Re: [Mesa-dev] [PATCH 10/10] egl/android: Add fallback to kms_swrast driver

2016-07-19 Thread Tomasz Figa
On Wed, Jul 20, 2016 at 7:40 AM, Rob Herring  wrote:
> On Fri, Jul 15, 2016 at 2:53 AM, Tomasz Figa  wrote:
>> If no hardware driver is present, it is possible to fall back to
>> the kms_swrast driver with any DRI node that supports dumb GEM create
>> and mmap IOCTLs with softpipe/llvmpipe drivers. This patch makes the
>> Android EGL platform code retry probe with kms_swrast if hardware-only
>> probe fails.
>
> Presumably, you need a gralloc that supports this too? It would be
> nice to have access to it to reproduce this setup.

Our use case is running the system in Qemu with vgem driver, so our
gralloc has a backend for vgem. However it should work with any
available card or render node (more about render nodes below), no
special support in gralloc really needed. It's just using kms_swrast
instead of the native driver.

>
> [...]
>
>>  #define DRM_RENDER_DEV_NAME  "%s/renderD%d"
>>
>>  static int
>> -droid_open_device(_EGLDisplay *dpy)
>> +droid_open_device(_EGLDisplay *dpy, int swrast)
>>  {
>> struct dri2_egl_display *dri2_dpy = dpy->DriverData;
>> const int limit = 64;
>> @@ -933,7 +936,7 @@ droid_open_device(_EGLDisplay *dpy)
>>if (fd < 0)
>>   continue;
>>
>> -  if (!droid_probe_device(dpy, fd))
>> +  if (!droid_probe_device(dpy, fd, swrast))
>
> This only gets here if a render node is present and successfully
> opened.

This is the case when HAS_GRALLOC_HEADERS is not defined, which means
only render nodes are supported. If you look at the other case, it
will use whatever FD was provided by gralloc using that private
perform call.

> I would think in the sw rendering case, we want this to work
> when there's only a card node present. Furthermore, you can't do dumb
> allocs on a render node, so I don't see how this can work at all.

This is only because the dumb alloc ioctl is not allowed, but that's
the only thing preventing it from working. We had similar restriction
put on mmap, but now everyone can just mmap the PRIME FD directly. We
actually have a patch allowing dumb alloc and mmap ioctls for render
nodes in our tree, because it makes things like swrast fallback much,
much easier and doesn't seem to be harmful at all. It might be worth
discussing this again on dri-devel mailing list.

In any case, this patch alone, even without any kernel changes should
work just fine with gralloc that returns a control node FD from the
GET_FD perform call.

Best regards,
Tomasz
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] nir: Silence missing field initializer warnings for vectors in nir_constant_expressions

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

nir/nir_constant_expressions.c: In function 'evaluate_ball2':
nir/nir_constant_expressions.c:279:7: warning: missing initializer for field 
'z' of 'struct bool_vec' [-Wmissing-field-initializers]
   };
   ^
nir/nir_constant_expressions.c:234:10: note: 'z' declared here
bool z;
  ^

Number of total warnings in my build reduced from 2532 to 2304
(reduction of 228).

v2: Initialize bool vectors with 0 instead of false to keep the
generator simpler.  Suggested by Ken.

Signed-off-by: Ian Romanick 
---
 src/compiler/nir/nir_constant_expressions.py | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/compiler/nir/nir_constant_expressions.py 
b/src/compiler/nir/nir_constant_expressions.py
index 96d5255..6b4d071 100644
--- a/src/compiler/nir/nir_constant_expressions.py
+++ b/src/compiler/nir/nir_constant_expressions.py
@@ -299,6 +299,9 @@ evaluate_${name}(unsigned num_components, unsigned bit_size,
_src[${j}].${get_const_field(input_types[j])}[${k}],
 % endif
  % endfor
+ % for k in range(op.input_sizes[j], 4):
+0,
+ % endfor
  };
   % endfor
 
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] nir/algebraic: Optimize common array indexing sequence

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

Some shaders include code that looks like:

   uniform int i;
   uniform vec4 bones[...];

   foo(bones[i * 3], bones[i * 3 + 1], bones[i * 3 + 2]);

CSE would do some work on this:

   x = i * 3
   foo(bones[x], bones[x + 1], bones[x + 2]);

The compiler may then add '<< 4 + base' to the index calculations.
This results in expressions like

   x = i * 3
   foo(bones[x << 4], bones[(x + 1) << 4], bones[(x + 2) << 4]);

Just rearranging the math to produce (i * 48) + 16 saves an
instruction, and it allows CSE to do more work.

   x = i * 48;
   foo(bones[x], bones[x + 16], bones[x + 32]);

So, ~6 instructions becomes ~3.

Some individual shader-db results look pretty bad.  However, I have a
really, really hard time believing the change in estimated cycles in,
for example, 3dmmes-taiji/51.shader_test after looking that change in
the generated code.

G45
total instructions in shared programs: 4020840 -> 4010070 (-0.27%)
instructions in affected programs: 177460 -> 166690 (-6.07%)
helped: 894
HURT: 0

total cycles in shared programs: 98829000 -> 98784990 (-0.04%)
cycles in affected programs: 3936648 -> 3892638 (-1.12%)
helped: 894
HURT: 0

Ironlake
total instructions in shared programs: 6418887 -> 6408117 (-0.17%)
instructions in affected programs: 177460 -> 166690 (-6.07%)
helped: 894
HURT: 0

total cycles in shared programs: 143504542 -> 143460532 (-0.03%)
cycles in affected programs: 3936648 -> 3892638 (-1.12%)
helped: 894
HURT: 0

Sandy Bridge
total instructions in shared programs: 8357887 -> 8339251 (-0.22%)
instructions in affected programs: 432715 -> 414079 (-4.31%)
helped: 2795
HURT: 0

total cycles in shared programs: 118284184 -> 118207412 (-0.06%)
cycles in affected programs: 6114626 -> 6037854 (-1.26%)
helped: 2478
HURT: 317

Ivy Bridge
total instructions in shared programs: 7669390 -> 7653822 (-0.20%)
instructions in affected programs: 388234 -> 372666 (-4.01%)
helped: 2795
HURT: 0

total cycles in shared programs: 68381982 -> 68263684 (-0.17%)
cycles in affected programs: 1972658 -> 1854360 (-6.00%)
helped: 2458
HURT: 307

Haswell
total instructions in shared programs: 7082636 -> 7067068 (-0.22%)
instructions in affected programs: 388234 -> 372666 (-4.01%)
helped: 2795
HURT: 0

total cycles in shared programs: 68282020 -> 68164158 (-0.17%)
cycles in affected programs: 1891820 -> 1773958 (-6.23%)
helped: 2459
HURT: 261

Broadwell
total instructions in shared programs: 9002466 -> 8985875 (-0.18%)
instructions in affected programs: 658784 -> 642193 (-2.52%)
helped: 2795
HURT: 5

total cycles in shared programs: 78503092 -> 78450404 (-0.07%)
cycles in affected programs: 2873304 -> 2820616 (-1.83%)
helped: 2275
HURT: 415

Skylake
total instructions in shared programs: 9156978 -> 9140387 (-0.18%)
instructions in affected programs: 682625 -> 666034 (-2.43%)
helped: 2795
HURT: 5

total cycles in shared programs: 75591392 -> 75550574 (-0.05%)
cycles in affected programs: 3192120 -> 3151302 (-1.28%)
helped: 2271
HURT: 425

Signed-off-by: Ian Romanick 
---
 src/compiler/nir/nir_opt_algebraic.py | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/src/compiler/nir/nir_opt_algebraic.py 
b/src/compiler/nir/nir_opt_algebraic.py
index 0f0896b..37cb700 100644
--- a/src/compiler/nir/nir_opt_algebraic.py
+++ b/src/compiler/nir/nir_opt_algebraic.py
@@ -119,6 +119,17 @@ optimizations = [
(('~fadd@64', a, ('fmul', c , ('fadd', b, ('fneg', a, ('flrp', 
a, b, c), '!options->lower_flrp64'),
(('ffma', a, b, c), ('fadd', ('fmul', a, b), c), 'options->lower_ffma'),
(('~fadd', ('fmul', a, b), c), ('ffma', a, b, c), 'options->fuse_ffma'),
+
+   # (a * #b + #c) << #d
+   # ((a * #b) << #d) + (#c << #d)
+   # (a * (#b << #d)) + (#c << #d)
+   (('ishl', ('iadd', ('imul', a, '#b'), '#c'), '#d'),
+('iadd', ('imul', a, ('ishl', b, d)), ('ishl', c, d))),
+
+   # (a * #b) << #c
+   # a * (#b << #c)
+   (('ishl', ('imul', a, '#b'), '#c'), ('imul', a, ('ishl', b, c))),
+
# Comparison simplifications
(('~inot', ('flt', a, b)), ('fge', a, b)),
(('~inot', ('fge', a, b)), ('flt', a, b)),
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] introducing radv - proof of concept vulkan driver for AMD VI chipsets

2016-07-19 Thread Edward O'Callaghan
Yea, that is pretty sweet Bas & Dave! I also agree - open code is open,
promises are just promises.

Best Wishes,
Edward.

On 07/20/2016 06:16 AM, Jason Ekstrand wrote:
> Good work guys!  Welcome to the Open-source Vulkan club :)
> 
> On Tue, Jul 19, 2016 at 12:59 PM, Dave Airlie  > wrote:
> 
> I was waiting for an open source driver to appear when I realised I
> should really just write one myself, some talking with Bas later, and
> we decided to see where we could get.
> 
> This is the point at which we were willing to show it to others, it's
> not really a vulkan driver yet, so far it's a vulkan triangle demos
> driver.
> 
> It renders the tri and cube demos from the vulkan loader,
> and the triangle demo from Sascha Willems demos
> and the Vulkan CTS smoke tests (all 4 of them one of which draws a
> triangle).
> 
> There is a lot of work to do, and it's at the stage where we are
> seeing if anyone else wants to join in at the start, before we make
> too many serious design decisions or take a path we really don't want
> to.
> 
> So far it's only been run on Tonga and Fiji chips I think, we are
> hoping to support radeon kernel driver for SI/CIK at some point, but I
> think we need to get things a bit further on VI chips first.
> 
> The code is currently here:
> https://github.com/airlied/mesa/tree/semi-interesting
> 
> There is a not-interesting branch which contains all the pre-history
> which might be useful for someone else bringing up a vulkan driver on
> other hardware.
> 
> The code is pretty much based on the Intel anv driver, with the winsys
> ported from gallium driver,
> and most of the state setup from there. Bas wrote the code to connect
> NIR<->LLVM IR so we could reuse it in the future for SPIR-V in GL if
> required. It also copies AMD addrlib over, (this should be shared).
> 
> Also we don't do SPIR-V->LLVM direct. We use NIR as it has the best
> chance for inter shader stage optimisations (vertex/fragment combined)
> which neither SPIR-V or LLVM handles for us, (nir doesn't do it yet
> but it can).
> 
> If you want to submit bug reports, they will only be taken seriously
> if accompanied by working patches at this stage, and we've no plans to
> merge to master yet, but open to discussion on when we could do that
> and what would be required.
> 
> Dave.
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org 
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
> 
> 
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 06/11] vl/util: add copy func for yv12image to nv12surface

2016-07-19 Thread Zhang, Boyuan
Hi Andy,


Thanks for the confirmation. It seems like all basic functionality is working 
now.


I will look into the cqp issue. And for the speed related issue, we understand 
the reason and have already planned to work on it after this bring-up. However, 
it will be a separate patch in future, and won't be included in this bring-up 
patch set.


Regards,

Boyuan


From: Andy Furniss 
Sent: July 19, 2016 2:39:44 PM
To: Zhang, Boyuan; 'Christian König'; mesa-dev@lists.freedesktop.org
Subject: Re: [PATCH 06/11] vl/util: add copy func for yv12image to nv12surface

Andy Furniss wrote:
> Zhang, Boyuan wrote:
>> Hi Andy,
>>
>> I just submitted another patch set, most of the issues you reported
>>  are solved, please see the information below:
>>
>> - Giving different frame rate should result different output size.
>> The final result from my side is very close to the CBR I set.
>> Please give a try with different frame rate and bit rate.
>>
>> - Picture corruption (half height pic) is caused by interlaced
>> setting. Interlace encoding is not supported. However, for
>> transcoding case, VAAPI decode will use interlace mode, which will
>> cause this issue. The temp solution is to use an Environmental
>> Variable to disable interlace when doing transcoding. Please try
>> the following command with the new patch: DISABLE_INTERLACE=true
>> gst-launch-1.0 filesrc location=~/big_buck_bunny_720p_1mb.mp4 !
>> qtdemux ! h264parse ! vaapidecode ! vaapih264enc ! filesink
>> location=out.264
>>
>> - I420 yuv -> nv12 case seems working fine on my side, can you
>> please provide the testing raw file and command you were using? I
>> want to reproduce the issue from my side and try to fix it if
>> possible. Thanks a lot!
>
> Will try new patches tomorrow.

DISABLE_INTERLACE=true does fix the decode -> encode issue.

bitrate seems to be working OK now with different fps and various rates
I tested. Gstreamer apparently can't count > 102M so that was as high
as I could go.

Stability on Tonga is good.

Remaining issues -

The default people will get just using ... ! vaapih264enc ! ... is not
sane - it encodes with a qp=0 so is huge.
vaapih264enc parameters init-qp and min-qp have no effect, though I am
not sure they would be the right ones to specify cqp anyway.

Speed - though omxh264dec has issues with bitrates, so a direct
comparison is hard, it's always 3x faster than vaapi.

Speed 2 - there seems to be an issue in the case where the bitrate
requested is higher than can be achieved WRT the content to be encoded.

It's up to twice as slow as it would be encoding something that had the
detail to be constrained by the bitrate. This leads to the strange
situation when say screen recording 1080p60 that when nothing much is
happening the framerate can't be reached, but if there is a lot going
on then it can. This is at very high rates = 100M, but then to record
an fps type  game the higher rate may be needed for the fast action.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/12] st/va: add functions for VAAPI encode

2016-07-19 Thread Zhang, Boyuan
>> -   context->decoder->begin_frame(context->decoder, context->target, 
>> >desc.base);
>> +   if (context->decoder->entrypoint != PIPE_VIDEO_ENTRYPOINT_ENCODE)
>> +  context->decoder->begin_frame(context->decoder, context->target, 
>> >desc.base);

>Why do we do so here? Could we avoid that?

>I would rather like to keep the begin_frame()/end_frame() handling as it is.

>Christian.


This is on purpose. Based on my testing, application will call begin_frame 
first, then call PictureParameter/SequenceParameter/... to pass us all picture 
related parameters. However, some of those values are actually required by 
begin_picture call in radeon_vce. So we have to delay the call until we receive 
all the parameters that needed. Same applies to encode_bitstream call. That's 
why I delay both calls to end_frame where we get all necessary values.


Regards,

Boyuan


From: Christian König 
Sent: July 19, 2016 4:55:43 AM
To: Zhang, Boyuan; mesa-dev@lists.freedesktop.org
Cc: adf.li...@gmail.com
Subject: Re: [PATCH 09/12] st/va: add functions for VAAPI encode

Am 19.07.2016 um 00:43 schrieb Boyuan Zhang:
> Add necessary functions/changes for VAAPI encoding to buffer and picture. 
> These changes will allow driver to handle all Vaapi encode related 
> operations. This patch doesn't change the Vaapi decode behaviour.
>
> Signed-off-by: Boyuan Zhang 
> ---
>   src/gallium/state_trackers/va/buffer.c |   6 +
>   src/gallium/state_trackers/va/picture.c| 169 
> -
>   src/gallium/state_trackers/va/va_private.h |   3 +
>   3 files changed, 176 insertions(+), 2 deletions(-)
>
> diff --git a/src/gallium/state_trackers/va/buffer.c 
> b/src/gallium/state_trackers/va/buffer.c
> index 7d3167b..dfcebbe 100644
> --- a/src/gallium/state_trackers/va/buffer.c
> +++ b/src/gallium/state_trackers/va/buffer.c
> @@ -133,6 +133,12 @@ vlVaMapBuffer(VADriverContextP ctx, VABufferID buf_id, 
> void **pbuff)
> if (!buf->derived_surface.transfer || !*pbuff)
>return VA_STATUS_ERROR_INVALID_BUFFER;
>
> +  if (buf->type == VAEncCodedBufferType) {
> + ((VACodedBufferSegment*)buf->data)->buf = *pbuff;
> + ((VACodedBufferSegment*)buf->data)->size = buf->coded_size;
> + ((VACodedBufferSegment*)buf->data)->next = NULL;
> + *pbuff = buf->data;
> +  }
>  } else {
> pipe_mutex_unlock(drv->mutex);
> *pbuff = buf->data;
> diff --git a/src/gallium/state_trackers/va/picture.c 
> b/src/gallium/state_trackers/va/picture.c
> index 89ac024..4793194 100644
> --- a/src/gallium/state_trackers/va/picture.c
> +++ b/src/gallium/state_trackers/va/picture.c
> @@ -78,7 +78,8 @@ vlVaBeginPicture(VADriverContextP ctx, VAContextID 
> context_id, VASurfaceID rende
> return VA_STATUS_SUCCESS;
>  }
>
> -   context->decoder->begin_frame(context->decoder, context->target, 
> >desc.base);
> +   if (context->decoder->entrypoint != PIPE_VIDEO_ENTRYPOINT_ENCODE)
> +  context->decoder->begin_frame(context->decoder, context->target, 
> >desc.base);

Why do we do so here? Could we avoid that?

I would rather like to keep the begin_frame()/end_frame() handling as it is.

Christian.

>
>  return VA_STATUS_SUCCESS;
>   }
> @@ -278,6 +279,139 @@ handleVASliceDataBufferType(vlVaContext *context, 
> vlVaBuffer *buf)
> num_buffers, (const void * const*)buffers, sizes);
>   }
>
> +static VAStatus
> +handleVAEncMiscParameterTypeRateControl(vlVaContext *context, 
> VAEncMiscParameterBuffer *misc)
> +{
> +   VAEncMiscParameterRateControl *rc = (VAEncMiscParameterRateControl 
> *)misc->data;
> +   if (context->desc.h264enc.rate_ctrl.rate_ctrl_method ==
> +   PIPE_H264_ENC_RATE_CONTROL_METHOD_CONSTANT)
> +  context->desc.h264enc.rate_ctrl.target_bitrate = rc->bits_per_second;
> +   else
> +  context->desc.h264enc.rate_ctrl.target_bitrate = rc->bits_per_second * 
> rc->target_percentage;
> +   context->desc.h264enc.rate_ctrl.peak_bitrate = rc->bits_per_second;
> +   if (context->desc.h264enc.rate_ctrl.target_bitrate < 200)
> +  context->desc.h264enc.rate_ctrl.vbv_buffer_size = 
> MIN2((context->desc.h264enc.rate_ctrl.target_bitrate * 2.75), 200);
> +   else
> +  context->desc.h264enc.rate_ctrl.vbv_buffer_size = 
> context->desc.h264enc.rate_ctrl.target_bitrate;
> +   context->desc.h264enc.rate_ctrl.target_bits_picture =
> +context->desc.h264enc.rate_ctrl.target_bitrate / 
> context->desc.h264enc.rate_ctrl.frame_rate_num;
> +   context->desc.h264enc.rate_ctrl.peak_bits_picture_integer =
> +context->desc.h264enc.rate_ctrl.peak_bitrate / 
> context->desc.h264enc.rate_ctrl.frame_rate_num;
> +   context->desc.h264enc.rate_ctrl.peak_bits_picture_fraction = 0;
> +
> +   return VA_STATUS_SUCCESS;
> +}
> +
> +static VAStatus
> +handleVAEncSequenceParameterBufferType(vlVaDriver *drv, vlVaContext 
> *context, vlVaBuffer *buf)
> +{
> +   

Re: [Mesa-dev] [PATCH 05/12] st/va: add encode entrypoint

2016-07-19 Thread Zhang, Boyuan
>> @@ -150,7 +167,16 @@ vlVaCreateConfig(VADriverContextP ctx, VAProfile 
>> profile, VAEntrypoint entrypoin
>>  if (entrypoint != VAEntrypointVLD)
>> return VA_STATUS_ERROR_UNSUPPORTED_ENTRYPOINT;
>>
>> -   *config_id = p;
>> +   if (entrypoint == VAEntrypointEncSlice || entrypoint == 
>> VAEntrypointEncPicture)
>> +  config->entrypoint = PIPE_VIDEO_ENTRYPOINT_ENCODE;
>> +   else
>> +  config->entrypoint = PIPE_VIDEO_ENTRYPOINT_BITSTREAM;

>Well that doesn't make much sense here.

>First we return and error if the entrypoint isn't VAEntrypointVLD and
>then check if it's an encoding entry point.

>Additional to that I already wondered if we are really going to support
>slice level as well as picture level encoding.

>I think that it should only be one of the two.

>Regards,
>Christian.


Hi Christian,


Sorry for the confusion, The first 2 lines of codes

>>  if (entrypoint != VAEntrypointVLD)
>> return VA_STATUS_ERROR_UNSUPPORTED_ENTRYPOINT;

will actually be removed in the last patch where we enable the VAAPI Encode 
(Patch 12/12). In other word, we don't accept VAEncode entrypoint until the 
time we enable VAAPI Encode. Therefore, we still only accept VAEntrypointVLD at 
this patch.


And we need to accept both picture level and slice level entrypoint. For some 
application, e.g. libva h264encode test, if we don't enable slice level encode, 
it will fail the call and report h264 encode is not supported. If we enable 
both, it will still use picture level encode. That's why I put both here.


Regards,

Boyuan


From: Christian König 
Sent: July 19, 2016 4:52 AM
To: Zhang, Boyuan; mesa-dev@lists.freedesktop.org
Cc: adf.li...@gmail.com
Subject: Re: [PATCH 05/12] st/va: add encode entrypoint

Am 19.07.2016 um 00:43 schrieb Boyuan Zhang:
> VAAPI passes PIPE_VIDEO_ENTRYPOINT_ENCODE as entry point for encoding case. 
> We will save this encode entry point in config. config_id was used as profile 
> previously. Now, config has both profile and entrypoint field, and config_id 
> is used to get the config object. Later on, we pass this entrypoint to 
> context->templat.entrypoint instead of always hardcoded to 
> PIPE_VIDEO_ENTRYPOINT_BITSTREAM for decoding case previously.
>
> Signed-off-by: Boyuan Zhang 
> ---
>   src/gallium/state_trackers/va/config.c | 69 
> +++---
>   src/gallium/state_trackers/va/context.c| 59 ++---
>   src/gallium/state_trackers/va/surface.c| 14 --
>   src/gallium/state_trackers/va/va_private.h |  5 +++
>   4 files changed, 115 insertions(+), 32 deletions(-)
>
> diff --git a/src/gallium/state_trackers/va/config.c 
> b/src/gallium/state_trackers/va/config.c
> index 9ca0aa8..7ea7e24 100644
> --- a/src/gallium/state_trackers/va/config.c
> +++ b/src/gallium/state_trackers/va/config.c
> @@ -34,6 +34,8 @@
>
>   #include "va_private.h"
>
> +#include "util/u_handle_table.h"
> +
>   DEBUG_GET_ONCE_BOOL_OPTION(mpeg4, "VAAPI_MPEG4_ENABLED", false)
>
>   VAStatus
> @@ -128,14 +130,29 @@ VAStatus
>   vlVaCreateConfig(VADriverContextP ctx, VAProfile profile, VAEntrypoint 
> entrypoint,
>VAConfigAttrib *attrib_list, int num_attribs, VAConfigID 
> *config_id)
>   {
> +   vlVaDriver *drv;
> +   vlVaConfig *config;
>  struct pipe_screen *pscreen;
>  enum pipe_video_profile p;
>
>  if (!ctx)
> return VA_STATUS_ERROR_INVALID_CONTEXT;
>
> +   drv = VL_VA_DRIVER(ctx);
> +
> +   if (!drv)
> +  return VA_STATUS_ERROR_INVALID_CONTEXT;
> +
> +   config = CALLOC(1, sizeof(vlVaConfig));
> +   if (!config)
> +  return VA_STATUS_ERROR_ALLOCATION_FAILED;
> +
>  if (profile == VAProfileNone && entrypoint == VAEntrypointVideoProc) {
> -  *config_id = PIPE_VIDEO_PROFILE_UNKNOWN;
> +  config->entrypoint = VAEntrypointVideoProc;
> +  config->profile = PIPE_VIDEO_PROFILE_UNKNOWN;
> +  pipe_mutex_lock(drv->mutex);
> +  *config_id = handle_table_add(drv->htab, config);
> +  pipe_mutex_unlock(drv->mutex);
> return VA_STATUS_SUCCESS;
>  }
>
> @@ -150,7 +167,16 @@ vlVaCreateConfig(VADriverContextP ctx, VAProfile 
> profile, VAEntrypoint entrypoin
>  if (entrypoint != VAEntrypointVLD)
> return VA_STATUS_ERROR_UNSUPPORTED_ENTRYPOINT;
>
> -   *config_id = p;
> +   if (entrypoint == VAEntrypointEncSlice || entrypoint == 
> VAEntrypointEncPicture)
> +  config->entrypoint = PIPE_VIDEO_ENTRYPOINT_ENCODE;
> +   else
> +  config->entrypoint = PIPE_VIDEO_ENTRYPOINT_BITSTREAM;

Well that doesn't make much sense here.

First we return and error if the entrypoint isn't VAEntrypointVLD and
then check if it's an encoding entry point.

Additional to that I already wondered if we are really going to support
slice level as well as picture level encoding.

I think that it should only be one of the two.

Regards,
Christian.

> +
> +   config->profile = p;
> +
> +   

Re: [Mesa-dev] [PATCH v1 8/8] nvc0: disable MS images on GM107+

2016-07-19 Thread Ilia Mirkin
Series is

Reviewed-by: Ilia Mirkin 

Note that you'll start using PBO downloads via images as a result of
this change, please run some of the deqp pbo tests (might be in gles3,
not gles31) to double-check that all's well.

  -ilia

On Tue, Jul 19, 2016 at 4:15 PM, Samuel Pitoiset
 wrote:
> MS images have to be handled explicitly and I don't plan to implement
> them for now.
>
> v1: - check that sample_count > 1
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 7 +++
>  1 file changed, 7 insertions(+)
>
> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c 
> b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
> index f681631..a3cd046 100644
> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
> @@ -90,6 +90,13 @@ nvc0_screen_is_format_supported(struct pipe_screen 
> *pscreen,
>   PIPE_BIND_LINEAR |
>   PIPE_BIND_SHARED);
>
> +   if (bindings & PIPE_BIND_SHADER_IMAGE && sample_count > 1 &&
> +   nouveau_screen(pscreen)->class_3d >= GM107_3D_CLASS) {
> +  /* MS images are currently unsupported on Maxwell because they have to
> +   * be handled explicitly. */
> +  return false;
> +   }
> +
> return (( nvc0_format_table[format].usage |
>  nvc0_vertex_format[format].usage) & bindings) == bindings;
>  }
> --
> 2.9.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 56/56] glsl: Replace most assertions with unreachable()

2016-07-19 Thread Matt Turner
Certainly not necessary to do this, but if you care to include `size`
output in commit messages, this one might be a good candidate.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/56] Die copy-and-paste code, die

2016-07-19 Thread Matt Turner
1-27 are

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 19/56] glsl: Use find_msb_uint to implement ir_unop_find_lsb

2016-07-19 Thread Matt Turner
On Tue, Jul 19, 2016 at 12:24 PM, Ian Romanick  wrote:
> From: Ian Romanick 
>
> (X & -X) calculates a value with only the least significant bit of X
> set.  Since there is only one bit set, the LSB is the MSB.
>
> Signed-off-by: Ian Romanick 
> ---
>  src/compiler/glsl/ir_constant_expression.cpp | 19 +--
>  1 file changed, 9 insertions(+), 10 deletions(-)
>
> diff --git a/src/compiler/glsl/ir_constant_expression.cpp 
> b/src/compiler/glsl/ir_constant_expression.cpp
> index 5f4cae2..71afb33 100644
> --- a/src/compiler/glsl/ir_constant_expression.cpp
> +++ b/src/compiler/glsl/ir_constant_expression.cpp
> @@ -1560,16 +1560,15 @@ ir_expression::constant_expression_value(struct 
> hash_table *variable_context)
>
> case ir_unop_find_lsb:
>for (unsigned c = 0; c < components; c++) {
> - if (op[0]->value.i[c] == 0)
> -data.i[c] = -1;
> - else {
> -unsigned pos = 0;
> -unsigned v = op[0]->value.u[c];
> -
> -for (; !(v & 1); v >>= 1) {
> -   pos++;
> -}
> -data.u[c] = pos;
> + switch (op[0]->type->base_type) {
> + case GLSL_TYPE_UINT:
> +data.i[c] = find_msb_uint(op[0]->value.u[c] & 
> -int(op[0]->value.u[c]));

The int cast looks a bit weird (and unnecessary, right?)

I'd drop it unless there's a good reason to keep it.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/7] i965: Rewrite FS input handling to use the new NIR intrinsics.

2016-07-19 Thread Jason Ekstrand
On Tue, Jul 19, 2016 at 5:02 PM, Kenneth Graunke 
wrote:

> On Tuesday, July 19, 2016 1:39:23 PM PDT Jason Ekstrand wrote:
> > On Mon, Jul 18, 2016 at 1:26 PM, Kenneth Graunke 
> > wrote:
> >
> > > This eliminates the need to walk the list of input variables, recurse
> > > into their types (via logic largely redundant with nir_lower_io), and
> > > interpolate all possible inputs up front.  The backend no longer has
> > > to care about variables at all, which eliminates complications from
> > > trying to pack multiple variables into the same location.  Instead,
> > > each intrinsic specifies exactly what's needed.
> > >
> > > This should unblock Timothy's work on GL_ARB_enhanced_layouts.
> > >
> > > Each load_interpolated_input intrinsic corresponds to PLN instructions,
> > > while load_barycentric_at_* intrinsics correspond to pixel interpolator
> > > messages.  The pixel/centroid/sample barycentric intrinsics simply
> refer
> > > to payload fields (delta_xy[]), and don't actually generate any code.
> > >
> > > Because we use a single intrinsic for both centroid-qualified variables
> > > and interpolateAtCentroid(), they become indistinguishable.  We stop
> > > sending pixel interpolator messages for those, and instead use the
> > > payload provided data, which should be considerably faster.
> > >
> > > On Broadwell:
> > >
> > > total instructions in shared programs: 9067751 -> 9067570 (-0.00%)
> > > instructions in affected programs: 145902 -> 145721 (-0.12%)
> > > helped: 422
> > > HURT: 209
> > >
> > > total spills in shared programs: 2849 -> 2899 (1.76%)
> > > spills in affected programs: 760 -> 810 (6.58%)
> > > helped: 0
> > > HURT: 10
> > >
> > > total fills in shared programs: 3910 -> 3950 (1.02%)
> > > fills in affected programs: 617 -> 657 (6.48%)
> > > helped: 0
> > > HURT: 10
> > >
> > > LOST:   3
> > > GAINED: 3
> > >
> > > The differences mostly appear to be slight changes in MOVs.
> > >
> > > Signed-off-by: Kenneth Graunke 
> > > ---
> > >  src/mesa/drivers/dri/i965/brw_fs.cpp | 175 -
> > >  src/mesa/drivers/dri/i965/brw_fs.h   |   9 +-
> > >  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 410
> > > ---
> > >  src/mesa/drivers/dri/i965/brw_nir.c  |  16 +-
> > >  4 files changed, 269 insertions(+), 341 deletions(-)
> > >
> > > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
> > > b/src/mesa/drivers/dri/i965/brw_fs.cpp
> > > index 94127bc..06007fe 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> > > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> > > @@ -1067,21 +1067,27 @@ fs_visitor::emit_fragcoord_interpolation(fs_reg
> > > wpos)
> > > bld.MOV(wpos, this->wpos_w);
> > >  }
> > >
> > > -static enum brw_barycentric_mode
> > > -barycentric_mode(enum glsl_interp_mode mode,
> > > - bool is_centroid, bool is_sample)
> > > +enum brw_barycentric_mode
> > > +brw_barycentric_mode(enum glsl_interp_mode mode, nir_intrinsic_op op)
> > >  {
> > > -   unsigned bary;
> > > -
> > > /* Barycentric modes don't make sense for flat inputs. */
> > > assert(mode != INTERP_MODE_FLAT);
> > >
> > > -   if (is_sample) {
> > > -  bary = BRW_BARYCENTRIC_PERSPECTIVE_SAMPLE;
> > > -   } else if (is_centroid) {
> > > -  bary = BRW_BARYCENTRIC_PERSPECTIVE_CENTROID;
> > > -   } else {
> > > +   unsigned bary;
> > > +   switch (op) {
> > > +   case nir_intrinsic_load_barycentric_pixel:
> > > +   case nir_intrinsic_load_barycentric_at_offset:
> > >bary = BRW_BARYCENTRIC_PERSPECTIVE_PIXEL;
> > > +  break;
> > > +   case nir_intrinsic_load_barycentric_centroid:
> > > +  bary = BRW_BARYCENTRIC_PERSPECTIVE_CENTROID;
> > > +  break;
> > > +   case nir_intrinsic_load_barycentric_sample:
> > > +   case nir_intrinsic_load_barycentric_at_sample:
> > > +  bary = BRW_BARYCENTRIC_PERSPECTIVE_SAMPLE;
> > > +  break;
> > > +   default:
> > > +  assert(!"invalid intrinsic");
> > > }
> > >
> > > if (mode == INTERP_MODE_NOPERSPECTIVE)
> > > @@ -1101,107 +1107,6 @@ centroid_to_pixel(enum brw_barycentric_mode
> bary)
> > > return (enum brw_barycentric_mode) ((unsigned) bary - 1);
> > >  }
> > >
> > > -void
> > > -fs_visitor::emit_general_interpolation(fs_reg *attr, const char *name,
> > > -   const glsl_type *type,
> > > -   glsl_interp_mode
> > > interpolation_mode,
> > > -   int *location, bool
> mod_centroid,
> > > -   bool mod_sample)
> > > -{
> > > -   assert(stage == MESA_SHADER_FRAGMENT);
> > > -   brw_wm_prog_data *prog_data = (brw_wm_prog_data*) this->prog_data;
> > > -
> > > -   if (type->is_array() || type->is_matrix()) {
> > > -  const glsl_type *elem_type = glsl_get_array_element(type);
> > > -  const unsigned length = glsl_get_length(type);
> > > -
> > > -  for (unsigned i = 

Re: [Mesa-dev] [PATCH 2/7] nir: Add a nir_lower_io flag for using load_interpolated_input intrins.

2016-07-19 Thread Kenneth Graunke
On Tuesday, July 19, 2016 12:57:23 PM PDT Jason Ekstrand wrote:
> On Mon, Jul 18, 2016 at 10:00 PM, Chris Forbes  wrote:
> 
> > Seems a little unfortunate to add a random bool to this interface which is
> > otherwise fairly descriptive, but OK.
> >
> 
> I agree that this is a bit unfortunate.  I was going to suggest adding a
> flags parameter and a bitfield union but a flags parameter for a single
> bool is also kind-of stupid.  I vote we keep it as-is and make it a flags
> parameter once we have 2 bools.
> 
> --Jason

I agree, this sucks.  I've dropped this patch locally in favor of adding
a nir_shader_compiler_options::use_interpolated_input_intrinsics flag
(which I just folded into patch 3 as it's a tiny amount of code).

That was really easy to hook up and is a lot cleaner.


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 23/36] isl: Remove duplicate px->sa conversions

2016-07-19 Thread Nanley Chery
On Tue, Jul 19, 2016 at 04:16:23PM -0700, Jason Ekstrand wrote:
> On Tue, Jul 19, 2016 at 4:12 PM, Nanley Chery  wrote:
> 
> > On Fri, Jul 01, 2016 at 04:08:49PM -0700, Jason Ekstrand wrote:
> > > In all three cases, we start with width and height taken from
> > > isl_surf::phys_slice0_extent_sa which is already in samples.  There is no
> > > need to do the conversion and doing so gives us an incorrect value.
> >
> > Thanks for noticing this bug! I think this patch is missing one
> > necessary change. The level width and height must be adjusted
> > for the sample count.
> >
> > Here's an example that demonstrates the post-patch issue:
> >  * User creates a 2x1px 4xIMS image
> >  * Level 0 is 4x2sa (2x1px)
> >  * Level 1 is 2x1sa (1x1px) but should be 2x2sa
> >
> 
> I thought this was an issue the first time through too.  But then chad
> reminded me that there is no mipmapped multisampling
> 

Thanks for pointing that out. This patch is,
Reviewed-by: Nanley Chery 

> --Jason
> 
> 
> >
> > - Nanley
> >
> > > ---
> > >  src/intel/isl/isl.c | 20 
> > >  1 file changed, 20 deletions(-)
> > >
> > > diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
> > > index 404cfc1..be3adfc 100644
> > > --- a/src/intel/isl/isl.c
> > > +++ b/src/intel/isl/isl.c
> > > @@ -610,18 +610,6 @@ isl_calc_phys_slice0_extent_sa_gen4_2d(
> > >uint32_t W = isl_minify(W0, l);
> > >uint32_t H = isl_minify(H0, l);
> > >
> > > -  if (msaa_layout == ISL_MSAA_LAYOUT_INTERLEAVED) {
> > > - /* From the Broadwell PRM >> Volume 5: Memory Views >>
> > Computing Mip Level
> > > -  * Sizes (p133):
> > > -  *
> > > -  *If the surface is multisampled and it is a depth or
> > stencil
> > > -  *surface or Multisampled Surface StorageFormat in
> > > -  *SURFACE_STATE is MSFMT_DEPTH_STENCIL, W_L and H_L must be
> > > -  *adjusted as follows before proceeding: [...]
> > > -  */
> > > - isl_msaa_interleaved_scale_px_to_sa(info->samples, , );
> > > -  }
> > > -
> > >uint32_t w = isl_align_npot(W, image_align_sa->w);
> > >uint32_t h = isl_align_npot(H, image_align_sa->h);
> > >
> > > @@ -1285,17 +1273,9 @@ get_image_offset_sa_gen4_2d(const struct isl_surf
> > *surf,
> > > for (uint32_t l = 0; l < level; ++l) {
> > >if (l == 1) {
> > >   uint32_t W = isl_minify(W0, l);
> > > -
> > > - if (surf->msaa_layout == ISL_MSAA_LAYOUT_INTERLEAVED)
> > > -isl_msaa_interleaved_scale_px_to_sa(surf->samples, ,
> > NULL);
> > > -
> > >   x += isl_align_npot(W, image_align_sa.w);
> > >} else {
> > >   uint32_t H = isl_minify(H0, l);
> > > -
> > > - if (surf->msaa_layout == ISL_MSAA_LAYOUT_INTERLEAVED)
> > > -isl_msaa_interleaved_scale_px_to_sa(surf->samples, NULL,
> > );
> > > -
> > >   y += isl_align_npot(H, image_align_sa.h);
> > >}
> > > }
> > > --
> > > 2.5.0.400.gff86faf
> > >
> > > ___
> > > mesa-dev mailing list
> > > mesa-dev@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/7] i965: Rewrite FS input handling to use the new NIR intrinsics.

2016-07-19 Thread Kenneth Graunke
On Tuesday, July 19, 2016 1:39:23 PM PDT Jason Ekstrand wrote:
> On Mon, Jul 18, 2016 at 1:26 PM, Kenneth Graunke 
> wrote:
> 
> > This eliminates the need to walk the list of input variables, recurse
> > into their types (via logic largely redundant with nir_lower_io), and
> > interpolate all possible inputs up front.  The backend no longer has
> > to care about variables at all, which eliminates complications from
> > trying to pack multiple variables into the same location.  Instead,
> > each intrinsic specifies exactly what's needed.
> >
> > This should unblock Timothy's work on GL_ARB_enhanced_layouts.
> >
> > Each load_interpolated_input intrinsic corresponds to PLN instructions,
> > while load_barycentric_at_* intrinsics correspond to pixel interpolator
> > messages.  The pixel/centroid/sample barycentric intrinsics simply refer
> > to payload fields (delta_xy[]), and don't actually generate any code.
> >
> > Because we use a single intrinsic for both centroid-qualified variables
> > and interpolateAtCentroid(), they become indistinguishable.  We stop
> > sending pixel interpolator messages for those, and instead use the
> > payload provided data, which should be considerably faster.
> >
> > On Broadwell:
> >
> > total instructions in shared programs: 9067751 -> 9067570 (-0.00%)
> > instructions in affected programs: 145902 -> 145721 (-0.12%)
> > helped: 422
> > HURT: 209
> >
> > total spills in shared programs: 2849 -> 2899 (1.76%)
> > spills in affected programs: 760 -> 810 (6.58%)
> > helped: 0
> > HURT: 10
> >
> > total fills in shared programs: 3910 -> 3950 (1.02%)
> > fills in affected programs: 617 -> 657 (6.48%)
> > helped: 0
> > HURT: 10
> >
> > LOST:   3
> > GAINED: 3
> >
> > The differences mostly appear to be slight changes in MOVs.
> >
> > Signed-off-by: Kenneth Graunke 
> > ---
> >  src/mesa/drivers/dri/i965/brw_fs.cpp | 175 -
> >  src/mesa/drivers/dri/i965/brw_fs.h   |   9 +-
> >  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 410
> > ---
> >  src/mesa/drivers/dri/i965/brw_nir.c  |  16 +-
> >  4 files changed, 269 insertions(+), 341 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
> > b/src/mesa/drivers/dri/i965/brw_fs.cpp
> > index 94127bc..06007fe 100644
> > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> > @@ -1067,21 +1067,27 @@ fs_visitor::emit_fragcoord_interpolation(fs_reg
> > wpos)
> > bld.MOV(wpos, this->wpos_w);
> >  }
> >
> > -static enum brw_barycentric_mode
> > -barycentric_mode(enum glsl_interp_mode mode,
> > - bool is_centroid, bool is_sample)
> > +enum brw_barycentric_mode
> > +brw_barycentric_mode(enum glsl_interp_mode mode, nir_intrinsic_op op)
> >  {
> > -   unsigned bary;
> > -
> > /* Barycentric modes don't make sense for flat inputs. */
> > assert(mode != INTERP_MODE_FLAT);
> >
> > -   if (is_sample) {
> > -  bary = BRW_BARYCENTRIC_PERSPECTIVE_SAMPLE;
> > -   } else if (is_centroid) {
> > -  bary = BRW_BARYCENTRIC_PERSPECTIVE_CENTROID;
> > -   } else {
> > +   unsigned bary;
> > +   switch (op) {
> > +   case nir_intrinsic_load_barycentric_pixel:
> > +   case nir_intrinsic_load_barycentric_at_offset:
> >bary = BRW_BARYCENTRIC_PERSPECTIVE_PIXEL;
> > +  break;
> > +   case nir_intrinsic_load_barycentric_centroid:
> > +  bary = BRW_BARYCENTRIC_PERSPECTIVE_CENTROID;
> > +  break;
> > +   case nir_intrinsic_load_barycentric_sample:
> > +   case nir_intrinsic_load_barycentric_at_sample:
> > +  bary = BRW_BARYCENTRIC_PERSPECTIVE_SAMPLE;
> > +  break;
> > +   default:
> > +  assert(!"invalid intrinsic");
> > }
> >
> > if (mode == INTERP_MODE_NOPERSPECTIVE)
> > @@ -1101,107 +1107,6 @@ centroid_to_pixel(enum brw_barycentric_mode bary)
> > return (enum brw_barycentric_mode) ((unsigned) bary - 1);
> >  }
> >
> > -void
> > -fs_visitor::emit_general_interpolation(fs_reg *attr, const char *name,
> > -   const glsl_type *type,
> > -   glsl_interp_mode
> > interpolation_mode,
> > -   int *location, bool mod_centroid,
> > -   bool mod_sample)
> > -{
> > -   assert(stage == MESA_SHADER_FRAGMENT);
> > -   brw_wm_prog_data *prog_data = (brw_wm_prog_data*) this->prog_data;
> > -
> > -   if (type->is_array() || type->is_matrix()) {
> > -  const glsl_type *elem_type = glsl_get_array_element(type);
> > -  const unsigned length = glsl_get_length(type);
> > -
> > -  for (unsigned i = 0; i < length; i++) {
> > - emit_general_interpolation(attr, name, elem_type,
> > interpolation_mode,
> > -location, mod_centroid, mod_sample);
> > -  }
> > -   } else if (type->is_record()) {
> > -  for (unsigned i = 0; i < type->length; i++) {
> > - const 

Re: [Mesa-dev] [PATCH v2 23/36] isl: Remove duplicate px->sa conversions

2016-07-19 Thread Nanley Chery
On Tue, Jul 19, 2016 at 04:12:17PM -0700, Nanley Chery wrote:
> On Fri, Jul 01, 2016 at 04:08:49PM -0700, Jason Ekstrand wrote:
> > In all three cases, we start with width and height taken from
> > isl_surf::phys_slice0_extent_sa which is already in samples.  There is no
> > need to do the conversion and doing so gives us an incorrect value.
> 
> Thanks for noticing this bug! I think this patch is missing one
> necessary change. The level width and height must be adjusted
> for the sample count.
> 
> Here's an example that demonstrates the post-patch issue:
>  * User creates a 2x1px 4xIMS image
>  * Level 0 is 4x2sa (2x1px) 
>  * Level 1 is 2x1sa (1x1px) but should be 2x2sa
> 
> - Nanley

I think we should also Cc stable as it fixes an incorrect layout bug.

- Nanley

> 
> > ---
> >  src/intel/isl/isl.c | 20 
> >  1 file changed, 20 deletions(-)
> > 
> > diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
> > index 404cfc1..be3adfc 100644
> > --- a/src/intel/isl/isl.c
> > +++ b/src/intel/isl/isl.c
> > @@ -610,18 +610,6 @@ isl_calc_phys_slice0_extent_sa_gen4_2d(
> >uint32_t W = isl_minify(W0, l);
> >uint32_t H = isl_minify(H0, l);
> >  
> > -  if (msaa_layout == ISL_MSAA_LAYOUT_INTERLEAVED) {
> > - /* From the Broadwell PRM >> Volume 5: Memory Views >> Computing 
> > Mip Level
> > -  * Sizes (p133):
> > -  *
> > -  *If the surface is multisampled and it is a depth or stencil
> > -  *surface or Multisampled Surface StorageFormat in
> > -  *SURFACE_STATE is MSFMT_DEPTH_STENCIL, W_L and H_L must be
> > -  *adjusted as follows before proceeding: [...]
> > -  */
> > - isl_msaa_interleaved_scale_px_to_sa(info->samples, , );
> > -  }
> > -
> >uint32_t w = isl_align_npot(W, image_align_sa->w);
> >uint32_t h = isl_align_npot(H, image_align_sa->h);
> >  
> > @@ -1285,17 +1273,9 @@ get_image_offset_sa_gen4_2d(const struct isl_surf 
> > *surf,
> > for (uint32_t l = 0; l < level; ++l) {
> >if (l == 1) {
> >   uint32_t W = isl_minify(W0, l);
> > -
> > - if (surf->msaa_layout == ISL_MSAA_LAYOUT_INTERLEAVED)
> > -isl_msaa_interleaved_scale_px_to_sa(surf->samples, , NULL);
> > -
> >   x += isl_align_npot(W, image_align_sa.w);
> >} else {
> >   uint32_t H = isl_minify(H0, l);
> > -
> > - if (surf->msaa_layout == ISL_MSAA_LAYOUT_INTERLEAVED)
> > -isl_msaa_interleaved_scale_px_to_sa(surf->samples, NULL, );
> > -
> >   y += isl_align_npot(H, image_align_sa.h);
> >}
> > }
> > -- 
> > 2.5.0.400.gff86faf
> > 
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 23/36] isl: Remove duplicate px->sa conversions

2016-07-19 Thread Jason Ekstrand
On Tue, Jul 19, 2016 at 4:12 PM, Nanley Chery  wrote:

> On Fri, Jul 01, 2016 at 04:08:49PM -0700, Jason Ekstrand wrote:
> > In all three cases, we start with width and height taken from
> > isl_surf::phys_slice0_extent_sa which is already in samples.  There is no
> > need to do the conversion and doing so gives us an incorrect value.
>
> Thanks for noticing this bug! I think this patch is missing one
> necessary change. The level width and height must be adjusted
> for the sample count.
>
> Here's an example that demonstrates the post-patch issue:
>  * User creates a 2x1px 4xIMS image
>  * Level 0 is 4x2sa (2x1px)
>  * Level 1 is 2x1sa (1x1px) but should be 2x2sa
>

I thought this was an issue the first time through too.  But then chad
reminded me that there is no mipmapped multisampling

--Jason


>
> - Nanley
>
> > ---
> >  src/intel/isl/isl.c | 20 
> >  1 file changed, 20 deletions(-)
> >
> > diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
> > index 404cfc1..be3adfc 100644
> > --- a/src/intel/isl/isl.c
> > +++ b/src/intel/isl/isl.c
> > @@ -610,18 +610,6 @@ isl_calc_phys_slice0_extent_sa_gen4_2d(
> >uint32_t W = isl_minify(W0, l);
> >uint32_t H = isl_minify(H0, l);
> >
> > -  if (msaa_layout == ISL_MSAA_LAYOUT_INTERLEAVED) {
> > - /* From the Broadwell PRM >> Volume 5: Memory Views >>
> Computing Mip Level
> > -  * Sizes (p133):
> > -  *
> > -  *If the surface is multisampled and it is a depth or
> stencil
> > -  *surface or Multisampled Surface StorageFormat in
> > -  *SURFACE_STATE is MSFMT_DEPTH_STENCIL, W_L and H_L must be
> > -  *adjusted as follows before proceeding: [...]
> > -  */
> > - isl_msaa_interleaved_scale_px_to_sa(info->samples, , );
> > -  }
> > -
> >uint32_t w = isl_align_npot(W, image_align_sa->w);
> >uint32_t h = isl_align_npot(H, image_align_sa->h);
> >
> > @@ -1285,17 +1273,9 @@ get_image_offset_sa_gen4_2d(const struct isl_surf
> *surf,
> > for (uint32_t l = 0; l < level; ++l) {
> >if (l == 1) {
> >   uint32_t W = isl_minify(W0, l);
> > -
> > - if (surf->msaa_layout == ISL_MSAA_LAYOUT_INTERLEAVED)
> > -isl_msaa_interleaved_scale_px_to_sa(surf->samples, ,
> NULL);
> > -
> >   x += isl_align_npot(W, image_align_sa.w);
> >} else {
> >   uint32_t H = isl_minify(H0, l);
> > -
> > - if (surf->msaa_layout == ISL_MSAA_LAYOUT_INTERLEAVED)
> > -isl_msaa_interleaved_scale_px_to_sa(surf->samples, NULL,
> );
> > -
> >   y += isl_align_npot(H, image_align_sa.h);
> >}
> > }
> > --
> > 2.5.0.400.gff86faf
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 23/36] isl: Remove duplicate px->sa conversions

2016-07-19 Thread Nanley Chery
On Fri, Jul 01, 2016 at 04:08:49PM -0700, Jason Ekstrand wrote:
> In all three cases, we start with width and height taken from
> isl_surf::phys_slice0_extent_sa which is already in samples.  There is no
> need to do the conversion and doing so gives us an incorrect value.

Thanks for noticing this bug! I think this patch is missing one
necessary change. The level width and height must be adjusted
for the sample count.

Here's an example that demonstrates the post-patch issue:
 * User creates a 2x1px 4xIMS image
 * Level 0 is 4x2sa (2x1px) 
 * Level 1 is 2x1sa (1x1px) but should be 2x2sa

- Nanley

> ---
>  src/intel/isl/isl.c | 20 
>  1 file changed, 20 deletions(-)
> 
> diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
> index 404cfc1..be3adfc 100644
> --- a/src/intel/isl/isl.c
> +++ b/src/intel/isl/isl.c
> @@ -610,18 +610,6 @@ isl_calc_phys_slice0_extent_sa_gen4_2d(
>uint32_t W = isl_minify(W0, l);
>uint32_t H = isl_minify(H0, l);
>  
> -  if (msaa_layout == ISL_MSAA_LAYOUT_INTERLEAVED) {
> - /* From the Broadwell PRM >> Volume 5: Memory Views >> Computing 
> Mip Level
> -  * Sizes (p133):
> -  *
> -  *If the surface is multisampled and it is a depth or stencil
> -  *surface or Multisampled Surface StorageFormat in
> -  *SURFACE_STATE is MSFMT_DEPTH_STENCIL, W_L and H_L must be
> -  *adjusted as follows before proceeding: [...]
> -  */
> - isl_msaa_interleaved_scale_px_to_sa(info->samples, , );
> -  }
> -
>uint32_t w = isl_align_npot(W, image_align_sa->w);
>uint32_t h = isl_align_npot(H, image_align_sa->h);
>  
> @@ -1285,17 +1273,9 @@ get_image_offset_sa_gen4_2d(const struct isl_surf 
> *surf,
> for (uint32_t l = 0; l < level; ++l) {
>if (l == 1) {
>   uint32_t W = isl_minify(W0, l);
> -
> - if (surf->msaa_layout == ISL_MSAA_LAYOUT_INTERLEAVED)
> -isl_msaa_interleaved_scale_px_to_sa(surf->samples, , NULL);
> -
>   x += isl_align_npot(W, image_align_sa.w);
>} else {
>   uint32_t H = isl_minify(H0, l);
> -
> - if (surf->msaa_layout == ISL_MSAA_LAYOUT_INTERLEAVED)
> -isl_msaa_interleaved_scale_px_to_sa(surf->samples, NULL, );
> -
>   y += isl_align_npot(H, image_align_sa.h);
>}
> }
> -- 
> 2.5.0.400.gff86faf
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/10] egl/android: Add fallback to kms_swrast driver

2016-07-19 Thread Rob Herring
On Fri, Jul 15, 2016 at 2:53 AM, Tomasz Figa  wrote:
> If no hardware driver is present, it is possible to fall back to
> the kms_swrast driver with any DRI node that supports dumb GEM create
> and mmap IOCTLs with softpipe/llvmpipe drivers. This patch makes the
> Android EGL platform code retry probe with kms_swrast if hardware-only
> probe fails.

Presumably, you need a gralloc that supports this too? It would be
nice to have access to it to reproduce this setup.

[...]

>  #define DRM_RENDER_DEV_NAME  "%s/renderD%d"
>
>  static int
> -droid_open_device(_EGLDisplay *dpy)
> +droid_open_device(_EGLDisplay *dpy, int swrast)
>  {
> struct dri2_egl_display *dri2_dpy = dpy->DriverData;
> const int limit = 64;
> @@ -933,7 +936,7 @@ droid_open_device(_EGLDisplay *dpy)
>if (fd < 0)
>   continue;
>
> -  if (!droid_probe_device(dpy, fd))
> +  if (!droid_probe_device(dpy, fd, swrast))

This only gets here if a render node is present and successfully
opened. I would think in the sw rendering case, we want this to work
when there's only a card node present. Furthermore, you can't do dumb
allocs on a render node, so I don't see how this can work at all.

Rob
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Testing patches 9987 and 9988

2016-07-19 Thread Marek Olšák
The first one just landed.

The second one will take some time I guess.

This branch contains both, but it won't be merged in this form:
https://cgit.freedesktop.org/~mareko/mesa/log/?h=si-mid-ib-gfx-fence

Marek

On Tue, Jul 19, 2016 at 3:53 PM, ⚛ <0xe2.0x9a.0...@gmail.com> wrote:
> Hello
>
> I would like to test http://patchwork.freedesktop.org/series/9987/ and
> http://patchwork.freedesktop.org/series/9988/ but the mbox patches
> aren't compatible with mesa-git.
>
> Would it be possible to update 9987 and 9988 to match mesa-git?
>
> Do 9987 and 9988 assume additional public patches that need to be
> applied prior to them?
>
> Thanks
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl/ast: don't allow subroutine uniform comparisons

2016-07-19 Thread Andres Gomez
On Tue, 2016-07-19 at 13:45 -0700, Ian Romanick wrote:
> On 07/19/2016 06:54 AM, Andres Gomez wrote:
...
> > So, what would be the conclusion? Do we allow subroutine variables 
> > comparison?
> 
> There is no conclusion yet.  I opened a Khronos gitlab tracker (right
> after Dave sent his original patch) for the CTS.  I'll try to get it on
> the conference call agenda for this week.

Thanks, Ian! ☺

-- 

Br,

Andres
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl/ast: don't allow subroutine uniform comparisons

2016-07-19 Thread Ian Romanick
On 07/19/2016 06:54 AM, Andres Gomez wrote:
> Hi,
> 
> Just dropped:
> https://lists.freedesktop.org/archives/mesa-dev/2016-July/123485.html
> 
> I didn't realize there was already this thread open.
> 
> On Tue, 2016-06-07 at 09:59 -0700, Ian Romanick wrote:
>> On 06/06/2016 10:20 PM, Dave Airlie wrote:
>>> From: Dave Airlie 
>>>
>>> This fixes:
>>> GL45-CTS.shader_subroutine.subroutines_cannot_be_assigned_float_int_values_or_be_compared
>>>
>>> though I'm not 100% sure why this is illegal from the spec,
>>> but it makes us pass the test, and I really can't see a use case for this.
>>
>> I think the test is wrong.  Section 5.9 (Expressions) of the GLSL 4.5
>> spec clearly says:
>>
>> The equality operators equal (==), and not equal (!=) operate on
>> all types (except aggregates that contain opaque types).
> 
> In my opinion, the specs are somehow contradictory or not completely
> clear.
> 
> AFAIU, subroutine variables are to be used just in the way functions
> are called. Although the spec doesn't say it explicitly, this means
> that these variables are not to be used in any other way than those
> left for function calls. Therefore, a comparison between 2 subroutine
> variables should also cause a compilation error.
> 
> From The OpenGL® Shading Language 4.40, page 117:
> 
>   "  To use subroutines, a subroutine type is declared, one or more
>  functions are associated with that subroutine type, and a
>  subroutine variable of that type is declared. The function
>  currently assigned to the variable function is then called by
>  using function calling syntax replacing a function name with the
>  name of the subroutine variable. Subroutine variables are
>  uniforms, and are assigned to specific functions only through
>  commands (UniformSubroutinesuiv) in the OpenGL API."
> 
> From The OpenGL® Shading Language 4.40, page 118:
> 
>   "  Subroutine uniform variables are called the same way functions
>  are called. When a subroutine variable (or an element of a
>  subroutine variable array) is associated with a particular
>  function, all function calls through that variable will call that
>  particular function."
> 
>> As much as anyone would use subroutines, you could imagine this being
>> used like:
>>
>> value = foo(param1, param2);
>> if (foo != bar)
>> value += bar(param1, param2);
> 
> If that would be the case, and we agree that subroutines can be
> compared, then we have, at least, some other bug to correct.
> 
> I've made some piglit tests with the following scenarios:
>  * == comparison result:
> * foo and bar point to the same subroutine function -> false
> * foo and bar point to different subroutine functions -> false
>  * != comparison result:
> * foo and bar point to the same subroutine function -> false
> * foo and bar point to different subroutine functions -> false
> 
> So, what would be the conclusion? Do we allow subroutine variables comparison?

There is no conclusion yet.  I opened a Khronos gitlab tracker (right
after Dave sent his original patch) for the CTS.  I'll try to get it on
the conference call agenda for this week.

> FTR, I passed this patch through an "all" piglit run and through GL44 CTS and 
> it doesn't cause any regression.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/7] i965: Delete the FS_OPCODE_INTERPOLATE_AT_CENTROID virtual opcode.

2016-07-19 Thread Jason Ekstrand
Patches 2, 3, 4, and 7 are

Reviewed-by: Jason Ekstrand 

I've left comments on a few others.

On Mon, Jul 18, 2016 at 10:30 PM, Jason Ekstrand 
wrote:

> On Jul 18, 2016 10:11 PM, "Chris Forbes"  wrote:
> >
> > I remember arguing about this when it got added -- tradeoff was payload
> size/register pressure vs needing to call out to this unit, if centroid
> barycentric coords weren't required for anything else? It does seem fairly
> pointless, though.
> >
> > For the series:-
> >
> > Reviewed-by: Chris Forbes 
>
> I'd like to chip in before you get too excited and push. I'll take a
> proper look tomorrow.
>
> > On Tue, Jul 19, 2016 at 8:26 AM, Kenneth Graunke 
> wrote:
> >>
> >> We no longer use this message.  As far as I can tell, it's fairly
> >> useless - the equivalent information is provided in the payload.
> >>
> >> Signed-off-by: Kenneth Graunke 
> >> ---
> >>  src/mesa/drivers/dri/i965/brw_defines.h| 1 -
> >>  src/mesa/drivers/dri/i965/brw_fs.cpp   | 2 --
> >>  src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 5 -
> >>  src/mesa/drivers/dri/i965/brw_shader.cpp   | 2 --
> >>  4 files changed, 10 deletions(-)
> >>
> >> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h
> b/src/mesa/drivers/dri/i965/brw_defines.h
> >> index b5a259e..2814fa7 100644
> >> --- a/src/mesa/drivers/dri/i965/brw_defines.h
> >> +++ b/src/mesa/drivers/dri/i965/brw_defines.h
> >> @@ -1120,7 +1120,6 @@ enum opcode {
> >> FS_OPCODE_UNPACK_HALF_2x16_SPLIT_X,
> >> FS_OPCODE_UNPACK_HALF_2x16_SPLIT_Y,
> >> FS_OPCODE_PLACEHOLDER_HALT,
> >> -   FS_OPCODE_INTERPOLATE_AT_CENTROID,
> >> FS_OPCODE_INTERPOLATE_AT_SAMPLE,
> >> FS_OPCODE_INTERPOLATE_AT_SHARED_OFFSET,
> >> FS_OPCODE_INTERPOLATE_AT_PER_SLOT_OFFSET,
> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> >> index 06007fe..120d6dd 100644
> >> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> >> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> >> @@ -250,7 +250,6 @@ fs_inst::is_send_from_grf() const
> >> switch (opcode) {
> >> case FS_OPCODE_VARYING_PULL_CONSTANT_LOAD_GEN7:
> >> case SHADER_OPCODE_SHADER_TIME_ADD:
> >> -   case FS_OPCODE_INTERPOLATE_AT_CENTROID:
> >> case FS_OPCODE_INTERPOLATE_AT_SAMPLE:
> >> case FS_OPCODE_INTERPOLATE_AT_SHARED_OFFSET:
> >> case FS_OPCODE_INTERPOLATE_AT_PER_SLOT_OFFSET:
> >> @@ -4785,7 +4784,6 @@ get_lowered_simd_width(const struct
> brw_device_info *devinfo,
> >> case FS_OPCODE_PACK_HALF_2x16_SPLIT:
> >> case FS_OPCODE_UNPACK_HALF_2x16_SPLIT_X:
> >> case FS_OPCODE_UNPACK_HALF_2x16_SPLIT_Y:
> >> -   case FS_OPCODE_INTERPOLATE_AT_CENTROID:
> >> case FS_OPCODE_INTERPOLATE_AT_SAMPLE:
> >> case FS_OPCODE_INTERPOLATE_AT_SHARED_OFFSET:
> >> case FS_OPCODE_INTERPOLATE_AT_PER_SLOT_OFFSET:
> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> >> index 1e9c7da..a390184 100644
> >> --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> >> +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
> >> @@ -2054,11 +2054,6 @@ fs_generator::generate_code(const cfg_t *cfg,
> int dispatch_width)
> >>   }
> >>   break;
> >>
> >> -  case FS_OPCODE_INTERPOLATE_AT_CENTROID:
> >> - generate_pixel_interpolator_query(inst, dst, src[0], src[1],
> >> -
>  GEN7_PIXEL_INTERPOLATOR_LOC_CENTROID);
> >> - break;
> >> -
> >>case FS_OPCODE_INTERPOLATE_AT_SAMPLE:
> >>   generate_pixel_interpolator_query(inst, dst, src[0], src[1],
> >>
> GEN7_PIXEL_INTERPOLATOR_LOC_SAMPLE);
> >> diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp
> b/src/mesa/drivers/dri/i965/brw_shader.cpp
> >> index f3b5487..559e44c 100644
> >> --- a/src/mesa/drivers/dri/i965/brw_shader.cpp
> >> +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
> >> @@ -367,8 +367,6 @@ brw_instruction_name(const struct brw_device_info
> *devinfo, enum opcode op)
> >> case FS_OPCODE_PLACEHOLDER_HALT:
> >>return "placeholder_halt";
> >>
> >> -   case FS_OPCODE_INTERPOLATE_AT_CENTROID:
> >> -  return "interp_centroid";
> >> case FS_OPCODE_INTERPOLATE_AT_SAMPLE:
> >>return "interp_sample";
> >> case FS_OPCODE_INTERPOLATE_AT_SHARED_OFFSET:
> >> --
> >> 2.9.0
> >>
> >> ___
> >> mesa-dev mailing list
> >> mesa-dev@lists.freedesktop.org
> >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >
> >
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/7] i965: Rewrite FS input handling to use the new NIR intrinsics.

2016-07-19 Thread Jason Ekstrand
On Mon, Jul 18, 2016 at 1:26 PM, Kenneth Graunke 
wrote:

> This eliminates the need to walk the list of input variables, recurse
> into their types (via logic largely redundant with nir_lower_io), and
> interpolate all possible inputs up front.  The backend no longer has
> to care about variables at all, which eliminates complications from
> trying to pack multiple variables into the same location.  Instead,
> each intrinsic specifies exactly what's needed.
>
> This should unblock Timothy's work on GL_ARB_enhanced_layouts.
>
> Each load_interpolated_input intrinsic corresponds to PLN instructions,
> while load_barycentric_at_* intrinsics correspond to pixel interpolator
> messages.  The pixel/centroid/sample barycentric intrinsics simply refer
> to payload fields (delta_xy[]), and don't actually generate any code.
>
> Because we use a single intrinsic for both centroid-qualified variables
> and interpolateAtCentroid(), they become indistinguishable.  We stop
> sending pixel interpolator messages for those, and instead use the
> payload provided data, which should be considerably faster.
>
> On Broadwell:
>
> total instructions in shared programs: 9067751 -> 9067570 (-0.00%)
> instructions in affected programs: 145902 -> 145721 (-0.12%)
> helped: 422
> HURT: 209
>
> total spills in shared programs: 2849 -> 2899 (1.76%)
> spills in affected programs: 760 -> 810 (6.58%)
> helped: 0
> HURT: 10
>
> total fills in shared programs: 3910 -> 3950 (1.02%)
> fills in affected programs: 617 -> 657 (6.48%)
> helped: 0
> HURT: 10
>
> LOST:   3
> GAINED: 3
>
> The differences mostly appear to be slight changes in MOVs.
>
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp | 175 -
>  src/mesa/drivers/dri/i965/brw_fs.h   |   9 +-
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 410
> ---
>  src/mesa/drivers/dri/i965/brw_nir.c  |  16 +-
>  4 files changed, 269 insertions(+), 341 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index 94127bc..06007fe 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -1067,21 +1067,27 @@ fs_visitor::emit_fragcoord_interpolation(fs_reg
> wpos)
> bld.MOV(wpos, this->wpos_w);
>  }
>
> -static enum brw_barycentric_mode
> -barycentric_mode(enum glsl_interp_mode mode,
> - bool is_centroid, bool is_sample)
> +enum brw_barycentric_mode
> +brw_barycentric_mode(enum glsl_interp_mode mode, nir_intrinsic_op op)
>  {
> -   unsigned bary;
> -
> /* Barycentric modes don't make sense for flat inputs. */
> assert(mode != INTERP_MODE_FLAT);
>
> -   if (is_sample) {
> -  bary = BRW_BARYCENTRIC_PERSPECTIVE_SAMPLE;
> -   } else if (is_centroid) {
> -  bary = BRW_BARYCENTRIC_PERSPECTIVE_CENTROID;
> -   } else {
> +   unsigned bary;
> +   switch (op) {
> +   case nir_intrinsic_load_barycentric_pixel:
> +   case nir_intrinsic_load_barycentric_at_offset:
>bary = BRW_BARYCENTRIC_PERSPECTIVE_PIXEL;
> +  break;
> +   case nir_intrinsic_load_barycentric_centroid:
> +  bary = BRW_BARYCENTRIC_PERSPECTIVE_CENTROID;
> +  break;
> +   case nir_intrinsic_load_barycentric_sample:
> +   case nir_intrinsic_load_barycentric_at_sample:
> +  bary = BRW_BARYCENTRIC_PERSPECTIVE_SAMPLE;
> +  break;
> +   default:
> +  assert(!"invalid intrinsic");
> }
>
> if (mode == INTERP_MODE_NOPERSPECTIVE)
> @@ -1101,107 +1107,6 @@ centroid_to_pixel(enum brw_barycentric_mode bary)
> return (enum brw_barycentric_mode) ((unsigned) bary - 1);
>  }
>
> -void
> -fs_visitor::emit_general_interpolation(fs_reg *attr, const char *name,
> -   const glsl_type *type,
> -   glsl_interp_mode
> interpolation_mode,
> -   int *location, bool mod_centroid,
> -   bool mod_sample)
> -{
> -   assert(stage == MESA_SHADER_FRAGMENT);
> -   brw_wm_prog_data *prog_data = (brw_wm_prog_data*) this->prog_data;
> -
> -   if (type->is_array() || type->is_matrix()) {
> -  const glsl_type *elem_type = glsl_get_array_element(type);
> -  const unsigned length = glsl_get_length(type);
> -
> -  for (unsigned i = 0; i < length; i++) {
> - emit_general_interpolation(attr, name, elem_type,
> interpolation_mode,
> -location, mod_centroid, mod_sample);
> -  }
> -   } else if (type->is_record()) {
> -  for (unsigned i = 0; i < type->length; i++) {
> - const glsl_type *field_type = type->fields.structure[i].type;
> - emit_general_interpolation(attr, name, field_type,
> interpolation_mode,
> -location, mod_centroid, mod_sample);
> -  }
> -   } else {
> -  assert(type->is_scalar() || type->is_vector());
> -
> -  if 

Re: [Mesa-dev] [PATCH 4/4] radeonsi: emit PS exports last

2016-07-19 Thread Nicolai Hähnle

On 19.07.2016 18:32, Marek Olšák wrote:

On Tue, Jul 19, 2016 at 3:43 PM, Nicolai Hähnle  wrote:

Patches 1, 3 & 4 are

Reviewed-by: Nicolai Hähnle 


Why not patch 2?


That was me being thoroughly confused today. 2 is 3 to a nearest 
approximation or something like that...


Anyway, series is

Reviewed-by: Nicolai Hähnle 



Marek


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] introducing radv - proof of concept vulkan driver for AMD VI chipsets

2016-07-19 Thread Jason Ekstrand
Good work guys!  Welcome to the Open-source Vulkan club :)

On Tue, Jul 19, 2016 at 12:59 PM, Dave Airlie  wrote:

> I was waiting for an open source driver to appear when I realised I
> should really just write one myself, some talking with Bas later, and
> we decided to see where we could get.
>
> This is the point at which we were willing to show it to others, it's
> not really a vulkan driver yet, so far it's a vulkan triangle demos
> driver.
>
> It renders the tri and cube demos from the vulkan loader,
> and the triangle demo from Sascha Willems demos
> and the Vulkan CTS smoke tests (all 4 of them one of which draws a
> triangle).
>
> There is a lot of work to do, and it's at the stage where we are
> seeing if anyone else wants to join in at the start, before we make
> too many serious design decisions or take a path we really don't want
> to.
>
> So far it's only been run on Tonga and Fiji chips I think, we are
> hoping to support radeon kernel driver for SI/CIK at some point, but I
> think we need to get things a bit further on VI chips first.
>
> The code is currently here:
> https://github.com/airlied/mesa/tree/semi-interesting
>
> There is a not-interesting branch which contains all the pre-history
> which might be useful for someone else bringing up a vulkan driver on
> other hardware.
>
> The code is pretty much based on the Intel anv driver, with the winsys
> ported from gallium driver,
> and most of the state setup from there. Bas wrote the code to connect
> NIR<->LLVM IR so we could reuse it in the future for SPIR-V in GL if
> required. It also copies AMD addrlib over, (this should be shared).
>
> Also we don't do SPIR-V->LLVM direct. We use NIR as it has the best
> chance for inter shader stage optimisations (vertex/fragment combined)
> which neither SPIR-V or LLVM handles for us, (nir doesn't do it yet
> but it can).
>
> If you want to submit bug reports, they will only be taken seriously
> if accompanied by working patches at this stage, and we've no plans to
> merge to master yet, but open to discussion on when we could do that
> and what would be required.
>
> Dave.
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v1 3/8] gm107/ir: lower surface operations

2016-07-19 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 .../nouveau/codegen/nv50_ir_lowering_nvc0.cpp  | 76 +-
 .../nouveau/codegen/nv50_ir_lowering_nvc0.h|  2 +
 2 files changed, 77 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
index 92bc0bb..2604296 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
@@ -2112,6 +2112,78 @@ NVC0LoweringPass::handleSurfaceOpNVC0(TexInstruction *su)
}
 }
 
+void
+NVC0LoweringPass::processSurfaceCoordsGM107(TexInstruction *su)
+{
+   const int slot = su->tex.r;
+   const int dim = su->tex.target.getDim();
+   const int arg = dim + (su->tex.target.isArray() || su->tex.target.isCube());
+   Value *ind = su->getIndirectR();
+   int pos = 0;
+
+   bld.setPosition(su, false);
+
+   // add texture handle
+   switch (su->op) {
+   case OP_SUSTP:
+  pos = 4;
+  break;
+   case OP_SUREDP:
+  pos = (su->subOp == NV50_IR_SUBOP_ATOM_CAS) ? 2 : 1;
+  break;
+   default:
+  assert(pos == 0);
+  break;
+   }
+   su->setSrc(arg + pos, loadTexHandle(ind, slot + 32));
+
+   // prevent read fault when the image is not actually bound
+   CmpInstruction *pred =
+  bld.mkCmp(OP_SET, CC_EQ, TYPE_U32, bld.getSSA(1, FILE_PREDICATE),
+TYPE_U32, bld.mkImm(0),
+loadSuInfo32(ind, slot, NVC0_SU_INFO_ADDR));
+   if (su->op != OP_SUSTP && su->tex.format) {
+  const TexInstruction::ImgFormatDesc *format = su->tex.format;
+  int blockwidth = format->bits[0] + format->bits[1] +
+   format->bits[2] + format->bits[3];
+
+  assert(format->components != 0);
+  // make sure that the format doesn't mismatch when it's not FMT_NONE
+  bld.mkCmp(OP_SET_OR, CC_NE, TYPE_U32, pred->getDef(0),
+TYPE_U32, bld.loadImm(NULL, blockwidth / 8),
+loadSuInfo32(ind, slot, NVC0_SU_INFO_BSIZE),
+pred->getDef(0));
+   }
+   su->setPredicate(CC_NOT_P, pred->getDef(0));
+}
+
+void
+NVC0LoweringPass::handleSurfaceOpGM107(TexInstruction *su)
+{
+   processSurfaceCoordsGM107(su);
+
+   if (su->op == OP_SULDP)
+  convertSurfaceFormat(su);
+
+   if (su->op == OP_SUREDP) {
+  Value *def = su->getDef(0);
+
+  su->op = OP_SUREDB;
+  su->setDef(0, bld.getSSA());
+
+  bld.setPosition(su, true);
+
+  // make sure to initialize dst value when the atomic operation is not
+  // performed
+  Instruction *mov = bld.mkMov(bld.getSSA(), bld.loadImm(NULL, 0));
+
+  assert(su->cc == CC_NOT_P);
+  mov->setPredicate(CC_P, su->getPredicate());
+
+  bld.mkOp2(OP_UNION, TYPE_U32, def, su->getDef(0), mov->getDef(0));
+   }
+}
+
 bool
 NVC0LoweringPass::handleWRSV(Instruction *i)
 {
@@ -2604,7 +2676,9 @@ NVC0LoweringPass::visit(Instruction *i)
case OP_SUSTP:
case OP_SUREDB:
case OP_SUREDP:
-  if (targ->getChipset() >= NVISA_GK104_CHIPSET)
+  if (targ->getChipset() >= NVISA_GM107_CHIPSET)
+ handleSurfaceOpGM107(i->asTex());
+  else if (targ->getChipset() >= NVISA_GK104_CHIPSET)
  handleSurfaceOpNVE4(i->asTex());
   else
  handleSurfaceOpNVC0(i->asTex());
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.h 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.h
index 4d7d8cc..104bc03 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.h
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.h
@@ -106,6 +106,7 @@ protected:
bool handleSUQ(TexInstruction *);
bool handleATOM(Instruction *);
bool handleCasExch(Instruction *, bool needCctl);
+   void handleSurfaceOpGM107(TexInstruction *);
void handleSurfaceOpNVE4(TexInstruction *);
void handleSurfaceOpNVC0(TexInstruction *);
void handleSharedATOM(Instruction *);
@@ -135,6 +136,7 @@ private:
Value *loadTexHandle(Value *ptr, unsigned int slot);
 
void adjustCoordinatesMS(TexInstruction *);
+   void processSurfaceCoordsGM107(TexInstruction *);
void processSurfaceCoordsNVE4(TexInstruction *);
void processSurfaceCoordsNVC0(TexInstruction *);
void convertSurfaceFormat(TexInstruction *);
-- 
2.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v1 2/8] nvc0: bind images for 3d/cp shaders on GM107+

2016-07-19 Thread Samuel Pitoiset
On Maxwell, images binding is slightly different (and much better)
regarding Fermi and Kepler because a texture view needs to be uploaded
for each image and this is going to simplify the thing a lot.

v1:
 - use nvc0_create_texture_view()
 - always set NV50_TEXVIEW_SCALED_COORDS
 - create texture views at bind time

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/nouveau/nvc0/nvc0_context.c |   5 +-
 src/gallium/drivers/nouveau/nvc0/nvc0_context.h |   4 +
 src/gallium/drivers/nouveau/nvc0/nvc0_state.c   |  23 -
 src/gallium/drivers/nouveau/nvc0/nvc0_tex.c | 108 ++--
 src/gallium/drivers/nouveau/nvc0/nve4_compute.c |  85 +--
 5 files changed, 207 insertions(+), 18 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_context.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_context.c
index 1137e6c..4bd240b 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_context.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_context.c
@@ -161,8 +161,11 @@ nvc0_context_unreference_resources(struct nvc0_context 
*nvc0)
   for (i = 0; i < NVC0_MAX_BUFFERS; ++i)
  pipe_resource_reference(>buffers[s][i].buffer, NULL);
 
-  for (i = 0; i < NVC0_MAX_IMAGES; ++i)
+  for (i = 0; i < NVC0_MAX_IMAGES; ++i) {
  pipe_resource_reference(>images[s][i].resource, NULL);
+ if (nvc0->screen->base.class_3d >= GM107_3D_CLASS)
+pipe_sampler_view_reference(>images_tic[s][i], NULL);
+  }
}
 
for (s = 0; s < 2; ++s) {
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_context.h 
b/src/gallium/drivers/nouveau/nvc0/nvc0_context.h
index 4b73ec3..6890f57 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_context.h
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_context.h
@@ -246,6 +246,7 @@ struct nvc0_context {
uint32_t buffers_valid[6];
 
struct pipe_image_view images[6][NVC0_MAX_IMAGES];
+   struct pipe_sampler_view *images_tic[6][NVC0_MAX_IMAGES]; /* GM107+ */
uint16_t images_dirty[6];
uint16_t images_valid[6];
 
@@ -349,6 +350,9 @@ struct pipe_sampler_view *
 nvc0_create_sampler_view(struct pipe_context *,
  struct pipe_resource *,
  const struct pipe_sampler_view *);
+struct pipe_sampler_view *
+gm107_create_texture_view_from_image(struct pipe_context *,
+ const struct pipe_image_view *);
 
 /* nvc0_transfer.c */
 void
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
index 441cfc9..fcb695a 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
@@ -1296,6 +1296,19 @@ nvc0_bind_images_range(struct nvc0_context *nvc0, const 
unsigned s,
 
  pipe_resource_reference(
>resource, pimages[p].resource);
+
+ if (nvc0->screen->base.class_3d >= GM107_3D_CLASS) {
+if (nvc0->images_tic[s][i]) {
+   struct nv50_tic_entry *old =
+  nv50_tic_entry(nvc0->images_tic[s][i]);
+   nvc0_screen_tic_unlock(nvc0->screen, old);
+   pipe_sampler_view_reference(>images_tic[s][i], NULL);
+}
+
+nvc0->images_tic[s][i] =
+   gm107_create_texture_view_from_image(>base.pipe,
+[p]);
+ }
   }
   if (!mask)
  return false;
@@ -1303,8 +1316,16 @@ nvc0_bind_images_range(struct nvc0_context *nvc0, const 
unsigned s,
   mask = ((1 << nr) - 1) << start;
   if (!(nvc0->images_valid[s] & mask))
  return false;
-  for (i = start; i < end; ++i)
+  for (i = start; i < end; ++i) {
  pipe_resource_reference(>images[s][i].resource, NULL);
+ if (nvc0->screen->base.class_3d >= GM107_3D_CLASS) {
+struct nv50_tic_entry *old = 
nv50_tic_entry(nvc0->images_tic[s][i]);
+if (old) {
+   nvc0_screen_tic_unlock(nvc0->screen, old);
+   pipe_sampler_view_reference(>images_tic[s][i], NULL);
+}
+ }
+  }
   nvc0->images_valid[s] &= ~mask;
}
nvc0->images_dirty[s] |= mask;
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c
index 71c1b84..8abf1b5 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c
@@ -236,6 +236,42 @@ gm107_create_texture_view(struct pipe_context *pipe,
return >pipe;
 }
 
+struct pipe_sampler_view *
+gm107_create_texture_view_from_image(struct pipe_context *pipe,
+ const struct pipe_image_view *view)
+{
+   struct nv04_resource *res = nv04_resource(view->resource);
+   struct pipe_sampler_view templ = {};
+   enum pipe_texture_target target;
+   uint32_t flags = 0;
+
+   if (!res)
+  return NULL;
+   target = res->base.target;
+
+   if (target == PIPE_TEXTURE_CUBE || target 

[Mesa-dev] [PATCH v1 6/8] gm107/ir: add emission for SUREDx

2016-07-19 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 .../drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 50 ++
 1 file changed, 50 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
index 7d77ca3..791273f 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
@@ -205,6 +205,7 @@ private:
void emitSUHandle(const int s);
void emitSUSTx();
void emitSULDx();
+   void emitSUREDx();
 };
 
 
/***
@@ -2913,6 +2914,51 @@ CodeEmitterGM107::emitSULDx()
 
emitSUHandle(1);
 }
+
+void
+CodeEmitterGM107::emitSUREDx()
+{
+   const TexInstruction *insn = this->insn->asTex();
+   uint8_t type = 0, subOp;
+
+   if (insn->subOp == NV50_IR_SUBOP_ATOM_CAS)
+  emitInsn(0xeac0);
+   else
+  emitInsn(0xea60);
+
+   if (insn->op == OP_SUREDB)
+  emitField(0x34, 1, 1);
+   emitSUTarget();
+
+   // destination type
+   switch (insn->dType) {
+   case TYPE_S32: type = 1; break;
+   case TYPE_U64: type = 2; break;
+   case TYPE_F32: type = 3; break;
+   case TYPE_S64: type = 5; break;
+   default:
+  assert(insn->dType == TYPE_U32);
+  break;
+   }
+
+   // atomic operation
+   if (insn->subOp == NV50_IR_SUBOP_ATOM_CAS) {
+  subOp = 0;
+   } else if (insn->subOp == NV50_IR_SUBOP_ATOM_EXCH) {
+  subOp = 8;
+   } else {
+  subOp = insn->subOp;
+   }
+
+   emitField(0x24, 3, type);
+   emitField(0x1d, 4, subOp);
+   emitGPR  (0x14, insn->src(1));
+   emitGPR  (0x08, insn->src(0));
+   emitGPR  (0x00, insn->def(0));
+
+   emitSUHandle(2);
+}
+
 
/***
  * assembler front-end
  
**/
@@ -3235,6 +3281,10 @@ CodeEmitterGM107::emitInstruction(Instruction *i)
case OP_SULDP:
   emitSULDx();
   break;
+   case OP_SUREDB:
+   case OP_SUREDP:
+  emitSUREDx();
+  break;
default:
   assert(!"invalid opcode");
   emitNOP();
-- 
2.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v1 1/8] nvc0: increase the tex handles area size in the driver cb

2016-07-19 Thread Samuel Pitoiset
Currently, we can store 32 tex handles of 32-bits integer each and
that fits perfectly with the underlying hardware except on GM107+
which requires to upload a texture view for each images.

This patch increases the number of storable texture handles in the
driver constant buffer from 32 to 40 because we expose 8 images.

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/nouveau/nvc0/nvc0_context.h | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_context.h 
b/src/gallium/drivers/nouveau/nvc0/nvc0_context.h
index 7acd477..4b73ec3 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_context.h
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_context.h
@@ -108,34 +108,34 @@
 /* XXX: Figure out what this UNK data is. */
 #define NVC0_CB_AUX_UNK_INFO0x000
 #define NVC0_CB_AUX_UNK_SIZE(8 * 4)
-/* 32 textures handles, at 1 32-bits integer each */
+/* 40 textures handles (8 for GM107+ images only), at 1 32-bits integer each */
 #define NVC0_CB_AUX_TEX_INFO(i) 0x020 + (i) * 4
-#define NVC0_CB_AUX_TEX_SIZE(32 * 4)
+#define NVC0_CB_AUX_TEX_SIZE(40 * 4)
 /* 8 sets of 32-bits coordinate offsets */
-#define NVC0_CB_AUX_MS_INFO 0x0a0
+#define NVC0_CB_AUX_MS_INFO 0x0c0
 #define NVC0_CB_AUX_MS_SIZE (8 * 2 * 4)
 /* block/grid size, at 3 32-bits integers each, gridid and work_dim */
-#define NVC0_CB_AUX_GRID_INFO(i)0x0e0 + (i) * 4 /* CP */
+#define NVC0_CB_AUX_GRID_INFO(i)0x100 + (i) * 4 /* CP */
 #define NVC0_CB_AUX_GRID_SIZE   (8 * 4)
 /* 8 user clip planes, at 4 32-bits floats each */
-#define NVC0_CB_AUX_UCP_INFO0x100
+#define NVC0_CB_AUX_UCP_INFO0x120
 #define NVC0_CB_AUX_UCP_SIZE(PIPE_MAX_CLIP_PLANES * 4 * 4)
 /* 13 ubos, at 4 32-bits integer each */
-#define NVC0_CB_AUX_UBO_INFO(i) 0x100 + (i) * 4 * 4 /* CP */
+#define NVC0_CB_AUX_UBO_INFO(i) 0x120 + (i) * 4 * 4 /* CP */
 #define NVC0_CB_AUX_UBO_SIZE((NVC0_MAX_PIPE_CONSTBUFS - 1) * 4 * 4)
 /* 8 sets of 32-bits integer pairs sample offsets */
-#define NVC0_CB_AUX_SAMPLE_INFO 0x180 /* FP */
+#define NVC0_CB_AUX_SAMPLE_INFO 0x1a0 /* FP */
 #define NVC0_CB_AUX_SAMPLE_SIZE (8 * 4 * 2)
 /* draw parameters (index bais, base instance, drawid) */
-#define NVC0_CB_AUX_DRAW_INFO   0x180 /* VP */
+#define NVC0_CB_AUX_DRAW_INFO   0x1a0 /* VP */
 /* 32 user buffers, at 4 32-bits integers each */
-#define NVC0_CB_AUX_BUF_INFO(i) 0x200 + (i) * 4 * 4
+#define NVC0_CB_AUX_BUF_INFO(i) 0x220 + (i) * 4 * 4
 #define NVC0_CB_AUX_BUF_SIZE(NVC0_MAX_BUFFERS * 4 * 4)
 /* 8 surfaces, at 16 32-bits integers each */
-#define NVC0_CB_AUX_SU_INFO(i)  0x400 + (i) * 16 * 4
+#define NVC0_CB_AUX_SU_INFO(i)  0x420 + (i) * 16 * 4
 #define NVC0_CB_AUX_SU_SIZE (NVC0_MAX_IMAGES * 16 * 4)
 /* 1 64-bits address and 1 32-bits sequence */
-#define NVC0_CB_AUX_MP_INFO 0x600
+#define NVC0_CB_AUX_MP_INFO 0x620
 #define NVC0_CB_AUX_MP_SIZE 3 * 4
 /* 4 32-bits floats for the vertex runout, put at the end */
 #define NVC0_CB_AUX_RUNOUT_INFO NVC0_CB_USR_SIZE + (NVC0_CB_AUX_SIZE * 6)
-- 
2.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v1 8/8] nvc0: disable MS images on GM107+

2016-07-19 Thread Samuel Pitoiset
MS images have to be handled explicitly and I don't plan to implement
them for now.

v1: - check that sample_count > 1

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
index f681631..a3cd046 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
@@ -90,6 +90,13 @@ nvc0_screen_is_format_supported(struct pipe_screen *pscreen,
  PIPE_BIND_LINEAR |
  PIPE_BIND_SHARED);
 
+   if (bindings & PIPE_BIND_SHADER_IMAGE && sample_count > 1 &&
+   nouveau_screen(pscreen)->class_3d >= GM107_3D_CLASS) {
+  /* MS images are currently unsupported on Maxwell because they have to
+   * be handled explicitly. */
+  return false;
+   }
+
return (( nvc0_format_table[format].usage |
 nvc0_vertex_format[format].usage) & bindings) == bindings;
 }
-- 
2.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v1 4/8] gm107/ra: fix constraints for surface operations

2016-07-19 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp | 25 --
 1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
index 63fe9c0..2d3486b 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_ra.cpp
@@ -2093,8 +2093,29 @@ 
RegAlloc::InsertConstraintsPass::texConstraintGM107(TexInstruction *tex)
   textureMask(tex);
condenseDefs(tex);
 
-   if (tex->op == OP_SUSTB || tex->op == OP_SUSTP) {
-  condenseSrcs(tex, 3, (3 + typeSizeof(tex->dType) / 4) - 1);
+   if (isSurfaceOp(tex->op)) {
+  int s = tex->tex.target.getDim() +
+ (tex->tex.target.isArray() || tex->tex.target.isCube());
+  int n = 0;
+
+  switch (tex->op) {
+  case OP_SUSTB:
+  case OP_SUSTP:
+ n = 4;
+ break;
+  case OP_SUREDB:
+  case OP_SUREDP:
+ if (tex->subOp == NV50_IR_SUBOP_ATOM_CAS)
+n = 2;
+ break;
+  default:
+ break;
+  }
+
+  if (s > 1)
+ condenseSrcs(tex, 0, s - 1);
+  if (n > 1)
+ condenseSrcs(tex, 1, n); // do not condense the tex handle
} else
if (isTextureOp(tex->op)) {
   if (tex->op != OP_TXQ) {
-- 
2.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v1 7/8] nv50/ir: print OP_SUREDB subops in debug mode

2016-07-19 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp
index ae0dd78..22f2f5d 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp
@@ -570,6 +570,7 @@ void Instruction::print() const
  PRINT("%s ", interpStr[ipa]);
   switch (op) {
   case OP_SUREDP:
+  case OP_SUREDB:
   case OP_ATOM:
  if (subOp < ARRAY_SIZE(atomSubOpStr))
 PRINT("%s ", atomSubOpStr[subOp]);
-- 
2.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v1 5/8] gm107/ir: add emission for SUSTx and SULDx

2016-07-19 Thread Samuel Pitoiset
v1: - remove one occurence of TEX_TARGET_CUBE_ARRAY

Signed-off-by: Samuel Pitoiset 
---
 .../drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 104 +
 1 file changed, 104 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
index 6904eba..7d77ca3 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp
@@ -200,6 +200,11 @@ private:
void emitMEMBAR();
 
void emitVOTE();
+
+   void emitSUTarget();
+   void emitSUHandle(const int s);
+   void emitSUSTx();
+   void emitSULDx();
 };
 
 
/***
@@ -2817,6 +2822,97 @@ CodeEmitterGM107::emitVOTE()
emitPRED (0x27, insn->src(0));
 }
 
+void
+CodeEmitterGM107::emitSUTarget()
+{
+   const TexInstruction *insn = this->insn->asTex();
+   int target = 0;
+
+   assert(insn->op >= OP_SULDB && insn->op <= OP_SUREDP);
+
+   if (insn->tex.target == TEX_TARGET_BUFFER) {
+  target = 2;
+   } else if (insn->tex.target == TEX_TARGET_1D_ARRAY) {
+  target = 4;
+   } else if (insn->tex.target == TEX_TARGET_2D ||
+  insn->tex.target == TEX_TARGET_RECT) {
+  target = 6;
+   } else if (insn->tex.target == TEX_TARGET_2D_ARRAY ||
+  insn->tex.target == TEX_TARGET_CUBE ||
+  insn->tex.target == TEX_TARGET_CUBE_ARRAY) {
+  target = 8;
+   } else if (insn->tex.target == TEX_TARGET_3D) {
+  target = 10;
+   } else {
+  assert(insn->tex.target == TEX_TARGET_1D);
+   }
+   emitField(0x20, 4, target);
+}
+
+void
+CodeEmitterGM107::emitSUHandle(const int s)
+{
+   const TexInstruction *insn = this->insn->asTex();
+
+   assert(insn->op >= OP_SULDB && insn->op <= OP_SUREDP);
+
+   if (insn->src(s).getFile() == FILE_GPR) {
+  emitGPR(0x27, insn->src(s));
+   } else {
+  ImmediateValue *imm = insn->getSrc(s)->asImm();
+  assert(imm);
+  emitField(0x33, 1, 1);
+  emitField(0x24, 13, imm->reg.data.u32);
+   }
+}
+
+void
+CodeEmitterGM107::emitSUSTx()
+{
+   const TexInstruction *insn = this->insn->asTex();
+
+   emitInsn(0xeb20);
+   if (insn->op == OP_SUSTB)
+  emitField(0x34, 1, 1);
+   emitSUTarget();
+
+   emitLDSTc(0x18);
+   emitField(0x14, 4, 0xf); // rgba
+   emitGPR  (0x08, insn->src(0));
+   emitGPR  (0x00, insn->src(1));
+
+   emitSUHandle(2);
+}
+
+void
+CodeEmitterGM107::emitSULDx()
+{
+   const TexInstruction *insn = this->insn->asTex();
+   int type = 0;
+
+   emitInsn(0xeb00);
+   if (insn->op == OP_SULDB)
+  emitField(0x34, 1, 1);
+   emitSUTarget();
+
+   switch (insn->dType) {
+   case TYPE_S8:   type = 1; break;
+   case TYPE_U16:  type = 2; break;
+   case TYPE_S16:  type = 3; break;
+   case TYPE_U32:  type = 4; break;
+   case TYPE_U64:  type = 5; break;
+   case TYPE_B128: type = 6; break;
+   default:
+  assert(insn->dType == TYPE_U8);
+  break;
+   }
+   emitLDSTc(0x18);
+   emitField(0x14, 3, type);
+   emitGPR  (0x00, insn->def(0));
+   emitGPR  (0x08, insn->src(0));
+
+   emitSUHandle(1);
+}
 
/***
  * assembler front-end
  
**/
@@ -3131,6 +3227,14 @@ CodeEmitterGM107::emitInstruction(Instruction *i)
case OP_VOTE:
   emitVOTE();
   break;
+   case OP_SUSTB:
+   case OP_SUSTP:
+  emitSUSTx();
+  break;
+   case OP_SULDB:
+   case OP_SULDP:
+  emitSULDx();
+  break;
default:
   assert(!"invalid opcode");
   emitNOP();
-- 
2.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/12] glsl: Instead of walking the name pointer in get_intrinsic_opcode, advance an offset

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

I think this makes the next patch more clear.

   textdata bss dec hex filename
7528883  273096   28584 7830563  777c23 /tmp/i965_dri-64bit-before.so
7528835  273096   28584 7830515  777bf3 /tmp/i965_dri-64bit-after.so

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/nir_intrinsic_map.py | 22 +-
 1 file changed, 9 insertions(+), 13 deletions(-)

diff --git a/src/compiler/glsl/nir_intrinsic_map.py 
b/src/compiler/glsl/nir_intrinsic_map.py
index 337f1e9..c2b78a0 100644
--- a/src/compiler/glsl/nir_intrinsic_map.py
+++ b/src/compiler/glsl/nir_intrinsic_map.py
@@ -140,32 +140,31 @@ def emit_trie_leaf(indent, d):
 
 
 def trie_as_C_code(trie, indent="   ", prefix_string="__intrinsic_"):
+offset = len(prefix_string)
 conditional = "if"
 
 c_code = ""
 for (s, t, d) in trie:
 if d is not None:
-c_code +=  "{}{} (name[0] == '\\0') {{\n".format(indent, 
conditional)
-c_code += "{}   /* {} */\n".format(indent, prefix_string)
+c_code +=  "{}{} (name[{}] == '\\0') {{\n".format(indent, 
conditional, offset)
+c_code += "{}   assert(strcmp(name, \"{}\") == 
0);\n".format(indent, prefix_string)
 c_code += emit_trie_leaf(indent + "   ", d);
 
 else:
 # Before emitting the string comparison, check to see of the
 # subtree has a single element with an empty string.  In that
-# case, use strcmp() instead of strncmp() and don't advance the
-# name pointer.
+# case, use strcmp() instead of strncmp().
 
 if len(t) == 1 and t[0][2] is not None:
 if s == "":
-c_code += "{}{} (name[0] == '\\0') {{\n".format(indent, 
conditional, s)
+c_code += "{}{} (name[{}] == '\\0') {{\n".format(indent, 
conditional, offset)
 else:
-c_code += "{}{} (strcmp(name, \"{}\") == 0) 
{{\n".format(indent, conditional, s)
+c_code += "{}{} (strcmp(name + {}, \"{}\") == 0) 
{{\n".format(indent, conditional, offset, s)
 
-c_code += "{}   /* {} */\n".format(indent, prefix_string + s)
+c_code += "{}   assert(strcmp(name, \"{}\") == 
0);\n".format(indent, prefix_string + s)
 c_code += emit_trie_leaf(indent + "   ", t[0][2]);
 else:
-c_code += "{}{} (strncmp(name, \"{}\", {}) == 0) 
{{\n".format(indent, conditional, s, len(s))
-c_code += "{}   name += {};\n\n".format(indent, len(s))
+c_code += "{}{} (strncmp(name + {}, \"{}\", {}) == 0) 
{{\n".format(indent, conditional, offset, s, len(s))
 
 c_code += trie_as_C_code(t, indent + "   ", prefix_string + s)
 
@@ -182,10 +181,7 @@ namespace _glsl_to_nir {
 nir_intrinsic_op
 get_intrinsic_opcode(const char *name, const ir_dereference *return_deref)
 {
-   if (strncmp(name, "__intrinsic_", 12) == 0)
-  name += 12;
-   else
-  unreachable("Intrinsic name does not begin with '__intrinsic_'");
+   assert(strncmp(name, "__intrinsic_", 12) == 0);
 
nir_intrinsic_op int_op;
nir_intrinsic_op uint_op;
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/12] glsl: Refactor code to find NIR opcode for an intrinsic and add unit test

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

The next patches are going to significantly change the implementation,
so I really want a unit test.

   textdata bss dec hex filename
7529123  273096   28584 7830803  777d13 /tmp/i965_dri-64bit-before.so
7529283  273096   28584 7830963  777db3 /tmp/i965_dri-64bit-after.so

I'm honestly not sure why it grew by 160 bytes.

Signed-off-by: Ian Romanick 
Reviewed-by: Iago Toral Quiroga 
---
 src/compiler/Makefile.glsl.am  |  11 +
 src/compiler/glsl/glsl_to_nir.cpp  | 238 +++--
 .../glsl/tests/get_intrinsic_opcode_test.cpp   | 157 ++
 3 files changed, 293 insertions(+), 113 deletions(-)
 create mode 100644 src/compiler/glsl/tests/get_intrinsic_opcode_test.cpp

diff --git a/src/compiler/Makefile.glsl.am b/src/compiler/Makefile.glsl.am
index 4e90f16..1132aae 100644
--- a/src/compiler/Makefile.glsl.am
+++ b/src/compiler/Makefile.glsl.am
@@ -47,6 +47,7 @@ check_PROGRAMS += \
glsl/glsl_test  \
glsl/tests/blob-test\
glsl/tests/general-ir-test  \
+   glsl/tests/get-intrinsic-opcode-test\
glsl/tests/sampler-types-test   \
glsl/tests/uniform-initializer-test
 
@@ -94,6 +95,16 @@ glsl_tests_sampler_types_test_LDADD =
\
$(top_builddir)/src/libglsl_util.la \
$(PTHREAD_LIBS)
 
+glsl_tests_get_intrinsic_opcode_test_SOURCES = \
+   glsl/tests/get_intrinsic_opcode_test.cpp
+glsl_tests_get_intrinsic_opcode_test_CFLAGS =  \
+   $(PTHREAD_CFLAGS)
+glsl_tests_get_intrinsic_opcode_test_LDADD =   \
+   $(top_builddir)/src/gtest/libgtest.la   \
+   glsl/libglsl.la \
+   $(top_builddir)/src/libglsl_util.la \
+   $(PTHREAD_LIBS)
+
 noinst_LTLIBRARIES += glsl/libglsl.la glsl/libglcpp.la glsl/libstandalone.la
 
 glsl_libglcpp_la_LIBADD =  \
diff --git a/src/compiler/glsl/glsl_to_nir.cpp 
b/src/compiler/glsl/glsl_to_nir.cpp
index 20302e3..266f150 100644
--- a/src/compiler/glsl/glsl_to_nir.cpp
+++ b/src/compiler/glsl/glsl_to_nir.cpp
@@ -602,123 +602,135 @@ nir_visitor::visit(ir_return *ir)
nir_builder_instr_insert(, >instr);
 }
 
+namespace _glsl_to_nir {
+
+nir_intrinsic_op
+get_intrinsic_opcode(const char *name, const ir_dereference *return_deref)
+{
+   nir_intrinsic_op op;
+
+   if (strcmp(name, "__intrinsic_atomic_read") == 0) {
+  op = nir_intrinsic_atomic_counter_read_var;
+   } else if (strcmp(name, "__intrinsic_atomic_increment") == 0) {
+  op = nir_intrinsic_atomic_counter_inc_var;
+   } else if (strcmp(name, "__intrinsic_atomic_predecrement") == 0) {
+  op = nir_intrinsic_atomic_counter_dec_var;
+   } else if (strcmp(name, "__intrinsic_image_load") == 0) {
+  op = nir_intrinsic_image_load;
+   } else if (strcmp(name, "__intrinsic_image_store") == 0) {
+  op = nir_intrinsic_image_store;
+   } else if (strcmp(name, "__intrinsic_image_atomic_add") == 0) {
+  op = nir_intrinsic_image_atomic_add;
+   } else if (strcmp(name, "__intrinsic_image_atomic_min") == 0) {
+  op = nir_intrinsic_image_atomic_min;
+   } else if (strcmp(name, "__intrinsic_image_atomic_max") == 0) {
+  op = nir_intrinsic_image_atomic_max;
+   } else if (strcmp(name, "__intrinsic_image_atomic_and") == 0) {
+  op = nir_intrinsic_image_atomic_and;
+   } else if (strcmp(name, "__intrinsic_image_atomic_or") == 0) {
+  op = nir_intrinsic_image_atomic_or;
+   } else if (strcmp(name, "__intrinsic_image_atomic_xor") == 0) {
+  op = nir_intrinsic_image_atomic_xor;
+   } else if (strcmp(name, "__intrinsic_image_atomic_exchange") == 0) {
+  op = nir_intrinsic_image_atomic_exchange;
+   } else if (strcmp(name, "__intrinsic_image_atomic_comp_swap") == 0) {
+  op = nir_intrinsic_image_atomic_comp_swap;
+   } else if (strcmp(name, "__intrinsic_memory_barrier") == 0) {
+  op = nir_intrinsic_memory_barrier;
+   } else if (strcmp(name, "__intrinsic_image_size") == 0) {
+  op = nir_intrinsic_image_size;
+   } else if (strcmp(name, "__intrinsic_image_samples") == 0) {
+  op = nir_intrinsic_image_samples;
+   } else if (strcmp(name, "__intrinsic_store_ssbo") == 0) {
+  op = nir_intrinsic_store_ssbo;
+   } else if (strcmp(name, "__intrinsic_load_ssbo") == 0) {
+  op = nir_intrinsic_load_ssbo;
+   } else if (strcmp(name, "__intrinsic_atomic_add_ssbo") == 0) {
+  op = nir_intrinsic_ssbo_atomic_add;
+   } else if (strcmp(name, "__intrinsic_atomic_and_ssbo") == 0) {
+  op = nir_intrinsic_ssbo_atomic_and;
+   } else if (strcmp(name, "__intrinsic_atomic_or_ssbo") == 0) {
+  op = nir_intrinsic_ssbo_atomic_or;
+   } else if (strcmp(name, "__intrinsic_atomic_xor_ssbo") == 0) 

[Mesa-dev] [PATCH 05/12] glsl: Replace the linear search in get_intrinsic_opcode with a radix trie

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

If there is a way to do this cleanly in mako, I'm very interested to
hear about it.

   textdata bss dec hex filename
7529003  273096   28584 7830683  777c9b /tmp/i965_dri-64bit-before.so
7528883  273096   28584 7830563  777c23 /tmp/i965_dri-64bit-after.so

v2: Replace list.sort() with sorted() for a pretty dramatic code clean
up.  Suggested by Dylan.

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/nir_intrinsic_map.py | 126 +
 1 file changed, 114 insertions(+), 12 deletions(-)

diff --git a/src/compiler/glsl/nir_intrinsic_map.py 
b/src/compiler/glsl/nir_intrinsic_map.py
index f6e9241..337f1e9 100644
--- a/src/compiler/glsl/nir_intrinsic_map.py
+++ b/src/compiler/glsl/nir_intrinsic_map.py
@@ -67,6 +67,115 @@ intrinsics = [("__intrinsic_atomic_read", 
("nir_intrinsic_atomic_counter_read_va
   ("__intrinsic_atomic_exchange_shared", 
("nir_intrinsic_shared_atomic_exchange", None)),
   ("__intrinsic_atomic_comp_swap_shared", 
("nir_intrinsic_shared_atomic_comp_swap", None))]
 
+def remove_prefix(table, prefix_length):
+"""Strip prefix_length characters off the name of each entry in table."""
+
+return [(s[prefix_length:], d) for (s, d) in table]
+
+
+def generate_trie(table):
+"""table is a list of (string, data) tuples.  It is assumed to be sorted by
+string.
+
+A radix trie (or compact prefix trie) is recursively generated from the
+list of names.  Names are paritioned into groups that have at least
+prefix_thresh (tunable parameter) common prefix characters.  Each of these
+groups becomes the branches at the current level of the tree.  The
+matching prefix characters from each group is removed, and the group is
+recursively operated on in the same fashion.
+
+The recursion terminates when no groups can be formed with at least
+prefix_thresh matching characters.
+
+Each node in the trie is a 3-element tuple:
+
+(prefix_string, [child_nodes], client_data)
+
+One of [child_nodes] or client_data will be None.
+
+See https://en.wikipedia.org/wiki/Radix_tree for more background details
+on the data structure.
+
+"""
+
+# Threshold for considering two strings to have the same prefix.
+prefix_thresh = 1
+
+if len(table) == 1 and table[0][0] == "":
+return [("", None, table[0][1])]
+
+trie_level = []
+
+(s, d) = table[0]
+candidates = [(s, d)]
+base = s
+prefix_length = len(s)
+
+for (s, d) in table[1:]:
+if s[:prefix_thresh] == base[:prefix_thresh]:
+candidates.append((s, d))
+
+l = len(s[:([x[0]==x[1] for x in zip(s, base)]+[0]).index(0)])
+if l < prefix_length:
+prefix_length = l
+else:
+trie_level.append((base[:prefix_length], 
generate_trie(remove_prefix(candidates, prefix_length)), None))
+
+candidates = [(s, d)]
+base = s
+prefix_length = len(s)
+
+trie_level.append((base[:prefix_length], 
generate_trie(remove_prefix(candidates, prefix_length)), None))
+
+return trie_level
+
+
+def emit_trie_leaf(indent, d):
+if d[1] is None:
+return "{}return {};\n".format(indent, d[0])
+else:
+c_code = "{}int_op = {};\n".format(indent, d[0])
+c_code += "{}uint_op = {};\n".format(indent, d[1])
+return c_code
+
+
+def trie_as_C_code(trie, indent="   ", prefix_string="__intrinsic_"):
+conditional = "if"
+
+c_code = ""
+for (s, t, d) in trie:
+if d is not None:
+c_code +=  "{}{} (name[0] == '\\0') {{\n".format(indent, 
conditional)
+c_code += "{}   /* {} */\n".format(indent, prefix_string)
+c_code += emit_trie_leaf(indent + "   ", d);
+
+else:
+# Before emitting the string comparison, check to see of the
+# subtree has a single element with an empty string.  In that
+# case, use strcmp() instead of strncmp() and don't advance the
+# name pointer.
+
+if len(t) == 1 and t[0][2] is not None:
+if s == "":
+c_code += "{}{} (name[0] == '\\0') {{\n".format(indent, 
conditional, s)
+else:
+c_code += "{}{} (strcmp(name, \"{}\") == 0) 
{{\n".format(indent, conditional, s)
+
+c_code += "{}   /* {} */\n".format(indent, prefix_string + s)
+c_code += emit_trie_leaf(indent + "   ", t[0][2]);
+else:
+c_code += "{}{} (strncmp(name, \"{}\", {}) == 0) 
{{\n".format(indent, conditional, s, len(s))
+c_code += "{}   name += {};\n\n".format(indent, len(s))
+
+c_code += trie_as_C_code(t, indent + "   ", prefix_string + s)
+
+conditional = "} else if"
+
+c_code += "{}}} else\n".format(indent)
+c_code += "{}   

[Mesa-dev] [PATCH 02/12] glsl: Replace big pile of hand-written code with a generator

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

Right now the generator generates nearly identical code.  There is no
change in the binary size.

   textdata bss dec hex filename
7529283  273096   28584 7830963  777db3 /tmp/i965_dri-64bit-before.so
7529283  273096   28584 7830963  777db3 /tmp/i965_dri-64bit-after.so

v2: Import print_function, and wrap "main" code in 'if __name__ ==
"__main__"' block.  Both suggested by Dylan.

Signed-off-by: Ian Romanick 
Reviewed-by: Iago Toral Quiroga 
---
 src/compiler/Makefile.glsl.am  |  11 ++-
 src/compiler/glsl/glsl_to_nir.cpp  | 123 +
 src/compiler/glsl/nir_intrinsic_map.py |  97 ++
 3 files changed, 107 insertions(+), 124 deletions(-)
 create mode 100644 src/compiler/glsl/nir_intrinsic_map.py

diff --git a/src/compiler/Makefile.glsl.am b/src/compiler/Makefile.glsl.am
index 1132aae..af80e60 100644
--- a/src/compiler/Makefile.glsl.am
+++ b/src/compiler/Makefile.glsl.am
@@ -27,6 +27,7 @@ EXTRA_DIST += glsl/tests glsl/glcpp/tests glsl/README \
glsl/glsl_parser.yy \
glsl/glcpp/glcpp-lex.l  \
glsl/glcpp/glcpp-parse.y\
+   glsl/nir_intrinsic_map.cpp  \
SConscript.glsl
 
 TESTS += glsl/glcpp/tests/glcpp-test   \
@@ -208,6 +209,10 @@ glsl/glcpp/glcpp-lex.c: glsl/glcpp/glcpp-lex.l
$(MKDIR_GEN)
$(LEX_GEN) -o $@ $(srcdir)/glsl/glcpp/glcpp-lex.l
 
+glsl/nir_intrinsic_map.cpp: glsl/nir_intrinsic_map.py
+   $(MKDIR_GEN)
+   $(PYTHON_GEN) $(srcdir)/glsl/nir_intrinsic_map.py > $@ || ($(RM) $@; 
false)
+
 # Only the parsers (specifically the header files generated at the same time)
 # need to be in BUILT_SOURCES. Though if we list the parser headers YACC is
 # called for the .c/.cpp file and the .h files. By listing the .c/.cpp files
@@ -218,14 +223,16 @@ BUILT_SOURCES +=  \
glsl/glsl_parser.cpp\
glsl/glsl_lexer.cpp \
glsl/glcpp/glcpp-parse.c\
-   glsl/glcpp/glcpp-lex.c
+   glsl/glcpp/glcpp-lex.c  \
+   glsl/nir_intrinsic_map.cpp
 CLEANFILES +=  \
glsl/glcpp/glcpp-parse.h\
glsl/glsl_parser.h  \
glsl/glsl_parser.cpp\
glsl/glsl_lexer.cpp \
glsl/glcpp/glcpp-parse.c\
-   glsl/glcpp/glcpp-lex.c
+   glsl/glcpp/glcpp-lex.c  \
+   glsl/nir_intrinsic_map.cpp
 
 clean-local:
$(RM) -r subtest-cr subtest-cr-lf subtest-lf subtest-lf-cr
diff --git a/src/compiler/glsl/glsl_to_nir.cpp 
b/src/compiler/glsl/glsl_to_nir.cpp
index 266f150..3b8424e 100644
--- a/src/compiler/glsl/glsl_to_nir.cpp
+++ b/src/compiler/glsl/glsl_to_nir.cpp
@@ -602,128 +602,7 @@ nir_visitor::visit(ir_return *ir)
nir_builder_instr_insert(, >instr);
 }
 
-namespace _glsl_to_nir {
-
-nir_intrinsic_op
-get_intrinsic_opcode(const char *name, const ir_dereference *return_deref)
-{
-   nir_intrinsic_op op;
-
-   if (strcmp(name, "__intrinsic_atomic_read") == 0) {
-  op = nir_intrinsic_atomic_counter_read_var;
-   } else if (strcmp(name, "__intrinsic_atomic_increment") == 0) {
-  op = nir_intrinsic_atomic_counter_inc_var;
-   } else if (strcmp(name, "__intrinsic_atomic_predecrement") == 0) {
-  op = nir_intrinsic_atomic_counter_dec_var;
-   } else if (strcmp(name, "__intrinsic_image_load") == 0) {
-  op = nir_intrinsic_image_load;
-   } else if (strcmp(name, "__intrinsic_image_store") == 0) {
-  op = nir_intrinsic_image_store;
-   } else if (strcmp(name, "__intrinsic_image_atomic_add") == 0) {
-  op = nir_intrinsic_image_atomic_add;
-   } else if (strcmp(name, "__intrinsic_image_atomic_min") == 0) {
-  op = nir_intrinsic_image_atomic_min;
-   } else if (strcmp(name, "__intrinsic_image_atomic_max") == 0) {
-  op = nir_intrinsic_image_atomic_max;
-   } else if (strcmp(name, "__intrinsic_image_atomic_and") == 0) {
-  op = nir_intrinsic_image_atomic_and;
-   } else if (strcmp(name, "__intrinsic_image_atomic_or") == 0) {
-  op = nir_intrinsic_image_atomic_or;
-   } else if (strcmp(name, "__intrinsic_image_atomic_xor") == 0) {
-  op = nir_intrinsic_image_atomic_xor;
-   } else if (strcmp(name, "__intrinsic_image_atomic_exchange") == 0) {
-  op = nir_intrinsic_image_atomic_exchange;
-   } else if (strcmp(name, "__intrinsic_image_atomic_comp_swap") == 0) {
-  op = nir_intrinsic_image_atomic_comp_swap;
-   } else if (strcmp(name, "__intrinsic_memory_barrier") == 0) {
-  op = nir_intrinsic_memory_barrier;
-   } else if (strcmp(name, "__intrinsic_image_size") == 0) {
-  

[Mesa-dev] [PATCH 00/12 RESEND] ARB_shader_atomic_counter_ops for NIR and i965

2016-07-19 Thread Ian Romanick
Most of the series has already be reviewed by Iago.  However, patches 5
through 8 still need review.  Dylan and Jason had provided some feedback
on patch 5, and Ilia and I had some discussion (which is now mostly
captured in the commit message) about patch 8.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 11/12] i965: Refactor emission of atomic counter operations

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

This will make it easier to add more operations.

Signed-off-by: Ian Romanick 
Reviewed-by: Iago Toral Quiroga 
---
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp   | 19 ---
 src/mesa/drivers/dri/i965/brw_shader.cpp   | 16 
 src/mesa/drivers/dri/i965/brw_shader.h |  3 +++
 src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 19 ---
 4 files changed, 27 insertions(+), 30 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 3da6d75..9a06dfe 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -3598,23 +3598,12 @@ fs_visitor::nir_emit_intrinsic(const fs_builder , 
nir_intrinsic_instr *instr
   fs_reg tmp;
 
   /* Emit a surface read or atomic op. */
-  switch (instr->intrinsic) {
-  case nir_intrinsic_atomic_counter_read:
+  if (instr->intrinsic == nir_intrinsic_atomic_counter_read) {
  tmp = emit_untyped_read(bld, brw_imm_ud(surface), offset, 1, 1);
- break;
-
-  case nir_intrinsic_atomic_counter_inc:
- tmp = emit_untyped_atomic(bld, brw_imm_ud(surface), offset, fs_reg(),
-   fs_reg(), 1, 1, BRW_AOP_INC);
- break;
-
-  case nir_intrinsic_atomic_counter_dec:
+  } else {
  tmp = emit_untyped_atomic(bld, brw_imm_ud(surface), offset, fs_reg(),
-   fs_reg(), 1, 1, BRW_AOP_PREDEC);
- break;
-
-  default:
- unreachable("Unreachable");
+   fs_reg(), 1, 1,
+   get_atomic_counter_op(instr->intrinsic));
   }
 
   /* Assign the result. */
diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
b/src/mesa/drivers/dri/i965/brw_shader.cpp
index f3b5487..1036e4c 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.cpp
+++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
@@ -599,6 +599,22 @@ brw_abs_immediate(enum brw_reg_type type, struct brw_reg 
*reg)
return false;
 }
 
+/**
+ * Get the appropriate atomic op for an image atomic intrinsic.
+ */
+unsigned
+get_atomic_counter_op(nir_intrinsic_op op)
+{
+   switch (op) {
+   case nir_intrinsic_atomic_counter_inc:
+  return BRW_AOP_INC;
+   case nir_intrinsic_atomic_counter_dec:
+  return BRW_AOP_PREDEC;
+   default:
+  unreachable("Not reachable.");
+   }
+}
+
 unsigned
 tesslevel_outer_components(GLenum tes_primitive_mode)
 {
diff --git a/src/mesa/drivers/dri/i965/brw_shader.h 
b/src/mesa/drivers/dri/i965/brw_shader.h
index dd9eb2d..86736e4 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.h
+++ b/src/mesa/drivers/dri/i965/brw_shader.h
@@ -27,6 +27,7 @@
 #include "brw_reg.h"
 #include "brw_defines.h"
 #include "brw_context.h"
+#include "compiler/nir/nir.h"
 
 #ifdef __cplusplus
 #include "brw_ir_allocator.h"
@@ -300,6 +301,8 @@ unsigned tesslevel_outer_components(GLenum 
tes_primitive_mode);
 unsigned tesslevel_inner_components(GLenum tes_primitive_mode);
 unsigned writemask_for_backwards_vector(unsigned mask);
 
+unsigned get_atomic_counter_op(nir_intrinsic_op op);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
index 3b20508..fed8e5f 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
@@ -739,24 +739,13 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr 
*instr)
 
   dest = get_nir_dest(instr->dest);
 
-  switch (instr->intrinsic) {
-  case nir_intrinsic_atomic_counter_inc:
- tmp = emit_untyped_atomic(bld, surface, offset,
-   src_reg(), src_reg(),
-   1, 1,
-   BRW_AOP_INC);
- break;
-  case nir_intrinsic_atomic_counter_dec:
+  if (instr->intrinsic == nir_intrinsic_atomic_counter_read) {
+ tmp = emit_untyped_read(bld, surface, offset, 1, 1);
+  } else {
  tmp = emit_untyped_atomic(bld, surface, offset,
src_reg(), src_reg(),
1, 1,
-   BRW_AOP_PREDEC);
- break;
-  case nir_intrinsic_atomic_counter_read:
- tmp = emit_untyped_read(bld, surface, offset, 1, 1);
- break;
-  default:
- unreachable("Unreachable");
+   get_atomic_counter_op(instr->intrinsic));
   }
 
   bld.MOV(retype(dest, tmp.type), tmp);
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/12] nir/intrinsics: Add more atomic_counter ops

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

v2: Delete some stray debug code notice by Iago.

Signed-off-by: Ian Romanick 
Reviewed-by: Iago Toral Quiroga 
---
 src/compiler/glsl/glsl_to_nir.cpp  | 39 +++---
 src/compiler/glsl/nir_intrinsic_map.py |  8 +
 .../glsl/tests/get_intrinsic_opcode_test.cpp   |  8 +
 src/compiler/nir/nir_intrinsics.h  | 14 
 src/compiler/nir/nir_lower_atomics.c   | 38 +
 5 files changed, 102 insertions(+), 5 deletions(-)

diff --git a/src/compiler/glsl/glsl_to_nir.cpp 
b/src/compiler/glsl/glsl_to_nir.cpp
index 3b8424e..f2b8b05 100644
--- a/src/compiler/glsl/glsl_to_nir.cpp
+++ b/src/compiler/glsl/glsl_to_nir.cpp
@@ -616,11 +616,40 @@ nir_visitor::visit(ir_call *ir)
   switch (op) {
   case nir_intrinsic_atomic_counter_read_var:
   case nir_intrinsic_atomic_counter_inc_var:
-  case nir_intrinsic_atomic_counter_dec_var: {
- ir_dereference *param =
-(ir_dereference *) ir->actual_parameters.get_head();
- instr->variables[0] = evaluate_deref(>instr, param);
- nir_ssa_dest_init(>instr, >dest, 1, 32, NULL);
+  case nir_intrinsic_atomic_counter_dec_var:
+  case nir_intrinsic_atomic_counter_add_var:
+  case nir_intrinsic_atomic_counter_min_var:
+  case nir_intrinsic_atomic_counter_max_var:
+  case nir_intrinsic_atomic_counter_and_var:
+  case nir_intrinsic_atomic_counter_or_var:
+  case nir_intrinsic_atomic_counter_xor_var:
+  case nir_intrinsic_atomic_counter_exchange_var:
+  case nir_intrinsic_atomic_counter_comp_swap_var: {
+ /* Set the counter variable dereference. */
+ exec_node *param = ir->actual_parameters.get_head();
+ ir_dereference *counter = (ir_dereference *)param;
+
+ instr->variables[0] = evaluate_deref(>instr, counter);
+ param = param->get_next();
+
+ /* Set the intrinsic destination. */
+ if (ir->return_deref) {
+nir_ssa_dest_init(>instr, >dest, 1, 32, NULL);
+ }
+
+ /* Set the intrinsic parameters. */
+ if (!param->is_tail_sentinel()) {
+instr->src[0] =
+   nir_src_for_ssa(evaluate_rvalue((ir_dereference *)param));
+param = param->get_next();
+ }
+
+ if (!param->is_tail_sentinel()) {
+instr->src[1] =
+   nir_src_for_ssa(evaluate_rvalue((ir_dereference *)param));
+param = param->get_next();
+ }
+
  nir_builder_instr_insert(, >instr);
  break;
   }
diff --git a/src/compiler/glsl/nir_intrinsic_map.py 
b/src/compiler/glsl/nir_intrinsic_map.py
index 3c2c76c..59b394e 100644
--- a/src/compiler/glsl/nir_intrinsic_map.py
+++ b/src/compiler/glsl/nir_intrinsic_map.py
@@ -27,6 +27,14 @@ from mako.template import Template
 intrinsics = [("__intrinsic_atomic_read", 
("nir_intrinsic_atomic_counter_read_var", None)),
   ("__intrinsic_atomic_increment", 
("nir_intrinsic_atomic_counter_inc_var", None)),
   ("__intrinsic_atomic_predecrement", 
("nir_intrinsic_atomic_counter_dec_var", None)),
+  ("__intrinsic_atomic_add", 
("nir_intrinsic_atomic_counter_add_var", None)),
+  ("__intrinsic_atomic_min", 
("nir_intrinsic_atomic_counter_min_var", None)),
+  ("__intrinsic_atomic_max", 
("nir_intrinsic_atomic_counter_max_var", None)),
+  ("__intrinsic_atomic_and", 
("nir_intrinsic_atomic_counter_and_var", None)),
+  ("__intrinsic_atomic_or", 
("nir_intrinsic_atomic_counter_or_var", None)),
+  ("__intrinsic_atomic_xor", 
("nir_intrinsic_atomic_counter_xor_var", None)),
+  ("__intrinsic_atomic_exchange", 
("nir_intrinsic_atomic_counter_exchange_var", None)),
+  ("__intrinsic_atomic_comp_swap", 
("nir_intrinsic_atomic_counter_comp_swap_var", None)),
   ("__intrinsic_image_load", ("nir_intrinsic_image_load", None)),
   ("__intrinsic_image_store", ("nir_intrinsic_image_store", None)),
   ("__intrinsic_image_atomic_add", 
("nir_intrinsic_image_atomic_add", None)),
diff --git a/src/compiler/glsl/tests/get_intrinsic_opcode_test.cpp 
b/src/compiler/glsl/tests/get_intrinsic_opcode_test.cpp
index aeecf32..d270a03 100644
--- a/src/compiler/glsl/tests/get_intrinsic_opcode_test.cpp
+++ b/src/compiler/glsl/tests/get_intrinsic_opcode_test.cpp
@@ -45,6 +45,14 @@ static const struct test_vector {
test_vector("__intrinsic_atomic_read", 
nir_intrinsic_atomic_counter_read_var),
test_vector("__intrinsic_atomic_increment", 
nir_intrinsic_atomic_counter_inc_var),
test_vector("__intrinsic_atomic_predecrement", 
nir_intrinsic_atomic_counter_dec_var),
+   test_vector("__intrinsic_atomic_add", nir_intrinsic_atomic_counter_add),
+   test_vector("__intrinsic_atomic_min", nir_intrinsic_atomic_counter_min),
+   

[Mesa-dev] [PATCH 12/12] i965: Enable ARB_shader_atomic_counter_ops

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

Signed-off-by: Ian Romanick 
Reviewed-by: Iago Toral Quiroga 
---
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 23 ---
 src/mesa/drivers/dri/i965/brw_shader.cpp | 16 
 src/mesa/drivers/dri/i965/brw_vec4_nir.cpp   | 16 +---
 src/mesa/drivers/dri/i965/intel_extensions.c |  1 +
 4 files changed, 50 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 9a06dfe..90fb7d3 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -3586,23 +3586,40 @@ fs_visitor::nir_emit_intrinsic(const fs_builder , 
nir_intrinsic_instr *instr
switch (instr->intrinsic) {
case nir_intrinsic_atomic_counter_inc:
case nir_intrinsic_atomic_counter_dec:
-   case nir_intrinsic_atomic_counter_read: {
+   case nir_intrinsic_atomic_counter_read:
+   case nir_intrinsic_atomic_counter_add:
+   case nir_intrinsic_atomic_counter_min:
+   case nir_intrinsic_atomic_counter_max:
+   case nir_intrinsic_atomic_counter_and:
+   case nir_intrinsic_atomic_counter_or:
+   case nir_intrinsic_atomic_counter_xor:
+   case nir_intrinsic_atomic_counter_exchange:
+   case nir_intrinsic_atomic_counter_comp_swap: {
   if (stage == MESA_SHADER_FRAGMENT &&
   instr->intrinsic != nir_intrinsic_atomic_counter_read)
  ((struct brw_wm_prog_data *)prog_data)->has_side_effects = true;
 
+  /* Get some metadata from the image intrinsic. */
+  const nir_intrinsic_info *info = _intrinsic_infos[instr->intrinsic];
+
   /* Get the arguments of the atomic intrinsic. */
   const fs_reg offset = get_nir_src(instr->src[0]);
   const unsigned surface = (stage_prog_data->binding_table.abo_start +
 instr->const_index[0]);
+  const fs_reg src0 = (info->num_srcs >= 2
+   ? get_nir_src(instr->src[1]) : fs_reg());
+  const fs_reg src1 = (info->num_srcs >= 3
+   ? get_nir_src(instr->src[2]) : fs_reg());
   fs_reg tmp;
 
+  assert(info->num_srcs <= 3);
+
   /* Emit a surface read or atomic op. */
   if (instr->intrinsic == nir_intrinsic_atomic_counter_read) {
  tmp = emit_untyped_read(bld, brw_imm_ud(surface), offset, 1, 1);
   } else {
- tmp = emit_untyped_atomic(bld, brw_imm_ud(surface), offset, fs_reg(),
-   fs_reg(), 1, 1,
+ tmp = emit_untyped_atomic(bld, brw_imm_ud(surface), offset, src0,
+   src1, 1, 1,
get_atomic_counter_op(instr->intrinsic));
   }
 
diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
b/src/mesa/drivers/dri/i965/brw_shader.cpp
index 1036e4c..e53ea7e 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.cpp
+++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
@@ -610,6 +610,22 @@ get_atomic_counter_op(nir_intrinsic_op op)
   return BRW_AOP_INC;
case nir_intrinsic_atomic_counter_dec:
   return BRW_AOP_PREDEC;
+   case nir_intrinsic_atomic_counter_add:
+  return BRW_AOP_ADD;
+   case nir_intrinsic_atomic_counter_min:
+  return BRW_AOP_UMIN;
+   case nir_intrinsic_atomic_counter_max:
+  return BRW_AOP_UMAX;
+   case nir_intrinsic_atomic_counter_and:
+  return BRW_AOP_AND;
+   case nir_intrinsic_atomic_counter_or:
+  return BRW_AOP_OR;
+   case nir_intrinsic_atomic_counter_xor:
+  return BRW_AOP_XOR;
+   case nir_intrinsic_atomic_counter_exchange:
+  return BRW_AOP_MOV;
+   case nir_intrinsic_atomic_counter_comp_swap:
+  return BRW_AOP_CMPWR;
default:
   unreachable("Not reachable.");
}
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
index fed8e5f..1908d33 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
@@ -730,11 +730,21 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr 
*instr)
case nir_intrinsic_atomic_counter_dec: {
   unsigned surf_index = prog_data->base.binding_table.abo_start +
  (unsigned) instr->const_index[0];
+  const vec4_builder bld =
+ vec4_builder(this).at_end().annotate(current_annotation, base_ir);
+
+  /* Get some metadata from the image intrinsic. */
+  const nir_intrinsic_info *info = _intrinsic_infos[instr->intrinsic];
+
+  /* Get the arguments of the atomic intrinsic. */
   src_reg offset = get_nir_src(instr->src[0], nir_type_int,
instr->num_components);
   const src_reg surface = brw_imm_ud(surf_index);
-  const vec4_builder bld =
- vec4_builder(this).at_end().annotate(current_annotation, base_ir);
+  const src_reg src0 = (info->num_srcs >= 2
+   ? get_nir_src(instr->src[1]) : src_reg());
+  const src_reg 

[Mesa-dev] [PATCH 03/12] glsl: Don't explicitly store "__intrinsic_"

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

Every valid intrinsic function name starts with "__intrinsic_".  Compare
and store the string once instead of 41 times.

   textdata bss dec hex filename
7529283  273096   28584 7830963  777db3 /tmp/i965_dri-64bit-before.so
7529067  273096   28584 7830747  777cdb /tmp/i965_dri-64bit-after.so

Signed-off-by: Ian Romanick 
Reviewed-by: Iago Toral Quiroga 
---
 src/compiler/glsl/nir_intrinsic_map.py | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/nir_intrinsic_map.py 
b/src/compiler/glsl/nir_intrinsic_map.py
index c14438b..4d999c1 100644
--- a/src/compiler/glsl/nir_intrinsic_map.py
+++ b/src/compiler/glsl/nir_intrinsic_map.py
@@ -73,8 +73,13 @@ namespace _glsl_to_nir {
 nir_intrinsic_op
 get_intrinsic_opcode(const char *name, const ir_dereference *return_deref)
 {
+   if (strncmp(name, "__intrinsic_", 12) == 0)
+  name += 12;
+   else
+  unreachable("Intrinsic name does not begin with '__intrinsic_'");
+
 % for (name, ops) in intrinsics:
-   if (strcmp(name, "${name}") == 0) {
+   if (strcmp(name, "${name[12:]}") == 0) {
 % if ops[1] is None:
   return ${ops[0]};
 % else:
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/12] glsl: Sometimes use a switch-statment in get_intrinsic_opcode

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

All of this code operates on the assumption that get_intrinsic_opcode
will only be called with valid names.  If the first character of every
node in the trie level is unique, the prefix can be uniquely identified
by switching on just that first character.  Otherwise we have to string
compare with the prefix.  The switch-statement is quite a bit more
compact that the if-statement ladder of string compares.

   textdata bss dec hex filename
7528835  273096   28584 7830515  777bf3 /tmp/i965_dri-64bit-before.so
7528339  273096   28584 7830019  777a03 /tmp/i965_dri-64bit-after.so

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/nir_intrinsic_map.py | 101 +
 1 file changed, 78 insertions(+), 23 deletions(-)

diff --git a/src/compiler/glsl/nir_intrinsic_map.py 
b/src/compiler/glsl/nir_intrinsic_map.py
index c2b78a0..3c2c76c 100644
--- a/src/compiler/glsl/nir_intrinsic_map.py
+++ b/src/compiler/glsl/nir_intrinsic_map.py
@@ -73,6 +73,26 @@ def remove_prefix(table, prefix_length):
 return [(s[prefix_length:], d) for (s, d) in table]
 
 
+def first_character_always_unique(trie):
+"""Determine whether the first character in every node in the trie level is
+unique.
+
+"""
+seen_characters = set()
+for (s, t, d) in trie:
+if s == "":
+c = ""
+else:
+c = s[:1]
+
+if c in seen_characters:
+return False
+else:
+seen_characters.add(c)
+
+return True
+
+
 def generate_trie(table):
 """table is a list of (string, data) tuples.  It is assumed to be sorted by
 string.
@@ -130,48 +150,83 @@ def generate_trie(table):
 return trie_level
 
 
-def emit_trie_leaf(indent, d):
+def emit_trie_leaf(indent, d, inside_switch):
 if d[1] is None:
 return "{}return {};\n".format(indent, d[0])
 else:
 c_code = "{}int_op = {};\n".format(indent, d[0])
 c_code += "{}uint_op = {};\n".format(indent, d[1])
+if inside_switch:
+c_code += "{}break;\n".format(indent)
 return c_code
 
 
 def trie_as_C_code(trie, indent="   ", prefix_string="__intrinsic_"):
 offset = len(prefix_string)
-conditional = "if"
-
 c_code = ""
-for (s, t, d) in trie:
-if d is not None:
-c_code +=  "{}{} (name[{}] == '\\0') {{\n".format(indent, 
conditional, offset)
-c_code += "{}   assert(strcmp(name, \"{}\") == 
0);\n".format(indent, prefix_string)
-c_code += emit_trie_leaf(indent + "   ", d);
 
-else:
-# Before emitting the string comparison, check to see of the
-# subtree has a single element with an empty string.  In that
-# case, use strcmp() instead of strncmp().
+# If the first character of every node in the trie level is unique, the
+# prefix can be uniquely identified by switching on just that first
+# character.  Otherwise we have to string compare with the prefix.
+if first_character_always_unique(trie):
+c_code += "{}switch(name[{}]) {{\n".format(indent, offset)
+
+for (s, t, d) in trie:
+if d is not None:
+c_code += "{}case '\\0':\n".format(indent)
+c_code += "{}   assert(strcmp(name, \"{}\") == 
0);\n".format(indent, prefix_string)
+c_code += emit_trie_leaf(indent + "   ", d, True);
 
-if len(t) == 1 and t[0][2] is not None:
-if s == "":
-c_code += "{}{} (name[{}] == '\\0') {{\n".format(indent, 
conditional, offset)
+else:
+if len(t) == 1 and t[0][2] is not None:
+if s == "":
+c_code += "{}case '\\0':\n".format(indent)
+else:
+c_code += "{}case '{}':\n".format(indent, s[0])
+
+c_code += "{}   assert(strcmp(name, \"{}\") == 
0);\n".format(indent, prefix_string + s)
+c_code += emit_trie_leaf(indent + "   ", t[0][2], True);
 else:
-c_code += "{}{} (strcmp(name + {}, \"{}\") == 0) 
{{\n".format(indent, conditional, offset, s)
+c_code += "{}case '{}':\n".format(indent, s[0])
+
+c_code += trie_as_C_code(t, indent + "   ", prefix_string 
+ s)
+c_code += "{}   break;\n".format(indent)
+
+c_code += "{}default:\n".format(indent)
+c_code += "{}   unreachable(\"Invalid intrinsic 
name\");\n".format(indent)
+c_code += "{}}}\n".format(indent)
+else:
+conditional = "if"
+
+for (s, t, d) in trie:
+if d is not None:
+c_code +=  "{}{} (name[{}] == '\\0') {{\n".format(indent, 
conditional, offset)
+c_code += "{}   assert(strcmp(name, \"{}\") == 
0);\n".format(indent, prefix_string)
+c_code += 

[Mesa-dev] [PATCH 04/12] glsl: Refactor a little code in get_intrinsic_opcode

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

   textdata bss dec hex filename
7529067  273096   28584 7830747  777cdb /tmp/i965_dri-64bit-before.so
7529003  273096   28584 7830683  777c9b /tmp/i965_dri-64bit-after.so

Signed-off-by: Ian Romanick 
Reviewed-by: Iago Toral Quiroga 
---
 src/compiler/glsl/nir_intrinsic_map.py | 20 +---
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/src/compiler/glsl/nir_intrinsic_map.py 
b/src/compiler/glsl/nir_intrinsic_map.py
index 4d999c1..f6e9241 100644
--- a/src/compiler/glsl/nir_intrinsic_map.py
+++ b/src/compiler/glsl/nir_intrinsic_map.py
@@ -78,22 +78,28 @@ get_intrinsic_opcode(const char *name, const ir_dereference 
*return_deref)
else
   unreachable("Intrinsic name does not begin with '__intrinsic_'");
 
+   nir_intrinsic_op int_op;
+   nir_intrinsic_op uint_op;
+
 % for (name, ops) in intrinsics:
if (strcmp(name, "${name[12:]}") == 0) {
 % if ops[1] is None:
   return ${ops[0]};
 % else:
-  assert(return_deref);
-  if (return_deref->type == glsl_type::int_type)
- return ${ops[0]};
-  else if (return_deref->type == glsl_type::uint_type)
- return ${ops[1]};
-  else
- unreachable("Invalid type");
+  int_op = ${ops[0]};
+  uint_op = ${ops[1]};
 % endif
} else
 % endfor
   unreachable("Unknown intrinsic name");
+
+   assert(return_deref);
+   if (return_deref->type == glsl_type::int_type)
+  return int_op;
+   else if (return_deref->type == glsl_type::uint_type)
+  return uint_op;
+   else
+  unreachable("Invalid type");
 }
 }
 """
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/12] glsl: Kill __intrinsic_atomic_sub

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

Just generate an __intrinsic_atomic_add with a negated parameter.

Some background on the non-obvious reasons for the the big change to
builtin_builder::call()... this is cribbed from some discussion with
Ilia on mesa-dev.

Why change builtin_builder::call() to allow taking dereferences and
create them here rather than just feeding in the ir_variables directly?
The problem is the neg_data ir_variable node would have to be in two
lists at the same time: the instruction stream and parameters.  The
ir_variable node is automatically added to the instruction stream by the
call to make_temp.  Restructuring the code so that the ir_variables
could be in parameters then move them to the instruction stream would
have been pretty terrible.

ir_call in the instruction stream has an exec_list that contains
ir_dereference_variable nodes.

The builtin_builder::call method previously took an exec_list of
ir_variables and created a list of ir_dereference_variable.  All of the
original users of that method wanted to make a function call using
exactly the set of parameters passed to the built-in function (i.e.,
call __intrinsic_atomic_add using the parameters to atomicAdd).  For
these users, the list of ir_variables already existed:  the list of
parameters in the built-in function signature.

This new caller doesn't do that.  It wants to call a function with a
parameter from the function and a value calculated in the function.  So,
I changed builtin_builder::call to take a list that could either be a
list of ir_variable or a list of ir_dereference_variable.  In the former
case it behaves just as it previously did.  In the latter case, it uses
(and removes from the input list) the ir_dereference_variable nodes
instead of creating new ones.

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/builtin_functions.cpp| 50 +++---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |  8 -
 2 files changed, 46 insertions(+), 12 deletions(-)

diff --git a/src/compiler/glsl/builtin_functions.cpp 
b/src/compiler/glsl/builtin_functions.cpp
index db2d3e3..8158348 100644
--- a/src/compiler/glsl/builtin_functions.cpp
+++ b/src/compiler/glsl/builtin_functions.cpp
@@ -3317,13 +3317,29 @@ builtin_builder::asin_expr(ir_variable *x, float p0, 
float p1)
   mul(abs(x), imm(p1));
 }
 
+/**
+ * Generate a ir_call to a function with a set of parameters
+ *
+ * The input \c params can either be a list of \c ir_variable or a list of
+ * \c ir_dereference_variable.  In the latter case, all nodes will be removed
+ * from \c params and used directly as the parameters to the generated
+ * \c ir_call.
+ */
 ir_call *
 builtin_builder::call(ir_function *f, ir_variable *ret, exec_list params)
 {
exec_list actual_params;
 
-   foreach_in_list(ir_variable, var, ) {
-  actual_params.push_tail(var_ref(var));
+   foreach_in_list_safe(ir_instruction, ir, ) {
+  ir_dereference_variable *d = ir->as_dereference_variable();
+  if (d != NULL) {
+ d->remove();
+ actual_params.push_tail(d);
+  } else {
+ ir_variable *var = ir->as_variable();
+ assert(var != NULL);
+ actual_params.push_tail(var_ref(var));
+  }
}
 
ir_function_signature *sig =
@@ -5301,8 +5317,34 @@ builtin_builder::_atomic_counter_op1(const char 
*intrinsic,
MAKE_SIG(glsl_type::uint_type, avail, 2, counter, data);
 
ir_variable *retval = body.make_temp(glsl_type::uint_type, "atomic_retval");
-   body.emit(call(shader->symbols->get_function(intrinsic), retval,
-  sig->parameters));
+
+   /* Instead of generating an __intrinsic_atomic_sub, generate an
+* __intrinsic_atomic_add with the data parameter negated.
+*/
+   if (strcmp("__intrinsic_atomic_sub", intrinsic) == 0) {
+  ir_variable *const neg_data =
+ body.make_temp(glsl_type::uint_type, "neg_data");
+
+  body.emit(assign(neg_data, neg(data)));
+
+  exec_list parameters;
+
+  parameters.push_tail(new(mem_ctx) ir_dereference_variable(counter));
+  parameters.push_tail(new(mem_ctx) ir_dereference_variable(neg_data));
+
+  ir_function *const func =
+ shader->symbols->get_function("__intrinsic_atomic_add");
+  ir_instruction *const c = call(func, retval, parameters);
+
+  assert(c != NULL);
+  assert(parameters.is_empty());
+
+  body.emit(c);
+   } else {
+  body.emit(call(shader->symbols->get_function(intrinsic), retval,
+ sig->parameters));
+   }
+
body.emit(ret(retval));
return sig;
 }
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 7564119..cad8f9d 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -3210,13 +3210,6 @@ 
glsl_to_tgsi_visitor::visit_atomic_counter_intrinsic(ir_call *ir)
  val = 

[Mesa-dev] [PATCH 09/12] nir/intrinsics: Include atomic_counter_ in the names used in macro invocations

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

Otherwise grepping for where atomic_counter_inc and friends are defined
is a very frustrating experience.

Signed-off-by: Ian Romanick 
Reviewed-by: Iago Toral Quiroga 
---
 src/compiler/nir/nir_intrinsics.h | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/compiler/nir/nir_intrinsics.h 
b/src/compiler/nir/nir_intrinsics.h
index 2f74555..0f2c5ec 100644
--- a/src/compiler/nir/nir_intrinsics.h
+++ b/src/compiler/nir/nir_intrinsics.h
@@ -137,12 +137,12 @@ INTRINSIC(set_vertex_count, 1, ARR(1), false, 0, 0, 0, 
xx, xx, xx, 0)
  */
 
 #define ATOMIC(name, flags) \
-   INTRINSIC(atomic_counter_##name##_var, 0, ARR(0), true, 1, 1, 0, xx, xx, 
xx, flags) \
-   INTRINSIC(atomic_counter_##name, 1, ARR(1), true, 1, 0, 1, BASE, xx, xx, 
flags)
+   INTRINSIC(name##_var, 0, ARR(0), true, 1, 1, 0, xx, xx, xx, flags) \
+   INTRINSIC(name, 1, ARR(1), true, 1, 0, 1, BASE, xx, xx, flags)
 
-ATOMIC(inc, 0)
-ATOMIC(dec, 0)
-ATOMIC(read, NIR_INTRINSIC_CAN_ELIMINATE)
+ATOMIC(atomic_counter_inc, 0)
+ATOMIC(atomic_counter_dec, 0)
+ATOMIC(atomic_counter_read, NIR_INTRINSIC_CAN_ELIMINATE)
 
 /*
  * Image load, store and atomic intrinsics.
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/7] i965: Move load_interpolated_input/barycentric_* intrinsics to the top.

2016-07-19 Thread Jason Ekstrand
On Mon, Jul 18, 2016 at 1:26 PM, Kenneth Graunke 
wrote:

> Currently, i965 interpolates all FS inputs at the top of the program.
> This has advantages and disadvantages, but I'd like to keep that policy
> while reworking this code.  We can consider changing it independently.
>
> The next patch will make the compiler generate PLN instructions "on the
> fly", when it encounters an input load intrinsic, rather than doing it
> for all inputs at the start of the program.
>
> To emulate this behavior, we introduce an ugly pass to move all NIR
> load_interpolated_input and payload-based (not interpolator message)
> load_barycentric_* intrinsics to the shader's start block.
>
> This helps avoid regressions in shader-db for cases such as:
>
>if (...) {
>   ...load some input...
>} else {
>   ...load that same input...
>}
>
> which CSE can't handle, because there's no dominance relationship
> between the two loads.  Because the start block dominates all others,
> we can CSE all inputs and emit PLNs exactly once, as we did before.
>
> Ideally, global value numbering would eliminate these redundant loads,
> while not forcing them all the way to the start block.  When that lands,
> we should consider dropping this hacky pass.
>

Ugh... You're probably right that we need to do this but it's ugly...  I
look forward to deleting this code :)


> Again, this pass currently does nothing, as i965 doesn't generate these
> intrinsics yet.  But it will shortly, and I figured I'd separate this
> code as it's relatively self-contained.
>
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp | 78
> 
>  1 file changed, 78 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index ea6616b..94127bc 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -6400,6 +6400,83 @@ computed_depth_mode(const nir_shader *shader)
>  }
>
>  /**
> + * Move load_interpolated_input with simple (payload-based) barycentric
> modes
> + * to the top of the program so we don't emit multiple PLNs for the same
> input.
> + *
> + * This works around CSE not being able to handle non-dominating cases
> + * such as:
> + *
> + *if (...) {
> + *   interpolate input
> + *} else {
> + *   interpolate the same exact input
> + *}
> + *
> + * This should be replaced by global value numbering someday.
> + */
> +void
> +move_interpolation_to_top(nir_shader *nir)
> +{
> +   nir_foreach_function(f, nir) {
> +  if (!f->impl)
> + continue;
> +
> +  nir_builder b;
> +  nir_builder_init(, f->impl);
> +  b.cursor = nir_before_block(nir_start_block(f->impl));
> +
> +  nir_foreach_block(block, f->impl) {
> + nir_foreach_instr_safe(instr, block) {
> +if (instr->type != nir_instr_type_intrinsic)
> +   continue;
> +
> +nir_intrinsic_instr *load = nir_instr_as_intrinsic(instr);
> +if (load->intrinsic != nir_intrinsic_load_interpolated_input)
> +   continue;
> +
> +nir_intrinsic_instr *bary =
> +   nir_instr_as_intrinsic(load->src[0].ssa->parent_instr);
> +
> +/* Leave interpolateAtSample/Offset() where it is. */
> +if (bary->intrinsic ==
> nir_intrinsic_load_barycentric_at_sample ||
> +bary->intrinsic ==
> nir_intrinsic_load_barycentric_at_offset)
> +   continue;
> +
> +/* Make a new load_barycentric_* intrinsic at the top */
> +nir_ssa_def *top_bary =
> +   nir_load_barycentric(, bary->intrinsic,
> +nir_intrinsic_interp_mode(bary));
> +
> +/* Make a new load_intrinsic_input at the top */
> +nir_intrinsic_instr *top_load =
> nir_intrinsic_instr_create(nir,
> +   nir_intrinsic_load_interpolated_input);
> +top_load->num_components = load->num_components;
> +top_load->src[0] = nir_src_for_ssa(top_bary);
> +/* We don't support indirects today - otherwise we might not
> + * be able to move this to the top. add_const_offset_to_base
> + * guarantees the offset will be 0.
> + */
> +assert(nir_src_as_const_value(load->src[1]) &&
> +   nir_src_as_const_value(load->src[1])->u32[0] == 0);
> +top_load->src[1] = nir_src_for_ssa(nir_imm_int(, 0));
> +top_load->const_index[0] = load->const_index[0];
> +top_load->const_index[1] = load->const_index[1];
> +nir_ssa_dest_init(_load->instr, _load->dest,
> +  load->dest.ssa.num_components,
> +  load->dest.ssa.bit_size, NULL);
> +
> +nir_ssa_def_rewrite_uses(>dest.ssa,
> +
>  nir_src_for_ssa(_load->dest.ssa));
> +

Re: [Mesa-dev] [PATCH 2/7] nir: Add a nir_lower_io flag for using load_interpolated_input intrins.

2016-07-19 Thread Jason Ekstrand
On Mon, Jul 18, 2016 at 10:00 PM, Chris Forbes  wrote:

> Seems a little unfortunate to add a random bool to this interface which is
> otherwise fairly descriptive, but OK.
>

I agree that this is a bit unfortunate.  I was going to suggest adding a
flags parameter and a bitfield union but a flags parameter for a single
bool is also kind-of stupid.  I vote we keep it as-is and make it a flags
parameter once we have 2 bools.

--Jason


>
> On Tue, Jul 19, 2016 at 8:26 AM, Kenneth Graunke 
> wrote:
>
>> While my intention is that the new intrinsics should be usable by all
>> drivers, we need to make them optional until all drivers switch.
>>
>> This doesn't do anything yet, but I added it as a separate patch to
>> keep the interface churn separate for easier review.
>>
>> Signed-off-by: Kenneth Graunke 
>> ---
>>  src/compiler/nir/nir.h  |  3 ++-
>>  src/compiler/nir/nir_lower_io.c | 15 +++
>>  src/gallium/drivers/freedreno/ir3/ir3_cmdline.c |  2 +-
>>  src/mesa/drivers/dri/i965/brw_blorp.c   |  2 +-
>>  src/mesa/drivers/dri/i965/brw_nir.c | 18 +-
>>  src/mesa/drivers/dri/i965/brw_program.c |  4 ++--
>>  src/mesa/state_tracker/st_glsl_to_nir.cpp   |  2 +-
>>  7 files changed, 27 insertions(+), 19 deletions(-)
>>
>> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
>> index ac11998..e996e0e 100644
>> --- a/src/compiler/nir/nir.h
>> +++ b/src/compiler/nir/nir.h
>> @@ -2324,7 +2324,8 @@ void nir_assign_var_locations(struct exec_list
>> *var_list, unsigned *size,
>>
>>  void nir_lower_io(nir_shader *shader,
>>nir_variable_mode modes,
>> -  int (*type_size)(const struct glsl_type *));
>> +  int (*type_size)(const struct glsl_type *),
>> +  bool use_load_interpolated_input_intrinsics);
>>  nir_src *nir_get_io_offset_src(nir_intrinsic_instr *instr);
>>  nir_src *nir_get_io_vertex_index_src(nir_intrinsic_instr *instr);
>>
>> diff --git a/src/compiler/nir/nir_lower_io.c
>> b/src/compiler/nir/nir_lower_io.c
>> index b05a73f..aa8a517 100644
>> --- a/src/compiler/nir/nir_lower_io.c
>> +++ b/src/compiler/nir/nir_lower_io.c
>> @@ -39,6 +39,7 @@ struct lower_io_state {
>> void *mem_ctx;
>> int (*type_size)(const struct glsl_type *type);
>> nir_variable_mode modes;
>> +   bool use_interpolated_input;
>>  };
>>
>>  void
>> @@ -394,7 +395,8 @@ nir_lower_io_block(nir_block *block,
>>  static void
>>  nir_lower_io_impl(nir_function_impl *impl,
>>nir_variable_mode modes,
>> -  int (*type_size)(const struct glsl_type *))
>> +  int (*type_size)(const struct glsl_type *),
>> +  bool use_interpolated_input)
>>  {
>> struct lower_io_state state;
>>
>> @@ -402,6 +404,7 @@ nir_lower_io_impl(nir_function_impl *impl,
>> state.mem_ctx = ralloc_parent(impl);
>> state.modes = modes;
>> state.type_size = type_size;
>> +   state.use_interpolated_input = use_interpolated_input;
>>
>> nir_foreach_block(block, impl) {
>>nir_lower_io_block(block, );
>> @@ -413,11 +416,15 @@ nir_lower_io_impl(nir_function_impl *impl,
>>
>>  void
>>  nir_lower_io(nir_shader *shader, nir_variable_mode modes,
>> - int (*type_size)(const struct glsl_type *))
>> + int (*type_size)(const struct glsl_type *),
>> + bool use_interpolated_input)
>>  {
>> nir_foreach_function(function, shader) {
>> -  if (function->impl)
>> - nir_lower_io_impl(function->impl, modes, type_size);
>> +  if (function->impl) {
>> + nir_lower_io_impl(function->impl, modes, type_size,
>> +   use_interpolated_input &&
>> +   shader->stage == MESA_SHADER_FRAGMENT);
>> +  }
>> }
>>  }
>>
>> diff --git a/src/gallium/drivers/freedreno/ir3/ir3_cmdline.c
>> b/src/gallium/drivers/freedreno/ir3/ir3_cmdline.c
>> index 41532fc..a8a8c1b 100644
>> --- a/src/gallium/drivers/freedreno/ir3/ir3_cmdline.c
>> +++ b/src/gallium/drivers/freedreno/ir3/ir3_cmdline.c
>> @@ -93,7 +93,7 @@ load_glsl(unsigned num_files, char* const* files,
>> gl_shader_stage stage)
>> // TODO nir_assign_var_locations??
>>
>> NIR_PASS_V(nir, nir_lower_system_values);
>> -   NIR_PASS_V(nir, nir_lower_io, nir_var_all, st_glsl_type_size);
>> +   NIR_PASS_V(nir, nir_lower_io, nir_var_all, st_glsl_type_size,
>> false);
>> NIR_PASS_V(nir, nir_lower_samplers, prog);
>>
>> return nir;
>> diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c
>> b/src/mesa/drivers/dri/i965/brw_blorp.c
>> index 282a5b2..0473cfe 100644
>> --- a/src/mesa/drivers/dri/i965/brw_blorp.c
>> +++ b/src/mesa/drivers/dri/i965/brw_blorp.c
>> @@ -209,7 +209,7 @@ brw_blorp_compile_nir_shader(struct brw_context *brw,
>> struct nir_shader *nir,
>>unsigned end = var->data.location 

[Mesa-dev] introducing radv - proof of concept vulkan driver for AMD VI chipsets

2016-07-19 Thread Dave Airlie
I was waiting for an open source driver to appear when I realised I
should really just write one myself, some talking with Bas later, and
we decided to see where we could get.

This is the point at which we were willing to show it to others, it's
not really a vulkan driver yet, so far it's a vulkan triangle demos
driver.

It renders the tri and cube demos from the vulkan loader,
and the triangle demo from Sascha Willems demos
and the Vulkan CTS smoke tests (all 4 of them one of which draws a triangle).

There is a lot of work to do, and it's at the stage where we are
seeing if anyone else wants to join in at the start, before we make
too many serious design decisions or take a path we really don't want
to.

So far it's only been run on Tonga and Fiji chips I think, we are
hoping to support radeon kernel driver for SI/CIK at some point, but I
think we need to get things a bit further on VI chips first.

The code is currently here:
https://github.com/airlied/mesa/tree/semi-interesting

There is a not-interesting branch which contains all the pre-history
which might be useful for someone else bringing up a vulkan driver on
other hardware.

The code is pretty much based on the Intel anv driver, with the winsys
ported from gallium driver,
and most of the state setup from there. Bas wrote the code to connect
NIR<->LLVM IR so we could reuse it in the future for SPIR-V in GL if
required. It also copies AMD addrlib over, (this should be shared).

Also we don't do SPIR-V->LLVM direct. We use NIR as it has the best
chance for inter shader stage optimisations (vertex/fragment combined)
which neither SPIR-V or LLVM handles for us, (nir doesn't do it yet
but it can).

If you want to submit bug reports, they will only be taken seriously
if accompanied by working patches at this stage, and we've no plans to
merge to master yet, but open to discussion on when we could do that
and what would be required.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/7] nir: Add new intrinsics for fragment shader input interpolation.

2016-07-19 Thread Jason Ekstrand
On Mon, Jul 18, 2016 at 1:26 PM, Kenneth Graunke 
wrote:

> Backends can normally handle shader inputs solely by looking at
> load_input intrinsics, and ignore the nir_variables in nir->inputs.
>
> One exception is fragment shader inputs.  load_input doesn't capture
> the necessary interpolation information - flat, smooth, noperspective
> mode, and centroid, sample, or pixel for the location.  This means
> that backends have to interpolate based on the nir_variables, then
> associate those with the load_input intrinsics (say, by storing a
> map of which variables are at which locations).
>
> With GL_ARB_enhanced_layouts, we're going to have multiple varyings
> packed into a single vec4 location.  The intrinsics make this easy:
> simply load N components from location .  However,
> working with variables and correlating the two is very awkward; we'd
> much rather have intrinsics capture all the necessary information.
>
> Fragment shader input interpolation typically works by producing a
> set of barycentric coordinates, then using those to do a linear
> interpolation between the values at the triangle's corners.
>
> We represent this by introducing five new load_barycentric_* intrinsics:
>
> - load_barycentric_pixel (ordinary variable)
> - load_barycentric_centroid  (centroid qualified variable)
> - load_barycentric_sample(sample qualified variable)
> - load_barycentric_at_sample (ARB_gpu_shader5's interpolateAtSample())
> - load_barycentric_at_offset (ARB_gpu_shader5's interpolateAtOffset())
>
> Each of these take the interpolation mode (smooth or noperspective only)
> as a const_index, and produce a vec2.  The last two also take a sample
> or offset source.
>
> We then introduce a new load_interpolated_input intrinsic, which
> is like a normal load_input intrinsic, but with an additional
> barycentric coordinate source.
>
> The intention is that flat inputs will still use regular load_input
> intrinsics.  This makes them distinguishable from normal inputs that
> need fancy interpolation, while also providing all the necessary data.
>
> This nicely unifies regular inputs and interpolateAt functions.
> Qualifiers and variables become irrelevant; there are just
> load_barycentric intrinsics that determine the interpolation.
>
> Signed-off-by: Kenneth Graunke 
> ---
>  src/compiler/nir/nir.h|  6 ++
>  src/compiler/nir/nir_builder.h| 11 +++
>  src/compiler/nir/nir_intrinsics.h | 24 
>  src/compiler/nir/nir_lower_io.c   |  1 +
>  src/compiler/nir/nir_print.c  |  1 +
>  5 files changed, 43 insertions(+)
>
> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
> index c5d3b6b..ac11998 100644
> --- a/src/compiler/nir/nir.h
> +++ b/src/compiler/nir/nir.h
> @@ -992,6 +992,11 @@ typedef enum {
>  */
> NIR_INTRINSIC_COMPONENT = 8,
>
> +   /**
> +* Interpolation mode (only meaningful for FS inputs).
> +*/
> +   NIR_INTRINSIC_INTERP_MODE = 9,
> +
> NIR_INTRINSIC_NUM_INDEX_FLAGS,
>
>  } nir_intrinsic_index_flag;
> @@ -1059,6 +1064,7 @@ INTRINSIC_IDX_ACCESSORS(range, RANGE, unsigned)
>  INTRINSIC_IDX_ACCESSORS(desc_set, DESC_SET, unsigned)
>  INTRINSIC_IDX_ACCESSORS(binding, BINDING, unsigned)
>  INTRINSIC_IDX_ACCESSORS(component, COMPONENT, unsigned)
> +INTRINSIC_IDX_ACCESSORS(interp_mode, INTERP_MODE, unsigned)
>
>  /**
>   * \group texture information
> diff --git a/src/compiler/nir/nir_builder.h
> b/src/compiler/nir/nir_builder.h
> index 09cdf72..435582a 100644
> --- a/src/compiler/nir/nir_builder.h
> +++ b/src/compiler/nir/nir_builder.h
> @@ -458,6 +458,17 @@ nir_load_system_value(nir_builder *build,
> nir_intrinsic_op op, int index)
> return >dest.ssa;
>  }
>
> +static inline nir_ssa_def *
> +nir_load_barycentric(nir_builder *build, nir_intrinsic_op op,
> + unsigned interp_mode)
> +{
> +   nir_intrinsic_instr *bary = nir_intrinsic_instr_create(build->shader,
> op);
> +   nir_ssa_dest_init(>instr, >dest, 2, 32, NULL);
> +   nir_intrinsic_set_interp_mode(bary, interp_mode);
> +   nir_builder_instr_insert(build, >instr);
> +   return >dest.ssa;
> +}
> +
>  static inline void
>  nir_jump(nir_builder *build, nir_jump_type jump_type)
>  {
> diff --git a/src/compiler/nir/nir_intrinsics.h
> b/src/compiler/nir/nir_intrinsics.h
> index 2f74555..29917e3 100644
> --- a/src/compiler/nir/nir_intrinsics.h
> +++ b/src/compiler/nir/nir_intrinsics.h
> @@ -306,6 +306,27 @@ SYSTEM_VALUE(num_work_groups, 3, 0, xx, xx, xx)
>  SYSTEM_VALUE(helper_invocation, 1, 0, xx, xx, xx)
>  SYSTEM_VALUE(channel_num, 1, 0, xx, xx, xx)
>
> +/**
> + * Barycentric coordinate intrinsics.
> + *
> + * These set up the barycentric coordinates for a particular
> interpolation.
> + * The first three are for the simple cases: pixel, centroid, or
> per-sample
> + * (at gl_SampleID).  The next two handle interpolating at a specified
> + * sample location, or interpolating with a vec2 

Re: [Mesa-dev] [PATCH 00/56] Die copy-and-paste code, die

2016-07-19 Thread Matt Turner
On Tue, Jul 19, 2016 at 12:24 PM, Ian Romanick  wrote:
> After seeing Dave's series to add support GL_ARB_shader_gpu_int64 and
> thinking about adding support for 8- and  16-bit integers, I decided
> that something had to be done about the cut-and-paste madness that is
> ir_constant_expression.cpp.  I decided to take a page from Jason's book
> and generate it from a machine description of the expressions.  The
> result is this series.
>
> You may notice from some of the earlier patches in this series that I
> started this work over a year ago.  The previous work was an attempt to
> generate opt_algebraic.cpp which was ultimately abandonded.  It may be
> worth picking that up again.
>
> I haven't done *anything* for SCons, so hopefully Jose or someone can
> help out there.
>
> All of this is available at:
>
> https://cgit.freedesktop.org/~idr/mesa/log/?h=generated-glsl-ir
>
> Other possible follow-up work:
>
>  - A few expressions don't have constant evaluation support.  I don't
>think I've seen a real shader use any of these, so there's a reason
>we haven't "missed" them.
>
> - frexp_sig

A program could do something crazy like

  float array[int(frexp(42.0, exp))];

in which case we'd need to handle this.

> - frexp_exp

exp is an "out" parameter of frexp(), so it can't be used for things
like sizing an array.

> - vote_any
> - vote_all
> - vote_eq

These are trivial. For constant inputs, any/all return the argument,
and vote_eq returns true.

> - imul_high
> - carry
> - borrow

Same story as frexp_exp.

>
>  - Generate validation code for expressions.  A few times while
>developing this series I had questions about what the IR actually
>supported.  In quite a few cases the IR support is different from
>what GLSL supports.  I would often look to ir_validate.cpp to answer

What can you say... often GLSL is stupid. :)

Really though, we have some intentional differences like allowing
vector versions of logical operations.

I'll start reviewing.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 89599] symbol 'x86_64_entry_start' is already defined when building with LLVM/clang

2016-07-19 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=89599

--- Comment #12 from austinengl...@gmail.com  ---
(In reply to Matt Turner from comment #11)
> I sent a modified version of Tomasz's patch last week to mesa-dev.
> 
> Would anyone like to test it?
> 
> [PATCH] mapi: Massage code to allow clang to compile.

https://lists.freedesktop.org/archives/mesa-dev/2016-July/122804.html works for
me, with mesa 11.0.6 and clang 3.7.1.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir: Fix uninitialized use of 'replacement'.

2016-07-19 Thread Anuj Phogat
On Tue, Jul 19, 2016 at 12:25 PM, Kenneth Graunke  wrote:
> For intrinsics we don't care about, just skip to the next loop iteration
> and process the next instruction.  We don't want to execute the rest of
> the code.
>
> This was a bug in commit cdfc05ea6e8c87876cdbf588aa8e03d70f3da4bb.
>
> Signed-off-by: Kenneth Graunke 
> ---
>  src/compiler/nir/nir_lower_io.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/compiler/nir/nir_lower_io.c b/src/compiler/nir/nir_lower_io.c
> index 71d2432..189370d 100644
> --- a/src/compiler/nir/nir_lower_io.c
> +++ b/src/compiler/nir/nir_lower_io.c
> @@ -352,41 +352,41 @@ nir_lower_io_block(nir_block *block,
>
>case nir_intrinsic_store_var:
>   replacement = lower_store(intrin, state, vertex_index, offset);
>   break;
>
>case nir_intrinsic_var_atomic_add:
>case nir_intrinsic_var_atomic_imin:
>case nir_intrinsic_var_atomic_umin:
>case nir_intrinsic_var_atomic_imax:
>case nir_intrinsic_var_atomic_umax:
>case nir_intrinsic_var_atomic_and:
>case nir_intrinsic_var_atomic_or:
>case nir_intrinsic_var_atomic_xor:
>case nir_intrinsic_var_atomic_exchange:
>case nir_intrinsic_var_atomic_comp_swap:
>   assert(vertex_index == NULL);
>   replacement = lower_atomic(intrin, state, offset);
>   break;
>
>default:
> - break;
> + continue;
>}
>
>if (nir_intrinsic_infos[intrin->intrinsic].has_dest) {
>   if (intrin->dest.is_ssa) {
>  nir_ssa_dest_init(>instr, >dest,
>intrin->dest.ssa.num_components,
>intrin->dest.ssa.bit_size, NULL);
>  nir_ssa_def_rewrite_uses(>dest.ssa,
>   
> nir_src_for_ssa(>dest.ssa));
>   } else {
>  nir_dest_copy(>dest, >dest, state->mem_ctx);
>   }
>}
>
>nir_instr_insert_before(>instr, >instr);
>nir_instr_remove(>instr);
> }
>
> return true;
>  }
> --
> 2.9.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reviewed-by: Anuj Phogat 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 55/56] glsl: Refactor handling of horizontal operations

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_expression_operation.py | 15 +++
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/src/compiler/glsl/ir_expression_operation.py 
b/src/compiler/glsl/ir_expression_operation.py
index bd71370..bac5a12 100644
--- a/src/compiler/glsl/ir_expression_operation.py
+++ b/src/compiler/glsl/ir_expression_operation.py
@@ -365,22 +365,21 @@ class operation(object):
   if self.c_expression is None:
  return None
 
-  if self.num_operands == 1:
- if horizontal_operation in self.flags and non_assign_operation in 
self.flags:
+  if horizontal_operation in self.flags:
+ if non_assign_operation in self.flags:
 return constant_template_horizontal_nonassignment.render(op=self)
- elif horizontal_operation in self.flags:
+ elif types_identical_operation in self.flags:
+return 
constant_template_horizontal_single_implementation.render(op=self)
+ else:
 return constant_template_horizontal.render(op=self)
-  elif self.num_operands == 2:
+
+  if self.num_operands == 2:
  if self.name == "mul":
 return constant_template_mul.render(op=self)
  elif self.name == "vector_extract":
 return constant_template_vector_extract.render(op=self)
  elif vector_scalar_operation in self.flags:
 return constant_template_vector_scalar.render(op=self)
- elif horizontal_operation in self.flags and types_identical_operation 
in self.flags:
-return 
constant_template_horizontal_single_implementation.render(op=self)
- elif horizontal_operation in self.flags:
-return constant_template_horizontal.render(op=self)
   elif self.num_operands == 3:
  if self.name == "vector_insert":
 return constant_template_vector_insert.render(op=self)
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 56/56] glsl: Replace most assertions with unreachable()

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_expression_operation.py | 19 +--
 1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/src/compiler/glsl/ir_expression_operation.py 
b/src/compiler/glsl/ir_expression_operation.py
index bac5a12..c466c08 100644
--- a/src/compiler/glsl/ir_expression_operation.py
+++ b/src/compiler/glsl/ir_expression_operation.py
@@ -103,7 +103,7 @@ constant_template_common = mako.template.Template("""\
 break;
 % endfor
  default:
-assert(0);
+unreachable("invalid type");
  }
   }
   break;""")
@@ -134,7 +134,7 @@ constant_template_vector_scalar = 
mako.template.Template("""\
 break;
 % endfor
  default:
-assert(0);
+unreachable("invalid type");
  }
   }
   break;""")
@@ -157,7 +157,7 @@ constant_template_mul = mako.template.Template("""\
break;
 % endfor
 default:
-   assert(0);
+   unreachable("invalid type");
 }
  }
   } else {
@@ -215,7 +215,7 @@ constant_template_horizontal = mako.template.Template("""\
  break;
 % endfor
   default:
- assert(0);
+ unreachable("invalid type");
   }
   break;""")
 
@@ -232,7 +232,7 @@ constant_template_vector_extract = 
mako.template.Template("""\
  break;
 % endfor
   default:
- assert(0);
+ unreachable("invalid type");
   }
   break;
}""")
@@ -251,8 +251,7 @@ constant_template_vector_insert = 
mako.template.Template("""\
  break;
 % endfor
   default:
- assert(!"Should not get here.");
- break;
+ unreachable("invalid type");
   }
   break;
}""")
@@ -268,7 +267,7 @@ constant_template_vector = mako.template.Template("""\
 break;
 % endfor
  default:
-assert(0);
+unreachable("invalid type");
  }
   }
   break;""")
@@ -292,7 +291,7 @@ constant_template_lrp = mako.template.Template("""\
 break;
 % endfor
  default:
-assert(0);
+unreachable("invalid type");
  }
   }
   break;
@@ -311,7 +310,7 @@ constant_template_csel = mako.template.Template("""\
 break;
 % endfor
  default:
-assert(0);
+unreachable("invalid type");
  }
   }
   break;""")
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 51/56] glsl: Eliminate constant_template0

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

This template is mostly an artefact of the development of the original
patch series and to minimize the differences between the original code
and the generated code.

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_expression_operation.py | 38 +++-
 1 file changed, 4 insertions(+), 34 deletions(-)

diff --git a/src/compiler/glsl/ir_expression_operation.py 
b/src/compiler/glsl/ir_expression_operation.py
index 0ad27f6..44f4f68 100644
--- a/src/compiler/glsl/ir_expression_operation.py
+++ b/src/compiler/glsl/ir_expression_operation.py
@@ -90,27 +90,10 @@ signed_numeric_types = (int_type, float_type, double_type)
 integer_types = (uint_type, int_type)
 real_types = (float_type, double_type)
 
-# This template is for unary and binary operations that can only have operands
-# of a single type or the implementation for all types is identical.
-# ir_unop_logic_not is an example of the former, and ir_quadop_bitfield_insert
-# is an example of the latter..
-constant_template0 = mako.template.Template("""\
-   case ${op.get_enum_name()}:
-% if len(op.source_types) == 1:
-  assert(op[0]->type->base_type == ${op.source_types[0].glsl_type});
-% endif
-  for (unsigned c = 0; c < op[0]->type->components(); c++)
-% for (dst_type, src_types) in op.signatures():
-% if loop.index == 0:
- data.${dst_type.union_field}[c] = ${op.get_c_expression(src_types)};
-% endif
-% endfor
-  break;""")
-
 # This template is for operations that can have operands of a several
 # different types, and each type may or may not has a different C expression.
-# ir_unop_bit_not and ir_unop_neg are examples.
-constant_template3 = mako.template.Template("""\
+# This is used by most operations.
+constant_template_common = mako.template.Template("""\
case ${op.get_enum_name()}:
   for (unsigned c = 0; c < op[0]->type->components(); c++) {
  switch (this->type->base_type) {
@@ -417,10 +400,6 @@ class operation(object):
 return constant_template2.render(op=self)
  elif self.dest_type is not None:
 return constant_template5.render(op=self)
- elif len(self.source_types) == 1:
-return constant_template0.render(op=self)
- else:
-return constant_template3.render(op=self)
   elif self.num_operands == 2:
  if self.name == "mul":
 return constant_template_mul.render(op=self)
@@ -432,12 +411,8 @@ class operation(object):
 return 
constant_template_horizontal_single_implementation.render(op=self)
  elif horizontal_operation in self.flags:
 return constant_template_horizontal.render(op=self)
- elif len(self.source_types) == 1:
-return constant_template0.render(op=self)
  elif self.dest_type is not None:
 return constant_template5.render(op=self)
- else:
-return constant_template3.render(op=self)
   elif self.num_operands == 3:
  if self.name == "vector_insert":
 return constant_template_vector_insert.render(op=self)
@@ -445,15 +420,11 @@ class operation(object):
 return constant_template_lrp.render(op=self)
  elif self.name == "csel":
 return constant_template_csel.render(op=self)
- else:
-return constant_template3.render(op=self)
   elif self.num_operands == 4:
  if self.name == "vector":
 return constant_template_vector.render(op=self)
- elif types_identical_operation in self.flags:
-return constant_template0.render(op=self)
 
-  return None
+  return constant_template_common.render(op=self)
 
 
def get_c_expression(self, types, indices=("c", "c", "c")):
@@ -722,8 +693,7 @@ ir_expression_operation = [
operation("bitfield_insert", 4,
  all_signatures=((uint_type, (uint_type, uint_type, int_type, 
int_type)),
  (int_type, (int_type, int_type, int_type, 
int_type))),
- c_expression="bitfield_insert({src0}, {src1}, {src2}, {src3})",
- flags=types_identical_operation),
+ c_expression="bitfield_insert({src0}, {src1}, {src2}, {src3})"),
 
operation("vector", 4, source_types=all_types, 
c_expression="anything-except-None"),
 ]
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 52/56] glsl: Eliminate constant_template5

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

constant_template_common can now handle the case where the result type
is different from the input type by using type_signature_iter.

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_expression_operation.py | 23 +--
 1 file changed, 1 insertion(+), 22 deletions(-)

diff --git a/src/compiler/glsl/ir_expression_operation.py 
b/src/compiler/glsl/ir_expression_operation.py
index 44f4f68..4a0dda9 100644
--- a/src/compiler/glsl/ir_expression_operation.py
+++ b/src/compiler/glsl/ir_expression_operation.py
@@ -96,7 +96,7 @@ real_types = (float_type, double_type)
 constant_template_common = mako.template.Template("""\
case ${op.get_enum_name()}:
   for (unsigned c = 0; c < op[0]->type->components(); c++) {
- switch (this->type->base_type) {
+ switch (op[0]->type->base_type) {
 % for (dst_type, src_types) in op.signatures():
  case ${src_types[0].glsl_type}:
 data.${dst_type.union_field}[c] = 
${op.get_c_expression(src_types)};
@@ -117,23 +117,6 @@ constant_template2 = mako.template.Template("""\
  data.${op.dest_type.union_field}[c] = 
${op.get_c_expression(op.source_types)};
   break;""")
 
-# This template is for operations with an output type that doesn't match the
-# input types.
-constant_template5 = mako.template.Template("""\
-   case ${op.get_enum_name()}:
-  for (unsigned c = 0; c < components; c++) {
- switch (op[0]->type->base_type) {
-% for (dst_type, src_types) in op.signatures():
- case ${src_types[0].glsl_type}:
-data.${dst_type.union_field}[c] = 
${op.get_c_expression(src_types)};
-break;
-% endfor
- default:
-assert(0);
- }
-  }
-  break;""")
-
 # This template is for binary operations that can operate on some combination
 # of scalar and vector operands.
 constant_template_vector_scalar = mako.template.Template("""\
@@ -398,8 +381,6 @@ class operation(object):
 return 
constant_template_horizontal_single_implementation.render(op=self)
  elif self.dest_type is not None and len(self.source_types) == 1:
 return constant_template2.render(op=self)
- elif self.dest_type is not None:
-return constant_template5.render(op=self)
   elif self.num_operands == 2:
  if self.name == "mul":
 return constant_template_mul.render(op=self)
@@ -411,8 +392,6 @@ class operation(object):
 return 
constant_template_horizontal_single_implementation.render(op=self)
  elif horizontal_operation in self.flags:
 return constant_template_horizontal.render(op=self)
- elif self.dest_type is not None:
-return constant_template5.render(op=self)
   elif self.num_operands == 3:
  if self.name == "vector_insert":
 return constant_template_vector_insert.render(op=self)
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 29/56] glsl: Begin generating code for the most basic constant expressions

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

Unary operations where all of the supported types use the same C
expression to evaluate them.

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_expression_operation.py | 175 ---
 1 file changed, 158 insertions(+), 17 deletions(-)

diff --git a/src/compiler/glsl/ir_expression_operation.py 
b/src/compiler/glsl/ir_expression_operation.py
index a2337d8..6866baf 100644
--- a/src/compiler/glsl/ir_expression_operation.py
+++ b/src/compiler/glsl/ir_expression_operation.py
@@ -24,8 +24,97 @@
 import mako.template
 import sys
 
+class type(object):
+   def __init__(self, c_type, union_field, glsl_type):
+  self.c_type = c_type
+  self.union_field = union_field
+  self.glsl_type = glsl_type
+
+
+class type_signature_iter(object):
+   """Basic iterator for a set of type signatures.  Various kinds of sequences 
of
+   types come in, and an iteration of type_signature objects come out.
+
+   """
+
+   def __init__(self, source_types, num_operands):
+  """Initialize an iterator from a sequence of input types and a number
+  operands.  This is for signatures where all the operands have the same
+  type and the result type of the operation is the same as the input type.
+
+  """
+  self.dest_type = None
+  self.source_types = source_types
+  self.num_operands = num_operands
+  self.i = 0
+
+   def __init__(self, dest_type, source_types, num_operands):
+  """Initialize an iterator from a result tpye, a sequence of input types 
and a
+  number operands.  This is for signatures where all the operands have the
+  same type but the result type of the operation is different from the
+  input type.
+
+  """
+  self.dest_type = dest_type
+  self.source_types = source_types
+  self.num_operands = num_operands
+  self.i = 0
+
+   def __iter__(self):
+  return self
+
+   def next(self):
+  if self.i < len(self.source_types):
+ i = self.i
+ self.i += 1
+
+ if self.dest_type is None:
+dest_type = self.source_types[i]
+ else:
+dest_type = self.dest_type
+
+ return (dest_type, self.num_operands * (self.source_types[i],))
+  else:
+ raise StopIteration()
+
+
+uint_type = type("unsigned", "u", "GLSL_TYPE_UINT")
+int_type = type("int", "i", "GLSL_TYPE_INT")
+float_type = type("float", "f", "GLSL_TYPE_FLOAT")
+double_type = type("double", "d", "GLSL_TYPE_DOUBLE")
+bool_type = type("bool", "b", "GLSL_TYPE_BOOL")
+
+numeric_types = (uint_type, int_type, float_type, double_type)
+integer_types = (uint_type, int_type)
+real_types = (float_type, double_type)
+
+# This template is for unary operations that can only have operands of a
+# single type.  ir_unop_logic_not is an example.
+constant_template0 = mako.template.Template("""\
+   case ${op.get_enum_name()}:
+  assert(op[0]->type->base_type == ${op.source_types[0].glsl_type});
+  for (unsigned c = 0; c < op[0]->type->components(); c++)
+ data.${op.source_types[0].union_field}[c] = 
${op.get_c_expression(op.source_types)};
+  break;""")
+
+# This template is for unary operations that can have operands of a several
+# different types.  ir_unop_bit_not is an example.
+constant_template1 = mako.template.Template("""\
+   case ${op.get_enum_name()}:
+  switch (op[0]->type->base_type) {
+% for (dst_type, src_types) in op.signatures():
+  case ${src_types[0].glsl_type}:
+ for (unsigned c = 0; c < op[0]->type->components(); c++)
+data.${dst_type.union_field}[c] = 
${op.get_c_expression(src_types)};
+ break;
+% endfor
+  default:
+ assert(0);
+  }
+  break;""")
+
 class operation(object):
-   def __init__(self, name, num_operands, printable_name = None):
+   def __init__(self, name, num_operands, printable_name = None, source_types 
= None, c_expression = None):
   self.name = name
   self.num_operands = num_operands
 
@@ -34,24 +123,60 @@ class operation(object):
   else:
  self.printable_name = printable_name
 
+  self.source_types = source_types
+  self.dest_type = None
+
+  if c_expression is None:
+ self.c_expression = None
+  elif isinstance(c_expression, str):
+ self.c_expression = {'default': c_expression}
+  else:
+ self.c_expression = c_expression
+
 
def get_enum_name(self):
   return "ir_{}op_{}".format(("un", "bin", "tri", 
"quad")[self.num_operands-1], self.name)
 
 
+   def get_template(self):
+  if self.c_expression is None:
+ return None
+
+  if self.num_operands == 1:
+ if len(self.source_types) == 1:
+return constant_template0.render(op=self)
+ else:
+return constant_template1.render(op=self)
+
+  return None
+
+
+   def get_c_expression(self, types):
+  src0 = 

[Mesa-dev] [PATCH 45/56] glsl: Generate code for constant ir_quadop_bitfield_insert expressions

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_expression_operation.py | 25 +
 1 file changed, 21 insertions(+), 4 deletions(-)

diff --git a/src/compiler/glsl/ir_expression_operation.py 
b/src/compiler/glsl/ir_expression_operation.py
index 033d947..66d015a 100644
--- a/src/compiler/glsl/ir_expression_operation.py
+++ b/src/compiler/glsl/ir_expression_operation.py
@@ -91,12 +91,20 @@ integer_types = (uint_type, int_type)
 real_types = (float_type, double_type)
 
 # This template is for unary and binary operations that can only have operands
-# of a single type.  ir_unop_logic_not is an example.
+# of a single type or the implementation for all types is identical.
+# ir_unop_logic_not is an example of the former, and ir_quadop_bitfield_insert
+# is an example of the latter..
 constant_template0 = mako.template.Template("""\
case ${op.get_enum_name()}:
+% if len(op.source_types) == 1:
   assert(op[0]->type->base_type == ${op.source_types[0].glsl_type});
+% endif
   for (unsigned c = 0; c < op[0]->type->components(); c++)
- data.${op.source_types[0].union_field}[c] = 
${op.get_c_expression(op.source_types)};
+% for (dst_type, src_types) in op.signatures():
+% if loop.index == 0:
+ data.${dst_type.union_field}[c] = ${op.get_c_expression(src_types)};
+% endif
+% endfor
   break;""")
 
 # This template is for unary operations that can have operands of a several
@@ -394,6 +402,9 @@ class operation(object):
 return constant_template_vector_insert.render(op=self)
  else:
 return constant_template3.render(op=self)
+  elif self.num_operands == 4:
+ if types_identical_operation in self.flags:
+return constant_template0.render(op=self)
 
   return None
 
@@ -402,12 +413,14 @@ class operation(object):
   src0 = "op[0]->value.{}[{}]".format(types[0].union_field, indices[0])
   src1 = "op[1]->value.{}[{}]".format(types[1].union_field, indices[1]) if 
len(types) >= 2 else "ERROR"
   src2 = "op[2]->value.{}[{}]".format(types[2].union_field, indices[2]) if 
len(types) >= 3 else "ERROR"
+  src3 = "op[3]->value.{}[c]".format(types[3].union_field) if len(types) 
>= 4 else "ERROR"
 
   expr = self.c_expression[types[0].union_field] if types[0].union_field 
in self.c_expression else self.c_expression['default']
 
   return expr.format(src0=src0,
  src1=src1,
- src2=src2)
+ src2=src2,
+ src3=src3)
 
 
def signatures(self):
@@ -657,7 +670,11 @@ ir_expression_operation = [
# operand2 is the index in operand0 to be modified
operation("vector_insert", 3, source_types=all_types, 
c_expression="anything-except-None"),
 
-   operation("bitfield_insert", 4),
+   operation("bitfield_insert", 4,
+ all_signatures=((uint_type, (uint_type, uint_type, int_type, 
int_type)),
+ (int_type, (int_type, int_type, int_type, 
int_type))),
+ c_expression="bitfield_insert({src0}, {src1}, {src2}, {src3})",
+ flags=types_identical_operation),
 
operation("vector", 4),
 ]
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 24/56] glsl: Sort GLSL type enums in switch-statements in enum order

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_constant_expression.cpp | 56 ++--
 1 file changed, 28 insertions(+), 28 deletions(-)

diff --git a/src/compiler/glsl/ir_constant_expression.cpp 
b/src/compiler/glsl/ir_constant_expression.cpp
index cfd60cc..35945c1 100644
--- a/src/compiler/glsl/ir_constant_expression.cpp
+++ b/src/compiler/glsl/ir_constant_expression.cpp
@@ -657,14 +657,14 @@ ir_expression::constant_expression_value(struct 
hash_table *variable_context)
switch (this->operation) {
case ir_unop_bit_not:
switch (op[0]->type->base_type) {
-   case GLSL_TYPE_INT:
-   for (unsigned c = 0; c < components; c++)
-   data.i[c] = ~ op[0]->value.i[c];
-   break;
case GLSL_TYPE_UINT:
for (unsigned c = 0; c < components; c++)
data.u[c] = ~ op[0]->value.u[c];
break;
+   case GLSL_TYPE_INT:
+   for (unsigned c = 0; c < components; c++)
+   data.i[c] = ~ op[0]->value.i[c];
+   break;
default:
assert(0);
}
@@ -1423,12 +1423,12 @@ ir_expression::constant_expression_value(struct 
hash_table *variable_context)
  case GLSL_TYPE_FLOAT:
 data.b[c] = op[0]->value.f[c] == op[1]->value.f[c];
 break;
- case GLSL_TYPE_BOOL:
-data.b[c] = op[0]->value.b[c] == op[1]->value.b[c];
-break;
  case GLSL_TYPE_DOUBLE:
 data.b[c] = op[0]->value.d[c] == op[1]->value.d[c];
 break;
+ case GLSL_TYPE_BOOL:
+data.b[c] = op[0]->value.b[c] == op[1]->value.b[c];
+break;
  default:
 assert(0);
  }
@@ -1447,12 +1447,12 @@ ir_expression::constant_expression_value(struct 
hash_table *variable_context)
  case GLSL_TYPE_FLOAT:
 data.b[c] = op[0]->value.f[c] != op[1]->value.f[c];
 break;
- case GLSL_TYPE_BOOL:
-data.b[c] = op[0]->value.b[c] != op[1]->value.b[c];
-break;
  case GLSL_TYPE_DOUBLE:
 data.b[c] = op[0]->value.d[c] != op[1]->value.d[c];
 break;
+ case GLSL_TYPE_BOOL:
+data.b[c] = op[0]->value.b[c] != op[1]->value.b[c];
+break;
  default:
 assert(0);
  }
@@ -1519,12 +1519,12 @@ ir_expression::constant_expression_value(struct 
hash_table *variable_context)
c0 += c0_inc, c1 += c1_inc, c++) {
 
   switch (op[0]->type->base_type) {
-  case GLSL_TYPE_INT:
-  data.i[c] = op[0]->value.i[c0] & op[1]->value.i[c1];
-  break;
   case GLSL_TYPE_UINT:
   data.u[c] = op[0]->value.u[c0] & op[1]->value.u[c1];
   break;
+  case GLSL_TYPE_INT:
+  data.i[c] = op[0]->value.i[c0] & op[1]->value.i[c1];
+  break;
   default:
   assert(0);
   }
@@ -1537,12 +1537,12 @@ ir_expression::constant_expression_value(struct 
hash_table *variable_context)
c0 += c0_inc, c1 += c1_inc, c++) {
 
   switch (op[0]->type->base_type) {
-  case GLSL_TYPE_INT:
-  data.i[c] = op[0]->value.i[c0] | op[1]->value.i[c1];
-  break;
   case GLSL_TYPE_UINT:
   data.u[c] = op[0]->value.u[c0] | op[1]->value.u[c1];
   break;
+  case GLSL_TYPE_INT:
+  data.i[c] = op[0]->value.i[c0] | op[1]->value.i[c1];
+  break;
   default:
   assert(0);
   }
@@ -1581,12 +1581,12 @@ ir_expression::constant_expression_value(struct 
hash_table *variable_context)
c0 += c0_inc, c1 += c1_inc, c++) {
 
   switch (op[0]->type->base_type) {
-  case GLSL_TYPE_INT:
-  data.i[c] = op[0]->value.i[c0] ^ op[1]->value.i[c1];
-  break;
   case GLSL_TYPE_UINT:
   data.u[c] = op[0]->value.u[c0] ^ op[1]->value.u[c1];
   break;
+  case GLSL_TYPE_INT:
+  data.i[c] = op[0]->value.i[c0] ^ op[1]->value.i[c1];
+  break;
   default:
   assert(0);
   }
@@ -1747,21 +1747,21 @@ ir_expression::constant_expression_value(struct 
hash_table *variable_context)
   memcpy(, [0]->value, sizeof(data));
 
   switch (this->type->base_type) {
-  case GLSL_TYPE_INT:
- data.i[idx] = op[1]->value.i[0];
- break;
   case GLSL_TYPE_UINT:
  data.u[idx] = op[1]->value.u[0];
  break;
+  case GLSL_TYPE_INT:
+ data.i[idx] = op[1]->value.i[0];
+ break;
   case GLSL_TYPE_FLOAT:
  data.f[idx] = op[1]->value.f[0];
  break;
-  case GLSL_TYPE_BOOL:
- data.b[idx] = op[1]->value.b[0];
- break;
   case GLSL_TYPE_DOUBLE:
  data.d[idx] = op[1]->value.d[0];
  

[Mesa-dev] [PATCH 09/56] glsl: Generate the ir_last_* values

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

This ensures that they remain correct if the list is rearranged or new
opcodes are added.  I checked a diff of before and after to ensure that
each ir_last_ had the same value.

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_expression_operation.py | 48 
 1 file changed, 20 insertions(+), 28 deletions(-)

diff --git a/src/compiler/glsl/ir_expression_operation.py 
b/src/compiler/glsl/ir_expression_operation.py
index 35f1be6..743ca91 100644
--- a/src/compiler/glsl/ir_expression_operation.py
+++ b/src/compiler/glsl/ir_expression_operation.py
@@ -162,13 +162,7 @@ ir_expression_operation = [
("vote_any", 1, None, None),
("vote_all", 1, None, None),
("vote_eq", 1, None, None),
-
-"""
-   /**
-* A sentinel marking the last of the unary operations.
-*/
-   ir_last_unop = ir_unop_vote_eq,
-""",
+   "",
("add", 2, "+", None),
("sub", 2, "-", None),
("mul", 2, "*", "Floating-point or low 32-bit integer multiply."),
@@ -284,11 +278,6 @@ ir_expression_operation = [
("interpolate_at_sample", 2, None, None),
 """
/**
-* A sentinel marking the last of the binary operations.
-*/
-   ir_last_binop = ir_binop_interpolate_at_sample,
-
-   /**
 * \\name Fused floating-point multiply-add, part of ARB_gpu_shader5.
 */
/*@{*/""",
@@ -319,25 +308,10 @@ ir_expression_operation = [
 * operand2 is the index in operand0 to be modified
 */""",
("vector_insert", 3, None, None),
-"""
-   /**
-* A sentinel marking the last of the ternary operations.
-*/
-   ir_last_triop = ir_triop_vector_insert,
-""",
+   "",
("bitfield_insert", 4, None, None),
"",
("vector", 4, None, None),
-"""
-   /**
-* A sentinel marking the last of the ternary operations.
-*/
-   ir_last_quadop = ir_quadop_vector,
-
-   /**
-* A sentinel marking the last of all operations.
-*/
-   ir_last_opcode = ir_quadop_vector""",
 ]
 
 def name_from_item(item):
@@ -378,7 +352,25 @@ ${item}
${name_from_item(item)},${"" if item[3] is None else " /**< {} 
*/".format(item[3])}
 %  endif
 % endfor
+
+   /**
+* Sentinels marking the last of each kind of operation;
+*/
+% for (name, i) in lasts:
+   ir_last_${("un", "bin", "tri", "quad")[i]}op = ${name_from_item((name, 
i+1))},
+% endfor
+   ir_last_opcode = ir_quadop_${lasts[3][0]}
 };""")
 
+   lasts = [None, None, None, None]
+   for item in reversed(ir_expression_operation):
+  if isinstance(item, str):
+ continue
+
+  i = item[1] - 1
+  if lasts[i] is None:
+ lasts[i] = (item[0], i)
+
print(enum_template.render(values=ir_expression_operation,
+  lasts=lasts,
   name_from_item=name_from_item))
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 47/56] glsl: Generate code for constant ir_triop_lrp expressions

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_expression_operation.py | 29 +++-
 1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/ir_expression_operation.py 
b/src/compiler/glsl/ir_expression_operation.py
index c53b66e..7161713 100644
--- a/src/compiler/glsl/ir_expression_operation.py
+++ b/src/compiler/glsl/ir_expression_operation.py
@@ -332,6 +332,31 @@ constant_template_vector = mako.template.Template("""\
   }
   break;""")
 
+# This template is for ir_triop_lrp.
+constant_template_lrp = mako.template.Template("""\
+   case ${op.get_enum_name()}: {
+  assert(op[0]->type->base_type == GLSL_TYPE_FLOAT ||
+ op[0]->type->base_type == GLSL_TYPE_DOUBLE);
+  assert(op[1]->type->base_type == GLSL_TYPE_FLOAT ||
+ op[1]->type->base_type == GLSL_TYPE_DOUBLE);
+  assert(op[2]->type->base_type == GLSL_TYPE_FLOAT ||
+ op[2]->type->base_type == GLSL_TYPE_DOUBLE);
+
+  unsigned c2_inc = op[2]->type->is_scalar() ? 0 : 1;
+  for (unsigned c = 0, c2 = 0; c < components; c2 += c2_inc, c++) {
+ switch (this->type->base_type) {
+% for (dst_type, src_types) in op.signatures():
+ case ${src_types[0].glsl_type}:
+data.${dst_type.union_field}[c] = ${op.get_c_expression(src_types, 
("c", "c", "c2"))};
+break;
+% endfor
+ default:
+assert(0);
+ }
+  }
+  break;
+   }""")
+
 
 vector_scalar_operation = "vector-scalar"
 horizontal_operation = "horizontal"
@@ -416,6 +441,8 @@ class operation(object):
   elif self.num_operands == 3:
  if self.name == "vector_insert":
 return constant_template_vector_insert.render(op=self)
+ elif self.name == "lrp":
+return constant_template_lrp.render(op=self)
  else:
 return constant_template3.render(op=self)
   elif self.num_operands == 4:
@@ -665,7 +692,7 @@ ir_expression_operation = [
# Fused floating-point multiply-add, part of ARB_gpu_shader5.
operation("fma", 3, source_types=real_types, c_expression="{src0} * {src1} 
+ {src2}"),
 
-   operation("lrp", 3),
+   operation("lrp", 3, source_types=real_types, c_expression={'f': "{src0} * 
(1.0f - {src2}) + ({src1} * {src2})", 'd': "{src0} * (1.0 - {src2}) + ({src1} * 
{src2})"}),
 
# Conditional Select
#
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 27/56] glsl: Compact a bunch of things onto one line

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

Even though they are much too long for that.

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_constant_expression.cpp | 86 +++-
 1 file changed, 20 insertions(+), 66 deletions(-)

diff --git a/src/compiler/glsl/ir_constant_expression.cpp 
b/src/compiler/glsl/ir_constant_expression.cpp
index 34ebf30..7c231e5 100644
--- a/src/compiler/glsl/ir_constant_expression.cpp
+++ b/src/compiler/glsl/ir_constant_expression.cpp
@@ -996,76 +996,52 @@ ir_expression::constant_expression_value(struct 
hash_table *variable_context)
 
case ir_unop_pack_snorm_2x16:
   assert(op[0]->type == glsl_type::vec2_type);
-  data.u[0] = pack_2x16(pack_snorm_1x16,
-op[0]->value.f[0],
-op[0]->value.f[1]);
+  data.u[0] = pack_2x16(pack_snorm_1x16, op[0]->value.f[0], 
op[0]->value.f[1]);
   break;
 
case ir_unop_pack_snorm_4x8:
   assert(op[0]->type == glsl_type::vec4_type);
-  data.u[0] = pack_4x8(pack_snorm_1x8,
-   op[0]->value.f[0],
-   op[0]->value.f[1],
-   op[0]->value.f[2],
-   op[0]->value.f[3]);
+  data.u[0] = pack_4x8(pack_snorm_1x8, op[0]->value.f[0], 
op[0]->value.f[1], op[0]->value.f[2], op[0]->value.f[3]);
   break;
 
case ir_unop_pack_unorm_2x16:
   assert(op[0]->type == glsl_type::vec2_type);
-  data.u[0] = pack_2x16(pack_unorm_1x16,
-op[0]->value.f[0],
-op[0]->value.f[1]);
+  data.u[0] = pack_2x16(pack_unorm_1x16, op[0]->value.f[0], 
op[0]->value.f[1]);
   break;
 
case ir_unop_pack_unorm_4x8:
   assert(op[0]->type == glsl_type::vec4_type);
-  data.u[0] = pack_4x8(pack_unorm_1x8,
-   op[0]->value.f[0],
-   op[0]->value.f[1],
-   op[0]->value.f[2],
-   op[0]->value.f[3]);
+  data.u[0] = pack_4x8(pack_unorm_1x8, op[0]->value.f[0], 
op[0]->value.f[1], op[0]->value.f[2], op[0]->value.f[3]);
   break;
 
case ir_unop_pack_half_2x16:
   assert(op[0]->type == glsl_type::vec2_type);
-  data.u[0] = pack_2x16(pack_half_1x16,
-op[0]->value.f[0],
-op[0]->value.f[1]);
+  data.u[0] = pack_2x16(pack_half_1x16, op[0]->value.f[0], 
op[0]->value.f[1]);
   break;
 
case ir_unop_unpack_snorm_2x16:
   assert(op[0]->type == glsl_type::uint_type);
-  unpack_2x16(unpack_snorm_1x16,
-  op[0]->value.u[0],
-  [0], [1]);
+  unpack_2x16(unpack_snorm_1x16, op[0]->value.u[0], [0], 
[1]);
   break;
 
case ir_unop_unpack_snorm_4x8:
   assert(op[0]->type == glsl_type::uint_type);
-  unpack_4x8(unpack_snorm_1x8,
- op[0]->value.u[0],
- [0], [1], [2], [3]);
+  unpack_4x8(unpack_snorm_1x8, op[0]->value.u[0], [0], [1], 
[2], [3]);
   break;
 
case ir_unop_unpack_unorm_2x16:
   assert(op[0]->type == glsl_type::uint_type);
-  unpack_2x16(unpack_unorm_1x16,
-  op[0]->value.u[0],
-  [0], [1]);
+  unpack_2x16(unpack_unorm_1x16, op[0]->value.u[0], [0], 
[1]);
   break;
 
case ir_unop_unpack_unorm_4x8:
   assert(op[0]->type == glsl_type::uint_type);
-  unpack_4x8(unpack_unorm_1x8,
- op[0]->value.u[0],
- [0], [1], [2], [3]);
+  unpack_4x8(unpack_unorm_1x8, op[0]->value.u[0], [0], [1], 
[2], [3]);
   break;
 
case ir_unop_unpack_half_2x16:
   assert(op[0]->type == glsl_type::uint_type);
-  unpack_2x16(unpack_half_1x16,
-  op[0]->value.u[0],
-  [0], [1]);
+  unpack_2x16(unpack_half_1x16, op[0]->value.u[0], [0], [1]);
   break;
 
case ir_unop_bitfield_reverse:
@@ -1251,18 +1227,10 @@ ir_expression::constant_expression_value(struct 
hash_table *variable_context)
 
  switch (op[0]->type->base_type) {
  case GLSL_TYPE_UINT:
-if (op[1]->value.u[c1] == 0) {
-   data.u[c] = 0;
-} else {
-   data.u[c] = op[0]->value.u[c0] / op[1]->value.u[c1];
-}
+data.u[c] = op[1]->value.u[c1] == 0 ? 0 : op[0]->value.u[c0] / 
op[1]->value.u[c1];
 break;
  case GLSL_TYPE_INT:
-if (op[1]->value.i[c1] == 0) {
-   data.i[c] = 0;
-} else {
-   data.i[c] = op[0]->value.i[c0] / op[1]->value.i[c1];
-}
+data.i[c] = op[1]->value.i[c1] == 0 ? 0 : op[0]->value.i[c0] / 
op[1]->value.i[c1];
 break;
  case GLSL_TYPE_FLOAT:
 data.f[c] = op[0]->value.f[c0] / op[1]->value.f[c1];
@@ -1285,32 +1253,22 @@ ir_expression::constant_expression_value(struct 
hash_table *variable_context)
 
  

[Mesa-dev] [PATCH 36/56] glsl: Generate code for some constant binary expression that are horizontal

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

Only operations where the implementation is identical code regardless of
type.  The only such operations are ir_binop_all_equal and
ir_binop_any_nequal.

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_expression_operation.py | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/src/compiler/glsl/ir_expression_operation.py 
b/src/compiler/glsl/ir_expression_operation.py
index 75dfa0f..882d5ea 100644
--- a/src/compiler/glsl/ir_expression_operation.py
+++ b/src/compiler/glsl/ir_expression_operation.py
@@ -180,7 +180,8 @@ constant_template_vector_scalar = 
mako.template.Template("""\
   }
   break;""")
 
-# This template is for unary operations that are horizontal.  That is, the
+# This template is for operations that are horizontal and either have only a
+# single type or the implementation for all types is identical.  That is, the
 # operation consumes a vector and produces a scalar.
 constant_template_horizontal_single_implementation = 
mako.template.Template("""\
case ${op.get_enum_name()}:
@@ -190,6 +191,7 @@ constant_template_horizontal_single_implementation = 
mako.template.Template("""\
 
 vector_scalar_operation = "vector-scalar"
 horizontal_operation = "horizontal"
+types_identical_operation = "identical"
 
 class operation(object):
def __init__(self, name, num_operands, printable_name = None, source_types 
= None, dest_type = None, c_expression = None, flags = None):
@@ -243,6 +245,8 @@ class operation(object):
   elif self.num_operands == 2:
  if vector_scalar_operation in self.flags:
 return constant_template_vector_scalar.render(op=self)
+ elif horizontal_operation in self.flags and types_identical_operation 
in self.flags:
+return 
constant_template_horizontal_single_implementation.render(op=self)
  elif len(self.source_types) == 1:
 return constant_template0.render(op=self)
  elif self.dest_type is not None:
@@ -425,11 +429,11 @@ ir_expression_operation = [
 
# Returns single boolean for whether all components of operands[0]
# equal the components of operands[1].
-   operation("all_equal", 2),
+   operation("all_equal", 2, source_types=all_types, dest_type=bool_type, 
c_expression="op[0]->has_value(op[1])", flags=frozenset((horizontal_operation, 
types_identical_operation))),
 
# Returns single boolean for whether any component of operands[0]
# is not equal to the corresponding component of operands[1].
-   operation("any_nequal", 2),
+   operation("any_nequal", 2, source_types=all_types, dest_type=bool_type, 
c_expression="!op[0]->has_value(op[1])", flags=frozenset((horizontal_operation, 
types_identical_operation))),
 
# Bit-wise binary operations.
operation("lshift", 2, printable_name="<<"),
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 23/56] glsl: Always use correct float types in constant expression handling

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_constant_expression.cpp | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/compiler/glsl/ir_constant_expression.cpp 
b/src/compiler/glsl/ir_constant_expression.cpp
index 07ded9f..cfd60cc 100644
--- a/src/compiler/glsl/ir_constant_expression.cpp
+++ b/src/compiler/glsl/ir_constant_expression.cpp
@@ -842,7 +842,7 @@ ir_expression::constant_expression_value(struct hash_table 
*variable_context)
   for (unsigned c = 0; c < op[0]->type->components(); c++) {
  switch (this->type->base_type) {
  case GLSL_TYPE_FLOAT:
-data.f[c] = op[0]->value.f[c] - floor(op[0]->value.f[c]);
+data.f[c] = op[0]->value.f[c] - floorf(op[0]->value.f[c]);
 break;
  case GLSL_TYPE_DOUBLE:
 data.d[c] = op[0]->value.d[c] - floor(op[0]->value.d[c]);
@@ -915,10 +915,10 @@ ir_expression::constant_expression_value(struct 
hash_table *variable_context)
 data.i[c] = (op[0]->value.i[c] > 0) - (op[0]->value.i[c] < 0);
 break;
  case GLSL_TYPE_FLOAT:
-data.f[c] = float((op[0]->value.f[c] > 0)-(op[0]->value.f[c] < 0));
+data.f[c] = float((op[0]->value.f[c] > 0.0F) - (op[0]->value.f[c] 
< 0.0F));
 break;
  case GLSL_TYPE_DOUBLE:
-data.d[c] = double((op[0]->value.d[c] > 0)-(op[0]->value.d[c] < 
0));
+data.d[c] = double((op[0]->value.d[c] > 0.0) - (op[0]->value.d[c] 
< 0.0));
 break;
  default:
 assert(0);
@@ -930,7 +930,7 @@ ir_expression::constant_expression_value(struct hash_table 
*variable_context)
   for (unsigned c = 0; c < op[0]->type->components(); c++) {
  switch (this->type->base_type) {
  case GLSL_TYPE_FLOAT:
-if (op[0]->value.f[c] != 0.0)
+if (op[0]->value.f[c] != 0.0F)
data.f[c] = 1.0F / op[0]->value.f[c];
 break;
  case GLSL_TYPE_DOUBLE:
@@ -997,7 +997,7 @@ ir_expression::constant_expression_value(struct hash_table 
*variable_context)
case ir_unop_dFdy_fine:
   assert(op[0]->type->base_type == GLSL_TYPE_FLOAT);
   for (unsigned c = 0; c < op[0]->type->components(); c++) {
- data.f[c] = 0.0;
+ data.f[c] = 0.0F;
   }
   break;
 
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 43/56] glsl: Generate code for constant ir_binop_vector_extract expressions

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_expression_operation.py | 22 +-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/ir_expression_operation.py 
b/src/compiler/glsl/ir_expression_operation.py
index e7a74e3..ec04f57 100644
--- a/src/compiler/glsl/ir_expression_operation.py
+++ b/src/compiler/glsl/ir_expression_operation.py
@@ -270,6 +270,24 @@ constant_template_horizontal = mako.template.Template("""\
   }
   break;""")
 
+# This template is for ir_binop_vector_extract.
+constant_template_vector_extract = mako.template.Template("""\
+   case ${op.get_enum_name()}: {
+  const int c = CLAMP(op[1]->value.i[0], 0,
+  (int) op[0]->type->vector_elements - 1);
+
+  switch (op[0]->type->base_type) {
+% for (dst_type, src_types) in op.signatures():
+  case ${src_types[0].glsl_type}:
+ data.${dst_type.union_field}[0] = 
op[0]->value.${src_types[0].union_field}[c];
+ break;
+% endfor
+  default:
+ assert(0);
+  }
+  break;
+   }""")
+
 
 vector_scalar_operation = "vector-scalar"
 horizontal_operation = "horizontal"
@@ -337,6 +355,8 @@ class operation(object):
   elif self.num_operands == 2:
  if self.name == "mul":
 return constant_template_mul.render(op=self)
+ elif self.name == "vector_extract":
+return constant_template_vector_extract.render(op=self)
  elif vector_scalar_operation in self.flags:
 return constant_template_vector_scalar.render(op=self)
  elif horizontal_operation in self.flags and types_identical_operation 
in self.flags:
@@ -574,7 +594,7 @@ ir_expression_operation = [
#
# operand0 is the vector
# operand1 is the index of the field to read from operand0
-   operation("vector_extract", 2),
+   operation("vector_extract", 2, source_types=all_types, 
c_expression="anything-except-None"),
 
# Interpolate fs input at offset
#
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 49/56] glsl: Use the generated constant expression code

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

Immediately previous to this patch,

diff -wud src/glsl/ir_constant_expression.cpp \
  src/glsl/ir_expression_operation_constant.h

should be "minimal."

Signed-off-by: Ian Romanick 
---
 src/compiler/Makefile.glsl.am|6 +
 src/compiler/Makefile.sources|1 +
 src/compiler/glsl/ir_constant_expression.cpp | 1114 +-
 3 files changed, 8 insertions(+), 1113 deletions(-)

diff --git a/src/compiler/Makefile.glsl.am b/src/compiler/Makefile.glsl.am
index 125580d..b8225cb 100644
--- a/src/compiler/Makefile.glsl.am
+++ b/src/compiler/Makefile.glsl.am
@@ -201,6 +201,10 @@ glsl/ir_expression_operation.h: 
glsl/ir_expression_operation.py
$(MKDIR_GEN)
$(PYTHON_GEN) $(srcdir)/glsl/ir_expression_operation.py enum > $@ || 
($(RM) $@; false)
 
+glsl/ir_expression_operation_constant.h: glsl/ir_expression_operation.py
+   $(MKDIR_GEN)
+   $(PYTHON_GEN) $(srcdir)/glsl/ir_expression_operation.py constant > $@ 
|| ($(RM) $@; false)
+
 glsl/ir_expression_operation_strings.h: glsl/ir_expression_operation.py
$(MKDIR_GEN)
$(PYTHON_GEN) $(srcdir)/glsl/ir_expression_operation.py strings > $@ || 
($(RM) $@; false)
@@ -215,6 +219,7 @@ BUILT_SOURCES +=\
glsl/glsl_parser.cpp\
glsl/glsl_lexer.cpp \
glsl/ir_expression_operation.h  \
+   glsl/ir_expression_operation_constant.h \
glsl/ir_expression_operation_strings.h  \
glsl/glcpp/glcpp-parse.c\
glsl/glcpp/glcpp-lex.c
@@ -224,6 +229,7 @@ CLEANFILES +=   
\
glsl/glsl_parser.cpp\
glsl/glsl_lexer.cpp \
glsl/ir_expression_operation.h  \
+   glsl/ir_expression_operation_constant.h \
glsl/ir_expression_operation_strings.h  \
glsl/glcpp/glcpp-parse.c\
glsl/glcpp/glcpp-lex.c
diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources
index 6b54426..79b588f 100644
--- a/src/compiler/Makefile.sources
+++ b/src/compiler/Makefile.sources
@@ -146,6 +146,7 @@ GLSL_COMPILER_CXX_FILES = \
 # libglsl generated sources
 LIBGLSL_GENERATED_FILES = \
glsl/ir_expression_operation.h \
+   glsl/ir_expression_operation_constant.h \
glsl/ir_expression_operation_strings.h \
glsl/glsl_lexer.cpp \
glsl/glsl_parser.cpp \
diff --git a/src/compiler/glsl/ir_constant_expression.cpp 
b/src/compiler/glsl/ir_constant_expression.cpp
index 7c231e5..5d739ad 100644
--- a/src/compiler/glsl/ir_constant_expression.cpp
+++ b/src/compiler/glsl/ir_constant_expression.cpp
@@ -654,1119 +654,7 @@ ir_expression::constant_expression_value(struct 
hash_table *variable_context)
   return NULL;
}
 
-   switch (this->operation) {
-   case ir_unop_bit_not:
-   switch (op[0]->type->base_type) {
-   case GLSL_TYPE_UINT:
-   for (unsigned c = 0; c < components; c++)
-   data.u[c] = ~ op[0]->value.u[c];
-   break;
-   case GLSL_TYPE_INT:
-   for (unsigned c = 0; c < components; c++)
-   data.i[c] = ~ op[0]->value.i[c];
-   break;
-   default:
-   assert(0);
-   }
-   break;
-
-   case ir_unop_logic_not:
-  assert(op[0]->type->base_type == GLSL_TYPE_BOOL);
-  for (unsigned c = 0; c < op[0]->type->components(); c++)
-  data.b[c] = !op[0]->value.b[c];
-  break;
-
-   case ir_unop_neg:
-  for (unsigned c = 0; c < op[0]->type->components(); c++) {
- switch (this->type->base_type) {
- case GLSL_TYPE_UINT:
-data.u[c] = -((int) op[0]->value.u[c]);
-break;
- case GLSL_TYPE_INT:
-data.i[c] = -op[0]->value.i[c];
-break;
- case GLSL_TYPE_FLOAT:
-data.f[c] = -op[0]->value.f[c];
-break;
- case GLSL_TYPE_DOUBLE:
-data.d[c] = -op[0]->value.d[c];
-break;
- default:
-assert(0);
- }
-  }
-  break;
-
-   case ir_unop_abs:
-  for (unsigned c = 0; c < op[0]->type->components(); c++) {
- switch (this->type->base_type) {
- case GLSL_TYPE_INT:
-data.i[c] = op[0]->value.i[c];
-if (data.i[c] < 0)
-   data.i[c] = -data.i[c];
-break;
- case GLSL_TYPE_FLOAT:
-data.f[c] = fabs(op[0]->value.f[c]);
-break;
- case GLSL_TYPE_DOUBLE:
-data.d[c] = fabs(op[0]->value.d[c]);
-break;
- default:
-assert(0);
- }
-  }
-  break;
-
-   case ir_unop_sign:
-  for (unsigned c = 0; c < 

[Mesa-dev] [PATCH 30/56] glsl: Generate code for constant unary expression that map one type to another

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

ir_unop_i2b is omitted because its source can either be int or uint.
That makes it special.

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_expression_operation.py | 81 +++-
 1 file changed, 57 insertions(+), 24 deletions(-)

diff --git a/src/compiler/glsl/ir_expression_operation.py 
b/src/compiler/glsl/ir_expression_operation.py
index 6866baf..bc1690b 100644
--- a/src/compiler/glsl/ir_expression_operation.py
+++ b/src/compiler/glsl/ir_expression_operation.py
@@ -113,8 +113,18 @@ constant_template1 = mako.template.Template("""\
   }
   break;""")
 
+# This template is for unary operations that map an operand of one type to an
+# operand of another type.  ir_unop_f2b is an example.
+constant_template2 = mako.template.Template("""\
+   case ${op.get_enum_name()}:
+  assert(op[0]->type->base_type == ${op.source_types[0].glsl_type});
+  for (unsigned c = 0; c < op[0]->type->components(); c++)
+ data.${op.dest_type.union_field}[c] = 
${op.get_c_expression(op.source_types)};
+  break;""")
+
+
 class operation(object):
-   def __init__(self, name, num_operands, printable_name = None, source_types 
= None, c_expression = None):
+   def __init__(self, name, num_operands, printable_name = None, source_types 
= None, dest_type = None, c_expression = None):
   self.name = name
   self.num_operands = num_operands
 
@@ -124,7 +134,7 @@ class operation(object):
  self.printable_name = printable_name
 
   self.source_types = source_types
-  self.dest_type = None
+  self.dest_type = dest_type
 
   if c_expression is None:
  self.c_expression = None
@@ -143,7 +153,9 @@ class operation(object):
  return None
 
   if self.num_operands == 1:
- if len(self.source_types) == 1:
+ if self.dest_type is not None:
+return constant_template2.render(op=self)
+ elif len(self.source_types) == 1:
 return constant_template0.render(op=self)
  else:
 return constant_template1.render(op=self)
@@ -177,27 +189,48 @@ ir_expression_operation = [
operation("exp2", 1, source_types=(float_type,), 
c_expression="exp2f({src0})"),
operation("log2", 1, source_types=(float_type,), 
c_expression="log2f({src0})"),
 
-   operation("f2i", 1), # Float-to-integer conversion.
-   operation("f2u", 1), # Float-to-unsigned conversion.
-   operation("i2f", 1), # Integer-to-float conversion.
-   operation("f2b", 1), # Float-to-boolean conversion
-   operation("b2f", 1), # Boolean-to-float conversion
-   operation("i2b", 1), # int-to-boolean conversion
-   operation("b2i", 1), # Boolean-to-int conversion
-   operation("u2f", 1), # Unsigned-to-float conversion.
-   operation("i2u", 1), # Integer-to-unsigned conversion.
-   operation("u2i", 1), # Unsigned-to-integer conversion.
-   operation("d2f", 1), # Double-to-float conversion.
-   operation("f2d", 1), # Float-to-double conversion.
-   operation("d2i", 1), # Double-to-integer conversion.
-   operation("i2d", 1), # Integer-to-double conversion.
-   operation("d2u", 1), # Double-to-unsigned conversion.
-   operation("u2d", 1), # Unsigned-to-double conversion.
-   operation("d2b", 1), # Double-to-boolean conversion.
-   operation("bitcast_i2f", 1), # 'Bit-identical int-to-float "conversion"
-   operation("bitcast_f2i", 1), # 'Bit-identical float-to-int "conversion"
-   operation("bitcast_u2f", 1), # 'Bit-identical uint-to-float "conversion"
-   operation("bitcast_f2u", 1), # 'Bit-identical float-to-uint "conversion"
+   # Float-to-integer conversion.
+   operation("f2i", 1, source_types=(float_type,), dest_type=int_type, 
c_expression="(int) {src0}"),
+   # Float-to-unsigned conversion.
+   operation("f2u", 1, source_types=(float_type,), dest_type=uint_type, 
c_expression="(unsigned) {src0}"),
+   # Integer-to-float conversion.
+   operation("i2f", 1, source_types=(int_type,), dest_type=float_type, 
c_expression="(float) {src0}"),
+   # Float-to-boolean conversion
+   operation("f2b", 1, source_types=(float_type,), dest_type=bool_type, 
c_expression="{src0} != 0.0F ? true : false"),
+   # Boolean-to-float conversion
+   operation("b2f", 1, source_types=(bool_type,), dest_type=float_type, 
c_expression="{src0} ? 1.0F : 0.0F"),
+   # int-to-boolean conversion
+   operation("i2b", 1),
+   # Boolean-to-int conversion
+   operation("b2i", 1, source_types=(bool_type,), dest_type=int_type, 
c_expression="{src0} ? 1 : 0"),
+   # Unsigned-to-float conversion.
+   operation("u2f", 1, source_types=(uint_type,), dest_type=float_type, 
c_expression="(float) {src0}"),
+   # Integer-to-unsigned conversion.
+   operation("i2u", 1, source_types=(int_type,), dest_type=uint_type, 
c_expression="{src0}"),
+   # Unsigned-to-integer conversion.
+   

[Mesa-dev] [PATCH 50/56] glsl: Eliminate one of the templates for simpler operations

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

The difference between these two templates were mostly an artefact of
the development of the original patch series and to minimize the
differences between the original code and the generated code.

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_expression_operation.py | 24 +++-
 1 file changed, 3 insertions(+), 21 deletions(-)

diff --git a/src/compiler/glsl/ir_expression_operation.py 
b/src/compiler/glsl/ir_expression_operation.py
index b1b7101..0ad27f6 100644
--- a/src/compiler/glsl/ir_expression_operation.py
+++ b/src/compiler/glsl/ir_expression_operation.py
@@ -107,25 +107,9 @@ constant_template0 = mako.template.Template("""\
 % endfor
   break;""")
 
-# This template is for unary operations that can have operands of a several
-# different types.  ir_unop_bit_not is an example.
-constant_template1 = mako.template.Template("""\
-   case ${op.get_enum_name()}:
-  switch (op[0]->type->base_type) {
-% for (dst_type, src_types) in op.signatures():
-  case ${src_types[0].glsl_type}:
- for (unsigned c = 0; c < op[0]->type->components(); c++)
-data.${dst_type.union_field}[c] = 
${op.get_c_expression(src_types)};
- break;
-% endfor
-  default:
- assert(0);
-  }
-  break;""")
-
-# This template is for unary operations that can have operands of a several
-# different types, and each type has a different C expression.  ir_unop_neg is
-# an example.
+# This template is for operations that can have operands of a several
+# different types, and each type may or may not has a different C expression.
+# ir_unop_bit_not and ir_unop_neg are examples.
 constant_template3 = mako.template.Template("""\
case ${op.get_enum_name()}:
   for (unsigned c = 0; c < op[0]->type->components(); c++) {
@@ -435,8 +419,6 @@ class operation(object):
 return constant_template5.render(op=self)
  elif len(self.source_types) == 1:
 return constant_template0.render(op=self)
- elif len(self.c_expression) == 1 and 'default' in self.c_expression:
-return constant_template1.render(op=self)
  else:
 return constant_template3.render(op=self)
   elif self.num_operands == 2:
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 33/56] glsl: Generate code for constant binary expressions that combine vector and scalar operands

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_expression_operation.py | 66 +---
 1 file changed, 51 insertions(+), 15 deletions(-)

diff --git a/src/compiler/glsl/ir_expression_operation.py 
b/src/compiler/glsl/ir_expression_operation.py
index 2ac2a28..ac568ae 100644
--- a/src/compiler/glsl/ir_expression_operation.py
+++ b/src/compiler/glsl/ir_expression_operation.py
@@ -141,9 +141,32 @@ constant_template2 = mako.template.Template("""\
  data.${op.dest_type.union_field}[c] = 
${op.get_c_expression(op.source_types)};
   break;""")
 
+# This template is for binary operations that can operate on some combination
+# of scalar and vector operands.
+constant_template_vector_scalar = mako.template.Template("""\
+   case ${op.get_enum_name()}:
+  assert(op[0]->type == op[1]->type || op0_scalar || op1_scalar);
+  for (unsigned c = 0, c0 = 0, c1 = 0;
+   c < components;
+   c0 += c0_inc, c1 += c1_inc, c++) {
+
+ switch (op[0]->type->base_type) {
+% for (dst_type, src_types) in op.signatures():
+ case ${src_types[0].glsl_type}:
+data.${dst_type.union_field}[c] = ${op.get_c_expression(src_types, 
("c0", "c1"))};
+break;
+% endfor
+ default:
+assert(0);
+ }
+  }
+  break;""")
+
+
+vector_scalar_operation = "vector-scalar"
 
 class operation(object):
-   def __init__(self, name, num_operands, printable_name = None, source_types 
= None, dest_type = None, c_expression = None):
+   def __init__(self, name, num_operands, printable_name = None, source_types 
= None, dest_type = None, c_expression = None, flags = None):
   self.name = name
   self.num_operands = num_operands
 
@@ -162,6 +185,13 @@ class operation(object):
   else:
  self.c_expression = c_expression
 
+  if flags is None:
+ self.flags = frozenset()
+  elif isinstance(flags, str):
+ self.flags = frozenset([flags])
+  else:
+ self.flags = frozenset(flags)
+
 
def get_enum_name(self):
   return "ir_{}op_{}".format(("un", "bin", "tri", 
"quad")[self.num_operands-1], self.name)
@@ -181,19 +211,22 @@ class operation(object):
  else:
 return constant_template3.render(op=self)
   elif self.num_operands == 2:
- if len(self.source_types) == 1:
+ if vector_scalar_operation in self.flags:
+return constant_template_vector_scalar.render(op=self)
+ elif len(self.source_types) == 1:
 return constant_template0.render(op=self)
 
   return None
 
 
-   def get_c_expression(self, types):
-  src0 = "op[0]->value.{}[c]".format(types[0].union_field)
-  src1 = "op[1]->value.{}[c]".format(types[1].union_field) if len(types) 
>= 2 else "ERROR"
+   def get_c_expression(self, types, indices=("c", "c")):
+  src0 = "op[0]->value.{}[{}]".format(types[0].union_field, indices[0])
+  src1 = "op[1]->value.{}[{}]".format(types[1].union_field, indices[1]) if 
len(types) >= 2 else "ERROR"
 
   expr = self.c_expression[types[0].union_field] if types[0].union_field 
in self.c_expression else self.c_expression['default']
 
-  return expr.format(src0=src0)
+  return expr.format(src0=src0,
+ src1=src1)
 
 
def signatures(self):
@@ -329,12 +362,12 @@ ir_expression_operation = [
operation("vote_all", 1),
operation("vote_eq", 1),
 
-   operation("add", 2, printable_name="+"),
-   operation("sub", 2, printable_name="-"),
+   operation("add", 2, printable_name="+", source_types=numeric_types, 
c_expression="{src0} + {src1}", flags=vector_scalar_operation),
+   operation("sub", 2, printable_name="-", source_types=numeric_types, 
c_expression="{src0} - {src1}", flags=vector_scalar_operation),
# "Floating-point or low 32-bit integer multiply."
operation("mul", 2, printable_name="*"),
operation("imul_high", 2),   # Calculates the high 32-bits of a 64-bit 
multiply.
-   operation("div", 2, printable_name="/"),
+   operation("div", 2, printable_name="/", source_types=numeric_types, 
c_expression={'u': "{src1} == 0 ? 0 : {src0} / {src1}", 'i': "{src1} == 0 ? 0 : 
{src0} / {src1}", 'default': "{src0} / {src1}"}, flags=vector_scalar_operation),
 
# Returns the carry resulting from the addition of the two arguments.
operation("carry", 2),
@@ -344,7 +377,10 @@ ir_expression_operation = [
operation("borrow", 2),
 
# Either (vector % vector) or (vector % scalar)
-   operation("mod", 2, printable_name="%"),
+   #
+   # We don't use fmod because it rounds toward zero; GLSL specifies the use
+   # of floor.
+   operation("mod", 2, printable_name="%", source_types=numeric_types, 
c_expression={'u': "{src1} == 0 ? 0 : {src0} % {src1}", 'i': "{src1} == 0 ? 0 : 
{src0} % {src1}", 'f': "{src0} - {src1} * floorf({src0} / {src1})", 'd': 
"{src0} - {src1} * floor({src0} / 

[Mesa-dev] [PATCH 15/56] glsl: Delete spurious comment about mod not taking integer operands

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

This hasn't been true since we added support for GLSL 1.30.

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_expression_operation.py | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/src/compiler/glsl/ir_expression_operation.py 
b/src/compiler/glsl/ir_expression_operation.py
index 5c7ad35..8098dac 100644
--- a/src/compiler/glsl/ir_expression_operation.py
+++ b/src/compiler/glsl/ir_expression_operation.py
@@ -145,12 +145,7 @@ ir_expression_operation = [
# from the first argument.
("borrow", 2, None),
 
-   # Takes one of two combinations of arguments:
-   #
-   # - mod(vecN, vecN)
-   # - mod(vecN, float)
-   #
-   # Does not take integer types.
+   # Either (vector % vector) or (vector % scalar)
("mod", 2, "%"),
 
# Binary comparison operators which return a boolean vector.
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 21/56] glsl: Extract ir_triop_bitfield_extract implementation to a separate function

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_constant_expression.cpp | 66 ++--
 1 file changed, 42 insertions(+), 24 deletions(-)

diff --git a/src/compiler/glsl/ir_constant_expression.cpp 
b/src/compiler/glsl/ir_constant_expression.cpp
index 0f90c4e..ebd2e07 100644
--- a/src/compiler/glsl/ir_constant_expression.cpp
+++ b/src/compiler/glsl/ir_constant_expression.cpp
@@ -537,6 +537,38 @@ ldexp_flush_subnormal(double x, int exp)
return !isnormal(result) ? copysign(0.0, x) : result;
 }
 
+static uint32_t
+bitfield_extract_uint(uint32_t value, int offset, int bits)
+{
+   if (bits == 0)
+  return 0;
+   else if (offset < 0 || bits < 0)
+  return 0; /* Undefined, per spec. */
+   else if (offset + bits > 32)
+  return 0; /* Undefined, per spec. */
+   else {
+  value <<= 32 - bits - offset;
+  value >>= 32 - bits;
+  return value;
+   }
+}
+
+static int32_t
+bitfield_extract_int(int32_t value, int offset, int bits)
+{
+   if (bits == 0)
+  return 0;
+   else if (offset < 0 || bits < 0)
+  return 0; /* Undefined, per spec. */
+   else if (offset + bits > 32)
+  return 0; /* Undefined, per spec. */
+   else {
+  value <<= 32 - bits - offset;
+  value >>= 32 - bits;
+  return value;
+   }
+}
+
 ir_constant *
 ir_expression::constant_expression_value(struct hash_table *variable_context)
 {
@@ -1610,34 +1642,20 @@ ir_expression::constant_expression_value(struct 
hash_table *variable_context)
   data.u[1] = *((uint32_t *)[0]->value.d[0] + 1);
   break;
 
-   case ir_triop_bitfield_extract: {
+   case ir_triop_bitfield_extract:
   for (unsigned c = 0; c < components; c++) {
- int offset = op[1]->value.i[c];
- int bits = op[2]->value.i[c];
-
- if (bits == 0)
-data.u[c] = 0;
- else if (offset < 0 || bits < 0)
-data.u[c] = 0; /* Undefined, per spec. */
- else if (offset + bits > 32)
-data.u[c] = 0; /* Undefined, per spec. */
- else {
-if (op[0]->type->base_type == GLSL_TYPE_INT) {
-   /* int so that the right shift will sign-extend. */
-   int value = op[0]->value.i[c];
-   value <<= 32 - bits - offset;
-   value >>= 32 - bits;
-   data.i[c] = value;
-} else {
-   unsigned value = op[0]->value.u[c];
-   value <<= 32 - bits - offset;
-   value >>= 32 - bits;
-   data.u[c] = value;
-}
+ switch (this->type->base_type) {
+ case GLSL_TYPE_UINT:
+data.u[c] = bitfield_extract_uint(op[0]->value.u[c], 
op[1]->value.i[c], op[2]->value.i[c]);
+break;
+ case GLSL_TYPE_INT:
+data.i[c] = bitfield_extract_int(op[0]->value.i[c], 
op[1]->value.i[c], op[2]->value.i[c]);
+break;
+ default:
+assert(0);
  }
   }
   break;
-   }
 
case ir_binop_ldexp:
   for (unsigned c = 0; c < op[0]->type->components(); c++) {
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 26/56] glsl: Sort constant expression handling by IR operand enum value

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_constant_expression.cpp | 648 +--
 1 file changed, 324 insertions(+), 324 deletions(-)

diff --git a/src/compiler/glsl/ir_constant_expression.cpp 
b/src/compiler/glsl/ir_constant_expression.cpp
index c41b014..34ebf30 100644
--- a/src/compiler/glsl/ir_constant_expression.cpp
+++ b/src/compiler/glsl/ir_constant_expression.cpp
@@ -676,6 +676,124 @@ ir_expression::constant_expression_value(struct 
hash_table *variable_context)
   data.b[c] = !op[0]->value.b[c];
   break;
 
+   case ir_unop_neg:
+  for (unsigned c = 0; c < op[0]->type->components(); c++) {
+ switch (this->type->base_type) {
+ case GLSL_TYPE_UINT:
+data.u[c] = -((int) op[0]->value.u[c]);
+break;
+ case GLSL_TYPE_INT:
+data.i[c] = -op[0]->value.i[c];
+break;
+ case GLSL_TYPE_FLOAT:
+data.f[c] = -op[0]->value.f[c];
+break;
+ case GLSL_TYPE_DOUBLE:
+data.d[c] = -op[0]->value.d[c];
+break;
+ default:
+assert(0);
+ }
+  }
+  break;
+
+   case ir_unop_abs:
+  for (unsigned c = 0; c < op[0]->type->components(); c++) {
+ switch (this->type->base_type) {
+ case GLSL_TYPE_INT:
+data.i[c] = op[0]->value.i[c];
+if (data.i[c] < 0)
+   data.i[c] = -data.i[c];
+break;
+ case GLSL_TYPE_FLOAT:
+data.f[c] = fabs(op[0]->value.f[c]);
+break;
+ case GLSL_TYPE_DOUBLE:
+data.d[c] = fabs(op[0]->value.d[c]);
+break;
+ default:
+assert(0);
+ }
+  }
+  break;
+
+   case ir_unop_sign:
+  for (unsigned c = 0; c < op[0]->type->components(); c++) {
+ switch (this->type->base_type) {
+ case GLSL_TYPE_INT:
+data.i[c] = (op[0]->value.i[c] > 0) - (op[0]->value.i[c] < 0);
+break;
+ case GLSL_TYPE_FLOAT:
+data.f[c] = float((op[0]->value.f[c] > 0.0F) - (op[0]->value.f[c] 
< 0.0F));
+break;
+ case GLSL_TYPE_DOUBLE:
+data.d[c] = double((op[0]->value.d[c] > 0.0) - (op[0]->value.d[c] 
< 0.0));
+break;
+ default:
+assert(0);
+ }
+  }
+  break;
+
+   case ir_unop_rcp:
+  for (unsigned c = 0; c < op[0]->type->components(); c++) {
+ switch (this->type->base_type) {
+ case GLSL_TYPE_FLOAT:
+if (op[0]->value.f[c] != 0.0F)
+   data.f[c] = 1.0F / op[0]->value.f[c];
+break;
+ case GLSL_TYPE_DOUBLE:
+if (op[0]->value.d[c] != 0.0)
+   data.d[c] = 1.0 / op[0]->value.d[c];
+break;
+ default:
+assert(0);
+ }
+  }
+  break;
+
+   case ir_unop_rsq:
+  for (unsigned c = 0; c < op[0]->type->components(); c++) {
+ if (op[0]->type->base_type == GLSL_TYPE_DOUBLE)
+data.d[c] = 1.0 / sqrt(op[0]->value.d[c]);
+ else
+data.f[c] = 1.0F / sqrtf(op[0]->value.f[c]);
+  }
+  break;
+
+   case ir_unop_sqrt:
+  for (unsigned c = 0; c < op[0]->type->components(); c++) {
+ if (op[0]->type->base_type == GLSL_TYPE_DOUBLE)
+data.d[c] = sqrt(op[0]->value.d[c]);
+ else
+data.f[c] = sqrtf(op[0]->value.f[c]);
+  }
+  break;
+
+   case ir_unop_exp:
+  assert(op[0]->type->base_type == GLSL_TYPE_FLOAT);
+  for (unsigned c = 0; c < op[0]->type->components(); c++)
+ data.f[c] = expf(op[0]->value.f[c]);
+  break;
+
+   case ir_unop_log:
+  assert(op[0]->type->base_type == GLSL_TYPE_FLOAT);
+  for (unsigned c = 0; c < op[0]->type->components(); c++)
+ data.f[c] = logf(op[0]->value.f[c]);
+  break;
+
+   case ir_unop_exp2:
+  assert(op[0]->type->base_type == GLSL_TYPE_FLOAT);
+  for (unsigned c = 0; c < op[0]->type->components(); c++)
+ data.f[c] = exp2f(op[0]->value.f[c]);
+  break;
+
+   case ir_unop_log2:
+  assert(op[0]->type->base_type == GLSL_TYPE_FLOAT);
+  for (unsigned c = 0; c < op[0]->type->components(); c++)
+ data.f[c] = log2f(op[0]->value.f[c]);
+  break;
+
case ir_unop_f2i:
   assert(op[0]->type->base_type == GLSL_TYPE_FLOAT);
   for (unsigned c = 0; c < op[0]->type->components(); c++)
@@ -694,10 +812,10 @@ ir_expression::constant_expression_value(struct 
hash_table *variable_context)
  data.f[c] = (float) op[0]->value.i[c];
   break;
 
-   case ir_unop_u2f:
-  assert(op[0]->type->base_type == GLSL_TYPE_UINT);
+   case ir_unop_f2b:
+  assert(op[0]->type->base_type == GLSL_TYPE_FLOAT);
   for (unsigned c = 0; c < op[0]->type->components(); c++)
- data.f[c] = (float) op[0]->value.u[c];
+ data.b[c] = op[0]->value.f[c] 

[Mesa-dev] [PATCH 54/56] glsl: Use constant_template_horizontal instead of constant_template_horizontal_single_implementation for unops

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

This changes the "shape" of all the pack and unpack operators, but they
should function the same.

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_expression_operation.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/compiler/glsl/ir_expression_operation.py 
b/src/compiler/glsl/ir_expression_operation.py
index a22b6f9..bd71370 100644
--- a/src/compiler/glsl/ir_expression_operation.py
+++ b/src/compiler/glsl/ir_expression_operation.py
@@ -369,7 +369,7 @@ class operation(object):
  if horizontal_operation in self.flags and non_assign_operation in 
self.flags:
 return constant_template_horizontal_nonassignment.render(op=self)
  elif horizontal_operation in self.flags:
-return 
constant_template_horizontal_single_implementation.render(op=self)
+return constant_template_horizontal.render(op=self)
   elif self.num_operands == 2:
  if self.name == "mul":
 return constant_template_mul.render(op=self)
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 40/56] glsl: Generate code for constant ir_binop_dot expressions

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_expression_operation.py | 19 ++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/ir_expression_operation.py 
b/src/compiler/glsl/ir_expression_operation.py
index de04ef4..8f71252 100644
--- a/src/compiler/glsl/ir_expression_operation.py
+++ b/src/compiler/glsl/ir_expression_operation.py
@@ -205,6 +205,21 @@ constant_template_horizontal_nonassignment = 
mako.template.Template("""\
   ${op.c_expression['default']};
   break;""")
 
+# This template is for binary operations that are horizontal.  That is, the
+# operation consumes a vector and produces a scalar.
+constant_template_horizontal = mako.template.Template("""\
+   case ${op.get_enum_name()}:
+  switch (op[0]->type->base_type) {
+% for (dst_type, src_types) in op.signatures():
+  case ${src_types[0].glsl_type}:
+ data.${dst_type.union_field}[0] = ${op.get_c_expression(src_types)};
+ break;
+% endfor
+  default:
+ assert(0);
+  }
+  break;""")
+
 
 vector_scalar_operation = "vector-scalar"
 horizontal_operation = "horizontal"
@@ -274,6 +289,8 @@ class operation(object):
 return constant_template_vector_scalar.render(op=self)
  elif horizontal_operation in self.flags and types_identical_operation 
in self.flags:
 return 
constant_template_horizontal_single_implementation.render(op=self)
+ elif horizontal_operation in self.flags:
+return constant_template_horizontal.render(op=self)
  elif len(self.source_types) == 1:
 return constant_template0.render(op=self)
  elif self.dest_type is not None:
@@ -478,7 +495,7 @@ ir_expression_operation = [
operation("logic_xor", 2, printable_name="^^", source_types=(bool_type,), 
c_expression="{src0} != {src1}"),
operation("logic_or", 2, printable_name="||", source_types=(bool_type,), 
c_expression="{src0} || {src1}"),
 
-   operation("dot", 2),
+   operation("dot", 2, source_types=real_types, c_expression={'f': 
"dot_f(op[0], op[1])", 'd': "dot_d(op[0], op[1])"}, flags=horizontal_operation),
operation("min", 2, source_types=numeric_types, c_expression="MIN2({src0}, 
{src1})", flags=vector_scalar_operation),
operation("max", 2, source_types=numeric_types, c_expression="MAX2({src0}, 
{src1})", flags=vector_scalar_operation),
 
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 35/56] glsl: Generate code for constant unary expression that are horizontal

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_expression_operation.py | 22 --
 1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/src/compiler/glsl/ir_expression_operation.py 
b/src/compiler/glsl/ir_expression_operation.py
index e2075b4..75dfa0f 100644
--- a/src/compiler/glsl/ir_expression_operation.py
+++ b/src/compiler/glsl/ir_expression_operation.py
@@ -180,8 +180,16 @@ constant_template_vector_scalar = 
mako.template.Template("""\
   }
   break;""")
 
+# This template is for unary operations that are horizontal.  That is, the
+# operation consumes a vector and produces a scalar.
+constant_template_horizontal_single_implementation = 
mako.template.Template("""\
+   case ${op.get_enum_name()}:
+  data.${op.dest_type.union_field}[0] = ${op.c_expression['default']};
+  break;""")
+
 
 vector_scalar_operation = "vector-scalar"
+horizontal_operation = "horizontal"
 
 class operation(object):
def __init__(self, name, num_operands, printable_name = None, source_types 
= None, dest_type = None, c_expression = None, flags = None):
@@ -220,7 +228,9 @@ class operation(object):
  return None
 
   if self.num_operands == 1:
- if self.dest_type is not None and len(self.source_types) == 1:
+ if horizontal_operation in self.flags:
+return 
constant_template_horizontal_single_implementation.render(op=self)
+ elif self.dest_type is not None and len(self.source_types) == 1:
 return constant_template2.render(op=self)
  elif self.dest_type is not None:
 return constant_template5.render(op=self)
@@ -332,11 +342,11 @@ ir_expression_operation = [
operation("dFdy_fine", 1, printable_name="dFdyFine", 
source_types=(float_type,), c_expression="0.0f"),
 
# Floating point pack and unpack operations.
-   operation("pack_snorm_2x16", 1, printable_name="packSnorm2x16"),
-   operation("pack_snorm_4x8", 1, printable_name="packSnorm4x8"),
-   operation("pack_unorm_2x16", 1, printable_name="packUnorm2x16"),
-   operation("pack_unorm_4x8", 1, printable_name="packUnorm4x8"),
-   operation("pack_half_2x16", 1, printable_name="packHalf2x16"),
+   operation("pack_snorm_2x16", 1, printable_name="packSnorm2x16", 
source_types=(float_type,), dest_type=uint_type, 
c_expression="pack_2x16(pack_snorm_1x16, op[0]->value.f[0], 
op[0]->value.f[1])", flags=horizontal_operation),
+   operation("pack_snorm_4x8", 1, printable_name="packSnorm4x8", 
source_types=(float_type,), dest_type=uint_type, 
c_expression="pack_4x8(pack_snorm_1x8, op[0]->value.f[0], op[0]->value.f[1], 
op[0]->value.f[2], op[0]->value.f[3])", flags=horizontal_operation),
+   operation("pack_unorm_2x16", 1, printable_name="packUnorm2x16", 
source_types=(float_type,), dest_type=uint_type, 
c_expression="pack_2x16(pack_unorm_1x16, op[0]->value.f[0], 
op[0]->value.f[1])", flags=horizontal_operation),
+   operation("pack_unorm_4x8", 1, printable_name="packUnorm4x8", 
source_types=(float_type,), dest_type=uint_type, 
c_expression="pack_4x8(pack_unorm_1x8, op[0]->value.f[0], op[0]->value.f[1], 
op[0]->value.f[2], op[0]->value.f[3])", flags=horizontal_operation),
+   operation("pack_half_2x16", 1, printable_name="packHalf2x16", 
source_types=(float_type,), dest_type=uint_type, 
c_expression="pack_2x16(pack_half_1x16, op[0]->value.f[0], op[0]->value.f[1])", 
flags=horizontal_operation),
operation("unpack_snorm_2x16", 1, printable_name="unpackSnorm2x16"),
operation("unpack_snorm_4x8", 1, printable_name="unpackSnorm4x8"),
operation("unpack_unorm_2x16", 1, printable_name="unpackUnorm2x16"),
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 53/56] glsl: Eliminate constant_template2

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

constant_template_common can now handle the case where the result type
is different from the input type by using type_signature_iter.  This
changes the "shape" of all the cast-style operators, but they should
function the same.

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_expression_operation.py | 11 ---
 1 file changed, 11 deletions(-)

diff --git a/src/compiler/glsl/ir_expression_operation.py 
b/src/compiler/glsl/ir_expression_operation.py
index 4a0dda9..a22b6f9 100644
--- a/src/compiler/glsl/ir_expression_operation.py
+++ b/src/compiler/glsl/ir_expression_operation.py
@@ -108,15 +108,6 @@ constant_template_common = mako.template.Template("""\
   }
   break;""")
 
-# This template is for unary operations that map an operand of one type to an
-# operand of another type.  ir_unop_f2b is an example.
-constant_template2 = mako.template.Template("""\
-   case ${op.get_enum_name()}:
-  assert(op[0]->type->base_type == ${op.source_types[0].glsl_type});
-  for (unsigned c = 0; c < op[0]->type->components(); c++)
- data.${op.dest_type.union_field}[c] = 
${op.get_c_expression(op.source_types)};
-  break;""")
-
 # This template is for binary operations that can operate on some combination
 # of scalar and vector operands.
 constant_template_vector_scalar = mako.template.Template("""\
@@ -379,8 +370,6 @@ class operation(object):
 return constant_template_horizontal_nonassignment.render(op=self)
  elif horizontal_operation in self.flags:
 return 
constant_template_horizontal_single_implementation.render(op=self)
- elif self.dest_type is not None and len(self.source_types) == 1:
-return constant_template2.render(op=self)
   elif self.num_operands == 2:
  if self.name == "mul":
 return constant_template_mul.render(op=self)
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 28/56] glsl: Convert tuple into a class

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

This makes things a little more clear now, and it will make future
changes... possible.

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_expression_operation.py | 265 ++-
 1 file changed, 138 insertions(+), 127 deletions(-)

diff --git a/src/compiler/glsl/ir_expression_operation.py 
b/src/compiler/glsl/ir_expression_operation.py
index 8098dac..a2337d8 100644
--- a/src/compiler/glsl/ir_expression_operation.py
+++ b/src/compiler/glsl/ir_expression_operation.py
@@ -24,100 +24,114 @@
 import mako.template
 import sys
 
+class operation(object):
+   def __init__(self, name, num_operands, printable_name = None):
+  self.name = name
+  self.num_operands = num_operands
+
+  if printable_name is None:
+ self.printable_name = name
+  else:
+ self.printable_name = printable_name
+
+
+   def get_enum_name(self):
+  return "ir_{}op_{}".format(("un", "bin", "tri", 
"quad")[self.num_operands-1], self.name)
+
+
 ir_expression_operation = [
-   # Nameoperands  string
-   ("bit_not", 1, "~"),
-   ("logic_not", 1, "!"),
-   ("neg", 1, None),
-   ("abs", 1, None),
-   ("sign", 1, None),
-   ("rcp", 1, None),
-   ("rsq", 1, None),
-   ("sqrt", 1, None),
-   ("exp", 1, None), # Log base e on gentype
-   ("log", 1, None), # Natural log on gentype
-   ("exp2", 1, None),
-   ("log2", 1, None),
-   ("f2i", 1, None), # Float-to-integer conversion.
-   ("f2u", 1, None), # Float-to-unsigned conversion.
-   ("i2f", 1, None), # Integer-to-float conversion.
-   ("f2b", 1, None), # Float-to-boolean conversion
-   ("b2f", 1, None), # Boolean-to-float conversion
-   ("i2b", 1, None), # int-to-boolean conversion
-   ("b2i", 1, None), # Boolean-to-int conversion
-   ("u2f", 1, None), # Unsigned-to-float conversion.
-   ("i2u", 1, None), # Integer-to-unsigned conversion.
-   ("u2i", 1, None), # Unsigned-to-integer conversion.
-   ("d2f", 1, None), # Double-to-float conversion.
-   ("f2d", 1, None), # Float-to-double conversion.
-   ("d2i", 1, None), # Double-to-integer conversion.
-   ("i2d", 1, None), # Integer-to-double conversion.
-   ("d2u", 1, None), # Double-to-unsigned conversion.
-   ("u2d", 1, None), # Unsigned-to-double conversion.
-   ("d2b", 1, None), # Double-to-boolean conversion.
-   ("bitcast_i2f", 1, None), # 'Bit-identical int-to-float "conversion"
-   ("bitcast_f2i", 1, None), # 'Bit-identical float-to-int "conversion"
-   ("bitcast_u2f", 1, None), # 'Bit-identical uint-to-float "conversion"
-   ("bitcast_f2u", 1, None), # 'Bit-identical float-to-uint "conversion"
+   operation("bit_not", 1, printable_name="~"),
+   operation("logic_not", 1, printable_name="!"),
+   operation("neg", 1),
+   operation("abs", 1),
+   operation("sign", 1),
+   operation("rcp", 1),
+   operation("rsq", 1),
+   operation("sqrt", 1),
+   operation("exp", 1), # Log base e on gentype
+   operation("log", 1), # Natural log on gentype
+   operation("exp2", 1),
+   operation("log2", 1),
+   operation("f2i", 1), # Float-to-integer conversion.
+   operation("f2u", 1), # Float-to-unsigned conversion.
+   operation("i2f", 1), # Integer-to-float conversion.
+   operation("f2b", 1), # Float-to-boolean conversion
+   operation("b2f", 1), # Boolean-to-float conversion
+   operation("i2b", 1), # int-to-boolean conversion
+   operation("b2i", 1), # Boolean-to-int conversion
+   operation("u2f", 1), # Unsigned-to-float conversion.
+   operation("i2u", 1), # Integer-to-unsigned conversion.
+   operation("u2i", 1), # Unsigned-to-integer conversion.
+   operation("d2f", 1), # Double-to-float conversion.
+   operation("f2d", 1), # Float-to-double conversion.
+   operation("d2i", 1), # Double-to-integer conversion.
+   operation("i2d", 1), # Integer-to-double conversion.
+   operation("d2u", 1), # Double-to-unsigned conversion.
+   operation("u2d", 1), # Unsigned-to-double conversion.
+   operation("d2b", 1), # Double-to-boolean conversion.
+   operation("bitcast_i2f", 1), # 'Bit-identical int-to-float "conversion"
+   operation("bitcast_f2i", 1), # 'Bit-identical float-to-int "conversion"
+   operation("bitcast_u2f", 1), # 'Bit-identical uint-to-float "conversion"
+   operation("bitcast_f2u", 1), # 'Bit-identical float-to-uint "conversion"
 
# Unary floating-point rounding operations.
-   ("trunc", 1, None),
-   ("ceil", 1, None),
-   ("floor", 1, None),
-   ("fract", 1, None),
-   ("round_even", 1, None),
+   operation("trunc", 1),
+   operation("ceil", 1),
+   operation("floor", 1),
+   operation("fract", 1),
+   operation("round_even", 1),
 
# Trigonometric operations.
-   ("sin", 1, None),
-   ("cos", 1, None),

[Mesa-dev] [PATCH 41/56] glsl: Generate code for constant ir_triop_fma and ir_triop_bitfield_extract expressions

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

ir_triop_bitfield_extract is a little weird because the second and third
operand and aways int, so they may differ in type from the first
operand.

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_expression_operation.py | 18 +-
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/src/compiler/glsl/ir_expression_operation.py 
b/src/compiler/glsl/ir_expression_operation.py
index 8f71252..dc270dd 100644
--- a/src/compiler/glsl/ir_expression_operation.py
+++ b/src/compiler/glsl/ir_expression_operation.py
@@ -181,7 +181,7 @@ constant_template_vector_scalar = 
mako.template.Template("""\
  switch (op[0]->type->base_type) {
 % for (dst_type, src_types) in op.signatures():
  case ${src_types[0].glsl_type}:
-data.${dst_type.union_field}[c] = ${op.get_c_expression(src_types, 
("c0", "c1"))};
+data.${dst_type.union_field}[c] = ${op.get_c_expression(src_types, 
("c0", "c1", "c2"))};
 break;
 % endfor
  default:
@@ -297,18 +297,22 @@ class operation(object):
 return constant_template5.render(op=self)
  else:
 return constant_template3.render(op=self)
+  elif self.num_operands == 3:
+ return constant_template3.render(op=self)
 
   return None
 
 
-   def get_c_expression(self, types, indices=("c", "c")):
+   def get_c_expression(self, types, indices=("c", "c", "c")):
   src0 = "op[0]->value.{}[{}]".format(types[0].union_field, indices[0])
   src1 = "op[1]->value.{}[{}]".format(types[1].union_field, indices[1]) if 
len(types) >= 2 else "ERROR"
+  src2 = "op[2]->value.{}[{}]".format(types[2].union_field, indices[2]) if 
len(types) >= 3 else "ERROR"
 
   expr = self.c_expression[types[0].union_field] if types[0].union_field 
in self.c_expression else self.c_expression['default']
 
   return expr.format(src0=src0,
- src1=src1)
+ src1=src1,
+ src2=src2)
 
 
def signatures(self):
@@ -533,7 +537,7 @@ ir_expression_operation = [
operation("interpolate_at_sample", 2),
 
# Fused floating-point multiply-add, part of ARB_gpu_shader5.
-   operation("fma", 3),
+   operation("fma", 3, source_types=real_types, c_expression="{src0} * {src1} 
+ {src2}"),
 
operation("lrp", 3),
 
@@ -545,7 +549,11 @@ ir_expression_operation = [
# See also lower_instructions_visitor::ldexp_to_arith
operation("csel", 3),
 
-   operation("bitfield_extract", 3),
+   operation("bitfield_extract", 3,
+ all_signatures=((int_type, (uint_type, int_type, int_type)),
+ (int_type, (int_type, int_type, int_type))),
+ c_expression={'u': "bitfield_extract_uint({src0}, {src1}, 
{src2})",
+   'i': "bitfield_extract_int({src0}, {src1}, 
{src2})"}),
 
# Generate a value with one field of a vector changed
#
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 17/56] glsl: Extract ir_unop_bitfield_reverse implementation to a separate function

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_constant_expression.cpp | 40 +++-
 1 file changed, 27 insertions(+), 13 deletions(-)

diff --git a/src/compiler/glsl/ir_constant_expression.cpp 
b/src/compiler/glsl/ir_constant_expression.cpp
index 04b5877..13315ed 100644
--- a/src/compiler/glsl/ir_constant_expression.cpp
+++ b/src/compiler/glsl/ir_constant_expression.cpp
@@ -477,6 +477,23 @@ ir_rvalue::constant_expression_value(struct hash_table *)
return NULL;
 }
 
+static uint32_t
+bitfield_reverse(uint32_t v)
+{
+   /* http://graphics.stanford.edu/~seander/bithacks.html#BitReverseObvious */
+   uint32_t r = v; // r will be reversed bits of v; first get LSB of v
+   int s = sizeof(v) * CHAR_BIT - 1; // extra shift needed at end
+
+   for (v >>= 1; v; v >>= 1) {
+  r <<= 1;
+  r |= v & 1;
+  s--;
+   }
+   r <<= s; // shift when v's highest bits are zero
+
+   return r;
+}
+
 ir_constant *
 ir_expression::constant_expression_value(struct hash_table *variable_context)
 {
@@ -1482,20 +1499,17 @@ ir_expression::constant_expression_value(struct 
hash_table *variable_context)
   break;
 
case ir_unop_bitfield_reverse:
-  /* http://graphics.stanford.edu/~seander/bithacks.html#BitReverseObvious 
*/
-  for (unsigned c = 0; c < components; c++) {
- unsigned int v = op[0]->value.u[c]; // input bits to be reversed
- unsigned int r = v; // r will be reversed bits of v; first get LSB of 
v
- int s = sizeof(v) * CHAR_BIT - 1; // extra shift needed at end
-
- for (v >>= 1; v; v >>= 1) {
-r <<= 1;
-r |= v & 1;
-s--;
+  for (unsigned c = 0; c < op[0]->type->components(); c++) {
+ switch (this->type->base_type) {
+ case GLSL_TYPE_UINT:
+data.u[c] = bitfield_reverse(op[0]->value.u[c]);
+break;
+ case GLSL_TYPE_INT:
+data.i[c] = bitfield_reverse(op[0]->value.i[c]);
+break;
+ default:
+assert(0);
  }
- r <<= s; // shift when v's highest bits are zero
-
- data.u[c] = r;
   }
   break;
 
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 48/56] glsl: Generate code for constant ir_triop_csel expressions

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_expression_operation.py | 24 +++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/ir_expression_operation.py 
b/src/compiler/glsl/ir_expression_operation.py
index 7161713..b1b7101 100644
--- a/src/compiler/glsl/ir_expression_operation.py
+++ b/src/compiler/glsl/ir_expression_operation.py
@@ -357,6 +357,24 @@ constant_template_lrp = mako.template.Template("""\
   break;
}""")
 
+# This template is for ir_triop_csel.  This expression is really unique
+# because not all of the operands are the same type, and the second operand
+# determines the type of the expression (instead of the first).
+constant_template_csel = mako.template.Template("""\
+   case ${op.get_enum_name()}:
+  for (unsigned c = 0; c < components; c++) {
+ switch (this->type->base_type) {
+% for (dst_type, src_types) in op.signatures():
+ case ${src_types[1].glsl_type}:
+data.${dst_type.union_field}[c] = 
${op.get_c_expression(src_types)};
+break;
+% endfor
+ default:
+assert(0);
+ }
+  }
+  break;""")
+
 
 vector_scalar_operation = "vector-scalar"
 horizontal_operation = "horizontal"
@@ -443,6 +461,8 @@ class operation(object):
 return constant_template_vector_insert.render(op=self)
  elif self.name == "lrp":
 return constant_template_lrp.render(op=self)
+ elif self.name == "csel":
+return constant_template_csel.render(op=self)
  else:
 return constant_template3.render(op=self)
   elif self.num_operands == 4:
@@ -700,7 +720,9 @@ ir_expression_operation = [
# component on vectors).
#
# See also lower_instructions_visitor::ldexp_to_arith
-   operation("csel", 3),
+   operation("csel", 3,
+ all_signatures=zip(all_types, zip(len(all_types) * (bool_type,), 
all_types, all_types)),
+ c_expression="{src0} ? {src1} : {src2}"),
 
operation("bitfield_extract", 3,
  all_signatures=((int_type, (uint_type, int_type, int_type)),
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/56] glsl: Don't support integer types for operations that can't handle them

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

ir_unop_fract already forbade integer types in ir_validate.  ir_unop_rcp,
ir_unop_rsq, and ir_unop_sqrt should also forbid them in ir_validate.

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_constant_expression.cpp | 14 --
 src/compiler/glsl/ir_validate.cpp|  2 ++
 2 files changed, 2 insertions(+), 14 deletions(-)

diff --git a/src/compiler/glsl/ir_constant_expression.cpp 
b/src/compiler/glsl/ir_constant_expression.cpp
index 3d15a42..b4486dd 100644
--- a/src/compiler/glsl/ir_constant_expression.cpp
+++ b/src/compiler/glsl/ir_constant_expression.cpp
@@ -729,12 +729,6 @@ ir_expression::constant_expression_value(struct hash_table 
*variable_context)
case ir_unop_fract:
   for (unsigned c = 0; c < op[0]->type->components(); c++) {
  switch (this->type->base_type) {
- case GLSL_TYPE_UINT:
-data.u[c] = 0;
-break;
- case GLSL_TYPE_INT:
-data.i[c] = 0;
-break;
  case GLSL_TYPE_FLOAT:
 data.f[c] = op[0]->value.f[c] - floor(op[0]->value.f[c]);
 break;
@@ -823,14 +817,6 @@ ir_expression::constant_expression_value(struct hash_table 
*variable_context)
case ir_unop_rcp:
   for (unsigned c = 0; c < op[0]->type->components(); c++) {
  switch (this->type->base_type) {
- case GLSL_TYPE_UINT:
-if (op[0]->value.u[c] != 0.0)
-   data.u[c] = 1 / op[0]->value.u[c];
-break;
- case GLSL_TYPE_INT:
-if (op[0]->value.i[c] != 0.0)
-   data.i[c] = 1 / op[0]->value.i[c];
-break;
  case GLSL_TYPE_FLOAT:
 if (op[0]->value.f[c] != 0.0)
data.f[c] = 1.0F / op[0]->value.f[c];
diff --git a/src/compiler/glsl/ir_validate.cpp 
b/src/compiler/glsl/ir_validate.cpp
index 6331868..f93b4f2 100644
--- a/src/compiler/glsl/ir_validate.cpp
+++ b/src/compiler/glsl/ir_validate.cpp
@@ -260,6 +260,8 @@ ir_validate::visit_leave(ir_expression *ir)
case ir_unop_rcp:
case ir_unop_rsq:
case ir_unop_sqrt:
+  assert(ir->type->base_type == GLSL_TYPE_FLOAT ||
+ ir->type->base_type == GLSL_TYPE_DOUBLE);
   assert(ir->type == ir->operands[0]->type);
   break;
 
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 38/56] glsl: Generate code for constant ir_binop_ldexp expressions

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

ldexp is weird because its two operands have different types.  Add
support for directly specifying the exact signatures of all the possible
variations of an operation.

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_expression_operation.py | 23 +++
 1 file changed, 19 insertions(+), 4 deletions(-)

diff --git a/src/compiler/glsl/ir_expression_operation.py 
b/src/compiler/glsl/ir_expression_operation.py
index 2351dcf..de9c7b7 100644
--- a/src/compiler/glsl/ir_expression_operation.py
+++ b/src/compiler/glsl/ir_expression_operation.py
@@ -202,7 +202,7 @@ types_identical_operation = "identical"
 non_assign_operation = "nonassign"
 
 class operation(object):
-   def __init__(self, name, num_operands, printable_name = None, source_types 
= None, dest_type = None, c_expression = None, flags = None):
+   def __init__(self, name, num_operands, printable_name = None, source_types 
= None, dest_type = None, c_expression = None, flags = None, all_signatures = 
None):
   self.name = name
   self.num_operands = num_operands
 
@@ -211,7 +211,13 @@ class operation(object):
   else:
  self.printable_name = printable_name
 
-  self.source_types = source_types
+  self.all_signatures = all_signatures
+
+  if source_types is None:
+ self.source_types = ()
+  else:
+ self.source_types = source_types
+
   self.dest_type = dest_type
 
   if c_expression is None:
@@ -261,6 +267,8 @@ class operation(object):
 return constant_template0.render(op=self)
  elif self.dest_type is not None:
 return constant_template5.render(op=self)
+ else:
+return constant_template3.render(op=self)
 
   return None
 
@@ -276,7 +284,10 @@ class operation(object):
 
 
def signatures(self):
-  return type_signature_iter(self.dest_type, self.source_types, 
self.num_operands)
+  if self.all_signatures is not None:
+ return self.all_signatures
+  else:
+ return type_signature_iter(self.dest_type, self.source_types, 
self.num_operands)
 
 
 ir_expression_operation = [
@@ -469,7 +480,11 @@ ir_expression_operation = [
operation("ubo_load", 2),
 
# Multiplies a number by two to a power, part of ARB_gpu_shader5.
-   operation("ldexp", 2),
+   operation("ldexp", 2,
+ all_signatures=((float_type, (float_type, int_type)),
+ (double_type, (double_type, int_type))),
+ c_expression={'f': "ldexpf_flush_subnormal({src0}, {src1})",
+   'd': "ldexp_flush_subnormal({src0}, {src1})"}),
 
# Extract a scalar from a vector
#
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 19/56] glsl: Use find_msb_uint to implement ir_unop_find_lsb

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

(X & -X) calculates a value with only the least significant bit of X
set.  Since there is only one bit set, the LSB is the MSB.

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_constant_expression.cpp | 19 +--
 1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/src/compiler/glsl/ir_constant_expression.cpp 
b/src/compiler/glsl/ir_constant_expression.cpp
index 5f4cae2..71afb33 100644
--- a/src/compiler/glsl/ir_constant_expression.cpp
+++ b/src/compiler/glsl/ir_constant_expression.cpp
@@ -1560,16 +1560,15 @@ ir_expression::constant_expression_value(struct 
hash_table *variable_context)
 
case ir_unop_find_lsb:
   for (unsigned c = 0; c < components; c++) {
- if (op[0]->value.i[c] == 0)
-data.i[c] = -1;
- else {
-unsigned pos = 0;
-unsigned v = op[0]->value.u[c];
-
-for (; !(v & 1); v >>= 1) {
-   pos++;
-}
-data.u[c] = pos;
+ switch (op[0]->type->base_type) {
+ case GLSL_TYPE_UINT:
+data.i[c] = find_msb_uint(op[0]->value.u[c] & 
-int(op[0]->value.u[c]));
+break;
+ case GLSL_TYPE_INT:
+data.i[c] = find_msb_uint(op[0]->value.i[c] & -op[0]->value.i[c]);
+break;
+ default:
+assert(0);
  }
   }
   break;
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 42/56] glsl: Generate code for constant ir_binop_mul expressions

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_expression_operation.py | 56 +++-
 1 file changed, 54 insertions(+), 2 deletions(-)

diff --git a/src/compiler/glsl/ir_expression_operation.py 
b/src/compiler/glsl/ir_expression_operation.py
index dc270dd..e7a74e3 100644
--- a/src/compiler/glsl/ir_expression_operation.py
+++ b/src/compiler/glsl/ir_expression_operation.py
@@ -190,6 +190,56 @@ constant_template_vector_scalar = 
mako.template.Template("""\
   }
   break;""")
 
+# This template is for multiplication.  It is unique because it has to support
+# matrix * vector and matrix * matrix operations, and those are just different.
+constant_template_mul = mako.template.Template("""\
+   case ${op.get_enum_name()}:
+  /* Check for equal types, or unequal types involving scalars */
+  if ((op[0]->type == op[1]->type && !op[0]->type->is_matrix())
+  || op0_scalar || op1_scalar) {
+ for (unsigned c = 0, c0 = 0, c1 = 0;
+  c < components;
+  c0 += c0_inc, c1 += c1_inc, c++) {
+
+switch (op[0]->type->base_type) {
+% for (dst_type, src_types) in op.signatures():
+case ${src_types[0].glsl_type}:
+   data.${dst_type.union_field}[c] = 
${op.get_c_expression(src_types, ("c0", "c1", "c2"))};
+   break;
+% endfor
+default:
+   assert(0);
+}
+ }
+  } else {
+ assert(op[0]->type->is_matrix() || op[1]->type->is_matrix());
+
+ /* Multiply an N-by-M matrix with an M-by-P matrix.  Since either
+  * matrix can be a GLSL vector, either N or P can be 1.
+  *
+  * For vec*mat, the vector is treated as a row vector.  This
+  * means the vector is a 1-row x M-column matrix.
+  *
+  * For mat*vec, the vector is treated as a column vector.  Since
+  * matrix_columns is 1 for vectors, this just works.
+  */
+ const unsigned n = op[0]->type->is_vector()
+? 1 : op[0]->type->vector_elements;
+ const unsigned m = op[1]->type->vector_elements;
+ const unsigned p = op[1]->type->matrix_columns;
+ for (unsigned j = 0; j < p; j++) {
+for (unsigned i = 0; i < n; i++) {
+   for (unsigned k = 0; k < m; k++) {
+  if (op[0]->type->base_type == GLSL_TYPE_DOUBLE)
+ data.d[i+n*j] += 
op[0]->value.d[i+n*k]*op[1]->value.d[k+m*j];
+  else
+ data.f[i+n*j] += 
op[0]->value.f[i+n*k]*op[1]->value.f[k+m*j];
+   }
+}
+ }
+  }
+  break;""")
+
 # This template is for operations that are horizontal and either have only a
 # single type or the implementation for all types is identical.  That is, the
 # operation consumes a vector and produces a scalar.
@@ -285,7 +335,9 @@ class operation(object):
  else:
 return constant_template3.render(op=self)
   elif self.num_operands == 2:
- if vector_scalar_operation in self.flags:
+ if self.name == "mul":
+return constant_template_mul.render(op=self)
+ elif vector_scalar_operation in self.flags:
 return constant_template_vector_scalar.render(op=self)
  elif horizontal_operation in self.flags and types_identical_operation 
in self.flags:
 return 
constant_template_horizontal_single_implementation.render(op=self)
@@ -454,7 +506,7 @@ ir_expression_operation = [
operation("add", 2, printable_name="+", source_types=numeric_types, 
c_expression="{src0} + {src1}", flags=vector_scalar_operation),
operation("sub", 2, printable_name="-", source_types=numeric_types, 
c_expression="{src0} - {src1}", flags=vector_scalar_operation),
# "Floating-point or low 32-bit integer multiply."
-   operation("mul", 2, printable_name="*"),
+   operation("mul", 2, printable_name="*", source_types=numeric_types, 
c_expression="{src0} * {src1}"),
operation("imul_high", 2),   # Calculates the high 32-bits of a 64-bit 
multiply.
operation("div", 2, printable_name="/", source_types=numeric_types, 
c_expression={'u': "{src1} == 0 ? 0 : {src0} / {src1}", 'i': "{src1} == 0 ? 0 : 
{src0} / {src1}", 'default': "{src0} / {src1}"}, flags=vector_scalar_operation),
 
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 14/56] glsl: Delete spurious comment about updating ir_expression::get_num_operands

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

This hasn't been necessary since 007f48815.

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_expression_operation.py | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/src/compiler/glsl/ir_expression_operation.py 
b/src/compiler/glsl/ir_expression_operation.py
index 10c9626..5c7ad35 100644
--- a/src/compiler/glsl/ir_expression_operation.py
+++ b/src/compiler/glsl/ir_expression_operation.py
@@ -269,9 +269,6 @@ if __name__ == "__main__":
  */
 """
enum_template = mako.template.Template(copyright + """
-/* Update ir_expression::get_num_operands() and operator_strs when
- * updating this list.
- */
 enum ir_expression_operation {
 % for item in values:
${name_from_item(item)},
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 46/56] glsl: Generate code for constant ir_quadop_vector expressions

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_expression_operation.py | 22 --
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/src/compiler/glsl/ir_expression_operation.py 
b/src/compiler/glsl/ir_expression_operation.py
index 66d015a..c53b66e 100644
--- a/src/compiler/glsl/ir_expression_operation.py
+++ b/src/compiler/glsl/ir_expression_operation.py
@@ -316,6 +316,22 @@ constant_template_vector_insert = 
mako.template.Template("""\
   break;
}""")
 
+# This template is for ir_quadop_vector.
+constant_template_vector = mako.template.Template("""\
+   case ${op.get_enum_name()}:
+  for (unsigned c = 0; c < this->type->vector_elements; c++) {
+ switch (this->type->base_type) {
+% for (dst_type, src_types) in op.signatures():
+ case ${src_types[0].glsl_type}:
+data.${dst_type.union_field}[c] = 
op[c]->value.${src_types[0].union_field}[0];
+break;
+% endfor
+ default:
+assert(0);
+ }
+  }
+  break;""")
+
 
 vector_scalar_operation = "vector-scalar"
 horizontal_operation = "horizontal"
@@ -403,7 +419,9 @@ class operation(object):
  else:
 return constant_template3.render(op=self)
   elif self.num_operands == 4:
- if types_identical_operation in self.flags:
+ if self.name == "vector":
+return constant_template_vector.render(op=self)
+ elif types_identical_operation in self.flags:
 return constant_template0.render(op=self)
 
   return None
@@ -676,7 +694,7 @@ ir_expression_operation = [
  c_expression="bitfield_insert({src0}, {src1}, {src2}, {src3})",
  flags=types_identical_operation),
 
-   operation("vector", 4),
+   operation("vector", 4, source_types=all_types, 
c_expression="anything-except-None"),
 ]
 
 
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 18/56] glsl: Extract ir_unop_find_msb implementation to a separate function

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_constant_expression.cpp | 49 +++-
 1 file changed, 34 insertions(+), 15 deletions(-)

diff --git a/src/compiler/glsl/ir_constant_expression.cpp 
b/src/compiler/glsl/ir_constant_expression.cpp
index 13315ed..5f4cae2 100644
--- a/src/compiler/glsl/ir_constant_expression.cpp
+++ b/src/compiler/glsl/ir_constant_expression.cpp
@@ -494,6 +494,31 @@ bitfield_reverse(uint32_t v)
return r;
 }
 
+static int
+find_msb_uint(uint32_t v)
+{
+   int count = 0;
+
+   /* If v == 0, then the loop will terminate when count == 32.  In that case
+* 31-count will produce the -1 result required by GLSL findMSB().
+*/
+   while (((v & (1u << 31)) == 0) && count != 32) {
+  count++;
+  v <<= 1;
+   }
+
+   return 31 - count;
+}
+
+static int
+find_msb_int(int32_t v)
+{
+   /* If v is signed, findMSB() returns the position of the most significant
+* zero bit.
+*/
+   return find_msb_uint(v < 0 ? ~v : v);
+}
+
 ir_constant *
 ir_expression::constant_expression_value(struct hash_table *variable_context)
 {
@@ -1520,21 +1545,15 @@ ir_expression::constant_expression_value(struct 
hash_table *variable_context)
 
case ir_unop_find_msb:
   for (unsigned c = 0; c < components; c++) {
- int v = op[0]->value.i[c];
-
- if (v == 0 || (op[0]->type->base_type == GLSL_TYPE_INT && v == -1))
-data.i[c] = -1;
- else {
-int count = 0;
-unsigned top_bit = op[0]->type->base_type == GLSL_TYPE_UINT
-   ? 0 : v & (1u << 31);
-
-while (((v & (1u << 31)) == top_bit) && count != 32) {
-   count++;
-   v <<= 1;
-}
-
-data.i[c] = 31 - count;
+ switch (op[0]->type->base_type) {
+ case GLSL_TYPE_UINT:
+data.i[c] = find_msb_uint(op[0]->value.u[c]);
+break;
+ case GLSL_TYPE_INT:
+data.i[c] = find_msb_int(op[0]->value.i[c]);
+break;
+ default:
+assert(0);
  }
   }
   break;
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 44/56] glsl: Generate code for constant ir_triop_vector_insert expressions

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_expression_operation.py | 27 +--
 1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/src/compiler/glsl/ir_expression_operation.py 
b/src/compiler/glsl/ir_expression_operation.py
index ec04f57..033d947 100644
--- a/src/compiler/glsl/ir_expression_operation.py
+++ b/src/compiler/glsl/ir_expression_operation.py
@@ -288,6 +288,26 @@ constant_template_vector_extract = 
mako.template.Template("""\
   break;
}""")
 
+# This template is for ir_triop_vector_insert.
+constant_template_vector_insert = mako.template.Template("""\
+   case ${op.get_enum_name()}: {
+  const unsigned idx = op[2]->value.u[0];
+
+  memcpy(, [0]->value, sizeof(data));
+
+  switch (this->type->base_type) {
+% for (dst_type, src_types) in op.signatures():
+  case ${src_types[0].glsl_type}:
+ data.${dst_type.union_field}[idx] = 
op[1]->value.${src_types[0].union_field}[0];
+ break;
+% endfor
+  default:
+ assert(!"Should not get here.");
+ break;
+  }
+  break;
+   }""")
+
 
 vector_scalar_operation = "vector-scalar"
 horizontal_operation = "horizontal"
@@ -370,7 +390,10 @@ class operation(object):
  else:
 return constant_template3.render(op=self)
   elif self.num_operands == 3:
- return constant_template3.render(op=self)
+ if self.name == "vector_insert":
+return constant_template_vector_insert.render(op=self)
+ else:
+return constant_template3.render(op=self)
 
   return None
 
@@ -632,7 +655,7 @@ ir_expression_operation = [
# operand0 is the vector
# operand1 is the value to write into the vector result
# operand2 is the index in operand0 to be modified
-   operation("vector_insert", 3),
+   operation("vector_insert", 3, source_types=all_types, 
c_expression="anything-except-None"),
 
operation("bitfield_insert", 4),
 
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 22/56] glsl: Extract ir_quadop_bitfield_insert implementation to a separate function

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_constant_expression.cpp | 48 +---
 1 file changed, 23 insertions(+), 25 deletions(-)

diff --git a/src/compiler/glsl/ir_constant_expression.cpp 
b/src/compiler/glsl/ir_constant_expression.cpp
index ebd2e07..07ded9f 100644
--- a/src/compiler/glsl/ir_constant_expression.cpp
+++ b/src/compiler/glsl/ir_constant_expression.cpp
@@ -569,6 +569,26 @@ bitfield_extract_int(int32_t value, int offset, int bits)
}
 }
 
+static uint32_t
+bitfield_insert(uint32_t base, uint32_t insert, int offset, int bits)
+{
+   if (bits == 0)
+  return base;
+   else if (offset < 0 || bits < 0)
+  return 0; /* Undefined, per spec. */
+   else if (offset + bits > 32)
+  return 0; /* Undefined, per spec. */
+   else {
+  unsigned insert_mask = ((1ull << bits) - 1) << offset;
+
+  insert <<= offset;
+  insert &= insert_mask;
+  base &= ~insert_mask;
+
+  return base | insert;
+   }
+}
+
 ir_constant *
 ir_expression::constant_expression_value(struct hash_table *variable_context)
 {
@@ -1749,32 +1769,10 @@ ir_expression::constant_expression_value(struct 
hash_table *variable_context)
   break;
}
 
-   case ir_quadop_bitfield_insert: {
-  for (unsigned c = 0; c < components; c++) {
- int offset = op[2]->value.i[c];
- int bits = op[3]->value.i[c];
-
- if (bits == 0)
-data.u[c] = op[0]->value.u[c];
- else if (offset < 0 || bits < 0)
-data.u[c] = 0; /* Undefined, per spec. */
- else if (offset + bits > 32)
-data.u[c] = 0; /* Undefined, per spec. */
- else {
-unsigned insert_mask = ((1ull << bits) - 1) << offset;
-
-unsigned insert = op[1]->value.u[c];
-insert <<= offset;
-insert &= insert_mask;
-
-unsigned base = op[0]->value.u[c];
-base &= ~insert_mask;
-
-data.u[c] = base | insert;
- }
-  }
+   case ir_quadop_bitfield_insert:
+  for (unsigned c = 0; c < components; c++)
+ data.u[c] = bitfield_insert(op[0]->value.u[c], op[1]->value.u[c], 
op[2]->value.i[c], op[3]->value.i[c]);
   break;
-   }
 
case ir_quadop_vector:
   for (unsigned c = 0; c < this->type->vector_elements; c++) {
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 39/56] glsl: Generate code for constant ir_binop_lshift and ir_binop_rshift expressions

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

The code generated is quite different from what was previously used.  I
believe that it is still correct by the GLSL spec, and I believe, due to
C rules about shifts, the behavior will be the same.

Section 5.9 (Expressions) of the GLSL 4.50 spec says:

The result is undefined if the right operand is negative, or greater
than or equal to the number of bits in the left expression's base
type.

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_expression_operation.py | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/src/compiler/glsl/ir_expression_operation.py 
b/src/compiler/glsl/ir_expression_operation.py
index de9c7b7..de04ef4 100644
--- a/src/compiler/glsl/ir_expression_operation.py
+++ b/src/compiler/glsl/ir_expression_operation.py
@@ -163,7 +163,17 @@ constant_template5 = mako.template.Template("""\
 # of scalar and vector operands.
 constant_template_vector_scalar = mako.template.Template("""\
case ${op.get_enum_name()}:
+% if "mixed" in op.flags:
+% for i in xrange(op.num_operands):
+  assert(op[${i}]->type->base_type == ${op.source_types[0].glsl_type} ||
+% for src_type in op.source_types[1:-1]:
+ op[${i}]->type->base_type == ${src_type.glsl_type} ||
+% endfor
+ op[${i}]->type->base_type == ${op.source_types[-1].glsl_type});
+% endfor
+% else:
   assert(op[0]->type == op[1]->type || op0_scalar || op1_scalar);
+% endif
   for (unsigned c = 0, c0 = 0, c1 = 0;
c < components;
c0 += c0_inc, c1 += c1_inc, c++) {
@@ -200,6 +210,7 @@ vector_scalar_operation = "vector-scalar"
 horizontal_operation = "horizontal"
 types_identical_operation = "identical"
 non_assign_operation = "nonassign"
+mixed_type_operation = "mixed"
 
 class operation(object):
def __init__(self, name, num_operands, printable_name = None, source_types 
= None, dest_type = None, c_expression = None, flags = None, all_signatures = 
None):
@@ -457,8 +468,8 @@ ir_expression_operation = [
operation("any_nequal", 2, source_types=all_types, dest_type=bool_type, 
c_expression="!op[0]->has_value(op[1])", flags=frozenset((horizontal_operation, 
types_identical_operation))),
 
# Bit-wise binary operations.
-   operation("lshift", 2, printable_name="<<"),
-   operation("rshift", 2, printable_name=">>"),
+   operation("lshift", 2, printable_name="<<", source_types=integer_types, 
c_expression="{src0} << {src1}", flags=frozenset((vector_scalar_operation, 
mixed_type_operation))),
+   operation("rshift", 2, printable_name=">>", source_types=integer_types, 
c_expression="{src0} >> {src1}", flags=frozenset((vector_scalar_operation, 
mixed_type_operation))),
operation("bit_and", 2, printable_name="&", source_types=integer_types, 
c_expression="{src0} & {src1}", flags=vector_scalar_operation),
operation("bit_xor", 2, printable_name="^", source_types=integer_types, 
c_expression="{src0} ^ {src1}", flags=vector_scalar_operation),
operation("bit_or", 2, printable_name="|", source_types=integer_types, 
c_expression="{src0} | {src1}", flags=vector_scalar_operation),
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 34/56] glsl: Generate code for constant expressions that have an output type the differs from the input types

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_expression_operation.py | 44 +---
 1 file changed, 33 insertions(+), 11 deletions(-)

diff --git a/src/compiler/glsl/ir_expression_operation.py 
b/src/compiler/glsl/ir_expression_operation.py
index ac568ae..e2075b4 100644
--- a/src/compiler/glsl/ir_expression_operation.py
+++ b/src/compiler/glsl/ir_expression_operation.py
@@ -84,6 +84,7 @@ float_type = type("float", "f", "GLSL_TYPE_FLOAT")
 double_type = type("double", "d", "GLSL_TYPE_DOUBLE")
 bool_type = type("bool", "b", "GLSL_TYPE_BOOL")
 
+all_types = (uint_type, int_type, float_type, double_type, bool_type)
 numeric_types = (uint_type, int_type, float_type, double_type)
 signed_numeric_types = (int_type, float_type, double_type)
 integer_types = (uint_type, int_type)
@@ -141,6 +142,23 @@ constant_template2 = mako.template.Template("""\
  data.${op.dest_type.union_field}[c] = 
${op.get_c_expression(op.source_types)};
   break;""")
 
+# This template is for operations with an output type that doesn't match the
+# input types.
+constant_template5 = mako.template.Template("""\
+   case ${op.get_enum_name()}:
+  for (unsigned c = 0; c < components; c++) {
+ switch (op[0]->type->base_type) {
+% for (dst_type, src_types) in op.signatures():
+ case ${src_types[0].glsl_type}:
+data.${dst_type.union_field}[c] = 
${op.get_c_expression(src_types)};
+break;
+% endfor
+ default:
+assert(0);
+ }
+  }
+  break;""")
+
 # This template is for binary operations that can operate on some combination
 # of scalar and vector operands.
 constant_template_vector_scalar = mako.template.Template("""\
@@ -202,8 +220,10 @@ class operation(object):
  return None
 
   if self.num_operands == 1:
- if self.dest_type is not None:
+ if self.dest_type is not None and len(self.source_types) == 1:
 return constant_template2.render(op=self)
+ elif self.dest_type is not None:
+return constant_template5.render(op=self)
  elif len(self.source_types) == 1:
 return constant_template0.render(op=self)
  elif len(self.c_expression) == 1 and 'default' in self.c_expression:
@@ -215,6 +235,8 @@ class operation(object):
 return constant_template_vector_scalar.render(op=self)
  elif len(self.source_types) == 1:
 return constant_template0.render(op=self)
+ elif self.dest_type is not None:
+return constant_template5.render(op=self)
 
   return None
 
@@ -258,7 +280,7 @@ ir_expression_operation = [
# Boolean-to-float conversion
operation("b2f", 1, source_types=(bool_type,), dest_type=float_type, 
c_expression="{src0} ? 1.0F : 0.0F"),
# int-to-boolean conversion
-   operation("i2b", 1),
+   operation("i2b", 1, source_types=integer_types, dest_type=bool_type, 
c_expression="{src0} ? true : false"),
# Boolean-to-int conversion
operation("b2i", 1, source_types=(bool_type,), dest_type=int_type, 
c_expression="{src0} ? 1 : 0"),
# Unsigned-to-float conversion.
@@ -323,9 +345,9 @@ ir_expression_operation = [
 
# Bit operations, part of ARB_gpu_shader5.
operation("bitfield_reverse", 1, source_types=integer_types, 
c_expression="bitfield_reverse({src0})"),
-   operation("bit_count", 1),
-   operation("find_msb", 1),
-   operation("find_lsb", 1),
+   operation("bit_count", 1, source_types=integer_types, dest_type=int_type, 
c_expression="_mesa_bitcount({src0})"),
+   operation("find_msb", 1, source_types=integer_types, dest_type=int_type, 
c_expression={'u': "find_msb_uint({src0})", 'i': "find_msb_int({src0})"}),
+   operation("find_lsb", 1, source_types=integer_types, dest_type=int_type, 
c_expression={'u': "find_msb_uint({src0} & -int({src0}))", 'i': 
"find_msb_uint({src0} & -{src0})"}),
 
operation("saturate", 1, printable_name="sat", source_types=(float_type,), 
c_expression="CLAMP({src0}, 0.0f, 1.0f)"),
 
@@ -384,12 +406,12 @@ ir_expression_operation = [
 
# Binary comparison operators which return a boolean vector.
# The type of both operands must be equal.
-   operation("less", 2, printable_name="<"),
-   operation("greater", 2, printable_name=">"),
-   operation("lequal", 2, printable_name="<="),
-   operation("gequal", 2, printable_name=">="),
-   operation("equal", 2, printable_name="=="),
-   operation("nequal", 2, printable_name="!="),
+   operation("less", 2, printable_name="<", source_types=numeric_types, 
dest_type=bool_type, c_expression="{src0} < {src1}"),
+   operation("greater", 2, printable_name=">", source_types=numeric_types, 
dest_type=bool_type, c_expression="{src0} > {src1}"),
+   operation("lequal", 2, printable_name="<=", source_types=numeric_types, 
dest_type=bool_type, c_expression="{src0} <= {src1}"),
+   operation("gequal", 2, printable_name=">=", 

[Mesa-dev] [PATCH 08/56] glsl: Generate ir_expression_operation.h from Python

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

There are differences in where end-of-line comments are placed, but
'diff -wud' is clean.

v2: Massive rebase.

Signed-off-by: Ian Romanick 
---
 src/compiler/Makefile.glsl.am|   8 +
 src/compiler/Makefile.sources|   2 +-
 src/compiler/glsl/.gitignore |   1 +
 src/compiler/glsl/ir_expression_operation.h  | 340 
 src/compiler/glsl/ir_expression_operation.py | 384 +++
 src/mesa/Makefile.sources|   1 +
 src/mesa/drivers/dri/i965/Makefile.am|   1 +
 7 files changed, 396 insertions(+), 341 deletions(-)
 delete mode 100644 src/compiler/glsl/ir_expression_operation.h
 create mode 100644 src/compiler/glsl/ir_expression_operation.py

diff --git a/src/compiler/Makefile.glsl.am b/src/compiler/Makefile.glsl.am
index bfb3161..0dba618 100644
--- a/src/compiler/Makefile.glsl.am
+++ b/src/compiler/Makefile.glsl.am
@@ -25,6 +25,7 @@ EXTRA_DIST += glsl/tests glsl/glcpp/tests glsl/README \
glsl/TODO glsl/glcpp/README \
glsl/glsl_lexer.ll  \
glsl/glsl_parser.yy \
+   glsl/ir_expression_operation.py \
glsl/glcpp/glcpp-lex.l  \
glsl/glcpp/glcpp-parse.y\
SConscript.glsl
@@ -178,6 +179,7 @@ am__v_YACC_1 =
 
 YACC_GEN = $(AM_V_YACC)$(YACC) $(YFLAGS)
 LEX_GEN = $(AM_V_LEX)$(LEX) $(LFLAGS)
+PYTHON_GEN = $(AM_V_GEN)$(PYTHON2) $(PYTHON_FLAGS)
 
 glsl/glsl_parser.cpp glsl/glsl_parser.h: glsl/glsl_parser.yy
$(MKDIR_GEN)
@@ -195,6 +197,10 @@ glsl/glcpp/glcpp-lex.c: glsl/glcpp/glcpp-lex.l
$(MKDIR_GEN)
$(LEX_GEN) -o $@ $(srcdir)/glsl/glcpp/glcpp-lex.l
 
+glsl/ir_expression_operation.h: glsl/ir_expression_operation.py
+   $(MKDIR_GEN)
+   $(PYTHON_GEN) $(srcdir)/glsl/ir_expression_operation.py > $@ || ($(RM) 
$@; false)
+
 # Only the parsers (specifically the header files generated at the same time)
 # need to be in BUILT_SOURCES. Though if we list the parser headers YACC is
 # called for the .c/.cpp file and the .h files. By listing the .c/.cpp files
@@ -204,6 +210,7 @@ glsl/glcpp/glcpp-lex.c: glsl/glcpp/glcpp-lex.l
 BUILT_SOURCES +=   \
glsl/glsl_parser.cpp\
glsl/glsl_lexer.cpp \
+   glsl/ir_expression_operation.h  \
glsl/glcpp/glcpp-parse.c\
glsl/glcpp/glcpp-lex.c
 CLEANFILES +=  \
@@ -211,6 +218,7 @@ CLEANFILES +=   
\
glsl/glsl_parser.h  \
glsl/glsl_parser.cpp\
glsl/glsl_lexer.cpp \
+   glsl/ir_expression_operation.h  \
glsl/glcpp/glcpp-parse.c\
glsl/glcpp/glcpp-lex.c
 
diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources
index a2dd234..f645173 100644
--- a/src/compiler/Makefile.sources
+++ b/src/compiler/Makefile.sources
@@ -39,7 +39,6 @@ LIBGLSL_FILES = \
glsl/ir_equals.cpp \
glsl/ir_expression_flattening.cpp \
glsl/ir_expression_flattening.h \
-   glsl/ir_expression_operation.h \
glsl/ir_function_can_inline.cpp \
glsl/ir_function_detect_recursion.cpp \
glsl/ir_function_inlining.h \
@@ -146,6 +145,7 @@ GLSL_COMPILER_CXX_FILES = \
 
 # libglsl generated sources
 LIBGLSL_GENERATED_FILES = \
+   glsl/ir_expression_operation.h \
glsl/glsl_lexer.cpp \
glsl/glsl_parser.cpp \
glsl/glsl_parser.h
diff --git a/src/compiler/glsl/.gitignore b/src/compiler/glsl/.gitignore
index 09951ba..30f4bca 100644
--- a/src/compiler/glsl/.gitignore
+++ b/src/compiler/glsl/.gitignore
@@ -3,6 +3,7 @@ glsl_parser.cpp
 glsl_parser.h
 glsl_parser.output
 glsl_test
+ir_expression_operation.h
 subtest-cr/
 subtest-lf/
 subtest-cr-lf/
diff --git a/src/compiler/glsl/ir_expression_operation.h 
b/src/compiler/glsl/ir_expression_operation.h
deleted file mode 100644
index a97ce84..000
--- a/src/compiler/glsl/ir_expression_operation.h
+++ /dev/null
@@ -1,340 +0,0 @@
-/*
- * Copyright (C) 2010 Intel Corporation
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice (including the next

[Mesa-dev] [PATCH 31/56] glsl: Generate code for constant unary expression that have different implementations for each source type

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_expression_operation.py | 45 
 1 file changed, 33 insertions(+), 12 deletions(-)

diff --git a/src/compiler/glsl/ir_expression_operation.py 
b/src/compiler/glsl/ir_expression_operation.py
index bc1690b..a491ccf 100644
--- a/src/compiler/glsl/ir_expression_operation.py
+++ b/src/compiler/glsl/ir_expression_operation.py
@@ -85,6 +85,7 @@ double_type = type("double", "d", "GLSL_TYPE_DOUBLE")
 bool_type = type("bool", "b", "GLSL_TYPE_BOOL")
 
 numeric_types = (uint_type, int_type, float_type, double_type)
+signed_numeric_types = (int_type, float_type, double_type)
 integer_types = (uint_type, int_type)
 real_types = (float_type, double_type)
 
@@ -113,6 +114,24 @@ constant_template1 = mako.template.Template("""\
   }
   break;""")
 
+# This template is for unary operations that can have operands of a several
+# different types, and each type has a different C expression.  ir_unop_neg is
+# an example.
+constant_template3 = mako.template.Template("""\
+   case ${op.get_enum_name()}:
+  for (unsigned c = 0; c < op[0]->type->components(); c++) {
+ switch (this->type->base_type) {
+% for (dst_type, src_types) in op.signatures():
+ case ${src_types[0].glsl_type}:
+data.${dst_type.union_field}[c] = 
${op.get_c_expression(src_types)};
+break;
+% endfor
+ default:
+assert(0);
+ }
+  }
+  break;""")
+
 # This template is for unary operations that map an operand of one type to an
 # operand of another type.  ir_unop_f2b is an example.
 constant_template2 = mako.template.Template("""\
@@ -157,8 +176,10 @@ class operation(object):
 return constant_template2.render(op=self)
  elif len(self.source_types) == 1:
 return constant_template0.render(op=self)
- else:
+ elif len(self.c_expression) == 1 and 'default' in self.c_expression:
 return constant_template1.render(op=self)
+ else:
+return constant_template3.render(op=self)
 
   return None
 
@@ -178,12 +199,12 @@ class operation(object):
 ir_expression_operation = [
operation("bit_not", 1, printable_name="~", source_types=integer_types, 
c_expression="~ {src0}"),
operation("logic_not", 1, printable_name="!", source_types=(bool_type,), 
c_expression="!{src0}"),
-   operation("neg", 1),
-   operation("abs", 1),
-   operation("sign", 1),
-   operation("rcp", 1),
-   operation("rsq", 1),
-   operation("sqrt", 1),
+   operation("neg", 1, source_types=numeric_types, c_expression={'u': "-((int) 
{src0})", 'default': "-{src0}"}),
+   operation("abs", 1, source_types=signed_numeric_types, c_expression={'i': 
"{src0} < 0 ? -{src0} : {src0}", 'f': "fabsf({src0})", 'd': "fabs({src0})"}),
+   operation("sign", 1, source_types=signed_numeric_types, c_expression={'i': 
"({src0} > 0) - ({src0} < 0)", 'f': "float(({src0} > 0.0F) - ({src0} < 0.0F))", 
'd': "double(({src0} > 0.0) - ({src0} < 0.0))"}),
+   operation("rcp", 1, source_types=real_types, c_expression={'f': "{src0} != 
0.0F ? 1.0F / {src0} : 0.0F", 'd': "{src0} != 0.0 ? 1.0 / {src0} : 0.0"}),
+   operation("rsq", 1, source_types=real_types, c_expression={'f': "1.0F / 
sqrtf({src0})", 'd': "1.0 / sqrt({src0})"}),
+   operation("sqrt", 1, source_types=real_types, c_expression={'f': 
"sqrtf({src0})", 'd': "sqrt({src0})"}),
operation("exp", 1, source_types=(float_type,), 
c_expression="expf({src0})"), # Log base e on gentype
operation("log", 1, source_types=(float_type,), 
c_expression="logf({src0})"), # Natural log on gentype
operation("exp2", 1, source_types=(float_type,), 
c_expression="exp2f({src0})"),
@@ -233,11 +254,11 @@ ir_expression_operation = [
operation("bitcast_f2u", 1, source_types=(float_type,), 
dest_type=uint_type, c_expression="bitcast_f2u({src0})"),
 
# Unary floating-point rounding operations.
-   operation("trunc", 1),
-   operation("ceil", 1),
-   operation("floor", 1),
-   operation("fract", 1),
-   operation("round_even", 1),
+   operation("trunc", 1, source_types=real_types, c_expression={'f': 
"truncf({src0})", 'd': "trunc({src0})"}),
+   operation("ceil", 1, source_types=real_types, c_expression={'f': 
"ceilf({src0})", 'd': "ceil({src0})"}),
+   operation("floor", 1, source_types=real_types, c_expression={'f': 
"floorf({src0})", 'd': "floor({src0})"}),
+   operation("fract", 1, source_types=real_types, c_expression={'f': "{src0} - 
floorf({src0})", 'd': "{src0} - floor({src0})"}),
+   operation("round_even", 1, source_types=real_types, c_expression={'f': 
"_mesa_roundevenf({src0})", 'd': "_mesa_roundeven({src0})"}),
 
# Trigonometric operations.
operation("sin", 1, source_types=(float_type,), 
c_expression="sinf({src0})"),
-- 
2.5.5

___
mesa-dev mailing list

[Mesa-dev] [PATCH 37/56] glsl: Generate code for constant unary expressions that don't assign the destination

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

These are operations like the pack functions that have separate
functions that assign multiple outputs from a single input.

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_expression_operation.py | 26 ++
 1 file changed, 18 insertions(+), 8 deletions(-)

diff --git a/src/compiler/glsl/ir_expression_operation.py 
b/src/compiler/glsl/ir_expression_operation.py
index 882d5ea..2351dcf 100644
--- a/src/compiler/glsl/ir_expression_operation.py
+++ b/src/compiler/glsl/ir_expression_operation.py
@@ -188,10 +188,18 @@ constant_template_horizontal_single_implementation = 
mako.template.Template("""\
   data.${op.dest_type.union_field}[0] = ${op.c_expression['default']};
   break;""")
 
+# This template is for operations that are horizontal and do not assign the
+# result.  The various unpack operations are examples.
+constant_template_horizontal_nonassignment = mako.template.Template("""\
+   case ${op.get_enum_name()}:
+  ${op.c_expression['default']};
+  break;""")
+
 
 vector_scalar_operation = "vector-scalar"
 horizontal_operation = "horizontal"
 types_identical_operation = "identical"
+non_assign_operation = "nonassign"
 
 class operation(object):
def __init__(self, name, num_operands, printable_name = None, source_types 
= None, dest_type = None, c_expression = None, flags = None):
@@ -230,7 +238,9 @@ class operation(object):
  return None
 
   if self.num_operands == 1:
- if horizontal_operation in self.flags:
+ if horizontal_operation in self.flags and non_assign_operation in 
self.flags:
+return constant_template_horizontal_nonassignment.render(op=self)
+ elif horizontal_operation in self.flags:
 return 
constant_template_horizontal_single_implementation.render(op=self)
  elif self.dest_type is not None and len(self.source_types) == 1:
 return constant_template2.render(op=self)
@@ -351,11 +361,11 @@ ir_expression_operation = [
operation("pack_unorm_2x16", 1, printable_name="packUnorm2x16", 
source_types=(float_type,), dest_type=uint_type, 
c_expression="pack_2x16(pack_unorm_1x16, op[0]->value.f[0], 
op[0]->value.f[1])", flags=horizontal_operation),
operation("pack_unorm_4x8", 1, printable_name="packUnorm4x8", 
source_types=(float_type,), dest_type=uint_type, 
c_expression="pack_4x8(pack_unorm_1x8, op[0]->value.f[0], op[0]->value.f[1], 
op[0]->value.f[2], op[0]->value.f[3])", flags=horizontal_operation),
operation("pack_half_2x16", 1, printable_name="packHalf2x16", 
source_types=(float_type,), dest_type=uint_type, 
c_expression="pack_2x16(pack_half_1x16, op[0]->value.f[0], op[0]->value.f[1])", 
flags=horizontal_operation),
-   operation("unpack_snorm_2x16", 1, printable_name="unpackSnorm2x16"),
-   operation("unpack_snorm_4x8", 1, printable_name="unpackSnorm4x8"),
-   operation("unpack_unorm_2x16", 1, printable_name="unpackUnorm2x16"),
-   operation("unpack_unorm_4x8", 1, printable_name="unpackUnorm4x8"),
-   operation("unpack_half_2x16", 1, printable_name="unpackHalf2x16"),
+   operation("unpack_snorm_2x16", 1, printable_name="unpackSnorm2x16", 
source_types=(uint_type,), dest_type=float_type, 
c_expression="unpack_2x16(unpack_snorm_1x16, op[0]->value.u[0], [0], 
[1])", flags=frozenset((horizontal_operation, non_assign_operation))),
+   operation("unpack_snorm_4x8", 1, printable_name="unpackSnorm4x8", 
source_types=(uint_type,), dest_type=float_type, 
c_expression="unpack_4x8(unpack_snorm_1x8, op[0]->value.u[0], [0], 
[1], [2], [3])", flags=frozenset((horizontal_operation, 
non_assign_operation))),
+   operation("unpack_unorm_2x16", 1, printable_name="unpackUnorm2x16", 
source_types=(uint_type,), dest_type=float_type, 
c_expression="unpack_2x16(unpack_unorm_1x16, op[0]->value.u[0], [0], 
[1])", flags=frozenset((horizontal_operation, non_assign_operation))),
+   operation("unpack_unorm_4x8", 1, printable_name="unpackUnorm4x8", 
source_types=(uint_type,), dest_type=float_type, 
c_expression="unpack_4x8(unpack_unorm_1x8, op[0]->value.u[0], [0], 
[1], [2], [3])", flags=frozenset((horizontal_operation, 
non_assign_operation))),
+   operation("unpack_half_2x16", 1, printable_name="unpackHalf2x16", 
source_types=(uint_type,), dest_type=float_type, 
c_expression="unpack_2x16(unpack_half_1x16, op[0]->value.u[0], [0], 
[1])", flags=frozenset((horizontal_operation, non_assign_operation))),
 
# Bit operations, part of ARB_gpu_shader5.
operation("bitfield_reverse", 1, source_types=integer_types, 
c_expression="bitfield_reverse({src0})"),
@@ -366,8 +376,8 @@ ir_expression_operation = [
operation("saturate", 1, printable_name="sat", source_types=(float_type,), 
c_expression="CLAMP({src0}, 0.0f, 1.0f)"),
 
# Double packing, part of ARB_gpu_shader_fp64.
-   operation("pack_double_2x32", 1, printable_name="packDouble2x32"),
-   operation("unpack_double_2x32", 1, 

[Mesa-dev] [PATCH 32/56] glsl: Generate code for constant binary expressions that have one operand type

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_expression_operation.py | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/src/compiler/glsl/ir_expression_operation.py 
b/src/compiler/glsl/ir_expression_operation.py
index a491ccf..2ac2a28 100644
--- a/src/compiler/glsl/ir_expression_operation.py
+++ b/src/compiler/glsl/ir_expression_operation.py
@@ -89,8 +89,8 @@ signed_numeric_types = (int_type, float_type, double_type)
 integer_types = (uint_type, int_type)
 real_types = (float_type, double_type)
 
-# This template is for unary operations that can only have operands of a
-# single type.  ir_unop_logic_not is an example.
+# This template is for unary and binary operations that can only have operands
+# of a single type.  ir_unop_logic_not is an example.
 constant_template0 = mako.template.Template("""\
case ${op.get_enum_name()}:
   assert(op[0]->type->base_type == ${op.source_types[0].glsl_type});
@@ -180,12 +180,16 @@ class operation(object):
 return constant_template1.render(op=self)
  else:
 return constant_template3.render(op=self)
+  elif self.num_operands == 2:
+ if len(self.source_types) == 1:
+return constant_template0.render(op=self)
 
   return None
 
 
def get_c_expression(self, types):
   src0 = "op[0]->value.{}[c]".format(types[0].union_field)
+  src1 = "op[1]->value.{}[c]".format(types[1].union_field) if len(types) 
>= 2 else "ERROR"
 
   expr = self.c_expression[types[0].union_field] if types[0].union_field 
in self.c_expression else self.c_expression['default']
 
@@ -366,15 +370,15 @@ ir_expression_operation = [
operation("bit_xor", 2, printable_name="^"),
operation("bit_or", 2, printable_name="|"),
 
-   operation("logic_and", 2, printable_name="&&"),
-   operation("logic_xor", 2, printable_name="^^"),
-   operation("logic_or", 2, printable_name="||"),
+   operation("logic_and", 2, printable_name="&&", source_types=(bool_type,), 
c_expression="{src0} && {src1}"),
+   operation("logic_xor", 2, printable_name="^^", source_types=(bool_type,), 
c_expression="{src0} != {src1}"),
+   operation("logic_or", 2, printable_name="||", source_types=(bool_type,), 
c_expression="{src0} || {src1}"),
 
operation("dot", 2),
operation("min", 2),
operation("max", 2),
 
-   operation("pow", 2),
+   operation("pow", 2, source_types=(float_type,), c_expression="powf({src0}, 
{src1})"),
 
# Load a value the size of a given GLSL type from a uniform block.
#
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 16/56] glsl: Use _mesa_bitcount to implement constant ir_unop_bit_count

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_constant_expression.cpp | 11 ++-
 1 file changed, 2 insertions(+), 9 deletions(-)

diff --git a/src/compiler/glsl/ir_constant_expression.cpp 
b/src/compiler/glsl/ir_constant_expression.cpp
index 694c9c7..04b5877 100644
--- a/src/compiler/glsl/ir_constant_expression.cpp
+++ b/src/compiler/glsl/ir_constant_expression.cpp
@@ -1500,15 +1500,8 @@ ir_expression::constant_expression_value(struct 
hash_table *variable_context)
   break;
 
case ir_unop_bit_count:
-  for (unsigned c = 0; c < components; c++) {
- unsigned count = 0;
- unsigned v = op[0]->value.u[c];
-
- for (; v; count++) {
-v &= v - 1;
- }
- data.u[c] = count;
-  }
+  for (unsigned c = 0; c < components; c++)
+ data.i[c] = _mesa_bitcount(op[0]->value.u[c]);
   break;
 
case ir_unop_find_msb:
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 20/56] glsl: Extract ir_binop_ldexp implementation to a separate function

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

---
 src/compiler/glsl/ir_constant_expression.cpp | 39 
 1 file changed, 28 insertions(+), 11 deletions(-)

diff --git a/src/compiler/glsl/ir_constant_expression.cpp 
b/src/compiler/glsl/ir_constant_expression.cpp
index 71afb33..0f90c4e 100644
--- a/src/compiler/glsl/ir_constant_expression.cpp
+++ b/src/compiler/glsl/ir_constant_expression.cpp
@@ -519,6 +519,24 @@ find_msb_int(int32_t v)
return find_msb_uint(v < 0 ? ~v : v);
 }
 
+static float
+ldexpf_flush_subnormal(float x, int exp)
+{
+   const float result = ldexpf(x, exp);
+
+   /* Flush subnormal values to zero. */
+   return !isnormal(result) ? copysignf(0.0f, x) : result;
+}
+
+static double
+ldexp_flush_subnormal(double x, int exp)
+{
+   const double result = ldexp(x, exp);
+
+   /* Flush subnormal values to zero. */
+   return !isnormal(result) ? copysign(0.0, x) : result;
+}
+
 ir_constant *
 ir_expression::constant_expression_value(struct hash_table *variable_context)
 {
@@ -1622,17 +1640,16 @@ ir_expression::constant_expression_value(struct 
hash_table *variable_context)
}
 
case ir_binop_ldexp:
-  for (unsigned c = 0; c < components; c++) {
- if (op[0]->type->base_type == GLSL_TYPE_DOUBLE) {
-data.d[c] = ldexp(op[0]->value.d[c], op[1]->value.i[c]);
-/* Flush subnormal values to zero. */
-if (!isnormal(data.d[c]))
-   data.d[c] = copysign(0.0, op[0]->value.d[c]);
- } else {
-data.f[c] = ldexpf(op[0]->value.f[c], op[1]->value.i[c]);
-/* Flush subnormal values to zero. */
-if (!isnormal(data.f[c]))
-   data.f[c] = copysignf(0.0f, op[0]->value.f[c]);
+  for (unsigned c = 0; c < op[0]->type->components(); c++) {
+ switch (this->type->base_type) {
+ case GLSL_TYPE_FLOAT:
+data.f[c] = ldexpf_flush_subnormal(op[0]->value.f[c], 
op[1]->value.i[c]);
+break;
+ case GLSL_TYPE_DOUBLE:
+data.d[c] = ldexp_flush_subnormal(op[0]->value.d[c], 
op[1]->value.i[c]);
+break;
+ default:
+assert(0);
  }
   }
   break;
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 13/56] glsl: Do not generate comments or extra whitespace in expression files

2016-07-19 Thread Ian Romanick
From: Ian Romanick 

The comments and whitespace can live in the Python code.

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_expression_operation.py | 517 +++
 1 file changed, 216 insertions(+), 301 deletions(-)

diff --git a/src/compiler/glsl/ir_expression_operation.py 
b/src/compiler/glsl/ir_expression_operation.py
index 8e2dd27..10c9626 100644
--- a/src/compiler/glsl/ir_expression_operation.py
+++ b/src/compiler/glsl/ir_expression_operation.py
@@ -25,294 +25,220 @@ import mako.template
 import sys
 
 ir_expression_operation = [
-   # Nameoperands  string  comment
-   ("bit_not", 1, "~", None),
-   ("logic_not", 1, "!", None),
-   ("neg", 1, None, None),
-   ("abs", 1, None, None),
-   ("sign", 1, None, None),
-   ("rcp", 1, None, None),
-   ("rsq", 1, None, None),
-   ("sqrt", 1, None, None),
-   ("exp", 1, None, "Log base e on gentype"),
-   ("log", 1, None, "Natural log on gentype"),
-   ("exp2", 1, None, None),
-   ("log2", 1, None, None),
-   ("f2i", 1, None, "Float-to-integer conversion."),
-   ("f2u", 1, None, "Float-to-unsigned conversion."),
-   ("i2f", 1, None, "Integer-to-float conversion."),
-   ("f2b", 1, None, "Float-to-boolean conversion"),
-   ("b2f", 1, None, "Boolean-to-float conversion"),
-   ("i2b", 1, None, "int-to-boolean conversion"),
-   ("b2i", 1, None, "Boolean-to-int conversion"),
-   ("u2f", 1, None, "Unsigned-to-float conversion."),
-   ("i2u", 1, None, "Integer-to-unsigned conversion."),
-   ("u2i", 1, None, "Unsigned-to-integer conversion."),
-   ("d2f", 1, None, "Double-to-float conversion."),
-   ("f2d", 1, None, "Float-to-double conversion."),
-   ("d2i", 1, None, "Double-to-integer conversion."),
-   ("i2d", 1, None, "Integer-to-double conversion."),
-   ("d2u", 1, None, "Double-to-unsigned conversion."),
-   ("u2d", 1, None, "Unsigned-to-double conversion."),
-   ("d2b", 1, None, "Double-to-boolean conversion."),
-   ("bitcast_i2f", 1, None, 'Bit-identical int-to-float "conversion"'),
-   ("bitcast_f2i", 1, None, 'Bit-identical float-to-int "conversion"'),
-   ("bitcast_u2f", 1, None, 'Bit-identical uint-to-float "conversion"'),
-   ("bitcast_f2u", 1, None, 'Bit-identical float-to-uint "conversion"'),
-"""
-   /**
-* \\name Unary floating-point rounding operations.
-*/
-   /*@{*/""",
-   ("trunc", 1, None, None),
-   ("ceil", 1, None, None),
-   ("floor", 1, None, None),
-   ("fract", 1, None, None),
-   ("round_even", 1, None, None),
-"""   /*@}*/
-
-   /**
-* \\name Trigonometric operations.
-*/
-   /*@{*/""",
-   ("sin", 1, None, None),
-   ("cos", 1, None, None),
-"""   /*@}*/
-
-   /**
-* \\name Partial derivatives.
-*/
-   /*@{*/""",
-   ("dFdx", 1, None, None),
-   ("dFdx_coarse", 1, "dFdxCoarse", None),
-   ("dFdx_fine", 1, "dFdxFine", None),
-   ("dFdy", 1, None, None),
-   ("dFdy_coarse", 1, "dFdyCoarse", None),
-   ("dFdy_fine", 1, "dFdyFine", None),
-"""   /*@}*/
-
-   /**
-* \\name Floating point pack and unpack operations.
-*/
-   /*@{*/""",
-   ("pack_snorm_2x16", 1, "packSnorm2x16", None),
-   ("pack_snorm_4x8", 1, "packSnorm4x8", None),
-   ("pack_unorm_2x16", 1, "packUnorm2x16", None),
-   ("pack_unorm_4x8", 1, "packUnorm4x8", None),
-   ("pack_half_2x16", 1, "packHalf2x16", None),
-   ("unpack_snorm_2x16", 1, "unpackSnorm2x16", None),
-   ("unpack_snorm_4x8", 1, "unpackSnorm4x8", None),
-   ("unpack_unorm_2x16", 1, "unpackUnorm2x16", None),
-   ("unpack_unorm_4x8", 1, "unpackUnorm4x8", None),
-   ("unpack_half_2x16", 1, "unpackHalf2x16", None),
-"""   /*@}*/
-
-   /**
-* \\name Bit operations, part of ARB_gpu_shader5.
-*/
-   /*@{*/""",
-   ("bitfield_reverse", 1, None, None),
-   ("bit_count", 1, None, None),
-   ("find_msb", 1, None, None),
-   ("find_lsb", 1, None, None),
-"""   /*@}*/
-""",
-   ("saturate", 1, "sat", None),
-"""
-   /**
-* \\name Double packing, part of ARB_gpu_shader_fp64.
-*/
-   /*@{*/""",
-   ("pack_double_2x32", 1, "packDouble2x32", None),
-   ("unpack_double_2x32", 1, "unpackDouble2x32", None),
-"""   /*@}*/
-""",
-   ("frexp_sig", 1, None, None),
-   ("frexp_exp", 1, None, None),
-   "",
-   ("noise", 1, None, None),
-   "",
-   ("subroutine_to_int", 1, None, None),
-"""   /**
-* Interpolate fs input at centroid
-*
-* operand0 is the fs input.
-*/""",
-   ("interpolate_at_centroid", 1, None, None),
-"""
-   /**
-* Ask the driver for the total size of a buffer block.
-*
-* operand0 is the ir_constant buffer block index in the linked shader.
-*/""",
-   ("get_buffer_size", 1, None, None),
-"""
-   /**
-* Calculate length of an unsized array inside a buffer block.
-* This opcode is going to be replaced in a lowering pass inside
-* the linker.
-*
-* operand0 is the unsized array's ir_value for the calculation
-* of its length.
-*/""",
-   ("ssbo_unsized_array_length", 1, None, None),
-"""
-   /**
-* Vote among threads on the 

  1   2   3   >