Re: [Mesa-dev] NIR constant problem for GPU which doesn't have native integer support

2019-01-04 Thread Qiang Yu
Thanks for your guys' information, although it's much deeper than my
question and also my understanding of the NIR and compiler knowledge.

This patch fix my problem:
https://patchwork.freedesktop.org/patch/268946/
But seems it's not the final solution, right? And I'm not sure if we will
have other integer problem besides constant for non-integer-gpu, do we?

To be honest, as a backend developer, NIR non-integer-gpu support is
out of my ability. So maybe I have to take this patch temporarily and work
on other topic for sometime to see if any progress of this support in NIR
in the future.

Thanks,
Qiang

On Sat, Jan 5, 2019 at 12:40 AM Ilia Mirkin  wrote:
>
> On Fri, Jan 4, 2019 at 10:46 AM Jason Ekstrand  wrote:
> >
> > On Fri, Jan 4, 2019 at 4:07 AM Erik Faye-Lund 
> >  wrote:
> >>
> >> On Thu, 2019-01-03 at 11:58 -0600, Jason Ekstrand wrote:
> >> > On Thu, Jan 3, 2019 at 3:39 AM Erik Faye-Lund <
> >> > erik.faye-l...@collabora.com> wrote:
> >> > > On Wed, 2019-01-02 at 10:16 -0600, Jason Ekstrand wrote:
> >> > > > On Wed, Jan 2, 2019 at 9:43 AM Ilia Mirkin 
> >> > > > wrote:
> >> > > > > Have a look at the first 4 patches in the series from Jonathan
> >> > > > > Marek
> >> > > > > to address some of these issues:
> >> > > > >
> >> > > > > https://patchwork.freedesktop.org/series/54295/
> >> > > > >
> >> > > > > Not sure exactly what state that work is in, but I've added
> >> > > > > Jonathan
> >> > > > > to CC, perhaps he can provide an update.
> >> > > > >
> >> > > > > Cheers,
> >> > > > >
> >> > > > >   -ilia
> >> > > > >
> >> > > > > On Wed, Jan 2, 2019 at 6:28 AM Qiang Yu 
> >> > > wrote:
> >> > > > > >
> >> > > > > > Hi guys,
> >> > > > > >
> >> > > > > > I found the problem with this test fragment shader when lima
> >> > > > > development:
> >> > > > > > uniform int color;
> >> > > > > > void main() {
> >> > > > > > if (color > 1)
> >> > > > > > gl_FragColor = vec4(1.0, 0.0, 0.0, 1);
> >> > > > > > else
> >> > > > > > gl_FragColor = vec4(0.0, 1.0, 0.0, 1);
> >> > > > > > }
> >> > > > > >
> >> > > > > > nir_print_shader output:
> >> > > > > > impl main {
> >> > > > > > block block_0:
> >> > > > > > /* preds: */
> >> > > > > > vec1 32 ssa_0 = load_const (0x0001 /* 0.00
> >> > > */)
> >> > > > > > vec4 32 ssa_1 = load_const (0x3f80 /* 1.00
> >> > > */,
> >> > > > > > 0x /* 0.00 */, 0x /* 0.00 */,
> >> > > 0x3f80
> >> > > > > /*
> >> > > > > > 1.00 */)
> >> > > > > > vec4 32 ssa_2 = load_const (0x /* 0.00
> >> > > */,
> >> > > > > > 0x3f80 /* 1.00 */, 0x /* 0.00 */,
> >> > > 0x3f80
> >> > > > > /*
> >> > > > > > 1.00 */)
> >> > > > > > vec1 32 ssa_3 = load_const (0x /* 0.00
> >> > > */)
> >> > > > > > vec1 32 ssa_4 = intrinsic load_uniform (ssa_3) (0, 1,
> >> > > 0)
> >> > > > > /*
> >> > > > > > base=0 */ /* range=1 */ /* component=0 */   /* color */
> >> > > > > > vec1 32 ssa_5 = slt ssa_0, ssa_4
> >> > > > > > vec1 32 ssa_6 = fnot ssa_5
> >> > > > > > vec4 32 ssa_7 = bcsel ssa_6., ssa_2, ssa_1
> >> > > > > > intrinsic store_output (ssa_7, ssa_3) (0, 15, 0) /*
> >> > > > > base=0 */
> >> > > > > > /* wrmask=xyzw */ /* component=0 */   /* gl_FragColor */
> >> > > > > > /* succs: block_1 */
> >> > > > > > block block_1:
> >> > > > > > }
> >> > > > > >
> >> > > > > > ssa0 is not converted to float when glsl to nir. I see
> >> > > > > glsl_to_nir.cpp
> >> > > > > > will create flt/ilt/ult
> >> > > > > > based on source type for gpu support native integer, but for
> >> > > gpu
> >> > > > > not
> >> > > > > > support native
> >> > > > > > integer, just create slt for all source type. And in
> >> > > > > > nir_lower_constant_initializers,
> >> > > > > > there's also no type conversion for integer constant.
> >> > > >
> >> > > > This is a generally sticky issue.  In NIR, we have no concept of
> >> > > > types on SSA values which has proven perfectly reasonable and
> >> > > > actually very powerful in a world where integers are supported
> >> > > > natively.  Unfortunately, it causes significant problems for
> >> > > float-
> >> > > > only architectures.
> >> > >
> >> > > I would like to take this chance to say that this untyped SSA-value
> >> > > choice has lead to issues in both radeon_si (because LLVM values
> >> > > are
> >> > > typed) and zink (similarly, because SPIR-V values are typed), where
> >> > > we
> >> > > need to to bitcasts on every access because there's just not enough
> >> > > information available to emit variables with the right type.
> >> >
> >> > I'm not sure if I agree that the two problems are the same or not...
> >> > More on that in a bit.
> >> >
> >> > > It took us a lot of time to realize that the meta-data from the
> >> > > opcodes
> >> > > doesn't *really* provide this, because the rest of nir doesn't
> >> > > treat
> >> > > values consistently. 

Re: [Mesa-dev] [PATCH] st/glsl: refactor st_link_nir()

2019-01-04 Thread Eric Anholt
Timothy Arceri  writes:

> The functional change here is moving the nir_lower_io_to_scalar_early()
> calls inside st_nir_link_shaders() and moving the st_nir_opts() call
> after the call to nir_lower_io_arrays_to_elements().
>
> This fixes a bug with the following piglit test due to the current code
> not cleaning up dead code after we lower arrays. This was causing an
> assert in the new duplicate varyings link time opt introduced in
> 70be9afccb23.
>
> tests/spec/glsl-1.10/execution/vsfs-unused-array-member.shader_test
>
> Moving the nir_lower_io_to_scalar_early() calls also allows us to tidy
> up the code a little and merge some loops.

I love the reduction in stage tracking logic.

Reviewed-by: Eric Anholt 


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC 4/6] dri: Enable fp16 configs and visuals

2019-01-04 Thread Ilia Mirkin
On Fri, Jan 4, 2019 at 5:04 PM Ilia Mirkin  wrote:
>
> On Fri, Jan 4, 2019 at 4:56 PM Kevin Strasser  
> wrote:
> > diff --git a/src/mesa/drivers/dri/common/utils.c 
> > b/src/mesa/drivers/dri/common/utils.c
> > index b52c59f..1cf9362 100644
> > --- a/src/mesa/drivers/dri/common/utils.c
> > +++ b/src/mesa/drivers/dri/common/utils.c
> > @@ -200,6 +200,10 @@ driCreateConfigs(mesa_format format,
> >{ 0x03FF, 0x000FFC00, 0x3FF0, 0x },
> >/* MESA_FORMAT_R10G10B10A2_UNORM */
> >{ 0x03FF, 0x000FFC00, 0x3FF0, 0xC000 },
> > +  /* MESA_FORMAT_RGBX_FLOAT16 */
> > +  { 0x, 0x, 0x, 
> > 0x},
> > +  /* MESA_FORMAT_RGBA_FLOAT16 */
> > +  { 0x, 0x, 0x, 
> > 0x},
>
> I'm about 37% sure that these need to have a ULL suffix on them, as
> otherwise immediates aren't upgraded to a wider type. I'm surprised
> there were no warnings/errors about that. Perhaps I'm just wrong
> though -- good to check either way.
>
> If if turns out I'm right, there were a few other instances of such
> wide immediates.

As confirmed on IRC, appears that I'm wrong about this. The values do
get upgraded to the wider type.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC 4/6] dri: Enable fp16 configs and visuals

2019-01-04 Thread Ilia Mirkin
On Fri, Jan 4, 2019 at 4:56 PM Kevin Strasser  wrote:
> diff --git a/src/mesa/drivers/dri/common/utils.c 
> b/src/mesa/drivers/dri/common/utils.c
> index b52c59f..1cf9362 100644
> --- a/src/mesa/drivers/dri/common/utils.c
> +++ b/src/mesa/drivers/dri/common/utils.c
> @@ -200,6 +200,10 @@ driCreateConfigs(mesa_format format,
>{ 0x03FF, 0x000FFC00, 0x3FF0, 0x },
>/* MESA_FORMAT_R10G10B10A2_UNORM */
>{ 0x03FF, 0x000FFC00, 0x3FF0, 0xC000 },
> +  /* MESA_FORMAT_RGBX_FLOAT16 */
> +  { 0x, 0x, 0x, 
> 0x},
> +  /* MESA_FORMAT_RGBA_FLOAT16 */
> +  { 0x, 0x, 0x, 
> 0x},

I'm about 37% sure that these need to have a ULL suffix on them, as
otherwise immediates aren't upgraded to a wider type. I'm surprised
there were no warnings/errors about that. Perhaps I'm just wrong
though -- good to check either way.

If if turns out I'm right, there were a few other instances of such
wide immediates.

Cheers,

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC 0/6] Enable fp16 visuals and fbconfigs

2019-01-04 Thread Kevin Strasser
This series enables fp16 fbconfigs and visuals by leveraging existing off-screen
rendering support.

These formats can be used in conjunction with EXT_surface_SMPTE2086_metadata
(not yet implemented by any drivers) to support EXT_gl_colorspace_scrgb /
EXT_gl_colorspace_scrgb_linear, used in places like Android wide color gamut.

While I have run this series against Piglit, I still need to sort out test
coverage for these formats. If anyone has pointers to existing tests that would
be really helpful.

As an easy smoke test I have a modified version of kmscube:
  https://github.com/strassek/kmscube/commits/fp16

Kevin Strasser (6):
  dri: Support 64 bit rgba masks
  dri: Set bit for float configs
  drm-uapi: Add fp16 formats to drm_fourcc.h
  dri: Enable fp16 configs and visuals
  gallium/winsys/kms: Respect format bpp
  gbm: Add visuals and buffer handling for fp16 formats

 include/GL/internal/dri_interface.h| 11 ++-
 include/drm-uapi/drm_fourcc.h  |  8 +++
 src/egl/drivers/dri2/egl_dri2.c| 32 +++--
 src/egl/drivers/dri2/egl_dri2.h|  6 +-
 src/egl/drivers/dri2/platform_android.c|  2 +-
 src/egl/drivers/dri2/platform_drm.c| 79 ++
 src/egl/drivers/dri2/platform_surfaceless.c|  2 +-
 src/egl/drivers/dri2/platform_wayland.c|  2 +-
 src/egl/drivers/dri2/platform_x11.c|  6 +-
 .../auxiliary/pipe-loader/driinfo_gallium.h|  1 +
 src/gallium/state_trackers/dri/dri2.c  | 22 ++
 src/gallium/state_trackers/dri/dri_drawable.c  |  3 +
 src/gallium/state_trackers/dri/dri_screen.c| 26 ++-
 src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c  |  2 +-
 src/gbm/backends/dri/gbm_dri.c | 38 ++-
 src/gbm/backends/dri/gbm_driint.h  |  9 +--
 src/gbm/main/gbm.c |  3 +
 src/gbm/main/gbm.h |  9 +++
 src/glx/glxconfig.h|  2 +-
 src/loader/loader_dri3_helper.c|  5 ++
 src/mesa/drivers/dri/common/dri_util.c |  8 +++
 src/mesa/drivers/dri/common/utils.c| 31 -
 src/mesa/drivers/dri/i965/intel_screen.c   | 39 ++-
 src/mesa/main/mtypes.h |  2 +-
 src/mesa/state_tracker/st_cb_fbo.c |  3 +
 src/util/xmlpool/t_options.h   |  5 ++
 26 files changed, 311 insertions(+), 45 deletions(-)

-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC 5/6] gallium/winsys/kms: Respect format bpp

2019-01-04 Thread Kevin Strasser
Needed for allocating buffers with pixel formats wider than 32 bpp.

Signed-off-by: Kevin Strasser 
---
 src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c 
b/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
index 9564d94..e921a61 100644
--- a/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
+++ b/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
@@ -182,7 +182,7 @@ kms_sw_displaytarget_create(struct sw_winsys *ws,
kms_sw_dt->format = format;
 
memset(_req, 0, sizeof(create_req));
-   create_req.bpp = 32;
+   create_req.bpp = util_format_get_blocksizebits(format);
create_req.width = width;
create_req.height = height;
ret = drmIoctl(kms_sw->fd, DRM_IOCTL_MODE_CREATE_DUMB, _req);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC 4/6] dri: Enable fp16 configs and visuals

2019-01-04 Thread Kevin Strasser
Add dri image formats for RGBA ordered 64 bpp IEEE 754 half precision
floating point.

Introduce a new dri configuration option so users can diable exposure of
fp16 formats, following the same design and policy of rgb10a2 (opt-in
for i965 and opt-out for gallium).

Add a loader cap field so loaders can indicate if they know how to
handle fp16 formats.

Add visuals to gallium and i965, leverage existing offscreen render
support for MESA_FORMAT_RGBA_FLOAT16 and MESA_FORMAT_RGBX_FLOAT16.

Signed-off-by: Kevin Strasser 
---
 include/GL/internal/dri_interface.h|  5 +++
 src/egl/drivers/dri2/egl_dri2.c|  2 ++
 .../auxiliary/pipe-loader/driinfo_gallium.h|  1 +
 src/gallium/state_trackers/dri/dri2.c  | 22 
 src/gallium/state_trackers/dri/dri_drawable.c  |  3 ++
 src/gallium/state_trackers/dri/dri_screen.c| 26 ++-
 src/loader/loader_dri3_helper.c|  5 +++
 src/mesa/drivers/dri/common/dri_util.c |  8 +
 src/mesa/drivers/dri/common/utils.c| 10 ++
 src/mesa/drivers/dri/i965/intel_screen.c   | 39 --
 src/mesa/state_tracker/st_cb_fbo.c |  3 ++
 src/util/xmlpool/t_options.h   |  5 +++
 12 files changed, 125 insertions(+), 4 deletions(-)

diff --git a/include/GL/internal/dri_interface.h 
b/include/GL/internal/dri_interface.h
index c5761c4..76ebfae 100644
--- a/include/GL/internal/dri_interface.h
+++ b/include/GL/internal/dri_interface.h
@@ -1038,6 +1038,7 @@ enum dri_loader_cap {
 * only BGRA ordering can be exposed.
 */
DRI_LOADER_CAP_RGBA_ORDERING,
+   DRI_LOADER_CAP_FP16,
 };
 
 struct __DRIdri2LoaderExtensionRec {
@@ -1277,6 +1278,8 @@ struct __DRIdri2ExtensionRec {
 #define __DRI_IMAGE_FORMAT_XBGR2101010  0x1010
 #define __DRI_IMAGE_FORMAT_ABGR2101010  0x1011
 #define __DRI_IMAGE_FORMAT_SABGR8   0x1012
+#define __DRI_IMAGE_FORMAT_XBGR16161616F 0x1013
+#define __DRI_IMAGE_FORMAT_ABGR16161616F 0x1014
 
 #define __DRI_IMAGE_USE_SHARE  0x0001
 #define __DRI_IMAGE_USE_SCANOUT0x0002
@@ -1322,6 +1325,8 @@ struct __DRIdri2ExtensionRec {
 #define __DRI_IMAGE_FOURCC_RGBX1010102 0x30335852
 #define __DRI_IMAGE_FOURCC_BGRA1010102 0x30334142
 #define __DRI_IMAGE_FOURCC_BGRX1010102 0x30335842
+#define __DRI_IMAGE_FOURCC_ABGR16161616F 0x48344241
+#define __DRI_IMAGE_FOURCC_XBGR16161616F 0x48344258
 #define __DRI_IMAGE_FOURCC_YUV410  0x39565559
 #define __DRI_IMAGE_FOURCC_YUV411  0x31315559
 #define __DRI_IMAGE_FOURCC_YUV420  0x32315559
diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c
index fb8f1b7..6b44767 100644
--- a/src/egl/drivers/dri2/egl_dri2.c
+++ b/src/egl/drivers/dri2/egl_dri2.c
@@ -2233,6 +2233,8 @@ dri2_num_fourcc_format_planes(EGLint format)
case DRM_FORMAT_ABGR2101010:
case DRM_FORMAT_RGBA1010102:
case DRM_FORMAT_BGRA1010102:
+   case DRM_FORMAT_XBGR16161616F:
+   case DRM_FORMAT_ABGR16161616F:
case DRM_FORMAT_YUYV:
case DRM_FORMAT_YVYU:
case DRM_FORMAT_UYVY:
diff --git a/src/gallium/auxiliary/pipe-loader/driinfo_gallium.h 
b/src/gallium/auxiliary/pipe-loader/driinfo_gallium.h
index 9db0dc0..76637e5 100644
--- a/src/gallium/auxiliary/pipe-loader/driinfo_gallium.h
+++ b/src/gallium/auxiliary/pipe-loader/driinfo_gallium.h
@@ -37,4 +37,5 @@ DRI_CONF_SECTION_MISCELLANEOUS
DRI_CONF_ALWAYS_HAVE_DEPTH_BUFFER("false")
DRI_CONF_GLSL_ZERO_INIT("false")
DRI_CONF_ALLOW_RGB10_CONFIGS("true")
+   DRI_CONF_ALLOW_FP16_CONFIGS("true")
 DRI_CONF_SECTION_END
diff --git a/src/gallium/state_trackers/dri/dri2.c 
b/src/gallium/state_trackers/dri/dri2.c
index 6fc07e4..c7f429a 100644
--- a/src/gallium/state_trackers/dri/dri2.c
+++ b/src/gallium/state_trackers/dri/dri2.c
@@ -74,6 +74,10 @@ struct dri2_format_mapping {
 };
 
 static const struct dri2_format_mapping dri2_format_table[] = {
+  { __DRI_IMAGE_FOURCC_ABGR16161616F, __DRI_IMAGE_FORMAT_ABGR16161616F,
+__DRI_IMAGE_COMPONENTS_RGBA,  PIPE_FORMAT_R16G16B16A16_FLOAT },
+  { __DRI_IMAGE_FOURCC_XBGR16161616F, __DRI_IMAGE_FORMAT_XBGR16161616F,
+__DRI_IMAGE_COMPONENTS_RGB,   PIPE_FORMAT_R16G16B16X16_FLOAT },
   { __DRI_IMAGE_FOURCC_ARGB2101010,   __DRI_IMAGE_FORMAT_ARGB2101010,
 __DRI_IMAGE_COMPONENTS_RGBA,  PIPE_FORMAT_B10G10R10A2_UNORM },
   { __DRI_IMAGE_FOURCC_XRGB2101010,   __DRI_IMAGE_FORMAT_XRGB2101010,
@@ -222,6 +226,12 @@ dri2_drawable_get_buffers(struct dri_drawable *drawable,
* may occur as the stvis->color_format.
*/
   switch(format) {
+  case PIPE_FORMAT_R16G16B16A16_FLOAT:
+ depth = 64;
+ break;
+  case PIPE_FORMAT_R16G16B16X16_FLOAT:
+ depth = 48;
+ break;
   case PIPE_FORMAT_B10G10R10A2_UNORM:
   case PIPE_FORMAT_R10G10B10A2_UNORM:
   case PIPE_FORMAT_BGRA_UNORM:
@@ -300,6 +310,12 @@ dri_image_drawable_get_buffers(struct 

[Mesa-dev] [RFC 2/6] dri: Set bit for float configs

2019-01-04 Thread Kevin Strasser
Populate __DRI_ATTRIB_FLOAT_BIT, such that we can filter for formats
containing floating point pixel data and egl/glx can satisfy the
requirements for their respective extensions.

Signed-off-by: Kevin Strasser 
---
 src/egl/drivers/dri2/egl_dri2.c |  2 ++
 src/egl/drivers/dri2/platform_drm.c | 18 +++---
 src/gbm/backends/dri/gbm_dri.c  | 12 
 src/gbm/backends/dri/gbm_driint.h   |  1 +
 src/mesa/drivers/dri/common/utils.c |  5 +
 5 files changed, 35 insertions(+), 3 deletions(-)

diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c
index d19950d..fb8f1b7 100644
--- a/src/egl/drivers/dri2/egl_dri2.c
+++ b/src/egl/drivers/dri2/egl_dri2.c
@@ -202,6 +202,8 @@ dri2_add_config(_EGLDisplay *disp, const __DRIconfig 
*dri_config, int id,
  ); ++i) {
   switch (attrib) {
   case __DRI_ATTRIB_RENDER_TYPE:
+ if (value & __DRI_ATTRIB_FLOAT_BIT)
+_eglSetConfigKey(, EGL_COLOR_COMPONENT_TYPE_EXT, 
EGL_COLOR_COMPONENT_TYPE_FLOAT_EXT);
  if (value & __DRI_ATTRIB_RGBA_BIT)
 value = EGL_RGB_BUFFER;
  else if (value & __DRI_ATTRIB_LUMINANCE_BIT)
diff --git a/src/egl/drivers/dri2/platform_drm.c 
b/src/egl/drivers/dri2/platform_drm.c
index 47563d6..ae235e9 100644
--- a/src/egl/drivers/dri2/platform_drm.c
+++ b/src/egl/drivers/dri2/platform_drm.c
@@ -98,6 +98,7 @@ dri2_drm_config_is_compatible(struct dri2_egl_display 
*dri2_dpy,
const struct gbm_dri_visual *visual = NULL;
unsigned long long int red, green, blue, alpha;
unsigned int mask_hi = 0, mask_lo, render_type;
+   bool is_float;
int i;
 
/* Check that the EGLConfig being used to render to the surface is
@@ -129,6 +130,10 @@ dri2_drm_config_is_compatible(struct dri2_egl_display 
*dri2_dpy,
_lo);
alpha = (unsigned long long int)mask_hi << 32 | mask_lo;
 
+   dri2_dpy->core->getConfigAttrib(config, __DRI_ATTRIB_RENDER_TYPE,
+   _type);
+   is_float = (render_type & __DRI_ATTRIB_FLOAT_BIT) ? true : false;
+
for (i = 0; i < dri2_dpy->gbm_dri->num_visuals; i++) {
   visual = _dpy->gbm_dri->visual_table[i];
   if (visual->gbm_format == surface->format)
@@ -141,7 +146,8 @@ dri2_drm_config_is_compatible(struct dri2_egl_display 
*dri2_dpy,
if (red != visual->rgba_masks.red ||
green != visual->rgba_masks.green ||
blue != visual->rgba_masks.blue ||
-   (alpha && visual->rgba_masks.alpha && alpha != 
visual->rgba_masks.alpha)) {
+   (alpha && visual->rgba_masks.alpha && alpha != 
visual->rgba_masks.alpha) ||
+   is_float != visual->is_float) {
   return false;
}
 
@@ -649,7 +655,8 @@ drm_add_configs_for_visuals(_EGLDriver *drv, _EGLDisplay 
*disp)
for (unsigned i = 0; dri2_dpy->driver_configs[i]; i++) {
   const __DRIconfig *config = dri2_dpy->driver_configs[i];
   unsigned long long int red, green, blue, alpha;
-  unsigned int mask_hi = 0, mask_lo;
+  unsigned int mask_hi = 0, mask_lo, render_type;
+  bool is_float;
 
   dri2_dpy->core->getConfigAttrib(config, __DRI_ATTRIB_RED_MASK_HI,
   _hi);
@@ -675,13 +682,18 @@ drm_add_configs_for_visuals(_EGLDriver *drv, _EGLDisplay 
*disp)
   _lo);
   alpha = (unsigned long long int)mask_hi << 32 | mask_lo;
 
+  dri2_dpy->core->getConfigAttrib(dri2_dpy->driver_configs[i],
+  __DRI_ATTRIB_RENDER_TYPE, _type);
+  is_float = (render_type & __DRI_ATTRIB_FLOAT_BIT) ? true : false;
+
   for (unsigned j = 0; j < num_visuals; j++) {
  struct dri2_egl_config *dri2_conf;
 
  if (visuals[j].rgba_masks.red != red ||
  visuals[j].rgba_masks.green != green ||
  visuals[j].rgba_masks.blue != blue ||
- visuals[j].rgba_masks.alpha != alpha)
+ visuals[j].rgba_masks.alpha != alpha ||
+ visuals[j].is_float != is_float)
 continue;
 
  const EGLint attr_list[] = {
diff --git a/src/gbm/backends/dri/gbm_dri.c b/src/gbm/backends/dri/gbm_dri.c
index abbb0b9..225b0de 100644
--- a/src/gbm/backends/dri/gbm_dri.c
+++ b/src/gbm/backends/dri/gbm_dri.c
@@ -487,50 +487,62 @@ static const struct gbm_dri_visual 
gbm_dri_visuals_table[] = {
{
  GBM_FORMAT_R8, __DRI_IMAGE_FORMAT_R8,
  { 0x00ff, 0x, 0x, 0x },
+ false,
},
{
  GBM_FORMAT_GR88, __DRI_IMAGE_FORMAT_GR88,
  { 0x00ff, 0xff00, 0x, 0x },
+ false,
},
{
  GBM_FORMAT_ARGB1555, __DRI_IMAGE_FORMAT_ARGB1555,
  { 0x7c00, 0x03e0, 0x001f, 0x8000 },
+ false,
},
{
  GBM_FORMAT_RGB565, __DRI_IMAGE_FORMAT_RGB565,
  { 0xf800, 0x07e0, 0x001f, 0x },
+ false,
},
{
  GBM_FORMAT_XRGB, __DRI_IMAGE_FORMAT_XRGB,
   

[Mesa-dev] [RFC 6/6] gbm: Add visuals and buffer handling for fp16 formats

2019-01-04 Thread Kevin Strasser
Set loader caps indicating that gbm can handle both rgba ordering and
fp16 formats.

Signed-off-by: Kevin Strasser 
---
 src/gbm/backends/dri/gbm_dri.c | 26 +-
 src/gbm/main/gbm.c |  3 +++
 src/gbm/main/gbm.h |  9 +
 3 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/src/gbm/backends/dri/gbm_dri.c b/src/gbm/backends/dri/gbm_dri.c
index 225b0de..cc1943e 100644
--- a/src/gbm/backends/dri/gbm_dri.c
+++ b/src/gbm/backends/dri/gbm_dri.c
@@ -118,6 +118,19 @@ dri_get_buffers_with_format(__DRIdrawable * driDrawable,
count, out_count, surf->dri_private);
 }
 
+static unsigned
+dri_get_capability(void *loaderPrivate, enum dri_loader_cap cap)
+{
+   /* Note: loaderPrivate is _EGLDisplay* */
+   switch (cap) {
+   case DRI_LOADER_CAP_RGBA_ORDERING:
+   case DRI_LOADER_CAP_FP16:
+  return 1;
+   default:
+  return 0;
+   }
+}
+
 static int
 image_get_buffers(__DRIdrawable *driDrawable,
   unsigned int format,
@@ -215,11 +228,12 @@ static const __DRIimageLookupExtension 
image_lookup_extension = {
 };
 
 static const __DRIdri2LoaderExtension dri2_loader_extension = {
-   .base = { __DRI_DRI2_LOADER, 3 },
+   .base = { __DRI_DRI2_LOADER, 4 },
 
.getBuffers  = dri_get_buffers,
.flushFrontBuffer= dri_flush_front_buffer,
.getBuffersWithFormat= dri_get_buffers_with_format,
+   .getCapability   = dri_get_capability,
 };
 
 static const __DRIimageLoaderExtension image_loader_extension = {
@@ -544,6 +558,16 @@ static const struct gbm_dri_visual gbm_dri_visuals_table[] 
= {
  { 0x03ff, 0x000ffc00, 0x3ff0, 0xc000 },
  false,
},
+   {
+ GBM_FORMAT_XBGR16161616F, __DRI_IMAGE_FORMAT_XBGR16161616F,
+ { 0x, 0x, 0x, 
0x},
+ true,
+   },
+   {
+ GBM_FORMAT_ABGR16161616F, __DRI_IMAGE_FORMAT_ABGR16161616F,
+ { 0x, 0x, 0x, 
0x},
+ true,
+   },
 };
 
 static int
diff --git a/src/gbm/main/gbm.c b/src/gbm/main/gbm.c
index 2501518..28e9237 100644
--- a/src/gbm/main/gbm.c
+++ b/src/gbm/main/gbm.c
@@ -271,6 +271,9 @@ gbm_bo_get_bpp(struct gbm_bo *bo)
   case GBM_FORMAT_RGBA1010102:
   case GBM_FORMAT_BGRA1010102:
  return 32;
+  case GBM_FORMAT_XBGR16161616F:
+  case GBM_FORMAT_ABGR16161616F:
+ return 64;
}
 }
 
diff --git a/src/gbm/main/gbm.h b/src/gbm/main/gbm.h
index 9b52887..4c6ab37 100644
--- a/src/gbm/main/gbm.h
+++ b/src/gbm/main/gbm.h
@@ -150,6 +150,15 @@ enum gbm_bo_format {
 #define GBM_FORMAT_RGBA1010102 __gbm_fourcc_code('R', 'A', '3', '0') /* [31:0] 
R:G:B:A 10:10:10:2 little endian */
 #define GBM_FORMAT_BGRA1010102 __gbm_fourcc_code('B', 'A', '3', '0') /* [31:0] 
B:G:R:A 10:10:10:2 little endian */
 
+/*
+ * Floating point 64bpp RGB
+ * IEEE 754-2008 binary16 half-precision float
+ * [15:0] sign:exponent:mantissa 1:5:10
+ */
+#define GBM_FORMAT_XBGR16161616F __gbm_fourcc_code('X', 'B', '4', 'H') /* 
[63:0] x:B:G:R 16:16:16:16 little endian */
+
+#define GBM_FORMAT_ABGR16161616F __gbm_fourcc_code('A', 'B', '4', 'H') /* 
[63:0] A:B:G:R 16:16:16:16 little endian */
+
 /* packed YCbCr */
 #define GBM_FORMAT_YUYV__gbm_fourcc_code('Y', 'U', 'Y', 'V') 
/* [31:0] Cr0:Y1:Cb0:Y0 8:8:8:8 little endian */
 #define GBM_FORMAT_YVYU__gbm_fourcc_code('Y', 'V', 'Y', 'U') 
/* [31:0] Cb0:Y1:Cr0:Y0 8:8:8:8 little endian */
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC 3/6] drm-uapi: Add fp16 formats to drm_fourcc.h

2019-01-04 Thread Kevin Strasser
Add 64 bpp 16:16:16:16 half float pixel formats. Each 16 bit component is
formatted in IEEE-754 half-precision float (binary16) 1:5:10
MSb-sign:exponent:fraction form.

Signed-off-by: Kevin Strasser 
---
 include/drm-uapi/drm_fourcc.h | 8 
 1 file changed, 8 insertions(+)

diff --git a/include/drm-uapi/drm_fourcc.h b/include/drm-uapi/drm_fourcc.h
index d5e5235..58ffeae 100644
--- a/include/drm-uapi/drm_fourcc.h
+++ b/include/drm-uapi/drm_fourcc.h
@@ -105,6 +105,14 @@ extern "C" {
 #define DRM_FORMAT_RGBA1010102 fourcc_code('R', 'A', '3', '0') /* [31:0] 
R:G:B:A 10:10:10:2 little endian */
 #define DRM_FORMAT_BGRA1010102 fourcc_code('B', 'A', '3', '0') /* [31:0] 
B:G:R:A 10:10:10:2 little endian */
 
+/*
+ * Floating point 64bpp RGB
+ * IEEE 754-2008 binary16 half-precision float
+ * [15:0] sign:exponent:mantissa 1:5:10
+ */
+#define DRM_FORMAT_XBGR16161616F fourcc_code('X', 'B', '4', 'H') /* [63:0] 
x:B:G:R 16:16:16:16 little endian */
+#define DRM_FORMAT_ABGR16161616F fourcc_code('A', 'B', '4', 'H') /* [63:0] 
A:B:G:R 16:16:16:16 little endian */
+
 /* packed YCbCr */
 #define DRM_FORMAT_YUYVfourcc_code('Y', 'U', 'Y', 'V') /* 
[31:0] Cr0:Y1:Cb0:Y0 8:8:8:8 little endian */
 #define DRM_FORMAT_YVYUfourcc_code('Y', 'V', 'Y', 'U') /* 
[31:0] Cb0:Y1:Cr0:Y0 8:8:8:8 little endian */
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC 1/6] dri: Support 64 bit rgba masks

2019-01-04 Thread Kevin Strasser
The dri core api was written with the assumption that all attribute values
would fit into 32 bits. This limitation means the config handlers can't
accept 64 bpp formats. Reserve 64 bits for rgba masks and add new
attributes that allow access to the upper 32 bits.

Signed-off-by: Kevin Strasser 
---
 include/GL/internal/dri_interface.h |  6 ++-
 src/egl/drivers/dri2/egl_dri2.c | 28 +---
 src/egl/drivers/dri2/egl_dri2.h |  6 +--
 src/egl/drivers/dri2/platform_android.c |  2 +-
 src/egl/drivers/dri2/platform_drm.c | 67 ++---
 src/egl/drivers/dri2/platform_surfaceless.c |  2 +-
 src/egl/drivers/dri2/platform_wayland.c |  2 +-
 src/egl/drivers/dri2/platform_x11.c |  6 +--
 src/gbm/backends/dri/gbm_driint.h   |  8 ++--
 src/glx/glxconfig.h |  2 +-
 src/mesa/drivers/dri/common/utils.c | 16 ++-
 src/mesa/main/mtypes.h  |  2 +-
 12 files changed, 108 insertions(+), 39 deletions(-)

diff --git a/include/GL/internal/dri_interface.h 
b/include/GL/internal/dri_interface.h
index 072f379..c5761c4 100644
--- a/include/GL/internal/dri_interface.h
+++ b/include/GL/internal/dri_interface.h
@@ -747,7 +747,11 @@ struct __DRIuseInvalidateExtensionRec {
 #define __DRI_ATTRIB_YINVERTED 47
 #define __DRI_ATTRIB_FRAMEBUFFER_SRGB_CAPABLE  48
 #define __DRI_ATTRIB_MUTABLE_RENDER_BUFFER 49 /* 
EGL_MUTABLE_RENDER_BUFFER_BIT_KHR */
-#define __DRI_ATTRIB_MAX   50
+#define __DRI_ATTRIB_RED_MASK_HI   50
+#define __DRI_ATTRIB_GREEN_MASK_HI 51
+#define __DRI_ATTRIB_BLUE_MASK_HI  52
+#define __DRI_ATTRIB_ALPHA_MASK_HI 53
+#define __DRI_ATTRIB_MAX   54
 
 /* __DRI_ATTRIB_RENDER_TYPE */
 #define __DRI_ATTRIB_RGBA_BIT  0x01
diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c
index 0be9235..d19950d 100644
--- a/src/egl/drivers/dri2/egl_dri2.c
+++ b/src/egl/drivers/dri2/egl_dri2.c
@@ -179,7 +179,7 @@ dri2_match_config(const _EGLConfig *conf, const _EGLConfig 
*criteria)
 struct dri2_egl_config *
 dri2_add_config(_EGLDisplay *disp, const __DRIconfig *dri_config, int id,
 EGLint surface_type, const EGLint *attr_list,
-const unsigned int *rgba_masks)
+const unsigned long long int *rgba_masks)
 {
struct dri2_egl_config *conf;
struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp);
@@ -187,7 +187,7 @@ dri2_add_config(_EGLDisplay *disp, const __DRIconfig 
*dri_config, int id,
unsigned int attrib, value, double_buffer;
bool srgb = false;
EGLint key, bind_to_texture_rgb, bind_to_texture_rgba;
-   unsigned int dri_masks[4] = { 0, 0, 0, 0 };
+   unsigned long long int dri_masks[4] = { 0, 0, 0, 0 };
_EGLConfig *matching_config;
EGLint num_configs = 0;
EGLint config_id;
@@ -234,19 +234,35 @@ dri2_add_config(_EGLDisplay *disp, const __DRIconfig 
*dri_config, int id,
  break;
 
   case __DRI_ATTRIB_RED_MASK:
- dri_masks[0] = value;
+ dri_masks[0] |= value;
+ break;
+
+  case __DRI_ATTRIB_RED_MASK_HI:
+ dri_masks[0] |= (unsigned long long int)value << 32;
  break;
 
   case __DRI_ATTRIB_GREEN_MASK:
- dri_masks[1] = value;
+ dri_masks[1] |= value;
+ break;
+
+  case __DRI_ATTRIB_GREEN_MASK_HI:
+ dri_masks[1] |= (unsigned long long int)value << 32;
  break;
 
   case __DRI_ATTRIB_BLUE_MASK:
- dri_masks[2] = value;
+ dri_masks[2] |= value;
+ break;
+
+  case __DRI_ATTRIB_BLUE_MASK_HI:
+ dri_masks[2] |= (unsigned long long int)value << 32;
  break;
 
   case __DRI_ATTRIB_ALPHA_MASK:
- dri_masks[3] = value;
+ dri_masks[3] |= value;
+ break;
+
+  case __DRI_ATTRIB_ALPHA_MASK_HI:
+ dri_masks[3] |= (unsigned long long int)value << 32;
  break;
 
   case __DRI_ATTRIB_ACCUM_RED_SIZE:
diff --git a/src/egl/drivers/dri2/egl_dri2.h b/src/egl/drivers/dri2/egl_dri2.h
index 4abe1ba..3916c68 100644
--- a/src/egl/drivers/dri2/egl_dri2.h
+++ b/src/egl/drivers/dri2/egl_dri2.h
@@ -404,7 +404,7 @@ dri2_lookup_egl_image(__DRIscreen *screen, void *image, 
void *data);
 struct dri2_egl_config *
 dri2_add_config(_EGLDisplay *disp, const __DRIconfig *dri_config, int id,
 EGLint surface_type, const EGLint *attr_list,
-const unsigned int *rgba_masks);
+const unsigned long long int *rgba_masks);
 
 _EGLImage *
 dri2_create_image_khr(_EGLDriver *drv, _EGLDisplay *disp,
@@ -420,7 +420,7 @@ EGLBoolean
 dri2_initialize_x11(_EGLDriver *drv, _EGLDisplay *disp);
 void
 dri2_teardown_x11(struct dri2_egl_display *dri2_dpy);
-unsigned int
+unsigned long long int
 dri2_x11_get_red_mask_for_depth(struct dri2_egl_display *dri2_dpy, int depth);
 #else
 static inline 

[Mesa-dev] [PATCH] gallium/swr: Fix multi-context sync fence deadlock.

2019-01-04 Thread Bruce Cherniak
Various recreation scenarios lead to API thread getting stuck in
swr_fence_finish().  This is a multi-context issue, whereby one context
overwrites the fence read-value with a previous sync's lesser value.
The fence sync value is supposed to be always increasing.

In swr_fence_cb(), only update the "read" value if the new value is
greater.

(This may seem like we're not waiting on the other context to finish, but
had we needed for it to finish there would have been a wait prior to
submitting a new sync.)

cc: mesa-sta...@lists.freedesktop.org
---
 src/gallium/drivers/swr/swr_fence.cpp | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/swr/swr_fence.cpp 
b/src/gallium/drivers/swr/swr_fence.cpp
index b05ac8cec0..074d82a3b4 100644
--- a/src/gallium/drivers/swr/swr_fence.cpp
+++ b/src/gallium/drivers/swr/swr_fence.cpp
@@ -50,7 +50,9 @@ swr_fence_cb(uint64_t userData, uint64_t userData2, uint64_t 
userData3)
swr_fence_do_work(fence);
 
/* Correct value is in SwrSync data, and not the fence write field. */
-   fence->read = userData2;
+   /* Contexts may not finish in order, but fence value always increases */
+   if (fence->read < userData2)
+  fence->read = userData2;
 }
 
 /*
-- 
2.17.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] intel/blorp: Be more conservative about copying clear colors

2019-01-04 Thread Rafael Antognolli
On Fri, Jan 04, 2019 at 01:07:07PM -0600, Jason Ekstrand wrote:
> In 92eb5bbc68d7324 we attempted to avoid copying clear colors whenever
> we weren't doing a resolve.  However, this broke MSAA resolves because
> we need the clear color in the source.  This patch makes blorp much more
> conservative such that it only avoids the clear color copy if either
> aux_usage == NONE or it's explicitly doing a fast-clear.

Ah, nice! I think I tried to fix this by not setting the
surface->clear_color_addr.buffer in some cases, but I think it broke
some crucible test.  I don't remember the details of how, though.

Anyway, this looks better, and I assume it doesn't break anything.

Reviewed-by: Rafael Antognolli 

> Fixes: 92eb5bbc68d7 "intel/blorp: Only copy clear color when doing..."
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107728
> Cc: Rafael Antognolli 
> ---
>  src/intel/blorp/blorp_genX_exec.h | 9 ++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/src/intel/blorp/blorp_genX_exec.h 
> b/src/intel/blorp/blorp_genX_exec.h
> index 9010b03fb67..29afe8ac78b 100644
> --- a/src/intel/blorp/blorp_genX_exec.h
> +++ b/src/intel/blorp/blorp_genX_exec.h
> @@ -1326,7 +1326,7 @@ blorp_emit_memcpy(struct blorp_batch *batch,
>  static void
>  blorp_emit_surface_state(struct blorp_batch *batch,
>   const struct brw_blorp_surface_info *surface,
> - enum isl_aux_op op,
> + enum isl_aux_op aux_op,
>   void *state, uint32_t state_offset,
>   const bool color_write_disables[4],
>   bool is_render_target)
> @@ -1382,7 +1382,7 @@ blorp_emit_surface_state(struct blorp_batch *batch,
>surface->aux_addr, *aux_addr);
> }
>  
> -   if (surface->clear_color_addr.buffer) {
> +   if (aux_usage != ISL_AUX_USAGE_NONE && surface->clear_color_addr.buffer) {
>  #if GEN_GEN >= 10
>assert((surface->clear_color_addr.offset & 0x3f) == 0);
>uint32_t *clear_addr = state + isl_dev->ss.clear_color_state_offset;
> @@ -1390,7 +1390,10 @@ blorp_emit_surface_state(struct blorp_batch *batch,
>isl_dev->ss.clear_color_state_offset,
>surface->clear_color_addr, *clear_addr);
>  #elif GEN_GEN >= 7
> -  if (op == ISL_AUX_OP_FULL_RESOLVE || op == ISL_AUX_OP_PARTIAL_RESOLVE) 
> {
> +  /* Fast clears just whack the AUX surface and don't actually use the
> +   * clear color for anything.  We can avoid the MI memcpy on that case.
> +   */
> +  if (aux_op != ISL_AUX_OP_FAST_CLEAR) {
>   struct blorp_address dst_addr = 
> blorp_get_surface_base_address(batch);
>   dst_addr.offset += state_offset + isl_dev->ss.clear_value_offset;
>   blorp_emit_memcpy(batch, dst_addr, surface->clear_color_addr,
> -- 
> 2.20.1
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] intel/blorp: Be more conservative about copying clear colors

2019-01-04 Thread Jason Ekstrand
In 92eb5bbc68d7324 we attempted to avoid copying clear colors whenever
we weren't doing a resolve.  However, this broke MSAA resolves because
we need the clear color in the source.  This patch makes blorp much more
conservative such that it only avoids the clear color copy if either
aux_usage == NONE or it's explicitly doing a fast-clear.

Fixes: 92eb5bbc68d7 "intel/blorp: Only copy clear color when doing..."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107728
Cc: Rafael Antognolli 
---
 src/intel/blorp/blorp_genX_exec.h | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/src/intel/blorp/blorp_genX_exec.h 
b/src/intel/blorp/blorp_genX_exec.h
index 9010b03fb67..29afe8ac78b 100644
--- a/src/intel/blorp/blorp_genX_exec.h
+++ b/src/intel/blorp/blorp_genX_exec.h
@@ -1326,7 +1326,7 @@ blorp_emit_memcpy(struct blorp_batch *batch,
 static void
 blorp_emit_surface_state(struct blorp_batch *batch,
  const struct brw_blorp_surface_info *surface,
- enum isl_aux_op op,
+ enum isl_aux_op aux_op,
  void *state, uint32_t state_offset,
  const bool color_write_disables[4],
  bool is_render_target)
@@ -1382,7 +1382,7 @@ blorp_emit_surface_state(struct blorp_batch *batch,
   surface->aux_addr, *aux_addr);
}
 
-   if (surface->clear_color_addr.buffer) {
+   if (aux_usage != ISL_AUX_USAGE_NONE && surface->clear_color_addr.buffer) {
 #if GEN_GEN >= 10
   assert((surface->clear_color_addr.offset & 0x3f) == 0);
   uint32_t *clear_addr = state + isl_dev->ss.clear_color_state_offset;
@@ -1390,7 +1390,10 @@ blorp_emit_surface_state(struct blorp_batch *batch,
   isl_dev->ss.clear_color_state_offset,
   surface->clear_color_addr, *clear_addr);
 #elif GEN_GEN >= 7
-  if (op == ISL_AUX_OP_FULL_RESOLVE || op == ISL_AUX_OP_PARTIAL_RESOLVE) {
+  /* Fast clears just whack the AUX surface and don't actually use the
+   * clear color for anything.  We can avoid the MI memcpy on that case.
+   */
+  if (aux_op != ISL_AUX_OP_FAST_CLEAR) {
  struct blorp_address dst_addr = blorp_get_surface_base_address(batch);
  dst_addr.offset += state_offset + isl_dev->ss.clear_value_offset;
  blorp_emit_memcpy(batch, dst_addr, surface->clear_color_addr,
-- 
2.20.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] NIR constant problem for GPU which doesn't have native integer support

2019-01-04 Thread Ilia Mirkin
On Fri, Jan 4, 2019 at 10:46 AM Jason Ekstrand  wrote:
>
> On Fri, Jan 4, 2019 at 4:07 AM Erik Faye-Lund  
> wrote:
>>
>> On Thu, 2019-01-03 at 11:58 -0600, Jason Ekstrand wrote:
>> > On Thu, Jan 3, 2019 at 3:39 AM Erik Faye-Lund <
>> > erik.faye-l...@collabora.com> wrote:
>> > > On Wed, 2019-01-02 at 10:16 -0600, Jason Ekstrand wrote:
>> > > > On Wed, Jan 2, 2019 at 9:43 AM Ilia Mirkin 
>> > > > wrote:
>> > > > > Have a look at the first 4 patches in the series from Jonathan
>> > > > > Marek
>> > > > > to address some of these issues:
>> > > > >
>> > > > > https://patchwork.freedesktop.org/series/54295/
>> > > > >
>> > > > > Not sure exactly what state that work is in, but I've added
>> > > > > Jonathan
>> > > > > to CC, perhaps he can provide an update.
>> > > > >
>> > > > > Cheers,
>> > > > >
>> > > > >   -ilia
>> > > > >
>> > > > > On Wed, Jan 2, 2019 at 6:28 AM Qiang Yu 
>> > > wrote:
>> > > > > >
>> > > > > > Hi guys,
>> > > > > >
>> > > > > > I found the problem with this test fragment shader when lima
>> > > > > development:
>> > > > > > uniform int color;
>> > > > > > void main() {
>> > > > > > if (color > 1)
>> > > > > > gl_FragColor = vec4(1.0, 0.0, 0.0, 1);
>> > > > > > else
>> > > > > > gl_FragColor = vec4(0.0, 1.0, 0.0, 1);
>> > > > > > }
>> > > > > >
>> > > > > > nir_print_shader output:
>> > > > > > impl main {
>> > > > > > block block_0:
>> > > > > > /* preds: */
>> > > > > > vec1 32 ssa_0 = load_const (0x0001 /* 0.00
>> > > */)
>> > > > > > vec4 32 ssa_1 = load_const (0x3f80 /* 1.00
>> > > */,
>> > > > > > 0x /* 0.00 */, 0x /* 0.00 */,
>> > > 0x3f80
>> > > > > /*
>> > > > > > 1.00 */)
>> > > > > > vec4 32 ssa_2 = load_const (0x /* 0.00
>> > > */,
>> > > > > > 0x3f80 /* 1.00 */, 0x /* 0.00 */,
>> > > 0x3f80
>> > > > > /*
>> > > > > > 1.00 */)
>> > > > > > vec1 32 ssa_3 = load_const (0x /* 0.00
>> > > */)
>> > > > > > vec1 32 ssa_4 = intrinsic load_uniform (ssa_3) (0, 1,
>> > > 0)
>> > > > > /*
>> > > > > > base=0 */ /* range=1 */ /* component=0 */   /* color */
>> > > > > > vec1 32 ssa_5 = slt ssa_0, ssa_4
>> > > > > > vec1 32 ssa_6 = fnot ssa_5
>> > > > > > vec4 32 ssa_7 = bcsel ssa_6., ssa_2, ssa_1
>> > > > > > intrinsic store_output (ssa_7, ssa_3) (0, 15, 0) /*
>> > > > > base=0 */
>> > > > > > /* wrmask=xyzw */ /* component=0 */   /* gl_FragColor */
>> > > > > > /* succs: block_1 */
>> > > > > > block block_1:
>> > > > > > }
>> > > > > >
>> > > > > > ssa0 is not converted to float when glsl to nir. I see
>> > > > > glsl_to_nir.cpp
>> > > > > > will create flt/ilt/ult
>> > > > > > based on source type for gpu support native integer, but for
>> > > gpu
>> > > > > not
>> > > > > > support native
>> > > > > > integer, just create slt for all source type. And in
>> > > > > > nir_lower_constant_initializers,
>> > > > > > there's also no type conversion for integer constant.
>> > > >
>> > > > This is a generally sticky issue.  In NIR, we have no concept of
>> > > > types on SSA values which has proven perfectly reasonable and
>> > > > actually very powerful in a world where integers are supported
>> > > > natively.  Unfortunately, it causes significant problems for
>> > > float-
>> > > > only architectures.
>> > >
>> > > I would like to take this chance to say that this untyped SSA-value
>> > > choice has lead to issues in both radeon_si (because LLVM values
>> > > are
>> > > typed) and zink (similarly, because SPIR-V values are typed), where
>> > > we
>> > > need to to bitcasts on every access because there's just not enough
>> > > information available to emit variables with the right type.
>> >
>> > I'm not sure if I agree that the two problems are the same or not...
>> > More on that in a bit.
>> >
>> > > It took us a lot of time to realize that the meta-data from the
>> > > opcodes
>> > > doesn't *really* provide this, because the rest of nir doesn't
>> > > treat
>> > > values consistently. In fact, this feels arguably more like buggy
>> > > behavior; why do we even have fmov when all of the time the
>> > > compiler
>> > > will emit imovs for floating-point values...? Or why do we have
>> > > bitcast
>> >
>> > Why do we have different mov opcodes?  Because they have different
>> > behavior in the presence of source/destination modifiers.
>>
>> Is this general NIR-behavior (i.e will this be honored by constant
>> folding etc), or is it Intel specific? If it's NIR-behavior, is it
>> documented somewhere?
>
>
> No, constant folding doesn't do modifiers.  I had completely forgotten about 
> this fact until we started this discussion.
>
>>
>> In either case, I have a feeling that this is a mis-design; the
>> modifiers apply to the operands, not the opcodes. It sounds a bit like
>> imov shouldn't have been an ALU instruction, but perhaps some other,
>> new 

[Mesa-dev] [Bug 99553] Tracker bug for runnning OpenCL applications on Clover

2019-01-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=99553
Bug 99553 depends on bug 102009, which changed state.

Bug 102009 Summary: [clover, amdgcn] Blender crashes when compiling OpenCL 
kernel
https://bugs.freedesktop.org/show_bug.cgi?id=102009

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|FIXED   |---

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102009] [clover, amdgcn] Blender crashes when compiling OpenCL kernel

2019-01-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102009

Jan Vesely  changed:

   What|Removed |Added

 Resolution|FIXED   |---
 Status|RESOLVED|REOPENED

--- Comment #10 from Jan Vesely  ---
(In reply to Markus from comment #9)
> After installing rocm 2.0 on Ubuntu 18.10 this is now working without
> crashes in Blender 2.79.
> 
> Issue can be considered resolved from my perspective.

clover is not part of rocm

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 99553] Tracker bug for runnning OpenCL applications on Clover

2019-01-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=99553
Bug 99553 depends on bug 102009, which changed state.

Bug 102009 Summary: [clover, amdgcn] Blender crashes when compiling OpenCL 
kernel
https://bugs.freedesktop.org/show_bug.cgi?id=102009

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102009] [clover, amdgcn] Blender crashes when compiling OpenCL kernel

2019-01-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102009

Markus  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #9 from Markus  ---
After installing rocm 2.0 on Ubuntu 18.10 this is now working without crashes
in Blender 2.79.

Issue can be considered resolved from my perspective.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] NIR constant problem for GPU which doesn't have native integer support

2019-01-04 Thread Jason Ekstrand
On Fri, Jan 4, 2019 at 4:07 AM Erik Faye-Lund 
wrote:

> On Thu, 2019-01-03 at 11:58 -0600, Jason Ekstrand wrote:
> > On Thu, Jan 3, 2019 at 3:39 AM Erik Faye-Lund <
> > erik.faye-l...@collabora.com> wrote:
> > > On Wed, 2019-01-02 at 10:16 -0600, Jason Ekstrand wrote:
> > > > On Wed, Jan 2, 2019 at 9:43 AM Ilia Mirkin 
> > > > wrote:
> > > > > Have a look at the first 4 patches in the series from Jonathan
> > > > > Marek
> > > > > to address some of these issues:
> > > > >
> > > > > https://patchwork.freedesktop.org/series/54295/
> > > > >
> > > > > Not sure exactly what state that work is in, but I've added
> > > > > Jonathan
> > > > > to CC, perhaps he can provide an update.
> > > > >
> > > > > Cheers,
> > > > >
> > > > >   -ilia
> > > > >
> > > > > On Wed, Jan 2, 2019 at 6:28 AM Qiang Yu 
> > > wrote:
> > > > > >
> > > > > > Hi guys,
> > > > > >
> > > > > > I found the problem with this test fragment shader when lima
> > > > > development:
> > > > > > uniform int color;
> > > > > > void main() {
> > > > > > if (color > 1)
> > > > > > gl_FragColor = vec4(1.0, 0.0, 0.0, 1);
> > > > > > else
> > > > > > gl_FragColor = vec4(0.0, 1.0, 0.0, 1);
> > > > > > }
> > > > > >
> > > > > > nir_print_shader output:
> > > > > > impl main {
> > > > > > block block_0:
> > > > > > /* preds: */
> > > > > > vec1 32 ssa_0 = load_const (0x0001 /* 0.00
> > > */)
> > > > > > vec4 32 ssa_1 = load_const (0x3f80 /* 1.00
> > > */,
> > > > > > 0x /* 0.00 */, 0x /* 0.00 */,
> > > 0x3f80
> > > > > /*
> > > > > > 1.00 */)
> > > > > > vec4 32 ssa_2 = load_const (0x /* 0.00
> > > */,
> > > > > > 0x3f80 /* 1.00 */, 0x /* 0.00 */,
> > > 0x3f80
> > > > > /*
> > > > > > 1.00 */)
> > > > > > vec1 32 ssa_3 = load_const (0x /* 0.00
> > > */)
> > > > > > vec1 32 ssa_4 = intrinsic load_uniform (ssa_3) (0, 1,
> > > 0)
> > > > > /*
> > > > > > base=0 */ /* range=1 */ /* component=0 */   /* color */
> > > > > > vec1 32 ssa_5 = slt ssa_0, ssa_4
> > > > > > vec1 32 ssa_6 = fnot ssa_5
> > > > > > vec4 32 ssa_7 = bcsel ssa_6., ssa_2, ssa_1
> > > > > > intrinsic store_output (ssa_7, ssa_3) (0, 15, 0) /*
> > > > > base=0 */
> > > > > > /* wrmask=xyzw */ /* component=0 */   /* gl_FragColor */
> > > > > > /* succs: block_1 */
> > > > > > block block_1:
> > > > > > }
> > > > > >
> > > > > > ssa0 is not converted to float when glsl to nir. I see
> > > > > glsl_to_nir.cpp
> > > > > > will create flt/ilt/ult
> > > > > > based on source type for gpu support native integer, but for
> > > gpu
> > > > > not
> > > > > > support native
> > > > > > integer, just create slt for all source type. And in
> > > > > > nir_lower_constant_initializers,
> > > > > > there's also no type conversion for integer constant.
> > > >
> > > > This is a generally sticky issue.  In NIR, we have no concept of
> > > > types on SSA values which has proven perfectly reasonable and
> > > > actually very powerful in a world where integers are supported
> > > > natively.  Unfortunately, it causes significant problems for
> > > float-
> > > > only architectures.
> > >
> > > I would like to take this chance to say that this untyped SSA-value
> > > choice has lead to issues in both radeon_si (because LLVM values
> > > are
> > > typed) and zink (similarly, because SPIR-V values are typed), where
> > > we
> > > need to to bitcasts on every access because there's just not enough
> > > information available to emit variables with the right type.
> >
> > I'm not sure if I agree that the two problems are the same or not...
> > More on that in a bit.
> >
> > > It took us a lot of time to realize that the meta-data from the
> > > opcodes
> > > doesn't *really* provide this, because the rest of nir doesn't
> > > treat
> > > values consistently. In fact, this feels arguably more like buggy
> > > behavior; why do we even have fmov when all of the time the
> > > compiler
> > > will emit imovs for floating-point values...? Or why do we have
> > > bitcast
> >
> > Why do we have different mov opcodes?  Because they have different
> > behavior in the presence of source/destination modifiers.
>
> Is this general NIR-behavior (i.e will this be honored by constant
> folding etc), or is it Intel specific? If it's NIR-behavior, is it
> documented somewhere?
>

No, constant folding doesn't do modifiers.  I had completely forgotten
about this fact until we started this discussion.


> In either case, I have a feeling that this is a mis-design; the
> modifiers apply to the operands, not the opcodes. It sounds a bit like
> imov shouldn't have been an ALU instruction, but perhaps some other,
> new type. Or perhaps the concept we have should have been a bit finer
> granularity. It's a bit hard to tell without more details, and I can't
> seem to find any on my own.
>

I'm very unclear about 

Re: [Mesa-dev] st/va: Return correct status from vlVaQuerySurfaceStatus

2019-01-04 Thread Liu, Leo
Please add some commit messages. With that, the patch is

Reviewed-by: Leo Liu 

On 1/3/19 5:20 AM, Das, Indrajit-kumar wrote:
> From: Indrajit Das 
> Date: Thu, 3 Jan 2019 14:36:33 +0530
> Subject: [PATCH] st/va: Return correct status from vlVaQuerySurfaceStatus
>
> Signed-off-by: Indrajit Das 
> ---
>   src/gallium/state_trackers/va/surface.c | 31 +
>   1 file changed, 31 insertions(+)
>
> diff --git a/src/gallium/state_trackers/va/surface.c 
> b/src/gallium/state_trackers/va/surface.c
> index cc26efe..e7ed64b 100644
> --- a/src/gallium/state_trackers/va/surface.c
> +++ b/src/gallium/state_trackers/va/surface.c
> @@ -146,9 +146,40 @@ vlVaSyncSurface(VADriverContextP ctx, VASurfaceID 
> render_target)
>   VAStatus
>   vlVaQuerySurfaceStatus(VADriverContextP ctx, VASurfaceID render_target, 
> VASurfaceStatus *status)
>   {
> +   vlVaDriver *drv;
> +   vlVaSurface *surf;
> +   vlVaContext *context;
> +
>  if (!ctx)
> return VA_STATUS_ERROR_INVALID_CONTEXT;
>   
> +   drv = VL_VA_DRIVER(ctx);
> +   if (!drv)
> +  return VA_STATUS_ERROR_INVALID_CONTEXT;
> +
> +   mtx_lock(>mutex);
> +
> +   surf = handle_table_get(drv->htab, render_target);
> +   if (!surf || !surf->buffer) {
> +  mtx_unlock(>mutex);
> +  return VA_STATUS_ERROR_INVALID_SURFACE;
> +   }
> +
> +   context = handle_table_get(drv->htab, surf->ctx);
> +   if (!context) {
> +  mtx_unlock(>mutex);
> +  return VA_STATUS_ERROR_INVALID_CONTEXT;
> +   }
> +
> +   if (context->decoder->entrypoint == PIPE_VIDEO_ENTRYPOINT_ENCODE) {
> +  if(surf->feedback == NULL)
> + *status=VASurfaceReady;
> +  else
> + *status=VASurfaceRendering;
> +   }
> +
> +   mtx_unlock(>mutex);
> +
>  return VA_STATUS_SUCCESS;
>   }
>   
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 107115] Make darktable OpenCL support work on Clover and RadeonSI

2019-01-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=107115

Jan Vesely  changed:

   What|Removed |Added

   See Also||https://bugs.freedesktop.or
   ||g/show_bug.cgi?id=109224

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 99553] Tracker bug for runnning OpenCL applications on Clover

2019-01-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=99553

Jan Vesely  changed:

   What|Removed |Added

 Depends on||109224


Referenced Bugs:

https://bugs.freedesktop.org/show_bug.cgi?id=109224
[Bug 109224] OpenCL in darktable with AMD Turks doesn't works
-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 87738] [OpenCL] Please add Image support

2019-01-04 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=87738

Jan Vesely  changed:

   What|Removed |Added

 Blocks||109224


Referenced Bugs:

https://bugs.freedesktop.org/show_bug.cgi?id=109224
[Bug 109224] OpenCL in darktable with AMD Turks doesn't works
-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] NIR constant problem for GPU which doesn't have native integer support

2019-01-04 Thread Erik Faye-Lund
On Thu, 2019-01-03 at 11:58 -0600, Jason Ekstrand wrote:
> On Thu, Jan 3, 2019 at 3:39 AM Erik Faye-Lund <
> erik.faye-l...@collabora.com> wrote:
> > On Wed, 2019-01-02 at 10:16 -0600, Jason Ekstrand wrote:
> > > On Wed, Jan 2, 2019 at 9:43 AM Ilia Mirkin 
> > > wrote:
> > > > Have a look at the first 4 patches in the series from Jonathan
> > > > Marek
> > > > to address some of these issues:
> > > > 
> > > > https://patchwork.freedesktop.org/series/54295/
> > > > 
> > > > Not sure exactly what state that work is in, but I've added
> > > > Jonathan
> > > > to CC, perhaps he can provide an update.
> > > > 
> > > > Cheers,
> > > > 
> > > >   -ilia
> > > > 
> > > > On Wed, Jan 2, 2019 at 6:28 AM Qiang Yu 
> > wrote:
> > > > >
> > > > > Hi guys,
> > > > >
> > > > > I found the problem with this test fragment shader when lima
> > > > development:
> > > > > uniform int color;
> > > > > void main() {
> > > > > if (color > 1)
> > > > > gl_FragColor = vec4(1.0, 0.0, 0.0, 1);
> > > > > else
> > > > > gl_FragColor = vec4(0.0, 1.0, 0.0, 1);
> > > > > }
> > > > >
> > > > > nir_print_shader output:
> > > > > impl main {
> > > > > block block_0:
> > > > > /* preds: */
> > > > > vec1 32 ssa_0 = load_const (0x0001 /* 0.00
> > */)
> > > > > vec4 32 ssa_1 = load_const (0x3f80 /* 1.00
> > */,
> > > > > 0x /* 0.00 */, 0x /* 0.00 */,
> > 0x3f80
> > > > /*
> > > > > 1.00 */)
> > > > > vec4 32 ssa_2 = load_const (0x /* 0.00
> > */,
> > > > > 0x3f80 /* 1.00 */, 0x /* 0.00 */,
> > 0x3f80
> > > > /*
> > > > > 1.00 */)
> > > > > vec1 32 ssa_3 = load_const (0x /* 0.00
> > */)
> > > > > vec1 32 ssa_4 = intrinsic load_uniform (ssa_3) (0, 1,
> > 0)
> > > > /*
> > > > > base=0 */ /* range=1 */ /* component=0 */   /* color */
> > > > > vec1 32 ssa_5 = slt ssa_0, ssa_4
> > > > > vec1 32 ssa_6 = fnot ssa_5
> > > > > vec4 32 ssa_7 = bcsel ssa_6., ssa_2, ssa_1
> > > > > intrinsic store_output (ssa_7, ssa_3) (0, 15, 0) /*
> > > > base=0 */
> > > > > /* wrmask=xyzw */ /* component=0 */   /* gl_FragColor */
> > > > > /* succs: block_1 */
> > > > > block block_1:
> > > > > }
> > > > >
> > > > > ssa0 is not converted to float when glsl to nir. I see
> > > > glsl_to_nir.cpp
> > > > > will create flt/ilt/ult
> > > > > based on source type for gpu support native integer, but for
> > gpu
> > > > not
> > > > > support native
> > > > > integer, just create slt for all source type. And in
> > > > > nir_lower_constant_initializers,
> > > > > there's also no type conversion for integer constant.
> > > 
> > > This is a generally sticky issue.  In NIR, we have no concept of
> > > types on SSA values which has proven perfectly reasonable and
> > > actually very powerful in a world where integers are supported
> > > natively.  Unfortunately, it causes significant problems for
> > float-
> > > only architectures.
> > 
> > I would like to take this chance to say that this untyped SSA-value
> > choice has lead to issues in both radeon_si (because LLVM values
> > are
> > typed) and zink (similarly, because SPIR-V values are typed), where
> > we
> > need to to bitcasts on every access because there's just not enough
> > information available to emit variables with the right type.
> 
> I'm not sure if I agree that the two problems are the same or not... 
> More on that in a bit.
>  
> > It took us a lot of time to realize that the meta-data from the
> > opcodes
> > doesn't *really* provide this, because the rest of nir doesn't
> > treat
> > values consistently. In fact, this feels arguably more like buggy
> > behavior; why do we even have fmov when all of the time the
> > compiler
> > will emit imovs for floating-point values...? Or why do we have
> > bitcast
> 
> Why do we have different mov opcodes?  Because they have different
> behavior in the presence of source/destination modifiers.

Is this general NIR-behavior (i.e will this be honored by constant
folding etc), or is it Intel specific? If it's NIR-behavior, is it
documented somewhere?

In either case, I have a feeling that this is a mis-design; the
modifiers apply to the operands, not the opcodes. It sounds a bit like
imov shouldn't have been an ALU instruction, but perhaps some other,
new type. Or perhaps the concept we have should have been a bit finer
granularity. It's a bit hard to tell without more details, and I can't
seem to find any on my own.

Generally, I wish we had some more of the details here written down.
Implementing support for NIR for a new architecture is currently quite
tricky, mostly because chasing down the details are hard.

> You likely don't use those in radeon or Zink but we do use them on
> Intel.  That being said, I've very seriously considered adding a
> generic nir_op_mov which is entirely typeless and doesn't support
> modifiers at all and make 

Re: [Mesa-dev] [PATCH v2 39/53] intel/compiler: add a helper to do conversions between integer and half-float

2019-01-04 Thread Iago Toral
On Fri, 2019-01-04 at 02:02 -0800, Francisco Jerez wrote:
> Iago Toral  writes:
> 
> > On Wed, 2019-01-02 at 15:00 -0800, Francisco Jerez wrote:
> > > Iago Toral Quiroga  writes:
> > > 
> > > > There are hardware restrictions to consider that seem to affect
> > > > atom platforms
> > > > only.
> > > 
> > > Same comment here as for PATCH 13 of this series.  This and PATCH
> > > 40
> > > shouldn't be necessary anymore with [1] in place.  Please drop
> > > them.
> > 
> > Actually, I think your pass doesn't handle this case. I have just
> > tested this and I get various regressions for conversions between
> > integers (specifically integers lower than 32-bit, so I wonder if
> > this
> > restriction only affects these cases) and half-float.
> > 
> > Here is an example of final IR generated with your pass and without
> > the
> > call to fixup_int_half_float_conversion from my series:
> > 
> > mov(8) vgrf1+0.0:HF, vgrf0<2>:W 
> > 
> > Which should be:
> > 
> > mov(8) vgrf1<2>:HF, vgrf0<2>:W
> > 
> > It seems your pass doesn't act on this code since
> > INTEL_DEBUG=optimizer
> > doesn't show any trace of it.
> > 
> > If this is not a bug in your pass and just that it doesn't handle
> > this
> > case, I am happy to add the support for it as part of my series if
> > that
> > makes sense to you, just let me know if that is the case.
> > 
> 
> It's not really a bug, you just need to add a case to
> has_dst_aligned_region_restriction() for it to return true for FP16
> instructions with this restriction, sorry I didn't point that out
> before.

Ok, I'll do that as part of my series then, thanks!

Iago

> > Iago
> > 
> > > [1] 
> > > 
https://lists.freedesktop.org/archives/mesa-dev/2018-December/212775.html
> > > 
> > > > ---
> > > >  src/intel/compiler/brw_fs_nir.cpp | 32
> > > > +++
> > > >  1 file changed, 32 insertions(+)
> > > > 
> > > > diff --git a/src/intel/compiler/brw_fs_nir.cpp
> > > > b/src/intel/compiler/brw_fs_nir.cpp
> > > > index 802f5cb0944..a9fd98bab68 100644
> > > > --- a/src/intel/compiler/brw_fs_nir.cpp
> > > > +++ b/src/intel/compiler/brw_fs_nir.cpp
> > > > @@ -696,6 +696,38 @@ fixup_64bit_conversion(const fs_builder
> > > > ,
> > > > return false;
> > > >  }
> > > >  
> > > > +static bool
> > > > +fixup_int_half_float_conversion(const fs_builder ,
> > > > +fs_reg dst, fs_reg src,
> > > > +bool saturate,
> > > > +const struct gen_device_info
> > > > *devinfo)
> > > > +{
> > > > +   /* CHV PRM, 3D Media GPGPU Engine, Register Region
> > > > Restrictions,
> > > > +* Special Restrictions:
> > > > +*
> > > > +*"Conversion between Integer and HF (Half Float) must
> > > > be
> > > > DWord
> > > > +* aligned and strided by a DWord on the destination."
> > > > +*
> > > > +* The same restriction is listed for other hardware
> > > > platforms,
> > > > however,
> > > > +* empirical testing suggests that only atom platforms are
> > > > affected.
> > > > +*/
> > > > +   if (!devinfo->is_cherryview &&
> > > > !gen_device_info_is_9lp(devinfo))
> > > > +  return false;
> > > > +
> > > > +   if (!((dst.type == BRW_REGISTER_TYPE_HF &&
> > > > !brw_reg_type_is_floating_point(src.type)) ||
> > > > + (src.type == BRW_REGISTER_TYPE_HF &&
> > > > !brw_reg_type_is_floating_point(dst.type
> > > > +  return false;
> > > > +
> > > > +   fs_reg tmp =
> > > > horiz_stride(retype(bld.vgrf(BRW_REGISTER_TYPE_F,
> > > > 1),
> > > > +dst.type),
> > > > + 2);
> > > > +   bld.MOV(tmp, src);
> > > > +   fs_inst *inst = bld.MOV(dst, tmp);
> > > > +   inst->saturate = saturate;
> > > > +
> > > > +   return true;
> > > > +}
> > > > +
> > > >  void
> > > >  fs_visitor::nir_emit_alu(const fs_builder , nir_alu_instr
> > > > *instr)
> > > >  {
> > > > -- 
> > > > 2.17.1
> > > > 
> > > > ___
> > > > mesa-dev mailing list
> > > > mesa-dev@lists.freedesktop.org
> > > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 39/53] intel/compiler: add a helper to do conversions between integer and half-float

2019-01-04 Thread Francisco Jerez
Iago Toral  writes:

> On Wed, 2019-01-02 at 15:00 -0800, Francisco Jerez wrote:
>> Iago Toral Quiroga  writes:
>> 
>> > There are hardware restrictions to consider that seem to affect
>> > atom platforms
>> > only.
>> 
>> Same comment here as for PATCH 13 of this series.  This and PATCH 40
>> shouldn't be necessary anymore with [1] in place.  Please drop them.
>
> Actually, I think your pass doesn't handle this case. I have just
> tested this and I get various regressions for conversions between
> integers (specifically integers lower than 32-bit, so I wonder if this
> restriction only affects these cases) and half-float.
>
> Here is an example of final IR generated with your pass and without the
> call to fixup_int_half_float_conversion from my series:
>
> mov(8) vgrf1+0.0:HF, vgrf0<2>:W 
>
> Which should be:
>
> mov(8) vgrf1<2>:HF, vgrf0<2>:W
>
> It seems your pass doesn't act on this code since INTEL_DEBUG=optimizer
> doesn't show any trace of it.
>
> If this is not a bug in your pass and just that it doesn't handle this
> case, I am happy to add the support for it as part of my series if that
> makes sense to you, just let me know if that is the case.
>

It's not really a bug, you just need to add a case to
has_dst_aligned_region_restriction() for it to return true for FP16
instructions with this restriction, sorry I didn't point that out
before.

> Iago
>
>> [1] 
>> https://lists.freedesktop.org/archives/mesa-dev/2018-December/212775.html
>> 
>> > ---
>> >  src/intel/compiler/brw_fs_nir.cpp | 32
>> > +++
>> >  1 file changed, 32 insertions(+)
>> > 
>> > diff --git a/src/intel/compiler/brw_fs_nir.cpp
>> > b/src/intel/compiler/brw_fs_nir.cpp
>> > index 802f5cb0944..a9fd98bab68 100644
>> > --- a/src/intel/compiler/brw_fs_nir.cpp
>> > +++ b/src/intel/compiler/brw_fs_nir.cpp
>> > @@ -696,6 +696,38 @@ fixup_64bit_conversion(const fs_builder ,
>> > return false;
>> >  }
>> >  
>> > +static bool
>> > +fixup_int_half_float_conversion(const fs_builder ,
>> > +fs_reg dst, fs_reg src,
>> > +bool saturate,
>> > +const struct gen_device_info
>> > *devinfo)
>> > +{
>> > +   /* CHV PRM, 3D Media GPGPU Engine, Register Region
>> > Restrictions,
>> > +* Special Restrictions:
>> > +*
>> > +*"Conversion between Integer and HF (Half Float) must be
>> > DWord
>> > +* aligned and strided by a DWord on the destination."
>> > +*
>> > +* The same restriction is listed for other hardware platforms,
>> > however,
>> > +* empirical testing suggests that only atom platforms are
>> > affected.
>> > +*/
>> > +   if (!devinfo->is_cherryview &&
>> > !gen_device_info_is_9lp(devinfo))
>> > +  return false;
>> > +
>> > +   if (!((dst.type == BRW_REGISTER_TYPE_HF &&
>> > !brw_reg_type_is_floating_point(src.type)) ||
>> > + (src.type == BRW_REGISTER_TYPE_HF &&
>> > !brw_reg_type_is_floating_point(dst.type
>> > +  return false;
>> > +
>> > +   fs_reg tmp = horiz_stride(retype(bld.vgrf(BRW_REGISTER_TYPE_F,
>> > 1),
>> > +dst.type),
>> > + 2);
>> > +   bld.MOV(tmp, src);
>> > +   fs_inst *inst = bld.MOV(dst, tmp);
>> > +   inst->saturate = saturate;
>> > +
>> > +   return true;
>> > +}
>> > +
>> >  void
>> >  fs_visitor::nir_emit_alu(const fs_builder , nir_alu_instr
>> > *instr)
>> >  {
>> > -- 
>> > 2.17.1
>> > 
>> > ___
>> > mesa-dev mailing list
>> > mesa-dev@lists.freedesktop.org
>> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 07/10] intel/fs: Introduce regioning lowering pass.

2019-01-04 Thread Francisco Jerez
Iago Toral  writes:

> On Sat, 2018-12-29 at 12:39 -0800, Francisco Jerez wrote:
>> This legalization pass is meant to handle situations where the source
>> or destination regioning controls of an instruction are unsupported
>> by
>> the hardware and need to be lowered away into separate instructions.
>> This should be more reliable and future-proof than the current
>> approach of handling CHV/BXT restrictions manually all over the
>> visitor.  The same mechanism is leveraged to lower unsupported type
>> conversions easily, which obsoletes the lower_conversions pass.
>> ---
>>  src/intel/Makefile.sources|   1 +
>>  src/intel/compiler/brw_fs.cpp |   5 +-
>>  src/intel/compiler/brw_fs.h   |  21 +-
>>  src/intel/compiler/brw_fs_lower_regioning.cpp | 382
>> ++
>>  src/intel/compiler/brw_ir_fs.h|  10 +
>>  src/intel/compiler/meson.build|   1 +
>>  6 files changed, 401 insertions(+), 19 deletions(-)
>>  create mode 100644 src/intel/compiler/brw_fs_lower_regioning.cpp
>> 
>> diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources
>> index 5e7d32293b7..6b9874d2b80 100644
>> --- a/src/intel/Makefile.sources
>> +++ b/src/intel/Makefile.sources
>> @@ -64,6 +64,7 @@ COMPILER_FILES = \
>>  compiler/brw_fs_live_variables.h \
>>  compiler/brw_fs_lower_conversions.cpp \
>>  compiler/brw_fs_lower_pack.cpp \
>> +compiler/brw_fs_lower_regioning.cpp \
>>  compiler/brw_fs_nir.cpp \
>>  compiler/brw_fs_reg_allocate.cpp \
>>  compiler/brw_fs_register_coalesce.cpp \
>> diff --git a/src/intel/compiler/brw_fs.cpp
>> b/src/intel/compiler/brw_fs.cpp
>> index 889509badab..caa7a798332 100644
>> --- a/src/intel/compiler/brw_fs.cpp
>> +++ b/src/intel/compiler/brw_fs.cpp
>> @@ -6471,7 +6471,10 @@ fs_visitor::optimize()
>>OPT(dead_code_eliminate);
>> }
>>  
>> -   if (OPT(lower_conversions)) {
>> +   progress = false;
>> +   OPT(lower_conversions);
>> +   OPT(lower_regioning);
>> +   if (progress) {
>>OPT(opt_copy_propagation);
>>OPT(dead_code_eliminate);
>>OPT(lower_simd_width);
>> diff --git a/src/intel/compiler/brw_fs.h
>> b/src/intel/compiler/brw_fs.h
>> index dc36ecc21ac..36825754931 100644
>> --- a/src/intel/compiler/brw_fs.h
>> +++ b/src/intel/compiler/brw_fs.h
>> @@ -164,6 +164,7 @@ public:
>> void lower_uniform_pull_constant_loads();
>> bool lower_load_payload();
>> bool lower_pack();
>> +   bool lower_regioning();
>> bool lower_conversions();
>> bool lower_logical_sends();
>> bool lower_integer_multiplication();
>> @@ -536,24 +537,8 @@ namespace brw {
>>}
>> }
>>  
>> -   /**
>> -* Remove any modifiers from the \p i-th source region of the
>> instruction,
>> -* including negate, abs and any implicit type conversion to the
>> execution
>> -* type.  Instead any source modifiers will be implemented as a
>> separate
>> -* MOV instruction prior to the original instruction.
>> -*/
>> -   inline bool
>> -   lower_src_modifiers(fs_visitor *v, bblock_t *block, fs_inst
>> *inst, unsigned i)
>> -   {
>> -  assert(inst->components_read(i) == 1);
>> -  const fs_builder ibld(v, block, inst);
>> -  const fs_reg tmp = ibld.vgrf(get_exec_type(inst));
>> -
>> -  ibld.MOV(tmp, inst->src[i]);
>> -  inst->src[i] = tmp;
>> -
>> -  return true;
>> -   }
>> +   bool
>> +   lower_src_modifiers(fs_visitor *v, bblock_t *block, fs_inst
>> *inst, unsigned i);
>>  }
>>  
>>  void shuffle_from_32bit_read(const brw::fs_builder ,
>> diff --git a/src/intel/compiler/brw_fs_lower_regioning.cpp
>> b/src/intel/compiler/brw_fs_lower_regioning.cpp
>> new file mode 100644
>> index 000..9578622401d
>> --- /dev/null
>> +++ b/src/intel/compiler/brw_fs_lower_regioning.cpp
>> @@ -0,0 +1,382 @@
>> +/*
>> + * Copyright © 2018 Intel Corporation
>> + *
>> + * Permission is hereby granted, free of charge, to any person
>> obtaining a
>> + * copy of this software and associated documentation files (the
>> "Software"),
>> + * to deal in the Software without restriction, including without
>> limitation
>> + * the rights to use, copy, modify, merge, publish, distribute,
>> sublicense,
>> + * and/or sell copies of the Software, and to permit persons to whom
>> the
>> + * Software is furnished to do so, subject to the following
>> conditions:
>> + *
>> + * The above copyright notice and this permission notice (including
>> the next
>> + * paragraph) shall be included in all copies or substantial
>> portions of the
>> + * Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
>> EXPRESS OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
>> MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO
>> EVENT SHALL
>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES
>> OR OTHER
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT 

Re: [Mesa-dev] [PATCH 07/10] intel/fs: Introduce regioning lowering pass.

2019-01-04 Thread Iago Toral
On Sat, 2018-12-29 at 12:39 -0800, Francisco Jerez wrote:
> This legalization pass is meant to handle situations where the source
> or destination regioning controls of an instruction are unsupported
> by
> the hardware and need to be lowered away into separate instructions.
> This should be more reliable and future-proof than the current
> approach of handling CHV/BXT restrictions manually all over the
> visitor.  The same mechanism is leveraged to lower unsupported type
> conversions easily, which obsoletes the lower_conversions pass.
> ---
>  src/intel/Makefile.sources|   1 +
>  src/intel/compiler/brw_fs.cpp |   5 +-
>  src/intel/compiler/brw_fs.h   |  21 +-
>  src/intel/compiler/brw_fs_lower_regioning.cpp | 382
> ++
>  src/intel/compiler/brw_ir_fs.h|  10 +
>  src/intel/compiler/meson.build|   1 +
>  6 files changed, 401 insertions(+), 19 deletions(-)
>  create mode 100644 src/intel/compiler/brw_fs_lower_regioning.cpp
> 
> diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources
> index 5e7d32293b7..6b9874d2b80 100644
> --- a/src/intel/Makefile.sources
> +++ b/src/intel/Makefile.sources
> @@ -64,6 +64,7 @@ COMPILER_FILES = \
>   compiler/brw_fs_live_variables.h \
>   compiler/brw_fs_lower_conversions.cpp \
>   compiler/brw_fs_lower_pack.cpp \
> + compiler/brw_fs_lower_regioning.cpp \
>   compiler/brw_fs_nir.cpp \
>   compiler/brw_fs_reg_allocate.cpp \
>   compiler/brw_fs_register_coalesce.cpp \
> diff --git a/src/intel/compiler/brw_fs.cpp
> b/src/intel/compiler/brw_fs.cpp
> index 889509badab..caa7a798332 100644
> --- a/src/intel/compiler/brw_fs.cpp
> +++ b/src/intel/compiler/brw_fs.cpp
> @@ -6471,7 +6471,10 @@ fs_visitor::optimize()
>OPT(dead_code_eliminate);
> }
>  
> -   if (OPT(lower_conversions)) {
> +   progress = false;
> +   OPT(lower_conversions);
> +   OPT(lower_regioning);
> +   if (progress) {
>OPT(opt_copy_propagation);
>OPT(dead_code_eliminate);
>OPT(lower_simd_width);
> diff --git a/src/intel/compiler/brw_fs.h
> b/src/intel/compiler/brw_fs.h
> index dc36ecc21ac..36825754931 100644
> --- a/src/intel/compiler/brw_fs.h
> +++ b/src/intel/compiler/brw_fs.h
> @@ -164,6 +164,7 @@ public:
> void lower_uniform_pull_constant_loads();
> bool lower_load_payload();
> bool lower_pack();
> +   bool lower_regioning();
> bool lower_conversions();
> bool lower_logical_sends();
> bool lower_integer_multiplication();
> @@ -536,24 +537,8 @@ namespace brw {
>}
> }
>  
> -   /**
> -* Remove any modifiers from the \p i-th source region of the
> instruction,
> -* including negate, abs and any implicit type conversion to the
> execution
> -* type.  Instead any source modifiers will be implemented as a
> separate
> -* MOV instruction prior to the original instruction.
> -*/
> -   inline bool
> -   lower_src_modifiers(fs_visitor *v, bblock_t *block, fs_inst
> *inst, unsigned i)
> -   {
> -  assert(inst->components_read(i) == 1);
> -  const fs_builder ibld(v, block, inst);
> -  const fs_reg tmp = ibld.vgrf(get_exec_type(inst));
> -
> -  ibld.MOV(tmp, inst->src[i]);
> -  inst->src[i] = tmp;
> -
> -  return true;
> -   }
> +   bool
> +   lower_src_modifiers(fs_visitor *v, bblock_t *block, fs_inst
> *inst, unsigned i);
>  }
>  
>  void shuffle_from_32bit_read(const brw::fs_builder ,
> diff --git a/src/intel/compiler/brw_fs_lower_regioning.cpp
> b/src/intel/compiler/brw_fs_lower_regioning.cpp
> new file mode 100644
> index 000..9578622401d
> --- /dev/null
> +++ b/src/intel/compiler/brw_fs_lower_regioning.cpp
> @@ -0,0 +1,382 @@
> +/*
> + * Copyright © 2018 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person
> obtaining a
> + * copy of this software and associated documentation files (the
> "Software"),
> + * to deal in the Software without restriction, including without
> limitation
> + * the rights to use, copy, modify, merge, publish, distribute,
> sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom
> the
> + * Software is furnished to do so, subject to the following
> conditions:
> + *
> + * The above copyright notice and this permission notice (including
> the next
> + * paragraph) shall be included in all copies or substantial
> portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO
> EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES
> OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
> OTHER DEALINGS
> + * IN THE SOFTWARE.
> + */
> +
> 

Re: [Mesa-dev] [PATCH v2 39/53] intel/compiler: add a helper to do conversions between integer and half-float

2019-01-04 Thread Iago Toral
On Wed, 2019-01-02 at 15:00 -0800, Francisco Jerez wrote:
> Iago Toral Quiroga  writes:
> 
> > There are hardware restrictions to consider that seem to affect
> > atom platforms
> > only.
> 
> Same comment here as for PATCH 13 of this series.  This and PATCH 40
> shouldn't be necessary anymore with [1] in place.  Please drop them.

Actually, I think your pass doesn't handle this case. I have just
tested this and I get various regressions for conversions between
integers (specifically integers lower than 32-bit, so I wonder if this
restriction only affects these cases) and half-float.

Here is an example of final IR generated with your pass and without the
call to fixup_int_half_float_conversion from my series:

mov(8) vgrf1+0.0:HF, vgrf0<2>:W 

Which should be:

mov(8) vgrf1<2>:HF, vgrf0<2>:W

It seems your pass doesn't act on this code since INTEL_DEBUG=optimizer
doesn't show any trace of it.

If this is not a bug in your pass and just that it doesn't handle this
case, I am happy to add the support for it as part of my series if that
makes sense to you, just let me know if that is the case.

Iago

> [1] 
> https://lists.freedesktop.org/archives/mesa-dev/2018-December/212775.html
> 
> > ---
> >  src/intel/compiler/brw_fs_nir.cpp | 32
> > +++
> >  1 file changed, 32 insertions(+)
> > 
> > diff --git a/src/intel/compiler/brw_fs_nir.cpp
> > b/src/intel/compiler/brw_fs_nir.cpp
> > index 802f5cb0944..a9fd98bab68 100644
> > --- a/src/intel/compiler/brw_fs_nir.cpp
> > +++ b/src/intel/compiler/brw_fs_nir.cpp
> > @@ -696,6 +696,38 @@ fixup_64bit_conversion(const fs_builder ,
> > return false;
> >  }
> >  
> > +static bool
> > +fixup_int_half_float_conversion(const fs_builder ,
> > +fs_reg dst, fs_reg src,
> > +bool saturate,
> > +const struct gen_device_info
> > *devinfo)
> > +{
> > +   /* CHV PRM, 3D Media GPGPU Engine, Register Region
> > Restrictions,
> > +* Special Restrictions:
> > +*
> > +*"Conversion between Integer and HF (Half Float) must be
> > DWord
> > +* aligned and strided by a DWord on the destination."
> > +*
> > +* The same restriction is listed for other hardware platforms,
> > however,
> > +* empirical testing suggests that only atom platforms are
> > affected.
> > +*/
> > +   if (!devinfo->is_cherryview &&
> > !gen_device_info_is_9lp(devinfo))
> > +  return false;
> > +
> > +   if (!((dst.type == BRW_REGISTER_TYPE_HF &&
> > !brw_reg_type_is_floating_point(src.type)) ||
> > + (src.type == BRW_REGISTER_TYPE_HF &&
> > !brw_reg_type_is_floating_point(dst.type
> > +  return false;
> > +
> > +   fs_reg tmp = horiz_stride(retype(bld.vgrf(BRW_REGISTER_TYPE_F,
> > 1),
> > +dst.type),
> > + 2);
> > +   bld.MOV(tmp, src);
> > +   fs_inst *inst = bld.MOV(dst, tmp);
> > +   inst->saturate = saturate;
> > +
> > +   return true;
> > +}
> > +
> >  void
> >  fs_visitor::nir_emit_alu(const fs_builder , nir_alu_instr
> > *instr)
> >  {
> > -- 
> > 2.17.1
> > 
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev