Re: [Mesa-dev] NIR constant problem for GPU which doesn't have native integer support
Thanks for your guys' information, although it's much deeper than my question and also my understanding of the NIR and compiler knowledge. This patch fix my problem: https://patchwork.freedesktop.org/patch/268946/ But seems it's not the final solution, right? And I'm not sure if we will have other integer problem besides constant for non-integer-gpu, do we? To be honest, as a backend developer, NIR non-integer-gpu support is out of my ability. So maybe I have to take this patch temporarily and work on other topic for sometime to see if any progress of this support in NIR in the future. Thanks, Qiang On Sat, Jan 5, 2019 at 12:40 AM Ilia Mirkin wrote: > > On Fri, Jan 4, 2019 at 10:46 AM Jason Ekstrand wrote: > > > > On Fri, Jan 4, 2019 at 4:07 AM Erik Faye-Lund > > wrote: > >> > >> On Thu, 2019-01-03 at 11:58 -0600, Jason Ekstrand wrote: > >> > On Thu, Jan 3, 2019 at 3:39 AM Erik Faye-Lund < > >> > erik.faye-l...@collabora.com> wrote: > >> > > On Wed, 2019-01-02 at 10:16 -0600, Jason Ekstrand wrote: > >> > > > On Wed, Jan 2, 2019 at 9:43 AM Ilia Mirkin > >> > > > wrote: > >> > > > > Have a look at the first 4 patches in the series from Jonathan > >> > > > > Marek > >> > > > > to address some of these issues: > >> > > > > > >> > > > > https://patchwork.freedesktop.org/series/54295/ > >> > > > > > >> > > > > Not sure exactly what state that work is in, but I've added > >> > > > > Jonathan > >> > > > > to CC, perhaps he can provide an update. > >> > > > > > >> > > > > Cheers, > >> > > > > > >> > > > > -ilia > >> > > > > > >> > > > > On Wed, Jan 2, 2019 at 6:28 AM Qiang Yu > >> > > wrote: > >> > > > > > > >> > > > > > Hi guys, > >> > > > > > > >> > > > > > I found the problem with this test fragment shader when lima > >> > > > > development: > >> > > > > > uniform int color; > >> > > > > > void main() { > >> > > > > > if (color > 1) > >> > > > > > gl_FragColor = vec4(1.0, 0.0, 0.0, 1); > >> > > > > > else > >> > > > > > gl_FragColor = vec4(0.0, 1.0, 0.0, 1); > >> > > > > > } > >> > > > > > > >> > > > > > nir_print_shader output: > >> > > > > > impl main { > >> > > > > > block block_0: > >> > > > > > /* preds: */ > >> > > > > > vec1 32 ssa_0 = load_const (0x0001 /* 0.00 > >> > > */) > >> > > > > > vec4 32 ssa_1 = load_const (0x3f80 /* 1.00 > >> > > */, > >> > > > > > 0x /* 0.00 */, 0x /* 0.00 */, > >> > > 0x3f80 > >> > > > > /* > >> > > > > > 1.00 */) > >> > > > > > vec4 32 ssa_2 = load_const (0x /* 0.00 > >> > > */, > >> > > > > > 0x3f80 /* 1.00 */, 0x /* 0.00 */, > >> > > 0x3f80 > >> > > > > /* > >> > > > > > 1.00 */) > >> > > > > > vec1 32 ssa_3 = load_const (0x /* 0.00 > >> > > */) > >> > > > > > vec1 32 ssa_4 = intrinsic load_uniform (ssa_3) (0, 1, > >> > > 0) > >> > > > > /* > >> > > > > > base=0 */ /* range=1 */ /* component=0 */ /* color */ > >> > > > > > vec1 32 ssa_5 = slt ssa_0, ssa_4 > >> > > > > > vec1 32 ssa_6 = fnot ssa_5 > >> > > > > > vec4 32 ssa_7 = bcsel ssa_6., ssa_2, ssa_1 > >> > > > > > intrinsic store_output (ssa_7, ssa_3) (0, 15, 0) /* > >> > > > > base=0 */ > >> > > > > > /* wrmask=xyzw */ /* component=0 */ /* gl_FragColor */ > >> > > > > > /* succs: block_1 */ > >> > > > > > block block_1: > >> > > > > > } > >> > > > > > > >> > > > > > ssa0 is not converted to float when glsl to nir. I see > >> > > > > glsl_to_nir.cpp > >> > > > > > will create flt/ilt/ult > >> > > > > > based on source type for gpu support native integer, but for > >> > > gpu > >> > > > > not > >> > > > > > support native > >> > > > > > integer, just create slt for all source type. And in > >> > > > > > nir_lower_constant_initializers, > >> > > > > > there's also no type conversion for integer constant. > >> > > > > >> > > > This is a generally sticky issue. In NIR, we have no concept of > >> > > > types on SSA values which has proven perfectly reasonable and > >> > > > actually very powerful in a world where integers are supported > >> > > > natively. Unfortunately, it causes significant problems for > >> > > float- > >> > > > only architectures. > >> > > > >> > > I would like to take this chance to say that this untyped SSA-value > >> > > choice has lead to issues in both radeon_si (because LLVM values > >> > > are > >> > > typed) and zink (similarly, because SPIR-V values are typed), where > >> > > we > >> > > need to to bitcasts on every access because there's just not enough > >> > > information available to emit variables with the right type. > >> > > >> > I'm not sure if I agree that the two problems are the same or not... > >> > More on that in a bit. > >> > > >> > > It took us a lot of time to realize that the meta-data from the > >> > > opcodes > >> > > doesn't *really* provide this, because the rest of nir doesn't > >> > > treat > >> > > values consistently.
Re: [Mesa-dev] [PATCH] st/glsl: refactor st_link_nir()
Timothy Arceri writes: > The functional change here is moving the nir_lower_io_to_scalar_early() > calls inside st_nir_link_shaders() and moving the st_nir_opts() call > after the call to nir_lower_io_arrays_to_elements(). > > This fixes a bug with the following piglit test due to the current code > not cleaning up dead code after we lower arrays. This was causing an > assert in the new duplicate varyings link time opt introduced in > 70be9afccb23. > > tests/spec/glsl-1.10/execution/vsfs-unused-array-member.shader_test > > Moving the nir_lower_io_to_scalar_early() calls also allows us to tidy > up the code a little and merge some loops. I love the reduction in stage tracking logic. Reviewed-by: Eric Anholt signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC 4/6] dri: Enable fp16 configs and visuals
On Fri, Jan 4, 2019 at 5:04 PM Ilia Mirkin wrote: > > On Fri, Jan 4, 2019 at 4:56 PM Kevin Strasser > wrote: > > diff --git a/src/mesa/drivers/dri/common/utils.c > > b/src/mesa/drivers/dri/common/utils.c > > index b52c59f..1cf9362 100644 > > --- a/src/mesa/drivers/dri/common/utils.c > > +++ b/src/mesa/drivers/dri/common/utils.c > > @@ -200,6 +200,10 @@ driCreateConfigs(mesa_format format, > >{ 0x03FF, 0x000FFC00, 0x3FF0, 0x }, > >/* MESA_FORMAT_R10G10B10A2_UNORM */ > >{ 0x03FF, 0x000FFC00, 0x3FF0, 0xC000 }, > > + /* MESA_FORMAT_RGBX_FLOAT16 */ > > + { 0x, 0x, 0x, > > 0x}, > > + /* MESA_FORMAT_RGBA_FLOAT16 */ > > + { 0x, 0x, 0x, > > 0x}, > > I'm about 37% sure that these need to have a ULL suffix on them, as > otherwise immediates aren't upgraded to a wider type. I'm surprised > there were no warnings/errors about that. Perhaps I'm just wrong > though -- good to check either way. > > If if turns out I'm right, there were a few other instances of such > wide immediates. As confirmed on IRC, appears that I'm wrong about this. The values do get upgraded to the wider type. -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC 4/6] dri: Enable fp16 configs and visuals
On Fri, Jan 4, 2019 at 4:56 PM Kevin Strasser wrote: > diff --git a/src/mesa/drivers/dri/common/utils.c > b/src/mesa/drivers/dri/common/utils.c > index b52c59f..1cf9362 100644 > --- a/src/mesa/drivers/dri/common/utils.c > +++ b/src/mesa/drivers/dri/common/utils.c > @@ -200,6 +200,10 @@ driCreateConfigs(mesa_format format, >{ 0x03FF, 0x000FFC00, 0x3FF0, 0x }, >/* MESA_FORMAT_R10G10B10A2_UNORM */ >{ 0x03FF, 0x000FFC00, 0x3FF0, 0xC000 }, > + /* MESA_FORMAT_RGBX_FLOAT16 */ > + { 0x, 0x, 0x, > 0x}, > + /* MESA_FORMAT_RGBA_FLOAT16 */ > + { 0x, 0x, 0x, > 0x}, I'm about 37% sure that these need to have a ULL suffix on them, as otherwise immediates aren't upgraded to a wider type. I'm surprised there were no warnings/errors about that. Perhaps I'm just wrong though -- good to check either way. If if turns out I'm right, there were a few other instances of such wide immediates. Cheers, -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [RFC 0/6] Enable fp16 visuals and fbconfigs
This series enables fp16 fbconfigs and visuals by leveraging existing off-screen rendering support. These formats can be used in conjunction with EXT_surface_SMPTE2086_metadata (not yet implemented by any drivers) to support EXT_gl_colorspace_scrgb / EXT_gl_colorspace_scrgb_linear, used in places like Android wide color gamut. While I have run this series against Piglit, I still need to sort out test coverage for these formats. If anyone has pointers to existing tests that would be really helpful. As an easy smoke test I have a modified version of kmscube: https://github.com/strassek/kmscube/commits/fp16 Kevin Strasser (6): dri: Support 64 bit rgba masks dri: Set bit for float configs drm-uapi: Add fp16 formats to drm_fourcc.h dri: Enable fp16 configs and visuals gallium/winsys/kms: Respect format bpp gbm: Add visuals and buffer handling for fp16 formats include/GL/internal/dri_interface.h| 11 ++- include/drm-uapi/drm_fourcc.h | 8 +++ src/egl/drivers/dri2/egl_dri2.c| 32 +++-- src/egl/drivers/dri2/egl_dri2.h| 6 +- src/egl/drivers/dri2/platform_android.c| 2 +- src/egl/drivers/dri2/platform_drm.c| 79 ++ src/egl/drivers/dri2/platform_surfaceless.c| 2 +- src/egl/drivers/dri2/platform_wayland.c| 2 +- src/egl/drivers/dri2/platform_x11.c| 6 +- .../auxiliary/pipe-loader/driinfo_gallium.h| 1 + src/gallium/state_trackers/dri/dri2.c | 22 ++ src/gallium/state_trackers/dri/dri_drawable.c | 3 + src/gallium/state_trackers/dri/dri_screen.c| 26 ++- src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c | 2 +- src/gbm/backends/dri/gbm_dri.c | 38 ++- src/gbm/backends/dri/gbm_driint.h | 9 +-- src/gbm/main/gbm.c | 3 + src/gbm/main/gbm.h | 9 +++ src/glx/glxconfig.h| 2 +- src/loader/loader_dri3_helper.c| 5 ++ src/mesa/drivers/dri/common/dri_util.c | 8 +++ src/mesa/drivers/dri/common/utils.c| 31 - src/mesa/drivers/dri/i965/intel_screen.c | 39 ++- src/mesa/main/mtypes.h | 2 +- src/mesa/state_tracker/st_cb_fbo.c | 3 + src/util/xmlpool/t_options.h | 5 ++ 26 files changed, 311 insertions(+), 45 deletions(-) -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [RFC 5/6] gallium/winsys/kms: Respect format bpp
Needed for allocating buffers with pixel formats wider than 32 bpp. Signed-off-by: Kevin Strasser --- src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c b/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c index 9564d94..e921a61 100644 --- a/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c +++ b/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c @@ -182,7 +182,7 @@ kms_sw_displaytarget_create(struct sw_winsys *ws, kms_sw_dt->format = format; memset(_req, 0, sizeof(create_req)); - create_req.bpp = 32; + create_req.bpp = util_format_get_blocksizebits(format); create_req.width = width; create_req.height = height; ret = drmIoctl(kms_sw->fd, DRM_IOCTL_MODE_CREATE_DUMB, _req); -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [RFC 4/6] dri: Enable fp16 configs and visuals
Add dri image formats for RGBA ordered 64 bpp IEEE 754 half precision floating point. Introduce a new dri configuration option so users can diable exposure of fp16 formats, following the same design and policy of rgb10a2 (opt-in for i965 and opt-out for gallium). Add a loader cap field so loaders can indicate if they know how to handle fp16 formats. Add visuals to gallium and i965, leverage existing offscreen render support for MESA_FORMAT_RGBA_FLOAT16 and MESA_FORMAT_RGBX_FLOAT16. Signed-off-by: Kevin Strasser --- include/GL/internal/dri_interface.h| 5 +++ src/egl/drivers/dri2/egl_dri2.c| 2 ++ .../auxiliary/pipe-loader/driinfo_gallium.h| 1 + src/gallium/state_trackers/dri/dri2.c | 22 src/gallium/state_trackers/dri/dri_drawable.c | 3 ++ src/gallium/state_trackers/dri/dri_screen.c| 26 ++- src/loader/loader_dri3_helper.c| 5 +++ src/mesa/drivers/dri/common/dri_util.c | 8 + src/mesa/drivers/dri/common/utils.c| 10 ++ src/mesa/drivers/dri/i965/intel_screen.c | 39 -- src/mesa/state_tracker/st_cb_fbo.c | 3 ++ src/util/xmlpool/t_options.h | 5 +++ 12 files changed, 125 insertions(+), 4 deletions(-) diff --git a/include/GL/internal/dri_interface.h b/include/GL/internal/dri_interface.h index c5761c4..76ebfae 100644 --- a/include/GL/internal/dri_interface.h +++ b/include/GL/internal/dri_interface.h @@ -1038,6 +1038,7 @@ enum dri_loader_cap { * only BGRA ordering can be exposed. */ DRI_LOADER_CAP_RGBA_ORDERING, + DRI_LOADER_CAP_FP16, }; struct __DRIdri2LoaderExtensionRec { @@ -1277,6 +1278,8 @@ struct __DRIdri2ExtensionRec { #define __DRI_IMAGE_FORMAT_XBGR2101010 0x1010 #define __DRI_IMAGE_FORMAT_ABGR2101010 0x1011 #define __DRI_IMAGE_FORMAT_SABGR8 0x1012 +#define __DRI_IMAGE_FORMAT_XBGR16161616F 0x1013 +#define __DRI_IMAGE_FORMAT_ABGR16161616F 0x1014 #define __DRI_IMAGE_USE_SHARE 0x0001 #define __DRI_IMAGE_USE_SCANOUT0x0002 @@ -1322,6 +1325,8 @@ struct __DRIdri2ExtensionRec { #define __DRI_IMAGE_FOURCC_RGBX1010102 0x30335852 #define __DRI_IMAGE_FOURCC_BGRA1010102 0x30334142 #define __DRI_IMAGE_FOURCC_BGRX1010102 0x30335842 +#define __DRI_IMAGE_FOURCC_ABGR16161616F 0x48344241 +#define __DRI_IMAGE_FOURCC_XBGR16161616F 0x48344258 #define __DRI_IMAGE_FOURCC_YUV410 0x39565559 #define __DRI_IMAGE_FOURCC_YUV411 0x31315559 #define __DRI_IMAGE_FOURCC_YUV420 0x32315559 diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c index fb8f1b7..6b44767 100644 --- a/src/egl/drivers/dri2/egl_dri2.c +++ b/src/egl/drivers/dri2/egl_dri2.c @@ -2233,6 +2233,8 @@ dri2_num_fourcc_format_planes(EGLint format) case DRM_FORMAT_ABGR2101010: case DRM_FORMAT_RGBA1010102: case DRM_FORMAT_BGRA1010102: + case DRM_FORMAT_XBGR16161616F: + case DRM_FORMAT_ABGR16161616F: case DRM_FORMAT_YUYV: case DRM_FORMAT_YVYU: case DRM_FORMAT_UYVY: diff --git a/src/gallium/auxiliary/pipe-loader/driinfo_gallium.h b/src/gallium/auxiliary/pipe-loader/driinfo_gallium.h index 9db0dc0..76637e5 100644 --- a/src/gallium/auxiliary/pipe-loader/driinfo_gallium.h +++ b/src/gallium/auxiliary/pipe-loader/driinfo_gallium.h @@ -37,4 +37,5 @@ DRI_CONF_SECTION_MISCELLANEOUS DRI_CONF_ALWAYS_HAVE_DEPTH_BUFFER("false") DRI_CONF_GLSL_ZERO_INIT("false") DRI_CONF_ALLOW_RGB10_CONFIGS("true") + DRI_CONF_ALLOW_FP16_CONFIGS("true") DRI_CONF_SECTION_END diff --git a/src/gallium/state_trackers/dri/dri2.c b/src/gallium/state_trackers/dri/dri2.c index 6fc07e4..c7f429a 100644 --- a/src/gallium/state_trackers/dri/dri2.c +++ b/src/gallium/state_trackers/dri/dri2.c @@ -74,6 +74,10 @@ struct dri2_format_mapping { }; static const struct dri2_format_mapping dri2_format_table[] = { + { __DRI_IMAGE_FOURCC_ABGR16161616F, __DRI_IMAGE_FORMAT_ABGR16161616F, +__DRI_IMAGE_COMPONENTS_RGBA, PIPE_FORMAT_R16G16B16A16_FLOAT }, + { __DRI_IMAGE_FOURCC_XBGR16161616F, __DRI_IMAGE_FORMAT_XBGR16161616F, +__DRI_IMAGE_COMPONENTS_RGB, PIPE_FORMAT_R16G16B16X16_FLOAT }, { __DRI_IMAGE_FOURCC_ARGB2101010, __DRI_IMAGE_FORMAT_ARGB2101010, __DRI_IMAGE_COMPONENTS_RGBA, PIPE_FORMAT_B10G10R10A2_UNORM }, { __DRI_IMAGE_FOURCC_XRGB2101010, __DRI_IMAGE_FORMAT_XRGB2101010, @@ -222,6 +226,12 @@ dri2_drawable_get_buffers(struct dri_drawable *drawable, * may occur as the stvis->color_format. */ switch(format) { + case PIPE_FORMAT_R16G16B16A16_FLOAT: + depth = 64; + break; + case PIPE_FORMAT_R16G16B16X16_FLOAT: + depth = 48; + break; case PIPE_FORMAT_B10G10R10A2_UNORM: case PIPE_FORMAT_R10G10B10A2_UNORM: case PIPE_FORMAT_BGRA_UNORM: @@ -300,6 +310,12 @@ dri_image_drawable_get_buffers(struct
[Mesa-dev] [RFC 2/6] dri: Set bit for float configs
Populate __DRI_ATTRIB_FLOAT_BIT, such that we can filter for formats containing floating point pixel data and egl/glx can satisfy the requirements for their respective extensions. Signed-off-by: Kevin Strasser --- src/egl/drivers/dri2/egl_dri2.c | 2 ++ src/egl/drivers/dri2/platform_drm.c | 18 +++--- src/gbm/backends/dri/gbm_dri.c | 12 src/gbm/backends/dri/gbm_driint.h | 1 + src/mesa/drivers/dri/common/utils.c | 5 + 5 files changed, 35 insertions(+), 3 deletions(-) diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c index d19950d..fb8f1b7 100644 --- a/src/egl/drivers/dri2/egl_dri2.c +++ b/src/egl/drivers/dri2/egl_dri2.c @@ -202,6 +202,8 @@ dri2_add_config(_EGLDisplay *disp, const __DRIconfig *dri_config, int id, ); ++i) { switch (attrib) { case __DRI_ATTRIB_RENDER_TYPE: + if (value & __DRI_ATTRIB_FLOAT_BIT) +_eglSetConfigKey(, EGL_COLOR_COMPONENT_TYPE_EXT, EGL_COLOR_COMPONENT_TYPE_FLOAT_EXT); if (value & __DRI_ATTRIB_RGBA_BIT) value = EGL_RGB_BUFFER; else if (value & __DRI_ATTRIB_LUMINANCE_BIT) diff --git a/src/egl/drivers/dri2/platform_drm.c b/src/egl/drivers/dri2/platform_drm.c index 47563d6..ae235e9 100644 --- a/src/egl/drivers/dri2/platform_drm.c +++ b/src/egl/drivers/dri2/platform_drm.c @@ -98,6 +98,7 @@ dri2_drm_config_is_compatible(struct dri2_egl_display *dri2_dpy, const struct gbm_dri_visual *visual = NULL; unsigned long long int red, green, blue, alpha; unsigned int mask_hi = 0, mask_lo, render_type; + bool is_float; int i; /* Check that the EGLConfig being used to render to the surface is @@ -129,6 +130,10 @@ dri2_drm_config_is_compatible(struct dri2_egl_display *dri2_dpy, _lo); alpha = (unsigned long long int)mask_hi << 32 | mask_lo; + dri2_dpy->core->getConfigAttrib(config, __DRI_ATTRIB_RENDER_TYPE, + _type); + is_float = (render_type & __DRI_ATTRIB_FLOAT_BIT) ? true : false; + for (i = 0; i < dri2_dpy->gbm_dri->num_visuals; i++) { visual = _dpy->gbm_dri->visual_table[i]; if (visual->gbm_format == surface->format) @@ -141,7 +146,8 @@ dri2_drm_config_is_compatible(struct dri2_egl_display *dri2_dpy, if (red != visual->rgba_masks.red || green != visual->rgba_masks.green || blue != visual->rgba_masks.blue || - (alpha && visual->rgba_masks.alpha && alpha != visual->rgba_masks.alpha)) { + (alpha && visual->rgba_masks.alpha && alpha != visual->rgba_masks.alpha) || + is_float != visual->is_float) { return false; } @@ -649,7 +655,8 @@ drm_add_configs_for_visuals(_EGLDriver *drv, _EGLDisplay *disp) for (unsigned i = 0; dri2_dpy->driver_configs[i]; i++) { const __DRIconfig *config = dri2_dpy->driver_configs[i]; unsigned long long int red, green, blue, alpha; - unsigned int mask_hi = 0, mask_lo; + unsigned int mask_hi = 0, mask_lo, render_type; + bool is_float; dri2_dpy->core->getConfigAttrib(config, __DRI_ATTRIB_RED_MASK_HI, _hi); @@ -675,13 +682,18 @@ drm_add_configs_for_visuals(_EGLDriver *drv, _EGLDisplay *disp) _lo); alpha = (unsigned long long int)mask_hi << 32 | mask_lo; + dri2_dpy->core->getConfigAttrib(dri2_dpy->driver_configs[i], + __DRI_ATTRIB_RENDER_TYPE, _type); + is_float = (render_type & __DRI_ATTRIB_FLOAT_BIT) ? true : false; + for (unsigned j = 0; j < num_visuals; j++) { struct dri2_egl_config *dri2_conf; if (visuals[j].rgba_masks.red != red || visuals[j].rgba_masks.green != green || visuals[j].rgba_masks.blue != blue || - visuals[j].rgba_masks.alpha != alpha) + visuals[j].rgba_masks.alpha != alpha || + visuals[j].is_float != is_float) continue; const EGLint attr_list[] = { diff --git a/src/gbm/backends/dri/gbm_dri.c b/src/gbm/backends/dri/gbm_dri.c index abbb0b9..225b0de 100644 --- a/src/gbm/backends/dri/gbm_dri.c +++ b/src/gbm/backends/dri/gbm_dri.c @@ -487,50 +487,62 @@ static const struct gbm_dri_visual gbm_dri_visuals_table[] = { { GBM_FORMAT_R8, __DRI_IMAGE_FORMAT_R8, { 0x00ff, 0x, 0x, 0x }, + false, }, { GBM_FORMAT_GR88, __DRI_IMAGE_FORMAT_GR88, { 0x00ff, 0xff00, 0x, 0x }, + false, }, { GBM_FORMAT_ARGB1555, __DRI_IMAGE_FORMAT_ARGB1555, { 0x7c00, 0x03e0, 0x001f, 0x8000 }, + false, }, { GBM_FORMAT_RGB565, __DRI_IMAGE_FORMAT_RGB565, { 0xf800, 0x07e0, 0x001f, 0x }, + false, }, { GBM_FORMAT_XRGB, __DRI_IMAGE_FORMAT_XRGB,
[Mesa-dev] [RFC 6/6] gbm: Add visuals and buffer handling for fp16 formats
Set loader caps indicating that gbm can handle both rgba ordering and fp16 formats. Signed-off-by: Kevin Strasser --- src/gbm/backends/dri/gbm_dri.c | 26 +- src/gbm/main/gbm.c | 3 +++ src/gbm/main/gbm.h | 9 + 3 files changed, 37 insertions(+), 1 deletion(-) diff --git a/src/gbm/backends/dri/gbm_dri.c b/src/gbm/backends/dri/gbm_dri.c index 225b0de..cc1943e 100644 --- a/src/gbm/backends/dri/gbm_dri.c +++ b/src/gbm/backends/dri/gbm_dri.c @@ -118,6 +118,19 @@ dri_get_buffers_with_format(__DRIdrawable * driDrawable, count, out_count, surf->dri_private); } +static unsigned +dri_get_capability(void *loaderPrivate, enum dri_loader_cap cap) +{ + /* Note: loaderPrivate is _EGLDisplay* */ + switch (cap) { + case DRI_LOADER_CAP_RGBA_ORDERING: + case DRI_LOADER_CAP_FP16: + return 1; + default: + return 0; + } +} + static int image_get_buffers(__DRIdrawable *driDrawable, unsigned int format, @@ -215,11 +228,12 @@ static const __DRIimageLookupExtension image_lookup_extension = { }; static const __DRIdri2LoaderExtension dri2_loader_extension = { - .base = { __DRI_DRI2_LOADER, 3 }, + .base = { __DRI_DRI2_LOADER, 4 }, .getBuffers = dri_get_buffers, .flushFrontBuffer= dri_flush_front_buffer, .getBuffersWithFormat= dri_get_buffers_with_format, + .getCapability = dri_get_capability, }; static const __DRIimageLoaderExtension image_loader_extension = { @@ -544,6 +558,16 @@ static const struct gbm_dri_visual gbm_dri_visuals_table[] = { { 0x03ff, 0x000ffc00, 0x3ff0, 0xc000 }, false, }, + { + GBM_FORMAT_XBGR16161616F, __DRI_IMAGE_FORMAT_XBGR16161616F, + { 0x, 0x, 0x, 0x}, + true, + }, + { + GBM_FORMAT_ABGR16161616F, __DRI_IMAGE_FORMAT_ABGR16161616F, + { 0x, 0x, 0x, 0x}, + true, + }, }; static int diff --git a/src/gbm/main/gbm.c b/src/gbm/main/gbm.c index 2501518..28e9237 100644 --- a/src/gbm/main/gbm.c +++ b/src/gbm/main/gbm.c @@ -271,6 +271,9 @@ gbm_bo_get_bpp(struct gbm_bo *bo) case GBM_FORMAT_RGBA1010102: case GBM_FORMAT_BGRA1010102: return 32; + case GBM_FORMAT_XBGR16161616F: + case GBM_FORMAT_ABGR16161616F: + return 64; } } diff --git a/src/gbm/main/gbm.h b/src/gbm/main/gbm.h index 9b52887..4c6ab37 100644 --- a/src/gbm/main/gbm.h +++ b/src/gbm/main/gbm.h @@ -150,6 +150,15 @@ enum gbm_bo_format { #define GBM_FORMAT_RGBA1010102 __gbm_fourcc_code('R', 'A', '3', '0') /* [31:0] R:G:B:A 10:10:10:2 little endian */ #define GBM_FORMAT_BGRA1010102 __gbm_fourcc_code('B', 'A', '3', '0') /* [31:0] B:G:R:A 10:10:10:2 little endian */ +/* + * Floating point 64bpp RGB + * IEEE 754-2008 binary16 half-precision float + * [15:0] sign:exponent:mantissa 1:5:10 + */ +#define GBM_FORMAT_XBGR16161616F __gbm_fourcc_code('X', 'B', '4', 'H') /* [63:0] x:B:G:R 16:16:16:16 little endian */ + +#define GBM_FORMAT_ABGR16161616F __gbm_fourcc_code('A', 'B', '4', 'H') /* [63:0] A:B:G:R 16:16:16:16 little endian */ + /* packed YCbCr */ #define GBM_FORMAT_YUYV__gbm_fourcc_code('Y', 'U', 'Y', 'V') /* [31:0] Cr0:Y1:Cb0:Y0 8:8:8:8 little endian */ #define GBM_FORMAT_YVYU__gbm_fourcc_code('Y', 'V', 'Y', 'U') /* [31:0] Cb0:Y1:Cr0:Y0 8:8:8:8 little endian */ -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [RFC 3/6] drm-uapi: Add fp16 formats to drm_fourcc.h
Add 64 bpp 16:16:16:16 half float pixel formats. Each 16 bit component is formatted in IEEE-754 half-precision float (binary16) 1:5:10 MSb-sign:exponent:fraction form. Signed-off-by: Kevin Strasser --- include/drm-uapi/drm_fourcc.h | 8 1 file changed, 8 insertions(+) diff --git a/include/drm-uapi/drm_fourcc.h b/include/drm-uapi/drm_fourcc.h index d5e5235..58ffeae 100644 --- a/include/drm-uapi/drm_fourcc.h +++ b/include/drm-uapi/drm_fourcc.h @@ -105,6 +105,14 @@ extern "C" { #define DRM_FORMAT_RGBA1010102 fourcc_code('R', 'A', '3', '0') /* [31:0] R:G:B:A 10:10:10:2 little endian */ #define DRM_FORMAT_BGRA1010102 fourcc_code('B', 'A', '3', '0') /* [31:0] B:G:R:A 10:10:10:2 little endian */ +/* + * Floating point 64bpp RGB + * IEEE 754-2008 binary16 half-precision float + * [15:0] sign:exponent:mantissa 1:5:10 + */ +#define DRM_FORMAT_XBGR16161616F fourcc_code('X', 'B', '4', 'H') /* [63:0] x:B:G:R 16:16:16:16 little endian */ +#define DRM_FORMAT_ABGR16161616F fourcc_code('A', 'B', '4', 'H') /* [63:0] A:B:G:R 16:16:16:16 little endian */ + /* packed YCbCr */ #define DRM_FORMAT_YUYVfourcc_code('Y', 'U', 'Y', 'V') /* [31:0] Cr0:Y1:Cb0:Y0 8:8:8:8 little endian */ #define DRM_FORMAT_YVYUfourcc_code('Y', 'V', 'Y', 'U') /* [31:0] Cb0:Y1:Cr0:Y0 8:8:8:8 little endian */ -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [RFC 1/6] dri: Support 64 bit rgba masks
The dri core api was written with the assumption that all attribute values would fit into 32 bits. This limitation means the config handlers can't accept 64 bpp formats. Reserve 64 bits for rgba masks and add new attributes that allow access to the upper 32 bits. Signed-off-by: Kevin Strasser --- include/GL/internal/dri_interface.h | 6 ++- src/egl/drivers/dri2/egl_dri2.c | 28 +--- src/egl/drivers/dri2/egl_dri2.h | 6 +-- src/egl/drivers/dri2/platform_android.c | 2 +- src/egl/drivers/dri2/platform_drm.c | 67 ++--- src/egl/drivers/dri2/platform_surfaceless.c | 2 +- src/egl/drivers/dri2/platform_wayland.c | 2 +- src/egl/drivers/dri2/platform_x11.c | 6 +-- src/gbm/backends/dri/gbm_driint.h | 8 ++-- src/glx/glxconfig.h | 2 +- src/mesa/drivers/dri/common/utils.c | 16 ++- src/mesa/main/mtypes.h | 2 +- 12 files changed, 108 insertions(+), 39 deletions(-) diff --git a/include/GL/internal/dri_interface.h b/include/GL/internal/dri_interface.h index 072f379..c5761c4 100644 --- a/include/GL/internal/dri_interface.h +++ b/include/GL/internal/dri_interface.h @@ -747,7 +747,11 @@ struct __DRIuseInvalidateExtensionRec { #define __DRI_ATTRIB_YINVERTED 47 #define __DRI_ATTRIB_FRAMEBUFFER_SRGB_CAPABLE 48 #define __DRI_ATTRIB_MUTABLE_RENDER_BUFFER 49 /* EGL_MUTABLE_RENDER_BUFFER_BIT_KHR */ -#define __DRI_ATTRIB_MAX 50 +#define __DRI_ATTRIB_RED_MASK_HI 50 +#define __DRI_ATTRIB_GREEN_MASK_HI 51 +#define __DRI_ATTRIB_BLUE_MASK_HI 52 +#define __DRI_ATTRIB_ALPHA_MASK_HI 53 +#define __DRI_ATTRIB_MAX 54 /* __DRI_ATTRIB_RENDER_TYPE */ #define __DRI_ATTRIB_RGBA_BIT 0x01 diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c index 0be9235..d19950d 100644 --- a/src/egl/drivers/dri2/egl_dri2.c +++ b/src/egl/drivers/dri2/egl_dri2.c @@ -179,7 +179,7 @@ dri2_match_config(const _EGLConfig *conf, const _EGLConfig *criteria) struct dri2_egl_config * dri2_add_config(_EGLDisplay *disp, const __DRIconfig *dri_config, int id, EGLint surface_type, const EGLint *attr_list, -const unsigned int *rgba_masks) +const unsigned long long int *rgba_masks) { struct dri2_egl_config *conf; struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp); @@ -187,7 +187,7 @@ dri2_add_config(_EGLDisplay *disp, const __DRIconfig *dri_config, int id, unsigned int attrib, value, double_buffer; bool srgb = false; EGLint key, bind_to_texture_rgb, bind_to_texture_rgba; - unsigned int dri_masks[4] = { 0, 0, 0, 0 }; + unsigned long long int dri_masks[4] = { 0, 0, 0, 0 }; _EGLConfig *matching_config; EGLint num_configs = 0; EGLint config_id; @@ -234,19 +234,35 @@ dri2_add_config(_EGLDisplay *disp, const __DRIconfig *dri_config, int id, break; case __DRI_ATTRIB_RED_MASK: - dri_masks[0] = value; + dri_masks[0] |= value; + break; + + case __DRI_ATTRIB_RED_MASK_HI: + dri_masks[0] |= (unsigned long long int)value << 32; break; case __DRI_ATTRIB_GREEN_MASK: - dri_masks[1] = value; + dri_masks[1] |= value; + break; + + case __DRI_ATTRIB_GREEN_MASK_HI: + dri_masks[1] |= (unsigned long long int)value << 32; break; case __DRI_ATTRIB_BLUE_MASK: - dri_masks[2] = value; + dri_masks[2] |= value; + break; + + case __DRI_ATTRIB_BLUE_MASK_HI: + dri_masks[2] |= (unsigned long long int)value << 32; break; case __DRI_ATTRIB_ALPHA_MASK: - dri_masks[3] = value; + dri_masks[3] |= value; + break; + + case __DRI_ATTRIB_ALPHA_MASK_HI: + dri_masks[3] |= (unsigned long long int)value << 32; break; case __DRI_ATTRIB_ACCUM_RED_SIZE: diff --git a/src/egl/drivers/dri2/egl_dri2.h b/src/egl/drivers/dri2/egl_dri2.h index 4abe1ba..3916c68 100644 --- a/src/egl/drivers/dri2/egl_dri2.h +++ b/src/egl/drivers/dri2/egl_dri2.h @@ -404,7 +404,7 @@ dri2_lookup_egl_image(__DRIscreen *screen, void *image, void *data); struct dri2_egl_config * dri2_add_config(_EGLDisplay *disp, const __DRIconfig *dri_config, int id, EGLint surface_type, const EGLint *attr_list, -const unsigned int *rgba_masks); +const unsigned long long int *rgba_masks); _EGLImage * dri2_create_image_khr(_EGLDriver *drv, _EGLDisplay *disp, @@ -420,7 +420,7 @@ EGLBoolean dri2_initialize_x11(_EGLDriver *drv, _EGLDisplay *disp); void dri2_teardown_x11(struct dri2_egl_display *dri2_dpy); -unsigned int +unsigned long long int dri2_x11_get_red_mask_for_depth(struct dri2_egl_display *dri2_dpy, int depth); #else static inline
[Mesa-dev] [PATCH] gallium/swr: Fix multi-context sync fence deadlock.
Various recreation scenarios lead to API thread getting stuck in swr_fence_finish(). This is a multi-context issue, whereby one context overwrites the fence read-value with a previous sync's lesser value. The fence sync value is supposed to be always increasing. In swr_fence_cb(), only update the "read" value if the new value is greater. (This may seem like we're not waiting on the other context to finish, but had we needed for it to finish there would have been a wait prior to submitting a new sync.) cc: mesa-sta...@lists.freedesktop.org --- src/gallium/drivers/swr/swr_fence.cpp | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/swr/swr_fence.cpp b/src/gallium/drivers/swr/swr_fence.cpp index b05ac8cec0..074d82a3b4 100644 --- a/src/gallium/drivers/swr/swr_fence.cpp +++ b/src/gallium/drivers/swr/swr_fence.cpp @@ -50,7 +50,9 @@ swr_fence_cb(uint64_t userData, uint64_t userData2, uint64_t userData3) swr_fence_do_work(fence); /* Correct value is in SwrSync data, and not the fence write field. */ - fence->read = userData2; + /* Contexts may not finish in order, but fence value always increases */ + if (fence->read < userData2) + fence->read = userData2; } /* -- 2.17.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] intel/blorp: Be more conservative about copying clear colors
On Fri, Jan 04, 2019 at 01:07:07PM -0600, Jason Ekstrand wrote: > In 92eb5bbc68d7324 we attempted to avoid copying clear colors whenever > we weren't doing a resolve. However, this broke MSAA resolves because > we need the clear color in the source. This patch makes blorp much more > conservative such that it only avoids the clear color copy if either > aux_usage == NONE or it's explicitly doing a fast-clear. Ah, nice! I think I tried to fix this by not setting the surface->clear_color_addr.buffer in some cases, but I think it broke some crucible test. I don't remember the details of how, though. Anyway, this looks better, and I assume it doesn't break anything. Reviewed-by: Rafael Antognolli > Fixes: 92eb5bbc68d7 "intel/blorp: Only copy clear color when doing..." > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107728 > Cc: Rafael Antognolli > --- > src/intel/blorp/blorp_genX_exec.h | 9 ++--- > 1 file changed, 6 insertions(+), 3 deletions(-) > > diff --git a/src/intel/blorp/blorp_genX_exec.h > b/src/intel/blorp/blorp_genX_exec.h > index 9010b03fb67..29afe8ac78b 100644 > --- a/src/intel/blorp/blorp_genX_exec.h > +++ b/src/intel/blorp/blorp_genX_exec.h > @@ -1326,7 +1326,7 @@ blorp_emit_memcpy(struct blorp_batch *batch, > static void > blorp_emit_surface_state(struct blorp_batch *batch, > const struct brw_blorp_surface_info *surface, > - enum isl_aux_op op, > + enum isl_aux_op aux_op, > void *state, uint32_t state_offset, > const bool color_write_disables[4], > bool is_render_target) > @@ -1382,7 +1382,7 @@ blorp_emit_surface_state(struct blorp_batch *batch, >surface->aux_addr, *aux_addr); > } > > - if (surface->clear_color_addr.buffer) { > + if (aux_usage != ISL_AUX_USAGE_NONE && surface->clear_color_addr.buffer) { > #if GEN_GEN >= 10 >assert((surface->clear_color_addr.offset & 0x3f) == 0); >uint32_t *clear_addr = state + isl_dev->ss.clear_color_state_offset; > @@ -1390,7 +1390,10 @@ blorp_emit_surface_state(struct blorp_batch *batch, >isl_dev->ss.clear_color_state_offset, >surface->clear_color_addr, *clear_addr); > #elif GEN_GEN >= 7 > - if (op == ISL_AUX_OP_FULL_RESOLVE || op == ISL_AUX_OP_PARTIAL_RESOLVE) > { > + /* Fast clears just whack the AUX surface and don't actually use the > + * clear color for anything. We can avoid the MI memcpy on that case. > + */ > + if (aux_op != ISL_AUX_OP_FAST_CLEAR) { > struct blorp_address dst_addr = > blorp_get_surface_base_address(batch); > dst_addr.offset += state_offset + isl_dev->ss.clear_value_offset; > blorp_emit_memcpy(batch, dst_addr, surface->clear_color_addr, > -- > 2.20.1 > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] intel/blorp: Be more conservative about copying clear colors
In 92eb5bbc68d7324 we attempted to avoid copying clear colors whenever we weren't doing a resolve. However, this broke MSAA resolves because we need the clear color in the source. This patch makes blorp much more conservative such that it only avoids the clear color copy if either aux_usage == NONE or it's explicitly doing a fast-clear. Fixes: 92eb5bbc68d7 "intel/blorp: Only copy clear color when doing..." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107728 Cc: Rafael Antognolli --- src/intel/blorp/blorp_genX_exec.h | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/src/intel/blorp/blorp_genX_exec.h b/src/intel/blorp/blorp_genX_exec.h index 9010b03fb67..29afe8ac78b 100644 --- a/src/intel/blorp/blorp_genX_exec.h +++ b/src/intel/blorp/blorp_genX_exec.h @@ -1326,7 +1326,7 @@ blorp_emit_memcpy(struct blorp_batch *batch, static void blorp_emit_surface_state(struct blorp_batch *batch, const struct brw_blorp_surface_info *surface, - enum isl_aux_op op, + enum isl_aux_op aux_op, void *state, uint32_t state_offset, const bool color_write_disables[4], bool is_render_target) @@ -1382,7 +1382,7 @@ blorp_emit_surface_state(struct blorp_batch *batch, surface->aux_addr, *aux_addr); } - if (surface->clear_color_addr.buffer) { + if (aux_usage != ISL_AUX_USAGE_NONE && surface->clear_color_addr.buffer) { #if GEN_GEN >= 10 assert((surface->clear_color_addr.offset & 0x3f) == 0); uint32_t *clear_addr = state + isl_dev->ss.clear_color_state_offset; @@ -1390,7 +1390,10 @@ blorp_emit_surface_state(struct blorp_batch *batch, isl_dev->ss.clear_color_state_offset, surface->clear_color_addr, *clear_addr); #elif GEN_GEN >= 7 - if (op == ISL_AUX_OP_FULL_RESOLVE || op == ISL_AUX_OP_PARTIAL_RESOLVE) { + /* Fast clears just whack the AUX surface and don't actually use the + * clear color for anything. We can avoid the MI memcpy on that case. + */ + if (aux_op != ISL_AUX_OP_FAST_CLEAR) { struct blorp_address dst_addr = blorp_get_surface_base_address(batch); dst_addr.offset += state_offset + isl_dev->ss.clear_value_offset; blorp_emit_memcpy(batch, dst_addr, surface->clear_color_addr, -- 2.20.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] NIR constant problem for GPU which doesn't have native integer support
On Fri, Jan 4, 2019 at 10:46 AM Jason Ekstrand wrote: > > On Fri, Jan 4, 2019 at 4:07 AM Erik Faye-Lund > wrote: >> >> On Thu, 2019-01-03 at 11:58 -0600, Jason Ekstrand wrote: >> > On Thu, Jan 3, 2019 at 3:39 AM Erik Faye-Lund < >> > erik.faye-l...@collabora.com> wrote: >> > > On Wed, 2019-01-02 at 10:16 -0600, Jason Ekstrand wrote: >> > > > On Wed, Jan 2, 2019 at 9:43 AM Ilia Mirkin >> > > > wrote: >> > > > > Have a look at the first 4 patches in the series from Jonathan >> > > > > Marek >> > > > > to address some of these issues: >> > > > > >> > > > > https://patchwork.freedesktop.org/series/54295/ >> > > > > >> > > > > Not sure exactly what state that work is in, but I've added >> > > > > Jonathan >> > > > > to CC, perhaps he can provide an update. >> > > > > >> > > > > Cheers, >> > > > > >> > > > > -ilia >> > > > > >> > > > > On Wed, Jan 2, 2019 at 6:28 AM Qiang Yu >> > > wrote: >> > > > > > >> > > > > > Hi guys, >> > > > > > >> > > > > > I found the problem with this test fragment shader when lima >> > > > > development: >> > > > > > uniform int color; >> > > > > > void main() { >> > > > > > if (color > 1) >> > > > > > gl_FragColor = vec4(1.0, 0.0, 0.0, 1); >> > > > > > else >> > > > > > gl_FragColor = vec4(0.0, 1.0, 0.0, 1); >> > > > > > } >> > > > > > >> > > > > > nir_print_shader output: >> > > > > > impl main { >> > > > > > block block_0: >> > > > > > /* preds: */ >> > > > > > vec1 32 ssa_0 = load_const (0x0001 /* 0.00 >> > > */) >> > > > > > vec4 32 ssa_1 = load_const (0x3f80 /* 1.00 >> > > */, >> > > > > > 0x /* 0.00 */, 0x /* 0.00 */, >> > > 0x3f80 >> > > > > /* >> > > > > > 1.00 */) >> > > > > > vec4 32 ssa_2 = load_const (0x /* 0.00 >> > > */, >> > > > > > 0x3f80 /* 1.00 */, 0x /* 0.00 */, >> > > 0x3f80 >> > > > > /* >> > > > > > 1.00 */) >> > > > > > vec1 32 ssa_3 = load_const (0x /* 0.00 >> > > */) >> > > > > > vec1 32 ssa_4 = intrinsic load_uniform (ssa_3) (0, 1, >> > > 0) >> > > > > /* >> > > > > > base=0 */ /* range=1 */ /* component=0 */ /* color */ >> > > > > > vec1 32 ssa_5 = slt ssa_0, ssa_4 >> > > > > > vec1 32 ssa_6 = fnot ssa_5 >> > > > > > vec4 32 ssa_7 = bcsel ssa_6., ssa_2, ssa_1 >> > > > > > intrinsic store_output (ssa_7, ssa_3) (0, 15, 0) /* >> > > > > base=0 */ >> > > > > > /* wrmask=xyzw */ /* component=0 */ /* gl_FragColor */ >> > > > > > /* succs: block_1 */ >> > > > > > block block_1: >> > > > > > } >> > > > > > >> > > > > > ssa0 is not converted to float when glsl to nir. I see >> > > > > glsl_to_nir.cpp >> > > > > > will create flt/ilt/ult >> > > > > > based on source type for gpu support native integer, but for >> > > gpu >> > > > > not >> > > > > > support native >> > > > > > integer, just create slt for all source type. And in >> > > > > > nir_lower_constant_initializers, >> > > > > > there's also no type conversion for integer constant. >> > > > >> > > > This is a generally sticky issue. In NIR, we have no concept of >> > > > types on SSA values which has proven perfectly reasonable and >> > > > actually very powerful in a world where integers are supported >> > > > natively. Unfortunately, it causes significant problems for >> > > float- >> > > > only architectures. >> > > >> > > I would like to take this chance to say that this untyped SSA-value >> > > choice has lead to issues in both radeon_si (because LLVM values >> > > are >> > > typed) and zink (similarly, because SPIR-V values are typed), where >> > > we >> > > need to to bitcasts on every access because there's just not enough >> > > information available to emit variables with the right type. >> > >> > I'm not sure if I agree that the two problems are the same or not... >> > More on that in a bit. >> > >> > > It took us a lot of time to realize that the meta-data from the >> > > opcodes >> > > doesn't *really* provide this, because the rest of nir doesn't >> > > treat >> > > values consistently. In fact, this feels arguably more like buggy >> > > behavior; why do we even have fmov when all of the time the >> > > compiler >> > > will emit imovs for floating-point values...? Or why do we have >> > > bitcast >> > >> > Why do we have different mov opcodes? Because they have different >> > behavior in the presence of source/destination modifiers. >> >> Is this general NIR-behavior (i.e will this be honored by constant >> folding etc), or is it Intel specific? If it's NIR-behavior, is it >> documented somewhere? > > > No, constant folding doesn't do modifiers. I had completely forgotten about > this fact until we started this discussion. > >> >> In either case, I have a feeling that this is a mis-design; the >> modifiers apply to the operands, not the opcodes. It sounds a bit like >> imov shouldn't have been an ALU instruction, but perhaps some other, >> new
[Mesa-dev] [Bug 99553] Tracker bug for runnning OpenCL applications on Clover
https://bugs.freedesktop.org/show_bug.cgi?id=99553 Bug 99553 depends on bug 102009, which changed state. Bug 102009 Summary: [clover, amdgcn] Blender crashes when compiling OpenCL kernel https://bugs.freedesktop.org/show_bug.cgi?id=102009 What|Removed |Added Status|RESOLVED|REOPENED Resolution|FIXED |--- -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 102009] [clover, amdgcn] Blender crashes when compiling OpenCL kernel
https://bugs.freedesktop.org/show_bug.cgi?id=102009 Jan Vesely changed: What|Removed |Added Resolution|FIXED |--- Status|RESOLVED|REOPENED --- Comment #10 from Jan Vesely --- (In reply to Markus from comment #9) > After installing rocm 2.0 on Ubuntu 18.10 this is now working without > crashes in Blender 2.79. > > Issue can be considered resolved from my perspective. clover is not part of rocm -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 99553] Tracker bug for runnning OpenCL applications on Clover
https://bugs.freedesktop.org/show_bug.cgi?id=99553 Bug 99553 depends on bug 102009, which changed state. Bug 102009 Summary: [clover, amdgcn] Blender crashes when compiling OpenCL kernel https://bugs.freedesktop.org/show_bug.cgi?id=102009 What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 102009] [clover, amdgcn] Blender crashes when compiling OpenCL kernel
https://bugs.freedesktop.org/show_bug.cgi?id=102009 Markus changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #9 from Markus --- After installing rocm 2.0 on Ubuntu 18.10 this is now working without crashes in Blender 2.79. Issue can be considered resolved from my perspective. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] NIR constant problem for GPU which doesn't have native integer support
On Fri, Jan 4, 2019 at 4:07 AM Erik Faye-Lund wrote: > On Thu, 2019-01-03 at 11:58 -0600, Jason Ekstrand wrote: > > On Thu, Jan 3, 2019 at 3:39 AM Erik Faye-Lund < > > erik.faye-l...@collabora.com> wrote: > > > On Wed, 2019-01-02 at 10:16 -0600, Jason Ekstrand wrote: > > > > On Wed, Jan 2, 2019 at 9:43 AM Ilia Mirkin > > > > wrote: > > > > > Have a look at the first 4 patches in the series from Jonathan > > > > > Marek > > > > > to address some of these issues: > > > > > > > > > > https://patchwork.freedesktop.org/series/54295/ > > > > > > > > > > Not sure exactly what state that work is in, but I've added > > > > > Jonathan > > > > > to CC, perhaps he can provide an update. > > > > > > > > > > Cheers, > > > > > > > > > > -ilia > > > > > > > > > > On Wed, Jan 2, 2019 at 6:28 AM Qiang Yu > > > wrote: > > > > > > > > > > > > Hi guys, > > > > > > > > > > > > I found the problem with this test fragment shader when lima > > > > > development: > > > > > > uniform int color; > > > > > > void main() { > > > > > > if (color > 1) > > > > > > gl_FragColor = vec4(1.0, 0.0, 0.0, 1); > > > > > > else > > > > > > gl_FragColor = vec4(0.0, 1.0, 0.0, 1); > > > > > > } > > > > > > > > > > > > nir_print_shader output: > > > > > > impl main { > > > > > > block block_0: > > > > > > /* preds: */ > > > > > > vec1 32 ssa_0 = load_const (0x0001 /* 0.00 > > > */) > > > > > > vec4 32 ssa_1 = load_const (0x3f80 /* 1.00 > > > */, > > > > > > 0x /* 0.00 */, 0x /* 0.00 */, > > > 0x3f80 > > > > > /* > > > > > > 1.00 */) > > > > > > vec4 32 ssa_2 = load_const (0x /* 0.00 > > > */, > > > > > > 0x3f80 /* 1.00 */, 0x /* 0.00 */, > > > 0x3f80 > > > > > /* > > > > > > 1.00 */) > > > > > > vec1 32 ssa_3 = load_const (0x /* 0.00 > > > */) > > > > > > vec1 32 ssa_4 = intrinsic load_uniform (ssa_3) (0, 1, > > > 0) > > > > > /* > > > > > > base=0 */ /* range=1 */ /* component=0 */ /* color */ > > > > > > vec1 32 ssa_5 = slt ssa_0, ssa_4 > > > > > > vec1 32 ssa_6 = fnot ssa_5 > > > > > > vec4 32 ssa_7 = bcsel ssa_6., ssa_2, ssa_1 > > > > > > intrinsic store_output (ssa_7, ssa_3) (0, 15, 0) /* > > > > > base=0 */ > > > > > > /* wrmask=xyzw */ /* component=0 */ /* gl_FragColor */ > > > > > > /* succs: block_1 */ > > > > > > block block_1: > > > > > > } > > > > > > > > > > > > ssa0 is not converted to float when glsl to nir. I see > > > > > glsl_to_nir.cpp > > > > > > will create flt/ilt/ult > > > > > > based on source type for gpu support native integer, but for > > > gpu > > > > > not > > > > > > support native > > > > > > integer, just create slt for all source type. And in > > > > > > nir_lower_constant_initializers, > > > > > > there's also no type conversion for integer constant. > > > > > > > > This is a generally sticky issue. In NIR, we have no concept of > > > > types on SSA values which has proven perfectly reasonable and > > > > actually very powerful in a world where integers are supported > > > > natively. Unfortunately, it causes significant problems for > > > float- > > > > only architectures. > > > > > > I would like to take this chance to say that this untyped SSA-value > > > choice has lead to issues in both radeon_si (because LLVM values > > > are > > > typed) and zink (similarly, because SPIR-V values are typed), where > > > we > > > need to to bitcasts on every access because there's just not enough > > > information available to emit variables with the right type. > > > > I'm not sure if I agree that the two problems are the same or not... > > More on that in a bit. > > > > > It took us a lot of time to realize that the meta-data from the > > > opcodes > > > doesn't *really* provide this, because the rest of nir doesn't > > > treat > > > values consistently. In fact, this feels arguably more like buggy > > > behavior; why do we even have fmov when all of the time the > > > compiler > > > will emit imovs for floating-point values...? Or why do we have > > > bitcast > > > > Why do we have different mov opcodes? Because they have different > > behavior in the presence of source/destination modifiers. > > Is this general NIR-behavior (i.e will this be honored by constant > folding etc), or is it Intel specific? If it's NIR-behavior, is it > documented somewhere? > No, constant folding doesn't do modifiers. I had completely forgotten about this fact until we started this discussion. > In either case, I have a feeling that this is a mis-design; the > modifiers apply to the operands, not the opcodes. It sounds a bit like > imov shouldn't have been an ALU instruction, but perhaps some other, > new type. Or perhaps the concept we have should have been a bit finer > granularity. It's a bit hard to tell without more details, and I can't > seem to find any on my own. > I'm very unclear about
Re: [Mesa-dev] st/va: Return correct status from vlVaQuerySurfaceStatus
Please add some commit messages. With that, the patch is Reviewed-by: Leo Liu On 1/3/19 5:20 AM, Das, Indrajit-kumar wrote: > From: Indrajit Das > Date: Thu, 3 Jan 2019 14:36:33 +0530 > Subject: [PATCH] st/va: Return correct status from vlVaQuerySurfaceStatus > > Signed-off-by: Indrajit Das > --- > src/gallium/state_trackers/va/surface.c | 31 + > 1 file changed, 31 insertions(+) > > diff --git a/src/gallium/state_trackers/va/surface.c > b/src/gallium/state_trackers/va/surface.c > index cc26efe..e7ed64b 100644 > --- a/src/gallium/state_trackers/va/surface.c > +++ b/src/gallium/state_trackers/va/surface.c > @@ -146,9 +146,40 @@ vlVaSyncSurface(VADriverContextP ctx, VASurfaceID > render_target) > VAStatus > vlVaQuerySurfaceStatus(VADriverContextP ctx, VASurfaceID render_target, > VASurfaceStatus *status) > { > + vlVaDriver *drv; > + vlVaSurface *surf; > + vlVaContext *context; > + > if (!ctx) > return VA_STATUS_ERROR_INVALID_CONTEXT; > > + drv = VL_VA_DRIVER(ctx); > + if (!drv) > + return VA_STATUS_ERROR_INVALID_CONTEXT; > + > + mtx_lock(>mutex); > + > + surf = handle_table_get(drv->htab, render_target); > + if (!surf || !surf->buffer) { > + mtx_unlock(>mutex); > + return VA_STATUS_ERROR_INVALID_SURFACE; > + } > + > + context = handle_table_get(drv->htab, surf->ctx); > + if (!context) { > + mtx_unlock(>mutex); > + return VA_STATUS_ERROR_INVALID_CONTEXT; > + } > + > + if (context->decoder->entrypoint == PIPE_VIDEO_ENTRYPOINT_ENCODE) { > + if(surf->feedback == NULL) > + *status=VASurfaceReady; > + else > + *status=VASurfaceRendering; > + } > + > + mtx_unlock(>mutex); > + > return VA_STATUS_SUCCESS; > } > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 107115] Make darktable OpenCL support work on Clover and RadeonSI
https://bugs.freedesktop.org/show_bug.cgi?id=107115 Jan Vesely changed: What|Removed |Added See Also||https://bugs.freedesktop.or ||g/show_bug.cgi?id=109224 -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 99553] Tracker bug for runnning OpenCL applications on Clover
https://bugs.freedesktop.org/show_bug.cgi?id=99553 Jan Vesely changed: What|Removed |Added Depends on||109224 Referenced Bugs: https://bugs.freedesktop.org/show_bug.cgi?id=109224 [Bug 109224] OpenCL in darktable with AMD Turks doesn't works -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 87738] [OpenCL] Please add Image support
https://bugs.freedesktop.org/show_bug.cgi?id=87738 Jan Vesely changed: What|Removed |Added Blocks||109224 Referenced Bugs: https://bugs.freedesktop.org/show_bug.cgi?id=109224 [Bug 109224] OpenCL in darktable with AMD Turks doesn't works -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] NIR constant problem for GPU which doesn't have native integer support
On Thu, 2019-01-03 at 11:58 -0600, Jason Ekstrand wrote: > On Thu, Jan 3, 2019 at 3:39 AM Erik Faye-Lund < > erik.faye-l...@collabora.com> wrote: > > On Wed, 2019-01-02 at 10:16 -0600, Jason Ekstrand wrote: > > > On Wed, Jan 2, 2019 at 9:43 AM Ilia Mirkin > > > wrote: > > > > Have a look at the first 4 patches in the series from Jonathan > > > > Marek > > > > to address some of these issues: > > > > > > > > https://patchwork.freedesktop.org/series/54295/ > > > > > > > > Not sure exactly what state that work is in, but I've added > > > > Jonathan > > > > to CC, perhaps he can provide an update. > > > > > > > > Cheers, > > > > > > > > -ilia > > > > > > > > On Wed, Jan 2, 2019 at 6:28 AM Qiang Yu > > wrote: > > > > > > > > > > Hi guys, > > > > > > > > > > I found the problem with this test fragment shader when lima > > > > development: > > > > > uniform int color; > > > > > void main() { > > > > > if (color > 1) > > > > > gl_FragColor = vec4(1.0, 0.0, 0.0, 1); > > > > > else > > > > > gl_FragColor = vec4(0.0, 1.0, 0.0, 1); > > > > > } > > > > > > > > > > nir_print_shader output: > > > > > impl main { > > > > > block block_0: > > > > > /* preds: */ > > > > > vec1 32 ssa_0 = load_const (0x0001 /* 0.00 > > */) > > > > > vec4 32 ssa_1 = load_const (0x3f80 /* 1.00 > > */, > > > > > 0x /* 0.00 */, 0x /* 0.00 */, > > 0x3f80 > > > > /* > > > > > 1.00 */) > > > > > vec4 32 ssa_2 = load_const (0x /* 0.00 > > */, > > > > > 0x3f80 /* 1.00 */, 0x /* 0.00 */, > > 0x3f80 > > > > /* > > > > > 1.00 */) > > > > > vec1 32 ssa_3 = load_const (0x /* 0.00 > > */) > > > > > vec1 32 ssa_4 = intrinsic load_uniform (ssa_3) (0, 1, > > 0) > > > > /* > > > > > base=0 */ /* range=1 */ /* component=0 */ /* color */ > > > > > vec1 32 ssa_5 = slt ssa_0, ssa_4 > > > > > vec1 32 ssa_6 = fnot ssa_5 > > > > > vec4 32 ssa_7 = bcsel ssa_6., ssa_2, ssa_1 > > > > > intrinsic store_output (ssa_7, ssa_3) (0, 15, 0) /* > > > > base=0 */ > > > > > /* wrmask=xyzw */ /* component=0 */ /* gl_FragColor */ > > > > > /* succs: block_1 */ > > > > > block block_1: > > > > > } > > > > > > > > > > ssa0 is not converted to float when glsl to nir. I see > > > > glsl_to_nir.cpp > > > > > will create flt/ilt/ult > > > > > based on source type for gpu support native integer, but for > > gpu > > > > not > > > > > support native > > > > > integer, just create slt for all source type. And in > > > > > nir_lower_constant_initializers, > > > > > there's also no type conversion for integer constant. > > > > > > This is a generally sticky issue. In NIR, we have no concept of > > > types on SSA values which has proven perfectly reasonable and > > > actually very powerful in a world where integers are supported > > > natively. Unfortunately, it causes significant problems for > > float- > > > only architectures. > > > > I would like to take this chance to say that this untyped SSA-value > > choice has lead to issues in both radeon_si (because LLVM values > > are > > typed) and zink (similarly, because SPIR-V values are typed), where > > we > > need to to bitcasts on every access because there's just not enough > > information available to emit variables with the right type. > > I'm not sure if I agree that the two problems are the same or not... > More on that in a bit. > > > It took us a lot of time to realize that the meta-data from the > > opcodes > > doesn't *really* provide this, because the rest of nir doesn't > > treat > > values consistently. In fact, this feels arguably more like buggy > > behavior; why do we even have fmov when all of the time the > > compiler > > will emit imovs for floating-point values...? Or why do we have > > bitcast > > Why do we have different mov opcodes? Because they have different > behavior in the presence of source/destination modifiers. Is this general NIR-behavior (i.e will this be honored by constant folding etc), or is it Intel specific? If it's NIR-behavior, is it documented somewhere? In either case, I have a feeling that this is a mis-design; the modifiers apply to the operands, not the opcodes. It sounds a bit like imov shouldn't have been an ALU instruction, but perhaps some other, new type. Or perhaps the concept we have should have been a bit finer granularity. It's a bit hard to tell without more details, and I can't seem to find any on my own. Generally, I wish we had some more of the details here written down. Implementing support for NIR for a new architecture is currently quite tricky, mostly because chasing down the details are hard. > You likely don't use those in radeon or Zink but we do use them on > Intel. That being said, I've very seriously considered adding a > generic nir_op_mov which is entirely typeless and doesn't support > modifiers at all and make
Re: [Mesa-dev] [PATCH v2 39/53] intel/compiler: add a helper to do conversions between integer and half-float
On Fri, 2019-01-04 at 02:02 -0800, Francisco Jerez wrote: > Iago Toral writes: > > > On Wed, 2019-01-02 at 15:00 -0800, Francisco Jerez wrote: > > > Iago Toral Quiroga writes: > > > > > > > There are hardware restrictions to consider that seem to affect > > > > atom platforms > > > > only. > > > > > > Same comment here as for PATCH 13 of this series. This and PATCH > > > 40 > > > shouldn't be necessary anymore with [1] in place. Please drop > > > them. > > > > Actually, I think your pass doesn't handle this case. I have just > > tested this and I get various regressions for conversions between > > integers (specifically integers lower than 32-bit, so I wonder if > > this > > restriction only affects these cases) and half-float. > > > > Here is an example of final IR generated with your pass and without > > the > > call to fixup_int_half_float_conversion from my series: > > > > mov(8) vgrf1+0.0:HF, vgrf0<2>:W > > > > Which should be: > > > > mov(8) vgrf1<2>:HF, vgrf0<2>:W > > > > It seems your pass doesn't act on this code since > > INTEL_DEBUG=optimizer > > doesn't show any trace of it. > > > > If this is not a bug in your pass and just that it doesn't handle > > this > > case, I am happy to add the support for it as part of my series if > > that > > makes sense to you, just let me know if that is the case. > > > > It's not really a bug, you just need to add a case to > has_dst_aligned_region_restriction() for it to return true for FP16 > instructions with this restriction, sorry I didn't point that out > before. Ok, I'll do that as part of my series then, thanks! Iago > > Iago > > > > > [1] > > > https://lists.freedesktop.org/archives/mesa-dev/2018-December/212775.html > > > > > > > --- > > > > src/intel/compiler/brw_fs_nir.cpp | 32 > > > > +++ > > > > 1 file changed, 32 insertions(+) > > > > > > > > diff --git a/src/intel/compiler/brw_fs_nir.cpp > > > > b/src/intel/compiler/brw_fs_nir.cpp > > > > index 802f5cb0944..a9fd98bab68 100644 > > > > --- a/src/intel/compiler/brw_fs_nir.cpp > > > > +++ b/src/intel/compiler/brw_fs_nir.cpp > > > > @@ -696,6 +696,38 @@ fixup_64bit_conversion(const fs_builder > > > > , > > > > return false; > > > > } > > > > > > > > +static bool > > > > +fixup_int_half_float_conversion(const fs_builder , > > > > +fs_reg dst, fs_reg src, > > > > +bool saturate, > > > > +const struct gen_device_info > > > > *devinfo) > > > > +{ > > > > + /* CHV PRM, 3D Media GPGPU Engine, Register Region > > > > Restrictions, > > > > +* Special Restrictions: > > > > +* > > > > +*"Conversion between Integer and HF (Half Float) must > > > > be > > > > DWord > > > > +* aligned and strided by a DWord on the destination." > > > > +* > > > > +* The same restriction is listed for other hardware > > > > platforms, > > > > however, > > > > +* empirical testing suggests that only atom platforms are > > > > affected. > > > > +*/ > > > > + if (!devinfo->is_cherryview && > > > > !gen_device_info_is_9lp(devinfo)) > > > > + return false; > > > > + > > > > + if (!((dst.type == BRW_REGISTER_TYPE_HF && > > > > !brw_reg_type_is_floating_point(src.type)) || > > > > + (src.type == BRW_REGISTER_TYPE_HF && > > > > !brw_reg_type_is_floating_point(dst.type > > > > + return false; > > > > + > > > > + fs_reg tmp = > > > > horiz_stride(retype(bld.vgrf(BRW_REGISTER_TYPE_F, > > > > 1), > > > > +dst.type), > > > > + 2); > > > > + bld.MOV(tmp, src); > > > > + fs_inst *inst = bld.MOV(dst, tmp); > > > > + inst->saturate = saturate; > > > > + > > > > + return true; > > > > +} > > > > + > > > > void > > > > fs_visitor::nir_emit_alu(const fs_builder , nir_alu_instr > > > > *instr) > > > > { > > > > -- > > > > 2.17.1 > > > > > > > > ___ > > > > mesa-dev mailing list > > > > mesa-dev@lists.freedesktop.org > > > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 39/53] intel/compiler: add a helper to do conversions between integer and half-float
Iago Toral writes: > On Wed, 2019-01-02 at 15:00 -0800, Francisco Jerez wrote: >> Iago Toral Quiroga writes: >> >> > There are hardware restrictions to consider that seem to affect >> > atom platforms >> > only. >> >> Same comment here as for PATCH 13 of this series. This and PATCH 40 >> shouldn't be necessary anymore with [1] in place. Please drop them. > > Actually, I think your pass doesn't handle this case. I have just > tested this and I get various regressions for conversions between > integers (specifically integers lower than 32-bit, so I wonder if this > restriction only affects these cases) and half-float. > > Here is an example of final IR generated with your pass and without the > call to fixup_int_half_float_conversion from my series: > > mov(8) vgrf1+0.0:HF, vgrf0<2>:W > > Which should be: > > mov(8) vgrf1<2>:HF, vgrf0<2>:W > > It seems your pass doesn't act on this code since INTEL_DEBUG=optimizer > doesn't show any trace of it. > > If this is not a bug in your pass and just that it doesn't handle this > case, I am happy to add the support for it as part of my series if that > makes sense to you, just let me know if that is the case. > It's not really a bug, you just need to add a case to has_dst_aligned_region_restriction() for it to return true for FP16 instructions with this restriction, sorry I didn't point that out before. > Iago > >> [1] >> https://lists.freedesktop.org/archives/mesa-dev/2018-December/212775.html >> >> > --- >> > src/intel/compiler/brw_fs_nir.cpp | 32 >> > +++ >> > 1 file changed, 32 insertions(+) >> > >> > diff --git a/src/intel/compiler/brw_fs_nir.cpp >> > b/src/intel/compiler/brw_fs_nir.cpp >> > index 802f5cb0944..a9fd98bab68 100644 >> > --- a/src/intel/compiler/brw_fs_nir.cpp >> > +++ b/src/intel/compiler/brw_fs_nir.cpp >> > @@ -696,6 +696,38 @@ fixup_64bit_conversion(const fs_builder , >> > return false; >> > } >> > >> > +static bool >> > +fixup_int_half_float_conversion(const fs_builder , >> > +fs_reg dst, fs_reg src, >> > +bool saturate, >> > +const struct gen_device_info >> > *devinfo) >> > +{ >> > + /* CHV PRM, 3D Media GPGPU Engine, Register Region >> > Restrictions, >> > +* Special Restrictions: >> > +* >> > +*"Conversion between Integer and HF (Half Float) must be >> > DWord >> > +* aligned and strided by a DWord on the destination." >> > +* >> > +* The same restriction is listed for other hardware platforms, >> > however, >> > +* empirical testing suggests that only atom platforms are >> > affected. >> > +*/ >> > + if (!devinfo->is_cherryview && >> > !gen_device_info_is_9lp(devinfo)) >> > + return false; >> > + >> > + if (!((dst.type == BRW_REGISTER_TYPE_HF && >> > !brw_reg_type_is_floating_point(src.type)) || >> > + (src.type == BRW_REGISTER_TYPE_HF && >> > !brw_reg_type_is_floating_point(dst.type >> > + return false; >> > + >> > + fs_reg tmp = horiz_stride(retype(bld.vgrf(BRW_REGISTER_TYPE_F, >> > 1), >> > +dst.type), >> > + 2); >> > + bld.MOV(tmp, src); >> > + fs_inst *inst = bld.MOV(dst, tmp); >> > + inst->saturate = saturate; >> > + >> > + return true; >> > +} >> > + >> > void >> > fs_visitor::nir_emit_alu(const fs_builder , nir_alu_instr >> > *instr) >> > { >> > -- >> > 2.17.1 >> > >> > ___ >> > mesa-dev mailing list >> > mesa-dev@lists.freedesktop.org >> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 07/10] intel/fs: Introduce regioning lowering pass.
Iago Toral writes: > On Sat, 2018-12-29 at 12:39 -0800, Francisco Jerez wrote: >> This legalization pass is meant to handle situations where the source >> or destination regioning controls of an instruction are unsupported >> by >> the hardware and need to be lowered away into separate instructions. >> This should be more reliable and future-proof than the current >> approach of handling CHV/BXT restrictions manually all over the >> visitor. The same mechanism is leveraged to lower unsupported type >> conversions easily, which obsoletes the lower_conversions pass. >> --- >> src/intel/Makefile.sources| 1 + >> src/intel/compiler/brw_fs.cpp | 5 +- >> src/intel/compiler/brw_fs.h | 21 +- >> src/intel/compiler/brw_fs_lower_regioning.cpp | 382 >> ++ >> src/intel/compiler/brw_ir_fs.h| 10 + >> src/intel/compiler/meson.build| 1 + >> 6 files changed, 401 insertions(+), 19 deletions(-) >> create mode 100644 src/intel/compiler/brw_fs_lower_regioning.cpp >> >> diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources >> index 5e7d32293b7..6b9874d2b80 100644 >> --- a/src/intel/Makefile.sources >> +++ b/src/intel/Makefile.sources >> @@ -64,6 +64,7 @@ COMPILER_FILES = \ >> compiler/brw_fs_live_variables.h \ >> compiler/brw_fs_lower_conversions.cpp \ >> compiler/brw_fs_lower_pack.cpp \ >> +compiler/brw_fs_lower_regioning.cpp \ >> compiler/brw_fs_nir.cpp \ >> compiler/brw_fs_reg_allocate.cpp \ >> compiler/brw_fs_register_coalesce.cpp \ >> diff --git a/src/intel/compiler/brw_fs.cpp >> b/src/intel/compiler/brw_fs.cpp >> index 889509badab..caa7a798332 100644 >> --- a/src/intel/compiler/brw_fs.cpp >> +++ b/src/intel/compiler/brw_fs.cpp >> @@ -6471,7 +6471,10 @@ fs_visitor::optimize() >>OPT(dead_code_eliminate); >> } >> >> - if (OPT(lower_conversions)) { >> + progress = false; >> + OPT(lower_conversions); >> + OPT(lower_regioning); >> + if (progress) { >>OPT(opt_copy_propagation); >>OPT(dead_code_eliminate); >>OPT(lower_simd_width); >> diff --git a/src/intel/compiler/brw_fs.h >> b/src/intel/compiler/brw_fs.h >> index dc36ecc21ac..36825754931 100644 >> --- a/src/intel/compiler/brw_fs.h >> +++ b/src/intel/compiler/brw_fs.h >> @@ -164,6 +164,7 @@ public: >> void lower_uniform_pull_constant_loads(); >> bool lower_load_payload(); >> bool lower_pack(); >> + bool lower_regioning(); >> bool lower_conversions(); >> bool lower_logical_sends(); >> bool lower_integer_multiplication(); >> @@ -536,24 +537,8 @@ namespace brw { >>} >> } >> >> - /** >> -* Remove any modifiers from the \p i-th source region of the >> instruction, >> -* including negate, abs and any implicit type conversion to the >> execution >> -* type. Instead any source modifiers will be implemented as a >> separate >> -* MOV instruction prior to the original instruction. >> -*/ >> - inline bool >> - lower_src_modifiers(fs_visitor *v, bblock_t *block, fs_inst >> *inst, unsigned i) >> - { >> - assert(inst->components_read(i) == 1); >> - const fs_builder ibld(v, block, inst); >> - const fs_reg tmp = ibld.vgrf(get_exec_type(inst)); >> - >> - ibld.MOV(tmp, inst->src[i]); >> - inst->src[i] = tmp; >> - >> - return true; >> - } >> + bool >> + lower_src_modifiers(fs_visitor *v, bblock_t *block, fs_inst >> *inst, unsigned i); >> } >> >> void shuffle_from_32bit_read(const brw::fs_builder , >> diff --git a/src/intel/compiler/brw_fs_lower_regioning.cpp >> b/src/intel/compiler/brw_fs_lower_regioning.cpp >> new file mode 100644 >> index 000..9578622401d >> --- /dev/null >> +++ b/src/intel/compiler/brw_fs_lower_regioning.cpp >> @@ -0,0 +1,382 @@ >> +/* >> + * Copyright © 2018 Intel Corporation >> + * >> + * Permission is hereby granted, free of charge, to any person >> obtaining a >> + * copy of this software and associated documentation files (the >> "Software"), >> + * to deal in the Software without restriction, including without >> limitation >> + * the rights to use, copy, modify, merge, publish, distribute, >> sublicense, >> + * and/or sell copies of the Software, and to permit persons to whom >> the >> + * Software is furnished to do so, subject to the following >> conditions: >> + * >> + * The above copyright notice and this permission notice (including >> the next >> + * paragraph) shall be included in all copies or substantial >> portions of the >> + * Software. >> + * >> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, >> EXPRESS OR >> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF >> MERCHANTABILITY, >> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO >> EVENT SHALL >> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES >> OR OTHER >> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT
Re: [Mesa-dev] [PATCH 07/10] intel/fs: Introduce regioning lowering pass.
On Sat, 2018-12-29 at 12:39 -0800, Francisco Jerez wrote: > This legalization pass is meant to handle situations where the source > or destination regioning controls of an instruction are unsupported > by > the hardware and need to be lowered away into separate instructions. > This should be more reliable and future-proof than the current > approach of handling CHV/BXT restrictions manually all over the > visitor. The same mechanism is leveraged to lower unsupported type > conversions easily, which obsoletes the lower_conversions pass. > --- > src/intel/Makefile.sources| 1 + > src/intel/compiler/brw_fs.cpp | 5 +- > src/intel/compiler/brw_fs.h | 21 +- > src/intel/compiler/brw_fs_lower_regioning.cpp | 382 > ++ > src/intel/compiler/brw_ir_fs.h| 10 + > src/intel/compiler/meson.build| 1 + > 6 files changed, 401 insertions(+), 19 deletions(-) > create mode 100644 src/intel/compiler/brw_fs_lower_regioning.cpp > > diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources > index 5e7d32293b7..6b9874d2b80 100644 > --- a/src/intel/Makefile.sources > +++ b/src/intel/Makefile.sources > @@ -64,6 +64,7 @@ COMPILER_FILES = \ > compiler/brw_fs_live_variables.h \ > compiler/brw_fs_lower_conversions.cpp \ > compiler/brw_fs_lower_pack.cpp \ > + compiler/brw_fs_lower_regioning.cpp \ > compiler/brw_fs_nir.cpp \ > compiler/brw_fs_reg_allocate.cpp \ > compiler/brw_fs_register_coalesce.cpp \ > diff --git a/src/intel/compiler/brw_fs.cpp > b/src/intel/compiler/brw_fs.cpp > index 889509badab..caa7a798332 100644 > --- a/src/intel/compiler/brw_fs.cpp > +++ b/src/intel/compiler/brw_fs.cpp > @@ -6471,7 +6471,10 @@ fs_visitor::optimize() >OPT(dead_code_eliminate); > } > > - if (OPT(lower_conversions)) { > + progress = false; > + OPT(lower_conversions); > + OPT(lower_regioning); > + if (progress) { >OPT(opt_copy_propagation); >OPT(dead_code_eliminate); >OPT(lower_simd_width); > diff --git a/src/intel/compiler/brw_fs.h > b/src/intel/compiler/brw_fs.h > index dc36ecc21ac..36825754931 100644 > --- a/src/intel/compiler/brw_fs.h > +++ b/src/intel/compiler/brw_fs.h > @@ -164,6 +164,7 @@ public: > void lower_uniform_pull_constant_loads(); > bool lower_load_payload(); > bool lower_pack(); > + bool lower_regioning(); > bool lower_conversions(); > bool lower_logical_sends(); > bool lower_integer_multiplication(); > @@ -536,24 +537,8 @@ namespace brw { >} > } > > - /** > -* Remove any modifiers from the \p i-th source region of the > instruction, > -* including negate, abs and any implicit type conversion to the > execution > -* type. Instead any source modifiers will be implemented as a > separate > -* MOV instruction prior to the original instruction. > -*/ > - inline bool > - lower_src_modifiers(fs_visitor *v, bblock_t *block, fs_inst > *inst, unsigned i) > - { > - assert(inst->components_read(i) == 1); > - const fs_builder ibld(v, block, inst); > - const fs_reg tmp = ibld.vgrf(get_exec_type(inst)); > - > - ibld.MOV(tmp, inst->src[i]); > - inst->src[i] = tmp; > - > - return true; > - } > + bool > + lower_src_modifiers(fs_visitor *v, bblock_t *block, fs_inst > *inst, unsigned i); > } > > void shuffle_from_32bit_read(const brw::fs_builder , > diff --git a/src/intel/compiler/brw_fs_lower_regioning.cpp > b/src/intel/compiler/brw_fs_lower_regioning.cpp > new file mode 100644 > index 000..9578622401d > --- /dev/null > +++ b/src/intel/compiler/brw_fs_lower_regioning.cpp > @@ -0,0 +1,382 @@ > +/* > + * Copyright © 2018 Intel Corporation > + * > + * Permission is hereby granted, free of charge, to any person > obtaining a > + * copy of this software and associated documentation files (the > "Software"), > + * to deal in the Software without restriction, including without > limitation > + * the rights to use, copy, modify, merge, publish, distribute, > sublicense, > + * and/or sell copies of the Software, and to permit persons to whom > the > + * Software is furnished to do so, subject to the following > conditions: > + * > + * The above copyright notice and this permission notice (including > the next > + * paragraph) shall be included in all copies or substantial > portions of the > + * Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, > EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF > MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO > EVENT SHALL > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES > OR OTHER > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, > ARISING > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR > OTHER DEALINGS > + * IN THE SOFTWARE. > + */ > + >
Re: [Mesa-dev] [PATCH v2 39/53] intel/compiler: add a helper to do conversions between integer and half-float
On Wed, 2019-01-02 at 15:00 -0800, Francisco Jerez wrote: > Iago Toral Quiroga writes: > > > There are hardware restrictions to consider that seem to affect > > atom platforms > > only. > > Same comment here as for PATCH 13 of this series. This and PATCH 40 > shouldn't be necessary anymore with [1] in place. Please drop them. Actually, I think your pass doesn't handle this case. I have just tested this and I get various regressions for conversions between integers (specifically integers lower than 32-bit, so I wonder if this restriction only affects these cases) and half-float. Here is an example of final IR generated with your pass and without the call to fixup_int_half_float_conversion from my series: mov(8) vgrf1+0.0:HF, vgrf0<2>:W Which should be: mov(8) vgrf1<2>:HF, vgrf0<2>:W It seems your pass doesn't act on this code since INTEL_DEBUG=optimizer doesn't show any trace of it. If this is not a bug in your pass and just that it doesn't handle this case, I am happy to add the support for it as part of my series if that makes sense to you, just let me know if that is the case. Iago > [1] > https://lists.freedesktop.org/archives/mesa-dev/2018-December/212775.html > > > --- > > src/intel/compiler/brw_fs_nir.cpp | 32 > > +++ > > 1 file changed, 32 insertions(+) > > > > diff --git a/src/intel/compiler/brw_fs_nir.cpp > > b/src/intel/compiler/brw_fs_nir.cpp > > index 802f5cb0944..a9fd98bab68 100644 > > --- a/src/intel/compiler/brw_fs_nir.cpp > > +++ b/src/intel/compiler/brw_fs_nir.cpp > > @@ -696,6 +696,38 @@ fixup_64bit_conversion(const fs_builder , > > return false; > > } > > > > +static bool > > +fixup_int_half_float_conversion(const fs_builder , > > +fs_reg dst, fs_reg src, > > +bool saturate, > > +const struct gen_device_info > > *devinfo) > > +{ > > + /* CHV PRM, 3D Media GPGPU Engine, Register Region > > Restrictions, > > +* Special Restrictions: > > +* > > +*"Conversion between Integer and HF (Half Float) must be > > DWord > > +* aligned and strided by a DWord on the destination." > > +* > > +* The same restriction is listed for other hardware platforms, > > however, > > +* empirical testing suggests that only atom platforms are > > affected. > > +*/ > > + if (!devinfo->is_cherryview && > > !gen_device_info_is_9lp(devinfo)) > > + return false; > > + > > + if (!((dst.type == BRW_REGISTER_TYPE_HF && > > !brw_reg_type_is_floating_point(src.type)) || > > + (src.type == BRW_REGISTER_TYPE_HF && > > !brw_reg_type_is_floating_point(dst.type > > + return false; > > + > > + fs_reg tmp = horiz_stride(retype(bld.vgrf(BRW_REGISTER_TYPE_F, > > 1), > > +dst.type), > > + 2); > > + bld.MOV(tmp, src); > > + fs_inst *inst = bld.MOV(dst, tmp); > > + inst->saturate = saturate; > > + > > + return true; > > +} > > + > > void > > fs_visitor::nir_emit_alu(const fs_builder , nir_alu_instr > > *instr) > > { > > -- > > 2.17.1 > > > > ___ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev