[Mesa-dev] [Bug 102530] [bisected] Kodi crashes when launching a stream - commit bd2662bf
https://bugs.freedesktop.org/show_bug.cgi?id=102530 --- Comment #7 from Erik Faye-Lund--- Timothy Arceri: In either case, it looks like the handling of location = -1 is incorrect in the no-error case. I think we should ignore the command in this case... -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] loader/dri3: Use client local back to front blit in copySubBuffer if available
On 04/09/2017 14:27, Thomas Hellstrom wrote: The copySubBuffer functionality always attempted a server side blit from back to fake front if a fake front was present, and we weren't displaying on a remote GPU. Now that we always have local blit capability on modern drivers, first attempt a local blit, and only if that fails, try the server blit. Signed-off-by: Thomas Hellstrom--- src/loader/loader_dri3_helper.c | 16 +++- 1 file changed, 7 insertions(+), 9 deletions(-) diff --git a/src/loader/loader_dri3_helper.c b/src/loader/loader_dri3_helper.c index e3120f5..c0a6e0d 100644 --- a/src/loader/loader_dri3_helper.c +++ b/src/loader/loader_dri3_helper.c @@ -635,14 +635,6 @@ loader_dri3_copy_sub_buffer(struct loader_dri3_drawable *draw, back->image, 0, 0, back->width, back->height, 0, 0, __BLIT_FLAG_FLUSH); - /* We use blit_image to update our fake front, - */ - if (draw->have_fake_front) - (void) loader_dri3_blit_image(draw, - dri3_fake_front_buffer(draw)->image, - back->image, - x, y, width, height, - x, y, __BLIT_FLAG_FLUSH); } Why removing that part ? It's a bit easier to read when the is_different_gpu path is separated to the normal path here. loader_dri3_swapbuffer_barrier(draw); @@ -656,7 +648,13 @@ loader_dri3_copy_sub_buffer(struct loader_dri3_drawable *draw, /* Refresh the fake front (if present) after we just damaged the real * front. */ - if (draw->have_fake_front && !draw->is_different_gpu) { + if (draw->have_fake_front && + !loader_dri3_blit_image(draw, + dri3_fake_front_buffer(draw)->image, + back->image, + x, y, width, height, + x, y, __BLIT_FLAG_FLUSH) && + !draw->is_different_gpu) { dri3_fence_reset(draw->conn, dri3_fake_front_buffer(draw)); dri3_copy_area(draw->conn, back->pixmap, In the case of is_different_gpu, reverting to using a server copy if loader_dri3_blit_image fails doesn't seem a good choice, because what the server sees is the linear copy of the buffers. You'd have to add a blit from the linear copy to the tiled buffer (which won't work if local blit is not available). Yours, Axel Davy ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 10/10] egl/dri2: Add Wayland+EGL support for RGB10 winsys buffers.
Successfully tested under Weston 3.0, both with the new (experimental) dmabuf+modifiers path, and the old buffer import path. Photometer confirms 10 rgb bits from rendering to display. Signed-off-by: Mario Kleiner--- src/egl/drivers/dri2/egl_dri2.c | 3 ++ src/egl/drivers/dri2/egl_dri2.h | 2 + src/egl/drivers/dri2/platform_wayland.c | 65 --- src/egl/wayland/wayland-drm/wayland-drm.c | 6 +++ 4 files changed, 70 insertions(+), 6 deletions(-) diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c index 2667aa5..b044716 100644 --- a/src/egl/drivers/dri2/egl_dri2.c +++ b/src/egl/drivers/dri2/egl_dri2.c @@ -986,6 +986,9 @@ dri2_display_destroy(_EGLDisplay *disp) wl_event_queue_destroy(dri2_dpy->wl_queue); if (dri2_dpy->wl_dpy_wrapper) wl_proxy_wrapper_destroy(dri2_dpy->wl_dpy_wrapper); + + u_vector_finish(_dpy->wl_modifiers.argb2101010); + u_vector_finish(_dpy->wl_modifiers.xrgb2101010); u_vector_finish(_dpy->wl_modifiers.argb); u_vector_finish(_dpy->wl_modifiers.xrgb); u_vector_finish(_dpy->wl_modifiers.rgb565); diff --git a/src/egl/drivers/dri2/egl_dri2.h b/src/egl/drivers/dri2/egl_dri2.h index 4a52b49..2353a0f 100644 --- a/src/egl/drivers/dri2/egl_dri2.h +++ b/src/egl/drivers/dri2/egl_dri2.h @@ -220,6 +220,8 @@ struct dri2_egl_display struct wl_event_queue*wl_queue; struct zwp_linux_dmabuf_v1 *wl_dmabuf; struct { + struct u_vectorxrgb2101010; + struct u_vectorargb2101010; struct u_vectorxrgb; struct u_vectorargb; struct u_vectorrgb565; diff --git a/src/egl/drivers/dri2/platform_wayland.c b/src/egl/drivers/dri2/platform_wayland.c index bf2adbf..3d46723 100644 --- a/src/egl/drivers/dri2/platform_wayland.c +++ b/src/egl/drivers/dri2/platform_wayland.c @@ -61,6 +61,8 @@ enum wl_drm_format_flags { HAS_ARGB = 1, HAS_XRGB = 2, HAS_RGB565 = 4, + HAS_ARGB2101010 = 8, + HAS_XRGB2101010 = 16, }; static int @@ -148,18 +150,26 @@ dri2_wl_create_window_surface(_EGLDriver *drv, _EGLDisplay *disp, if (dri2_dpy->wl_dmabuf || dri2_dpy->wl_drm) { if (conf->RedSize == 5) dri2_surf->format = WL_DRM_FORMAT_RGB565; - else if (conf->AlphaSize == 0) + else if (conf->RedSize == 8 && conf->AlphaSize == 0) dri2_surf->format = WL_DRM_FORMAT_XRGB; - else + else if (conf->RedSize == 8) dri2_surf->format = WL_DRM_FORMAT_ARGB; + else if (conf->RedSize == 10 && conf->AlphaSize == 0) + dri2_surf->format = WL_DRM_FORMAT_XRGB2101010; + else if (conf->RedSize == 10) + dri2_surf->format = WL_DRM_FORMAT_ARGB2101010; } else { assert(dri2_dpy->wl_shm); if (conf->RedSize == 5) dri2_surf->format = WL_SHM_FORMAT_RGB565; - else if (conf->AlphaSize == 0) + else if (conf->RedSize == 8 && conf->AlphaSize == 0) dri2_surf->format = WL_SHM_FORMAT_XRGB; - else + else if (conf->RedSize == 8) dri2_surf->format = WL_SHM_FORMAT_ARGB; + else if (conf->RedSize == 10 && conf->AlphaSize == 0) + dri2_surf->format = WL_SHM_FORMAT_XRGB2101010; + else if (conf->RedSize == 10) + dri2_surf->format = WL_SHM_FORMAT_ARGB2101010; } dri2_surf->wl_queue = wl_display_create_queue(dri2_dpy->wl_dpy); @@ -339,8 +349,9 @@ get_back_bo(struct dri2_egl_surface *dri2_surf) uint64_t *modifiers; int num_modifiers; - /* currently supports three WL DRM formats, + /* currently supports five WL DRM formats, * WL_DRM_FORMAT_ARGB, WL_DRM_FORMAT_XRGB, +* WL_DRM_FORMAT_ARGB2101010, WL_DRM_FORMAT_XRGB2101010, * and WL_DRM_FORMAT_RGB565 */ switch (dri2_surf->format) { @@ -359,6 +370,16 @@ get_back_bo(struct dri2_egl_surface *dri2_surf) modifiers = u_vector_tail(_dpy->wl_modifiers.rgb565); num_modifiers = u_vector_length(_dpy->wl_modifiers.rgb565); break; + case WL_DRM_FORMAT_ARGB2101010: + dri_image_format = __DRI_IMAGE_FORMAT_ARGB2101010; + modifiers = u_vector_tail(_dpy->wl_modifiers.argb2101010); + num_modifiers = u_vector_length(_dpy->wl_modifiers.argb2101010); + break; + case WL_DRM_FORMAT_XRGB2101010: + dri_image_format = __DRI_IMAGE_FORMAT_XRGB2101010; + modifiers = u_vector_tail(_dpy->wl_modifiers.xrgb2101010); + num_modifiers = u_vector_length(_dpy->wl_modifiers.xrgb2101010); + break; default: /* format is not supported */ return -1; @@ -582,6 +603,8 @@ dri2_wl_get_buffers(__DRIdrawable * driDrawable, switch (dri2_surf->format) { case WL_DRM_FORMAT_ARGB: case WL_DRM_FORMAT_XRGB: + case WL_DRM_FORMAT_ARGB2101010: + case WL_DRM_FORMAT_XRGB2101010: bpp = 32; break; case WL_DRM_FORMAT_RGB565: @@ -921,6 +944,14 @@
[Mesa-dev] [PATCH 07/10] i965/screen: Honor 'expose_rgb10_configs' option.
Allows to prevent exposing RGB10 configs and visuals to clients. Signed-off-by: Mario Kleiner--- src/mesa/drivers/dri/i965/intel_screen.c | 19 +++ 1 file changed, 19 insertions(+) diff --git a/src/mesa/drivers/dri/i965/intel_screen.c b/src/mesa/drivers/dri/i965/intel_screen.c index c0fdde3..5754ea7 100644 --- a/src/mesa/drivers/dri/i965/intel_screen.c +++ b/src/mesa/drivers/dri/i965/intel_screen.c @@ -1904,11 +1904,20 @@ intel_screen_make_configs(__DRIscreen *dri_screen) uint8_t depth_bits[4], stencil_bits[4]; __DRIconfig **configs = NULL; + /* Shall we expose 10 bpc formats? */ + bool expose_rgb10_configs = driQueryOptionb(_screen->optionCache, + "expose_rgb10_configs"); + /* Generate singlesample configs without accumulation buffer. */ for (unsigned i = 0; i < ARRAY_SIZE(formats); i++) { __DRIconfig **new_configs; int num_depth_stencil_bits = 2; + if (!expose_rgb10_configs && + (formats[i] == MESA_FORMAT_B10G10R10A2_UNORM || + formats[i] == MESA_FORMAT_B10G10R10X2_UNORM)) + continue; + /* Starting with DRI2 protocol version 1.1 we can request a depth/stencil * buffer that has a different number of bits per pixel than the color * buffer, gen >= 6 supports this. @@ -1945,6 +1954,11 @@ intel_screen_make_configs(__DRIscreen *dri_screen) for (unsigned i = 0; i < ARRAY_SIZE(formats); i++) { __DRIconfig **new_configs; + if (!expose_rgb10_configs && + (formats[i] == MESA_FORMAT_B10G10R10A2_UNORM || + formats[i] == MESA_FORMAT_B10G10R10X2_UNORM)) + continue; + if (formats[i] == MESA_FORMAT_B5G6R5_UNORM) { depth_bits[0] = 16; stencil_bits[0] = 0; @@ -1978,6 +1992,11 @@ intel_screen_make_configs(__DRIscreen *dri_screen) if (devinfo->gen < 6) break; + if (!expose_rgb10_configs && + (formats[i] == MESA_FORMAT_B10G10R10A2_UNORM || + formats[i] == MESA_FORMAT_B10G10R10X2_UNORM)) + continue; + __DRIconfig **new_configs; const int num_depth_stencil_bits = 2; int num_msaa_modes = 0; -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 08/10] drirc: Don't expose 10 bpc visuals/configs to gnome-shell.
Set 'expose_rgb10_configs' false when gnome-shell is the client. Gnome-Shell/Wayland (= Mutter drm/kms wayland backend) currently can't handle non RGB8 configs. It will treat any framebuffer as RGBX8 or RGBA8, so if provided with a RGB10A2 or RGB10X2 framebuffer, the compositors kms backend will simply pass it to the kernel as RGBX8 for scanout, resulting in false colors. Gnome-Shell/X11 displays 10 bpc drawables correctly without any color artifacts if X-Screen DefaultDepth 30 is set. Both Gnome-Shell Wayland and X11 for some reason seem to have problems with hit-testing for RGB10 modes, making them almost unusable: Neither context menus (right mouse click) on the desktop, nor the icons in the dock, nor any part of the menu bar at the top, nor any icons on the desktop, respond to any mouse clicks. The same problem appears for window decorations (resize, move, close of windows via mouse impossible). The same problem happens when testing with the amdgpu-pro proprietary OpenGL library in "DefaultDepth 30" mode, and with the NVidia proprietary driver with depth 30 mode, so this seems to be a problem inside Gnome-Shell, not in Mesa, X or Wayland. Not exposing RGB10 configs keeps Gnome-Shell usable, and still allows other X-Clients to do RGB10 rendering if X "DefaultDepth 30" is selected. No such problems happened under Gnome flashback session (Metacity), or with Compiz based UI's, or under KDE-5 with or without compositing. Signed-off-by: Mario Kleiner--- src/util/drirc | 4 1 file changed, 4 insertions(+) diff --git a/src/util/drirc b/src/util/drirc index 30ac9c8..c3170be 100644 --- a/src/util/drirc +++ b/src/util/drirc @@ -160,6 +160,10 @@ TODO: document the other workarounds. + + + + -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 09/10] egl/x11: Match depth 30 RGB visuals to 32-bit RGBA EGLConfigs.
Similar to the matching of 24 bit RGB visuals to 32-bit RGBA EGLConfigs. Fixes failure of piglit egl tests to select ARGB2101010 visuals via eglChooseConfig() with EGL_ALPHA_BITS 2 on a depth 30 X-Screen. Signed-off-by: Mario Kleiner--- src/egl/drivers/dri2/platform_x11.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/egl/drivers/dri2/platform_x11.c b/src/egl/drivers/dri2/platform_x11.c index 062c8a4..df768ab 100644 --- a/src/egl/drivers/dri2/platform_x11.c +++ b/src/egl/drivers/dri2/platform_x11.c @@ -781,13 +781,14 @@ dri2_x11_add_configs_for_visuals(struct dri2_egl_display *dri2_dpy, config_count++; /* Allow a 24-bit RGB visual to match a 32-bit RGBA EGLConfig. + * Ditto for 30-bit RGB visuals to match a 32-bit RGBA EGLConfig. * Otherwise it will only match a 32-bit RGBA visual. On a * composited window manager on X11, this will make all of the * EGLConfigs with destination alpha get blended by the * compositor. This is probably not what the application * wants... especially on drivers that only have 32-bit RGBA * EGLConfigs! */ -if (d.data->depth == 24) { +if (d.data->depth == 24 || d.data->depth == 30) { rgba_masks[3] = ~(rgba_masks[0] | rgba_masks[1] | rgba_masks[2]); dri2_conf = dri2_add_config(disp, config, config_count + 1, -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 06/10] dri/common: Add option to disable exposure of 10 bpc color configs.
A few clients don't like RGB10X2 and RGB10A2 fbconfigs and visuals. Add a new driconf option 'expose_rgb10_configs' to allow per application enable/disable. The option defaults to enabled. Signed-off-by: Mario Kleiner--- src/mesa/drivers/dri/common/dri_util.c | 11 +++ src/util/xmlpool/t_options.h | 5 + 2 files changed, 12 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/common/dri_util.c b/src/mesa/drivers/dri/common/dri_util.c index 31a3040..972a1a4 100644 --- a/src/mesa/drivers/dri/common/dri_util.c +++ b/src/mesa/drivers/dri/common/dri_util.c @@ -55,6 +55,10 @@ const char __dri2ConfigOptions[] = DRI_CONF_SECTION_PERFORMANCE DRI_CONF_VBLANK_MODE(DRI_CONF_VBLANK_DEF_INTERVAL_1) DRI_CONF_SECTION_END + + DRI_CONF_SECTION_MISCELLANEOUS + DRI_CONF_EXPOSE_RGB10_CONFIGS("true") + DRI_CONF_SECTION_END DRI_CONF_END; /*/ @@ -144,6 +148,9 @@ driCreateNewScreen2(int scrn, int fd, psp->fd = fd; psp->myNum = scrn; +driParseOptionInfo(>optionInfo, __dri2ConfigOptions); +driParseConfigFiles(>optionCache, >optionInfo, psp->myNum, "dri2"); + *driver_configs = psp->driver->InitScreen(psp); if (*driver_configs == NULL) { free(psp); @@ -179,10 +186,6 @@ driCreateNewScreen2(int scrn, int fd, if (psp->max_gl_es2_version >= 30) psp->api_mask |= (1 << __DRI_API_GLES3); -driParseOptionInfo(>optionInfo, __dri2ConfigOptions); -driParseConfigFiles(>optionCache, >optionInfo, psp->myNum, "dri2"); - - return psp; } diff --git a/src/util/xmlpool/t_options.h b/src/util/xmlpool/t_options.h index d3f31fc..08f92c6 100644 --- a/src/util/xmlpool/t_options.h +++ b/src/util/xmlpool/t_options.h @@ -380,6 +380,11 @@ DRI_CONF_OPT_BEGIN_B(glsl_zero_init, def) \ DRI_CONF_DESC(en,gettext("Force uninitialized variables to default to zero")) \ DRI_CONF_OPT_END +#define DRI_CONF_EXPOSE_RGB10_CONFIGS(def) \ +DRI_CONF_OPT_BEGIN_B(expose_rgb10_configs, def) \ +DRI_CONF_DESC(en,gettext("Expose visuals and fbconfigs with rgb10a2 formats")) \ +DRI_CONF_OPT_END + /** * \brief Initialization configuration options */ -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] RGB10 bit rendering support for OpenGL + Intel i965
Hi, this patch series adds support to the i965 classic Mesa driver for ARGB2101010 and XRGB2101010 fbconfigs/visuals for X11/GLX, X11/EGL and Wayland/EGL, and for rendering into those winsys framebuffers with 10 bpc precision. Tested hw/sw configs on top of Ubuntu 16.04.3 LTS, on top of mesa master: Tested on Intel Ironlake (gen5), Ivybridge (gen7), Haswell (gen7.5). Lightly tested on Skylake (visual correctness, no time for piglit runs). Tested with the intel ddx (X-Server 1.18.4 and 1.19.3) with sna and uxa backends, DRI2 and DRI3/Present, with and without compositing (when a choice was possible, e.g., under KDE-5), for default X-Screen depth of 24 bit and then DefaultDepth 30 bit. Also tested with the modesetting ddx under depth 24 -- modesetting doesn't support depth 30 yet. Tested with KDE Plasma-5, Compiz based UI's and Gnome flashback (Metacity), and with Gnome Shell X11 and Wayland. Mesa "make check" passes. "piglit run gpu" results: Mesa vs. Mesa with these patches: No regressions under depth 24, except for glx-visuals-depth and glx-visuals-stencil. Both turned out to be problems in the piglit tests, for which i'll send out a patch. Mesa with 10 bit patches under X-Screen depth 24 vs. 30 shows another fail in egl-configless-context due to limitations of that test, for which i'll send a patch. Other than that, a couple of skipped tests. If i make those tests not skip, i get some failures due to various tests assuming the alpha channel is at least 8 bits deep, when here it is only 2 bits. Also some glean tests fail due to slightly insufficient precision. glxgears, es2gears, glmark work. My own application works I also have some kernel patches for intel-kms in preparation to get XRGB2101010 framebuffers displayed with 10 bpc, and measurements with a photometer and my own application confirm we get 10 bpc from OpenGL rendering to the display on X11 with/without compositing, windowed or unredirected page-flipped. I also tested Wayland + Weston master, with gbm-format=xrgb2101010 for 10 bit framebuffer. First "as is" with the experimental dmabuf+modifiers path, and then hacked to use the old WL_BUFFER import path. Both works and photometer measurement shows 10 bpc from rendering to display. Visually all tested desktop UI's seem to render correctly in X-Screen depth 30 (save for some minor funky colors in window decorations with some toolkits if desktop compositing is disabled under KDE-5), also under Wayland + Weston. The exception is Gnome-Shell. Gnome-Shell Wayland shows funky false colors. This is due to limitations in Mutters/COGL drm/kms backend, which assumes any content is argb / xrgb and doesn't check for mismatch. Gnome-Shell X11 displays all content visually correct. Both Gnome-Shell Wayland and X11 though don't respond to mouse clicks on desktop icons, elements in the menu bar or dock, on window decorations, or right-click context menus on the desktop, although the mouse works just fine inside the client area of X11 windows. It's as if somehow hit-testing doesn't work when RGBA1010102 configs are exposed? The same problem also happens under X-Screen color depth 30 with NVidia's proprietary graphics drivers, and under AMD's amdgpu-pro hybrid driver, so it doesn't seem to be a bug specific to this patch series or X11 vs. Wayland, but some Gnome-Shell problem? I didn't manage to track it down, but this series includes a new driconf option to prevent exposing 10 bpc configs to clients and a default driconf black-list entry for gnome-shell for the moment. All in all the patch series seems to work well. If this looks about right to you, i'd also give the gallium drivers a try for 10 bit enablement. Thanks, -mario ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 05/10] i965/screen: Add XRGB2101010 and ARGB2101010 support for DRI3.
Allow DRI3/Present buffer sharing for 10 bpc buffers. Otherwise composited desktops under DRI3 will only display black client areas for redirected windows. Signed-off-by: Mario Kleiner--- src/mesa/drivers/dri/i965/intel_screen.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/mesa/drivers/dri/i965/intel_screen.c b/src/mesa/drivers/dri/i965/intel_screen.c index 47008b5..c0fdde3 100644 --- a/src/mesa/drivers/dri/i965/intel_screen.c +++ b/src/mesa/drivers/dri/i965/intel_screen.c @@ -180,6 +180,12 @@ static const struct __DRI2flushExtensionRec intelFlushExtension = { }; static const struct intel_image_format intel_image_formats[] = { + { __DRI_IMAGE_FOURCC_ARGB2101010, __DRI_IMAGE_COMPONENTS_RGBA, 1, + { { 0, 0, 0, __DRI_IMAGE_FORMAT_ARGB2101010, 4 } } }, + + { __DRI_IMAGE_FOURCC_XRGB2101010, __DRI_IMAGE_COMPONENTS_RGB, 1, + { { 0, 0, 0, __DRI_IMAGE_FORMAT_XRGB2101010, 4 } } }, + { __DRI_IMAGE_FOURCC_ARGB, __DRI_IMAGE_COMPONENTS_RGBA, 1, { { 0, 0, 0, __DRI_IMAGE_FORMAT_ARGB, 4 } } }, -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 01/10] i965/screen: Add basic support for rendering 10 bpc/depth 30 framebuffers.
Expose formats which are supported at least back to Gen 5 Ironlake, possibly further. Allow creation of 10 bpc winsys buffers for drawables. glxinfo now lists new RGBA 10 10 10 2/0 formats. Works correctly under DRI2 without compositing. Signed-off-by: Mario Kleiner--- src/mesa/drivers/dri/i965/intel_screen.c | 12 +++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/intel_screen.c b/src/mesa/drivers/dri/i965/intel_screen.c index d39509b..47008b5 100644 --- a/src/mesa/drivers/dri/i965/intel_screen.c +++ b/src/mesa/drivers/dri/i965/intel_screen.c @@ -1486,7 +1486,13 @@ intelCreateBuffer(__DRIscreen *dri_screen, fb->Visual.samples = num_samples; } - if (mesaVis->redBits == 5) { + if (mesaVis->redBits == 10 && mesaVis->alphaBits > 0) { + rgbFormat = mesaVis->redMask == 0x3ff0 ? MESA_FORMAT_B10G10R10A2_UNORM + : MESA_FORMAT_R10G10B10A2_UNORM; + } else if (mesaVis->redBits == 10) { + rgbFormat = mesaVis->redMask == 0x3ff0 ? MESA_FORMAT_B10G10R10X2_UNORM + : MESA_FORMAT_R10G10B10X2_UNORM; + } else if (mesaVis->redBits == 5) { rgbFormat = mesaVis->redMask == 0x1f ? MESA_FORMAT_R5G6B5_UNORM : MESA_FORMAT_B5G6R5_UNORM; } else if (mesaVis->sRGBCapable) { @@ -1874,6 +1880,10 @@ intel_screen_make_configs(__DRIscreen *dri_screen) /* Required by Android, for HAL_PIXEL_FORMAT_RGBX_. */ MESA_FORMAT_R8G8B8X8_UNORM, + + /* For 10 bpc, 30 bit depth framebuffers */ + MESA_FORMAT_B10G10R10A2_UNORM, + MESA_FORMAT_B10G10R10X2_UNORM, }; /* GLX_SWAP_COPY_OML is not supported due to page flipping. */ -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 02/10] i965: Support XRGB2101010 and ARGB2101010 compositing under DRI2.
This works well as tested under Compiz, KDE-5, Gnome-Shell. Signed-off-by: Mario Kleiner--- src/mesa/drivers/dri/i965/intel_blit.c | 8 src/mesa/drivers/dri/i965/intel_tex_image.c | 12 ++-- 2 files changed, 18 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_blit.c b/src/mesa/drivers/dri/i965/intel_blit.c index 819a3da..b324c47 100644 --- a/src/mesa/drivers/dri/i965/intel_blit.c +++ b/src/mesa/drivers/dri/i965/intel_blit.c @@ -161,6 +161,14 @@ intel_miptree_blit_compatible_formats(mesa_format src, mesa_format dst) return (dst == MESA_FORMAT_R8G8B8A8_UNORM || dst == MESA_FORMAT_R8G8B8X8_UNORM); + if (src == MESA_FORMAT_B10G10R10A2_UNORM || src == MESA_FORMAT_B10G10R10X2_UNORM) + return (dst == MESA_FORMAT_B10G10R10A2_UNORM || + dst == MESA_FORMAT_B10G10R10X2_UNORM); + + if (src == MESA_FORMAT_R10G10B10A2_UNORM || src == MESA_FORMAT_R10G10B10X2_UNORM) + return (dst == MESA_FORMAT_R10G10B10A2_UNORM || + dst == MESA_FORMAT_R10G10B10X2_UNORM); + return false; } diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c b/src/mesa/drivers/dri/i965/intel_tex_image.c index 4661581..405de99 100644 --- a/src/mesa/drivers/dri/i965/intel_tex_image.c +++ b/src/mesa/drivers/dri/i965/intel_tex_image.c @@ -246,11 +246,19 @@ intelSetTexBuffer2(__DRIcontext *pDRICtx, GLint target, if (rb->mt->cpp == 4) { if (texture_format == __DRI_TEXTURE_FORMAT_RGB) { internal_format = GL_RGB; - texFormat = MESA_FORMAT_B8G8R8X8_UNORM; + if (rb->mt->format == MESA_FORMAT_B10G10R10X2_UNORM || + rb->mt->format == MESA_FORMAT_B10G10R10A2_UNORM) +texFormat = MESA_FORMAT_B10G10R10X2_UNORM; + else +texFormat = MESA_FORMAT_B8G8R8X8_UNORM; } else { internal_format = GL_RGBA; - texFormat = MESA_FORMAT_B8G8R8A8_UNORM; + if (rb->mt->format == MESA_FORMAT_B10G10R10X2_UNORM || + rb->mt->format == MESA_FORMAT_B10G10R10A2_UNORM) +texFormat = MESA_FORMAT_B10G10R10A2_UNORM; + else +texFormat = MESA_FORMAT_B8G8R8A8_UNORM; } } else if (rb->mt->cpp == 2) { internal_format = GL_RGB; -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 04/10] loader/dri3: Add XRGB2101010 and ARGB2101010 support.
To allow DRI3/Present buffer sharing for 10 bpc buffers. Signed-off-by: Mario Kleiner--- src/loader/loader_dri3_helper.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/loader/loader_dri3_helper.c b/src/loader/loader_dri3_helper.c index e3120f5..e3d819f 100644 --- a/src/loader/loader_dri3_helper.c +++ b/src/loader/loader_dri3_helper.c @@ -980,6 +980,8 @@ image_format_to_fourcc(int format) case __DRI_IMAGE_FORMAT_ARGB: return __DRI_IMAGE_FOURCC_ARGB; case __DRI_IMAGE_FORMAT_ABGR: return __DRI_IMAGE_FOURCC_ABGR; case __DRI_IMAGE_FORMAT_XBGR: return __DRI_IMAGE_FOURCC_XBGR; + case __DRI_IMAGE_FORMAT_XRGB2101010: return __DRI_IMAGE_FOURCC_XRGB2101010; + case __DRI_IMAGE_FORMAT_ARGB2101010: return __DRI_IMAGE_FOURCC_ARGB2101010; } return 0; } -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 03/10] dri: Add 10 bpc formats as available formats.
Used to support ARGB2101010 and XRGB2101010 winsys framebuffers / drawables, but added other 10 bpc fourcc's as well for consistency with definitions in wayland_drm.h, gbm.h, and drm_fourcc.h. Signed-off-by: Mario Kleiner--- include/GL/internal/dri_interface.h | 8 1 file changed, 8 insertions(+) diff --git a/include/GL/internal/dri_interface.h b/include/GL/internal/dri_interface.h index 1c91bde..195c2a7 100644 --- a/include/GL/internal/dri_interface.h +++ b/include/GL/internal/dri_interface.h @@ -1246,6 +1246,14 @@ struct __DRIdri2ExtensionRec { #define __DRI_IMAGE_FOURCC_ABGR0x34324241 #define __DRI_IMAGE_FOURCC_XBGR0x34324258 #define __DRI_IMAGE_FOURCC_SARGB0x83324258 +#define __DRI_IMAGE_FOURCC_ARGB2101010 0x30335241 +#define __DRI_IMAGE_FOURCC_XRGB2101010 0x30335258 +#define __DRI_IMAGE_FOURCC_ABGR2101010 0x30334241 +#define __DRI_IMAGE_FOURCC_XBGR2101010 0x30334258 +#define __DRI_IMAGE_FOURCC_RGBA1010102 0x30334152 +#define __DRI_IMAGE_FOURCC_RGBX1010102 0x30335852 +#define __DRI_IMAGE_FOURCC_BGRA1010102 0x30334142 +#define __DRI_IMAGE_FOURCC_BGRX1010102 0x30335842 #define __DRI_IMAGE_FOURCC_YUV410 0x39565559 #define __DRI_IMAGE_FOURCC_YUV411 0x31315559 #define __DRI_IMAGE_FOURCC_YUV420 0x32315559 -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] loader/dri3: Invalidate the drawable after copySubBuffer
On 04/09/17 09:27 PM, Thomas Hellstrom wrote: > Anyone using copySubBuffer as a replacement for swapBuffers would probably > want window resizing to update the viewport. > > Signed-off-by: Thomas Hellstrom> --- > src/loader/loader_dri3_helper.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/src/loader/loader_dri3_helper.c b/src/loader/loader_dri3_helper.c > index c0a6e0d..9549b18 100644 > --- a/src/loader/loader_dri3_helper.c > +++ b/src/loader/loader_dri3_helper.c > @@ -664,6 +664,8 @@ loader_dri3_copy_sub_buffer(struct loader_dri3_drawable > *draw, >dri3_fence_trigger(draw->conn, dri3_fake_front_buffer(draw)); >dri3_fence_await(draw->conn, dri3_fake_front_buffer(draw)); > } > + > + draw->ext->flush->invalidate(draw->dri_drawable); > dri3_fence_await(draw->conn, back); > } Your rationale makes some sense to me, but I notice that dri2CopySubBuffer doesn't seem to do this. Do you have a test case where this makes a difference? -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] loader/dri3: Use client local back to front blit in copySubBuffer if available
On 04/09/17 09:27 PM, Thomas Hellstrom wrote: > The copySubBuffer functionality always attempted a server side blit from > back to fake front if a fake front was present, and we weren't displaying > on a remote GPU. > > Now that we always have local blit capability on modern drivers, first > attempt a local blit, and only if that fails, try the server blit. > > Signed-off-by: Thomas Hellstrom> --- > src/loader/loader_dri3_helper.c | 16 +++- > 1 file changed, 7 insertions(+), 9 deletions(-) > > diff --git a/src/loader/loader_dri3_helper.c b/src/loader/loader_dri3_helper.c > index e3120f5..c0a6e0d 100644 > --- a/src/loader/loader_dri3_helper.c > +++ b/src/loader/loader_dri3_helper.c > @@ -635,14 +635,6 @@ loader_dri3_copy_sub_buffer(struct loader_dri3_drawable > *draw, > back->image, > 0, 0, back->width, back->height, > 0, 0, __BLIT_FLAG_FLUSH); > - /* We use blit_image to update our fake front, > - */ > - if (draw->have_fake_front) > - (void) loader_dri3_blit_image(draw, > - dri3_fake_front_buffer(draw)->image, > - back->image, > - x, y, width, height, > - x, y, __BLIT_FLAG_FLUSH); > } > > loader_dri3_swapbuffer_barrier(draw); > @@ -656,7 +648,13 @@ loader_dri3_copy_sub_buffer(struct loader_dri3_drawable > *draw, > /* Refresh the fake front (if present) after we just damaged the real > * front. > */ > - if (draw->have_fake_front && !draw->is_different_gpu) { > + if (draw->have_fake_front && > + !loader_dri3_blit_image(draw, > + dri3_fake_front_buffer(draw)->image, > + back->image, > + x, y, width, height, > + x, y, __BLIT_FLAG_FLUSH) && > + !draw->is_different_gpu) { >dri3_fence_reset(draw->conn, dri3_fake_front_buffer(draw)); >dri3_copy_area(draw->conn, > back->pixmap, > Reviewed-by: Michel Dänzer -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 102522] [radeonsi, bisected] commit 147d7fb772 causes full-window map to flash green in Crea
https://bugs.freedesktop.org/show_bug.cgi?id=102522 Michel Dänzerchanged: What|Removed |Added QA Contact|dri-devel@lists.freedesktop |mesa-dev@lists.freedesktop. |.org|org CC||charmai...@vmware.com Component|Drivers/Gallium/radeonsi|Mesa core Assignee|dri-devel@lists.freedesktop |mesa-dev@lists.freedesktop. |.org|org --- Comment #1 from Michel Dänzer --- Charmaine, any ideas? -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 102530] [bisected] Kodi crashes when launching a stream - commit bd2662bf
https://bugs.freedesktop.org/show_bug.cgi?id=102530 --- Comment #6 from Timothy Arceri--- (In reply to Alexandre Demers from comment #5) > Created attachment 133967 [details] > Kodi's log after segfault > > This is the core dump produced by Kodi when segfaulting. > /usr/bin/kodi: line 175: 16779 Segmentation fault (core dumped) > "$LIBDIR/${bin_name}/${bin_name}.bin" $SAVED_ARGS > > If you need something else, let me know. So that looks like the stack trace from having NO_ERROR enabled can you provide one for the crash that happens when NO_ERROR is disabled. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] llvmpipe, draw: increase shader cache limits
From: Roland ScheideggerWe're not particularly concerned with memory usage, if the tradeoff is shader recompiles. And it's common for apps to have a lot of shaders nowadays (and, since our shaders include a LOT of context state of course we may create quite a bit more shaders even). So quadruple the amount of shaders draw will cache (from 128 to 512). For llvmpipe (fs shaders) quadruple the number of instructions, keep the number of variants the same for now (only with very simple, non-texturing shaders the variant limit could really be reached), and simplify the definition, it's probably easier to just have one different definition per branch... --- src/gallium/auxiliary/draw/draw_private.h | 2 +- src/gallium/drivers/llvmpipe/lp_limits.h | 4 +--- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_private.h b/src/gallium/auxiliary/draw/draw_private.h index 030bb2c..06ad737 100644 --- a/src/gallium/auxiliary/draw/draw_private.h +++ b/src/gallium/auxiliary/draw/draw_private.h @@ -103,7 +103,7 @@ struct vertex_header { /* maximum number of shader variants we can cache */ -#define DRAW_MAX_SHADER_VARIANTS 128 +#define DRAW_MAX_SHADER_VARIANTS 512 /** * Private context for the drawing module. diff --git a/src/gallium/drivers/llvmpipe/lp_limits.h b/src/gallium/drivers/llvmpipe/lp_limits.h index 5294ced..c280816 100644 --- a/src/gallium/drivers/llvmpipe/lp_limits.h +++ b/src/gallium/drivers/llvmpipe/lp_limits.h @@ -78,10 +78,8 @@ /** * Max number of instructions (for all fragment shaders combined per context) * that will be kept around (counted in terms of llvm ir). - * Note: the definition looks odd, but there's branches which use a different - * number of max shader variants. */ -#define LP_MAX_SHADER_INSTRUCTIONS MAX2(256*1024, 512*LP_MAX_SHADER_VARIANTS) +#define LP_MAX_SHADER_INSTRUCTIONS (2048 * LP_MAX_SHADER_VARIANTS) /** * Max number of setup variants that will be kept around. -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/4] ac/debug: take ASIC generation into account when printing registers
gfx9d.h contains almost no named values. Does it obtain named values from sid.h when the same field is also present in gfx9d.h? Marek On Mon, Sep 4, 2017 at 2:11 PM, Nicolai Hähnlewrote: > From: Nicolai Hähnle > > There were some overlapping changes in gfx9 especially in the CB/DB > blocks which made register dumps rather misleading. > > The split is along the lines of the header files, so we'll print VI-only > fields on SI and CI, for example, but we won't print GFX9 fields on > SI/CI/VI, and we won't print SI/CI/VI fields on GFX9. > --- > src/amd/common/ac_debug.c| 83 ++ > src/amd/common/sid_tables.py | 201 > +++ > 2 files changed, 177 insertions(+), 107 deletions(-) > > diff --git a/src/amd/common/ac_debug.c b/src/amd/common/ac_debug.c > index 570ba850851..54685356f1d 100644 > --- a/src/amd/common/ac_debug.c > +++ b/src/amd/common/ac_debug.c > @@ -94,68 +94,83 @@ static void print_value(FILE *file, uint32_t value, int > bits) > } > > static void print_named_value(FILE *file, const char *name, uint32_t value, > int bits) > { > print_spaces(file, INDENT_PKT); > fprintf(file, COLOR_YELLOW "%s" COLOR_RESET " <- ", name); > print_value(file, value, bits); > } > > +static const struct si_reg *find_register(const struct si_reg *table, > + unsigned table_size, > + unsigned offset) > +{ > + for (unsigned i = 0; i < table_size; i++) { > + const struct si_reg *reg = [i]; > + > + if (reg->offset == offset) > + return reg; > + } > + > + return NULL; > +} > + > void ac_dump_reg(FILE *file, enum chip_class chip_class, unsigned offset, > uint32_t value, uint32_t field_mask) > { > - int r, f; > + const struct si_reg *reg = NULL; > > - for (r = 0; r < ARRAY_SIZE(sid_reg_table); r++) { > - const struct si_reg *reg = _reg_table[r]; > - const char *reg_name = sid_strings + reg->name_offset; > + if (chip_class >= GFX9) > + reg = find_register(gfx9d_reg_table, > ARRAY_SIZE(gfx9d_reg_table), offset); > + if (!reg) > + reg = find_register(sid_reg_table, ARRAY_SIZE(sid_reg_table), > offset); > > - if (reg->offset == offset) { > - bool first_field = true; > + if (reg) { > + const char *reg_name = sid_strings + reg->name_offset; > + bool first_field = true; > > - print_spaces(file, INDENT_PKT); > - fprintf(file, COLOR_YELLOW "%s" COLOR_RESET " <- ", > - reg_name); > + print_spaces(file, INDENT_PKT); > + fprintf(file, COLOR_YELLOW "%s" COLOR_RESET " <- ", > + reg_name); > > - if (!reg->num_fields) { > - print_value(file, value, 32); > - return; > - } > + if (!reg->num_fields) { > + print_value(file, value, 32); > + return; > + } > > - for (f = 0; f < reg->num_fields; f++) { > - const struct si_field *field = > sid_fields_table + reg->fields_offset + f; > - const int *values_offsets = > sid_strings_offsets + field->values_offset; > - uint32_t val = (value & field->mask) >> > - (ffs(field->mask) - 1); > + for (unsigned f = 0; f < reg->num_fields; f++) { > + const struct si_field *field = sid_fields_table + > reg->fields_offset + f; > + const int *values_offsets = sid_strings_offsets + > field->values_offset; > + uint32_t val = (value & field->mask) >> > + (ffs(field->mask) - 1); > > - if (!(field->mask & field_mask)) > - continue; > + if (!(field->mask & field_mask)) > + continue; > > - /* Indent the field. */ > - if (!first_field) > - print_spaces(file, > -INDENT_PKT + > strlen(reg_name) + 4); > + /* Indent the field. */ > + if (!first_field) > + print_spaces(file, > +INDENT_PKT + strlen(reg_name) + > 4); > > - /* Print the field. */ > -
Re: [Mesa-dev] What is the difference between ROCm and Clover?
On Mon, Sep 4, 2017 at 3:56 PM, Nicolai Hähnlewrote: > On 04.09.2017 15:38, Aaron Watry wrote: >> >> On Sun, Sep 3, 2017 at 3:20 PM, David Niklas wrote: >>> >>> Hello, >>> I'm interested in knowing why there are two different OpenCL >>> implementations of OpenCL drivers for AMD cards. >>> Is it because they support different ASICs? >>> One is older and going to be replaced? >> >> >> As near as I can tell, ROCm requires PCIe 3.0 Atomic operations for >> both the CPU and GPU. As such, ROCm will never support the VLIW >> radeons (5000-6000 series) or the earlier generations of GCN (SI, CI, >> possibly VI?). According to the ROCm page, it requires >> Fiji/Polaris/Vega. It also requires either a Haswell or Ryzen CPU (or >> newer), so anyone with an older CPU will be left out of ROCm support. >> External GPUs via Thunderbolt 1/2 enclosures are also unsupported. > > > For what it's worth, if you have an "older" system without PCIe 3.0 atomics, > you could still try and see how far you get with ROCm. As long as you don't > *actually* need atomics between the CPU and GPU in your own code, you're > probably fine. I've never tried it, though. I can confirm with 100% certainty that PCIe 3.0 atomics aren't required if you don't need them. In the majority cases, you don't need them. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 102530] [bisected] Kodi crashes when launching a stream - commit bd2662bf
https://bugs.freedesktop.org/show_bug.cgi?id=102530 --- Comment #5 from Alexandre Demers--- Created attachment 133967 --> https://bugs.freedesktop.org/attachment.cgi?id=133967=edit Kodi's log after segfault This is the core dump produced by Kodi when segfaulting. /usr/bin/kodi: line 175: 16779 Segmentation fault (core dumped) "$LIBDIR/${bin_name}/${bin_name}.bin" $SAVED_ARGS If you need something else, let me know. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radeonsi/gfx9: always flush DB metadata on framebuffer changes
Reviewed-by: Marek OlšákMarek On Mon, Sep 4, 2017 at 2:16 PM, Nicolai Hähnle wrote: > From: Nicolai Hähnle > > This fixes GL45-CTS.shader_image_load_store.basic-glsl-earlyFragTests. > > Cc: mesa-sta...@lists.freedesktop.org > -- > FWIW, Vulkan also always flushes DB metadata on gfx9. Wait-for-idle > is not required here. > --- > src/gallium/drivers/radeonsi/si_pipe.h | 4 ++-- > src/gallium/drivers/radeonsi/si_state.c | 11 ++- > src/gallium/drivers/radeonsi/si_state_draw.c | 3 ++- > 3 files changed, 14 insertions(+), 4 deletions(-) > > diff --git a/src/gallium/drivers/radeonsi/si_pipe.h > b/src/gallium/drivers/radeonsi/si_pipe.h > index 386a6dc886d..b82ec7ef9f8 100644 > --- a/src/gallium/drivers/radeonsi/si_pipe.h > +++ b/src/gallium/drivers/radeonsi/si_pipe.h > @@ -54,23 +54,23 @@ > /* VMEM L1 can optionally be bypassed (GLC=1). Other names: TC L1 */ > #define SI_CONTEXT_INV_VMEM_L1 (R600_CONTEXT_PRIVATE_FLAG << 2) > /* Used by everything except CB/DB, can be bypassed (SLC=1). Other names: TC > L2 */ > #define SI_CONTEXT_INV_GLOBAL_L2 (R600_CONTEXT_PRIVATE_FLAG << 3) > /* Write dirty L2 lines back to memory (shader and CP DMA stores), but don't > * invalidate L2. SI-CIK can't do it, so they will do complete invalidation. > */ > #define SI_CONTEXT_WRITEBACK_GLOBAL_L2 (R600_CONTEXT_PRIVATE_FLAG << 4) > /* Writeback & invalidate the L2 metadata cache. It can only be coupled with > * a CB or DB flush. */ > #define SI_CONTEXT_INV_L2_METADATA (R600_CONTEXT_PRIVATE_FLAG << 5) > -/* gap */ > /* Framebuffer caches. */ > -#define SI_CONTEXT_FLUSH_AND_INV_DB(R600_CONTEXT_PRIVATE_FLAG << 7) > +#define SI_CONTEXT_FLUSH_AND_INV_DB(R600_CONTEXT_PRIVATE_FLAG << 6) > +#define SI_CONTEXT_FLUSH_AND_INV_DB_META (R600_CONTEXT_PRIVATE_FLAG << 7) > #define SI_CONTEXT_FLUSH_AND_INV_CB(R600_CONTEXT_PRIVATE_FLAG << 8) > /* Engine synchronization. */ > #define SI_CONTEXT_VS_PARTIAL_FLUSH(R600_CONTEXT_PRIVATE_FLAG << 9) > #define SI_CONTEXT_PS_PARTIAL_FLUSH(R600_CONTEXT_PRIVATE_FLAG << 10) > #define SI_CONTEXT_CS_PARTIAL_FLUSH(R600_CONTEXT_PRIVATE_FLAG << 11) > #define SI_CONTEXT_VGT_FLUSH (R600_CONTEXT_PRIVATE_FLAG << 12) > #define SI_CONTEXT_VGT_STREAMOUT_SYNC (R600_CONTEXT_PRIVATE_FLAG << 13) > > #define SI_PREFETCH_VBO_DESCRIPTORS(1 << 0) > #define SI_PREFETCH_LS (1 << 1) > diff --git a/src/gallium/drivers/radeonsi/si_state.c > b/src/gallium/drivers/radeonsi/si_state.c > index 41b08f8de4f..365c1248b2f 100644 > --- a/src/gallium/drivers/radeonsi/si_state.c > +++ b/src/gallium/drivers/radeonsi/si_state.c > @@ -2569,23 +2569,32 @@ static void si_set_framebuffer_state(struct > pipe_context *ctx, > > sctx->framebuffer.CB_has_shader_readable_metadata); > > sctx->b.flags |= SI_CONTEXT_CS_PARTIAL_FLUSH; > > /* u_blitter doesn't invoke depth decompression when it does multiple > * blits in a row, but the only case when it matters for DB is when > * doing generate_mipmap. So here we flush DB manually between > * individual generate_mipmap blits. > * Note that lower mipmap levels aren't compressed. > */ > - if (sctx->generate_mipmap_for_depth) > + if (sctx->generate_mipmap_for_depth) { > si_make_DB_shader_coherent(sctx, 1, false, > > sctx->framebuffer.DB_has_shader_readable_metadata); > + } else if (sctx->b.chip_class == GFX9) { > + /* It appears that DB metadata "leaks" in a sequence of: > +* - depth clear > +* - DCC decompress for shader image writes (with DB > disabled) > +* - render with DEPTH_BEFORE_SHADER=1 > +* Flushing DB metadata works around the problem. > +*/ > + sctx->b.flags |= SI_CONTEXT_FLUSH_AND_INV_DB_META; > + } > > /* Take the maximum of the old and new count. If the new count is > lower, > * dirtying is needed to disable the unbound colorbuffers. > */ > sctx->framebuffer.dirty_cbufs |= > (1 << MAX2(sctx->framebuffer.state.nr_cbufs, > state->nr_cbufs)) - 1; > sctx->framebuffer.dirty_zsbuf |= sctx->framebuffer.state.zsbuf != > state->zsbuf; > > si_dec_framebuffer_counters(>framebuffer.state); > util_copy_framebuffer_state(>framebuffer.state, state); > diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c > b/src/gallium/drivers/radeonsi/si_state_draw.c > index 81751d2186e..7ee6cf88e88 100644 > --- a/src/gallium/drivers/radeonsi/si_state_draw.c > +++ b/src/gallium/drivers/radeonsi/si_state_draw.c > @@ -905,21 +905,22 @@ void si_emit_cache_flush(struct si_context *sctx) > if (rctx->flags & SI_CONTEXT_FLUSH_AND_INV_DB) >
Re: [Mesa-dev] [PATCH] radeonsi/gfx9: proper workaround for LS/HS VGPR initialization bug
Would it be possible to use this workaround only when LS vertices > HS vertices? (which should be rare) Marek On Mon, Sep 4, 2017 at 8:11 PM, Nicolai Hähnlewrote: > From: Nicolai Hähnle > > When the HS wave is empty, the hardware writes the LS VGPRs starting at > v0 instead of v2. Workaround by shifting them back into place when > necessary. For simplicity, this is always done in the LS prolog. > > According to the hardware team, this will be fixed in future chips, > so take that into account already. > > Note that this is not a bug fix, as the bug was already worked > around by commit 166823bfd26 ("radeonsi/gfx9: add a temporary workaround > for a tessellation driver bug"). This change merely replaces the > workaround by one that should be better. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v4 1/1] clover: Wait for requested operation if blocking flag is set
Jan Veselywrites: > v2: wait in map_buffer and map_image as well > v3: use event::wait instead of wait (skips fence wait for hard_event) > v4: use wait_signalled() > > Signed-off-by: Jan Vesely > --- > Hi Francisco, > > once again sorry for the delay, and thanks for you patience. > This patch applies on top of the two you attached during our email > discussion. > From what I can tell, the functionality is identical to v3 after your > two patches are applied ("event:wait()" calls "wait_signalled()"), but I > suppose calling non-virtual function is preferrable. if not, feel free > to use v3. > Yeah, I find v4 more readable than calling the base class' implementation of wait(). Patch is: Reviewed-by: Francisco Jerez Thanks. > thanks, > Jan > > src/gallium/state_trackers/clover/api/transfer.cpp | 30 > -- > 1 file changed, 28 insertions(+), 2 deletions(-) > > diff --git a/src/gallium/state_trackers/clover/api/transfer.cpp > b/src/gallium/state_trackers/clover/api/transfer.cpp > index f7046253be..34559042ae 100644 > --- a/src/gallium/state_trackers/clover/api/transfer.cpp > +++ b/src/gallium/state_trackers/clover/api/transfer.cpp > @@ -295,6 +295,9 @@ clEnqueueReadBuffer(cl_command_queue d_q, cl_mem d_mem, > cl_bool blocking, > , obj_origin, obj_pitch, > region)); > > + if (blocking) > + hev().wait_signalled(); > + > ret_object(rd_ev, hev); > return CL_SUCCESS; > > @@ -325,6 +328,9 @@ clEnqueueWriteBuffer(cl_command_queue d_q, cl_mem d_mem, > cl_bool blocking, > ptr, {}, obj_pitch, > region)); > > + if (blocking) > + hev().wait_signalled(); > + > ret_object(rd_ev, hev); > return CL_SUCCESS; > > @@ -362,6 +368,9 @@ clEnqueueReadBufferRect(cl_command_queue d_q, cl_mem > d_mem, cl_bool blocking, > , obj_origin, obj_pitch, > region)); > > + if (blocking) > + hev().wait_signalled(); > + > ret_object(rd_ev, hev); > return CL_SUCCESS; > > @@ -399,6 +408,9 @@ clEnqueueWriteBufferRect(cl_command_queue d_q, cl_mem > d_mem, cl_bool blocking, > ptr, host_origin, host_pitch, > region)); > > + if (blocking) > + hev().wait_signalled(); > + > ret_object(rd_ev, hev); > return CL_SUCCESS; > > @@ -504,6 +516,9 @@ clEnqueueReadImage(cl_command_queue d_q, cl_mem d_mem, > cl_bool blocking, > , src_origin, src_pitch, > region)); > > + if (blocking) > + hev().wait_signalled(); > + > ret_object(rd_ev, hev); > return CL_SUCCESS; > > @@ -538,6 +553,9 @@ clEnqueueWriteImage(cl_command_queue d_q, cl_mem d_mem, > cl_bool blocking, > ptr, {}, src_pitch, > region)); > > + if (blocking) > + hev().wait_signalled(); > + > ret_object(rd_ev, hev); > return CL_SUCCESS; > > @@ -667,7 +685,11 @@ clEnqueueMapBuffer(cl_command_queue d_q, cl_mem d_mem, > cl_bool blocking, > > void *map = mem.resource(q).add_map(q, flags, blocking, obj_origin, > region); > > - ret_object(rd_ev, create(q, CL_COMMAND_MAP_BUFFER, deps)); > + auto hev = create(q, CL_COMMAND_MAP_BUFFER, deps); > + if (blocking) > + hev().wait_signalled(); > + > + ret_object(rd_ev, hev); > ret_error(r_errcode, CL_SUCCESS); > return map; > > @@ -695,7 +717,11 @@ clEnqueueMapImage(cl_command_queue d_q, cl_mem d_mem, > cl_bool blocking, > > void *map = img.resource(q).add_map(q, flags, blocking, origin, region); > > - ret_object(rd_ev, create(q, CL_COMMAND_MAP_IMAGE, deps)); > + auto hev = create(q, CL_COMMAND_MAP_IMAGE, deps); > + if (blocking) > + hev().wait_signalled(); > + > + ret_object(rd_ev, hev); > ret_error(r_errcode, CL_SUCCESS); > return map; > > -- > 2.13.5 signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] gallium: build ddebug, noop, rbug, trace as part of auxiliary
From: Marek OlšákBuilding gallium is faster by 7.5 seconds on a 4core/8thread 3GHz CPU. (gallium build time is reduced by 15% when building only radeonsi) Non-recursive makefiles are great! --- src/gallium/Makefile.am| 12 src/gallium/auxiliary/Makefile.am | 10 ++- .../auxiliary/target-helpers/inline_debug_helper.h | 26 - src/gallium/drivers/ddebug/Makefile.am | 9 -- src/gallium/drivers/ddebug/Makefile.sources| 14 - src/gallium/drivers/noop/Makefile.am | 16 --- src/gallium/drivers/noop/Makefile.sources | 8 +++--- src/gallium/drivers/rbug/Makefile.am | 33 -- src/gallium/drivers/rbug/Makefile.sources | 18 ++-- src/gallium/drivers/trace/Makefile.am | 14 - src/gallium/drivers/trace/Makefile.sources | 26 - src/gallium/state_trackers/osmesa/Makefile.am | 3 +- src/gallium/targets/d3dadapter9/Makefile.am| 8 +- src/gallium/targets/dri/Makefile.am| 10 +-- src/gallium/targets/libgl-xlib/Makefile.am | 6 +--- src/gallium/targets/osmesa/Makefile.am | 4 +-- src/gallium/targets/pipe-loader/Makefile.am| 6 +--- src/gallium/tests/unit/Makefile.am | 1 - 18 files changed, 54 insertions(+), 170 deletions(-) delete mode 100644 src/gallium/drivers/ddebug/Makefile.am delete mode 100644 src/gallium/drivers/noop/Makefile.am delete mode 100644 src/gallium/drivers/rbug/Makefile.am delete mode 100644 src/gallium/drivers/trace/Makefile.am diff --git a/src/gallium/Makefile.am b/src/gallium/Makefile.am index 9f98a7e..9e8b827 100644 --- a/src/gallium/Makefile.am +++ b/src/gallium/Makefile.am @@ -4,26 +4,20 @@ SUBDIRS = ## Gallium auxiliary module ## SUBDIRS += auxiliary SUBDIRS += auxiliary/pipe-loader ## ## Gallium pipe drivers and their respective winsys' ## -SUBDIRS += \ - drivers/ddebug \ - drivers/noop \ - drivers/trace \ - drivers/rbug - ## freedreno/msm/kgsl if HAVE_GALLIUM_FREEDRENO SUBDIRS += drivers/freedreno winsys/freedreno/drm endif ## i915g/i915 if HAVE_GALLIUM_I915 SUBDIRS += drivers/i915 winsys/i915/drm endif @@ -176,20 +170,26 @@ endif if HAVE_ST_NINE SUBDIRS += state_trackers/nine targets/d3dadapter9 endif ## ## Don't forget to bundle the remaining (non autotools) state-trackers/targets ## EXTRA_DIST += \ include \ + drivers/noop/SConscript \ + drivers/rbug/README \ + drivers/rbug/SConscript \ + drivers/trace/trace.xsl \ + drivers/trace/README \ + drivers/trace/SConscript \ state_trackers/README \ state_trackers/wgl targets/libgl-gdi \ targets/graw-gdi targets/graw-null targets/graw-xlib \ state_trackers/hgl targets/haiku-softpipe \ tools ## ## Gallium tests ## diff --git a/src/gallium/auxiliary/Makefile.am b/src/gallium/auxiliary/Makefile.am index a64ead2..5a92c1a 100644 --- a/src/gallium/auxiliary/Makefile.am +++ b/src/gallium/auxiliary/Makefile.am @@ -1,32 +1,40 @@ include Makefile.sources +include $(top_srcdir)/src/gallium/drivers/ddebug/Makefile.sources +include $(top_srcdir)/src/gallium/drivers/noop/Makefile.sources +include $(top_srcdir)/src/gallium/drivers/rbug/Makefile.sources +include $(top_srcdir)/src/gallium/drivers/trace/Makefile.sources include $(top_srcdir)/src/gallium/Automake.inc noinst_LTLIBRARIES = libgallium.la AM_CFLAGS = \ -I$(top_srcdir)/src/loader \ -I$(top_builddir)/src/compiler/nir \ -I$(top_srcdir)/src/gallium/auxiliary/util \ $(GALLIUM_CFLAGS) \ $(LIBUNWIND_CFLAGS) \ $(VISIBILITY_CFLAGS) \ $(MSVC2013_COMPAT_CFLAGS) AM_CXXFLAGS = \ $(VISIBILITY_CXXFLAGS) \ $(MSVC2013_COMPAT_CXXFLAGS) libgallium_la_SOURCES = \ $(C_SOURCES) \ $(NIR_SOURCES) \ - $(GENERATED_SOURCES) + $(GENERATED_SOURCES) \ + $(DDEBUG_SOURCES) \ + $(NOOP_SOURCES) \ + $(RBUG_SOURCES) \ + $(TRACE_SOURCES) if HAVE_LIBDRM AM_CFLAGS += \ $(LIBDRM_CFLAGS) libgallium_la_SOURCES += \ $(RENDERONLY_SOURCES) endif diff --git a/src/gallium/auxiliary/target-helpers/inline_debug_helper.h b/src/gallium/auxiliary/target-helpers/inline_debug_helper.h index 2443bf2..8556376 100644 --- a/src/gallium/auxiliary/target-helpers/inline_debug_helper.h +++ b/src/gallium/auxiliary/target-helpers/inline_debug_helper.h @@ -4,56 +4,30 @@ #include "pipe/p_compiler.h" #include "util/u_debug.h" #include "util/u_tests.h" /* Helper function to wrap a screen with * one or more debug driver: rbug, trace. */ -#ifdef GALLIUM_DDEBUG #include "ddebug/dd_public.h" -#endif - -#ifdef GALLIUM_TRACE #include "trace/tr_public.h" -#endif - -#ifdef GALLIUM_RBUG #include
Re: [Mesa-dev] [PATCH 2/2] clover: Query and export half precision support
Jan Veselywrites: > Signed-off-by: Jan Vesely With the spelling fixed up (s/has_halfs/has_halves/) patch is: Reviewed-by: Francisco Jerez > --- > > src/gallium/state_trackers/clover/api/device.cpp | 14 +++--- > src/gallium/state_trackers/clover/core/device.cpp | 5 + > src/gallium/state_trackers/clover/core/device.hpp | 1 + > 3 files changed, 17 insertions(+), 3 deletions(-) > > diff --git a/src/gallium/state_trackers/clover/api/device.cpp > b/src/gallium/state_trackers/clover/api/device.cpp > index b202102389..7b31b10e15 100644 > --- a/src/gallium/state_trackers/clover/api/device.cpp > +++ b/src/gallium/state_trackers/clover/api/device.cpp > @@ -150,7 +150,7 @@ clGetDeviceInfo(cl_device_id d_dev, cl_device_info param, >break; > > case CL_DEVICE_PREFERRED_VECTOR_WIDTH_HALF: > - buf.as_scalar() = 0; > + buf.as_scalar() = dev.has_halfs() ? 8 : 0; >break; > > case CL_DEVICE_MAX_CLOCK_FREQUENCY: > @@ -213,6 +213,13 @@ clGetDeviceInfo(cl_device_id d_dev, cl_device_info param, >buf.as_scalar() = 128; >break; > > + case CL_DEVICE_HALF_FP_CONFIG: > + // This is the "mandated minimum half precision floating-point > + // capability" for OpenCL 1.x. > + buf.as_scalar() = > + CL_FP_INF_NAN | CL_FP_ROUND_TO_NEAREST; > + break; > + > case CL_DEVICE_SINGLE_FP_CONFIG: >// This is the "mandated minimum single precision floating-point >// capability" for OpenCL 1.1. In OpenCL 1.2, nothing is required for > @@ -329,7 +336,8 @@ clGetDeviceInfo(cl_device_id d_dev, cl_device_info param, > " cl_khr_local_int32_base_atomics" > " cl_khr_local_int32_extended_atomics" > " cl_khr_byte_addressable_store" > - + std::string(dev.has_doubles() ? " cl_khr_fp64" : ""); > + + std::string(dev.has_doubles() ? " cl_khr_fp64" : "") > + + std::string(dev.has_halfs() ? " cl_khr_fp16" : ""); >break; > > case CL_DEVICE_PLATFORM: > @@ -365,7 +373,7 @@ clGetDeviceInfo(cl_device_id d_dev, cl_device_info param, >break; > > case CL_DEVICE_NATIVE_VECTOR_WIDTH_HALF: > - buf.as_scalar() = 0; > + buf.as_scalar() = dev.has_halfs() ? 8 : 0; >break; > > case CL_DEVICE_OPENCL_C_VERSION: > diff --git a/src/gallium/state_trackers/clover/core/device.cpp > b/src/gallium/state_trackers/clover/core/device.cpp > index fc74bd51a9..f38696cc44 100644 > --- a/src/gallium/state_trackers/clover/core/device.cpp > +++ b/src/gallium/state_trackers/clover/core/device.cpp > @@ -191,6 +191,11 @@ device::has_doubles() const { > } > > bool > +device::has_halfs() const { > + return pipe->get_param(pipe, PIPE_CAP_HALFS); > +} > + > +bool > device::has_unified_memory() const { > return pipe->get_param(pipe, PIPE_CAP_UMA); > } > diff --git a/src/gallium/state_trackers/clover/core/device.hpp > b/src/gallium/state_trackers/clover/core/device.hpp > index 4e11519421..4ef53de486 100644 > --- a/src/gallium/state_trackers/clover/core/device.hpp > +++ b/src/gallium/state_trackers/clover/core/device.hpp > @@ -67,6 +67,7 @@ namespace clover { >cl_uint max_compute_units() const; >bool image_support() const; >bool has_doubles() const; > + bool has_halfs() const; >bool has_unified_memory() const; >cl_uint mem_base_addr_align() const; > > -- > 2.13.5 signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] radeonsi/gfx9: proper workaround for LS/HS VGPR initialization bug
From: Nicolai HähnleWhen the HS wave is empty, the hardware writes the LS VGPRs starting at v0 instead of v2. Workaround by shifting them back into place when necessary. For simplicity, this is always done in the LS prolog. According to the hardware team, this will be fixed in future chips, so take that into account already. Note that this is not a bug fix, as the bug was already worked around by commit 166823bfd26 ("radeonsi/gfx9: add a temporary workaround for a tessellation driver bug"). This change merely replaces the workaround by one that should be better. --- src/gallium/drivers/radeonsi/si_shader.c | 71 src/gallium/drivers/radeonsi/si_state_draw.c | 6 +-- 2 files changed, 53 insertions(+), 24 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index 0e89ccac09d..6840b8e4b65 100644 --- a/src/gallium/drivers/radeonsi/si_shader.c +++ b/src/gallium/drivers/radeonsi/si_shader.c @@ -5629,20 +5629,31 @@ static void si_init_exec_from_input(struct si_shader_context *ctx, { LLVMValueRef args[] = { LLVMGetParam(ctx->main_fn, param), LLVMConstInt(ctx->i32, bitoffset, 0), }; lp_build_intrinsic(ctx->gallivm.builder, "llvm.amdgcn.init.exec.from.input", ctx->voidt, args, 2, LP_FUNC_ATTR_CONVERGENT); } +static bool si_vs_needs_prolog(struct si_screen *sscreen, + struct si_shader_selector *sel, bool as_ls) +{ + /* VGPR initialization fixup for Vega10 and Raven is always done in the +* VS prolog. */ + return sel->vs_needs_prolog || + (as_ls && + (sscreen->b.family == CHIP_VEGA10 || +sscreen->b.family == CHIP_RAVEN)); +} + static bool si_compile_tgsi_main(struct si_shader_context *ctx, bool is_monolithic) { struct si_shader *shader = ctx->shader; struct si_shader_selector *sel = shader->selector; struct lp_build_tgsi_context *bld_base = >bld_base; // TODO clean all this up! switch (ctx->type) { case PIPE_SHADER_VERTEX: @@ -5705,21 +5716,21 @@ static bool si_compile_tgsi_main(struct si_shader_context *ctx, * * For monolithic merged shaders, the first shader is wrapped in an * if-block together with its prolog in si_build_wrapper_function. */ if (ctx->screen->b.chip_class >= GFX9) { if (!is_monolithic && sel->info.num_instructions > 1 && /* not empty shader */ (shader->key.as_es || shader->key.as_ls) && (ctx->type == PIPE_SHADER_TESS_EVAL || (ctx->type == PIPE_SHADER_VERTEX && - !sel->vs_needs_prolog))) { + !si_vs_needs_prolog(ctx->screen, sel, shader->key.as_ls { si_init_exec_from_input(ctx, ctx->param_merged_wave_info, 0); } else if (ctx->type == PIPE_SHADER_TESS_CTRL || ctx->type == PIPE_SHADER_GEOMETRY) { if (!is_monolithic) si_init_exec_full_mask(ctx); /* The barrier must execute for all shaders in a * threadgroup. */ @@ -6357,33 +6368,34 @@ int si_compile_tgsi_shader(struct si_screen *sscreen, si_build_vs_prolog_function(, _key); parts[0] = ctx.main_fn; } si_build_wrapper_function(, parts + !need_prolog, 1 + need_prolog, need_prolog, 0); } else if (is_monolithic && ctx.type == PIPE_SHADER_TESS_CTRL) { if (sscreen->b.chip_class >= GFX9) { struct si_shader_selector *ls = shader->key.part.tcs.ls; LLVMValueRef parts[4]; + bool vs_needs_prolog = si_vs_needs_prolog(sscreen, ls, true); /* TCS main part */ parts[2] = ctx.main_fn; /* TCS epilog */ union si_shader_part_key tcs_epilog_key; memset(_epilog_key, 0, sizeof(tcs_epilog_key)); tcs_epilog_key.tcs_epilog.states = shader->key.part.tcs.epilog; si_build_tcs_epilog_function(, _epilog_key); parts[3] = ctx.main_fn; /* VS prolog */ - if (ls->vs_needs_prolog) { + if (vs_needs_prolog) { union si_shader_part_key vs_prolog_key; si_get_vs_prolog_key(>info,
Re: [Mesa-dev] [PATCH 3/3] radeonsi/gfx9: implement primitive binning
I actually made a mistake while porting the code. All UINT_MAX occurences should stay, and UINT_MAX should be the terminator, so I'm adding this: diff --git a/src/gallium/drivers/radeonsi/si_state_binning.c b/src/gallium/drivers/radeonsi/si_state_binning.c index 56bcdc8..d75e86e 100644 --- a/src/gallium/drivers/radeonsi/si_state_binning.c +++ b/src/gallium/drivers/radeonsi/si_state_binning.c @@ -55,7 +55,7 @@ static struct uvec2 si_find_bin_size(struct si_screen *sscreen, const struct si_bin_size_map *subtable = [log_num_rb_per_se][log_num_se][0]; - for (i = 0; subtable[i].bin_size_x != 0; i++) { + for (i = 0; subtable[i].start != UINT_MAX; i++) { if (sum >= subtable[i].start && sum < subtable[i + 1].start) break; } Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] radeonsi/gfx9: implement primitive binning
On Mon, Sep 4, 2017 at 3:56 PM, Nicolai Hähnlewrote: > On 01.09.2017 02:57, Marek Olšák wrote: >> >> From: Marek Olšák >> >> This increases performance, but it was tuned for Raven, not Vega. >> We don't know yet how Vega will perform, hopefully not worse. >> --- >> src/gallium/drivers/radeon/r600_pipe_common.c | 2 + >> src/gallium/drivers/radeon/r600_pipe_common.h | 2 + >> src/gallium/drivers/radeonsi/Makefile.sources | 1 + >> src/gallium/drivers/radeonsi/si_hw_context.c| 2 + >> src/gallium/drivers/radeonsi/si_pipe.c | 5 + >> src/gallium/drivers/radeonsi/si_pipe.h | 2 + >> src/gallium/drivers/radeonsi/si_state.c | 26 +- >> src/gallium/drivers/radeonsi/si_state.h | 6 +- >> src/gallium/drivers/radeonsi/si_state_binning.c | 448 >> >> src/gallium/drivers/radeonsi/si_state_shaders.c | 2 + >> 10 files changed, 489 insertions(+), 7 deletions(-) >> create mode 100644 src/gallium/drivers/radeonsi/si_state_binning.c >> > [snip] > >> diff --git a/src/gallium/drivers/radeonsi/si_state_binning.c >> b/src/gallium/drivers/radeonsi/si_state_binning.c >> new file mode 100644 >> index 000..56bcdc8 >> --- /dev/null >> +++ b/src/gallium/drivers/radeonsi/si_state_binning.c >> @@ -0,0 +1,448 @@ >> +/* >> + * Copyright 2017 Advanced Micro Devices, Inc. >> + * >> + * Permission is hereby granted, free of charge, to any person obtaining >> a >> + * copy of this software and associated documentation files (the >> "Software"), >> + * to deal in the Software without restriction, including without >> limitation >> + * on the rights to use, copy, modify, merge, publish, distribute, sub >> + * license, and/or sell copies of the Software, and to permit persons to >> whom >> + * the Software is furnished to do so, subject to the following >> conditions: >> + * >> + * The above copyright notice and this permission notice (including the >> next >> + * paragraph) shall be included in all copies or substantial portions of >> the >> + * Software. >> + * >> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, >> EXPRESS OR >> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF >> MERCHANTABILITY, >> + * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT >> SHALL >> + * THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM, >> + * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR >> + * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR >> THE >> + * USE OR OTHER DEALINGS IN THE SOFTWARE. >> + */ >> + >> +/* This file handles register programming of primitive binning. */ >> + >> +#include "si_pipe.h" >> +#include "sid.h" >> +#include "gfx9d.h" >> +#include "radeon/r600_cs.h" >> + >> +struct uvec2 { >> + unsigned x, y; >> +}; >> + >> +struct si_bin_size_map { >> + unsigned start; >> + unsigned bin_size_x; >> + unsigned bin_size_y; >> +}; >> + >> +typedef struct si_bin_size_map si_bin_size_subtable[3][9]; >> + >> +/* Find the bin size where sum is >= table[i].start and < table[i + >> 1].start. */ >> +static struct uvec2 si_find_bin_size(struct si_screen *sscreen, >> +const si_bin_size_subtable table[], >> +unsigned sum) >> +{ >> + unsigned log_num_rb_per_se = >> + util_logbase2_ceil(sscreen->b.info.num_render_backends / >> + sscreen->b.info.max_se); >> + unsigned log_num_se = util_logbase2_ceil(sscreen->b.info.max_se); >> + unsigned i; >> + >> + /* Get the chip-specific subtable. */ >> + const struct si_bin_size_map *subtable = >> + [log_num_rb_per_se][log_num_se][0]; >> + >> + for (i = 0; subtable[i].bin_size_x != 0; i++) { >> + if (sum >= subtable[i].start && sum < subtable[i + >> 1].start) >> + break; >> + } >> + >> + struct uvec2 size = {subtable[i].bin_size_x, >> subtable[i].bin_size_y}; >> + return size; >> +} >> + >> +static struct uvec2 si_get_color_bin_size(struct si_context *sctx, >> + unsigned cb_target_enabled_4bit) >> +{ >> + unsigned nr_samples = sctx->framebuffer.nr_samples; >> + unsigned sum = 0; >> + >> + /* Compute the sum of all Bpp. */ >> + for (unsigned i = 0; i < sctx->framebuffer.state.nr_cbufs; i++) { >> + if (!(cb_target_enabled_4bit & (0xf << (i * 4 >> + continue; >> + >> + struct r600_texture *rtex = >> + (struct >> r600_texture*)sctx->framebuffer.state.cbufs[i]->texture; >> + sum += rtex->surface.bpe; >> + } > > > I believe this should early-out for !sum, for depth-only rendering. See the table. There are different values for depth-only rendering depending on the number of SEs /
[Mesa-dev] [Bug 101340] i915_surface.c:108:4: error: too few arguments to function ‘util_blitter_default_src_texture’
https://bugs.freedesktop.org/show_bug.cgi?id=101340 Emil Velikovchanged: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #2 from Emil Velikov --- Resolved, as pointed out by Vinson. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: expose sRGB visuals and EGL_KHR_gl_colorspace
On 09/04/2017 07:13 PM, Jason Ekstrand wrote: A quick scan through and this looks pretty gross. There may be no better way to do it but I'm not sure with only a cursory glance. Is like this to wait on either me or Ken spending enough brain cells in it to do a proper review. Sure, no problem. On September 4, 2017 6:12:20 AM Tapani Pälliwrote: Patch exposes sRGB visuals and adds DRI integer query support for __DRI2_RENDERER_HAS_FRAMEBUFFER_SRGB. Further changes make sure that we mark if the app explicitly wanted sRGB and for these framebuffers we don't turn sRGB off in intel_gles3_srgb_workaround. This way we keep compatibility for existing applications relying on default sRGB and only add more visual support. With this change, following dEQP tests start to pass: dEQP-EGL.functional.wide_color.window__colorspace_srgb dEQP-EGL.functional.wide_color.pbuffer__colorspace_srgb Signed-off-by: Tapani Pälli --- I did see following tests fail during CI run: ES31-CTS.functional.blend_equation_advanced.srgb.colorburn.hswm64 ES31-CTS.functional.blend_equation_advanced.basic.colordodge.hswm64 However they don't seem to fail on non-hsw, so I'm not sure if this is really a regression here? I saw that Ken has been burning some colors before so I've CC:d him. Note, this might have some effect on following bug: https://bugs.freedesktop.org/show_bug.cgi?id=102503 For me SuperTuxKart was working when testing (!) But it could be I was testing a wrong version. src/mesa/drivers/dri/i965/brw_context.c | 16 ++-- src/mesa/drivers/dri/i965/intel_fbo.h | 5 + src/mesa/drivers/dri/i965/intel_screen.c | 12 3 files changed, 27 insertions(+), 6 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index 6441311d47..a8e39e3bb6 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -1112,8 +1112,8 @@ intelUnbindContext(__DRIcontext * driContextPriv) * * Unfortunately, renderbuffer setup happens before a context is created. So * in intel_screen.c we always set up sRGB, and here, if you're a GLES2/3 - * context (without an sRGB visual, though we don't have sRGB visuals exposed - * yet), we go turn that back off before anyone finds out. + * context (without an sRGB visual), we go turn that back off before anyone + * finds out. */ static void intel_gles3_srgb_workaround(struct brw_context *brw, @@ -1124,15 +1124,19 @@ intel_gles3_srgb_workaround(struct brw_context *brw, if (_mesa_is_desktop_gl(ctx) || !fb->Visual.sRGBCapable) return; - /* Some day when we support the sRGB capable bit on visuals available for - * GLES, we'll need to respect that and not disable things here. - */ - fb->Visual.sRGBCapable = false; for (int i = 0; i < BUFFER_COUNT; i++) { struct gl_renderbuffer *rb = fb->Attachment[i].Renderbuffer; + + /* Check if sRGB was specifically asked for. */ + struct intel_renderbuffer *irb = intel_get_renderbuffer(fb, i); + if (irb && irb->explicit_srgb) + return; + if (rb) rb->Format = _mesa_get_srgb_format_linear(rb->Format); } + /* Disable sRGB from framebuffers that are not compatible. */ + fb->Visual.sRGBCapable = false; } GLboolean diff --git a/src/mesa/drivers/dri/i965/intel_fbo.h b/src/mesa/drivers/dri/i965/intel_fbo.h index 1e2494286b..c8c2ed9a1b 100644 --- a/src/mesa/drivers/dri/i965/intel_fbo.h +++ b/src/mesa/drivers/dri/i965/intel_fbo.h @@ -116,6 +116,11 @@ struct intel_renderbuffer * for the duration of a mapping. */ bool singlesample_mt_is_tmp; + + /** + * Application specifically asked for a sRGB visual. + */ + bool explicit_srgb; }; diff --git a/src/mesa/drivers/dri/i965/intel_screen.c b/src/mesa/drivers/dri/i965/intel_screen.c index d39509bcb8..79cc962ab1 100644 --- a/src/mesa/drivers/dri/i965/intel_screen.c +++ b/src/mesa/drivers/dri/i965/intel_screen.c @@ -1341,6 +1341,9 @@ brw_query_renderer_integer(__DRIscreen *dri_screen, case __DRI2_RENDERER_HAS_TEXTURE_3D: value[0] = 1; return 0; + case __DRI2_RENDERER_HAS_FRAMEBUFFER_SRGB: + value[0] = screen->mesa_format_supports_render[MESA_FORMAT_B8G8R8A8_SRGB]; + return 0; default: return driQueryRendererIntegerCommon(dri_screen, param, value); } @@ -1486,12 +1489,16 @@ intelCreateBuffer(__DRIscreen *dri_screen, fb->Visual.samples = num_samples; } + bool is_srgb = false; + if (mesaVis->redBits == 5) { rgbFormat = mesaVis->redMask == 0x1f ? MESA_FORMAT_R5G6B5_UNORM : MESA_FORMAT_B5G6R5_UNORM; } else if (mesaVis->sRGBCapable) { rgbFormat = mesaVis->redMask == 0xff ? MESA_FORMAT_R8G8B8A8_SRGB : MESA_FORMAT_B8G8R8A8_SRGB; + /*
Re: [Mesa-dev] [PATCH] i965: expose sRGB visuals and EGL_KHR_gl_colorspace
Hi; On 09/04/2017 06:42 PM, Emil Velikov wrote: Hi Tapani, On 4 September 2017 at 14:11, Tapani Pälliwrote: Patch exposes sRGB visuals and adds DRI integer query support for __DRI2_RENDERER_HAS_FRAMEBUFFER_SRGB. Further changes make sure that we mark if the app explicitly wanted sRGB and for these framebuffers we don't turn sRGB off in intel_gles3_srgb_workaround. This way we keep compatibility for existing applications relying on default sRGB and only add more visual support. With this change, following dEQP tests start to pass: dEQP-EGL.functional.wide_color.window__colorspace_srgb dEQP-EGL.functional.wide_color.pbuffer__colorspace_srgb Signed-off-by: Tapani Pälli There's a couple of minor suggestions below. With those Reviewed-by: Emil Velikov Please me a couple of days so I can test this with KDE/Plasma. Just in case ... ;-) @@ -1504,10 +1511,12 @@ intelCreateBuffer(__DRIscreen *dri_screen, /* setup the hardware-based renderbuffers */ rb = intel_create_winsys_renderbuffer(screen, rgbFormat, num_samples); _mesa_attach_and_own_rb(fb, BUFFER_FRONT_LEFT, >Base.Base); + rb->explicit_srgb = is_srgb ? true : false; Both variables are of type bool, so this can be: rb->explicit_srgb = is_srgb; oops yes of course, I think I had originally something more complicated .. if (mesaVis->doubleBufferMode) { rb = intel_create_winsys_renderbuffer(screen, rgbFormat, num_samples); _mesa_attach_and_own_rb(fb, BUFFER_BACK_LEFT, >Base.Base); + rb->explicit_srgb = is_srgb ? true : false; Ditto. } /* @@ -1854,6 +1863,9 @@ intel_screen_make_configs(__DRIscreen *dri_screen) MESA_FORMAT_B8G8R8A8_UNORM, MESA_FORMAT_B8G8R8X8_UNORM, + MESA_FORMAT_B8G8R8A8_SRGB, + MESA_FORMAT_B8G8R8X8_SRGB, + I was going to mention - you need more, yet it seems like my earlier patch never got a reply [1] If you've got a few minutes a pair of eyes would be appreciated. -Emil [1] https://patchwork.freedesktop.org/patch/169660/ Thanks Emil, I will spend some time with your patch tomorrow! ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 102463] SpaceEngine can't render main scenery properly
https://bugs.freedesktop.org/show_bug.cgi?id=102463 Hi-Angelchanged: What|Removed |Added CC||hi-an...@yandex.ru -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 102518] [apitrace, backtrace] Crash in _mesa_is_bufferobj during load of "XCOM 2: War of the Chosen"
https://bugs.freedesktop.org/show_bug.cgi?id=102518 Marc Di Luziochanged: What|Removed |Added CC||mdiluzio@feralinteractive.c ||om -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 102518] [apitrace, backtrace] Crash in _mesa_is_bufferobj during load of "XCOM 2: War of the Chosen"
https://bugs.freedesktop.org/show_bug.cgi?id=102518 --- Comment #8 from Marc Di Luzio--- Cheers Kai. That line is being removed from the default branch as we speak, while we sort out the actual error on our side. Apologies for any wasted time. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Mesa-stable] [PATCH] spirv: Add support for the HelperInvocation builtin
Yeah, just haven't gotten around to pushing it. Feel free. On September 4, 2017 7:33:38 AM Andres Gomezwrote: Jason, has this patch fallen through the cracks ? On Mon, 2017-08-21 at 22:11 -0700, Jason Ekstrand wrote: I have no idea how this got missed but it's been missing since forever. Cc: mesa-sta...@lists.freedesktop.org --- src/compiler/spirv/vtn_variables.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/src/compiler/spirv/vtn_variables.c b/src/compiler/spirv/vtn_variables.c index 6a8776b..87cb935 100644 --- a/src/compiler/spirv/vtn_variables.c +++ b/src/compiler/spirv/vtn_variables.c @@ -1121,6 +1121,10 @@ vtn_get_builtin_location(struct vtn_builder *b, *location = FRAG_RESULT_DEPTH; assert(*mode == nir_var_shader_out); break; + case SpvBuiltInHelperInvocation: + *location = SYSTEM_VALUE_HELPER_INVOCATION; + set_mode_system_value(mode); + break; case SpvBuiltInNumWorkgroups: *location = SYSTEM_VALUE_NUM_WORK_GROUPS; set_mode_system_value(mode); @@ -1177,7 +1181,6 @@ vtn_get_builtin_location(struct vtn_builder *b, *location = SYSTEM_VALUE_VIEW_INDEX; set_mode_system_value(mode); break; - case SpvBuiltInHelperInvocation: default: unreachable("unsupported builtin"); } -- Br, Andres ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: expose sRGB visuals and EGL_KHR_gl_colorspace
A quick scan through and this looks pretty gross. There may be no better way to do it but I'm not sure with only a cursory glance. Is like this to wait on either me or Ken spending enough brain cells in it to do a proper review. On September 4, 2017 6:12:20 AM Tapani Pälliwrote: Patch exposes sRGB visuals and adds DRI integer query support for __DRI2_RENDERER_HAS_FRAMEBUFFER_SRGB. Further changes make sure that we mark if the app explicitly wanted sRGB and for these framebuffers we don't turn sRGB off in intel_gles3_srgb_workaround. This way we keep compatibility for existing applications relying on default sRGB and only add more visual support. With this change, following dEQP tests start to pass: dEQP-EGL.functional.wide_color.window__colorspace_srgb dEQP-EGL.functional.wide_color.pbuffer__colorspace_srgb Signed-off-by: Tapani Pälli --- I did see following tests fail during CI run: ES31-CTS.functional.blend_equation_advanced.srgb.colorburn.hswm64 ES31-CTS.functional.blend_equation_advanced.basic.colordodge.hswm64 However they don't seem to fail on non-hsw, so I'm not sure if this is really a regression here? I saw that Ken has been burning some colors before so I've CC:d him. Note, this might have some effect on following bug: https://bugs.freedesktop.org/show_bug.cgi?id=102503 For me SuperTuxKart was working when testing (!) But it could be I was testing a wrong version. src/mesa/drivers/dri/i965/brw_context.c | 16 ++-- src/mesa/drivers/dri/i965/intel_fbo.h| 5 + src/mesa/drivers/dri/i965/intel_screen.c | 12 3 files changed, 27 insertions(+), 6 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index 6441311d47..a8e39e3bb6 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -1112,8 +1112,8 @@ intelUnbindContext(__DRIcontext * driContextPriv) * * Unfortunately, renderbuffer setup happens before a context is created. So * in intel_screen.c we always set up sRGB, and here, if you're a GLES2/3 - * context (without an sRGB visual, though we don't have sRGB visuals exposed - * yet), we go turn that back off before anyone finds out. + * context (without an sRGB visual), we go turn that back off before anyone + * finds out. */ static void intel_gles3_srgb_workaround(struct brw_context *brw, @@ -1124,15 +1124,19 @@ intel_gles3_srgb_workaround(struct brw_context *brw, if (_mesa_is_desktop_gl(ctx) || !fb->Visual.sRGBCapable) return; - /* Some day when we support the sRGB capable bit on visuals available for -* GLES, we'll need to respect that and not disable things here. -*/ - fb->Visual.sRGBCapable = false; for (int i = 0; i < BUFFER_COUNT; i++) { struct gl_renderbuffer *rb = fb->Attachment[i].Renderbuffer; + + /* Check if sRGB was specifically asked for. */ + struct intel_renderbuffer *irb = intel_get_renderbuffer(fb, i); + if (irb && irb->explicit_srgb) + return; + if (rb) rb->Format = _mesa_get_srgb_format_linear(rb->Format); } + /* Disable sRGB from framebuffers that are not compatible. */ + fb->Visual.sRGBCapable = false; } GLboolean diff --git a/src/mesa/drivers/dri/i965/intel_fbo.h b/src/mesa/drivers/dri/i965/intel_fbo.h index 1e2494286b..c8c2ed9a1b 100644 --- a/src/mesa/drivers/dri/i965/intel_fbo.h +++ b/src/mesa/drivers/dri/i965/intel_fbo.h @@ -116,6 +116,11 @@ struct intel_renderbuffer * for the duration of a mapping. */ bool singlesample_mt_is_tmp; + + /** +* Application specifically asked for a sRGB visual. +*/ + bool explicit_srgb; }; diff --git a/src/mesa/drivers/dri/i965/intel_screen.c b/src/mesa/drivers/dri/i965/intel_screen.c index d39509bcb8..79cc962ab1 100644 --- a/src/mesa/drivers/dri/i965/intel_screen.c +++ b/src/mesa/drivers/dri/i965/intel_screen.c @@ -1341,6 +1341,9 @@ brw_query_renderer_integer(__DRIscreen *dri_screen, case __DRI2_RENDERER_HAS_TEXTURE_3D: value[0] = 1; return 0; + case __DRI2_RENDERER_HAS_FRAMEBUFFER_SRGB: + value[0] = screen->mesa_format_supports_render[MESA_FORMAT_B8G8R8A8_SRGB]; + return 0; default: return driQueryRendererIntegerCommon(dri_screen, param, value); } @@ -1486,12 +1489,16 @@ intelCreateBuffer(__DRIscreen *dri_screen, fb->Visual.samples = num_samples; } + bool is_srgb = false; + if (mesaVis->redBits == 5) { rgbFormat = mesaVis->redMask == 0x1f ? MESA_FORMAT_R5G6B5_UNORM : MESA_FORMAT_B5G6R5_UNORM; } else if (mesaVis->sRGBCapable) { rgbFormat = mesaVis->redMask == 0xff ? MESA_FORMAT_R8G8B8A8_SRGB : MESA_FORMAT_B8G8R8A8_SRGB; + /* mesaVis->sRGBCapable was set, user is asking for sRGB */ + is_srgb = true; } else if
Re: [Mesa-dev] [PATCH] i965: expose sRGB visuals and EGL_KHR_gl_colorspace
Hi Tapani, On 4 September 2017 at 14:11, Tapani Pälliwrote: > Patch exposes sRGB visuals and adds DRI integer query support for > __DRI2_RENDERER_HAS_FRAMEBUFFER_SRGB. Further changes make sure that > we mark if the app explicitly wanted sRGB and for these framebuffers > we don't turn sRGB off in intel_gles3_srgb_workaround. This way we > keep compatibility for existing applications relying on default sRGB > and only add more visual support. > > With this change, following dEQP tests start to pass: > >dEQP-EGL.functional.wide_color.window__colorspace_srgb >dEQP-EGL.functional.wide_color.pbuffer__colorspace_srgb > > Signed-off-by: Tapani Pälli There's a couple of minor suggestions below. With those Reviewed-by: Emil Velikov Please me a couple of days so I can test this with KDE/Plasma. Just in case ... ;-) > @@ -1504,10 +1511,12 @@ intelCreateBuffer(__DRIscreen *dri_screen, > /* setup the hardware-based renderbuffers */ > rb = intel_create_winsys_renderbuffer(screen, rgbFormat, num_samples); > _mesa_attach_and_own_rb(fb, BUFFER_FRONT_LEFT, >Base.Base); > + rb->explicit_srgb = is_srgb ? true : false; Both variables are of type bool, so this can be: rb->explicit_srgb = is_srgb; > > if (mesaVis->doubleBufferMode) { >rb = intel_create_winsys_renderbuffer(screen, rgbFormat, num_samples); >_mesa_attach_and_own_rb(fb, BUFFER_BACK_LEFT, >Base.Base); > + rb->explicit_srgb = is_srgb ? true : false; Ditto. > } > > /* > @@ -1854,6 +1863,9 @@ intel_screen_make_configs(__DRIscreen *dri_screen) >MESA_FORMAT_B8G8R8A8_UNORM, >MESA_FORMAT_B8G8R8X8_UNORM, > > + MESA_FORMAT_B8G8R8A8_SRGB, > + MESA_FORMAT_B8G8R8X8_SRGB, > + I was going to mention - you need more, yet it seems like my earlier patch never got a reply [1] If you've got a few minutes a pair of eyes would be appreciated. -Emil [1] https://patchwork.freedesktop.org/patch/169660/ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] gallium: Add PIPE_CAP_HALFS
On Mon, Sep 4, 2017 at 4:13 PM, Jan Veselywrote: > On Sat, 2017-09-02 at 22:21 +0200, Erik Faye-Lund wrote: >> On Sat, Sep 2, 2017 at 2:55 AM, Jan Vesely wrote: >> > Denotes native half precision float operations capability >> > >> > Signed-off-by: Jan Vesely >> > --- >> > I can change the spelling to HALVES, but simplified english sounded more >> > appropriate. >> > >> > src/gallium/docs/source/screen.rst | 1 + >> > src/gallium/drivers/etnaviv/etnaviv_screen.c | 1 + >> > src/gallium/drivers/freedreno/freedreno_screen.c | 1 + >> > src/gallium/drivers/i915/i915_screen.c | 1 + >> > src/gallium/drivers/llvmpipe/lp_screen.c | 1 + >> > src/gallium/drivers/nouveau/nv30/nv30_screen.c | 1 + >> > src/gallium/drivers/nouveau/nv50/nv50_screen.c | 1 + >> > src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 1 + >> > src/gallium/drivers/r300/r300_screen.c | 1 + >> > src/gallium/drivers/r600/r600_pipe.c | 1 + >> > src/gallium/drivers/radeonsi/si_pipe.c | 1 + >> > src/gallium/drivers/softpipe/sp_screen.c | 1 + >> > src/gallium/drivers/svga/svga_screen.c | 1 + >> > src/gallium/drivers/swr/swr_screen.cpp | 1 + >> > src/gallium/drivers/vc4/vc4_screen.c | 1 + >> > src/gallium/drivers/virgl/virgl_screen.c | 1 + >> > src/gallium/include/pipe/p_defines.h | 1 + >> > 17 files changed, 17 insertions(+) >> > >> > diff --git a/src/gallium/docs/source/screen.rst >> > b/src/gallium/docs/source/screen.rst >> > index be14ddd0c0..e27a0e8325 100644 >> > --- a/src/gallium/docs/source/screen.rst >> > +++ b/src/gallium/docs/source/screen.rst >> > @@ -370,6 +370,7 @@ The integer capabilities: >> > * ``PIPE_CAP_TGSI_MUL_ZERO_WINS``: Whether TGSI shaders support the >> >``TGSI_PROPERTY_MUL_ZERO_WINS`` shader property. >> > * ``PIPE_CAP_DOUBLES``: Whether double precision floating-point operations >> > +* ``PIPE_CAP_HALFS``: Whether half precision floating-point operations >> >are supported. >> > * ``PIPE_CAP_INT64``: Whether 64-bit integer operations are supported. >> > * ``PIPE_CAP_INT64_DIVMOD``: Whether 64-bit integer division/modulo >> > diff --git a/src/gallium/drivers/etnaviv/etnaviv_screen.c >> > b/src/gallium/drivers/etnaviv/etnaviv_screen.c >> > index f400e423de..9b4ff5bbf9 100644 >> > --- a/src/gallium/drivers/etnaviv/etnaviv_screen.c >> > +++ b/src/gallium/drivers/etnaviv/etnaviv_screen.c >> > @@ -248,6 +248,7 @@ etna_screen_get_param(struct pipe_screen *pscreen, >> > enum pipe_cap param) >> > case PIPE_CAP_TGSI_FS_FBFETCH: >> > case PIPE_CAP_TGSI_MUL_ZERO_WINS: >> > case PIPE_CAP_DOUBLES: >> > + case PIPE_CAP_HALFS: >> > case PIPE_CAP_INT64: >> > case PIPE_CAP_INT64_DIVMOD: >> > case PIPE_CAP_TGSI_TEX_TXF_LZ: >> >> Shouldn't this be a shader cap? Some GPUs only support FP16 in the >> fragment shader, for instance... > > Interesting, which GPUs would that be? I assumed that unified shaders > would prevent such differences between stages. At least ARM Mali-400. A Gallium driver for it is currently in the works: - https://github.com/yuq/mesa-lima Tegra 2 also have differing precisions in the vertex and fragment shaders, but it's not even doing FP16, rather some unusual FP24 variant. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] gallium: Add PIPE_CAP_HALFS
On Mon, 2017-09-04 at 15:56 +0200, Nicolai Hähnle wrote: > On 02.09.2017 02:55, Jan Vesely wrote: > > Denotes native half precision float operations capability > > > > Signed-off-by: Jan Vesely> > --- > > I can change the spelling to HALVES, but simplified english sounded more > > appropriate. > > > > src/gallium/docs/source/screen.rst | 1 + > > src/gallium/drivers/etnaviv/etnaviv_screen.c | 1 + > > src/gallium/drivers/freedreno/freedreno_screen.c | 1 + > > src/gallium/drivers/i915/i915_screen.c | 1 + > > src/gallium/drivers/llvmpipe/lp_screen.c | 1 + > > src/gallium/drivers/nouveau/nv30/nv30_screen.c | 1 + > > src/gallium/drivers/nouveau/nv50/nv50_screen.c | 1 + > > src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 1 + > > src/gallium/drivers/r300/r300_screen.c | 1 + > > src/gallium/drivers/r600/r600_pipe.c | 1 + > > src/gallium/drivers/radeonsi/si_pipe.c | 1 + > > src/gallium/drivers/softpipe/sp_screen.c | 1 + > > src/gallium/drivers/svga/svga_screen.c | 1 + > > src/gallium/drivers/swr/swr_screen.cpp | 1 + > > src/gallium/drivers/vc4/vc4_screen.c | 1 + > > src/gallium/drivers/virgl/virgl_screen.c | 1 + > > src/gallium/include/pipe/p_defines.h | 1 + > > 17 files changed, 17 insertions(+) > > > > diff --git a/src/gallium/docs/source/screen.rst > > b/src/gallium/docs/source/screen.rst > > index be14ddd0c0..e27a0e8325 100644 > > --- a/src/gallium/docs/source/screen.rst > > +++ b/src/gallium/docs/source/screen.rst > > @@ -370,6 +370,7 @@ The integer capabilities: > > * ``PIPE_CAP_TGSI_MUL_ZERO_WINS``: Whether TGSI shaders support the > > ``TGSI_PROPERTY_MUL_ZERO_WINS`` shader property. > > * ``PIPE_CAP_DOUBLES``: Whether double precision floating-point operations > > +* ``PIPE_CAP_HALFS``: Whether half precision floating-point operations > > are supported. > > This is awfully vague without more context. *How* are half-precision > floating-point operations supported? I followed PIPE_CAP_DOUBLES here, so I think it's intentional. my understanding is that it denotes that pipe driver wants to get FP16 operations in the IR. Ideally it would mean that the target supports fp16 instructions, but for example clang enables cl_khr_fp16 for all GCN hw even if 16 bit types are only legal for VI+. Jan > > Cheers, > Nicolai > > > > * ``PIPE_CAP_INT64``: Whether 64-bit integer operations are supported. > > * ``PIPE_CAP_INT64_DIVMOD``: Whether 64-bit integer division/modulo > > diff --git a/src/gallium/drivers/etnaviv/etnaviv_screen.c > > b/src/gallium/drivers/etnaviv/etnaviv_screen.c > > index f400e423de..9b4ff5bbf9 100644 > > --- a/src/gallium/drivers/etnaviv/etnaviv_screen.c > > +++ b/src/gallium/drivers/etnaviv/etnaviv_screen.c > > @@ -248,6 +248,7 @@ etna_screen_get_param(struct pipe_screen *pscreen, enum > > pipe_cap param) > > case PIPE_CAP_TGSI_FS_FBFETCH: > > case PIPE_CAP_TGSI_MUL_ZERO_WINS: > > case PIPE_CAP_DOUBLES: > > + case PIPE_CAP_HALFS: > > case PIPE_CAP_INT64: > > case PIPE_CAP_INT64_DIVMOD: > > case PIPE_CAP_TGSI_TEX_TXF_LZ: > > diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c > > b/src/gallium/drivers/freedreno/freedreno_screen.c > > index b26f67e4e2..949c79cfa0 100644 > > --- a/src/gallium/drivers/freedreno/freedreno_screen.c > > +++ b/src/gallium/drivers/freedreno/freedreno_screen.c > > @@ -309,6 +309,7 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum > > pipe_cap param) > > case PIPE_CAP_TGSI_FS_FBFETCH: > > case PIPE_CAP_TGSI_MUL_ZERO_WINS: > > case PIPE_CAP_DOUBLES: > > + case PIPE_CAP_HALFS: > > case PIPE_CAP_INT64: > > case PIPE_CAP_INT64_DIVMOD: > > case PIPE_CAP_TGSI_TEX_TXF_LZ: > > diff --git a/src/gallium/drivers/i915/i915_screen.c > > b/src/gallium/drivers/i915/i915_screen.c > > index e700e294da..d4ca6d6dfc 100644 > > --- a/src/gallium/drivers/i915/i915_screen.c > > +++ b/src/gallium/drivers/i915/i915_screen.c > > @@ -300,6 +300,7 @@ i915_get_param(struct pipe_screen *screen, enum > > pipe_cap cap) > > case PIPE_CAP_TGSI_FS_FBFETCH: > > case PIPE_CAP_TGSI_MUL_ZERO_WINS: > > case PIPE_CAP_DOUBLES: > > + case PIPE_CAP_HALFS: > > case PIPE_CAP_INT64: > > case PIPE_CAP_INT64_DIVMOD: > > case PIPE_CAP_TGSI_TEX_TXF_LZ: > > diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c > > b/src/gallium/drivers/llvmpipe/lp_screen.c > > index 32a405088f..a3ba042733 100644 > > --- a/src/gallium/drivers/llvmpipe/lp_screen.c > > +++ b/src/gallium/drivers/llvmpipe/lp_screen.c > > @@ -359,6 +359,7 @@ llvmpipe_get_param(struct pipe_screen *screen, enum > > pipe_cap param) > > case PIPE_CAP_BINDLESS_TEXTURE: > > case PIPE_CAP_NIR_SAMPLERS_AS_DEREF: > > case PIPE_CAP_MEMOBJ: > > + case PIPE_CAP_HALFS: > > return 0; > > } > > /* should only get
Re: [Mesa-dev] [Mesa-stable] [PATCH] spirv: Add support for the HelperInvocation builtin
Jason, has this patch fallen through the cracks ? On Mon, 2017-08-21 at 22:11 -0700, Jason Ekstrand wrote: > I have no idea how this got missed but it's been missing since forever. > > Cc: mesa-sta...@lists.freedesktop.org > --- > src/compiler/spirv/vtn_variables.c | 5 - > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/src/compiler/spirv/vtn_variables.c > b/src/compiler/spirv/vtn_variables.c > index 6a8776b..87cb935 100644 > --- a/src/compiler/spirv/vtn_variables.c > +++ b/src/compiler/spirv/vtn_variables.c > @@ -1121,6 +1121,10 @@ vtn_get_builtin_location(struct vtn_builder *b, >*location = FRAG_RESULT_DEPTH; >assert(*mode == nir_var_shader_out); >break; > + case SpvBuiltInHelperInvocation: > + *location = SYSTEM_VALUE_HELPER_INVOCATION; > + set_mode_system_value(mode); > + break; > case SpvBuiltInNumWorkgroups: >*location = SYSTEM_VALUE_NUM_WORK_GROUPS; >set_mode_system_value(mode); > @@ -1177,7 +1181,6 @@ vtn_get_builtin_location(struct vtn_builder *b, >*location = SYSTEM_VALUE_VIEW_INDEX; >set_mode_system_value(mode); >break; > - case SpvBuiltInHelperInvocation: > default: >unreachable("unsupported builtin"); > } -- Br, Andres ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] util/ralloc: set prev-pointers correctly in ralloc_adopt
On Monday, 2017-09-04 14:06:30 +0200, Nicolai Hähnle wrote: > From: Nicolai Hähnle> > Found by inspection. > > I'm not aware of any actual failures caused by this, but a precise > sequence of ralloc_adopt and ralloc_free should be able to cause > problems. > --- > src/util/ralloc.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/src/util/ralloc.c b/src/util/ralloc.c > index bf46439df4e..50e629fe450 100644 > --- a/src/util/ralloc.c > +++ b/src/util/ralloc.c > @@ -304,24 +304,26 @@ ralloc_adopt(const void *new_ctx, void *old_ctx) > new_info = get_header(new_ctx); > > /* If there are no children, bail. */ > if (unlikely(old_info->child == NULL)) >return; > > /* Set all the children's parent to new_ctx; get a pointer to the last > child. */ > for (child = old_info->child; child->next != NULL; child = child->next) { >child->parent = new_info; > } > + child->parent = new_info; > > /* Connect the two lists together; parent them to new_ctx; make old_ctx > empty. */ > child->next = new_info->child; > - child->parent = new_info; > + if (new_info->child) > + new_info->child->prev = child; I would've written it like this as I find it clearer: if (child->next) child->next->prev = child; Doesn't change much though, so either way: Reviewed-by: Eric Engestrom > new_info->child = old_info->child; > old_info->child = NULL; > } > > void * > ralloc_parent(const void *ptr) > { > ralloc_header *info; > > if (unlikely(ptr == NULL)) > -- > 2.11.0 > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] gallium: Add PIPE_CAP_HALFS
On Sat, 2017-09-02 at 22:21 +0200, Erik Faye-Lund wrote: > On Sat, Sep 2, 2017 at 2:55 AM, Jan Veselywrote: > > Denotes native half precision float operations capability > > > > Signed-off-by: Jan Vesely > > --- > > I can change the spelling to HALVES, but simplified english sounded more > > appropriate. > > > > src/gallium/docs/source/screen.rst | 1 + > > src/gallium/drivers/etnaviv/etnaviv_screen.c | 1 + > > src/gallium/drivers/freedreno/freedreno_screen.c | 1 + > > src/gallium/drivers/i915/i915_screen.c | 1 + > > src/gallium/drivers/llvmpipe/lp_screen.c | 1 + > > src/gallium/drivers/nouveau/nv30/nv30_screen.c | 1 + > > src/gallium/drivers/nouveau/nv50/nv50_screen.c | 1 + > > src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 1 + > > src/gallium/drivers/r300/r300_screen.c | 1 + > > src/gallium/drivers/r600/r600_pipe.c | 1 + > > src/gallium/drivers/radeonsi/si_pipe.c | 1 + > > src/gallium/drivers/softpipe/sp_screen.c | 1 + > > src/gallium/drivers/svga/svga_screen.c | 1 + > > src/gallium/drivers/swr/swr_screen.cpp | 1 + > > src/gallium/drivers/vc4/vc4_screen.c | 1 + > > src/gallium/drivers/virgl/virgl_screen.c | 1 + > > src/gallium/include/pipe/p_defines.h | 1 + > > 17 files changed, 17 insertions(+) > > > > diff --git a/src/gallium/docs/source/screen.rst > > b/src/gallium/docs/source/screen.rst > > index be14ddd0c0..e27a0e8325 100644 > > --- a/src/gallium/docs/source/screen.rst > > +++ b/src/gallium/docs/source/screen.rst > > @@ -370,6 +370,7 @@ The integer capabilities: > > * ``PIPE_CAP_TGSI_MUL_ZERO_WINS``: Whether TGSI shaders support the > >``TGSI_PROPERTY_MUL_ZERO_WINS`` shader property. > > * ``PIPE_CAP_DOUBLES``: Whether double precision floating-point operations > > +* ``PIPE_CAP_HALFS``: Whether half precision floating-point operations > >are supported. > > * ``PIPE_CAP_INT64``: Whether 64-bit integer operations are supported. > > * ``PIPE_CAP_INT64_DIVMOD``: Whether 64-bit integer division/modulo > > diff --git a/src/gallium/drivers/etnaviv/etnaviv_screen.c > > b/src/gallium/drivers/etnaviv/etnaviv_screen.c > > index f400e423de..9b4ff5bbf9 100644 > > --- a/src/gallium/drivers/etnaviv/etnaviv_screen.c > > +++ b/src/gallium/drivers/etnaviv/etnaviv_screen.c > > @@ -248,6 +248,7 @@ etna_screen_get_param(struct pipe_screen *pscreen, enum > > pipe_cap param) > > case PIPE_CAP_TGSI_FS_FBFETCH: > > case PIPE_CAP_TGSI_MUL_ZERO_WINS: > > case PIPE_CAP_DOUBLES: > > + case PIPE_CAP_HALFS: > > case PIPE_CAP_INT64: > > case PIPE_CAP_INT64_DIVMOD: > > case PIPE_CAP_TGSI_TEX_TXF_LZ: > > Shouldn't this be a shader cap? Some GPUs only support FP16 in the > fragment shader, for instance... Interesting, which GPUs would that be? I assumed that unified shaders would prevent such differences between stages. thanks, Jan -- Jan Vesely signature.asc Description: This is a digitally signed message part ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] gallium: Add PIPE_CAP_HALFS
On 02.09.2017 02:55, Jan Vesely wrote: Denotes native half precision float operations capability Signed-off-by: Jan Vesely--- I can change the spelling to HALVES, but simplified english sounded more appropriate. src/gallium/docs/source/screen.rst | 1 + src/gallium/drivers/etnaviv/etnaviv_screen.c | 1 + src/gallium/drivers/freedreno/freedreno_screen.c | 1 + src/gallium/drivers/i915/i915_screen.c | 1 + src/gallium/drivers/llvmpipe/lp_screen.c | 1 + src/gallium/drivers/nouveau/nv30/nv30_screen.c | 1 + src/gallium/drivers/nouveau/nv50/nv50_screen.c | 1 + src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 1 + src/gallium/drivers/r300/r300_screen.c | 1 + src/gallium/drivers/r600/r600_pipe.c | 1 + src/gallium/drivers/radeonsi/si_pipe.c | 1 + src/gallium/drivers/softpipe/sp_screen.c | 1 + src/gallium/drivers/svga/svga_screen.c | 1 + src/gallium/drivers/swr/swr_screen.cpp | 1 + src/gallium/drivers/vc4/vc4_screen.c | 1 + src/gallium/drivers/virgl/virgl_screen.c | 1 + src/gallium/include/pipe/p_defines.h | 1 + 17 files changed, 17 insertions(+) diff --git a/src/gallium/docs/source/screen.rst b/src/gallium/docs/source/screen.rst index be14ddd0c0..e27a0e8325 100644 --- a/src/gallium/docs/source/screen.rst +++ b/src/gallium/docs/source/screen.rst @@ -370,6 +370,7 @@ The integer capabilities: * ``PIPE_CAP_TGSI_MUL_ZERO_WINS``: Whether TGSI shaders support the ``TGSI_PROPERTY_MUL_ZERO_WINS`` shader property. * ``PIPE_CAP_DOUBLES``: Whether double precision floating-point operations +* ``PIPE_CAP_HALFS``: Whether half precision floating-point operations are supported. This is awfully vague without more context. *How* are half-precision floating-point operations supported? Cheers, Nicolai * ``PIPE_CAP_INT64``: Whether 64-bit integer operations are supported. * ``PIPE_CAP_INT64_DIVMOD``: Whether 64-bit integer division/modulo diff --git a/src/gallium/drivers/etnaviv/etnaviv_screen.c b/src/gallium/drivers/etnaviv/etnaviv_screen.c index f400e423de..9b4ff5bbf9 100644 --- a/src/gallium/drivers/etnaviv/etnaviv_screen.c +++ b/src/gallium/drivers/etnaviv/etnaviv_screen.c @@ -248,6 +248,7 @@ etna_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param) case PIPE_CAP_TGSI_FS_FBFETCH: case PIPE_CAP_TGSI_MUL_ZERO_WINS: case PIPE_CAP_DOUBLES: + case PIPE_CAP_HALFS: case PIPE_CAP_INT64: case PIPE_CAP_INT64_DIVMOD: case PIPE_CAP_TGSI_TEX_TXF_LZ: diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c b/src/gallium/drivers/freedreno/freedreno_screen.c index b26f67e4e2..949c79cfa0 100644 --- a/src/gallium/drivers/freedreno/freedreno_screen.c +++ b/src/gallium/drivers/freedreno/freedreno_screen.c @@ -309,6 +309,7 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param) case PIPE_CAP_TGSI_FS_FBFETCH: case PIPE_CAP_TGSI_MUL_ZERO_WINS: case PIPE_CAP_DOUBLES: + case PIPE_CAP_HALFS: case PIPE_CAP_INT64: case PIPE_CAP_INT64_DIVMOD: case PIPE_CAP_TGSI_TEX_TXF_LZ: diff --git a/src/gallium/drivers/i915/i915_screen.c b/src/gallium/drivers/i915/i915_screen.c index e700e294da..d4ca6d6dfc 100644 --- a/src/gallium/drivers/i915/i915_screen.c +++ b/src/gallium/drivers/i915/i915_screen.c @@ -300,6 +300,7 @@ i915_get_param(struct pipe_screen *screen, enum pipe_cap cap) case PIPE_CAP_TGSI_FS_FBFETCH: case PIPE_CAP_TGSI_MUL_ZERO_WINS: case PIPE_CAP_DOUBLES: + case PIPE_CAP_HALFS: case PIPE_CAP_INT64: case PIPE_CAP_INT64_DIVMOD: case PIPE_CAP_TGSI_TEX_TXF_LZ: diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c b/src/gallium/drivers/llvmpipe/lp_screen.c index 32a405088f..a3ba042733 100644 --- a/src/gallium/drivers/llvmpipe/lp_screen.c +++ b/src/gallium/drivers/llvmpipe/lp_screen.c @@ -359,6 +359,7 @@ llvmpipe_get_param(struct pipe_screen *screen, enum pipe_cap param) case PIPE_CAP_BINDLESS_TEXTURE: case PIPE_CAP_NIR_SAMPLERS_AS_DEREF: case PIPE_CAP_MEMOBJ: + case PIPE_CAP_HALFS: return 0; } /* should only get here on unhandled cases */ diff --git a/src/gallium/drivers/nouveau/nv30/nv30_screen.c b/src/gallium/drivers/nouveau/nv30/nv30_screen.c index 72f886c911..084bc887ee 100644 --- a/src/gallium/drivers/nouveau/nv30/nv30_screen.c +++ b/src/gallium/drivers/nouveau/nv30/nv30_screen.c @@ -209,6 +209,7 @@ nv30_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param) case PIPE_CAP_TGSI_FS_FBFETCH: case PIPE_CAP_TGSI_MUL_ZERO_WINS: case PIPE_CAP_DOUBLES: + case PIPE_CAP_HALFS: case PIPE_CAP_INT64: case PIPE_CAP_INT64_DIVMOD: case PIPE_CAP_TGSI_TEX_TXF_LZ: diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c b/src/gallium/drivers/nouveau/nv50/nv50_screen.c index 0f25cd5fed..6b40515748 100644 ---
Re: [Mesa-dev] What is the difference between ROCm and Clover?
On 04.09.2017 15:38, Aaron Watry wrote: On Sun, Sep 3, 2017 at 3:20 PM, David Niklaswrote: Hello, I'm interested in knowing why there are two different OpenCL implementations of OpenCL drivers for AMD cards. Is it because they support different ASICs? One is older and going to be replaced? As near as I can tell, ROCm requires PCIe 3.0 Atomic operations for both the CPU and GPU. As such, ROCm will never support the VLIW radeons (5000-6000 series) or the earlier generations of GCN (SI, CI, possibly VI?). According to the ROCm page, it requires Fiji/Polaris/Vega. It also requires either a Haswell or Ryzen CPU (or newer), so anyone with an older CPU will be left out of ROCm support. External GPUs via Thunderbolt 1/2 enclosures are also unsupported. For what it's worth, if you have an "older" system without PCIe 3.0 atomics, you could still try and see how far you get with ROCm. As long as you don't *actually* need atomics between the CPU and GPU in your own code, you're probably fine. I've never tried it, though. Cheers, Nicolai There is a bit of overlap between the two projects, in that Clover/libclc supports Polaris cards on various CPUs, but it's safe to say that for the moment, if your system is supported by ROCm, it is the more complete/well-tested option. Clover/Libclc are both still works in progress that work for some use cases, but they are not complete, and have never passed a run of the CL 1.2 conformance test suite in any configuration, and it's definitely a use at your own risk type of thing for now. --Aaron Thanks, David ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev -- Lerne, wie die Welt wirklich ist, Aber vergiss niemals, wie sie sein sollte. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] radeonsi/gfx9: implement primitive binning
On 01.09.2017 02:57, Marek Olšák wrote: From: Marek OlšákThis increases performance, but it was tuned for Raven, not Vega. We don't know yet how Vega will perform, hopefully not worse. --- src/gallium/drivers/radeon/r600_pipe_common.c | 2 + src/gallium/drivers/radeon/r600_pipe_common.h | 2 + src/gallium/drivers/radeonsi/Makefile.sources | 1 + src/gallium/drivers/radeonsi/si_hw_context.c| 2 + src/gallium/drivers/radeonsi/si_pipe.c | 5 + src/gallium/drivers/radeonsi/si_pipe.h | 2 + src/gallium/drivers/radeonsi/si_state.c | 26 +- src/gallium/drivers/radeonsi/si_state.h | 6 +- src/gallium/drivers/radeonsi/si_state_binning.c | 448 src/gallium/drivers/radeonsi/si_state_shaders.c | 2 + 10 files changed, 489 insertions(+), 7 deletions(-) create mode 100644 src/gallium/drivers/radeonsi/si_state_binning.c [snip] diff --git a/src/gallium/drivers/radeonsi/si_state_binning.c b/src/gallium/drivers/radeonsi/si_state_binning.c new file mode 100644 index 000..56bcdc8 --- /dev/null +++ b/src/gallium/drivers/radeonsi/si_state_binning.c @@ -0,0 +1,448 @@ +/* + * Copyright 2017 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * on the rights to use, copy, modify, merge, publish, distribute, sub + * license, and/or sell copies of the Software, and to permit persons to whom + * the Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL + * THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM, + * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR + * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE + * USE OR OTHER DEALINGS IN THE SOFTWARE. + */ + +/* This file handles register programming of primitive binning. */ + +#include "si_pipe.h" +#include "sid.h" +#include "gfx9d.h" +#include "radeon/r600_cs.h" + +struct uvec2 { + unsigned x, y; +}; + +struct si_bin_size_map { + unsigned start; + unsigned bin_size_x; + unsigned bin_size_y; +}; + +typedef struct si_bin_size_map si_bin_size_subtable[3][9]; + +/* Find the bin size where sum is >= table[i].start and < table[i + 1].start. */ +static struct uvec2 si_find_bin_size(struct si_screen *sscreen, +const si_bin_size_subtable table[], +unsigned sum) +{ + unsigned log_num_rb_per_se = + util_logbase2_ceil(sscreen->b.info.num_render_backends / + sscreen->b.info.max_se); + unsigned log_num_se = util_logbase2_ceil(sscreen->b.info.max_se); + unsigned i; + + /* Get the chip-specific subtable. */ + const struct si_bin_size_map *subtable = + [log_num_rb_per_se][log_num_se][0]; + + for (i = 0; subtable[i].bin_size_x != 0; i++) { + if (sum >= subtable[i].start && sum < subtable[i + 1].start) + break; + } + + struct uvec2 size = {subtable[i].bin_size_x, subtable[i].bin_size_y}; + return size; +} + +static struct uvec2 si_get_color_bin_size(struct si_context *sctx, + unsigned cb_target_enabled_4bit) +{ + unsigned nr_samples = sctx->framebuffer.nr_samples; + unsigned sum = 0; + + /* Compute the sum of all Bpp. */ + for (unsigned i = 0; i < sctx->framebuffer.state.nr_cbufs; i++) { + if (!(cb_target_enabled_4bit & (0xf << (i * 4 + continue; + + struct r600_texture *rtex = + (struct r600_texture*)sctx->framebuffer.state.cbufs[i]->texture; + sum += rtex->surface.bpe; + } I believe this should early-out for !sum, for depth-only rendering. + + /* Multiply the sum by some function of the number of samples. */ + if (nr_samples >= 2) { + if (sctx->ps_iter_samples >= 2) + sum *= nr_samples; + else + sum *= 2; + } + + static const si_bin_size_subtable table[] = { + { + /* One RB / SE */ + { + /* One shader engine */ + {0, 128, 128 }, +
Re: [Mesa-dev] [PATCH] i965: Set "Subslice Hashing Mode" to 16x16 on Apollolake.
Thanks Eero! On Tue, 2017-08-29 at 12:26 +0300, Eero Tamminen wrote: > Hi, > > On 28.08.2017 17:33, Andres Gomez wrote: > > Kenneth, would we want this patch in 17.1 or we shouldn't bother ? > > See this for more extensive info on its impact: > https://bugs.freedesktop.org/show_bug.cgi?id=102272 > > (It's just performance. While it improves some cases, it regresses others.) > > > - Eero > > > On Tue, 2017-05-30 at 16:28 -0700, Kenneth Graunke wrote: > > > As of 4.11, the kernel isn't bothering to set the subslice hashing mode > > > on Apollolake, leaving it at the default of 8x8. (It initializes it to > > > 16x4 on most platforms.) > > > > > > Performance data for GPUTest Triangle on Apollolake at 1024x640: > > > > > > X-tiled RT: > > > --- > > > 8x8 -> 16x4: 2.4325% +/- 0.383683% (n=107) > > > 8x8 -> 8x4: -3.75105% +/- 0.592491% (n=40) > > > 8x8 -> 16x16: 6.17238% +/- 0.67157% (n=30) > > > > > > Y-tiled RT: > > > --- > > > 8x8 -> 16x4: 1.30307% +/- 0.297292% (n=205) > > > 8x8 -> 8x4: -0.769282% +/- 0.729557% (n=35) > > > 8x8 -> 16x16: 3.00254% +/- 0.715503% (n=40) > > > > > > 8x MSAA RT (INTEL_FORCE_MSAA=8): > > > > > > 8x8 -> 16x4: 1.38889% +/- 0.93729% (n=7) > > > 8x8 -> 8x4: -2.10643% +/- 1.15153% (n=3) > > > 8x8 -> 16x16: 3.87183% +/- 1.08851% (n=5) > > > > > > Based on this, we choose 16x16 for Apollolake. > > > > > > Skylake GT2 with X-tiled buffers appears to be a toss-up between 16x4 > > > and 16x16, and with Y-tiled buffers it doesn't seem to really matter. > > > So we'll leave Skylake alone for now. > > > > > > The hashing mode doesn't seem to make a measurable impact on more > > > complex benchmarks. > > > --- > > > src/mesa/drivers/dri/i965/brw_defines.h | 7 +++ > > > src/mesa/drivers/dri/i965/brw_state_upload.c | 9 + > > > 2 files changed, 16 insertions(+) > > > > > > diff --git a/src/mesa/drivers/dri/i965/brw_defines.h > > > b/src/mesa/drivers/dri/i965/brw_defines.h > > > index 312dddafd77..1278634269a 100644 > > > --- a/src/mesa/drivers/dri/i965/brw_defines.h > > > +++ b/src/mesa/drivers/dri/i965/brw_defines.h > > > @@ -1616,6 +1616,13 @@ enum brw_pixel_shader_coverage_mask_mode { > > > # define GEN8_HIZ_PMA_MASK_BITS \ > > > REG_MASK(GEN8_HIZ_NP_PMA_FIX_ENABLE | > > > GEN8_HIZ_NP_EARLY_Z_FAILS_DISABLE) > > > > > > +#define GEN7_GT_MODE0x7008 > > > +# define GEN9_SUBSLICE_HASHING_8x8 (0 << 8) > > > +# define GEN9_SUBSLICE_HASHING_16x4 (1 << 8) > > > +# define GEN9_SUBSLICE_HASHING_8x4 (2 << 8) > > > +# define GEN9_SUBSLICE_HASHING_16x16(3 << 8) > > > +# define GEN9_SUBSLICE_HASHING_MASK_BITS REG_MASK(3 << 8) > > > + > > > /* Predicate registers */ > > > #define MI_PREDICATE_SRC0 0x2400 > > > #define MI_PREDICATE_SRC1 0x2408 > > > diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c > > > b/src/mesa/drivers/dri/i965/brw_state_upload.c > > > index 4647f1c41e0..6a8547c4ede 100644 > > > --- a/src/mesa/drivers/dri/i965/brw_state_upload.c > > > +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c > > > @@ -70,6 +70,15 @@ brw_upload_initial_gpu_state(struct brw_context *brw) > > > GEN9_FLOAT_BLEND_OPTIMIZATION_ENABLE | > > > GEN9_PARTIAL_RESOLVE_DISABLE_IN_VC); > > > ADVANCE_BATCH(); > > > + > > > + if (brw->is_broxton) { > > > + BEGIN_BATCH(3); > > > + OUT_BATCH(MI_LOAD_REGISTER_IMM | (3 - 2)); > > > + OUT_BATCH(GEN7_GT_MODE); > > > + OUT_BATCH(GEN9_SUBSLICE_HASHING_MASK_BITS | > > > + GEN9_SUBSLICE_HASHING_16x16); > > > + ADVANCE_BATCH(); > > > + } > > > } > > > > > > if (brw->gen >= 8) { > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev -- Br, Andres signature.asc Description: This is a digitally signed message part ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] What is the difference between ROCm and Clover?
On Sun, Sep 3, 2017 at 3:20 PM, David Niklaswrote: > Hello, > I'm interested in knowing why there are two different OpenCL > implementations of OpenCL drivers for AMD cards. > Is it because they support different ASICs? > One is older and going to be replaced? As near as I can tell, ROCm requires PCIe 3.0 Atomic operations for both the CPU and GPU. As such, ROCm will never support the VLIW radeons (5000-6000 series) or the earlier generations of GCN (SI, CI, possibly VI?). According to the ROCm page, it requires Fiji/Polaris/Vega. It also requires either a Haswell or Ryzen CPU (or newer), so anyone with an older CPU will be left out of ROCm support. External GPUs via Thunderbolt 1/2 enclosures are also unsupported. There is a bit of overlap between the two projects, in that Clover/libclc supports Polaris cards on various CPUs, but it's safe to say that for the moment, if your system is supported by ROCm, it is the more complete/well-tested option. Clover/Libclc are both still works in progress that work for some use cases, but they are not complete, and have never passed a run of the CL 1.2 conformance test suite in any configuration, and it's definitely a use at your own risk type of thing for now. --Aaron > > > Thanks, > David > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: expose sRGB visuals and EGL_KHR_gl_colorspace
Patch exposes sRGB visuals and adds DRI integer query support for __DRI2_RENDERER_HAS_FRAMEBUFFER_SRGB. Further changes make sure that we mark if the app explicitly wanted sRGB and for these framebuffers we don't turn sRGB off in intel_gles3_srgb_workaround. This way we keep compatibility for existing applications relying on default sRGB and only add more visual support. With this change, following dEQP tests start to pass: dEQP-EGL.functional.wide_color.window__colorspace_srgb dEQP-EGL.functional.wide_color.pbuffer__colorspace_srgb Signed-off-by: Tapani Pälli--- I did see following tests fail during CI run: ES31-CTS.functional.blend_equation_advanced.srgb.colorburn.hswm64 ES31-CTS.functional.blend_equation_advanced.basic.colordodge.hswm64 However they don't seem to fail on non-hsw, so I'm not sure if this is really a regression here? I saw that Ken has been burning some colors before so I've CC:d him. Note, this might have some effect on following bug: https://bugs.freedesktop.org/show_bug.cgi?id=102503 For me SuperTuxKart was working when testing (!) But it could be I was testing a wrong version. src/mesa/drivers/dri/i965/brw_context.c | 16 ++-- src/mesa/drivers/dri/i965/intel_fbo.h| 5 + src/mesa/drivers/dri/i965/intel_screen.c | 12 3 files changed, 27 insertions(+), 6 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index 6441311d47..a8e39e3bb6 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -1112,8 +1112,8 @@ intelUnbindContext(__DRIcontext * driContextPriv) * * Unfortunately, renderbuffer setup happens before a context is created. So * in intel_screen.c we always set up sRGB, and here, if you're a GLES2/3 - * context (without an sRGB visual, though we don't have sRGB visuals exposed - * yet), we go turn that back off before anyone finds out. + * context (without an sRGB visual), we go turn that back off before anyone + * finds out. */ static void intel_gles3_srgb_workaround(struct brw_context *brw, @@ -1124,15 +1124,19 @@ intel_gles3_srgb_workaround(struct brw_context *brw, if (_mesa_is_desktop_gl(ctx) || !fb->Visual.sRGBCapable) return; - /* Some day when we support the sRGB capable bit on visuals available for -* GLES, we'll need to respect that and not disable things here. -*/ - fb->Visual.sRGBCapable = false; for (int i = 0; i < BUFFER_COUNT; i++) { struct gl_renderbuffer *rb = fb->Attachment[i].Renderbuffer; + + /* Check if sRGB was specifically asked for. */ + struct intel_renderbuffer *irb = intel_get_renderbuffer(fb, i); + if (irb && irb->explicit_srgb) + return; + if (rb) rb->Format = _mesa_get_srgb_format_linear(rb->Format); } + /* Disable sRGB from framebuffers that are not compatible. */ + fb->Visual.sRGBCapable = false; } GLboolean diff --git a/src/mesa/drivers/dri/i965/intel_fbo.h b/src/mesa/drivers/dri/i965/intel_fbo.h index 1e2494286b..c8c2ed9a1b 100644 --- a/src/mesa/drivers/dri/i965/intel_fbo.h +++ b/src/mesa/drivers/dri/i965/intel_fbo.h @@ -116,6 +116,11 @@ struct intel_renderbuffer * for the duration of a mapping. */ bool singlesample_mt_is_tmp; + + /** +* Application specifically asked for a sRGB visual. +*/ + bool explicit_srgb; }; diff --git a/src/mesa/drivers/dri/i965/intel_screen.c b/src/mesa/drivers/dri/i965/intel_screen.c index d39509bcb8..79cc962ab1 100644 --- a/src/mesa/drivers/dri/i965/intel_screen.c +++ b/src/mesa/drivers/dri/i965/intel_screen.c @@ -1341,6 +1341,9 @@ brw_query_renderer_integer(__DRIscreen *dri_screen, case __DRI2_RENDERER_HAS_TEXTURE_3D: value[0] = 1; return 0; + case __DRI2_RENDERER_HAS_FRAMEBUFFER_SRGB: + value[0] = screen->mesa_format_supports_render[MESA_FORMAT_B8G8R8A8_SRGB]; + return 0; default: return driQueryRendererIntegerCommon(dri_screen, param, value); } @@ -1486,12 +1489,16 @@ intelCreateBuffer(__DRIscreen *dri_screen, fb->Visual.samples = num_samples; } + bool is_srgb = false; + if (mesaVis->redBits == 5) { rgbFormat = mesaVis->redMask == 0x1f ? MESA_FORMAT_R5G6B5_UNORM : MESA_FORMAT_B5G6R5_UNORM; } else if (mesaVis->sRGBCapable) { rgbFormat = mesaVis->redMask == 0xff ? MESA_FORMAT_R8G8B8A8_SRGB : MESA_FORMAT_B8G8R8A8_SRGB; + /* mesaVis->sRGBCapable was set, user is asking for sRGB */ + is_srgb = true; } else if (mesaVis->alphaBits == 0) { rgbFormat = mesaVis->redMask == 0xff ? MESA_FORMAT_R8G8B8X8_UNORM : MESA_FORMAT_B8G8R8X8_UNORM; @@ -1504,10 +1511,12 @@ intelCreateBuffer(__DRIscreen *dri_screen, /* setup the hardware-based renderbuffers */ rb =
Re: [Mesa-dev] [PATCH 12/23] intel: Add simple logging façade for Android
Hi, On 02.09.2017 11:17, Chad Versace wrote: I'm bringing up Vulkan in the Android container of Chrome OS (ARC++). On Android, stdio goes to /dev/null. On Android, remote gdb is even more painful than the usual remote gdb. On Android, nothing works like you expect and debugging is hell. I need logging. Would non-remote Gdb work better? I.e. use a chroot containing your normal Linux setup inside your Android, and use tools from that to debug Android stuff outside the chroot. Everything that doesn't need to be inside the debugged process (like LD_PRELOAD tools) such as Gdb, "perf" etc, should work fine as long as you mount /dev, /proc, /sys there, along with having (the non-stripped versions of) the Android binaries in same path within the chroot, as they're outside. (At least that worked fine for me few years ago, when I needed to debug & profile Android stuff. If security is nowadays tightened, you may need to use your own more relaxed kernel config.) - Eero This patch introduces a small, simple logging API that can easily wrap Android's API. On non-Android platforms, this logger does nothing fancy. It follows the time-honored Unix tradition of spewing everything to stderr with minimal fuss. My goal here is not perfection. My goal is to make a minimal, clean API, that people hate merely a little instead of a lot, and that's good enough to let me bring up Android Vulkan. And it needs to be fast, which means it must be small. No one wants to their game to miss frames while aiming a flaming bow into the jaws of an angry robot t-rex, and thus become t-rex breakfast, because some fool had too much fun desiging a bloated, ideal logging API. If people like it, perhaps we should quickly promote it to src/util. The API looks like this: #define INTEL_LOG_TAG "intel-vulkan" #define DEBUG intel_logd("try hard thing with foo=%d", foo); n = try_foo(...); if (n < 0) { intel_loge("%s:%d: foo failed bigtime", __FILE__, __LINE__); return VK_ERROR_DEVICE_LOST; } And produces this on non-Android: intel-vulkan: debug: try hard thing with foo=93 intel-vulkan: error: anv_device.c:182: foo failed bigtime --- src/intel/Makefile.sources | 4 +- src/intel/common/intel_log.c | 87 src/intel/common/intel_log.h | 82 + 3 files changed, 172 insertions(+), 1 deletion(-) create mode 100644 src/intel/common/intel_log.c create mode 100644 src/intel/common/intel_log.h diff --git a/src/intel/Makefile.sources b/src/intel/Makefile.sources index 4074ba9ee54..f6a69f65455 100644 --- a/src/intel/Makefile.sources +++ b/src/intel/Makefile.sources @@ -18,7 +18,9 @@ COMMON_FILES = \ common/gen_l3_config.c \ common/gen_l3_config.h \ common/gen_urb_config.c \ - common/gen_sample_positions.h + common/gen_sample_positions.h \ + common/intel_log.c \ + common/intel_log.h COMPILER_FILES = \ compiler/brw_cfg.cpp \ diff --git a/src/intel/common/intel_log.c b/src/intel/common/intel_log.c new file mode 100644 index 000..03d6dc72a8d --- /dev/null +++ b/src/intel/common/intel_log.c @@ -0,0 +1,87 @@ +/* + * Copyright 2017 Google + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +#include + +#ifdef ANDROID +#include +#else +#include +#endif + +#include "intel_log.h" + +#ifdef ANDROID +static inline android_LogPriority +level_to_android(enum intel_log_level l) +{ + switch (l) { + case INTEL_LOG_ERROR: return ANDROID_LOG_ERROR; + case INTEL_LOG_WARN: return ANDROID_LOG_WARN; + case INTEL_LOG_INFO: return ANDROID_LOG_INFO; + case INTEL_LOG_DEBUG: return ANDROID_LOG_DEBUG; + } + + unreachable("bad intel_log_level"); +} +#endif + +#ifndef ANDROID +static inline const char * +level_to_str(enum intel_log_level l) +{ + switch
[Mesa-dev] [PATCH 2/2] loader/dri3: Invalidate the drawable after copySubBuffer
Anyone using copySubBuffer as a replacement for swapBuffers would probably want window resizing to update the viewport. Signed-off-by: Thomas Hellstrom--- src/loader/loader_dri3_helper.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/loader/loader_dri3_helper.c b/src/loader/loader_dri3_helper.c index c0a6e0d..9549b18 100644 --- a/src/loader/loader_dri3_helper.c +++ b/src/loader/loader_dri3_helper.c @@ -664,6 +664,8 @@ loader_dri3_copy_sub_buffer(struct loader_dri3_drawable *draw, dri3_fence_trigger(draw->conn, dri3_fake_front_buffer(draw)); dri3_fence_await(draw->conn, dri3_fake_front_buffer(draw)); } + + draw->ext->flush->invalidate(draw->dri_drawable); dri3_fence_await(draw->conn, back); } -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] loader/dri3: Use client local back to front blit in copySubBuffer if available
The copySubBuffer functionality always attempted a server side blit from back to fake front if a fake front was present, and we weren't displaying on a remote GPU. Now that we always have local blit capability on modern drivers, first attempt a local blit, and only if that fails, try the server blit. Signed-off-by: Thomas Hellstrom--- src/loader/loader_dri3_helper.c | 16 +++- 1 file changed, 7 insertions(+), 9 deletions(-) diff --git a/src/loader/loader_dri3_helper.c b/src/loader/loader_dri3_helper.c index e3120f5..c0a6e0d 100644 --- a/src/loader/loader_dri3_helper.c +++ b/src/loader/loader_dri3_helper.c @@ -635,14 +635,6 @@ loader_dri3_copy_sub_buffer(struct loader_dri3_drawable *draw, back->image, 0, 0, back->width, back->height, 0, 0, __BLIT_FLAG_FLUSH); - /* We use blit_image to update our fake front, - */ - if (draw->have_fake_front) - (void) loader_dri3_blit_image(draw, - dri3_fake_front_buffer(draw)->image, - back->image, - x, y, width, height, - x, y, __BLIT_FLAG_FLUSH); } loader_dri3_swapbuffer_barrier(draw); @@ -656,7 +648,13 @@ loader_dri3_copy_sub_buffer(struct loader_dri3_drawable *draw, /* Refresh the fake front (if present) after we just damaged the real * front. */ - if (draw->have_fake_front && !draw->is_different_gpu) { + if (draw->have_fake_front && + !loader_dri3_blit_image(draw, + dri3_fake_front_buffer(draw)->image, + back->image, + x, y, width, height, + x, y, __BLIT_FLAG_FLUSH) && + !draw->is_different_gpu) { dri3_fence_reset(draw->conn, dri3_fake_front_buffer(draw)); dri3_copy_area(draw->conn, back->pixmap, -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Question about implementing viewport transfer and const load in nir
On 30.08.2017 16:44, Rob Clark wrote: On Wed, Aug 30, 2017 at 10:18 AM, Qiang Yuwrote: On Wed, Aug 30, 2017 at 9:03 PM, Rob Clark wrote: On Wed, Aug 30, 2017 at 3:26 AM, Qiang Yu wrote: btw, does lima have some way to write to memory from cmdstream (ie. without setting up a full blown draw)? If so perhaps you could get away with leaving some extra space at the end of your uniform buffer that you copy driver internal uniforms into before kicking the draw? Unfortunately lima can't do this. Seems you guys all know how to "reserve space in uniform buffer", how? fwiw, freedreno does have driver specific uniforms, see ir3_driver_param. Although normal uniforms (ie. non-UBO) get copied into internal memory, so I just upload driver specific uniforms (as needed) and immediates to the tail of the uniform memory before the draw. So you mean freedreno will do an extra copy from the constant buffer set by set_constant_buffer to a driver allocated memory then append the driver spec uniform? If so, that can explain the trick above. Kind of.. it is really a copy into internal uniform memory that the shaders access. Although if the uniforms were a buffer in system memory and you had a way to memcpy from cmdstream synchronized with draws, then that could work too. (I do similar w/ some immediates too, since in some cases it avoids an extra move from immed instruction in the shader.) lima store uniform in system memory and can't write memory in cmd stream, so I have to do a memcpy before draw if not use the constant buffer of set_constant_buffer. that is a bit unfortunate.. and also means you have to create multiple versions of uniform buffer if internal driver params change between draws but user uniforms did not :-( Does Lima support uniform buffers? You may be able to put the internal driver params into a separate buffer. Cheers, Nicolai BR, -R ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev -- Lerne, wie die Welt wirklich ist, Aber vergiss niemals, wie sie sein sollte. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] radeonsi: eliminate PS color outputs when colormask kills them
Both patches: Reviewed-by: Nicolai HähnleOn 30.08.2017 00:26, Marek Olšák wrote: From: Marek Olšák --- src/gallium/drivers/radeonsi/si_state.c | 4 src/gallium/drivers/radeonsi/si_state.h | 1 + src/gallium/drivers/radeonsi/si_state_shaders.c | 1 + 3 files changed, 6 insertions(+) diff --git a/src/gallium/drivers/radeonsi/si_state.c b/src/gallium/drivers/radeonsi/si_state.c index 41b08f8..5ee8bb9 100644 --- a/src/gallium/drivers/radeonsi/si_state.c +++ b/src/gallium/drivers/radeonsi/si_state.c @@ -434,20 +434,22 @@ static void *si_create_blend_state_mode(struct pipe_context *ctx, S_028B70_ALPHA_TO_MASK_ENABLE(state->alpha_to_coverage) | S_028B70_ALPHA_TO_MASK_OFFSET0(2) | S_028B70_ALPHA_TO_MASK_OFFSET1(2) | S_028B70_ALPHA_TO_MASK_OFFSET2(2) | S_028B70_ALPHA_TO_MASK_OFFSET3(2)); if (state->alpha_to_coverage) blend->need_src_alpha_4bit |= 0xf; blend->cb_target_mask = 0; + blend->cb_target_enabled_4bit = 0; + for (int i = 0; i < 8; i++) { /* state->rt entries > 0 only written if independent blending */ const int j = state->independent_blend_enable ? i : 0; unsigned eqRGB = state->rt[j].rgb_func; unsigned srcRGB = state->rt[j].rgb_src_factor; unsigned dstRGB = state->rt[j].rgb_dst_factor; unsigned eqA = state->rt[j].alpha_func; unsigned srcA = state->rt[j].alpha_src_factor; unsigned dstA = state->rt[j].alpha_dst_factor; @@ -475,20 +477,22 @@ static void *si_create_blend_state_mode(struct pipe_context *ctx, if (blend->dual_src_blend && (eqRGB == PIPE_BLEND_MIN || eqRGB == PIPE_BLEND_MAX || eqA == PIPE_BLEND_MIN || eqA == PIPE_BLEND_MAX)) { assert(!"Unsupported equation for dual source blending"); si_pm4_set_reg(pm4, R_028780_CB_BLEND0_CONTROL + i * 4, blend_cntl); continue; } /* cb_render_state will disable unused ones */ blend->cb_target_mask |= (unsigned)state->rt[j].colormask << (4 * i); + if (state->rt[j].colormask) + blend->cb_target_enabled_4bit |= 0xf << (4 * i); if (!state->rt[j].colormask || !state->rt[j].blend_enable) { si_pm4_set_reg(pm4, R_028780_CB_BLEND0_CONTROL + i * 4, blend_cntl); continue; } /* Blending optimizations for RB+. * These transformations don't change the behavior. * * First, get rid of DST in the blend factors: diff --git a/src/gallium/drivers/radeonsi/si_state.h b/src/gallium/drivers/radeonsi/si_state.h index 26c7b4c..7b7d96c 100644 --- a/src/gallium/drivers/radeonsi/si_state.h +++ b/src/gallium/drivers/radeonsi/si_state.h @@ -48,20 +48,21 @@ struct si_shader_selector; struct si_state_blend { struct si_pm4_state pm4; uint32_tcb_target_mask; boolalpha_to_coverage; boolalpha_to_one; booldual_src_blend; /* Set 0xf or 0x0 (4 bits) per render target if the following is * true. ANDed with spi_shader_col_format. */ + unsignedcb_target_enabled_4bit; unsignedblend_enable_4bit; unsignedneed_src_alpha_4bit; }; struct si_state_rasterizer { struct si_pm4_state pm4; /* poly offset states for 16-bit, 24-bit, and 32-bit zbuffers */ struct si_pm4_state *pm4_poly_offset; unsignedpa_sc_line_stipple; unsignedpa_cl_clip_cntl; diff --git a/src/gallium/drivers/radeonsi/si_state_shaders.c b/src/gallium/drivers/radeonsi/si_state_shaders.c index 71d7987..061b3d2 100644 --- a/src/gallium/drivers/radeonsi/si_state_shaders.c +++ b/src/gallium/drivers/radeonsi/si_state_shaders.c @@ -1348,20 +1348,21 @@ static inline void si_shader_selector_key(struct pipe_context *ctx, */ key->part.ps.epilog.spi_shader_col_format = (blend->blend_enable_4bit & blend->need_src_alpha_4bit & sctx->framebuffer.spi_shader_col_format_blend_alpha) | (blend->blend_enable_4bit & ~blend->need_src_alpha_4bit & sctx->framebuffer.spi_shader_col_format_blend) | (~blend->blend_enable_4bit & blend->need_src_alpha_4bit & sctx->framebuffer.spi_shader_col_format_alpha) |
Re: [Mesa-dev] What is the difference between ROCm and Clover?
Hallo David, the following Reddit link might be interesting for you: https://www.reddit.com/r/Amd/comments/5jqk54/what_is_the_status_of_opencl_in_amdgpumesa/ As far as I understand it, it basically says that ROCm is going to be the future regarding OpenCL. On 3 September 2017 at 22:20, David Niklaswrote: > Hello, > I'm interested in knowing why there are two different OpenCL > implementations of OpenCL drivers for AMD cards. > Is it because they support different ASICs? > One is older and going to be replaced? > > > Thanks, > David > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] vbo: fix build errors on android
On 4 September 2017 at 06:12, Tapani Pälliwrote: > incompatible pointer to integer conversion assigning to 'GLintptr' (aka 'int') > from 'const char *' [-Werror,-Wint-conversion] > > offset = indices; > ^ ~~~ > > Signed-off-by: Tapani Pälli > --- > src/mesa/vbo/vbo_minmax_index.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/mesa/vbo/vbo_minmax_index.c b/src/mesa/vbo/vbo_minmax_index.c > index 58a2af49ac..1377926bba 100644 > --- a/src/mesa/vbo/vbo_minmax_index.c > +++ b/src/mesa/vbo/vbo_minmax_index.c > @@ -255,7 +255,7 @@ vbo_get_minmax_index(struct gl_context *ctx, > count, min_index, max_index)) > return; > > - offset = indices; > + offset = (GLintptr) indices; The one line summary seems bit too generic, rework perhaps? Either way, Fixes: 2d93b462b4d ("vbo: fix offset in minmax cache key") Reviewed-by: Emil Velikov -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] radeonsi/gfx9: always flush DB metadata on framebuffer changes
From: Nicolai HähnleThis fixes GL45-CTS.shader_image_load_store.basic-glsl-earlyFragTests. Cc: mesa-sta...@lists.freedesktop.org -- FWIW, Vulkan also always flushes DB metadata on gfx9. Wait-for-idle is not required here. --- src/gallium/drivers/radeonsi/si_pipe.h | 4 ++-- src/gallium/drivers/radeonsi/si_state.c | 11 ++- src/gallium/drivers/radeonsi/si_state_draw.c | 3 ++- 3 files changed, 14 insertions(+), 4 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_pipe.h b/src/gallium/drivers/radeonsi/si_pipe.h index 386a6dc886d..b82ec7ef9f8 100644 --- a/src/gallium/drivers/radeonsi/si_pipe.h +++ b/src/gallium/drivers/radeonsi/si_pipe.h @@ -54,23 +54,23 @@ /* VMEM L1 can optionally be bypassed (GLC=1). Other names: TC L1 */ #define SI_CONTEXT_INV_VMEM_L1 (R600_CONTEXT_PRIVATE_FLAG << 2) /* Used by everything except CB/DB, can be bypassed (SLC=1). Other names: TC L2 */ #define SI_CONTEXT_INV_GLOBAL_L2 (R600_CONTEXT_PRIVATE_FLAG << 3) /* Write dirty L2 lines back to memory (shader and CP DMA stores), but don't * invalidate L2. SI-CIK can't do it, so they will do complete invalidation. */ #define SI_CONTEXT_WRITEBACK_GLOBAL_L2 (R600_CONTEXT_PRIVATE_FLAG << 4) /* Writeback & invalidate the L2 metadata cache. It can only be coupled with * a CB or DB flush. */ #define SI_CONTEXT_INV_L2_METADATA (R600_CONTEXT_PRIVATE_FLAG << 5) -/* gap */ /* Framebuffer caches. */ -#define SI_CONTEXT_FLUSH_AND_INV_DB(R600_CONTEXT_PRIVATE_FLAG << 7) +#define SI_CONTEXT_FLUSH_AND_INV_DB(R600_CONTEXT_PRIVATE_FLAG << 6) +#define SI_CONTEXT_FLUSH_AND_INV_DB_META (R600_CONTEXT_PRIVATE_FLAG << 7) #define SI_CONTEXT_FLUSH_AND_INV_CB(R600_CONTEXT_PRIVATE_FLAG << 8) /* Engine synchronization. */ #define SI_CONTEXT_VS_PARTIAL_FLUSH(R600_CONTEXT_PRIVATE_FLAG << 9) #define SI_CONTEXT_PS_PARTIAL_FLUSH(R600_CONTEXT_PRIVATE_FLAG << 10) #define SI_CONTEXT_CS_PARTIAL_FLUSH(R600_CONTEXT_PRIVATE_FLAG << 11) #define SI_CONTEXT_VGT_FLUSH (R600_CONTEXT_PRIVATE_FLAG << 12) #define SI_CONTEXT_VGT_STREAMOUT_SYNC (R600_CONTEXT_PRIVATE_FLAG << 13) #define SI_PREFETCH_VBO_DESCRIPTORS(1 << 0) #define SI_PREFETCH_LS (1 << 1) diff --git a/src/gallium/drivers/radeonsi/si_state.c b/src/gallium/drivers/radeonsi/si_state.c index 41b08f8de4f..365c1248b2f 100644 --- a/src/gallium/drivers/radeonsi/si_state.c +++ b/src/gallium/drivers/radeonsi/si_state.c @@ -2569,23 +2569,32 @@ static void si_set_framebuffer_state(struct pipe_context *ctx, sctx->framebuffer.CB_has_shader_readable_metadata); sctx->b.flags |= SI_CONTEXT_CS_PARTIAL_FLUSH; /* u_blitter doesn't invoke depth decompression when it does multiple * blits in a row, but the only case when it matters for DB is when * doing generate_mipmap. So here we flush DB manually between * individual generate_mipmap blits. * Note that lower mipmap levels aren't compressed. */ - if (sctx->generate_mipmap_for_depth) + if (sctx->generate_mipmap_for_depth) { si_make_DB_shader_coherent(sctx, 1, false, sctx->framebuffer.DB_has_shader_readable_metadata); + } else if (sctx->b.chip_class == GFX9) { + /* It appears that DB metadata "leaks" in a sequence of: +* - depth clear +* - DCC decompress for shader image writes (with DB disabled) +* - render with DEPTH_BEFORE_SHADER=1 +* Flushing DB metadata works around the problem. +*/ + sctx->b.flags |= SI_CONTEXT_FLUSH_AND_INV_DB_META; + } /* Take the maximum of the old and new count. If the new count is lower, * dirtying is needed to disable the unbound colorbuffers. */ sctx->framebuffer.dirty_cbufs |= (1 << MAX2(sctx->framebuffer.state.nr_cbufs, state->nr_cbufs)) - 1; sctx->framebuffer.dirty_zsbuf |= sctx->framebuffer.state.zsbuf != state->zsbuf; si_dec_framebuffer_counters(>framebuffer.state); util_copy_framebuffer_state(>framebuffer.state, state); diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c b/src/gallium/drivers/radeonsi/si_state_draw.c index 81751d2186e..7ee6cf88e88 100644 --- a/src/gallium/drivers/radeonsi/si_state_draw.c +++ b/src/gallium/drivers/radeonsi/si_state_draw.c @@ -905,21 +905,22 @@ void si_emit_cache_flush(struct si_context *sctx) if (rctx->flags & SI_CONTEXT_FLUSH_AND_INV_DB) cp_coher_cntl |= S_0085F0_DB_ACTION_ENA(1) | S_0085F0_DB_DEST_BASE_ENA(1); } if (rctx->flags & SI_CONTEXT_FLUSH_AND_INV_CB) { /* Flush CMASK/FMASK/DCC. SURFACE_SYNC will wait for idle. */ radeon_emit(cs, PKT3(PKT3_EVENT_WRITE,
[Mesa-dev] [PATCH 4/4] ac/debug: take ASIC generation into account when printing registers
From: Nicolai HähnleThere were some overlapping changes in gfx9 especially in the CB/DB blocks which made register dumps rather misleading. The split is along the lines of the header files, so we'll print VI-only fields on SI and CI, for example, but we won't print GFX9 fields on SI/CI/VI, and we won't print SI/CI/VI fields on GFX9. --- src/amd/common/ac_debug.c| 83 ++ src/amd/common/sid_tables.py | 201 +++ 2 files changed, 177 insertions(+), 107 deletions(-) diff --git a/src/amd/common/ac_debug.c b/src/amd/common/ac_debug.c index 570ba850851..54685356f1d 100644 --- a/src/amd/common/ac_debug.c +++ b/src/amd/common/ac_debug.c @@ -94,68 +94,83 @@ static void print_value(FILE *file, uint32_t value, int bits) } static void print_named_value(FILE *file, const char *name, uint32_t value, int bits) { print_spaces(file, INDENT_PKT); fprintf(file, COLOR_YELLOW "%s" COLOR_RESET " <- ", name); print_value(file, value, bits); } +static const struct si_reg *find_register(const struct si_reg *table, + unsigned table_size, + unsigned offset) +{ + for (unsigned i = 0; i < table_size; i++) { + const struct si_reg *reg = [i]; + + if (reg->offset == offset) + return reg; + } + + return NULL; +} + void ac_dump_reg(FILE *file, enum chip_class chip_class, unsigned offset, uint32_t value, uint32_t field_mask) { - int r, f; + const struct si_reg *reg = NULL; - for (r = 0; r < ARRAY_SIZE(sid_reg_table); r++) { - const struct si_reg *reg = _reg_table[r]; - const char *reg_name = sid_strings + reg->name_offset; + if (chip_class >= GFX9) + reg = find_register(gfx9d_reg_table, ARRAY_SIZE(gfx9d_reg_table), offset); + if (!reg) + reg = find_register(sid_reg_table, ARRAY_SIZE(sid_reg_table), offset); - if (reg->offset == offset) { - bool first_field = true; + if (reg) { + const char *reg_name = sid_strings + reg->name_offset; + bool first_field = true; - print_spaces(file, INDENT_PKT); - fprintf(file, COLOR_YELLOW "%s" COLOR_RESET " <- ", - reg_name); + print_spaces(file, INDENT_PKT); + fprintf(file, COLOR_YELLOW "%s" COLOR_RESET " <- ", + reg_name); - if (!reg->num_fields) { - print_value(file, value, 32); - return; - } + if (!reg->num_fields) { + print_value(file, value, 32); + return; + } - for (f = 0; f < reg->num_fields; f++) { - const struct si_field *field = sid_fields_table + reg->fields_offset + f; - const int *values_offsets = sid_strings_offsets + field->values_offset; - uint32_t val = (value & field->mask) >> - (ffs(field->mask) - 1); + for (unsigned f = 0; f < reg->num_fields; f++) { + const struct si_field *field = sid_fields_table + reg->fields_offset + f; + const int *values_offsets = sid_strings_offsets + field->values_offset; + uint32_t val = (value & field->mask) >> + (ffs(field->mask) - 1); - if (!(field->mask & field_mask)) - continue; + if (!(field->mask & field_mask)) + continue; - /* Indent the field. */ - if (!first_field) - print_spaces(file, -INDENT_PKT + strlen(reg_name) + 4); + /* Indent the field. */ + if (!first_field) + print_spaces(file, +INDENT_PKT + strlen(reg_name) + 4); - /* Print the field. */ - fprintf(file, "%s = ", sid_strings + field->name_offset); + /* Print the field. */ + fprintf(file, "%s = ", sid_strings + field->name_offset); - if (val < field->num_values && values_offsets[val] >= 0) - fprintf(file, "%s\n", sid_strings + values_offsets[val]); - else
[Mesa-dev] [PATCH 1/4] ac/sid_tables: remove unused variable varname_values
From: Nicolai Hähnle--- src/amd/common/sid_tables.py | 1 - 1 file changed, 1 deletion(-) diff --git a/src/amd/common/sid_tables.py b/src/amd/common/sid_tables.py index 0a2b7ef1fe4..01970caa7be 100644 --- a/src/amd/common/sid_tables.py +++ b/src/amd/common/sid_tables.py @@ -124,21 +124,20 @@ class IntTable: 'static ' if static else '', self.typename, name, '\n'.join(fragments) )) class Field: def __init__(self, reg, s_name): self.s_name = s_name self.name = strip_prefix(s_name) self.values = [] -self.varname_values = '%s__%s__values' % (reg.r_name.lower(), self.name.lower()) class Reg: def __init__(self, r_name): self.r_name = r_name self.name = strip_prefix(r_name) self.fields = [] self.own_fields = True def strip_prefix(s): -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/4] amd/common: pass chip_class to ac_dump_reg
From: Nicolai Hähnle--- src/amd/common/ac_debug.c | 86 - src/amd/common/ac_debug.h | 4 +- src/gallium/drivers/radeonsi/si_debug.c | 45 +++-- 3 files changed, 75 insertions(+), 60 deletions(-) diff --git a/src/amd/common/ac_debug.c b/src/amd/common/ac_debug.c index 0de00e27e75..570ba850851 100644 --- a/src/amd/common/ac_debug.c +++ b/src/amd/common/ac_debug.c @@ -94,22 +94,22 @@ static void print_value(FILE *file, uint32_t value, int bits) } static void print_named_value(FILE *file, const char *name, uint32_t value, int bits) { print_spaces(file, INDENT_PKT); fprintf(file, COLOR_YELLOW "%s" COLOR_RESET " <- ", name); print_value(file, value, bits); } -void ac_dump_reg(FILE *file, unsigned offset, uint32_t value, -uint32_t field_mask) +void ac_dump_reg(FILE *file, enum chip_class chip_class, unsigned offset, +uint32_t value, uint32_t field_mask) { int r, f; for (r = 0; r < ARRAY_SIZE(sid_reg_table); r++) { const struct si_reg *reg = _reg_table[r]; const char *reg_name = sid_strings + reg->name_offset; if (reg->offset == offset) { bool first_field = true; @@ -189,21 +189,21 @@ static void ac_parse_set_reg_packet(FILE *f, unsigned count, unsigned reg_offset unsigned reg = ((reg_dw & 0x) << 2) + reg_offset; unsigned index = reg_dw >> 28; int i; if (index != 0) { print_spaces(f, INDENT_PKT); fprintf(f, "INDEX = %u\n", index); } for (i = 0; i < count; i++) - ac_dump_reg(f, reg + i*4, ac_ib_get(ib), ~0); + ac_dump_reg(f, ib->chip_class, reg + i*4, ac_ib_get(ib), ~0); } static void ac_parse_packet3(FILE *f, uint32_t header, struct ac_ib_parser *ib, int *current_trace_id) { unsigned first_dw = ib->cur_dw; int count = PKT_COUNT_G(header); unsigned op = PKT3_IT_OPCODE_G(header); const char *predicate = PKT3_PREDICATE(header) ? "(predicate)" : ""; int i; @@ -237,74 +237,74 @@ static void ac_parse_packet3(FILE *f, uint32_t header, struct ac_ib_parser *ib, case PKT3_SET_CONFIG_REG: ac_parse_set_reg_packet(f, count, SI_CONFIG_REG_OFFSET, ib); break; case PKT3_SET_UCONFIG_REG: ac_parse_set_reg_packet(f, count, CIK_UCONFIG_REG_OFFSET, ib); break; case PKT3_SET_SH_REG: ac_parse_set_reg_packet(f, count, SI_SH_REG_OFFSET, ib); break; case PKT3_ACQUIRE_MEM: - ac_dump_reg(f, R_0301F0_CP_COHER_CNTL, ac_ib_get(ib), ~0); - ac_dump_reg(f, R_0301F4_CP_COHER_SIZE, ac_ib_get(ib), ~0); - ac_dump_reg(f, R_030230_CP_COHER_SIZE_HI, ac_ib_get(ib), ~0); - ac_dump_reg(f, R_0301F8_CP_COHER_BASE, ac_ib_get(ib), ~0); - ac_dump_reg(f, R_0301E4_CP_COHER_BASE_HI, ac_ib_get(ib), ~0); + ac_dump_reg(f, ib->chip_class, R_0301F0_CP_COHER_CNTL, ac_ib_get(ib), ~0); + ac_dump_reg(f, ib->chip_class, R_0301F4_CP_COHER_SIZE, ac_ib_get(ib), ~0); + ac_dump_reg(f, ib->chip_class, R_030230_CP_COHER_SIZE_HI, ac_ib_get(ib), ~0); + ac_dump_reg(f, ib->chip_class, R_0301F8_CP_COHER_BASE, ac_ib_get(ib), ~0); + ac_dump_reg(f, ib->chip_class, R_0301E4_CP_COHER_BASE_HI, ac_ib_get(ib), ~0); print_named_value(f, "POLL_INTERVAL", ac_ib_get(ib), 16); break; case PKT3_SURFACE_SYNC: if (ib->chip_class >= CIK) { - ac_dump_reg(f, R_0301F0_CP_COHER_CNTL, ac_ib_get(ib), ~0); - ac_dump_reg(f, R_0301F4_CP_COHER_SIZE, ac_ib_get(ib), ~0); - ac_dump_reg(f, R_0301F8_CP_COHER_BASE, ac_ib_get(ib), ~0); + ac_dump_reg(f, ib->chip_class, R_0301F0_CP_COHER_CNTL, ac_ib_get(ib), ~0); + ac_dump_reg(f, ib->chip_class, R_0301F4_CP_COHER_SIZE, ac_ib_get(ib), ~0); + ac_dump_reg(f, ib->chip_class, R_0301F8_CP_COHER_BASE, ac_ib_get(ib), ~0); } else { - ac_dump_reg(f, R_0085F0_CP_COHER_CNTL, ac_ib_get(ib), ~0); - ac_dump_reg(f, R_0085F4_CP_COHER_SIZE, ac_ib_get(ib), ~0); - ac_dump_reg(f, R_0085F8_CP_COHER_BASE, ac_ib_get(ib), ~0); + ac_dump_reg(f, ib->chip_class, R_0085F0_CP_COHER_CNTL, ac_ib_get(ib), ~0); + ac_dump_reg(f, ib->chip_class, R_0085F4_CP_COHER_SIZE, ac_ib_get(ib), ~0); + ac_dump_reg(f, ib->chip_class, R_0085F8_CP_COHER_BASE, ac_ib_get(ib), ~0); } print_named_value(f,
[Mesa-dev] [PATCH 2/4] ac/sid_tables: add FieldTable object
From: Nicolai HähnleAutomatically re-use table entries like StringTable and IntTable do. This allows us to get rid of the "fields_owner" logic, and simplifies the next change. --- src/amd/common/sid_tables.py | 115 --- 1 file changed, 85 insertions(+), 30 deletions(-) diff --git a/src/amd/common/sid_tables.py b/src/amd/common/sid_tables.py index 01970caa7be..808a96f834f 100644 --- a/src/amd/common/sid_tables.py +++ b/src/amd/common/sid_tables.py @@ -18,20 +18,22 @@ CopyRight = ''' * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL * THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM, * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE * USE OR OTHER DEALINGS IN THE SOFTWARE. * */ ''' +import collections +import functools import sys import re class StringTable: """ A class for collecting multiple strings in a single larger string that is used by indexing (to avoid relocations in the resulting binary) """ def __init__(self): @@ -125,26 +127,102 @@ class IntTable: self.typename, name, '\n'.join(fragments) )) class Field: def __init__(self, reg, s_name): self.s_name = s_name self.name = strip_prefix(s_name) self.values = [] +def format(self, string_table, idx_table): +if len(self.values): +values_offsets = [] +for value in self.values: +while value[1] >= len(values_offsets): +values_offsets.append(-1) +values_offsets[value[1]] = string_table.add(strip_prefix(value[0])) +return '{%s, %s(~0u), %s, %s}' % ( +string_table.add(self.name), self.s_name, +len(values_offsets), idx_table.add(values_offsets)) +else: +return '{%s, %s(~0u)}' % (string_table.add(self.name), self.s_name) + +def __eq__(self, other): +return (self.s_name == other.s_name and +self.name == other.name and +len(self.values) == len(other.values) and +all(a[0] == b[0] and a[1] == b[1] for a, b, in zip(self.values, other.values))) + +def __ne__(self, other): +return not (self == other) + + +class FieldTable: +""" +A class for collecting multiple arrays of register fields in a single big +array that is used by indexing (to avoid relocations in the resulting binary) +""" +def __init__(self): +self.table = [] +self.idxs = set() +self.name_to_idx = collections.defaultdict(lambda: []) + +def add(self, array): +""" +Add an array of Field objects, and return the index of where to find +the array in the table. +""" +# Check if we can find the array in the table already +for base_idx in self.name_to_idx.get(array[0].name, []): +if base_idx + len(array) > len(self.table): +continue + +for i, a in enumerate(array): +b = self.table[base_idx + i] +if a != b: +break +else: +return base_idx + +base_idx = len(self.table) +self.idxs.add(base_idx) + +for field in array: +self.name_to_idx[field.name].append(len(self.table)) +self.table.append(field) + +return base_idx + +def emit(self, filp, string_table, idx_table): +""" +Write +static const struct si_field sid_fields_table[] = { ... }; +to filp. +""" +idxs = sorted(self.idxs) + [len(self.table)] + +filp.write('static const struct si_field sid_fields_table[] = {\n') + +for start, end in zip(idxs, idxs[1:]): +filp.write('\t/* %s */\n' % (start)) +for field in self.table[start:end]: +filp.write('\t%s,\n' % (field.format(string_table, idx_table))) + +filp.write('};\n') + + class Reg: def __init__(self, r_name): self.r_name = r_name self.name = strip_prefix(r_name) self.fields = [] -self.own_fields = True def strip_prefix(s): '''Strip prefix in the form ._.*_, e.g. R_001234_''' return s[s[2:].find('_')+3:] def parse(filename, regs, packets): stream = open(filename) for line in stream: @@ -200,28 +278,27 @@ def parse(filename, regs, packets): for reg in regs: if len(reg.fields) and reg.name.find('0') != -1: reg_dict[reg.name] = reg # Assign fields for reg in regs: if not len(reg.fields): reg0 = reg_dict.get(match_number.sub('0', reg.name)) if reg0 != None: reg.fields =
[Mesa-dev] [PATCH] util/ralloc: set prev-pointers correctly in ralloc_adopt
From: Nicolai HähnleFound by inspection. I'm not aware of any actual failures caused by this, but a precise sequence of ralloc_adopt and ralloc_free should be able to cause problems. --- src/util/ralloc.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/util/ralloc.c b/src/util/ralloc.c index bf46439df4e..50e629fe450 100644 --- a/src/util/ralloc.c +++ b/src/util/ralloc.c @@ -304,24 +304,26 @@ ralloc_adopt(const void *new_ctx, void *old_ctx) new_info = get_header(new_ctx); /* If there are no children, bail. */ if (unlikely(old_info->child == NULL)) return; /* Set all the children's parent to new_ctx; get a pointer to the last child. */ for (child = old_info->child; child->next != NULL; child = child->next) { child->parent = new_info; } + child->parent = new_info; /* Connect the two lists together; parent them to new_ctx; make old_ctx empty. */ child->next = new_info->child; - child->parent = new_info; + if (new_info->child) + new_info->child->prev = child; new_info->child = old_info->child; old_info->child = NULL; } void * ralloc_parent(const void *ptr) { ralloc_header *info; if (unlikely(ptr == NULL)) -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa/mtypes: repack gl_texture_object.
On Mon, Sep 4, 2017 at 1:29 PM, Marek Olšákwrote: > On Sun, Sep 3, 2017 at 1:18 PM, Dave Airlie wrote: >> From: Dave Airlie >> >> reduces size from 1144 to 1128. >> >> Signed-off-by: Dave Airlie >> --- >> src/mesa/main/mtypes.h | 10 +- >> 1 file changed, 5 insertions(+), 5 deletions(-) >> >> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h >> index d44897b..3d68a6d 100644 >> --- a/src/mesa/main/mtypes.h >> +++ b/src/mesa/main/mtypes.h >> @@ -1012,7 +1012,6 @@ struct gl_texture_object >> struct gl_sampler_object Sampler; >> >> GLenum DepthMode; /**< GL_ARB_depth_texture */ > > The patch looks good, but here are some ideas for future improvements: > > GLenum can be uint16_t everywhere, because GL doesn't set higher bits: > > typedef uint16_t GLenum16. > s/GLenum/GLenum16/ > >> - bool StencilSampling; /**< Should we sample stencil instead of >> depth? */ >> >> GLfloat Priority; /**< in [0,1] */ >> GLint BaseLevel;/**< min mipmap level, OpenGL 1.2 */ >> @@ -1033,12 +1032,17 @@ struct gl_texture_object >> GLboolean Immutable;/**< GL_ARB_texture_storage */ >> GLboolean _IsFloat; /**< GL_OES_float_texture */ >> GLboolean _IsHalfFloat; /**< GL_OES_half_float_texture */ >> + bool StencilSampling; /**< Should we sample stencil instead of >> depth? */ >> + bool HandleAllocated; /**< GL_ARB_bindless_texture */ > > All bools can be 1 bit: > > bool x:1; > GLboolean y:1; > > etc. > >> >> GLuint MinLevel;/**< GL_ARB_texture_view */ >> GLuint MinLayer;/**< GL_ARB_texture_view */ >> GLuint NumLevels; /**< GL_ARB_texture_view */ >> GLuint NumLayers; /**< GL_ARB_texture_view */ > > MinLevel, NumLevels can be ubyte (uint8_t). MinLayer, NumLayers can be > ushort (uint16_t)... simply by considering the range of possible > values. One more: Enums have 4 bytes by default, but you can limit that, because who wants 2^32 possible values: enum type var:5; Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa/mtypes: repack gl_texture_object.
On Sun, Sep 3, 2017 at 1:18 PM, Dave Airliewrote: > From: Dave Airlie > > reduces size from 1144 to 1128. > > Signed-off-by: Dave Airlie > --- > src/mesa/main/mtypes.h | 10 +- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h > index d44897b..3d68a6d 100644 > --- a/src/mesa/main/mtypes.h > +++ b/src/mesa/main/mtypes.h > @@ -1012,7 +1012,6 @@ struct gl_texture_object > struct gl_sampler_object Sampler; > > GLenum DepthMode; /**< GL_ARB_depth_texture */ The patch looks good, but here are some ideas for future improvements: GLenum can be uint16_t everywhere, because GL doesn't set higher bits: typedef uint16_t GLenum16. s/GLenum/GLenum16/ > - bool StencilSampling; /**< Should we sample stencil instead of > depth? */ > > GLfloat Priority; /**< in [0,1] */ > GLint BaseLevel;/**< min mipmap level, OpenGL 1.2 */ > @@ -1033,12 +1032,17 @@ struct gl_texture_object > GLboolean Immutable;/**< GL_ARB_texture_storage */ > GLboolean _IsFloat; /**< GL_OES_float_texture */ > GLboolean _IsHalfFloat; /**< GL_OES_half_float_texture */ > + bool StencilSampling; /**< Should we sample stencil instead of > depth? */ > + bool HandleAllocated; /**< GL_ARB_bindless_texture */ All bools can be 1 bit: bool x:1; GLboolean y:1; etc. > > GLuint MinLevel;/**< GL_ARB_texture_view */ > GLuint MinLayer;/**< GL_ARB_texture_view */ > GLuint NumLevels; /**< GL_ARB_texture_view */ > GLuint NumLayers; /**< GL_ARB_texture_view */ MinLevel, NumLevels can be ubyte (uint8_t). MinLayer, NumLayers can be ushort (uint16_t)... simply by considering the range of possible values. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa/mtypes: repack gl_texture_object.
Hi, On 03.09.2017 14:22, Thomas Helland wrote: 2017-09-03 13:18 GMT+02:00 Dave Airlie: From: Dave Airlie reduces size from 1144 to 1128. Signed-off-by: Dave Airlie --- src/mesa/main/mtypes.h | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index d44897b..3d68a6d 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -1012,7 +1012,6 @@ struct gl_texture_object struct gl_sampler_object Sampler; GLenum DepthMode; /**< GL_ARB_depth_texture */ - bool StencilSampling; /**< Should we sample stencil instead of depth? */ GLfloat Priority; /**< in [0,1] */ GLint BaseLevel;/**< min mipmap level, OpenGL 1.2 */ @@ -1033,12 +1032,17 @@ struct gl_texture_object GLboolean Immutable;/**< GL_ARB_texture_storage */ GLboolean _IsFloat; /**< GL_OES_float_texture */ GLboolean _IsHalfFloat; /**< GL_OES_half_float_texture */ + bool StencilSampling; /**< Should we sample stencil instead of depth? */ + bool HandleAllocated; /**< GL_ARB_bindless_texture */ Maybe we could use "pragma pack" here instead? Structure re-organization (including changing their types) is done for three reasons: * Portability between systems (based on their minimum alignment rules) * Saving memory by minimizing structure sizes * Improving performance by: - getting structures to fit better to caches - moving members that are used together to same cacheline AFAIK there are three main reasons to use pragma pack on top of that: * Legacy compatibility for non-portable ABIs (e.g. DOS/Windows ones) when members have to be at specific, non-optimal positions and alignments * When a commonly used structure takes so much memory in total that program can run out of memory, and packing is only way to reduce it further * Squeezing certain members to same cacheline after *profiling has shown* that them not being there is actually a problem Issues that you might get from packing: * On some platforms (at least older ARMs), CPU not guaranteeing atomic accesses if variable crosses page boundary (which could happen when you forbid compiler from aligning it naturally) * Compiler optimizations alignment expectations if you pass parts of the structure elsewhere without telling compiler that it's not properly aligned I'm debating with myself whether or not moving this bool away from the rest of the bindless_texture related variables is worth saving the few bytes. For performance reasons, it's good to keep things that are used together, also together in memory (structure), but granularity for that is pretty small... (cache line size, aligned) - Eero GLuint MinLevel;/**< GL_ARB_texture_view */ GLuint MinLayer;/**< GL_ARB_texture_view */ GLuint NumLevels; /**< GL_ARB_texture_view */ GLuint NumLayers; /**< GL_ARB_texture_view */ + /** GL_EXT_memory_object */ + GLenum TextureTiling; + /** Actual texture images, indexed by [cube face] and [mipmap level] */ struct gl_texture_image *Image[MAX_FACES][MAX_TEXTURE_LEVELS]; @@ -1057,13 +1061,9 @@ struct gl_texture_object /** GL_ARB_shader_image_load_store */ GLenum ImageFormatCompatibilityType; - /** GL_EXT_memory_object */ - GLenum TextureTiling; - /** GL_ARB_bindless_texture */ struct util_dynarray SamplerHandles; struct util_dynarray ImageHandles; - bool HandleAllocated; }; -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] mesa/mtypes: repack display list structs.
For the series: Reviewed-by: Marek OlšákMarek On Sun, Sep 3, 2017 at 1:06 PM, Dave Airlie wrote: > From: Dave Airlie > > This reduces each of these by 8 bytes. > > Signed-off-by: Dave Airlie > --- > src/mesa/main/mtypes.h | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h > index a72a3b2..34da6b9 100644 > --- a/src/mesa/main/mtypes.h > +++ b/src/mesa/main/mtypes.h > @@ -4342,8 +4342,8 @@ union gl_dlist_node; > struct gl_display_list > { > GLuint Name; > - GLchar *Label; /**< GL_KHR_debug */ > GLbitfield Flags; /**< DLIST_x flags */ > + GLchar *Label; /**< GL_KHR_debug */ > /** The dlist commands are in a linked list of nodes */ > union gl_dlist_node *Head; > }; > @@ -4354,11 +4354,10 @@ struct gl_display_list > */ > struct gl_dlist_state > { > - GLuint CallDepth; /**< Current recursion calling depth */ > - > struct gl_display_list *CurrentList; /**< List currently being compiled */ > union gl_dlist_node *CurrentBlock; /**< Pointer to current block of nodes > */ > GLuint CurrentPos; /**< Index into current block of nodes */ > + GLuint CallDepth; /**< Current recursion calling depth */ > > GLvertexformat ListVtxfmt; > > -- > 2.9.5 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 102518] [apitrace, backtrace] Crash in _mesa_is_bufferobj during load of "XCOM 2: War of the Chosen"
https://bugs.freedesktop.org/show_bug.cgi?id=102518 --- Comment #7 from Kai--- I can confirm, that commenting out the export of MESA_NO_ERROR=1 in XCOM2WotC/config/extra-environment.sh lets me launch the game. This was tested on (fully updated Debian testing as a base): GPU: Hawaii PRO [Radeon R9 290] (ChipID = 0x67b1) Mesa: Git:master/39a69f0692 libdrm: 2.4.82-1 LLVM: SVN:trunk/r312410 (6.0 devel) X.Org: 2:1.19.3-2 Linux: 4.12.10 Firmware (firmware-amd-graphics): 20170823-1 libclc: Git:master/7331b0a1fa DDX (xserver-xorg-video-amdgpu): 1.3.0-1 -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: fix loop analysis of loop terminators
Reviewed-by: Marek OlšákMarek On Mon, Sep 4, 2017 at 5:29 AM, Timothy Arceri wrote: > This code incorrectly assumed that loop terminators will always be > at the start of the loop. Fortunately we *seem* to avoid any bugs > because the unrolling code loops over and correctly handles the > terminators. > > However the incorrect analysis can result in loops not being > unrolled at all. For example the current code would unroll: > > int j = 0; > do { > if (j > 5) > break; > > ... do stuff ... > > j++; > } while (j < 4); > > But would fail to unroll the following as no iteration limit was > calculated because it failed to find the terminator: > > int j = 0; > do { > ... do stuff ... > > j++; > } while (j < 4); > > Also we would fail to unroll the following as we ended up > calculating the iteration limit as 6 rather than 4. The unroll > code then assumed we had 3 terminators rather the 2 as it > wasn't able to determine that "if (j > 5)" was redundant. > > int j = 0; > do { > if (j > 5) > break; > > ... do stuff ... > > if (bool(i)) > break; > > j++; > } while (j < 4); > --- > src/compiler/glsl/loop_analysis.cpp | 2 -- > 1 file changed, 2 deletions(-) > > diff --git a/src/compiler/glsl/loop_analysis.cpp > b/src/compiler/glsl/loop_analysis.cpp > index b9bae43536..253a405dfb 100644 > --- a/src/compiler/glsl/loop_analysis.cpp > +++ b/src/compiler/glsl/loop_analysis.cpp > @@ -290,22 +290,20 @@ loop_analysis::visit_leave(ir_loop *ir) > foreach_in_list(ir_instruction, node, >body_instructions) { >/* Skip over declarations at the start of a loop. > */ >if (node->as_variable()) > continue; > >ir_if *if_stmt = ((ir_instruction *) node)->as_if(); > >if ((if_stmt != NULL) && is_loop_terminator(if_stmt)) > ls->insert(if_stmt); > - else > -break; > } > > > foreach_in_list_safe(loop_variable, lv, >variables) { >/* Move variables that are already marked as being loop constant to > * a separate list. These trivially don't need to be tested. > */ >if (lv->is_loop_constant()) { > lv->remove(); > ls->constants.push_tail(lv); > -- > 2.13.5 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 77449] Tracker bug for all bugs related to Steam titles
https://bugs.freedesktop.org/show_bug.cgi?id=77449 Bug 77449 depends on bug 102518, which changed state. Bug 102518 Summary: [apitrace,backtrace] Crash in _mesa_is_bufferobj during load of "XCOM 2: War of the Chosen" https://bugs.freedesktop.org/show_bug.cgi?id=102518 What|Removed |Added Status|NEW |RESOLVED Resolution|--- |NOTOURBUG -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 102518] [apitrace, backtrace] Crash in _mesa_is_bufferobj during load of "XCOM 2: War of the Chosen"
https://bugs.freedesktop.org/show_bug.cgi?id=102518 Timothy Arcerichanged: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |NOTOURBUG --- Comment #6 from Timothy Arceri --- (In reply to Marc Di Luzio from comment #4) > FWIW to clear up some confusion - our launch scripts turns on KHR_no_error > for WOTC for the extra performance. > > See /path/to/install/dir/XCOM2WotC/config/extra-environment.sh:2 > export MESA_NO_ERROR=1 > > I'll take a look and see if we're hitting a GL error at this stage. O(In reply to Marc Di Luzio from comment #5) > It appears some errors slipped through. > > Example: > [GL_DEBUG] Error message from OpenGL API call with id 1281: GL_INVALID_VALUE > error generated. out of range. > > We'll handle this, I'd think it should be safe to assume this is an > application bug. From what I understand using a KHR_no_error context is a > trust handshake that you won't trigger any error states. Thanks for confirming. Making as resolved. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 102530] [bisected] Kodi crashes when launching a stream - commit bd2662bf
https://bugs.freedesktop.org/show_bug.cgi?id=102530 Timothy Arcerichanged: What|Removed |Added Status|NEW |NEEDINFO --- Comment #4 from Timothy Arceri --- (In reply to Tapani Pälli from comment #3) > (In reply to Tapani Pälli from comment #2) > > Just a guess but at least it looks possible for > > "UniformRemapTable[location]" to access garbage, there should be a check for > > location value before using it. > > Disclaimer: I did not realize that undefined behaviour including a crash in > that situation is actually OK by the no_error spec so forget about this > comment :) -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 102454] glibc 2.26 doesn't provide anymore xlocale.h
https://bugs.freedesktop.org/show_bug.cgi?id=102454 Eric Engestromchanged: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED --- Comment #5 from Eric Engestrom --- Fixed by: commit 49b428470e28ae6ab22083e43fa41abf622f3b0d Author: Eric Engestrom Date: Thu Aug 31 16:55:56 2017 + util: improve compiler guard Glibc 2.26 has dropped xlocale.h, but the functions needed (strtod_l() and strdof_l()) can be found in stdlib.h. Improve the detection method to allow newer builds to still make use of the locale-setting. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 102518] [apitrace, backtrace] Crash in _mesa_is_bufferobj during load of "XCOM 2: War of the Chosen"
https://bugs.freedesktop.org/show_bug.cgi?id=102518 --- Comment #5 from Marc Di Luzio--- It appears some errors slipped through. Example: [GL_DEBUG] Error message from OpenGL API call with id 1281: GL_INVALID_VALUE error generated. out of range. We'll handle this, I'd think it should be safe to assume this is an application bug. From what I understand using a KHR_no_error context is a trust handshake that you won't trigger any error states. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 102518] [apitrace, backtrace] Crash in _mesa_is_bufferobj during load of "XCOM 2: War of the Chosen"
https://bugs.freedesktop.org/show_bug.cgi?id=102518 --- Comment #4 from Marc Di Luzio--- FWIW to clear up some confusion - our launch scripts turns on KHR_no_error for WOTC for the extra performance. See /path/to/install/dir/XCOM2WotC/config/extra-environment.sh:2 export MESA_NO_ERROR=1 I'll take a look and see if we're hitting a GL error at this stage. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa/mtypes: reback gl_shader_program_data.
Reviewed-by: Samuel PitoisetOn 09/03/2017 01:12 PM, Dave Airlie wrote: From: Dave Airlie This reduces the size from 144 bytes to 128 bytes. Signed-off-by: Dave Airlie --- src/mesa/main/mtypes.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 2dab594..d44897b 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -2853,9 +2853,9 @@ struct gl_shader_program_data struct gl_uniform_storage *UniformStorage; unsigned NumUniformBlocks; - struct gl_uniform_block *UniformBlocks; - unsigned NumShaderStorageBlocks; + + struct gl_uniform_block *UniformBlocks; struct gl_uniform_block *ShaderStorageBlocks; struct gl_active_atomic_buffer *AtomicBuffers; @@ -2873,13 +2873,13 @@ struct gl_shader_program_data * lands we should switch to using the cache_fallback support. */ bool skip_cache; + GLboolean Validated; /** List of all active resources after linking. */ struct gl_program_resource *ProgramResourceList; unsigned NumProgramResourceList; enum gl_link_status LinkStatus; /**< GL_LINK_STATUS */ - GLboolean Validated; GLchar *InfoLog; unsigned Version; /**< GLSL version used for linking */ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa/mtypes: repack gl_texture_object.
Reviewed-by: Samuel PitoisetOn 09/03/2017 01:18 PM, Dave Airlie wrote: From: Dave Airlie reduces size from 1144 to 1128. Signed-off-by: Dave Airlie --- src/mesa/main/mtypes.h | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index d44897b..3d68a6d 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -1012,7 +1012,6 @@ struct gl_texture_object struct gl_sampler_object Sampler; GLenum DepthMode; /**< GL_ARB_depth_texture */ - bool StencilSampling; /**< Should we sample stencil instead of depth? */ GLfloat Priority; /**< in [0,1] */ GLint BaseLevel;/**< min mipmap level, OpenGL 1.2 */ @@ -1033,12 +1032,17 @@ struct gl_texture_object GLboolean Immutable;/**< GL_ARB_texture_storage */ GLboolean _IsFloat; /**< GL_OES_float_texture */ GLboolean _IsHalfFloat; /**< GL_OES_half_float_texture */ + bool StencilSampling; /**< Should we sample stencil instead of depth? */ + bool HandleAllocated; /**< GL_ARB_bindless_texture */ GLuint MinLevel;/**< GL_ARB_texture_view */ GLuint MinLayer;/**< GL_ARB_texture_view */ GLuint NumLevels; /**< GL_ARB_texture_view */ GLuint NumLayers; /**< GL_ARB_texture_view */ + /** GL_EXT_memory_object */ + GLenum TextureTiling; + /** Actual texture images, indexed by [cube face] and [mipmap level] */ struct gl_texture_image *Image[MAX_FACES][MAX_TEXTURE_LEVELS]; @@ -1057,13 +1061,9 @@ struct gl_texture_object /** GL_ARB_shader_image_load_store */ GLenum ImageFormatCompatibilityType; - /** GL_EXT_memory_object */ - GLenum TextureTiling; - /** GL_ARB_bindless_texture */ struct util_dynarray SamplerHandles; struct util_dynarray ImageHandles; - bool HandleAllocated; }; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa/mtypes: repack gl_sampler_object.
Reviewed-by: Samuel PitoisetOn 09/03/2017 01:21 PM, Dave Airlie wrote: From: Dave Airlie 160->152. Signed-off-by: Dave Airlie --- src/mesa/main/mtypes.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 3d68a6d..db9ea76 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -990,8 +990,8 @@ struct gl_sampler_object GLboolean CubeMapSeamless; /**< GL_AMD_seamless_cubemap_per_texture */ /** GL_ARB_bindless_texture */ - struct util_dynarray Handles; bool HandleAllocated; + struct util_dynarray Handles; }; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa/mtypes: reorganise gl_shader
Reviewed-by: Samuel PitoisetOn 09/03/2017 01:09 PM, Dave Airlie wrote: From: Dave Airlie This reduces this from 200->182 bytes. Signed-off-by: Dave Airlie --- src/mesa/main/mtypes.h | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 34da6b9..2dab594 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -2567,9 +2567,10 @@ struct gl_shader GLchar *Label; /**< GL_KHR_debug */ unsigned char sha1[20]; /**< SHA1 hash of pre-processed source */ GLboolean DeletePending; - enum gl_compile_status CompileStatus; bool IsES; /**< True if this shader uses GLSL ES */ + enum gl_compile_status CompileStatus; + #ifdef DEBUG unsigned SourceChecksum; /**< for debug/logging purposes */ #endif @@ -2581,14 +2582,14 @@ struct gl_shader unsigned Version; /**< GLSL version used for linking */ - struct exec_list *ir; - struct glsl_symbol_table *symbols; - /** * A bitmask of gl_advanced_blend_mode values */ GLbitfield BlendSupport; + struct exec_list *ir; + struct glsl_symbol_table *symbols; + /** * Whether early fragment tests are enabled as defined by * ARB_shader_image_load_store. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 102530] [bisected] Kodi crashes when launching a stream - commit bd2662bf
https://bugs.freedesktop.org/show_bug.cgi?id=102530 --- Comment #3 from Tapani Pälli--- (In reply to Tapani Pälli from comment #2) > Just a guess but at least it looks possible for > "UniformRemapTable[location]" to access garbage, there should be a check for > location value before using it. Disclaimer: I did not realize that undefined behaviour including a crash in that situation is actually OK by the no_error spec so forget about this comment :) -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] vbo: fix build errors on android
Reviewed-by: Charmaine Lee> On Sep 3, 2017, at 10:13 PM, Tapani Pälli wrote: > > incompatible pointer to integer conversion assigning to 'GLintptr' (aka 'int') > from 'const char *' [-Werror,-Wint-conversion] > > offset = indices; > ^ ~~~ > > Signed-off-by: Tapani Pälli > --- > src/mesa/vbo/vbo_minmax_index.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/mesa/vbo/vbo_minmax_index.c b/src/mesa/vbo/vbo_minmax_index.c > index 58a2af49ac..1377926bba 100644 > --- a/src/mesa/vbo/vbo_minmax_index.c > +++ b/src/mesa/vbo/vbo_minmax_index.c > @@ -255,7 +255,7 @@ vbo_get_minmax_index(struct gl_context *ctx, > count, min_index, max_index)) > return; > > - offset = indices; > + offset = (GLintptr) indices; > indices = ctx->Driver.MapBufferRange(ctx, offset, size, >GL_MAP_READ_BIT, ib->obj, >MAP_INTERNAL); > -- > 2.13.5 > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/9] spirv: Improve logging and error handling
On 08/29/2017 01:55 PM, Tapani Pälli wrote: LGTM, patches 1-5 Reviewed-by: Tapani PälliWith these changes we can add a nir_spirv_debug_callback that has instance and shader object in private data so that we can call VK_EXT_debug_report functionality from anv_pipeline code. Went through the rest of it, series is: Reviewed-by: Tapani Pälli On 08/17/2017 08:22 PM, Jason Ekstrand wrote: This series has two objectives: 1) Improve logging to provide more detail and provide hooks so we can plumb errors and warnings through to debug_report extensions. 2) Improve error handling so that not all errors result in killing the process. This is done by adding new spv_fail and spv_assert helpers which log the error and longjump back to a point where we can clean up and return NULL without crashing. There is still quite a ways to go with error handling if we want to be able to guarantee that we won't crash given an arbitrary stream of bytes. However, this should at least be a step in the right direction. There are probably at least a couple of cases where I could have left an assert() as an actual "kill the process" assert because it's testing for internal consistency. However, the vast majority of them are validation checks so it seemed better to just search+replace them all and we can convert the few that we want back to real asserts later. Cc: "Ian Romanick" Jason Ekstrand (9): ralloc: Allow reparenting to a NULL context spirv: Parent the nir_shader to the builder while building spirv: Re-arrange vtn_builder initialization spirv: Rework logging spirv: Do something useful with OpSource util: Add a NORETURN macro spirv: Add vtn_fail and vtn_assert helpers spirv: Replace assert with vtn_assert spirv: Replace unreachable with vtn_fail configure.ac | 1 + src/amd/vulkan/radv_pipeline.c | 2 +- src/compiler/spirv/nir_spirv.h | 17 +- src/compiler/spirv/spirv2nir.c | 3 +- src/compiler/spirv/spirv_to_nir.c | 496 +++-- src/compiler/spirv/vtn_alu.c | 46 ++-- src/compiler/spirv/vtn_cfg.c | 59 ++--- src/compiler/spirv/vtn_glsl450.c | 18 +- src/compiler/spirv/vtn_private.h | 56 - src/compiler/spirv/vtn_variables.c | 253 +-- src/intel/vulkan/anv_pipeline.c| 2 +- src/util/macros.h | 6 + src/util/ralloc.c | 2 +- 13 files changed, 578 insertions(+), 383 deletions(-) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH mesa] anv: fix off by one in array check
On Monday, 2017-09-04 04:54:36 +, Jason Ekstrand wrote: > I sent the same patch a few hours later. I don't care which one we land. > You have a more descriptive commit message. Alright then, I just pushed mine. > > Reviewed-by: Jason Ekstrand> > On Sun, Sep 3, 2017 at 11:33 AM, Eric Engestrom wrote: > > > `anv_formats[ARRAY_SIZE(anv_formats)]` is already one too far. > > Spotted by Coverity. > > > > CovID: 1417259 > > Fixes: 242211933a0682696170 "anv/formats: Nicely handle unknown VkFormat > > enums" > > Cc: Jason Ekstrand > > Signed-off-by: Eric Engestrom > > --- > > src/intel/vulkan/anv_formats.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/src/intel/vulkan/anv_formats.c b/src/intel/vulkan/anv_ > > formats.c > > index c23b143cac..eead1aa790 100644 > > --- a/src/intel/vulkan/anv_formats.c > > +++ b/src/intel/vulkan/anv_formats.c > > @@ -253,7 +253,7 @@ static const struct anv_format anv_formats[] = { > > static bool > > format_supported(VkFormat vk_format) > > { > > - if (vk_format > ARRAY_SIZE(anv_formats)) > > + if (vk_format >= ARRAY_SIZE(anv_formats)) > >return false; > > > > return anv_formats[vk_format].isl_format != ISL_FORMAT_UNSUPPORTED; > > -- > > Cheers, > > Eric > > > > ___ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev