Re: [Mesa-dev] [PATCH] mesa: readpixels add support for GL_HALF_FLOAT
On 03/22/2018 04:43 AM, Lin, Johnson wrote: Hi, Thanks for the comments. I just noticed it does not check the extension support for EXT_color_buffer_float neither? That is probably because it is enabled as 'dummy_true' (see extensions_table.h) so it's always enabled on any driver. I wonder if we can just go and do the same for EXT_color_buffer_half_float? Is there any driver that would not support this? -Original Message- From: Palli, Tapani Sent: Wednesday, March 21, 2018 6:57 PM To: Alejandro Piñeiro; Lin, Johnson ; mesa-dev@lists.freedesktop.org Subject: Re: [Mesa-dev] [PATCH] mesa: readpixels add support for GL_HALF_FLOAT On 21.03.2018 12:45, Tapani Pälli wrote: On 21.03.2018 08:52, Alejandro Piñeiro wrote: On 21/03/18 06:57, Lin Johnson wrote: Ext_color_buffer_half_float is using type GL_HALF_FLOAT and data_type GL_FLOAT. This fix Android CTS test android.view.cts.PixelCopyTest #TestWindowProducerCopyToRGBA16F Signed-off-by: Lin Johnson --- src/mesa/main/readpix.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/mesa/main/readpix.c b/src/mesa/main/readpix.c index 6ce340ddf9bb..51331dd095ab 100644 --- a/src/mesa/main/readpix.c +++ b/src/mesa/main/readpix.c @@ -920,6 +920,8 @@ read_pixels_es3_error_check(GLenum format, GLenum type, case GL_RGBA: if (type == GL_FLOAT && data_type == GL_FLOAT) return GL_NO_ERROR; /* EXT_color_buffer_float */ + if (type == GL_HALF_FLOAT && data_type == GL_FLOAT) + return GL_NO_ERROR; /* EXT_color_buffer_half_float */ If this combination is allowed thanks to that extension, what would happen if that extension is not supported? shouldn't include a extension check? Or that is checked in a different place? I was thinking the same. Having seen the test it does not seem to make any kind of checks what is supported (like asking for extension, or maybe asking for GL_IMPLEMENTATION_COLOR_READ_TYPE) but attempts glReadPixels using GL_HALF_FLOAT type, I think it should verify first that such reads are supported. Currently we don't seem to support this extension. ... but probably support the functionality (OpenGL ES 3.2), so maybe some checks needed for ES version (?) if (type == GL_UNSIGNED_BYTE && data_type == GL_UNSIGNED_NORMALIZED) return GL_NO_ERROR; if (internalFormat == GL_RGB10_A2 && ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallium: Do not add -Wframe-address option for gcc <= 4.4.
Hi Vinson, Thanks for the patch. I was considering moving the gcc stuff out into its own function e.g get_gcc_frame_pointer() which could then be wrapped with #pragma GCC diagnostic which gcc 4.4 should be able to handle. However I'm not too worried about GCC 4.4 and lower so this patch is also fine by me. Either way you decided to go you can have a: Reviewed-by: Timothy ArceriOn 22/03/18 09:10, Vinson Lee wrote: This patch fixes these build errors with GCC 4.4. Compiling src/gallium/auxiliary/util/u_debug_stack.c ... src/gallium/auxiliary/util/u_debug_stack.c: In function ‘debug_backtrace_capture’: src/gallium/auxiliary/util/u_debug_stack.c:268: error: #pragma GCC diagnostic not allowed inside functions src/gallium/auxiliary/util/u_debug_stack.c:269: error: #pragma GCC diagnostic not allowed inside functions src/gallium/auxiliary/util/u_debug_stack.c:271: error: #pragma GCC diagnostic not allowed inside functions Fixes: 370e356ebab4 ("gallium: silence __builtin_frame_address nonzero argument is unsafe warning") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105529 Signed-off-by: Vinson Lee --- src/gallium/auxiliary/util/u_debug_stack.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/src/gallium/auxiliary/util/u_debug_stack.c b/src/gallium/auxiliary/util/u_debug_stack.c index 974e639..846f648 100644 --- a/src/gallium/auxiliary/util/u_debug_stack.c +++ b/src/gallium/auxiliary/util/u_debug_stack.c @@ -264,7 +264,7 @@ debug_backtrace_capture(struct debug_stack_frame *backtrace, } #endif -#if defined(PIPE_CC_GCC) +#if defined(PIPE_CC_GCC) && (PIPE_CC_GCC_VERSION > 404) || defined(__clang__) #pragma GCC diagnostic push #pragma GCC diagnostic ignored "-Wframe-address" frame_pointer = ((const void **)__builtin_frame_address(1)); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/3] st/glsl_to_nir: fix driver location for packed doubles
The subject line should have read: "st/glsl_to_nir: fix driver location for dual-slot packed doubles" This should also partially fix packed arrays although more is needed to make sure those work since an array can be packed across multiple other arrays so we need to make sure everything is assigned a driver location in the correct order. On 21/03/18 14:50, Timothy Arceri wrote: --- src/mesa/state_tracker/st_glsl_to_nir.cpp | 22 -- 1 file changed, 16 insertions(+), 6 deletions(-) diff --git a/src/mesa/state_tracker/st_glsl_to_nir.cpp b/src/mesa/state_tracker/st_glsl_to_nir.cpp index afb6120d9d..b01be622f7 100644 --- a/src/mesa/state_tracker/st_glsl_to_nir.cpp +++ b/src/mesa/state_tracker/st_glsl_to_nir.cpp @@ -141,16 +141,23 @@ st_nir_assign_var_locations(struct exec_list *var_list, unsigned *size, type = glsl_get_array_element(type); } + unsigned var_size = type_size(type); + /* Builtins don't allow component packing so we only need to worry about * user defined varyings sharing the same location. */ bool processed = false; if (var->data.location >= base) { unsigned glsl_location = var->data.location - base; - if (processed_locs[var->data.index] & ((uint64_t)1 << glsl_location)) -processed = true; - else -processed_locs[var->data.index] |= ((uint64_t)1 << glsl_location); + + for (unsigned i = 0; i < var_size; i++) { +if (processed_locs[var->data.index] & +((uint64_t)1 << (glsl_location + i))) + processed = true; +else + processed_locs[var->data.index] |= + ((uint64_t)1 << (glsl_location + i)); + } } /* Because component packing allows varyings to share the same location @@ -162,9 +169,12 @@ st_nir_assign_var_locations(struct exec_list *var_list, unsigned *size, continue; } - assigned_locations[var->data.location] = location; + for (unsigned i = 0; i < var_size; i++) { + assigned_locations[var->data.location + i] = location + i; + } + var->data.driver_location = location; - location += type_size(type); + location += var_size; } *size += location; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: readpixels add support for GL_HALF_FLOAT
Hi, Thanks for the comments. I just noticed it does not check the extension support for EXT_color_buffer_float neither? -Original Message- From: Palli, Tapani Sent: Wednesday, March 21, 2018 6:57 PM To: Alejandro Piñeiro; Lin, Johnson ; mesa-dev@lists.freedesktop.org Subject: Re: [Mesa-dev] [PATCH] mesa: readpixels add support for GL_HALF_FLOAT On 21.03.2018 12:45, Tapani Pälli wrote: > > > On 21.03.2018 08:52, Alejandro Piñeiro wrote: >> On 21/03/18 06:57, Lin Johnson wrote: >>> Ext_color_buffer_half_float is using type GL_HALF_FLOAT and >>> data_type GL_FLOAT. This fix Android CTS test >>> android.view.cts.PixelCopyTest #TestWindowProducerCopyToRGBA16F >>> >>> Signed-off-by: Lin Johnson >>> --- >>> src/mesa/main/readpix.c | 2 ++ >>> 1 file changed, 2 insertions(+) >>> >>> diff --git a/src/mesa/main/readpix.c b/src/mesa/main/readpix.c index >>> 6ce340ddf9bb..51331dd095ab 100644 >>> --- a/src/mesa/main/readpix.c >>> +++ b/src/mesa/main/readpix.c >>> @@ -920,6 +920,8 @@ read_pixels_es3_error_check(GLenum format, >>> GLenum type, >>> case GL_RGBA: >>> if (type == GL_FLOAT && data_type == GL_FLOAT) >>> return GL_NO_ERROR; /* EXT_color_buffer_float */ >>> + if (type == GL_HALF_FLOAT && data_type == GL_FLOAT) >>> + return GL_NO_ERROR; /* EXT_color_buffer_half_float */ >> >> If this combination is allowed thanks to that extension, what would >> happen if that extension is not supported? shouldn't include a >> extension check? Or that is checked in a different place? > > I was thinking the same. Having seen the test it does not seem to make > any kind of checks what is supported (like asking for extension, or > maybe asking for GL_IMPLEMENTATION_COLOR_READ_TYPE) but attempts > glReadPixels using GL_HALF_FLOAT type, I think it should verify first > that such reads are supported. Currently we don't seem to support this > extension. ... but probably support the functionality (OpenGL ES 3.2), so maybe some checks needed for ES version (?) > > >>> if (type == GL_UNSIGNED_BYTE && data_type == >>> GL_UNSIGNED_NORMALIZED) >>> return GL_NO_ERROR; >>> if (internalFormat == GL_RGB10_A2 && >> >> >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev >> > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] clover/llvm: Fix build against LLVM/Clang 4.0
Aaron Watrywrites: > The opencl 1.0 langstandard was renamed in 5.0+ > > v2: Move preprocessor check into compat.hpp > > Cc: Mark Janes > Cc: Francisco Jerez Reviewed-by: Francisco Jerez > --- > src/gallium/state_trackers/clover/llvm/compat.hpp | 2 ++ > src/gallium/state_trackers/clover/llvm/invocation.cpp | 2 +- > 2 files changed, 3 insertions(+), 1 deletion(-) > > diff --git a/src/gallium/state_trackers/clover/llvm/compat.hpp > b/src/gallium/state_trackers/clover/llvm/compat.hpp > index 19528a0133..2e070b2eef 100644 > --- a/src/gallium/state_trackers/clover/llvm/compat.hpp > +++ b/src/gallium/state_trackers/clover/llvm/compat.hpp > @@ -89,8 +89,10 @@ namespace clover { > > #if HAVE_LLVM >= 0x0500 > const clang::InputKind ik_opencl = clang::InputKind::OpenCL; > + const clang::LangStandard::Kind lang_opencl10 = > clang::LangStandard::lang_opencl10; > #else > const clang::InputKind ik_opencl = clang::IK_OpenCL; > + const clang::LangStandard::Kind lang_opencl10 = > clang::LangStandard::lang_opencl; > #endif > > inline void > diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp > b/src/gallium/state_trackers/clover/llvm/invocation.cpp > index af78c2ae28..b2c64bc48f 100644 > --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp > +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp > @@ -85,7 +85,7 @@ namespace { > }; > > const clc_version_lang_std cl_version_lang_stds[] = { > - { 100, clang::LangStandard::lang_opencl10}, > + { 100, compat::lang_opencl10}, > { 110, clang::LangStandard::lang_opencl11}, > { 120, clang::LangStandard::lang_opencl12}, > { 200, clang::LangStandard::lang_opencl20}, > -- > 2.14.1 signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] clover/llvm: Fix build against LLVM/Clang 4.0
The opencl 1.0 langstandard was renamed in 5.0+ v2: Move preprocessor check into compat.hpp Cc: Mark JanesCc: Francisco Jerez --- src/gallium/state_trackers/clover/llvm/compat.hpp | 2 ++ src/gallium/state_trackers/clover/llvm/invocation.cpp | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/src/gallium/state_trackers/clover/llvm/compat.hpp b/src/gallium/state_trackers/clover/llvm/compat.hpp index 19528a0133..2e070b2eef 100644 --- a/src/gallium/state_trackers/clover/llvm/compat.hpp +++ b/src/gallium/state_trackers/clover/llvm/compat.hpp @@ -89,8 +89,10 @@ namespace clover { #if HAVE_LLVM >= 0x0500 const clang::InputKind ik_opencl = clang::InputKind::OpenCL; + const clang::LangStandard::Kind lang_opencl10 = clang::LangStandard::lang_opencl10; #else const clang::InputKind ik_opencl = clang::IK_OpenCL; + const clang::LangStandard::Kind lang_opencl10 = clang::LangStandard::lang_opencl; #endif inline void diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp b/src/gallium/state_trackers/clover/llvm/invocation.cpp index af78c2ae28..b2c64bc48f 100644 --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp @@ -85,7 +85,7 @@ namespace { }; const clc_version_lang_std cl_version_lang_stds[] = { - { 100, clang::LangStandard::lang_opencl10}, + { 100, compat::lang_opencl10}, { 110, clang::LangStandard::lang_opencl11}, { 120, clang::LangStandard::lang_opencl12}, { 200, clang::LangStandard::lang_opencl20}, -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [ANNOUNCE] mesa 18.0.0-rc5
Hi Emil, From radv perspective we seem to have one bug in CTS to debug, but all games I tested that are expected to work seemed to work ok. Per discussion with Dave I'd like to give the go ahead for releasing this wrt radv. Thanks, Bas On Wed, Mar 21, 2018 at 3:50 PM, Emil Velikovwrote: > The fifth and final release candidate for Mesa 18.0.0 is now available. > > Modulo serious regressions, it is anticipated that it will become > Mesa 18.0.0 this Friday around 16:00GMT > > > Alex Smith (1): > radv: Fix CmdCopyImage between uncompressed and compressed images > > Andres Gomez (2): > travis: make Meson find the proper llvm-config > travis: keep meson version below 0.45.0 > > Andriy Khulap (1): > i965: Fix RELOC_WRITE typo in brw_store_data_imm64() > > Anuj Phogat (2): > isl: Don't use surface format R32_FLOAT for typed atomic integer > operations > intel/compiler: Memory fence commit must always be enabled for gen10+ > > Bas Nieuwenhuizen (6): > radv: Always lower indirect derefs after nir_lower_global_vars_to_local. > vulkan/wsi: Fix OOM behavior with prime images. > radv: Increase the number of dynamic uniform buffers. > radv: Implement WaitForFences with !waitAll. > radv: Implement waiting on non-submitted fences. > radv: Fix copying from 3D images starting at non-zero depth. > > Brian Paul (1): > mesa: add missing switch case for EXTRA_VERSION_40 in check_extra() > > Chuck Atkins (1): > glx: Properly handle cases where screen creation fails > > Daniel Stone (4): > i965: Fix bugs in intel_from_planar > egl/wayland: Fix ARGB/XRGB transposition in config map > egl/wayland: Always use in-tree wayland-egl-backend.h > i965: Fix aux-surface size check > > Dave Airlie (15): > r600/eg: use texture target to pick array size not view target (v2) > r600/sb/cayman: fix indirect ubo access on cayman > r600/compute: only mark buffer/image state dirty for fragment shaders > r600: fix xfb stream check. > ac/nir: to integer the args to bcsel. > radv: don't support tc-compat on multisample d32s8 at all. > virgl: remap query types to hw support. > r600: fix tgsi clock last setting > r600: partly revert disabling tiling for 1d texture. > r600: implement callstack workaround for evergreen. > r600/cayman: fix fragcood loading recip generation. > ac/nir: don't apply slice rounding on txf_ms > radv: get correct offset into LDS for indexed vars. > ac/nir: pass the nir variable through tcs loading. > radv: mark all tess output for an indirect access. > > Dylan Baker (27): > meson: fix test source name for static glapi > glapi/check_table: Remove 'extern "C"' block > glapi: remove APPLE extensions from test > glapi: fix check_table test for non-shared glapi with meson > meson: use a custom target instead of a generator for i965 oa > Revert "anv/meson: Make anv_entrypoints_gen.py depend on > anv_extensions.py" > meson: use depend_files to track extra file dependencies > meson: use depend_files for adding extra file dependencies > meson: define empty variables for libswdri and libswkmsdri > meson: add libswdri and libswkmsdri to d3dadaptor link_with > meson: add libswdri and libswkmsdri to dri link_with > meson: use va-api version reported by pkg-config > meson: link dri3 xcb libs into vlwinsys instead of into each target > meson: actually link with libomxil-bellagio > meson: Actually link xvmc target with libxvmc > meson: fix vdpau target linkage > meson: fix va target linkage > meson: Fix omx-bellagio target linkage > meson: Fix xa target linkage > meson: fix xvmc target linkage > meson: freedreno depends on nir > meson: Fix GL and EGL pkg-config files with glvnd > meson: fix building without GL > meson: radeonsi cannot be built with drm 2.4.90 > meson: install vulkan_intel.h header > autotools: include all meson.build files > meson: Add moduledir to d3d.pc > > Emil Velikov (2): > cherry-ignore: reference correct SHA for the VK_KHX_multiview commit > Update version to 18.0.0-rc5 > > Eric Anholt (5): > ac/nir: Fix compiler warning about uninitialized dw_addr. > glsl/tests: Fix strict aliasing warning about int64/double. > glsl: Silence warnings in the uniform initializer test about 16-bit > types > glsl/tests: Fix a compiler warning about signed/unsigned loop > comparison. > i965: Silence compiler warning about promoted_constants. > > Eric Engestrom (5): > meson: dedup gallium-vdpau logic > meson: dedup gallium-xvmc logic > meson: dedup gallium-omx logic > meson: dedup gallium-va logic > meson: dedup gallium-xa logic > > Francisco Jerez (1): > i965: Fix
Re: [Mesa-dev] [PATCH] radv: Unset ZRANGE_PRECISION when depth was zeroed
On Thu, Mar 8, 2018 at 12:59 PM, James Leggwrote: > This avoids bug 105396 somehow. I suspect it is a VI and GFX9 hardware > bug which PAL calls WaTcCompatZRange, but I don't know for sure. > > In the VK_FORMAT_D32_SFLOAT case, TILE_STENCIL_DISABLE is not set for > tc compatible image formats regardless of not having a stencil aspect. > If TILE_STENCIL_DISABLE was set, ZRANGE_PRECISION would have no effect > and the bug would occur again. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105396 > CC: > CC: Dave Airlie > CC: Bas Nieuwenhuizen > CC: Samuel Pitoiset > --- > src/amd/vulkan/radv_cmd_buffer.c | 52 > +--- > 1 file changed, 49 insertions(+), 3 deletions(-) > > diff --git a/src/amd/vulkan/radv_cmd_buffer.c > b/src/amd/vulkan/radv_cmd_buffer.c > index 3e0ed0e9a9..89e31a0347 100644 > --- a/src/amd/vulkan/radv_cmd_buffer.c > +++ b/src/amd/vulkan/radv_cmd_buffer.c > @@ -915,6 +915,37 @@ radv_emit_fb_ds_state(struct radv_cmd_buffer *cmd_buffer, > > } > > + if (image->surface.htile_size) > + { > + /* If the last depth clear value was 0.0f, set > ZRANGE_PRECISION > +* to 0 in dp_z_info for more accuracy with reverse depth; and > +* to avoid > https://bugs.freedesktop.org/show_bug.cgi?id=105396. > +* Otherwise, we leave it set to 1. > +*/ > + radeon_emit(cmd_buffer->cs, PKT3(PKT3_COND_WRITE, 7, 0)); > + > + const uint32_t write_space = 0 << 8;/* register */ > + const uint32_t poll_space = 1 << 4; /* memory */ > + const uint32_t function = 3 << 0; /* equal to the > reference */ > + const uint32_t options = write_space | poll_space | function; > + radeon_emit(cmd_buffer->cs, options); > + > + /* poll address - location of the depth clear value */ > + uint64_t va = radv_buffer_get_va(image->bo); > + va += image->offset + image->clear_value_offset; > + radeon_emit(cmd_buffer->cs, va); > + radeon_emit(cmd_buffer->cs, va >> 32); > + > + radeon_emit(cmd_buffer->cs, fui(0.0f)); /* reference > value */ > + radeon_emit(cmd_buffer->cs, (uint32_t)-1); /* comparison > mask */ > + radeon_emit(cmd_buffer->cs, R_028040_DB_Z_INFO >> 2); /* > write address low */ > + radeon_emit(cmd_buffer->cs, 0u);/* write > address high */ > + > + /* The value to write data when the condition passes */ > + uint32_t db_z_info_clear_zero = db_z_info & > C_028040_ZRANGE_PRECISION; > + radeon_emit(cmd_buffer->cs, db_z_info_clear_zero); > + } > + > radeon_set_context_reg(cmd_buffer->cs, > R_028B78_PA_SU_POLY_OFFSET_DB_FMT_CNTL, >ds->pa_su_poly_offset_db_fmt_cntl); > } > @@ -3479,7 +3510,8 @@ void radv_CmdEndRenderPass( > > /* > * For HTILE we have the following interesting clear words: > - * 0xf30f: Uncompressed, full depth range, for depth+stencil HTILE > + * 0xf30f: Uncompressed, full depth range, for depth+stencil HTILE > when ZRANGE_PRECISION is 1 > + * 0x0003f30f: Uncompressed, full depth range, for depth+stencil HTILE > when ZRANGE_PRECISION is 0 > * 0xfffc000f: Uncompressed, full depth range, for depth only HTILE. > * 0xfff0: Clear depth to 1.0 > * 0x: Clear depth to 0.0 > @@ -3528,8 +3560,22 @@ static void radv_handle_depth_image_transition(struct > radv_cmd_buffer *cmd_buffe > radv_initialize_htile(cmd_buffer, image, range, 0); > } else if (!radv_layout_is_htile_compressed(image, src_layout, > src_queue_mask) && >radv_layout_is_htile_compressed(image, dst_layout, > dst_queue_mask)) { > - uint32_t clear_value = vk_format_is_stencil(image->vk_format) > ? 0xf30f : 0xfffc000f; > - radv_initialize_htile(cmd_buffer, image, range, clear_value); > + if (vk_format_is_stencil(image->vk_format)) { > + /* The appropriate clear value depends on DB_Z_INFO's > +* ZRANGE_PRECISION, which can vary depending on the > +* last used clear value, which could be from another > +* command buffer. Instead of picking the appropriate > +* clear value on the GPU, resummarize accurately. > +*/ > + VkImageSubresourceRange local_range = *range; > + local_range.aspectMask = VK_IMAGE_ASPECT_DEPTH_BIT; > + local_range.baseMipLevel = 0; > + local_range.levelCount = 1; > + > +
[Mesa-dev] [PATCH 6/6] nir: Don't condition 'a-b < 0' -> 'a < b' on is_not_used_by_conditional
From: Ian RomanickNow that i965 recognizes that a-b generates the same conditions as 'a < b', there is no reason to condition this transformation on 'is not used by conditional.' Since this was the only user of the is_not_used_by_conditional function, delete it. All Gen6+ platforms had similar results. (Skylake shown) total instructions in shared programs: 14400775 -> 14400595 (<.01%) instructions in affected programs: 36712 -> 36532 (-0.49%) helped: 182 HURT: 26 helped stats (abs) min: 1 max: 2 x̄: 1.13 x̃: 1 helped stats (rel) min: 0.15% max: 1.82% x̄: 0.70% x̃: 0.62% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.24% max: 1.02% x̄: 0.82% x̃: 0.90% 95% mean confidence interval for instructions value: -0.97 -0.76 95% mean confidence interval for instructions %-change: -0.59% -0.43% Instructions are helped. total cycles in shared programs: 532929592 -> 532926345 (<.01%) cycles in affected programs: 478660 -> 475413 (-0.68%) helped: 187 HURT: 22 helped stats (abs) min: 2 max: 200 x̄: 20.99 x̃: 18 helped stats (rel) min: 0.23% max: 24.10% x̄: 1.48% x̃: 1.03% HURT stats (abs) min: 1 max: 214 x̄: 30.86 x̃: 11 HURT stats (rel) min: 0.01% max: 23.06% x̄: 3.12% x̃: 0.86% 95% mean confidence interval for cycles value: -19.50 -11.57 95% mean confidence interval for cycles %-change: -1.42% -0.58% Cycles are helped. GM45 and Iron Lake had similar results. (Iron Lake shown) total cycles in shared programs: 177851578 -> 177851810 (<.01%) cycles in affected programs: 24408 -> 24640 (0.95%) helped: 2 HURT: 4 helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4 helped stats (rel) min: 0.42% max: 0.47% x̄: 0.44% x̃: 0.44% HURT stats (abs) min: 24 max: 108 x̄: 60.00 x̃: 54 HURT stats (rel) min: 0.52% max: 1.62% x̄: 1.04% x̃: 1.02% 95% mean confidence interval for cycles value: -7.75 85.08 95% mean confidence interval for cycles %-change: -0.39% 1.49% Inconclusive result (value mean confidence interval includes 0). Signed-off-by: Ian Romanick --- src/compiler/nir/nir_opt_algebraic.py | 4 +--- src/compiler/nir/nir_search_helpers.h | 15 --- 2 files changed, 1 insertion(+), 18 deletions(-) diff --git a/src/compiler/nir/nir_opt_algebraic.py b/src/compiler/nir/nir_opt_algebraic.py index b9565ce..96232f0 100644 --- a/src/compiler/nir/nir_opt_algebraic.py +++ b/src/compiler/nir/nir_opt_algebraic.py @@ -208,9 +208,7 @@ optimizations = [ # fmax. If b is > 1.0, the bcsel will be replaced with a b2f. (('fmin', ('b2f', a), '#b'), ('bcsel', a, ('fmin', b, 1.0), ('fmin', b, 0.0))), - # ignore this opt when the result is used by a bcsel or if so we can make - # use of conditional modifiers on supported hardware. - (('flt(is_not_used_by_conditional)', ('fadd(is_used_once)', a, ('fneg', b)), 0.0), ('flt', a, b)), + (('flt', ('fadd(is_used_once)', a, ('fneg', b)), 0.0), ('flt', a, b)), (('fge', ('fneg', ('fabs', a)), 0.0), ('feq', a, 0.0)), (('~bcsel', ('flt', b, a), b, a), ('fmin', a, b)), diff --git a/src/compiler/nir/nir_search_helpers.h b/src/compiler/nir/nir_search_helpers.h index 2e3bd13..2d399bd 100644 --- a/src/compiler/nir/nir_search_helpers.h +++ b/src/compiler/nir/nir_search_helpers.h @@ -170,19 +170,4 @@ is_not_used_by_if(nir_alu_instr *instr) return list_empty(>dest.dest.ssa.if_uses); } -static inline bool -is_not_used_by_conditional(nir_alu_instr *instr) -{ - if (!is_not_used_by_if(instr)) - return false; - - nir_foreach_use(use, >dest.dest.ssa) { - if (use->parent_instr->type == nir_instr_type_alu && - nir_instr_as_alu(use->parent_instr)->op == nir_op_bcsel) - return false; - } - - return true; -} - #endif /* _NIR_SEARCH_ */ -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/6] i965: Add negative_equals methods
From: Ian RomanickThis method is similar to the existing ::equals methods. Instead of testing that two src_regs are equal to each other, it tests that one is the negation of the other. v2: Simplify various checks based on suggestions from Matt. Use src_reg::type instead of fixed_hw_reg.type in a check. Also suggested by Matt. v3: Rebase on 3 years. Fix some problems with negative_equals with VF constants. Add fs_reg::negative_equals. Signed-off-by: Ian Romanick --- src/intel/compiler/brw_fs.cpp | 7 ++ src/intel/compiler/brw_ir_fs.h| 1 + src/intel/compiler/brw_ir_vec4.h | 1 + src/intel/compiler/brw_reg.h | 46 +++ src/intel/compiler/brw_shader.cpp | 6 + src/intel/compiler/brw_shader.h | 1 + src/intel/compiler/brw_vec4.cpp | 7 ++ 7 files changed, 69 insertions(+) diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp index 6eea532..3d454c3 100644 --- a/src/intel/compiler/brw_fs.cpp +++ b/src/intel/compiler/brw_fs.cpp @@ -454,6 +454,13 @@ fs_reg::equals(const fs_reg ) const } bool +fs_reg::negative_equals(const fs_reg ) const +{ + return (this->backend_reg::negative_equals(r) && + stride == r.stride); +} + +bool fs_reg::is_contiguous() const { return stride == 1; diff --git a/src/intel/compiler/brw_ir_fs.h b/src/intel/compiler/brw_ir_fs.h index 54797ff..f06a33c 100644 --- a/src/intel/compiler/brw_ir_fs.h +++ b/src/intel/compiler/brw_ir_fs.h @@ -41,6 +41,7 @@ public: fs_reg(enum brw_reg_file file, int nr, enum brw_reg_type type); bool equals(const fs_reg ) const; + bool negative_equals(const fs_reg ) const; bool is_contiguous() const; /** diff --git a/src/intel/compiler/brw_ir_vec4.h b/src/intel/compiler/brw_ir_vec4.h index cbaff2f..95c5119 100644 --- a/src/intel/compiler/brw_ir_vec4.h +++ b/src/intel/compiler/brw_ir_vec4.h @@ -43,6 +43,7 @@ public: src_reg(struct ::brw_reg reg); bool equals(const src_reg ) const; + bool negative_equals(const src_reg ) const; src_reg(class vec4_visitor *v, const struct glsl_type *type); src_reg(class vec4_visitor *v, const struct glsl_type *type, int size); diff --git a/src/intel/compiler/brw_reg.h b/src/intel/compiler/brw_reg.h index 7ad144b..732bddf 100644 --- a/src/intel/compiler/brw_reg.h +++ b/src/intel/compiler/brw_reg.h @@ -255,6 +255,52 @@ brw_regs_equal(const struct brw_reg *a, const struct brw_reg *b) return a->bits == b->bits && (df ? a->u64 == b->u64 : a->ud == b->ud); } +static inline bool +brw_regs_negative_equal(const struct brw_reg *a, const struct brw_reg *b) +{ + if (a->file == IMM) { + if (a->bits != b->bits) + return false; + + switch (a->type) { + case BRW_REGISTER_TYPE_UQ: + case BRW_REGISTER_TYPE_Q: + return a->d64 == -b->d64; + case BRW_REGISTER_TYPE_DF: + return a->df == -b->df; + case BRW_REGISTER_TYPE_UD: + case BRW_REGISTER_TYPE_D: + return a->d == -b->d; + case BRW_REGISTER_TYPE_F: + return a->f == -b->f; + case BRW_REGISTER_TYPE_VF: + /* It is tempting to treat 0 as a negation of 0 (and -0 as a negation + * of -0). There are occasions where 0 or -0 is used and the exact + * bit pattern is desired. At the very least, changing this to allow + * 0 as a negation of 0 causes some fp64 tests to fail on IVB. + */ + return a->ud == (b->ud ^ 0x80808080); + case BRW_REGISTER_TYPE_UW: + case BRW_REGISTER_TYPE_W: + case BRW_REGISTER_TYPE_UV: + case BRW_REGISTER_TYPE_V: + case BRW_REGISTER_TYPE_HF: + case BRW_REGISTER_TYPE_UB: + case BRW_REGISTER_TYPE_B: + /* FINISHME: Implement support for these types. */ + return false; + default: + unreachable("not reached"); + } + } else { + struct brw_reg tmp = *a; + + tmp.negate = !tmp.negate; + + return brw_regs_equal(, b); + } +} + struct brw_indirect { unsigned addr_subnr:4; int addr_offset:10; diff --git a/src/intel/compiler/brw_shader.cpp b/src/intel/compiler/brw_shader.cpp index 054962b..9cdf9fc 100644 --- a/src/intel/compiler/brw_shader.cpp +++ b/src/intel/compiler/brw_shader.cpp @@ -685,6 +685,12 @@ backend_reg::equals(const backend_reg ) const } bool +backend_reg::negative_equals(const backend_reg ) const +{ + return brw_regs_negative_equal(this, ) && offset == r.offset; +} + +bool backend_reg::is_zero() const { if (file != IMM) diff --git a/src/intel/compiler/brw_shader.h b/src/intel/compiler/brw_shader.h index fd02feb..7d97ddb 100644 --- a/src/intel/compiler/brw_shader.h +++ b/src/intel/compiler/brw_shader.h @@ -59,6 +59,7 @@ struct backend_reg : private brw_reg } bool equals(const backend_reg ) const; + bool negative_equals(const backend_reg ) const; bool is_zero() const; bool is_one() const; diff --git
[Mesa-dev] [PATCH 4/6] i965/vec4: Allow cmod propagation when src0 is a uniform or shader input
From: Ian RomanickNo shader-db changes. This source must have been written by a previous instruction, so it cannot be a uniform or a shader input. However, this change allows the next commit to help more shaders. Signed-off-by: Ian Romanick --- src/intel/compiler/brw_vec4_cmod_propagation.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/intel/compiler/brw_vec4_cmod_propagation.cpp b/src/intel/compiler/brw_vec4_cmod_propagation.cpp index 0d72d82..7f1001b 100644 --- a/src/intel/compiler/brw_vec4_cmod_propagation.cpp +++ b/src/intel/compiler/brw_vec4_cmod_propagation.cpp @@ -49,7 +49,8 @@ opt_cmod_propagation_local(bblock_t *block) inst->opcode != BRW_OPCODE_MOV) || inst->predicate != BRW_PREDICATE_NONE || !inst->dst.is_null() || - inst->src[0].file != VGRF || + (inst->src[0].file != VGRF && inst->src[0].file != ATTR && + inst->src[0].file != UNIFORM) || inst->src[0].abs) continue; -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/6] i965/fs: Propagate conditional modifiers from compares to adds
From: Ian RomanickThe math inside the add and the cmp in this instruction sequence is the same. We can utilize this to eliminate the compare. add(8) g5<1>F g2<8,8,1>F g64.5<0,1,0>F { align1 1Q compacted }; cmp.z.f0(8) null<1>Fg2<8,8,1>F -g64.5<0,1,0>F { align1 1Q switch }; (-f0) sel(8)g8<1>F (abs)g5<8,8,1>F 3e-37F { align1 1Q }; This is reduced to: add.z.f0(8) g5<1>F g2<8,8,1>F g64.5<0,1,0>F { align1 1Q compacted }; (-f0) sel(8)g8<1>F (abs)g5<8,8,1>F 3e-37F { align1 1Q }; This optimization pass could do even better. The nature of converting vectorized code from the GLSL front end to scalar code in NIR results in sequences like: add(8) g7<1>F g4<8,8,1>F g64.5<0,1,0>F { align1 1Q compacted }; add(8) g6<1>F g3<8,8,1>F g64.5<0,1,0>F { align1 1Q compacted }; add(8) g5<1>F g2<8,8,1>F g64.5<0,1,0>F { align1 1Q compacted }; cmp.z.f0(8) null<1>Fg2<8,8,1>F -g64.5<0,1,0>F { align1 1Q switch }; (-f0) sel(8)g8<1>F (abs)g5<8,8,1>F 3e-37F { align1 1Q }; cmp.z.f0(8) null<1>Fg3<8,8,1>F -g64.5<0,1,0>F { align1 1Q switch }; (-f0) sel(8)g10<1>F (abs)g6<8,8,1>F 3e-37F { align1 1Q }; cmp.z.f0(8) null<1>Fg4<8,8,1>F -g64.5<0,1,0>F { align1 1Q switch }; (-f0) sel(8)g12<1>F (abs)g7<8,8,1>F 3e-37F { align1 1Q }; In this sequence, only the first cmp.z is removed. With different scheduling, all 3 could get removed. Skylake total instructions in shared programs: 14407009 -> 14400173 (-0.05%) instructions in affected programs: 1307274 -> 1300438 (-0.52%) helped: 4880 HURT: 0 helped stats (abs) min: 1 max: 33 x̄: 1.40 x̃: 1 helped stats (rel) min: 0.03% max: 8.70% x̄: 0.70% x̃: 0.52% 95% mean confidence interval for instructions value: -1.45 -1.35 95% mean confidence interval for instructions %-change: -0.72% -0.69% Instructions are helped. total cycles in shared programs: 532943169 -> 532923528 (<.01%) cycles in affected programs: 14065798 -> 14046157 (-0.14%) helped: 2703 HURT: 339 helped stats (abs) min: 1 max: 1062 x̄: 12.27 x̃: 2 helped stats (rel) min: <.01% max: 28.72% x̄: 0.38% x̃: 0.21% HURT stats (abs) min: 1 max: 739 x̄: 39.86 x̃: 12 HURT stats (rel) min: 0.02% max: 27.69% x̄: 1.38% x̃: 0.41% 95% mean confidence interval for cycles value: -8.66 -4.26 95% mean confidence interval for cycles %-change: -0.24% -0.14% Cycles are helped. LOST: 0 GAINED: 1 Broadwell total instructions in shared programs: 14719636 -> 14712949 (-0.05%) instructions in affected programs: 1288188 -> 1281501 (-0.52%) helped: 4845 HURT: 0 helped stats (abs) min: 1 max: 33 x̄: 1.38 x̃: 1 helped stats (rel) min: 0.03% max: 8.00% x̄: 0.70% x̃: 0.52% 95% mean confidence interval for instructions value: -1.43 -1.33 95% mean confidence interval for instructions %-change: -0.72% -0.68% Instructions are helped. total cycles in shared programs: 559599253 -> 559581699 (<.01%) cycles in affected programs: 13315565 -> 13298011 (-0.13%) helped: 2600 HURT: 269 helped stats (abs) min: 1 max: 2128 x̄: 12.24 x̃: 2 helped stats (rel) min: <.01% max: 23.95% x̄: 0.41% x̃: 0.20% HURT stats (abs) min: 1 max: 790 x̄: 53.07 x̃: 20 HURT stats (rel) min: 0.02% max: 15.96% x̄: 1.55% x̃: 0.75% 95% mean confidence interval for cycles value: -8.47 -3.77 95% mean confidence interval for cycles %-change: -0.27% -0.18% Cycles are helped. LOST: 0 GAINED: 8 Haswell total instructions in shared programs: 12978609 -> 12973483 (-0.04%) instructions in affected programs: 932921 -> 927795 (-0.55%) helped: 3480 HURT: 0 helped stats (abs) min: 1 max: 33 x̄: 1.47 x̃: 1 helped stats (rel) min: 0.03% max: 7.84% x̄: 0.78% x̃: 0.58% 95% mean confidence interval for instructions value: -1.53 -1.42 95% mean confidence interval for instructions %-change: -0.80% -0.75% Instructions are helped. total cycles in shared programs: 410270788 -> 410250531 (<.01%) cycles in affected programs: 10986161 -> 10965904 (-0.18%) helped: 2087 HURT: 254 helped stats (abs) min: 1 max: 2672 x̄: 14.63 x̃: 4 helped stats (rel) min: <.01% max: 39.61% x̄: 0.42% x̃: 0.21% HURT stats (abs) min: 1 max: 519 x̄: 40.49 x̃: 16 HURT stats (rel) min: 0.01% max: 12.83% x̄: 1.20% x̃: 0.47% 95% mean confidence interval for cycles value: -12.82 -4.49 95% mean confidence interval for cycles %-change: -0.31% -0.18% Cycles are helped. LOST: 0 GAINED: 5 Ivy Bridge total instructions in shared programs: 11686082 -> 11681548 (-0.04%) instructions in affected programs: 937696 -> 933162 (-0.48%) helped: 3150 HURT: 0 helped stats (abs) min: 1 max: 33 x̄: 1.44 x̃: 1 helped stats (rel) min: 0.03% max: 7.84% x̄: 0.69% x̃: 0.49% 95% mean confidence interval for instructions value: -1.49 -1.38 95% mean confidence interval for instructions %-change: -0.71% -0.67% Instructions are helped.
[Mesa-dev] [PATCH 5/6] i965/vec4: Propagate conditional modifiers from compares to adds
From: Ian RomanickNo changes on Broadwell and later becuase those plaforms do not use the vec4 backend at all. Ivy Bridge and Haswell had similar results. (Ivy Bridge shown) total instructions in shared programs: 11682119 -> 11681056 (<.01%) instructions in affected programs: 150403 -> 149340 (-0.71%) helped: 950 HURT: 0 helped stats (abs) min: 1 max: 16 x̄: 1.12 x̃: 1 helped stats (rel) min: 0.23% max: 2.78% x̄: 0.82% x̃: 0.71% 95% mean confidence interval for instructions value: -1.19 -1.04 95% mean confidence interval for instructions %-change: -0.84% -0.79% Instructions are helped. total cycles in shared programs: 257495842 -> 257495238 (<.01%) cycles in affected programs: 270302 -> 269698 (-0.22%) helped: 271 HURT: 13 helped stats (abs) min: 2 max: 14 x̄: 2.42 x̃: 2 helped stats (rel) min: 0.06% max: 1.13% x̄: 0.32% x̃: 0.28% HURT stats (abs) min: 2 max: 12 x̄: 4.00 x̃: 4 HURT stats (rel) min: 0.15% max: 1.18% x̄: 0.30% x̃: 0.26% 95% mean confidence interval for cycles value: -2.41 -1.84 95% mean confidence interval for cycles %-change: -0.31% -0.26% Cycles are helped. Sandy Bridge total instructions in shared programs: 10430493 -> 10429727 (<.01%) instructions in affected programs: 120860 -> 120094 (-0.63%) helped: 766 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.30% max: 2.70% x̄: 0.78% x̃: 0.73% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.80% -0.75% Instructions are helped. total cycles in shared programs: 146138718 -> 146138446 (<.01%) cycles in affected programs: 244114 -> 243842 (-0.11%) helped: 132 HURT: 0 helped stats (abs) min: 2 max: 4 x̄: 2.06 x̃: 2 helped stats (rel) min: 0.03% max: 0.43% x̄: 0.16% x̃: 0.19% 95% mean confidence interval for cycles value: -2.12 -2.00 95% mean confidence interval for cycles %-change: -0.18% -0.15% Cycles are helped. GM45 and Iron Lake had identical results. (Iron Lake shown) total instructions in shared programs: 7780251 -> 7780248 (<.01%) instructions in affected programs: 175 -> 172 (-1.71%) helped: 3 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 1.49% max: 2.44% x̄: 1.81% x̃: 1.49% total cycles in shared programs: 177851584 -> 177851578 (<.01%) cycles in affected programs: 9796 -> 9790 (-0.06%) helped: 3 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.05% max: 0.08% x̄: 0.06% x̃: 0.05% Signed-off-by: Ian Romanick --- src/intel/compiler/brw_vec4_cmod_propagation.cpp | 70 ++-- 1 file changed, 65 insertions(+), 5 deletions(-) diff --git a/src/intel/compiler/brw_vec4_cmod_propagation.cpp b/src/intel/compiler/brw_vec4_cmod_propagation.cpp index 7f1001b..5205da4 100644 --- a/src/intel/compiler/brw_vec4_cmod_propagation.cpp +++ b/src/intel/compiler/brw_vec4_cmod_propagation.cpp @@ -50,8 +50,14 @@ opt_cmod_propagation_local(bblock_t *block) inst->predicate != BRW_PREDICATE_NONE || !inst->dst.is_null() || (inst->src[0].file != VGRF && inst->src[0].file != ATTR && - inst->src[0].file != UNIFORM) || - inst->src[0].abs) + inst->src[0].file != UNIFORM)) + continue; + + /* An ABS source modifier can only be handled when processing a compare + * with a value other than zero. + */ + if (inst->src[0].abs && + (inst->opcode != BRW_OPCODE_CMP || inst->src[1].is_zero())) continue; if (inst->opcode == BRW_OPCODE_AND && @@ -60,15 +66,68 @@ opt_cmod_propagation_local(bblock_t *block) !inst->src[0].negate)) continue; - if (inst->opcode == BRW_OPCODE_CMP && !inst->src[1].is_zero()) - continue; - if (inst->opcode == BRW_OPCODE_MOV && inst->conditional_mod != BRW_CONDITIONAL_NZ) continue; bool read_flag = false; foreach_inst_in_block_reverse_starting_from(vec4_instruction, scan_inst, inst) { + /* A CMP with a second source of zero can match with anything. A CMP + * with a second source that is not zero can only match with an ADD + * instruction. + */ + if (inst->opcode == BRW_OPCODE_CMP && !inst->src[1].is_zero()) { +bool negate; + +if (scan_inst->opcode != BRW_OPCODE_ADD) + goto not_match; + +/* A CMP is basically a subtraction. The result of the + * subtraction must be the same as the result of the addition. + * This means that one of the operands must be negated. So (a + + * b) vs (a == -b) or (a + -b) vs (a == b). + */ +if ((inst->src[0].equals(scan_inst->src[0]) && + inst->src[1].negative_equals(scan_inst->src[1])) || +(inst->src[0].equals(scan_inst->src[1]) && +
[Mesa-dev] [PATCH 2/6] i965/fs: Allow cmod propagation when src0 is a uniform or shader input
From: Ian RomanickNo shader-db changes. This source must have been written by a previous instruction, so it cannot be a uniform or a shader input. However, this change allows the next commit to help about 900 more shaders. Signed-off-by: Ian Romanick --- src/intel/compiler/brw_fs_cmod_propagation.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/intel/compiler/brw_fs_cmod_propagation.cpp b/src/intel/compiler/brw_fs_cmod_propagation.cpp index 4625d69..b995a51 100644 --- a/src/intel/compiler/brw_fs_cmod_propagation.cpp +++ b/src/intel/compiler/brw_fs_cmod_propagation.cpp @@ -62,7 +62,8 @@ opt_cmod_propagation_local(const gen_device_info *devinfo, bblock_t *block) inst->opcode != BRW_OPCODE_MOV) || inst->predicate != BRW_PREDICATE_NONE || !inst->dst.is_null() || - inst->src[0].file != VGRF || + (inst->src[0].file != VGRF && inst->src[0].file != ATTR && + inst->src[0].file != UNIFORM) || inst->src[0].abs) continue; -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 77449] Tracker bug for all bugs related to Steam titles
https://bugs.freedesktop.org/show_bug.cgi?id=77449 Bug 77449 depends on bug 105426, which changed state. Bug 105426 Summary: [regression] Mesa-18.0rc4 - black screen in some Valve games when run under Wine https://bugs.freedesktop.org/show_bug.cgi?id=105426 What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] 2018 Election voting OPEN
To all X.Org Foundation Members: The X.Org Foundation's annual election is now open and will remain open until 23:59 UTC on 5 April 2018. Four of the eight director seats are open during this election, with the four nominees receiving the highest vote totals serving as directors for two year terms. There were six candidates nominated. For a complete list of the candidates and their personal statements, please see the following: https://www.x.org/wiki/BoardOfDirectors/Elections/2018/ Here are some instructions on how to cast your vote: Login to the membership system at: https://members.x.org/ If you do not remember your password, you can click on the "lost password" button and enter your user name. An e-mail will be sent to you with your password. If you have problems with the membership system, please e-mail members...@x.org. When you login you will see a row of buttons that will allow you to update your info, list the members, list the open ballots and logout. Below this you will see a list of open ballots, for which you can cast votes. At the bottom of this page you will see another row of buttons with the current privacy policy, the provisional By-laws, the provisional Membership Agreement and instructions on how to contact the admin. Note that if you click on the "Ballots" button at any time, you will see a list of the open ballots. To cast your vote in a ballot, click on the "Cast" button to the right of the ballot you wish to vote on. This will bring up another page with the list of the candidates, and a question of whether or not to approve the new By-laws. For the election: There is a pull-down selection box next to each candidate. For your top choice, select "1". For your second choice, select "2" and so forth. You should think of the numbers that you are selecting as the ranking of your preferences. Note that you are NOT required to select your preferences for all four positions. You can leave more than one blank. The only restriction is that you cannot duplicate any of your choices (i.e., you can only select one "1", one "2" and so forth). After you have completed your ballot, click the "Vote" button. Note that once you click this button, your votes will be cast and you will not be able to make further changes, so please make sure you are satisfied with your votes before clicking the "Vote" button. After you click the "Vote" button, the system will verify that you have completed a valid ballot. If your ballot is invalid (e.g., you duplicated a selection), it will return you to the previous voting page. If your ballot is valid, your votes will be recorded and the system will show you a notice that your votes were cast. Note that the election will close at 23:59 UTC on 5 April 2018. At that time, the election committee will count the votes and present the results to the current board for validation. After the current board validates the results, the election committee will present the results to the Members. Rob Clark, on behalf of the X.Org elections committee ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC] Mesa 17.3.x release problems and process improvements
Just one bit of feedback, for the rest I either agree or have no opinion: On Wed, Mar 21, 2018 at 8:28 PM, Emil Velikovwrote: > * unfit and late nominations: > * any rejections that are unfit based on the existing criteria can > be merged as long as: >* subsystem specific patches are approved by the team > maintainer(s). >* patches that cover multiple subsystems are approved by 50%+1 > of the maintainers of the affected subsystems. I don't think 50% + 1 is workable. That would mean for a core mesa patch, one would have to get like 5+ people to ack it. Seems like a lot. (And I suspect will lead to debates about how to count "affected" subsystems.) IMHO 2 is enough, i.e. the maintainer that wants it, and another maintainer who thinks it's reasonable. Cheers, -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC] Mesa 17.3.x release problems and process improvements
Hi all, Having gone through the thread a few times, I believe it can be summarised as follows: * Greater transparency is needed. * Subsystem/team maintainers. * Unfit and late nominations. * Developers/everyone should be more involved. * Greater automation must be explored. NOTES: * Some of the details are not listed in the thread, but have been raised in one form or another. * The details focuses more on the goals, than the actual means. * Above said, some details may have been missed - I'm mere human. In detail: * make the patch queue, release date and blockers accessible at any point in time: * queued patches can be accessed, via a branch - say wip/17.3, wip/18.0, wip/18.1, etc. The branch _will be_ rebased, although normally reverts are recommended. * rejected patches must be listed alongside the reason why and author+reviewer must be informed (email & IRC?) ASAP. * we already document and track those in .cherry-ignore. can we reuse that? * patches with trivial conflicts can be merged to the wip branch after another release manager, or patch author/reviewer has confirmed the changes. * patches that require backports will be rejected. usual rejection procedure applies (described above). * if there is delay due to extra testing time or otherwise, the release manager must list the blocking issues and ETA must be provided. ETA must be updated before it's reached. it may be worth having the ETA and rejections in a single place - inside the wip/ branch, html page, elsewhere. * the current metabug with release blockers must be made more obvious. * release manager can contact Phoronix and/or similar media to publicise expected delays, blockers or seek request for testing. * teams are encouraged to have one or multiple maintainers. some of the goals of having such people include: * individuals that have greater interaction with the team and knowledge about the team plans. rough examples include: * backport/bug is needed, yet person is not available - on a leave (sick, sabbatical, other) or busy with other things. * team has higher priority with details not publicly available. * can approve unfit or late nominations - see next section. * to ensure cover and minimise stress it's encouraged to have multiple maintainers per team and they are rotated regularly. * list of maintainers must be documented * unfit and late nominations: * any rejections that are unfit based on the existing criteria can be merged as long as: * subsystem specific patches are approved by the team maintainer(s). * patches that cover multiple subsystems are approved by 50%+1 of the maintainers of the affected subsystems. * late nominations can be made after the pre-release announcement. they must be approved by the subsystem maintainers up-to X hours before the actual release. approval specifics are identical to the ones listed in 'unfit' section just above. * developers/everyone should be more involved: * with the patch queue accessible at any point, everyone is encouraged to keep an eye open and report issues. * developers should be more active in providing backports and updating the status of release blocking bugs. * release managers and team maintainers must check with developer (via email, IRC, other) if no action has been made for X days. * everyone is encouraged to provide a piglit/dEQP/etc testing summary (via email, attachment, html page., etc). if they do, please ensure that summary consistently available, regardless if there's any regressions or not. if extra time is needed reply to the list informing release managers * in case of regressions bisection must be provided. * testing - pre and post merge, automation: NOTE: implementation specifics is up-to each team, with goals of: a) results must be accessible reasonably easy b) high level documentation of the setup and contact points are documented * with over 120 developers contributing to mesa, ambiguous patch nominations will always exist. * the obvious ones can be automated, others will be applied manually. * release manager should deploy automation ensuring that all common combinations build correctly. if particular combination is missing interested parties should provide basic information/assistance for setting one up. * release manager will push the wip branch, after ensuring that patches follow the criteria and passes build testing * pre: automated runtime testing can be utilised at a later stage with gitlab. it's does not seem feasible atm. * post: teams can setup piglit/dEQP/etc testing, summary and/or bisection. it should be documented if the testing is
[Mesa-dev] [Bug 105670] [regression][hang] Trine1EE hangs GPU after loading screen on Mesa3D-17.3 and later
https://bugs.freedesktop.org/show_bug.cgi?id=105670 Bug ID: 105670 Summary: [regression][hang] Trine1EE hangs GPU after loading screen on Mesa3D-17.3 and later Product: Mesa Version: 17.3 Hardware: Other OS: All Status: NEW Keywords: regression Severity: normal Priority: medium Component: glsl-compiler Assignee: mesa-dev@lists.freedesktop.org Reporter: i...@yahoo.com QA Contact: intel-3d-b...@lists.freedesktop.org The game is Trine1 Enchanted Edition, running under Wine-3.3. The game works fine with Mesa-17.2.0, but with Mesa-17.3.0 hangs right after the loading screen. The hang is soft, the driver tries to reset itself each 10 seconds and turns off my monitor. I am able to switch to text console (kms one) and kill the Xorg server, then reboot the machine. The kernel driver refuses to accept any new commands. Using software render I was able to capture a small apitrace (90MB compressed) that successfully reproduces the issue. It could be found here: https://drive.google.com/open?id=1RNKExfdBXUCN7SIhcrdiMwsvwIoCVMTg By using the qapitrace lookup, I managed to locate the exact operation that hangs. Not surprising, it is a draw operation: #55251 - works #55252 - hangs I tried to git bisect between 17.2.0 and 17.3.0, but I only managed to narrow it down to few steps: git bisect good 375c4868efa3cf549699557989c8f5c08c0710f0 git bisect bad 09f6bd5ef27c1b16b1468441b070b60c2d57523d The rest of my bisect log is full of skips, because I cannot find a commit that would work at all. All of them fail to run even `glxgears`, some crash, other give asserts in R600 code, in xmlconfig etc... My hardware is AMD Radeon HD5670 Redwood Evergreen (R600 driver). Using "R600_DEBUG=nosb" does not workaround the issue. Also the bug is reproducible on AMD RX480, running latest mesa3d master, llvm-svn 7.0.0svn_r328112 and experimental kernel. -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/4] gallium: add initial support for conservative rasterization
Am 22.03.2018 um 00:43 schrieb Ilia Mirkin: > On Wed, Mar 21, 2018 at 7:37 PM, Roland Scheidegger> wrote: >> Personally I'm not a big proponent on propagating single-vendor >> extensions (which are useless for anything but one specific driver) more >> or less directly through to gallium. >> There's an intel extension doing similar things already too. >> Ideally we'd end up with some bits in gallium which can do whatever the >> standardized version of it is going to require in some sensible way - at >> least I'd hope that such an extension will surface... > > Agreed. When/if such an extension materializes, we can adjust the > gallium API in a logical way to cover all the cases. Until then, this > is the functionality that exists on the GPUs in question. > I'm wondering, which bits of these could be done on AMD gpus too? Vega chips support conservative rasterization too. My guess is that what will end up in a standardized extension is probably similar to what's supported by d3d... I'm not just not sure it's really worth the trouble of bothering the gallium interface with basically experimental additions. From what I can tell you could instead implement intel's extension and expose that on nvidia gpus instead (albeit I'm not sure nvidia can do all of that neither) - from a quick look the interfaces would be quite different if you started with that instead. But whatever, I'm not too concerned, but maybe the AMD guys are... Roland ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] mesa: add support for nvidia conservative rasterization extensions
The indentation error shall be fixed. no_error="true" does mean there's a separate no-error variant of the function. I create such variants for consistency with other functions in viewport.c On Wed, Mar 21, 2018 at 11:40 PM, Ilia Mirkinwrote: > On Wed, Mar 21, 2018 at 7:11 PM, pendingchaos > wrote: >> Although the specs write it against compatibility GL 4.3 and allows core >> profile and GLES2+, it is exposed for GL 1.0+ and GLES1 and GLES2+. >> --- >> src/mapi/glapi/gen/gl_API.xml | 47 +++ >> src/mapi/glapi/gen/gl_genexec.py| 1 + >> src/mesa/Makefile.sources | 2 + >> src/mesa/main/attrib.c | 60 +++--- >> src/mesa/main/conservativeraster.c | 138 >> >> src/mesa/main/conservativeraster.h | 48 +++ >> src/mesa/main/context.c | 10 +++ >> src/mesa/main/dlist.c | 86 >> src/mesa/main/enable.c | 14 >> src/mesa/main/extensions_table.h| 4 + >> src/mesa/main/get.c | 3 + >> src/mesa/main/get_hash_params.py| 13 +++ >> src/mesa/main/mtypes.h | 29 ++- >> src/mesa/main/tests/dispatch_sanity.cpp | 27 +++ >> src/mesa/main/viewport.c| 57 + >> src/mesa/main/viewport.h| 6 ++ >> src/mesa/meson.build| 2 + >> 17 files changed, 535 insertions(+), 12 deletions(-) >> create mode 100644 src/mesa/main/conservativeraster.c >> create mode 100644 src/mesa/main/conservativeraster.h >> >> diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml >> index 38c1921047..0098e6e425 100644 >> --- a/src/mapi/glapi/gen/gl_API.xml >> +++ b/src/mapi/glapi/gen/gl_API.xml >> @@ -12871,6 +12871,53 @@ >> >> >> >> + >> + >> + >> + >> + >> + >> + >> + >> + >> + >> + >> + >> + >> +> no_error="true"> >> + >> + > > Indent, both here and below (i.e. param should be indented by 1). > > Not 100% sure I remember what no_error="true" means, but IIRC it means > there's separate dispatch in a no-error context. Doesn't seem > worthwhile here (and I don't think you added the _no_error variants of > the functions). > > -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/4] nvc0: add conservative rasterization support
I haven't tested on Maxwell as I don't have easy access to one but I think I can do so sometime tomorrow. I'll gate on GM200 with the second revision of the patch-set. prec_bias should always fit in the max value of an immed, 2**12-1, as the maximum subpixel precision bias is 8 on GM200 and later hardware (8|8<<8 < 2**12-1). On Wed, Mar 21, 2018 at 11:27 PM, Ilia Mirkinwrote: > On Wed, Mar 21, 2018 at 7:11 PM, pendingchaos > wrote: >> Subpixel precision bias, dilation and the post-snap mode are supported on >> GM2xx and newer. The pre-snap mode is supported for triangle primitives on >> GP1xx. >> --- >> src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h | 5 + >> src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 18 >> -- >> src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 18 >> ++ >> src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c | 5 + >> src/gallium/drivers/nouveau/nvc0/nvc0_stateobj.h | 2 +- >> 5 files changed, 41 insertions(+), 7 deletions(-) >> >> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h >> b/src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h >> index d7245fbcae..c5456e48b5 100644 >> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h >> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h >> @@ -447,6 +447,10 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE >> SOFTWARE. >> #define NVC0_3D_VIEWPORT_TRANSLATE_Z__ESIZE0x0020 >> #define NVC0_3D_VIEWPORT_TRANSLATE_Z__LEN 0x0010 >> >> +#define NVC0_3D_SUBPIXEL_PRECISION(i0)(0x0a1c + >> 0x20*(i0)) >> +#define NVC0_3D_SUBPIXEL_PRECISION__ESIZE 0x0020 >> +#define NVC0_3D_SUBPIXEL_PRECISION__LEN >> 0x0010 >> + >> #define NVC0_3D_VIEWPORT_HORIZ(i0)(0x0c00 + >> 0x10*(i0)) >> #define NVC0_3D_VIEWPORT_HORIZ__ESIZE 0x0010 >> #define NVC0_3D_VIEWPORT_HORIZ__LEN0x0010 >> @@ -780,6 +784,7 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE >> SOFTWARE. >> #define NVC0_3D_UNK11400x1140 >> >> #define NVC0_3D_UNK11440x1144 >> +#define NVC0_3D_CONSERVATIVE_RASTER0x1148 >> >> #define NVC0_3D_VTX_ATTR_DEFINE0x114c >> #define NVC0_3D_VTX_ATTR_DEFINE_ATTR__MASK 0x00ff >> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c >> b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c >> index ddbb3ec16d..b2b87e01d6 100644 >> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c >> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c >> @@ -172,6 +172,8 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, enum >> pipe_cap param) >>return 30; >> case PIPE_CAP_MAX_WINDOW_RECTANGLES: >>return NVC0_MAX_WINDOW_RECTANGLES; >> + case PIPE_CAP_MAX_CONSERVATIVE_RASTER_SUBPIXEL_PRECISION_BIAS: >> + return class_3d>=GM200_3D_CLASS ? 8 : 0; > > foo >= bar (here and elsewhere) > >> >> /* supported caps */ >> case PIPE_CAP_TEXTURE_MIRROR_CLAMP: >> @@ -263,7 +265,12 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, enum >> pipe_cap param) >> case PIPE_CAP_TGSI_VS_LAYER_VIEWPORT: >> case PIPE_CAP_TGSI_TES_LAYER_VIEWPORT: >> case PIPE_CAP_POST_DEPTH_COVERAGE: >> + case PIPE_CAP_CONSERVATIVE_RASTER_POST_SNAP_TRIANGLES: >> + case PIPE_CAP_CONSERVATIVE_RASTER_POST_SNAP_POINTS_LINES: >> + case PIPE_CAP_CONSERVATIVE_RASTER_POST_DEPTH_COVERAGE: >>return class_3d >= GM200_3D_CLASS; >> + case PIPE_CAP_CONSERVATIVE_RASTER_PRE_SNAP_TRIANGLES: >> + return class_3d >= GP100_3D_CLASS; >> case PIPE_CAP_SEAMLESS_CUBE_MAP_PER_TEXTURE: >> case PIPE_CAP_TGSI_BALLOT: >> case PIPE_CAP_BINDLESS_TEXTURE: >> @@ -309,12 +316,7 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, enum >> pipe_cap param) >> case PIPE_CAP_FENCE_SIGNAL: >> case PIPE_CAP_CONSTBUF0_FLAGS: >> case PIPE_CAP_PACKED_UNIFORMS: >> - case PIPE_CAP_CONSERVATIVE_RASTER_POST_SNAP_TRIANGLES: >> - case PIPE_CAP_CONSERVATIVE_RASTER_POST_SNAP_POINTS_LINES: >> - case PIPE_CAP_CONSERVATIVE_RASTER_PRE_SNAP_TRIANGLES: >> case PIPE_CAP_CONSERVATIVE_RASTER_PRE_SNAP_POINTS_LINES: >> - case PIPE_CAP_CONSERVATIVE_RASTER_POST_DEPTH_COVERAGE: >> - case PIPE_CAP_MAX_CONSERVATIVE_RASTER_SUBPIXEL_PRECISION_BIAS: >>return 0; >> >> case PIPE_CAP_VENDOR_ID: >> @@ -444,6 +446,8 @@ nvc0_screen_get_shader_param(struct pipe_screen *pscreen, >> static float >> nvc0_screen_get_paramf(struct pipe_screen *pscreen, enum pipe_capf param) >> { >> + const uint16_t class_3d = nouveau_screen(pscreen)->class_3d; >> + >> switch (param) { >> case PIPE_CAPF_MAX_LINE_WIDTH: >>
Re: [Mesa-dev] [PATCH 2/4] gallium: add initial support for conservative rasterization
On Wed, Mar 21, 2018 at 7:37 PM, Roland Scheideggerwrote: > Personally I'm not a big proponent on propagating single-vendor > extensions (which are useless for anything but one specific driver) more > or less directly through to gallium. > There's an intel extension doing similar things already too. > Ideally we'd end up with some bits in gallium which can do whatever the > standardized version of it is going to require in some sensible way - at > least I'd hope that such an extension will surface... Agreed. When/if such an extension materializes, we can adjust the gallium API in a logical way to cover all the cases. Until then, this is the functionality that exists on the GPUs in question. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Removing GRALLOC_MODULE_PERFORM_GET_DRM_FD
Hey Robert, On Wed, Mar 21, 2018 at 4:16 PM, Robert Fosswrote: > Hey, > > I've started looking into removing the gralloc method > GRALLOC_MODULE_PERFORM_GET_DRM_FD. > > The issues around this seems to be two parts: > 1) Finding the right device to open > 2) Sharing the device between components > > Sharing the device between components > - > > Currently the device is used by drm_hwc, gbm_gralloc and mesa. > > drm_hwc opens the *primary* node in DrmResources::Init() and creates an > internal model of what properties/components the device has. > > gbm_gralloc uses the *render* node during in gbm_dev_create(). > > Mesa uses uses the *render* node during dri_screen creation in > dri2_create_screen() and for loading the driver in > dri2_initialize_android(). > > However, problematically, drm_hwc uses OpenGL composition as a fallback > method, and when doing so mesa has to be able to import buffers, which means > mesa has to use a *primary* node. > > The way this is currently worked around in production systems seems to be to > disable drm master authentication. This is at least what ChromeOS & Intel > are doing as far as I understand it. > Thanks for kicking this off. I've done a few tests on 2) with VC4 and 8.1.0_r18. With drm_hwc the primary or master on card0 and gbm_gralloc & Mesa each getting their own fd from render128, I didn't need any of the DRM authentication hacks in the kernel anymore. That's with full overlay composition, everything forced to hwui GL composition or everything done through my hacked up ES2 version of glworker in drm_hwc (well it made it to launcher until succumbing to a resource leak). So I don't think mesa would need a master node and could make do with render. The one thing that mesa on a render node definitely breaks is flink/GEM names which drm_gralloc uses (the Android-x86 version anyway). No flink anything with render nodes; drm_gralloc would have to move to dmabuf fds. That said, it would finally get rid of the strict coupling between Mesa and gralloc. Ripping out the PERFORM and drm_gralloc facsimile in gbm_gralloc saves a big bunch of code: https://github.com/stschake/gbm_gralloc/tree/libdrm_handle_def With more in mesa/platform_android from the flink stuff. Thanks, Stefan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] mesa: add support for nvidia conservative rasterization extensions
On Wed, Mar 21, 2018 at 7:11 PM, pendingchaoswrote: > Although the specs write it against compatibility GL 4.3 and allows core > profile and GLES2+, it is exposed for GL 1.0+ and GLES1 and GLES2+. > --- > src/mapi/glapi/gen/gl_API.xml | 47 +++ > src/mapi/glapi/gen/gl_genexec.py| 1 + > src/mesa/Makefile.sources | 2 + > src/mesa/main/attrib.c | 60 +++--- > src/mesa/main/conservativeraster.c | 138 > > src/mesa/main/conservativeraster.h | 48 +++ > src/mesa/main/context.c | 10 +++ > src/mesa/main/dlist.c | 86 > src/mesa/main/enable.c | 14 > src/mesa/main/extensions_table.h| 4 + > src/mesa/main/get.c | 3 + > src/mesa/main/get_hash_params.py| 13 +++ > src/mesa/main/mtypes.h | 29 ++- > src/mesa/main/tests/dispatch_sanity.cpp | 27 +++ > src/mesa/main/viewport.c| 57 + > src/mesa/main/viewport.h| 6 ++ > src/mesa/meson.build| 2 + > 17 files changed, 535 insertions(+), 12 deletions(-) > create mode 100644 src/mesa/main/conservativeraster.c > create mode 100644 src/mesa/main/conservativeraster.h > > diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml > index 38c1921047..0098e6e425 100644 > --- a/src/mapi/glapi/gen/gl_API.xml > +++ b/src/mapi/glapi/gen/gl_API.xml > @@ -12871,6 +12871,53 @@ > > > > + > + > + > + > + > + > + > + > + > + > + > + > + > + no_error="true"> > + > + Indent, both here and below (i.e. param should be indented by 1). Not 100% sure I remember what no_error="true" means, but IIRC it means there's separate dispatch in a no-error context. Doesn't seem worthwhile here (and I don't think you added the _no_error variants of the functions). -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/4] gallium: add initial support for conservative rasterization
Personally I'm not a big proponent on propagating single-vendor extensions (which are useless for anything but one specific driver) more or less directly through to gallium. There's an intel extension doing similar things already too. Ideally we'd end up with some bits in gallium which can do whatever the standardized version of it is going to require in some sensible way - at least I'd hope that such an extension will surface... Roland Am 22.03.2018 um 00:11 schrieb pendingchaos: > --- > src/gallium/docs/source/cso/rasterizer.rst | 18 ++ > src/gallium/docs/source/screen.rst | 18 ++ > src/gallium/drivers/etnaviv/etnaviv_screen.c | 10 ++ > src/gallium/drivers/freedreno/freedreno_screen.c | 10 ++ > src/gallium/drivers/i915/i915_screen.c | 13 + > src/gallium/drivers/llvmpipe/lp_screen.c | 12 > src/gallium/drivers/nouveau/nv30/nv30_screen.c | 10 ++ > src/gallium/drivers/nouveau/nv50/nv50_screen.c | 10 ++ > src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 10 ++ > src/gallium/drivers/r300/r300_screen.c | 10 ++ > src/gallium/drivers/r600/r600_pipe.c | 6 ++ > src/gallium/drivers/r600/r600_pipe_common.c | 4 > src/gallium/drivers/radeonsi/si_get.c| 10 ++ > src/gallium/drivers/softpipe/sp_screen.c | 12 > src/gallium/drivers/svga/svga_screen.c | 13 + > src/gallium/drivers/swr/swr_screen.cpp | 10 ++ > src/gallium/drivers/vc4/vc4_screen.c | 13 - > src/gallium/drivers/vc5/vc5_screen.c | 13 - > src/gallium/drivers/virgl/virgl_screen.c | 10 ++ > src/gallium/include/pipe/p_defines.h | 20 > src/gallium/include/pipe/p_state.h | 6 ++ > 21 files changed, 236 insertions(+), 2 deletions(-) > > diff --git a/src/gallium/docs/source/cso/rasterizer.rst > b/src/gallium/docs/source/cso/rasterizer.rst > index 616e4511a2..4e2d487674 100644 > --- a/src/gallium/docs/source/cso/rasterizer.rst > +++ b/src/gallium/docs/source/cso/rasterizer.rst > @@ -340,3 +340,21 @@ clip_plane_enable > If any clip distance output is written, those half-spaces for which no > clip distance is written count as disabled; i.e. user clip planes and > shader clip distances cannot be mixed, and clip distances take > precedence. > + > +conservative_raster_mode > +The conservative rasterization mode. For PIPE_CONSERVATIVE_RASTER_OFF, > +conservative rasterization is disabled. For > IPE_CONSERVATIVE_RASTER_POST_SNAP > +or PIPE_CONSERVATIVE_RASTER_PRE_SNAP, conservative rasterization is > nabled. > +When conservative rasterization is enabled, the polygon smooth, line > mooth, > +point smooth and line stipple settings are ignored. > +With the post-snap mode, unlike the pre-snap mode, fragments are never > +generated for degenerate primitives. Degenerate primitives, when > rasterized, > +are considered back-facing and the vertex attributes and depth are that > of > +the provoking vertex. > +If the post-snap mode is used with an unsupported primitive, the pre-snap > +mode is used, if supported. Behavior is similar for the pre-snap mode. > +If the pre-snap mode is used, fragments are generated with respect to > the primitive > +before vertex snapping. > + > +conservative_raster_dilate > +The amount of dilation during conservative rasterization. > diff --git a/src/gallium/docs/source/screen.rst > b/src/gallium/docs/source/screen.rst > index 3837360fb4..5bc6ee99f0 100644 > --- a/src/gallium/docs/source/screen.rst > +++ b/src/gallium/docs/source/screen.rst > @@ -420,6 +420,18 @@ The integer capabilities: >by the driver, and the driver can throw assertion failures. > * ``PIPE_CAP_PACKED_UNIFORMS``: True if the driver supports packed uniforms >as opposed to padding to vec4s. > +* ``PIPE_CAP_CONSERVATIVE_RASTER_POST_SNAP_TRIANGLES``: Whether the > + PIPE_CONSERVATIVE_RASTER_POST_SNAP mode is supported for triangles. > +* ``PIPE_CAP_CONSERVATIVE_RASTER_POST_SNAP_POINTS_LINES``: Whether the > +PIPE_CONSERVATIVE_RASTER_POST_SNAP mode is supported for points and lines. > +* ``PIPE_CAP_CONSERVATIVE_RASTER_PRE_SNAP_TRIANGLES``: Whether the > +PIPE_CONSERVATIVE_RASTER_PRE_SNAP mode is supported for triangles. > +* ``PIPE_CAP_CONSERVATIVE_RASTER_PRE_SNAP_POINTS_LINES``: Whether the > +PIPE_CONSERVATIVE_RASTER_PRE_SNAP mode is supported for points and lines. > +* ``PIPE_CAP_CONSERVATIVE_RASTER_POST_DEPTH_COVERAGE``: Whether > PIPE_CAP_POST_DEPTH_COVERAGE > +works with conservative rasterization. > +* ``PIPE_CAP_MAX_CONSERVATIVE_RASTER_SUBPIXEL_PRECISION_BIAS``: The maximum > +subpixel precision bias in bits during conservative rasterization. > > > .. _pipe_capf: > @@ -437,6 +449,12
Re: [Mesa-dev] [PATCH 4/4] nvc0: add conservative rasterization support
On Wed, Mar 21, 2018 at 7:11 PM, pendingchaoswrote: > Subpixel precision bias, dilation and the post-snap mode are supported on > GM2xx and newer. The pre-snap mode is supported for triangle primitives on > GP1xx. > --- > src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h | 5 + > src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 18 > -- > src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 18 > ++ > src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c | 5 + > src/gallium/drivers/nouveau/nvc0/nvc0_stateobj.h | 2 +- > 5 files changed, 41 insertions(+), 7 deletions(-) > > diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h > b/src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h > index d7245fbcae..c5456e48b5 100644 > --- a/src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h > +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h > @@ -447,6 +447,10 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE > SOFTWARE. > #define NVC0_3D_VIEWPORT_TRANSLATE_Z__ESIZE0x0020 > #define NVC0_3D_VIEWPORT_TRANSLATE_Z__LEN 0x0010 > > +#define NVC0_3D_SUBPIXEL_PRECISION(i0)(0x0a1c + > 0x20*(i0)) > +#define NVC0_3D_SUBPIXEL_PRECISION__ESIZE 0x0020 > +#define NVC0_3D_SUBPIXEL_PRECISION__LEN > 0x0010 > + > #define NVC0_3D_VIEWPORT_HORIZ(i0)(0x0c00 + > 0x10*(i0)) > #define NVC0_3D_VIEWPORT_HORIZ__ESIZE 0x0010 > #define NVC0_3D_VIEWPORT_HORIZ__LEN0x0010 > @@ -780,6 +784,7 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE > SOFTWARE. > #define NVC0_3D_UNK11400x1140 > > #define NVC0_3D_UNK11440x1144 > +#define NVC0_3D_CONSERVATIVE_RASTER0x1148 > > #define NVC0_3D_VTX_ATTR_DEFINE0x114c > #define NVC0_3D_VTX_ATTR_DEFINE_ATTR__MASK 0x00ff > diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c > b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c > index ddbb3ec16d..b2b87e01d6 100644 > --- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c > +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c > @@ -172,6 +172,8 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, enum > pipe_cap param) >return 30; > case PIPE_CAP_MAX_WINDOW_RECTANGLES: >return NVC0_MAX_WINDOW_RECTANGLES; > + case PIPE_CAP_MAX_CONSERVATIVE_RASTER_SUBPIXEL_PRECISION_BIAS: > + return class_3d>=GM200_3D_CLASS ? 8 : 0; foo >= bar (here and elsewhere) > > /* supported caps */ > case PIPE_CAP_TEXTURE_MIRROR_CLAMP: > @@ -263,7 +265,12 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, enum > pipe_cap param) > case PIPE_CAP_TGSI_VS_LAYER_VIEWPORT: > case PIPE_CAP_TGSI_TES_LAYER_VIEWPORT: > case PIPE_CAP_POST_DEPTH_COVERAGE: > + case PIPE_CAP_CONSERVATIVE_RASTER_POST_SNAP_TRIANGLES: > + case PIPE_CAP_CONSERVATIVE_RASTER_POST_SNAP_POINTS_LINES: > + case PIPE_CAP_CONSERVATIVE_RASTER_POST_DEPTH_COVERAGE: >return class_3d >= GM200_3D_CLASS; > + case PIPE_CAP_CONSERVATIVE_RASTER_PRE_SNAP_TRIANGLES: > + return class_3d >= GP100_3D_CLASS; > case PIPE_CAP_SEAMLESS_CUBE_MAP_PER_TEXTURE: > case PIPE_CAP_TGSI_BALLOT: > case PIPE_CAP_BINDLESS_TEXTURE: > @@ -309,12 +316,7 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, enum > pipe_cap param) > case PIPE_CAP_FENCE_SIGNAL: > case PIPE_CAP_CONSTBUF0_FLAGS: > case PIPE_CAP_PACKED_UNIFORMS: > - case PIPE_CAP_CONSERVATIVE_RASTER_POST_SNAP_TRIANGLES: > - case PIPE_CAP_CONSERVATIVE_RASTER_POST_SNAP_POINTS_LINES: > - case PIPE_CAP_CONSERVATIVE_RASTER_PRE_SNAP_TRIANGLES: > case PIPE_CAP_CONSERVATIVE_RASTER_PRE_SNAP_POINTS_LINES: > - case PIPE_CAP_CONSERVATIVE_RASTER_POST_DEPTH_COVERAGE: > - case PIPE_CAP_MAX_CONSERVATIVE_RASTER_SUBPIXEL_PRECISION_BIAS: >return 0; > > case PIPE_CAP_VENDOR_ID: > @@ -444,6 +446,8 @@ nvc0_screen_get_shader_param(struct pipe_screen *pscreen, > static float > nvc0_screen_get_paramf(struct pipe_screen *pscreen, enum pipe_capf param) > { > + const uint16_t class_3d = nouveau_screen(pscreen)->class_3d; > + > switch (param) { > case PIPE_CAPF_MAX_LINE_WIDTH: > case PIPE_CAPF_MAX_LINE_WIDTH_AA: > @@ -457,9 +461,11 @@ nvc0_screen_get_paramf(struct pipe_screen *pscreen, enum > pipe_capf param) > case PIPE_CAPF_MAX_TEXTURE_LOD_BIAS: >return 15.0f; > case PIPE_CAPF_MIN_CONSERVATIVE_RASTER_DILATE: > + return 0.0f; > case PIPE_CAPF_MAX_CONSERVATIVE_RASTER_DILATE: > + return class_3d>=GM200_3D_CLASS ? 0.75f : 0.0f; > case PIPE_CAPF_CONSERVATIVE_RASTER_DILATE_GRANULARITY: > - return 0.0f; > + return class_3d>=GM200_3D_CLASS
[Mesa-dev] [PATCH 4/4] nvc0: add conservative rasterization support
Subpixel precision bias, dilation and the post-snap mode are supported on GM2xx and newer. The pre-snap mode is supported for triangle primitives on GP1xx. --- src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h | 5 + src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 18 -- src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 18 ++ src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c | 5 + src/gallium/drivers/nouveau/nvc0/nvc0_stateobj.h | 2 +- 5 files changed, 41 insertions(+), 7 deletions(-) diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h b/src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h index d7245fbcae..c5456e48b5 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h @@ -447,6 +447,10 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. #define NVC0_3D_VIEWPORT_TRANSLATE_Z__ESIZE0x0020 #define NVC0_3D_VIEWPORT_TRANSLATE_Z__LEN 0x0010 +#define NVC0_3D_SUBPIXEL_PRECISION(i0)(0x0a1c + 0x20*(i0)) +#define NVC0_3D_SUBPIXEL_PRECISION__ESIZE 0x0020 +#define NVC0_3D_SUBPIXEL_PRECISION__LEN 0x0010 + #define NVC0_3D_VIEWPORT_HORIZ(i0)(0x0c00 + 0x10*(i0)) #define NVC0_3D_VIEWPORT_HORIZ__ESIZE 0x0010 #define NVC0_3D_VIEWPORT_HORIZ__LEN0x0010 @@ -780,6 +784,7 @@ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. #define NVC0_3D_UNK11400x1140 #define NVC0_3D_UNK11440x1144 +#define NVC0_3D_CONSERVATIVE_RASTER0x1148 #define NVC0_3D_VTX_ATTR_DEFINE0x114c #define NVC0_3D_VTX_ATTR_DEFINE_ATTR__MASK 0x00ff diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c index ddbb3ec16d..b2b87e01d6 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c @@ -172,6 +172,8 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param) return 30; case PIPE_CAP_MAX_WINDOW_RECTANGLES: return NVC0_MAX_WINDOW_RECTANGLES; + case PIPE_CAP_MAX_CONSERVATIVE_RASTER_SUBPIXEL_PRECISION_BIAS: + return class_3d>=GM200_3D_CLASS ? 8 : 0; /* supported caps */ case PIPE_CAP_TEXTURE_MIRROR_CLAMP: @@ -263,7 +265,12 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param) case PIPE_CAP_TGSI_VS_LAYER_VIEWPORT: case PIPE_CAP_TGSI_TES_LAYER_VIEWPORT: case PIPE_CAP_POST_DEPTH_COVERAGE: + case PIPE_CAP_CONSERVATIVE_RASTER_POST_SNAP_TRIANGLES: + case PIPE_CAP_CONSERVATIVE_RASTER_POST_SNAP_POINTS_LINES: + case PIPE_CAP_CONSERVATIVE_RASTER_POST_DEPTH_COVERAGE: return class_3d >= GM200_3D_CLASS; + case PIPE_CAP_CONSERVATIVE_RASTER_PRE_SNAP_TRIANGLES: + return class_3d >= GP100_3D_CLASS; case PIPE_CAP_SEAMLESS_CUBE_MAP_PER_TEXTURE: case PIPE_CAP_TGSI_BALLOT: case PIPE_CAP_BINDLESS_TEXTURE: @@ -309,12 +316,7 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param) case PIPE_CAP_FENCE_SIGNAL: case PIPE_CAP_CONSTBUF0_FLAGS: case PIPE_CAP_PACKED_UNIFORMS: - case PIPE_CAP_CONSERVATIVE_RASTER_POST_SNAP_TRIANGLES: - case PIPE_CAP_CONSERVATIVE_RASTER_POST_SNAP_POINTS_LINES: - case PIPE_CAP_CONSERVATIVE_RASTER_PRE_SNAP_TRIANGLES: case PIPE_CAP_CONSERVATIVE_RASTER_PRE_SNAP_POINTS_LINES: - case PIPE_CAP_CONSERVATIVE_RASTER_POST_DEPTH_COVERAGE: - case PIPE_CAP_MAX_CONSERVATIVE_RASTER_SUBPIXEL_PRECISION_BIAS: return 0; case PIPE_CAP_VENDOR_ID: @@ -444,6 +446,8 @@ nvc0_screen_get_shader_param(struct pipe_screen *pscreen, static float nvc0_screen_get_paramf(struct pipe_screen *pscreen, enum pipe_capf param) { + const uint16_t class_3d = nouveau_screen(pscreen)->class_3d; + switch (param) { case PIPE_CAPF_MAX_LINE_WIDTH: case PIPE_CAPF_MAX_LINE_WIDTH_AA: @@ -457,9 +461,11 @@ nvc0_screen_get_paramf(struct pipe_screen *pscreen, enum pipe_capf param) case PIPE_CAPF_MAX_TEXTURE_LOD_BIAS: return 15.0f; case PIPE_CAPF_MIN_CONSERVATIVE_RASTER_DILATE: + return 0.0f; case PIPE_CAPF_MAX_CONSERVATIVE_RASTER_DILATE: + return class_3d>=GM200_3D_CLASS ? 0.75f : 0.0f; case PIPE_CAPF_CONSERVATIVE_RASTER_DILATE_GRANULARITY: - return 0.0f; + return class_3d>=GM200_3D_CLASS ? 0.25f : 0.0f; } NOUVEAU_ERR("unknown PIPE_CAPF %d\n", param); diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c index 99d45a238a..10c450e036 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c +++
[Mesa-dev] [PATCH 1/4] mesa: add support for nvidia conservative rasterization extensions
Although the specs write it against compatibility GL 4.3 and allows core profile and GLES2+, it is exposed for GL 1.0+ and GLES1 and GLES2+. --- src/mapi/glapi/gen/gl_API.xml | 47 +++ src/mapi/glapi/gen/gl_genexec.py| 1 + src/mesa/Makefile.sources | 2 + src/mesa/main/attrib.c | 60 +++--- src/mesa/main/conservativeraster.c | 138 src/mesa/main/conservativeraster.h | 48 +++ src/mesa/main/context.c | 10 +++ src/mesa/main/dlist.c | 86 src/mesa/main/enable.c | 14 src/mesa/main/extensions_table.h| 4 + src/mesa/main/get.c | 3 + src/mesa/main/get_hash_params.py| 13 +++ src/mesa/main/mtypes.h | 29 ++- src/mesa/main/tests/dispatch_sanity.cpp | 27 +++ src/mesa/main/viewport.c| 57 + src/mesa/main/viewport.h| 6 ++ src/mesa/meson.build| 2 + 17 files changed, 535 insertions(+), 12 deletions(-) create mode 100644 src/mesa/main/conservativeraster.c create mode 100644 src/mesa/main/conservativeraster.h diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml index 38c1921047..0098e6e425 100644 --- a/src/mapi/glapi/gen/gl_API.xml +++ b/src/mapi/glapi/gen/gl_API.xml @@ -12871,6 +12871,53 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + http://www.w3.org/2001/XInclude"/> diff --git a/src/mapi/glapi/gen/gl_genexec.py b/src/mapi/glapi/gen/gl_genexec.py index aaff9f230b..be8013b62b 100644 --- a/src/mapi/glapi/gen/gl_genexec.py +++ b/src/mapi/glapi/gen/gl_genexec.py @@ -62,6 +62,7 @@ header = """/** #include "main/colortab.h" #include "main/compute.h" #include "main/condrender.h" +#include "main/conservativeraster.h" #include "main/context.h" #include "main/convolve.h" #include "main/copyimage.h" diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources index 0446078136..43ec55f580 100644 --- a/src/mesa/Makefile.sources +++ b/src/mesa/Makefile.sources @@ -49,6 +49,8 @@ MAIN_FILES = \ main/condrender.c \ main/condrender.h \ main/config.h \ + main/conservativeraster.c \ + main/conservativeraster.h \ main/context.c \ main/context.h \ main/convolve.c \ diff --git a/src/mesa/main/attrib.c b/src/mesa/main/attrib.c index 9d3aa728a1..4790bbc036 100644 --- a/src/mesa/main/attrib.c +++ b/src/mesa/main/attrib.c @@ -138,6 +138,9 @@ struct gl_enable_attrib /* GL_ARB_framebuffer_sRGB / GL_EXT_framebuffer_sRGB */ GLboolean sRGBEnabled; + + /* GL_NV_conservative_raster */ + GLboolean ConservativeRasterization; }; @@ -178,6 +181,13 @@ struct texture_state }; +struct viewport_state +{ + struct gl_viewport_attrib ViewportArray[MAX_VIEWPORTS]; + GLuint SubpixelPrecisionBias[2]; +}; + + /** An unused GL_*_BIT value */ #define DUMMY_BIT 0x1000 @@ -394,6 +404,9 @@ _mesa_PushAttrib(GLbitfield mask) /* GL_ARB_framebuffer_sRGB / GL_EXT_framebuffer_sRGB */ attr->sRGBEnabled = ctx->Color.sRGBEnabled; + + /* GL_NV_conservative_raster */ + attr->ConservativeRasterization = ctx->ConservativeRasterization; } if (mask & GL_EVAL_BIT) { @@ -545,11 +558,23 @@ _mesa_PushAttrib(GLbitfield mask) } if (mask & GL_VIEWPORT_BIT) { - if (!push_attrib(ctx, , GL_VIEWPORT_BIT, - sizeof(struct gl_viewport_attrib) - * ctx->Const.MaxViewports, - (void*)>ViewportArray)) + struct viewport_state *viewstate = CALLOC_STRUCT(viewport_state); + if (!viewstate) { + _mesa_error(ctx, GL_OUT_OF_MEMORY, "glPushAttrib(GL_VIEWPORT_BIT)"); + goto end; + } + + if (!save_attrib_data(, GL_VIEWPORT_BIT, viewstate)) { + free(viewstate); + _mesa_error(ctx, GL_OUT_OF_MEMORY, "glPushAttrib(GL_VIEWPORT_BIT)"); goto end; + } + + memcpy(>ViewportArray, >ViewportArray, + sizeof(struct gl_viewport_attrib)*ctx->Const.MaxViewports); + + viewstate->SubpixelPrecisionBias[0] = ctx->SubpixelPrecisionBias[0]; + viewstate->SubpixelPrecisionBias[1] = ctx->SubpixelPrecisionBias[1]; } /* GL_ARB_multisample */ @@ -714,6 +739,13 @@ pop_enable_group(struct gl_context *ctx, const struct gl_enable_attrib *enable) TEST_AND_UPDATE(ctx->Color.sRGBEnabled, enable->sRGBEnabled, GL_FRAMEBUFFER_SRGB); + /* GL_NV_conservative_raster */ + if (ctx->Extensions.NV_conservative_raster) { + TEST_AND_UPDATE(ctx->ConservativeRasterization, +
[Mesa-dev] [PATCH 2/4] gallium: add initial support for conservative rasterization
--- src/gallium/docs/source/cso/rasterizer.rst | 18 ++ src/gallium/docs/source/screen.rst | 18 ++ src/gallium/drivers/etnaviv/etnaviv_screen.c | 10 ++ src/gallium/drivers/freedreno/freedreno_screen.c | 10 ++ src/gallium/drivers/i915/i915_screen.c | 13 + src/gallium/drivers/llvmpipe/lp_screen.c | 12 src/gallium/drivers/nouveau/nv30/nv30_screen.c | 10 ++ src/gallium/drivers/nouveau/nv50/nv50_screen.c | 10 ++ src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 10 ++ src/gallium/drivers/r300/r300_screen.c | 10 ++ src/gallium/drivers/r600/r600_pipe.c | 6 ++ src/gallium/drivers/r600/r600_pipe_common.c | 4 src/gallium/drivers/radeonsi/si_get.c| 10 ++ src/gallium/drivers/softpipe/sp_screen.c | 12 src/gallium/drivers/svga/svga_screen.c | 13 + src/gallium/drivers/swr/swr_screen.cpp | 10 ++ src/gallium/drivers/vc4/vc4_screen.c | 13 - src/gallium/drivers/vc5/vc5_screen.c | 13 - src/gallium/drivers/virgl/virgl_screen.c | 10 ++ src/gallium/include/pipe/p_defines.h | 20 src/gallium/include/pipe/p_state.h | 6 ++ 21 files changed, 236 insertions(+), 2 deletions(-) diff --git a/src/gallium/docs/source/cso/rasterizer.rst b/src/gallium/docs/source/cso/rasterizer.rst index 616e4511a2..4e2d487674 100644 --- a/src/gallium/docs/source/cso/rasterizer.rst +++ b/src/gallium/docs/source/cso/rasterizer.rst @@ -340,3 +340,21 @@ clip_plane_enable If any clip distance output is written, those half-spaces for which no clip distance is written count as disabled; i.e. user clip planes and shader clip distances cannot be mixed, and clip distances take precedence. + +conservative_raster_mode +The conservative rasterization mode. For PIPE_CONSERVATIVE_RASTER_OFF, +conservative rasterization is disabled. For IPE_CONSERVATIVE_RASTER_POST_SNAP +or PIPE_CONSERVATIVE_RASTER_PRE_SNAP, conservative rasterization is nabled. +When conservative rasterization is enabled, the polygon smooth, line mooth, +point smooth and line stipple settings are ignored. +With the post-snap mode, unlike the pre-snap mode, fragments are never +generated for degenerate primitives. Degenerate primitives, when rasterized, +are considered back-facing and the vertex attributes and depth are that of +the provoking vertex. +If the post-snap mode is used with an unsupported primitive, the pre-snap +mode is used, if supported. Behavior is similar for the pre-snap mode. +If the pre-snap mode is used, fragments are generated with respect to the primitive +before vertex snapping. + +conservative_raster_dilate +The amount of dilation during conservative rasterization. diff --git a/src/gallium/docs/source/screen.rst b/src/gallium/docs/source/screen.rst index 3837360fb4..5bc6ee99f0 100644 --- a/src/gallium/docs/source/screen.rst +++ b/src/gallium/docs/source/screen.rst @@ -420,6 +420,18 @@ The integer capabilities: by the driver, and the driver can throw assertion failures. * ``PIPE_CAP_PACKED_UNIFORMS``: True if the driver supports packed uniforms as opposed to padding to vec4s. +* ``PIPE_CAP_CONSERVATIVE_RASTER_POST_SNAP_TRIANGLES``: Whether the + PIPE_CONSERVATIVE_RASTER_POST_SNAP mode is supported for triangles. +* ``PIPE_CAP_CONSERVATIVE_RASTER_POST_SNAP_POINTS_LINES``: Whether the +PIPE_CONSERVATIVE_RASTER_POST_SNAP mode is supported for points and lines. +* ``PIPE_CAP_CONSERVATIVE_RASTER_PRE_SNAP_TRIANGLES``: Whether the +PIPE_CONSERVATIVE_RASTER_PRE_SNAP mode is supported for triangles. +* ``PIPE_CAP_CONSERVATIVE_RASTER_PRE_SNAP_POINTS_LINES``: Whether the +PIPE_CONSERVATIVE_RASTER_PRE_SNAP mode is supported for points and lines. +* ``PIPE_CAP_CONSERVATIVE_RASTER_POST_DEPTH_COVERAGE``: Whether PIPE_CAP_POST_DEPTH_COVERAGE +works with conservative rasterization. +* ``PIPE_CAP_MAX_CONSERVATIVE_RASTER_SUBPIXEL_PRECISION_BIAS``: The maximum +subpixel precision bias in bits during conservative rasterization. .. _pipe_capf: @@ -437,6 +449,12 @@ The floating-point capabilities are: applied to anisotropically filtered textures. * ``PIPE_CAPF_MAX_TEXTURE_LOD_BIAS``: The maximum :term:`LOD` bias that may be applied to filtered textures. +* ``PIPE_CAPF_MIN_CONSERVATIVE_RASTER_DILATE``: The minimum conservative rasterization + dilation. +* ``PIPE_CAPF_MAX_CONSERVATIVE_RASTER_DILATE``: The maximum conservative rasterization + dilation. +* ``PIPE_CAPF_CONSERVATIVE_RASTER_DILATE_GRANULARITY``: The conservative rasterization + dilation granularity for values relative to the minimum dilation. .. _pipe_shader_cap: diff --git a/src/gallium/drivers/etnaviv/etnaviv_screen.c
[Mesa-dev] [PATCH 3/4] st/mesa: add support for nvidia conservative rasterization extensions
--- src/mesa/state_tracker/st_atom_rasterizer.c | 12 ++ src/mesa/state_tracker/st_atom_viewport.c | 4 src/mesa/state_tracker/st_context.c | 2 ++ src/mesa/state_tracker/st_extensions.c | 34 + 4 files changed, 52 insertions(+) diff --git a/src/mesa/state_tracker/st_atom_rasterizer.c b/src/mesa/state_tracker/st_atom_rasterizer.c index 1be072e6e3..451935d638 100644 --- a/src/mesa/state_tracker/st_atom_rasterizer.c +++ b/src/mesa/state_tracker/st_atom_rasterizer.c @@ -298,5 +298,17 @@ st_update_rasterizer(struct st_context *st) raster->clip_plane_enable = ctx->Transform.ClipPlanesEnabled; raster->clip_halfz = (ctx->Transform.ClipDepthMode == GL_ZERO_TO_ONE); +/* ST_NEW_RASTERIZER */ + if (ctx->ConservativeRasterization) { + if (ctx->ConservativeRasterMode == GL_CONSERVATIVE_RASTER_MODE_POST_SNAP_NV) + raster->conservative_raster_mode = PIPE_CONSERVATIVE_RASTER_POST_SNAP; + else + raster->conservative_raster_mode = PIPE_CONSERVATIVE_RASTER_PRE_SNAP; + } else { + raster->conservative_raster_mode = PIPE_CONSERVATIVE_RASTER_OFF; + } + + raster->conservative_raster_dilate = ctx->ConservativeRasterDilate; + cso_set_rasterizer(st->cso_context, raster); } diff --git a/src/mesa/state_tracker/st_atom_viewport.c b/src/mesa/state_tracker/st_atom_viewport.c index 6e3347e7cf..1accaa363b 100644 --- a/src/mesa/state_tracker/st_atom_viewport.c +++ b/src/mesa/state_tracker/st_atom_viewport.c @@ -50,9 +50,13 @@ st_update_viewport( struct st_context *st ) for (i = 0; i < st->state.num_viewports; i++) { float *scale = st->state.viewport[i].scale; float *translate = st->state.viewport[i].translate; + uint16_t* subpixel_precision = st->state.viewport[i].subpixel_precision; _mesa_get_viewport_xform(ctx, i, scale, translate); + subpixel_precision[0] = ctx->SubpixelPrecisionBias[0]; + subpixel_precision[1] = ctx->SubpixelPrecisionBias[1]; + /* _NEW_BUFFERS */ /* Drawing to a window where the coordinate system is upside down. */ if (st->state.fb_orientation == Y_0_TOP) { diff --git a/src/mesa/state_tracker/st_context.c b/src/mesa/state_tracker/st_context.c index de30905dd2..0bcccdf84f 100644 --- a/src/mesa/state_tracker/st_context.c +++ b/src/mesa/state_tracker/st_context.c @@ -344,6 +344,8 @@ st_init_driver_flags(struct st_context *st) f->NewPolygonState = ST_NEW_RASTERIZER; f->NewPolygonStipple = ST_NEW_POLY_STIPPLE; f->NewViewport = ST_NEW_VIEWPORT; + f->NewNvConservativeRasterization = ST_NEW_RASTERIZER; + f->NewNvConservativeRasterizationParams = ST_NEW_RASTERIZER; } diff --git a/src/mesa/state_tracker/st_extensions.c b/src/mesa/state_tracker/st_extensions.c index bea61f21cb..02832f3951 100644 --- a/src/mesa/state_tracker/st_extensions.c +++ b/src/mesa/state_tracker/st_extensions.c @@ -494,6 +494,16 @@ void st_init_limits(struct pipe_screen *screen, c->UseSTD430AsDefaultPacking = screen->get_param(screen, PIPE_CAP_LOAD_CONSTBUF); + c->MaxSubpixelPrecisionBiasBits = + screen->get_param(screen, PIPE_CAP_MAX_CONSERVATIVE_RASTER_SUBPIXEL_PRECISION_BIAS); + + c->ConservativeRasterDilateRange[0] = + screen->get_paramf(screen, PIPE_CAPF_MIN_CONSERVATIVE_RASTER_DILATE); + c->ConservativeRasterDilateRange[1] = + screen->get_paramf(screen, PIPE_CAPF_MAX_CONSERVATIVE_RASTER_DILATE); + c->ConservativeRasterDilateGranularity = + screen->get_paramf(screen, PIPE_CAPF_CONSERVATIVE_RASTER_DILATE_GRANULARITY); + /* limit the max combined shader output resources to a driver limit */ temp = screen->get_param(screen, PIPE_CAP_MAX_COMBINED_SHADER_OUTPUT_RESOURCES); if (temp > 0 && c->MaxCombinedShaderOutputResources > temp) @@ -1363,4 +1373,28 @@ void st_init_extensions(struct pipe_screen *screen, extensions->ARB_texture_cube_map_array && extensions->ARB_texture_stencil8 && extensions->ARB_texture_multisample; + + if (screen->get_param(screen, PIPE_CAP_CONSERVATIVE_RASTER_POST_SNAP_TRIANGLES) && + screen->get_param(screen, PIPE_CAP_CONSERVATIVE_RASTER_POST_SNAP_POINTS_LINES) && + screen->get_param(screen, PIPE_CAP_CONSERVATIVE_RASTER_POST_DEPTH_COVERAGE)) { + float max_dilate; + bool pre_snap_triangles, pre_snap_points_lines; + + max_dilate = screen->get_paramf(screen, PIPE_CAPF_MAX_CONSERVATIVE_RASTER_DILATE); + + pre_snap_triangles = + screen->get_param(screen, PIPE_CAP_CONSERVATIVE_RASTER_PRE_SNAP_TRIANGLES); + pre_snap_points_lines = + screen->get_param(screen, PIPE_CAP_CONSERVATIVE_RASTER_PRE_SNAP_POINTS_LINES); + + extensions->NV_conservative_raster = + screen->get_param(screen, PIPE_CAP_MAX_CONSERVATIVE_RASTER_SUBPIXEL_PRECISION_BIAS) > 1; + + if (extensions->NV_conservative_raster) { + extensions->NV_conservative_raster_dilate = max_dilate>=0.75; +
[Mesa-dev] [PATCH 0/4] Implement Various Conservative Rasterization Extensions
This patch-set adds support for GL_NV_conservative_raster and GL_NV_conservative_raster_dilate on GM2xx and newer. It also adds support for GL_NV_conservative_raster_pre_snap_triangles on GP1xx. In doing so, it implements various functions in mesa core, extends the Gallium API, connects the new mesa core functions and the Gallium API through st/mesa and implements support for the Gallium API in the Nouveau driver. pendingchaos (4): mesa: add support for nvidia conservative rasterization extensions gallium: add initial support for conservative rasterization st/mesa: add support for nvidia conservative rasterization extensions nvc0: add conservative rasterization support src/gallium/docs/source/cso/rasterizer.rst | 18 +++ src/gallium/docs/source/screen.rst | 18 +++ src/gallium/drivers/etnaviv/etnaviv_screen.c | 10 ++ src/gallium/drivers/freedreno/freedreno_screen.c | 10 ++ src/gallium/drivers/i915/i915_screen.c | 13 ++ src/gallium/drivers/llvmpipe/lp_screen.c | 12 ++ src/gallium/drivers/nouveau/nv30/nv30_screen.c | 10 ++ src/gallium/drivers/nouveau/nv50/nv50_screen.c | 10 ++ src/gallium/drivers/nouveau/nvc0/nvc0_3d.xml.h | 5 + src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 16 +++ src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 18 +++ .../drivers/nouveau/nvc0/nvc0_state_validate.c | 5 + src/gallium/drivers/nouveau/nvc0/nvc0_stateobj.h | 2 +- src/gallium/drivers/r300/r300_screen.c | 10 ++ src/gallium/drivers/r600/r600_pipe.c | 6 + src/gallium/drivers/r600/r600_pipe_common.c| 4 + src/gallium/drivers/radeonsi/si_get.c | 10 ++ src/gallium/drivers/softpipe/sp_screen.c | 12 ++ src/gallium/drivers/svga/svga_screen.c | 13 ++ src/gallium/drivers/swr/swr_screen.cpp | 10 ++ src/gallium/drivers/vc4/vc4_screen.c | 13 +- src/gallium/drivers/vc5/vc5_screen.c | 13 +- src/gallium/drivers/virgl/virgl_screen.c | 10 ++ src/gallium/include/pipe/p_defines.h | 20 +++ src/gallium/include/pipe/p_state.h | 6 + src/mapi/glapi/gen/gl_API.xml | 47 +++ src/mapi/glapi/gen/gl_genexec.py | 1 + src/mesa/Makefile.sources | 2 + src/mesa/main/attrib.c | 60 +++-- src/mesa/main/conservativeraster.c | 138 + src/mesa/main/conservativeraster.h | 48 +++ src/mesa/main/context.c| 10 ++ src/mesa/main/dlist.c | 86 + src/mesa/main/enable.c | 14 +++ src/mesa/main/extensions_table.h | 4 + src/mesa/main/get.c| 3 + src/mesa/main/get_hash_params.py | 13 ++ src/mesa/main/mtypes.h | 29 - src/mesa/main/tests/dispatch_sanity.cpp| 27 src/mesa/main/viewport.c | 57 + src/mesa/main/viewport.h | 6 + src/mesa/meson.build | 2 + src/mesa/state_tracker/st_atom_rasterizer.c| 12 ++ src/mesa/state_tracker/st_atom_viewport.c | 4 + src/mesa/state_tracker/st_context.c| 2 + src/mesa/state_tracker/st_extensions.c | 34 + 46 files changed, 858 insertions(+), 15 deletions(-) create mode 100644 src/mesa/main/conservativeraster.c create mode 100644 src/mesa/main/conservativeraster.h -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] clover/llvm: Fix build against LLVM/Clang 4.0
Aaron Watrywrites: > On Wed, Mar 21, 2018, 4:49 PM Francisco Jerez wrote: > >> Aaron Watry writes: >> >> > The opencl 1.0 langstandard was renamed in 5.0+ >> > >> > Cc: Mark Janes >> > --- >> > src/gallium/state_trackers/clover/llvm/invocation.cpp | 4 >> > 1 file changed, 4 insertions(+) >> > >> > diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp >> b/src/gallium/state_trackers/clover/llvm/invocation.cpp >> > index af78c2ae28..2fb3ce2365 100644 >> > --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp >> > +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp >> > @@ -85,7 +85,11 @@ namespace { >> > }; >> > >> > const clc_version_lang_std cl_version_lang_stds[] = { >> > +#if HAVE_LLVM >= 0x0500 >> > { 100, clang::LangStandard::lang_opencl10}, >> > +#else >> > + { 100, clang::LangStandard::lang_opencl}, >> > +#endif >> >> Please move this preprocessor magic into an llvm/compat.hpp definition. >> Thanks! >> > > Sure thing. Do you want to see a v2? > I wouldn't mind. > --Aaron > > >> > { 110, clang::LangStandard::lang_opencl11}, >> > { 120, clang::LangStandard::lang_opencl12}, >> > { 200, clang::LangStandard::lang_opencl20}, >> > -- >> > 2.14.1 >> > >> > ___ >> > mesa-dev mailing list >> > mesa-dev@lists.freedesktop.org >> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev >> signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] clover/llvm: Fix build against LLVM/Clang 4.0
On Wed, Mar 21, 2018, 4:49 PM Francisco Jerezwrote: > Aaron Watry writes: > > > The opencl 1.0 langstandard was renamed in 5.0+ > > > > Cc: Mark Janes > > --- > > src/gallium/state_trackers/clover/llvm/invocation.cpp | 4 > > 1 file changed, 4 insertions(+) > > > > diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp > b/src/gallium/state_trackers/clover/llvm/invocation.cpp > > index af78c2ae28..2fb3ce2365 100644 > > --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp > > +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp > > @@ -85,7 +85,11 @@ namespace { > > }; > > > > const clc_version_lang_std cl_version_lang_stds[] = { > > +#if HAVE_LLVM >= 0x0500 > > { 100, clang::LangStandard::lang_opencl10}, > > +#else > > + { 100, clang::LangStandard::lang_opencl}, > > +#endif > > Please move this preprocessor magic into an llvm/compat.hpp definition. > Thanks! > Sure thing. Do you want to see a v2? --Aaron > > { 110, clang::LangStandard::lang_opencl11}, > > { 120, clang::LangStandard::lang_opencl12}, > > { 200, clang::LangStandard::lang_opencl20}, > > -- > > 2.14.1 > > > > ___ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 05/11] intel/compiler: Use null destination register for memory fence messages
Matt Turnerwrites: > On Wed, Mar 21, 2018 at 2:56 PM, Francisco Jerez > wrote: >> Matt Turner writes: >> >>> From Message Descriptor section in gfxspecs: >>> >>> "Memory fence messages without Commit Enable set do not return >>>anything to the thread (response length is 0 and destination >>>register is null)." >>> >>> This fixes a GPU hang in simulation in the piglit test >>> arb_shader_image_load_store-shader-mem-barrier >>> >> >> On what platform? > > I'm pretty sure Anuj found this in ICL. I'll revert this patch from my > branch and try to confirm. That sounds pretty bogus, this patch cannot possibly have any effect on ICL because brw_memory_fence() is already setting commit_enable unconditionally on Gen10+, so the destination register won't ever be null regardless. signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 05/11] intel/compiler: Use null destination register for memory fence messages
On Wed, Mar 21, 2018 at 2:56 PM, Francisco Jerezwrote: > Matt Turner writes: > >> From Message Descriptor section in gfxspecs: >> >> "Memory fence messages without Commit Enable set do not return >>anything to the thread (response length is 0 and destination >>register is null)." >> >> This fixes a GPU hang in simulation in the piglit test >> arb_shader_image_load_store-shader-mem-barrier >> > > On what platform? I'm pretty sure Anuj found this in ICL. I'll revert this patch from my branch and try to confirm. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 05/11] intel/compiler: Use null destination register for memory fence messages
Matt Turnerwrites: > From Message Descriptor section in gfxspecs: > > "Memory fence messages without Commit Enable set do not return >anything to the thread (response length is 0 and destination >register is null)." > > This fixes a GPU hang in simulation in the piglit test > arb_shader_image_load_store-shader-mem-barrier > On what platform? > The mem fence message doesn't send any data, and previously we were > setting the SEND's src0 to the same register as the destination. I've > kept that behavior, so src0 will now be the null register in a number of > cases, necessitating a few changes in the EU validator. The simulator > and real hardware seem to be okay with this. > --- > src/intel/compiler/brw_eu_emit.c| 4 ++-- > src/intel/compiler/brw_eu_validate.c| 13 +++-- > src/intel/compiler/brw_fs_nir.cpp | 14 +++--- > src/intel/compiler/test_eu_validate.cpp | 9 + > 4 files changed, 33 insertions(+), 7 deletions(-) > > diff --git a/src/intel/compiler/brw_eu_emit.c > b/src/intel/compiler/brw_eu_emit.c > index f039af56d05..fe7fa8723e1 100644 > --- a/src/intel/compiler/brw_eu_emit.c > +++ b/src/intel/compiler/brw_eu_emit.c > @@ -3289,8 +3289,8 @@ brw_memory_fence(struct brw_codegen *p, > { > const struct gen_device_info *devinfo = p->devinfo; > const bool commit_enable = > - devinfo->gen >= 10 || /* HSD ES # 1404612949 */ > - (devinfo->gen == 7 && !devinfo->is_haswell); > + !(dst.file == BRW_ARCHITECTURE_REGISTER_FILE && > +dst.nr == BRW_ARF_NULL); > struct brw_inst *insn; > > brw_push_insn_state(p); > diff --git a/src/intel/compiler/brw_eu_validate.c > b/src/intel/compiler/brw_eu_validate.c > index d3189d1ef5e..e16dfc3aaf3 100644 > --- a/src/intel/compiler/brw_eu_validate.c > +++ b/src/intel/compiler/brw_eu_validate.c > @@ -168,6 +168,14 @@ src1_has_scalar_region(const struct gen_device_info > *devinfo, const brw_inst *in >brw_inst_src1_hstride(devinfo, inst) == BRW_HORIZONTAL_STRIDE_0; > } > > +static bool > +is_mfence(const struct gen_device_info *devinfo, const brw_inst *inst) > +{ > + return brw_inst_opcode(devinfo, inst) == BRW_OPCODE_SEND && > + brw_inst_sfid(devinfo, inst) == GEN7_SFID_DATAPORT_DATA_CACHE && > + brw_inst_dp_msg_type(devinfo, inst) == > GEN7_DATAPORT_DC_MEMORY_FENCE; > +} > + > static unsigned > num_sources_from_inst(const struct gen_device_info *devinfo, >const brw_inst *inst) > @@ -236,7 +244,7 @@ sources_not_null(const struct gen_device_info *devinfo, > if (num_sources == 3) >return (struct string){}; > > - if (num_sources >= 1) > + if (num_sources >= 1 && !is_mfence(devinfo, inst)) >ERROR_IF(src0_is_null(devinfo, inst), "src0 is null"); > > if (num_sources == 2) > @@ -256,7 +264,8 @@ send_restrictions(const struct gen_device_info *devinfo, > "send must use direct addressing"); > >if (devinfo->gen >= 7) { > - ERROR_IF(!src0_is_grf(devinfo, inst), "send from non-GRF"); > + ERROR_IF(!src0_is_grf(devinfo, inst) && !is_mfence(devinfo, inst), > + "send from non-GRF"); > ERROR_IF(brw_inst_eot(devinfo, inst) && >brw_inst_src0_da_reg_nr(devinfo, inst) < 112, >"send with EOT must use g112-g127"); > diff --git a/src/intel/compiler/brw_fs_nir.cpp > b/src/intel/compiler/brw_fs_nir.cpp > index dbd2105f7e9..063f0256829 100644 > --- a/src/intel/compiler/brw_fs_nir.cpp > +++ b/src/intel/compiler/brw_fs_nir.cpp > @@ -3859,9 +3859,17 @@ fs_visitor::nir_emit_intrinsic(const fs_builder , > nir_intrinsic_instr *instr > case nir_intrinsic_memory_barrier_image: > case nir_intrinsic_memory_barrier: { >const fs_builder ubld = bld.group(8, 0); > - const fs_reg tmp = ubld.vgrf(BRW_REGISTER_TYPE_UD, 2); > - ubld.emit(SHADER_OPCODE_MEMORY_FENCE, tmp) > - ->size_written = 2 * REG_SIZE; > + if (devinfo->gen == 7 && !devinfo->is_haswell) { > + const fs_reg tmp = ubld.vgrf(BRW_REGISTER_TYPE_UD, 2); > + ubld.emit(SHADER_OPCODE_MEMORY_FENCE, tmp) > +->size_written = 2 * REG_SIZE; > + } else { > + const fs_reg tmp = > +/* HSD ES #1404612949 */ > +devinfo->gen >= 10 ? ubld.vgrf(BRW_REGISTER_TYPE_UD) > + : bld.null_reg_d(); > + ubld.emit(SHADER_OPCODE_MEMORY_FENCE, tmp); > + } >break; > } > > diff --git a/src/intel/compiler/test_eu_validate.cpp > b/src/intel/compiler/test_eu_validate.cpp > index 161db994b2b..8169f951b2d 100644 > --- a/src/intel/compiler/test_eu_validate.cpp > +++ b/src/intel/compiler/test_eu_validate.cpp > @@ -168,6 +168,15 @@ TEST_P(validation_test, math_src1_null_reg) > } > } > > +TEST_P(validation_test, mfence_src0_null_reg) > +{ > + /* On HSW+ mfence's src0 is the null register */ > + if (devinfo.gen >=
[Mesa-dev] [PATCH] gallium: Do not add -Wframe-address option for gcc <= 4.4.
This patch fixes these build errors with GCC 4.4. Compiling src/gallium/auxiliary/util/u_debug_stack.c ... src/gallium/auxiliary/util/u_debug_stack.c: In function ‘debug_backtrace_capture’: src/gallium/auxiliary/util/u_debug_stack.c:268: error: #pragma GCC diagnostic not allowed inside functions src/gallium/auxiliary/util/u_debug_stack.c:269: error: #pragma GCC diagnostic not allowed inside functions src/gallium/auxiliary/util/u_debug_stack.c:271: error: #pragma GCC diagnostic not allowed inside functions Fixes: 370e356ebab4 ("gallium: silence __builtin_frame_address nonzero argument is unsafe warning") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105529 Signed-off-by: Vinson Lee--- src/gallium/auxiliary/util/u_debug_stack.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/src/gallium/auxiliary/util/u_debug_stack.c b/src/gallium/auxiliary/util/u_debug_stack.c index 974e639..846f648 100644 --- a/src/gallium/auxiliary/util/u_debug_stack.c +++ b/src/gallium/auxiliary/util/u_debug_stack.c @@ -264,7 +264,7 @@ debug_backtrace_capture(struct debug_stack_frame *backtrace, } #endif -#if defined(PIPE_CC_GCC) +#if defined(PIPE_CC_GCC) && (PIPE_CC_GCC_VERSION > 404) || defined(__clang__) #pragma GCC diagnostic push #pragma GCC diagnostic ignored "-Wframe-address" frame_pointer = ((const void **)__builtin_frame_address(1)); -- 1.7.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 08/11] intel: Disable fast color clear on icl
On Wed, Mar 21, 2018 at 2:52 PM, Kenneth Graunkewrote: > On Wednesday, March 21, 2018 2:06:19 PM PDT Matt Turner wrote: >> From: Anuj Phogat >> >> Disabling fast color clear makes fbo-clearmipmap test render correct >> texture in base miplevel. Fast color clear is anyways disabled for >> non-base miplevels. >> --- >> src/mesa/drivers/dri/i965/brw_blorp.c | 4 >> 1 file changed, 4 insertions(+) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c >> b/src/mesa/drivers/dri/i965/brw_blorp.c >> index 72578b6ea5c..bee8e409897 100644 >> --- a/src/mesa/drivers/dri/i965/brw_blorp.c >> +++ b/src/mesa/drivers/dri/i965/brw_blorp.c >> @@ -1228,6 +1228,10 @@ do_single_blorp_clear(struct brw_context *brw, struct >> gl_framebuffer *fb, >>} >> } >> >> + /* FINISHME: Debug and enable fast clears */ >> + if (devinfo->gen >= 11) >> + can_fast_clear = false; >> + >> if (can_fast_clear) { >>const enum isl_aux_state aux_state = >> intel_miptree_get_aux_state(irb->mt, irb->mt_level, irb->mt_layer); >> > > Not very enthused about this, it's fine if we need to do this for now, > but we should at least make sure we're tracking the task somewhere. Yeah. Just filed as MD5-422. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 05/11] intel/compiler: Use null destination register for memory fence messages
On Wednesday, March 21, 2018 2:06:16 PM PDT Matt Turner wrote: > From Message Descriptor section in gfxspecs: > > "Memory fence messages without Commit Enable set do not return >anything to the thread (response length is 0 and destination >register is null)." > > This fixes a GPU hang in simulation in the piglit test > arb_shader_image_load_store-shader-mem-barrier > > The mem fence message doesn't send any data, and previously we were > setting the SEND's src0 to the same register as the destination. I've > kept that behavior, so src0 will now be the null register in a number of > cases, necessitating a few changes in the EU validator. The simulator > and real hardware seem to be okay with this. > --- > src/intel/compiler/brw_eu_emit.c| 4 ++-- > src/intel/compiler/brw_eu_validate.c| 13 +++-- > src/intel/compiler/brw_fs_nir.cpp | 14 +++--- > src/intel/compiler/test_eu_validate.cpp | 9 + > 4 files changed, 33 insertions(+), 7 deletions(-) NAK on using NULL registers as SEND message sources. It won't end well. signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 10/11] intel/compiler: Skip 64-bit type tests when types not available
On Wednesday, March 21, 2018 2:06:21 PM PDT Matt Turner wrote: > --- > src/intel/compiler/test_eu_validate.cpp | 39 > + > 1 file changed, 39 insertions(+) I'd be tempted to write this as !devinfo.has_64bit_types && type_sz(inst[i].dst_type) == 8 && inst[i].dst_type != BRW_REGISTER_TYPE_NF or omit that last part if NF isn't a consideration. Or at least make a helper function so as not to repeat so much. But it's up to you. Either way, this patch is: Reviewed-by: Kenneth Graunke> diff --git a/src/intel/compiler/test_eu_validate.cpp > b/src/intel/compiler/test_eu_validate.cpp > index 8169f951b2d..e36f50a2d7e 100644 > --- a/src/intel/compiler/test_eu_validate.cpp > +++ b/src/intel/compiler/test_eu_validate.cpp > @@ -1075,6 +1075,15 @@ TEST_P(validation_test, > qword_low_power_align1_regioning_restrictions) >return; > > for (unsigned i = 0; i < sizeof(inst) / sizeof(inst[0]); i++) { > + if (!devinfo.has_64bit_types && > + (inst[i].dst_type == BRW_REGISTER_TYPE_DF || > + inst[i].dst_type == BRW_REGISTER_TYPE_UQ || > + inst[i].dst_type == BRW_REGISTER_TYPE_Q || > + inst[i].src_type == BRW_REGISTER_TYPE_DF || > + inst[i].src_type == BRW_REGISTER_TYPE_UQ || > + inst[i].src_type == BRW_REGISTER_TYPE_Q)) > + continue; > + >if (inst[i].opcode == BRW_OPCODE_MOV) { > brw_MOV(p, retype(g0, inst[i].dst_type), > retype(g0, inst[i].src_type)); > @@ -1195,6 +1204,15 @@ TEST_P(validation_test, > qword_low_power_no_indirect_addressing) >return; > > for (unsigned i = 0; i < sizeof(inst) / sizeof(inst[0]); i++) { > + if (!devinfo.has_64bit_types && > + (inst[i].dst_type == BRW_REGISTER_TYPE_DF || > + inst[i].dst_type == BRW_REGISTER_TYPE_UQ || > + inst[i].dst_type == BRW_REGISTER_TYPE_Q || > + inst[i].src_type == BRW_REGISTER_TYPE_DF || > + inst[i].src_type == BRW_REGISTER_TYPE_UQ || > + inst[i].src_type == BRW_REGISTER_TYPE_Q)) > + continue; > + >if (inst[i].opcode == BRW_OPCODE_MOV) { > brw_MOV(p, retype(g0, inst[i].dst_type), > retype(g0, inst[i].src_type)); > @@ -1331,6 +1349,15 @@ TEST_P(validation_test, qword_low_power_no_64bit_arf) >return; > > for (unsigned i = 0; i < sizeof(inst) / sizeof(inst[0]); i++) { > + if (!devinfo.has_64bit_types && > + (inst[i].dst_type == BRW_REGISTER_TYPE_DF || > + inst[i].dst_type == BRW_REGISTER_TYPE_UQ || > + inst[i].dst_type == BRW_REGISTER_TYPE_Q || > + inst[i].src_type == BRW_REGISTER_TYPE_DF || > + inst[i].src_type == BRW_REGISTER_TYPE_UQ || > + inst[i].src_type == BRW_REGISTER_TYPE_Q)) > + continue; > + >if (inst[i].opcode == BRW_OPCODE_MOV) { > brw_MOV(p, retype(inst[i].dst, inst[i].dst_type), > retype(inst[i].src, inst[i].src_type)); > @@ -1359,6 +1386,9 @@ TEST_P(validation_test, qword_low_power_no_64bit_arf) >clear_instructions(p); > } > > + if (!devinfo.has_64bit_types) > + return; > + > /* MAC implicitly reads the accumulator */ > brw_MAC(p, retype(g0, BRW_REGISTER_TYPE_DF), >retype(stride(g0, 4, 4, 1), BRW_REGISTER_TYPE_DF), > @@ -1529,6 +1559,15 @@ TEST_P(validation_test, qword_low_power_no_depctrl) >return; > > for (unsigned i = 0; i < sizeof(inst) / sizeof(inst[0]); i++) { > + if (!devinfo.has_64bit_types && > + (inst[i].dst_type == BRW_REGISTER_TYPE_DF || > + inst[i].dst_type == BRW_REGISTER_TYPE_UQ || > + inst[i].dst_type == BRW_REGISTER_TYPE_Q || > + inst[i].src_type == BRW_REGISTER_TYPE_DF || > + inst[i].src_type == BRW_REGISTER_TYPE_UQ || > + inst[i].src_type == BRW_REGISTER_TYPE_Q)) > + continue; > + >if (inst[i].opcode == BRW_OPCODE_MOV) { > brw_MOV(p, retype(g0, inst[i].dst_type), > retype(g0, inst[i].src_type)); > signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 06/11] intel/compiler/icl: Clear "null render target" bit in extended message descriptor
On Wednesday, March 21, 2018 2:06:17 PM PDT Matt Turner wrote: > From: Jason Ekstrand> > Otherwise all our render target writes go no where. > --- > src/intel/compiler/brw_eu_emit.c | 3 +++ > src/intel/compiler/brw_inst.h| 3 +++ > 2 files changed, 6 insertions(+) > > diff --git a/src/intel/compiler/brw_eu_emit.c > b/src/intel/compiler/brw_eu_emit.c > index fe7fa8723e1..99c09e6f541 100644 > --- a/src/intel/compiler/brw_eu_emit.c > +++ b/src/intel/compiler/brw_eu_emit.c > @@ -536,6 +536,9 @@ brw_set_dp_write_message(struct brw_codegen *p, > if (devinfo->gen < 7) { >brw_inst_set_dp_write_commit(devinfo, insn, send_commit_msg); > } > + > + if (devinfo->gen >= 11) > + brw_inst_set_null_rt(devinfo, insn, false); > } > > void > diff --git a/src/intel/compiler/brw_inst.h b/src/intel/compiler/brw_inst.h > index e6998973b64..b569e0e41b7 100644 > --- a/src/intel/compiler/brw_inst.h > +++ b/src/intel/compiler/brw_inst.h > @@ -505,6 +505,9 @@ FF(sfid, > /* 6: */ 27, 24, > /* 7: */ 27, 24, > /* 8: */ 27, 24) > +FF(null_rt, > + /* 4-7: */ -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, > + /* 8: */ 80, 80) > FC(base_mrf, 27, 24, devinfo->gen < 6); > /** @} */ > > This bit is new on Gen11, so letting it slop through on Gen8+ seems pretty lame. But, updating all the macros to support Gen11+ only bits also seems pretty lame. Given that Curro's already reworking all the message descriptor stuff, this will probably do for now. With a comment saying something like /* actually only Gen11+ */, Reviewed-by: Kenneth Graunke signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 07/11] intel/compiler/icl: Set the condition for dependency control on gen11+
On Wed, Mar 21, 2018 at 2:51 PM, Kenneth Graunkewrote: > On Wednesday, March 21, 2018 2:06:18 PM PDT Matt Turner wrote: >> From: Anuj Phogat >> >> When source or destination datatype is 64b or operation is integer >> DWord multiply, DepCtrl must not be used. >> We had this restriction on few previous intel platforms. It has been >> brought back on Gen11+. >> --- >> src/intel/compiler/brw_vec4.cpp | 8 ++-- >> 1 file changed, 6 insertions(+), 2 deletions(-) >> >> diff --git a/src/intel/compiler/brw_vec4.cpp >> b/src/intel/compiler/brw_vec4.cpp >> index e4838146ac1..bb668b2538a 100644 >> --- a/src/intel/compiler/brw_vec4.cpp >> +++ b/src/intel/compiler/brw_vec4.cpp >> @@ -984,15 +984,19 @@ vec4_visitor::is_dep_ctrl_unsafe(const >> vec4_instruction *inst) >> * SKL PRMs don't include this restriction, however, gen7 seems to be >> * affected, at least by the 64b restriction, since DepCtrl with double >> * precision instructions seems to produce GPU hangs in some cases. >> +* >> +* This restriction is back in ICL+ platforms. >> */ >> - if (devinfo->gen == 8 || gen_device_info_is_9lp(devinfo)) { >> + if (devinfo->gen == 8 || >> + gen_device_info_is_9lp(devinfo) || >> + devinfo->gen >= 11) { >>if (inst->opcode == BRW_OPCODE_MUL && >> IS_DWORD(inst->src[0]) && >> IS_DWORD(inst->src[1])) >> return true; >> } >> >> - if (devinfo->gen >= 7 && devinfo->gen <= 8) { >> + if ((devinfo->gen >= 7 && devinfo->gen <= 8) || devinfo->gen >= 11) { >>if (IS_64BIT(inst->dst) || IS_64BIT(inst->src[0]) || >>IS_64BIT(inst->src[1]) || IS_64BIT(inst->src[2])) >>return true; >> > > Patch is bogus. Gen10+ doesn't and Gen11+ /cannot/ use the vec4 > backend, so why are we updating it with Gen11 code? Right you are. Dropped. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 08/11] intel: Disable fast color clear on icl
On Wednesday, March 21, 2018 2:06:19 PM PDT Matt Turner wrote: > From: Anuj Phogat> > Disabling fast color clear makes fbo-clearmipmap test render correct > texture in base miplevel. Fast color clear is anyways disabled for > non-base miplevels. > --- > src/mesa/drivers/dri/i965/brw_blorp.c | 4 > 1 file changed, 4 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c > b/src/mesa/drivers/dri/i965/brw_blorp.c > index 72578b6ea5c..bee8e409897 100644 > --- a/src/mesa/drivers/dri/i965/brw_blorp.c > +++ b/src/mesa/drivers/dri/i965/brw_blorp.c > @@ -1228,6 +1228,10 @@ do_single_blorp_clear(struct brw_context *brw, struct > gl_framebuffer *fb, >} > } > > + /* FINISHME: Debug and enable fast clears */ > + if (devinfo->gen >= 11) > + can_fast_clear = false; > + > if (can_fast_clear) { >const enum isl_aux_state aux_state = > intel_miptree_get_aux_state(irb->mt, irb->mt_level, irb->mt_layer); > Not very enthused about this, it's fine if we need to do this for now, but we should at least make sure we're tracking the task somewhere. --Ken signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 07/11] intel/compiler/icl: Set the condition for dependency control on gen11+
On Wednesday, March 21, 2018 2:06:18 PM PDT Matt Turner wrote: > From: Anuj Phogat> > When source or destination datatype is 64b or operation is integer > DWord multiply, DepCtrl must not be used. > We had this restriction on few previous intel platforms. It has been > brought back on Gen11+. > --- > src/intel/compiler/brw_vec4.cpp | 8 ++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp > index e4838146ac1..bb668b2538a 100644 > --- a/src/intel/compiler/brw_vec4.cpp > +++ b/src/intel/compiler/brw_vec4.cpp > @@ -984,15 +984,19 @@ vec4_visitor::is_dep_ctrl_unsafe(const vec4_instruction > *inst) > * SKL PRMs don't include this restriction, however, gen7 seems to be > * affected, at least by the 64b restriction, since DepCtrl with double > * precision instructions seems to produce GPU hangs in some cases. > +* > +* This restriction is back in ICL+ platforms. > */ > - if (devinfo->gen == 8 || gen_device_info_is_9lp(devinfo)) { > + if (devinfo->gen == 8 || > + gen_device_info_is_9lp(devinfo) || > + devinfo->gen >= 11) { >if (inst->opcode == BRW_OPCODE_MUL && > IS_DWORD(inst->src[0]) && > IS_DWORD(inst->src[1])) > return true; > } > > - if (devinfo->gen >= 7 && devinfo->gen <= 8) { > + if ((devinfo->gen >= 7 && devinfo->gen <= 8) || devinfo->gen >= 11) { >if (IS_64BIT(inst->dst) || IS_64BIT(inst->src[0]) || >IS_64BIT(inst->src[1]) || IS_64BIT(inst->src[2])) >return true; > Patch is bogus. Gen10+ doesn't and Gen11+ /cannot/ use the vec4 backend, so why are we updating it with Gen11 code? --Ken signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 01/11] intel/tools/aubinator: Drop platform list from print_help()
On Wednesday, March 21, 2018 2:06:12 PM PDT Matt Turner wrote: > We all know the platform names, and I don't want to update this list > continually. > --- > src/intel/tools/aubinator.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c > index 8029dc12155..2a72efa8a2c 100644 > --- a/src/intel/tools/aubinator.c > +++ b/src/intel/tools/aubinator.c > @@ -548,7 +548,7 @@ print_help(const char *progname, FILE *file) > "Decode aub file contents from either FILE or the standard > input.\n\n" > "A valid --gen option must be provided.\n\n" > " --help display this help and exit\n" > - " --gen=platform decode for given platform (ivb, byt, hsw, > bdw, chv, skl, kbl, bxt or cnl)\n" > + " --gen=platform decode for given platform (3 letter > platform name)\n" > " --headers decode only command headers\n" > " --color[=WHEN] colorize the output; WHEN can be 'auto' > (default\n" > "if omitted), 'always', or 'never'\n" > Patches 1-4 are: Reviewed-by: Kenneth Graunkesignature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] clover/llvm: Fix build against LLVM/Clang 4.0
Aaron Watrywrites: > The opencl 1.0 langstandard was renamed in 5.0+ > > Cc: Mark Janes > --- > src/gallium/state_trackers/clover/llvm/invocation.cpp | 4 > 1 file changed, 4 insertions(+) > > diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp > b/src/gallium/state_trackers/clover/llvm/invocation.cpp > index af78c2ae28..2fb3ce2365 100644 > --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp > +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp > @@ -85,7 +85,11 @@ namespace { > }; > > const clc_version_lang_std cl_version_lang_stds[] = { > +#if HAVE_LLVM >= 0x0500 > { 100, clang::LangStandard::lang_opencl10}, > +#else > + { 100, clang::LangStandard::lang_opencl}, > +#endif Please move this preprocessor magic into an llvm/compat.hpp definition. Thanks! > { 110, clang::LangStandard::lang_opencl11}, > { 120, clang::LangStandard::lang_opencl12}, > { 200, clang::LangStandard::lang_opencl20}, > -- > 2.14.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] clover/llvm: Fix build against LLVM/Clang 4.0
This patch fixes the clover build for Clang 4.0, which is what the Intel CI uses. Tested-by: Mark JanesAaron Watry writes: > The opencl 1.0 langstandard was renamed in 5.0+ > > Cc: Mark Janes > --- > src/gallium/state_trackers/clover/llvm/invocation.cpp | 4 > 1 file changed, 4 insertions(+) > > diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp > b/src/gallium/state_trackers/clover/llvm/invocation.cpp > index af78c2ae28..2fb3ce2365 100644 > --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp > +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp > @@ -85,7 +85,11 @@ namespace { > }; > > const clc_version_lang_std cl_version_lang_stds[] = { > +#if HAVE_LLVM >= 0x0500 > { 100, clang::LangStandard::lang_opencl10}, > +#else > + { 100, clang::LangStandard::lang_opencl}, > +#endif > { 110, clang::LangStandard::lang_opencl11}, > { 120, clang::LangStandard::lang_opencl12}, > { 200, clang::LangStandard::lang_opencl20}, > -- > 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 09/11] intel: Add a Ice Lake PCI IDs
Matches the bspec. Reviewed-by: Rafael AntognolliOn Wed, Mar 21, 2018 at 02:06:20PM -0700, Matt Turner wrote: > From: Anuj Phogat > > --- > include/pci_ids/i965_pci_ids.h | 9 + > 1 file changed, 9 insertions(+) > > diff --git a/include/pci_ids/i965_pci_ids.h b/include/pci_ids/i965_pci_ids.h > index feb9c582b19..925655e9908 100644 > --- a/include/pci_ids/i965_pci_ids.h > +++ b/include/pci_ids/i965_pci_ids.h > @@ -196,3 +196,12 @@ CHIPSET(0x5A50, cnl_5x8, "Intel(R) HD Graphics > (Cannonlake 5x8 GT2)") > CHIPSET(0x5A51, cnl_5x8, "Intel(R) HD Graphics (Cannonlake 5x8 GT2)") > CHIPSET(0x5A52, cnl_5x8, "Intel(R) HD Graphics (Cannonlake 5x8 GT2)") > CHIPSET(0x5A54, cnl_5x8, "Intel(R) HD Graphics (Cannonlake 5x8 GT2)") > +CHIPSET(0x8A50, icl_8x8, "Intel(R) HD Graphics (Ice Lake 8x8 GT2)") > +CHIPSET(0x8A51, icl_8x8, "Intel(R) HD Graphics (Ice Lake 8x8 GT2)") > +CHIPSET(0x8A52, icl_8x8, "Intel(R) HD Graphics (Ice Lake 8x8 GT2)") > +CHIPSET(0x8A5A, icl_6x8, "Intel(R) HD Graphics (Ice Lake 6x8 GT1.5)") > +CHIPSET(0x8A5B, icl_4x8, "Intel(R) HD Graphics (Ice Lake 4x8 GT1)") > +CHIPSET(0x8A5C, icl_6x8, "Intel(R) HD Graphics (Ice Lake 6x8 GT1.5)") > +CHIPSET(0x8A5D, icl_4x8, "Intel(R) HD Graphics (Ice Lake 4x8 GT1)") > +CHIPSET(0x8A71, icl_1x8, "Intel(R) HD Graphics (Ice Lake 1x8 GT0.5)") > +CHIPSET(0xFF05, icl_8x8, "Intel(R) HD Graphics (Ice Lake)") > -- > 2.16.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 03/11] intel/common/icl: Disable hiz surface sampling
On Wed, Mar 21, 2018 at 02:06:14PM -0700, Matt Turner wrote: > From: Anuj Phogat> > On gen11+ AUX_HIZ is not a supported value for surfaces being > sampled by the 3D sampler. Reviewed-by: Rafael Antognolli > --- > src/intel/dev/gen_device_info.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/src/intel/dev/gen_device_info.c b/src/intel/dev/gen_device_info.c > index 3365bdd4dd6..9e684b78a09 100644 > --- a/src/intel/dev/gen_device_info.c > +++ b/src/intel/dev/gen_device_info.c > @@ -823,6 +823,7 @@ static const struct gen_device_info > gen_device_info_cnl_5x8 = { > GEN11_HW_INFO, \ > .has_64bit_types = false, \ > .has_integer_dword_mul = false,\ > + .has_sample_with_hiz = false, \ > .gt = _gt, .num_slices = _slices, .l3_banks = _l3, \ > .num_subslices = _subslices > > -- > 2.16.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 01/11] intel/tools/aubinator: Drop platform list from print_help()
On Wed, Mar 21, 2018 at 02:06:12PM -0700, Matt Turner wrote: > We all know the platform names, and I don't want to update this list > continually. Reviewed-by: Rafael Antognolli> --- > src/intel/tools/aubinator.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c > index 8029dc12155..2a72efa8a2c 100644 > --- a/src/intel/tools/aubinator.c > +++ b/src/intel/tools/aubinator.c > @@ -548,7 +548,7 @@ print_help(const char *progname, FILE *file) > "Decode aub file contents from either FILE or the standard > input.\n\n" > "A valid --gen option must be provided.\n\n" > " --help display this help and exit\n" > - " --gen=platform decode for given platform (ivb, byt, hsw, > bdw, chv, skl, kbl, bxt or cnl)\n" > + " --gen=platform decode for given platform (3 letter > platform name)\n" > " --headers decode only command headers\n" > " --color[=WHEN] colorize the output; WHEN can be 'auto' > (default\n" > "if omitted), 'always', or 'never'\n" > -- > 2.16.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] radv: enable TC-compat HTILE for 16-bit depth surfaces on GFX8
Reviewed-by: Bas Nieuwenhuizenfor the series. On Wed, Mar 21, 2018 at 9:30 PM, Samuel Pitoiset wrote: > The hardware only supports 32-bit depth surfaces, but we can > enable TC-compat HTILE for 16-bit depth surfaces if no Z planes > are compressed. > > The main benefit is to reduce the number of depth decompression > passes. Also, we don't need to implement DB->CB copies which is > fine. > > This improves Serious Sam 2017 by +4%. Talos and F12017 are also > affected but I don't see a performance difference. > > This also improves the shadowmapping Vulkan demo by 10-15% > (FPS is now similar to AMDVLK). > > No CTS regressions on Polaris10. > > Signed-off-by: Samuel Pitoiset > --- > src/amd/vulkan/radv_device.c | 22 -- > src/amd/vulkan/radv_image.c | 20 > 2 files changed, 24 insertions(+), 18 deletions(-) > > diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c > index 22500bfc13..9c82fd059f 100644 > --- a/src/amd/vulkan/radv_device.c > +++ b/src/amd/vulkan/radv_device.c > @@ -3615,12 +3615,22 @@ radv_calc_decompress_on_z_planes(struct radv_device > *device, > > max_zplanes = max_zplanes + 1; > } else { > - if (iview->image->info.samples <= 1) > - max_zplanes = 5; > - else if (iview->image->info.samples <= 4) > - max_zplanes = 3; > - else > - max_zplanes = 2; > + if (iview->vk_format == VK_FORMAT_D16_UNORM) { > + /* Do not enable Z plane compression for 16-bit depth > +* surfaces because isn't supported on GFX8. Only > +* 32-bit depth surfaces are supported by the > hardware. > +* This allows to maintain shader compatibility and to > +* reduce the number of depth decompressions. > +*/ > + max_zplanes = 1; > + } else { > + if (iview->image->info.samples <= 1) > + max_zplanes = 5; > + else if (iview->image->info.samples <= 4) > + max_zplanes = 3; > + else > + max_zplanes = 2; > + } > } > > return max_zplanes; > diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c > index 6e5f3e7ad0..dd3189c67d 100644 > --- a/src/amd/vulkan/radv_image.c > +++ b/src/amd/vulkan/radv_image.c > @@ -91,18 +91,14 @@ radv_image_is_tc_compat_htile(struct radv_device *device, > pCreateInfo->format == VK_FORMAT_D32_SFLOAT_S8_UINT) > return false; > > - if (device->physical_device->rad_info.chip_class >= GFX9) { > - /* GFX9 supports both 32-bit and 16-bit depth surfaces. */ > - if (pCreateInfo->format != VK_FORMAT_D32_SFLOAT_S8_UINT && > - pCreateInfo->format != VK_FORMAT_D32_SFLOAT && > - pCreateInfo->format != VK_FORMAT_D16_UNORM) > - return false; > - } else { > - /* GFX8 only supports 32-bit depth surfaces. */ > - if (pCreateInfo->format != VK_FORMAT_D32_SFLOAT_S8_UINT && > - pCreateInfo->format != VK_FORMAT_D32_SFLOAT) > - return false; > - } > + /* GFX9 supports both 32-bit and 16-bit depth surfaces, while GFX8 > only > +* supports 32-bit. Though, it's possible to enable TC-compat for > +* 16-bit depth surfaces if no Z planes are compressed. > +*/ > + if (pCreateInfo->format != VK_FORMAT_D32_SFLOAT_S8_UINT && > + pCreateInfo->format != VK_FORMAT_D32_SFLOAT && > + pCreateInfo->format != VK_FORMAT_D16_UNORM) > + return false; > > return true; > } > -- > 2.16.2 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] clover/llvm: Fix build against LLVM/Clang 4.0
The opencl 1.0 langstandard was renamed in 5.0+ Cc: Mark Janes--- src/gallium/state_trackers/clover/llvm/invocation.cpp | 4 1 file changed, 4 insertions(+) diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp b/src/gallium/state_trackers/clover/llvm/invocation.cpp index af78c2ae28..2fb3ce2365 100644 --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp @@ -85,7 +85,11 @@ namespace { }; const clc_version_lang_std cl_version_lang_stds[] = { +#if HAVE_LLVM >= 0x0500 { 100, clang::LangStandard::lang_opencl10}, +#else + { 100, clang::LangStandard::lang_opencl}, +#endif { 110, clang::LangStandard::lang_opencl11}, { 120, clang::LangStandard::lang_opencl12}, { 200, clang::LangStandard::lang_opencl20}, -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 04/11] intel/compiler/icl: Update the assert in brw_stage_has_packed_dispatch()
From: Anuj PhogatRafael ran piglit with the test code enabled and saw no additional GPU hangs. --- src/intel/compiler/brw_compiler.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/intel/compiler/brw_compiler.h b/src/intel/compiler/brw_compiler.h index 0e27c898203..d3ae6499b91 100644 --- a/src/intel/compiler/brw_compiler.h +++ b/src/intel/compiler/brw_compiler.h @@ -1294,7 +1294,7 @@ brw_stage_has_packed_dispatch(MAYBE_UNUSED const struct gen_device_info *devinfo * to do a full test run with brw_fs_test_dispatch_packing() hooked up to * the NIR front-end before changing this assertion. */ - assert(devinfo->gen <= 10); + assert(devinfo->gen <= 11); switch (stage) { case MESA_SHADER_FRAGMENT: { -- 2.16.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 05/11] intel/compiler: Use null destination register for memory fence messages
From Message Descriptor section in gfxspecs: "Memory fence messages without Commit Enable set do not return anything to the thread (response length is 0 and destination register is null)." This fixes a GPU hang in simulation in the piglit test arb_shader_image_load_store-shader-mem-barrier The mem fence message doesn't send any data, and previously we were setting the SEND's src0 to the same register as the destination. I've kept that behavior, so src0 will now be the null register in a number of cases, necessitating a few changes in the EU validator. The simulator and real hardware seem to be okay with this. --- src/intel/compiler/brw_eu_emit.c| 4 ++-- src/intel/compiler/brw_eu_validate.c| 13 +++-- src/intel/compiler/brw_fs_nir.cpp | 14 +++--- src/intel/compiler/test_eu_validate.cpp | 9 + 4 files changed, 33 insertions(+), 7 deletions(-) diff --git a/src/intel/compiler/brw_eu_emit.c b/src/intel/compiler/brw_eu_emit.c index f039af56d05..fe7fa8723e1 100644 --- a/src/intel/compiler/brw_eu_emit.c +++ b/src/intel/compiler/brw_eu_emit.c @@ -3289,8 +3289,8 @@ brw_memory_fence(struct brw_codegen *p, { const struct gen_device_info *devinfo = p->devinfo; const bool commit_enable = - devinfo->gen >= 10 || /* HSD ES # 1404612949 */ - (devinfo->gen == 7 && !devinfo->is_haswell); + !(dst.file == BRW_ARCHITECTURE_REGISTER_FILE && +dst.nr == BRW_ARF_NULL); struct brw_inst *insn; brw_push_insn_state(p); diff --git a/src/intel/compiler/brw_eu_validate.c b/src/intel/compiler/brw_eu_validate.c index d3189d1ef5e..e16dfc3aaf3 100644 --- a/src/intel/compiler/brw_eu_validate.c +++ b/src/intel/compiler/brw_eu_validate.c @@ -168,6 +168,14 @@ src1_has_scalar_region(const struct gen_device_info *devinfo, const brw_inst *in brw_inst_src1_hstride(devinfo, inst) == BRW_HORIZONTAL_STRIDE_0; } +static bool +is_mfence(const struct gen_device_info *devinfo, const brw_inst *inst) +{ + return brw_inst_opcode(devinfo, inst) == BRW_OPCODE_SEND && + brw_inst_sfid(devinfo, inst) == GEN7_SFID_DATAPORT_DATA_CACHE && + brw_inst_dp_msg_type(devinfo, inst) == GEN7_DATAPORT_DC_MEMORY_FENCE; +} + static unsigned num_sources_from_inst(const struct gen_device_info *devinfo, const brw_inst *inst) @@ -236,7 +244,7 @@ sources_not_null(const struct gen_device_info *devinfo, if (num_sources == 3) return (struct string){}; - if (num_sources >= 1) + if (num_sources >= 1 && !is_mfence(devinfo, inst)) ERROR_IF(src0_is_null(devinfo, inst), "src0 is null"); if (num_sources == 2) @@ -256,7 +264,8 @@ send_restrictions(const struct gen_device_info *devinfo, "send must use direct addressing"); if (devinfo->gen >= 7) { - ERROR_IF(!src0_is_grf(devinfo, inst), "send from non-GRF"); + ERROR_IF(!src0_is_grf(devinfo, inst) && !is_mfence(devinfo, inst), + "send from non-GRF"); ERROR_IF(brw_inst_eot(devinfo, inst) && brw_inst_src0_da_reg_nr(devinfo, inst) < 112, "send with EOT must use g112-g127"); diff --git a/src/intel/compiler/brw_fs_nir.cpp b/src/intel/compiler/brw_fs_nir.cpp index dbd2105f7e9..063f0256829 100644 --- a/src/intel/compiler/brw_fs_nir.cpp +++ b/src/intel/compiler/brw_fs_nir.cpp @@ -3859,9 +3859,17 @@ fs_visitor::nir_emit_intrinsic(const fs_builder , nir_intrinsic_instr *instr case nir_intrinsic_memory_barrier_image: case nir_intrinsic_memory_barrier: { const fs_builder ubld = bld.group(8, 0); - const fs_reg tmp = ubld.vgrf(BRW_REGISTER_TYPE_UD, 2); - ubld.emit(SHADER_OPCODE_MEMORY_FENCE, tmp) - ->size_written = 2 * REG_SIZE; + if (devinfo->gen == 7 && !devinfo->is_haswell) { + const fs_reg tmp = ubld.vgrf(BRW_REGISTER_TYPE_UD, 2); + ubld.emit(SHADER_OPCODE_MEMORY_FENCE, tmp) +->size_written = 2 * REG_SIZE; + } else { + const fs_reg tmp = +/* HSD ES #1404612949 */ +devinfo->gen >= 10 ? ubld.vgrf(BRW_REGISTER_TYPE_UD) + : bld.null_reg_d(); + ubld.emit(SHADER_OPCODE_MEMORY_FENCE, tmp); + } break; } diff --git a/src/intel/compiler/test_eu_validate.cpp b/src/intel/compiler/test_eu_validate.cpp index 161db994b2b..8169f951b2d 100644 --- a/src/intel/compiler/test_eu_validate.cpp +++ b/src/intel/compiler/test_eu_validate.cpp @@ -168,6 +168,15 @@ TEST_P(validation_test, math_src1_null_reg) } } +TEST_P(validation_test, mfence_src0_null_reg) +{ + /* On HSW+ mfence's src0 is the null register */ + if (devinfo.gen >= 8 || devinfo.is_haswell) { + brw_memory_fence(p, null); + EXPECT_TRUE(validate(p)); + } +} + TEST_P(validation_test, opcode46) { /* opcode 46 is "push" on Gen 4 and 5 -- 2.16.1 ___ mesa-dev mailing list
[Mesa-dev] [PATCH 11/11] intel/compiler: Readd ICL to test_eu_validate.cpp
Now that the PCI IDs are upstream, this can be readded. --- src/intel/compiler/test_eu_validate.cpp | 1 + 1 file changed, 1 insertion(+) diff --git a/src/intel/compiler/test_eu_validate.cpp b/src/intel/compiler/test_eu_validate.cpp index e36f50a2d7e..79401222d78 100644 --- a/src/intel/compiler/test_eu_validate.cpp +++ b/src/intel/compiler/test_eu_validate.cpp @@ -43,6 +43,7 @@ static const struct gen_info { { "glk", }, { "cfl", }, { "cnl", }, + { "icl", }, }; class validation_test: public ::testing::TestWithParam { -- 2.16.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 03/11] intel/common/icl: Disable hiz surface sampling
From: Anuj PhogatOn gen11+ AUX_HIZ is not a supported value for surfaces being sampled by the 3D sampler. --- src/intel/dev/gen_device_info.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/intel/dev/gen_device_info.c b/src/intel/dev/gen_device_info.c index 3365bdd4dd6..9e684b78a09 100644 --- a/src/intel/dev/gen_device_info.c +++ b/src/intel/dev/gen_device_info.c @@ -823,6 +823,7 @@ static const struct gen_device_info gen_device_info_cnl_5x8 = { GEN11_HW_INFO, \ .has_64bit_types = false, \ .has_integer_dword_mul = false,\ + .has_sample_with_hiz = false, \ .gt = _gt, .num_slices = _slices, .l3_banks = _l3, \ .num_subslices = _subslices -- 2.16.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 02/11] intel/common/icl: Add L3 config
From: Anuj PhogatICL uses the same L3 configs as CNL, just leaving the SLM configs out. --- src/intel/common/gen_l3_config.c | 18 ++ 1 file changed, 18 insertions(+) diff --git a/src/intel/common/gen_l3_config.c b/src/intel/common/gen_l3_config.c index 7d58ad8d7c8..b977c6ab136 100644 --- a/src/intel/common/gen_l3_config.c +++ b/src/intel/common/gen_l3_config.c @@ -132,6 +132,21 @@ static const struct gen_l3_config cnl_l3_configs[] = { {{ 0 }} }; +/** + * ICL validated L3 configurations. \sa icl_l3_configs. + */ +static const struct gen_l3_config icl_l3_configs[] = { + /* SLM URB ALL DC RO IS C T */ + {{ 0, 64, 64, 0, 0, 0, 0, 0 }}, + {{ 0, 64, 0, 16, 48, 0, 0, 0 }}, + {{ 0, 48, 0, 16, 64, 0, 0, 0 }}, + {{ 0, 32, 0, 0, 96, 0, 0, 0 }}, + {{ 0, 32, 96, 0, 0, 0, 0, 0 }}, + {{ 0, 32, 0, 16, 80, 0, 0, 0 }}, + {{ 0 }} +}; + + /** * Return a zero-terminated array of validated L3 configurations for the * specified device. @@ -154,6 +169,9 @@ get_l3_configs(const struct gen_device_info *devinfo) case 10: return cnl_l3_configs; + case 11: + return icl_l3_configs; + default: unreachable("Not implemented"); } -- 2.16.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 09/11] intel: Add a Ice Lake PCI IDs
From: Anuj Phogat--- include/pci_ids/i965_pci_ids.h | 9 + 1 file changed, 9 insertions(+) diff --git a/include/pci_ids/i965_pci_ids.h b/include/pci_ids/i965_pci_ids.h index feb9c582b19..925655e9908 100644 --- a/include/pci_ids/i965_pci_ids.h +++ b/include/pci_ids/i965_pci_ids.h @@ -196,3 +196,12 @@ CHIPSET(0x5A50, cnl_5x8, "Intel(R) HD Graphics (Cannonlake 5x8 GT2)") CHIPSET(0x5A51, cnl_5x8, "Intel(R) HD Graphics (Cannonlake 5x8 GT2)") CHIPSET(0x5A52, cnl_5x8, "Intel(R) HD Graphics (Cannonlake 5x8 GT2)") CHIPSET(0x5A54, cnl_5x8, "Intel(R) HD Graphics (Cannonlake 5x8 GT2)") +CHIPSET(0x8A50, icl_8x8, "Intel(R) HD Graphics (Ice Lake 8x8 GT2)") +CHIPSET(0x8A51, icl_8x8, "Intel(R) HD Graphics (Ice Lake 8x8 GT2)") +CHIPSET(0x8A52, icl_8x8, "Intel(R) HD Graphics (Ice Lake 8x8 GT2)") +CHIPSET(0x8A5A, icl_6x8, "Intel(R) HD Graphics (Ice Lake 6x8 GT1.5)") +CHIPSET(0x8A5B, icl_4x8, "Intel(R) HD Graphics (Ice Lake 4x8 GT1)") +CHIPSET(0x8A5C, icl_6x8, "Intel(R) HD Graphics (Ice Lake 6x8 GT1.5)") +CHIPSET(0x8A5D, icl_4x8, "Intel(R) HD Graphics (Ice Lake 4x8 GT1)") +CHIPSET(0x8A71, icl_1x8, "Intel(R) HD Graphics (Ice Lake 1x8 GT0.5)") +CHIPSET(0xFF05, icl_8x8, "Intel(R) HD Graphics (Ice Lake)") -- 2.16.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 10/11] intel/compiler: Skip 64-bit type tests when types not available
--- src/intel/compiler/test_eu_validate.cpp | 39 + 1 file changed, 39 insertions(+) diff --git a/src/intel/compiler/test_eu_validate.cpp b/src/intel/compiler/test_eu_validate.cpp index 8169f951b2d..e36f50a2d7e 100644 --- a/src/intel/compiler/test_eu_validate.cpp +++ b/src/intel/compiler/test_eu_validate.cpp @@ -1075,6 +1075,15 @@ TEST_P(validation_test, qword_low_power_align1_regioning_restrictions) return; for (unsigned i = 0; i < sizeof(inst) / sizeof(inst[0]); i++) { + if (!devinfo.has_64bit_types && + (inst[i].dst_type == BRW_REGISTER_TYPE_DF || + inst[i].dst_type == BRW_REGISTER_TYPE_UQ || + inst[i].dst_type == BRW_REGISTER_TYPE_Q || + inst[i].src_type == BRW_REGISTER_TYPE_DF || + inst[i].src_type == BRW_REGISTER_TYPE_UQ || + inst[i].src_type == BRW_REGISTER_TYPE_Q)) + continue; + if (inst[i].opcode == BRW_OPCODE_MOV) { brw_MOV(p, retype(g0, inst[i].dst_type), retype(g0, inst[i].src_type)); @@ -1195,6 +1204,15 @@ TEST_P(validation_test, qword_low_power_no_indirect_addressing) return; for (unsigned i = 0; i < sizeof(inst) / sizeof(inst[0]); i++) { + if (!devinfo.has_64bit_types && + (inst[i].dst_type == BRW_REGISTER_TYPE_DF || + inst[i].dst_type == BRW_REGISTER_TYPE_UQ || + inst[i].dst_type == BRW_REGISTER_TYPE_Q || + inst[i].src_type == BRW_REGISTER_TYPE_DF || + inst[i].src_type == BRW_REGISTER_TYPE_UQ || + inst[i].src_type == BRW_REGISTER_TYPE_Q)) + continue; + if (inst[i].opcode == BRW_OPCODE_MOV) { brw_MOV(p, retype(g0, inst[i].dst_type), retype(g0, inst[i].src_type)); @@ -1331,6 +1349,15 @@ TEST_P(validation_test, qword_low_power_no_64bit_arf) return; for (unsigned i = 0; i < sizeof(inst) / sizeof(inst[0]); i++) { + if (!devinfo.has_64bit_types && + (inst[i].dst_type == BRW_REGISTER_TYPE_DF || + inst[i].dst_type == BRW_REGISTER_TYPE_UQ || + inst[i].dst_type == BRW_REGISTER_TYPE_Q || + inst[i].src_type == BRW_REGISTER_TYPE_DF || + inst[i].src_type == BRW_REGISTER_TYPE_UQ || + inst[i].src_type == BRW_REGISTER_TYPE_Q)) + continue; + if (inst[i].opcode == BRW_OPCODE_MOV) { brw_MOV(p, retype(inst[i].dst, inst[i].dst_type), retype(inst[i].src, inst[i].src_type)); @@ -1359,6 +1386,9 @@ TEST_P(validation_test, qword_low_power_no_64bit_arf) clear_instructions(p); } + if (!devinfo.has_64bit_types) + return; + /* MAC implicitly reads the accumulator */ brw_MAC(p, retype(g0, BRW_REGISTER_TYPE_DF), retype(stride(g0, 4, 4, 1), BRW_REGISTER_TYPE_DF), @@ -1529,6 +1559,15 @@ TEST_P(validation_test, qword_low_power_no_depctrl) return; for (unsigned i = 0; i < sizeof(inst) / sizeof(inst[0]); i++) { + if (!devinfo.has_64bit_types && + (inst[i].dst_type == BRW_REGISTER_TYPE_DF || + inst[i].dst_type == BRW_REGISTER_TYPE_UQ || + inst[i].dst_type == BRW_REGISTER_TYPE_Q || + inst[i].src_type == BRW_REGISTER_TYPE_DF || + inst[i].src_type == BRW_REGISTER_TYPE_UQ || + inst[i].src_type == BRW_REGISTER_TYPE_Q)) + continue; + if (inst[i].opcode == BRW_OPCODE_MOV) { brw_MOV(p, retype(g0, inst[i].dst_type), retype(g0, inst[i].src_type)); -- 2.16.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 07/11] intel/compiler/icl: Set the condition for dependency control on gen11+
From: Anuj PhogatWhen source or destination datatype is 64b or operation is integer DWord multiply, DepCtrl must not be used. We had this restriction on few previous intel platforms. It has been brought back on Gen11+. --- src/intel/compiler/brw_vec4.cpp | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp index e4838146ac1..bb668b2538a 100644 --- a/src/intel/compiler/brw_vec4.cpp +++ b/src/intel/compiler/brw_vec4.cpp @@ -984,15 +984,19 @@ vec4_visitor::is_dep_ctrl_unsafe(const vec4_instruction *inst) * SKL PRMs don't include this restriction, however, gen7 seems to be * affected, at least by the 64b restriction, since DepCtrl with double * precision instructions seems to produce GPU hangs in some cases. +* +* This restriction is back in ICL+ platforms. */ - if (devinfo->gen == 8 || gen_device_info_is_9lp(devinfo)) { + if (devinfo->gen == 8 || + gen_device_info_is_9lp(devinfo) || + devinfo->gen >= 11) { if (inst->opcode == BRW_OPCODE_MUL && IS_DWORD(inst->src[0]) && IS_DWORD(inst->src[1])) return true; } - if (devinfo->gen >= 7 && devinfo->gen <= 8) { + if ((devinfo->gen >= 7 && devinfo->gen <= 8) || devinfo->gen >= 11) { if (IS_64BIT(inst->dst) || IS_64BIT(inst->src[0]) || IS_64BIT(inst->src[1]) || IS_64BIT(inst->src[2])) return true; -- 2.16.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 06/11] intel/compiler/icl: Clear "null render target" bit in extended message descriptor
From: Jason EkstrandOtherwise all our render target writes go no where. --- src/intel/compiler/brw_eu_emit.c | 3 +++ src/intel/compiler/brw_inst.h| 3 +++ 2 files changed, 6 insertions(+) diff --git a/src/intel/compiler/brw_eu_emit.c b/src/intel/compiler/brw_eu_emit.c index fe7fa8723e1..99c09e6f541 100644 --- a/src/intel/compiler/brw_eu_emit.c +++ b/src/intel/compiler/brw_eu_emit.c @@ -536,6 +536,9 @@ brw_set_dp_write_message(struct brw_codegen *p, if (devinfo->gen < 7) { brw_inst_set_dp_write_commit(devinfo, insn, send_commit_msg); } + + if (devinfo->gen >= 11) + brw_inst_set_null_rt(devinfo, insn, false); } void diff --git a/src/intel/compiler/brw_inst.h b/src/intel/compiler/brw_inst.h index e6998973b64..b569e0e41b7 100644 --- a/src/intel/compiler/brw_inst.h +++ b/src/intel/compiler/brw_inst.h @@ -505,6 +505,9 @@ FF(sfid, /* 6: */ 27, 24, /* 7: */ 27, 24, /* 8: */ 27, 24) +FF(null_rt, + /* 4-7: */ -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, + /* 8: */ 80, 80) FC(base_mrf, 27, 24, devinfo->gen < 6); /** @} */ -- 2.16.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 08/11] intel: Disable fast color clear on icl
From: Anuj PhogatDisabling fast color clear makes fbo-clearmipmap test render correct texture in base miplevel. Fast color clear is anyways disabled for non-base miplevels. --- src/mesa/drivers/dri/i965/brw_blorp.c | 4 1 file changed, 4 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_blorp.c b/src/mesa/drivers/dri/i965/brw_blorp.c index 72578b6ea5c..bee8e409897 100644 --- a/src/mesa/drivers/dri/i965/brw_blorp.c +++ b/src/mesa/drivers/dri/i965/brw_blorp.c @@ -1228,6 +1228,10 @@ do_single_blorp_clear(struct brw_context *brw, struct gl_framebuffer *fb, } } + /* FINISHME: Debug and enable fast clears */ + if (devinfo->gen >= 11) + can_fast_clear = false; + if (can_fast_clear) { const enum isl_aux_state aux_state = intel_miptree_get_aux_state(irb->mt, irb->mt_level, irb->mt_layer); -- 2.16.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 01/11] intel/tools/aubinator: Drop platform list from print_help()
We all know the platform names, and I don't want to update this list continually. --- src/intel/tools/aubinator.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c index 8029dc12155..2a72efa8a2c 100644 --- a/src/intel/tools/aubinator.c +++ b/src/intel/tools/aubinator.c @@ -548,7 +548,7 @@ print_help(const char *progname, FILE *file) "Decode aub file contents from either FILE or the standard input.\n\n" "A valid --gen option must be provided.\n\n" " --help display this help and exit\n" - " --gen=platform decode for given platform (ivb, byt, hsw, bdw, chv, skl, kbl, bxt or cnl)\n" + " --gen=platform decode for given platform (3 letter platform name)\n" " --headers decode only command headers\n" " --color[=WHEN] colorize the output; WHEN can be 'auto' (default\n" "if omitted), 'always', or 'never'\n" -- 2.16.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] st: Allow accelerated CopyTexImage from RGBA to RGB.
There's nothing to worry about here -- the A channel just gets dropped by the blit. This avoids a segfault in the fallback path when copying from a RGBA16_SINT renderbuffer to a RGB16_SINT destination represented by an RGBA16_SINT texture (the fallback path tries to get/fetch to float buffers, but the float pack/unpack functions are NULL for SINT/UINT). Fixes KHR-GLES3.packed_pixels.pbo_rectangle.rgba16i on VC5. --- src/mesa/state_tracker/st_cb_texture.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/mesa/state_tracker/st_cb_texture.c b/src/mesa/state_tracker/st_cb_texture.c index 6345ead6396b..469a82a75390 100644 --- a/src/mesa/state_tracker/st_cb_texture.c +++ b/src/mesa/state_tracker/st_cb_texture.c @@ -2327,8 +2327,10 @@ st_CopyTexSubImage(struct gl_context *ctx, GLuint dims, /* The base internal format must match the mesa format, so make sure * e.g. an RGB internal format is really allocated as RGB and not as RGBA. */ - if (texImage->_BaseFormat != - _mesa_get_format_base_format(texImage->TexFormat) || + if ((texImage->_BaseFormat != +_mesa_get_format_base_format(texImage->TexFormat) && +(texImage->_BaseFormat != GL_RGB || + _mesa_get_format_base_format(texImage->TexFormat) != GL_RGBA)) || rb->_BaseFormat != _mesa_get_format_base_format(rb->Format)) { goto fallback; } -- 2.16.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/5] clover: Dynamically calculate __OPENCL_VERSION__ and CLC language version
On Wed, Mar 21, 2018 at 2:52 PM, Aaron Watrywrote: > On Wed, Mar 21, 2018 at 2:37 PM, Mark Janes wrote: >> Aaron, this patch breaks the meson build-test in our CI: >> >> ../src/gallium/state_trackers/clover/llvm/invocation.cpp:88:36: error: >> ‘lang_opencl10’ is not a member of ‘clang::LangStandard’ >> { 100, clang::LangStandard::lang_opencl10}, >> >> configured with: >> >> meson -Dbuild-tests=true >> -Dgallium-drivers=r300,r600,radeonsi,nouveau,swrast,swr,freedreno,vc4,pl111,etnaviv,imx,svga,virgl >> -Dgallium-vdpau=true -Dgallium-xvmc=true -Dgallium-xa=true >> -Dgallium-va=true -Dgallium-nine=true -Dgallium-opencl=standalone >> -Dgallium-omx=bellagio > > I've seen issues with building clover in the past when an incomplete > set of clang headers is installed. This happens to be the case with at > least Ubuntu's stock packaged clang. I'm not really sure what your CI > system is running, but I did just verify that I was able to build my > full mesa stack in meson with the following build configuration (what > my normal build script uses when I tell it to use its meson path): > > CXXFLAGS=' -O2' CFLAGS=' -O2 -march=native' LD='ld.gold' LDFLAGS='' > CC='gcc' CXX='g++' meson --prefix /usr/local -D dri-drivers= > --sysconfdir /etc --libdir /usr/local/lib --buildtype release > --buildtype release -D gallium-opencl=icd -D gles1=true -D gles2=true > -D texture-float=true -D gallium-va=true -D gallium-xvmc=false -D > build-tests=true -D gallium-drivers=radeonsi,r600,swrast -D > vulkan-drivers=radeon ../ > > That being said, I'm building against an llvm/clang 7.0 build > installed in /usr/local which meson picks up in preference to the > system headers. I think I see what's going on. The opencl 1.0 language standard was renamed in clang/Frontend/LangStandards.def between clang 4.0 and 5.0. It used to just be opencl, now it's opencl10 in clang 5+. I'll work on a patch to use the clang version to switch the definition out appropriately. --Aaron > > --Aaron > >> >> Pierre Moreau writes: >> >>> Oops, sorry. >>> >>> Reviewed-by: Pierre Moreau >>> >>> Thanks again for the series! >>> Pierre >>> >>> On 2018-03-20 — 20:23, Aaron Watry wrote: ping. This is the last of the series that still needs review. --Aaron On Thu, Mar 1, 2018 at 1:39 PM, Aaron Watry wrote: > Use get_language_version to calculate default cl standard based on > device capabilities and -cl-std specified in build options. > > v4: Squash the __OPENCL_VERSION__ and CLC language version patches > v3: (Jan) Allow device_version up to 2.2 while device_clc_version > only goes to 2.0 > Use get_cl_version to calculate version instead > v2: Split out from the previous patch (Pierre) > > Signed-off-by: Aaron Watry > CC: Pierre Moreau > CC: Jan Vesely > --- > src/gallium/state_trackers/clover/llvm/invocation.cpp | 6 -- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp > b/src/gallium/state_trackers/clover/llvm/invocation.cpp > index 8d76f203de..f146695585 100644 > --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp > +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp > @@ -194,7 +194,7 @@ namespace { >compat::set_lang_defaults(c->getInvocation(), c->getLangOpts(), > compat::ik_opencl, > ::llvm::Triple(target.triple), > c->getPreprocessorOpts(), > -clang::LangStandard::lang_opencl11); > +get_language_version(opts, > device_clc_version)); > >c->createDiagnostics(new clang::TextDiagnosticPrinter( >*new raw_string_ostream(r_log), > @@ -225,7 +225,9 @@ namespace { >c.getPreprocessorOpts().Includes.push_back("clc/clc.h"); > >// Add definition for the OpenCL version > - c.getPreprocessorOpts().addMacroDef("__OPENCL_VERSION__=110"); > + c.getPreprocessorOpts().addMacroDef("__OPENCL_VERSION__=" + > + std::to_string(get_cl_version( > + > dev.device_version()).version_number)); > >// clc.h requires that this macro be defined: > > c.getPreprocessorOpts().addMacroDef("cl_clang_storage_class_specifiers"); > -- > 2.14.1 > >>> ___ >>> mesa-dev mailing list >>> mesa-dev@lists.freedesktop.org >>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev
Re: [Mesa-dev] [RFC] Mesa release improvements - Feature and Stable releases
On 14 March 2018 at 20:13, Andres Gomezwrote: > On Wed, 2018-03-14 at 16:02 +, Emil Velikov wrote: > > [...] >> >> Just double-checking: >> I would suspect you're not suggesting removing the existing email/poke >> scheme? > > Partially. The "announce" mail for the pre-branching period will still > happen, pointing to the "Metabug" in which to add the WIP features that > developers intend to land before the deadline. > > If some of the developers just reply by mail/IRC/you-name-it, then it > will be the release manager task to add the blocking bugs with the WIP > features, as a way of documenting them. > Ack makes sense. >> Providing another means to devs to track/handle things is good IMHO. >> Whether developers will like it is up-to them. Everyone, your input is >> appreciated! >> >> >> I'm slightly worried that it might cause extra confusion. >> Some crude examples follow: >> - I don't use bugzilla/etc to track my feature work - most teams > > I don't think much interaction/documentation is needed. Just mention > the WIP feature and update its status eventually ... and only for the > ones developer X wants to have at branchpoint Y before that happens. > The rest of the work of developer X doesn't need to be in Bugzilla. > The gist sounds fine. >> - Do I open another bug, or list my feature in the metabug - seeming >> an ongoing theme with metabugs > > I think it should be a new blocking bug but I'm open to just document > it in the Metabug. > I'm also inclined to have each feature as separate bug, all listed in the metabug. >> - Do I add the bug, reply to the email or both > > Preferably, just add the bug. > > Once the bug is created and all the parties are in Cc for the bug, I > understand there is no need for any other way of communication. I'm > still open to reconsidering, though. > All in all the idea sounds sane. As a summary/overall: - Maintainer: open meta bug, send usual release plan + mention that feature should be added to the tracker - Developers: add bugs to tracker, or - Developers: list via other means (email/IRC) of the features they're aiming for -> Maintainer: add those to the tracker - Maintainer: follow-up reminders closer to the branch point - Maintainer: one week before branchpoint check with developers if features are still on track - drop otherwise And as always: - Developers: can propose minor adjustments (need a definition, say 1 week?) to the schedule up-to the 3 days before the branchpoint. Or in a sentence: There's no actual changes, things are better documented and more explicit for everyone to see. I believe that describes it nicely, right? Andres care to do the honours and add that to the existing documentation? Might be worth splitting it out to separate page, since the existing one is getting bit cluttered. Thanks Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] radv: add radv_image_is_tc_compat_htile() helper
Instead of that huge conditional that's going to be crazy. Signed-off-by: Samuel Pitoiset--- src/amd/vulkan/radv_image.c | 56 - 1 file changed, 45 insertions(+), 11 deletions(-) diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c index 5ac0f72589..6e5f3e7ad0 100644 --- a/src/amd/vulkan/radv_image.c +++ b/src/amd/vulkan/radv_image.c @@ -63,6 +63,50 @@ radv_choose_tiling(struct radv_device *device, return RADEON_SURF_MODE_2D; } + +static bool +radv_image_is_tc_compat_htile(struct radv_device *device, + const VkImageCreateInfo *pCreateInfo) +{ + /* TC-compat HTILE is only available for GFX8+. */ + if (device->physical_device->rad_info.chip_class < VI) + return false; + + if (pCreateInfo->usage & VK_IMAGE_USAGE_STORAGE_BIT) + return false; + + if (pCreateInfo->flags & (VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT | + VK_IMAGE_CREATE_EXTENDED_USAGE_BIT_KHR)) + return false; + + if (pCreateInfo->tiling == VK_IMAGE_TILING_LINEAR) + return false; + + if (pCreateInfo->mipLevels > 1) + return false; + + /* FIXME: for some reason TC compat with 2/4/8 samples breaks some cts +* tests - disable for now */ + if (pCreateInfo->samples >= 2 && + pCreateInfo->format == VK_FORMAT_D32_SFLOAT_S8_UINT) + return false; + + if (device->physical_device->rad_info.chip_class >= GFX9) { + /* GFX9 supports both 32-bit and 16-bit depth surfaces. */ + if (pCreateInfo->format != VK_FORMAT_D32_SFLOAT_S8_UINT && + pCreateInfo->format != VK_FORMAT_D32_SFLOAT && + pCreateInfo->format != VK_FORMAT_D16_UNORM) + return false; + } else { + /* GFX8 only supports 32-bit depth surfaces. */ + if (pCreateInfo->format != VK_FORMAT_D32_SFLOAT_S8_UINT && + pCreateInfo->format != VK_FORMAT_D32_SFLOAT) + return false; + } + + return true; +} + static int radv_init_surface(struct radv_device *device, struct radeon_surf *surface, @@ -109,17 +153,7 @@ radv_init_surface(struct radv_device *device, if (is_depth) { surface->flags |= RADEON_SURF_ZBUFFER; - if (!(pCreateInfo->usage & VK_IMAGE_USAGE_STORAGE_BIT) && - !(pCreateInfo->flags & (VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT | - VK_IMAGE_CREATE_EXTENDED_USAGE_BIT_KHR)) && - pCreateInfo->tiling != VK_IMAGE_TILING_LINEAR && - pCreateInfo->mipLevels <= 1 && - device->physical_device->rad_info.chip_class >= VI && - ((pCreateInfo->format == VK_FORMAT_D32_SFLOAT || - /* for some reason TC compat with 2/4/8 samples breaks some cts tests - disable for now */ - (pCreateInfo->samples < 2 && pCreateInfo->format == VK_FORMAT_D32_SFLOAT_S8_UINT)) || -(device->physical_device->rad_info.chip_class >= GFX9 && - pCreateInfo->format == VK_FORMAT_D16_UNORM))) + if (radv_image_is_tc_compat_htile(device, pCreateInfo)) surface->flags |= RADEON_SURF_TC_COMPATIBLE_HTILE; } -- 2.16.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] radv: add radv_calc_decompress_on_z_planes() helper
Signed-off-by: Samuel Pitoiset--- src/amd/vulkan/radv_device.c | 51 1 file changed, 37 insertions(+), 14 deletions(-) diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c index 36ba0c3833..22500bfc13 100644 --- a/src/amd/vulkan/radv_device.c +++ b/src/amd/vulkan/radv_device.c @@ -3597,6 +3597,35 @@ radv_initialise_color_surface(struct radv_device *device, } } +static unsigned +radv_calc_decompress_on_z_planes(struct radv_device *device, +struct radv_image_view *iview) +{ + unsigned max_zplanes = 0; + + assert(iview->image->tc_compatible_htile); + + if (device->physical_device->rad_info.chip_class >= GFX9) { + /* Default value for 32-bit depth surfaces. */ + max_zplanes = 4; + + if (iview->vk_format == VK_FORMAT_D16_UNORM && + iview->image->info.samples > 1) + max_zplanes = 2; + + max_zplanes = max_zplanes + 1; + } else { + if (iview->image->info.samples <= 1) + max_zplanes = 5; + else if (iview->image->info.samples <= 4) + max_zplanes = 3; + else + max_zplanes = 2; + } + + return max_zplanes; +} + static void radv_initialise_ds_surface(struct radv_device *device, struct radv_ds_buffer_info *ds, @@ -3667,14 +3696,11 @@ radv_initialise_ds_surface(struct radv_device *device, ds->db_z_info |= S_028038_TILE_SURFACE_ENABLE(1); if (iview->image->tc_compatible_htile) { - unsigned max_zplanes = 4; - - if (iview->vk_format == VK_FORMAT_D16_UNORM && - iview->image->info.samples > 1) - max_zplanes = 2; + unsigned max_zplanes = + radv_calc_decompress_on_z_planes(device, iview); - ds->db_z_info |= S_028038_DECOMPRESS_ON_N_ZPLANES(max_zplanes + 1) | - S_028038_ITERATE_FLUSH(1); + ds->db_z_info |= S_028038_DECOMPRESS_ON_N_ZPLANES(max_zplanes) | +S_028038_ITERATE_FLUSH(1); ds->db_stencil_info |= S_02803C_ITERATE_FLUSH(1); } @@ -3752,14 +3778,11 @@ radv_initialise_ds_surface(struct radv_device *device, ds->db_htile_surface = S_028ABC_FULL_CACHE(1); if (iview->image->tc_compatible_htile) { - ds->db_htile_surface |= S_028ABC_TC_COMPATIBLE(1); + unsigned max_zplanes = + radv_calc_decompress_on_z_planes(device, iview); - if (iview->image->info.samples <= 1) - ds->db_z_info |= S_028040_DECOMPRESS_ON_N_ZPLANES(5); - else if (iview->image->info.samples <= 4) - ds->db_z_info |= S_028040_DECOMPRESS_ON_N_ZPLANES(3); - else - ds->db_z_info|= S_028040_DECOMPRESS_ON_N_ZPLANES(2); + ds->db_htile_surface |= S_028ABC_TC_COMPATIBLE(1); + ds->db_z_info |= S_028040_DECOMPRESS_ON_N_ZPLANES(max_zplanes); } } } -- 2.16.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] radv: enable TC-compat HTILE for 16-bit depth surfaces on GFX8
The hardware only supports 32-bit depth surfaces, but we can enable TC-compat HTILE for 16-bit depth surfaces if no Z planes are compressed. The main benefit is to reduce the number of depth decompression passes. Also, we don't need to implement DB->CB copies which is fine. This improves Serious Sam 2017 by +4%. Talos and F12017 are also affected but I don't see a performance difference. This also improves the shadowmapping Vulkan demo by 10-15% (FPS is now similar to AMDVLK). No CTS regressions on Polaris10. Signed-off-by: Samuel Pitoiset--- src/amd/vulkan/radv_device.c | 22 -- src/amd/vulkan/radv_image.c | 20 2 files changed, 24 insertions(+), 18 deletions(-) diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c index 22500bfc13..9c82fd059f 100644 --- a/src/amd/vulkan/radv_device.c +++ b/src/amd/vulkan/radv_device.c @@ -3615,12 +3615,22 @@ radv_calc_decompress_on_z_planes(struct radv_device *device, max_zplanes = max_zplanes + 1; } else { - if (iview->image->info.samples <= 1) - max_zplanes = 5; - else if (iview->image->info.samples <= 4) - max_zplanes = 3; - else - max_zplanes = 2; + if (iview->vk_format == VK_FORMAT_D16_UNORM) { + /* Do not enable Z plane compression for 16-bit depth +* surfaces because isn't supported on GFX8. Only +* 32-bit depth surfaces are supported by the hardware. +* This allows to maintain shader compatibility and to +* reduce the number of depth decompressions. +*/ + max_zplanes = 1; + } else { + if (iview->image->info.samples <= 1) + max_zplanes = 5; + else if (iview->image->info.samples <= 4) + max_zplanes = 3; + else + max_zplanes = 2; + } } return max_zplanes; diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan/radv_image.c index 6e5f3e7ad0..dd3189c67d 100644 --- a/src/amd/vulkan/radv_image.c +++ b/src/amd/vulkan/radv_image.c @@ -91,18 +91,14 @@ radv_image_is_tc_compat_htile(struct radv_device *device, pCreateInfo->format == VK_FORMAT_D32_SFLOAT_S8_UINT) return false; - if (device->physical_device->rad_info.chip_class >= GFX9) { - /* GFX9 supports both 32-bit and 16-bit depth surfaces. */ - if (pCreateInfo->format != VK_FORMAT_D32_SFLOAT_S8_UINT && - pCreateInfo->format != VK_FORMAT_D32_SFLOAT && - pCreateInfo->format != VK_FORMAT_D16_UNORM) - return false; - } else { - /* GFX8 only supports 32-bit depth surfaces. */ - if (pCreateInfo->format != VK_FORMAT_D32_SFLOAT_S8_UINT && - pCreateInfo->format != VK_FORMAT_D32_SFLOAT) - return false; - } + /* GFX9 supports both 32-bit and 16-bit depth surfaces, while GFX8 only +* supports 32-bit. Though, it's possible to enable TC-compat for +* 16-bit depth surfaces if no Z planes are compressed. +*/ + if (pCreateInfo->format != VK_FORMAT_D32_SFLOAT_S8_UINT && + pCreateInfo->format != VK_FORMAT_D32_SFLOAT && + pCreateInfo->format != VK_FORMAT_D16_UNORM) + return false; return true; } -- 2.16.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/5] clover: Dynamically calculate __OPENCL_VERSION__ and CLC language version
On Wed, Mar 21, 2018 at 2:37 PM, Mark Janeswrote: > Aaron, this patch breaks the meson build-test in our CI: > > ../src/gallium/state_trackers/clover/llvm/invocation.cpp:88:36: error: > ‘lang_opencl10’ is not a member of ‘clang::LangStandard’ > { 100, clang::LangStandard::lang_opencl10}, > > configured with: > > meson -Dbuild-tests=true > -Dgallium-drivers=r300,r600,radeonsi,nouveau,swrast,swr,freedreno,vc4,pl111,etnaviv,imx,svga,virgl > -Dgallium-vdpau=true -Dgallium-xvmc=true -Dgallium-xa=true -Dgallium-va=true > -Dgallium-nine=true -Dgallium-opencl=standalone -Dgallium-omx=bellagio I've seen issues with building clover in the past when an incomplete set of clang headers is installed. This happens to be the case with at least Ubuntu's stock packaged clang. I'm not really sure what your CI system is running, but I did just verify that I was able to build my full mesa stack in meson with the following build configuration (what my normal build script uses when I tell it to use its meson path): CXXFLAGS=' -O2' CFLAGS=' -O2 -march=native' LD='ld.gold' LDFLAGS='' CC='gcc' CXX='g++' meson --prefix /usr/local -D dri-drivers= --sysconfdir /etc --libdir /usr/local/lib --buildtype release --buildtype release -D gallium-opencl=icd -D gles1=true -D gles2=true -D texture-float=true -D gallium-va=true -D gallium-xvmc=false -D build-tests=true -D gallium-drivers=radeonsi,r600,swrast -D vulkan-drivers=radeon ../ That being said, I'm building against an llvm/clang 7.0 build installed in /usr/local which meson picks up in preference to the system headers. --Aaron > > Pierre Moreau writes: > >> Oops, sorry. >> >> Reviewed-by: Pierre Moreau >> >> Thanks again for the series! >> Pierre >> >> On 2018-03-20 — 20:23, Aaron Watry wrote: >>> ping. >>> >>> This is the last of the series that still needs review. >>> >>> --Aaron >>> >>> On Thu, Mar 1, 2018 at 1:39 PM, Aaron Watry wrote: >>> > Use get_language_version to calculate default cl standard based on >>> > device capabilities and -cl-std specified in build options. >>> > >>> > v4: Squash the __OPENCL_VERSION__ and CLC language version patches >>> > v3: (Jan) Allow device_version up to 2.2 while device_clc_version >>> > only goes to 2.0 >>> > Use get_cl_version to calculate version instead >>> > v2: Split out from the previous patch (Pierre) >>> > >>> > Signed-off-by: Aaron Watry >>> > CC: Pierre Moreau >>> > CC: Jan Vesely >>> > --- >>> > src/gallium/state_trackers/clover/llvm/invocation.cpp | 6 -- >>> > 1 file changed, 4 insertions(+), 2 deletions(-) >>> > >>> > diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp >>> > b/src/gallium/state_trackers/clover/llvm/invocation.cpp >>> > index 8d76f203de..f146695585 100644 >>> > --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp >>> > +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp >>> > @@ -194,7 +194,7 @@ namespace { >>> >compat::set_lang_defaults(c->getInvocation(), c->getLangOpts(), >>> > compat::ik_opencl, >>> > ::llvm::Triple(target.triple), >>> > c->getPreprocessorOpts(), >>> > -clang::LangStandard::lang_opencl11); >>> > +get_language_version(opts, >>> > device_clc_version)); >>> > >>> >c->createDiagnostics(new clang::TextDiagnosticPrinter( >>> >*new raw_string_ostream(r_log), >>> > @@ -225,7 +225,9 @@ namespace { >>> >c.getPreprocessorOpts().Includes.push_back("clc/clc.h"); >>> > >>> >// Add definition for the OpenCL version >>> > - c.getPreprocessorOpts().addMacroDef("__OPENCL_VERSION__=110"); >>> > + c.getPreprocessorOpts().addMacroDef("__OPENCL_VERSION__=" + >>> > + std::to_string(get_cl_version( >>> > + dev.device_version()).version_number)); >>> > >>> >// clc.h requires that this macro be defined: >>> > >>> > c.getPreprocessorOpts().addMacroDef("cl_clang_storage_class_specifiers"); >>> > -- >>> > 2.14.1 >>> > >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/5] clover: Dynamically calculate __OPENCL_VERSION__ and CLC language version
Aaron, this patch breaks the meson build-test in our CI: ../src/gallium/state_trackers/clover/llvm/invocation.cpp:88:36: error: ‘lang_opencl10’ is not a member of ‘clang::LangStandard’ { 100, clang::LangStandard::lang_opencl10}, configured with: meson -Dbuild-tests=true -Dgallium-drivers=r300,r600,radeonsi,nouveau,swrast,swr,freedreno,vc4,pl111,etnaviv,imx,svga,virgl -Dgallium-vdpau=true -Dgallium-xvmc=true -Dgallium-xa=true -Dgallium-va=true -Dgallium-nine=true -Dgallium-opencl=standalone -Dgallium-omx=bellagio Pierre Moreauwrites: > Oops, sorry. > > Reviewed-by: Pierre Moreau > > Thanks again for the series! > Pierre > > On 2018-03-20 — 20:23, Aaron Watry wrote: >> ping. >> >> This is the last of the series that still needs review. >> >> --Aaron >> >> On Thu, Mar 1, 2018 at 1:39 PM, Aaron Watry wrote: >> > Use get_language_version to calculate default cl standard based on >> > device capabilities and -cl-std specified in build options. >> > >> > v4: Squash the __OPENCL_VERSION__ and CLC language version patches >> > v3: (Jan) Allow device_version up to 2.2 while device_clc_version >> > only goes to 2.0 >> > Use get_cl_version to calculate version instead >> > v2: Split out from the previous patch (Pierre) >> > >> > Signed-off-by: Aaron Watry >> > CC: Pierre Moreau >> > CC: Jan Vesely >> > --- >> > src/gallium/state_trackers/clover/llvm/invocation.cpp | 6 -- >> > 1 file changed, 4 insertions(+), 2 deletions(-) >> > >> > diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp >> > b/src/gallium/state_trackers/clover/llvm/invocation.cpp >> > index 8d76f203de..f146695585 100644 >> > --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp >> > +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp >> > @@ -194,7 +194,7 @@ namespace { >> >compat::set_lang_defaults(c->getInvocation(), c->getLangOpts(), >> > compat::ik_opencl, >> > ::llvm::Triple(target.triple), >> > c->getPreprocessorOpts(), >> > -clang::LangStandard::lang_opencl11); >> > +get_language_version(opts, >> > device_clc_version)); >> > >> >c->createDiagnostics(new clang::TextDiagnosticPrinter( >> >*new raw_string_ostream(r_log), >> > @@ -225,7 +225,9 @@ namespace { >> >c.getPreprocessorOpts().Includes.push_back("clc/clc.h"); >> > >> >// Add definition for the OpenCL version >> > - c.getPreprocessorOpts().addMacroDef("__OPENCL_VERSION__=110"); >> > + c.getPreprocessorOpts().addMacroDef("__OPENCL_VERSION__=" + >> > + std::to_string(get_cl_version( >> > + dev.device_version()).version_number)); >> > >> >// clc.h requires that this macro be defined: >> > >> > c.getPreprocessorOpts().addMacroDef("cl_clang_storage_class_specifiers"); >> > -- >> > 2.14.1 >> > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 4/4] spirv: Accept doubles in FaceForward, Reflect and Refract
The SPIR-V spec doesn’t specify a size requirement for these and the equivalent functions in the GLSL spec have explicit alternatives for doubles. Refract is a little bit more complicated due to the fact that the final argument is always supposed to be a scalar 32- or 16- bit float regardless of the other operands. However in practice it seems there is a bug in glslang that makes it convert the argument to 64-bit if you actually try to pass it a 32-bit value while the other arguments are 64-bit. This adds an optional conversion of the final argument in order to support any type. These have been tested against the automatically generated tests of glsl-4.00/execution/built-in-functions using the ARB_gl_spirv branch which tests it with quite a large range of combinations. The issue with glslang has been filed here: https://github.com/KhronosGroup/glslang/issues/1279 v2: Convert the eta operand of Refract from any size in order to make it eventually cope with 16-bit floats. --- src/compiler/spirv/vtn_glsl450.c | 22 ++ 1 file changed, 18 insertions(+), 4 deletions(-) diff --git a/src/compiler/spirv/vtn_glsl450.c b/src/compiler/spirv/vtn_glsl450.c index 50783fbfb4d..0cabedf741d 100644 --- a/src/compiler/spirv/vtn_glsl450.c +++ b/src/compiler/spirv/vtn_glsl450.c @@ -628,14 +628,14 @@ handle_glsl450_alu(struct vtn_builder *b, enum GLSLstd450 entrypoint, case GLSLstd450FaceForward: val->ssa->def = nir_bcsel(nb, nir_flt(nb, nir_fdot(nb, src[2], src[1]), - nir_imm_float(nb, 0.0)), + NIR_IMM_FP(nb, 0.0)), src[0], nir_fneg(nb, src[0])); return; case GLSLstd450Reflect: /* I - 2 * dot(N, I) * N */ val->ssa->def = - nir_fsub(nb, src[0], nir_fmul(nb, nir_imm_float(nb, 2.0), + nir_fsub(nb, src[0], nir_fmul(nb, NIR_IMM_FP(nb, 2.0), nir_fmul(nb, nir_fdot(nb, src[0], src[1]), src[1]))); return; @@ -645,8 +645,22 @@ handle_glsl450_alu(struct vtn_builder *b, enum GLSLstd450 entrypoint, nir_ssa_def *N = src[1]; nir_ssa_def *eta = src[2]; nir_ssa_def *n_dot_i = nir_fdot(nb, N, I); - nir_ssa_def *one = nir_imm_float(nb, 1.0); - nir_ssa_def *zero = nir_imm_float(nb, 0.0); + nir_ssa_def *one = NIR_IMM_FP(nb, 1.0); + nir_ssa_def *zero = NIR_IMM_FP(nb, 0.0); + /* According to the SPIR-V and GLSL specs, eta is always a float + * regardless of the type of the other operands. However in practice it + * seems that if you try to pass it a float then glslang will just + * promote it to a double and generate invalid SPIR-V. In order to + * support a hypothetical fixed version of glslang we’ll promote eta to + * double if the other operands are double also. + */ + if (I->bit_size != eta->bit_size) { + nir_op conversion_op = +nir_type_conversion_op(nir_type_float | eta->bit_size, + nir_type_float | I->bit_size, + nir_rounding_mode_undef); + eta = nir_build_alu(nb, conversion_op, eta, NULL, NULL, NULL); + } /* k = 1.0 - eta * eta * (1.0 - dot(N, I) * dot(N, I)) */ nir_ssa_def *k = nir_fsub(nb, one, nir_fmul(nb, eta, nir_fmul(nb, eta, -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 3/4] spirv: Add a 64-bit implementation of OpIsInf
The only change neccessary is to change the type of the constant used to compare against. This has been tested against the arb_gpu_shader_fp64/execution/ fs-isinf-dvec tests using the ARB_gl_spirv branch. v2: Use nir_imm_floatN_t for the constant. --- src/compiler/spirv/vtn_alu.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/src/compiler/spirv/vtn_alu.c b/src/compiler/spirv/vtn_alu.c index 01be397e271..3226e5b8739 100644 --- a/src/compiler/spirv/vtn_alu.c +++ b/src/compiler/spirv/vtn_alu.c @@ -563,10 +563,11 @@ vtn_handle_alu(struct vtn_builder *b, SpvOp opcode, val->ssa->def = nir_fne(>nb, src[0], src[0]); break; - case SpvOpIsInf: - val->ssa->def = nir_ieq(>nb, nir_fabs(>nb, src[0]), - nir_imm_float(>nb, INFINITY)); + case SpvOpIsInf: { + nir_ssa_def *inf = nir_imm_floatN_t(>nb, INFINITY, src[0]->bit_size); + val->ssa->def = nir_ieq(>nb, nir_fabs(>nb, src[0]), inf); break; + } case SpvOpFUnordEqual: case SpvOpFUnordNotEqual: -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/4] spirv: Use nir_imm_floatN_t for constants for GLSL450 builtins
There is an existing macro that is used to choose between either a float or a double immediate constant based on the bit size of the first operand to the builtin. This is now changed to use the new nir_imm_floatN_t helper function to reduce the number of places that make this decision. --- src/compiler/spirv/vtn_glsl450.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/compiler/spirv/vtn_glsl450.c b/src/compiler/spirv/vtn_glsl450.c index 7d32914d516..50783fbfb4d 100644 --- a/src/compiler/spirv/vtn_glsl450.c +++ b/src/compiler/spirv/vtn_glsl450.c @@ -513,7 +513,7 @@ vtn_nir_alu_op_for_spirv_glsl_opcode(struct vtn_builder *b, } } -#define NIR_IMM_FP(n, v) (src[0]->bit_size == 64 ? nir_imm_double(n, v) : nir_imm_float(n, v)) +#define NIR_IMM_FP(n, v) (nir_imm_floatN_t(n, v, src[0]->bit_size)) static void handle_glsl450_alu(struct vtn_builder *b, enum GLSLstd450 entrypoint, -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/4] nir/builder: Add a nir_imm_floatN_t helper
This lets you easily build float immediates just given the bit size. If we have this single place here to handle this then it will be easier to add support for 16-bit floats later. --- src/compiler/nir/nir_builder.h | 13 + 1 file changed, 13 insertions(+) diff --git a/src/compiler/nir/nir_builder.h b/src/compiler/nir/nir_builder.h index 36e0ae3ac63..32f86249ad3 100644 --- a/src/compiler/nir/nir_builder.h +++ b/src/compiler/nir/nir_builder.h @@ -227,6 +227,19 @@ nir_imm_double(nir_builder *build, double x) return nir_build_imm(build, 1, 64, v); } +static inline nir_ssa_def * +nir_imm_floatN_t(nir_builder *build, double x, unsigned bit_size) +{ + switch (bit_size) { + case 32: + return nir_imm_float(build, x); + case 64: + return nir_imm_double(build, x); + } + + unreachable("unknown float immediate bit size"); +} + static inline nir_ssa_def * nir_imm_vec4(nir_builder *build, float x, float y, float z, float w) { -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/4] spirv: Support doubles in some builtin functions
This adds support for doubles in some of the builtin functions. The last two patches have been posted already and are a v2 based on Jason’s feedback. These patches come out of testing using the ARB_gl_spirv branch of Mesa and Piglit. However they also affect Vulkan and can be tested with VkRunner using the test branch here: https://github.com/Igalia/vkrunner/tree/tests The corresponding tests can be run with: ./src/vkrunner \ examples/{face-forward,reflect,refract,isinf}-double.shader_test \ examples/refract-double-exp32.shader_test Neil Roberts (4): nir/builder: Add a nir_imm_floatN_t helper spirv: Use nir_imm_floatN_t for constants for GLSL450 builtins spirv: Add a 64-bit implementation of OpIsInf spirv: Accept doubles in FaceForward, Reflect and Refract src/compiler/nir/nir_builder.h | 13 + src/compiler/spirv/vtn_alu.c | 7 --- src/compiler/spirv/vtn_glsl450.c | 24 +++- 3 files changed, 36 insertions(+), 8 deletions(-) -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH mesa] meson/configure: detect endian.h instead of trying to guess when it's available
On March 21, 2018 6:47:48 PM UTC, Dylan Bakerwrote: > Quoting Emil Velikov (2018-03-21 10:53:08) > > On 21 March 2018 at 17:09, Eric Engestrom > wrote: > > > Cc: Maxin B. John > > > Cc: Khem Raj > > > Suggested-by: Jon Turney > > > Signed-off-by: Eric Engestrom > > > --- > > > configure.ac| 1 + > > > meson.build | 2 +- > > > src/util/u_endian.h | 2 +- > > > 3 files changed, 3 insertions(+), 2 deletions(-) > > > > > > diff --git a/configure.ac b/configure.ac > > > index 29d3c3457a7cdaefc36a..36c56da787e4fab5a355 100644 > > > --- a/configure.ac > > > +++ b/configure.ac > > > @@ -865,6 +865,7 @@ fi > > > AC_HEADER_MAJOR > > > AC_CHECK_HEADER([xlocale.h], [DEFINES="$DEFINES > -DHAVE_XLOCALE_H"]) > > > AC_CHECK_HEADER([sys/sysctl.h], [DEFINES="$DEFINES > -DHAVE_SYS_SYSCTL_H"]) > > > +AC_CHECK_HEADER([endian.h], [DEFINES="$DEFINES -DHAVE_ENDIAN_H"]) > > > AC_CHECK_FUNC([strtof], [DEFINES="$DEFINES -DHAVE_STRTOF"]) > > > AC_CHECK_FUNC([mkostemp], [DEFINES="$DEFINES -DHAVE_MKOSTEMP"]) > > > AC_CHECK_FUNC([timespec_get], [DEFINES="$DEFINES > -DHAVE_TIMESPEC_GET"]) > > Just hit me - why are we using any of these instead of > AC_CHECK_FUNCS > > and AC_CHECK_HEADERS (note the S at the end). > > Those take a list + automatically set the HAVE_ macros for us. > > > > Off the top of my head - we could even use the _ONCE version which > > should lead to smaller configure file + faster runtime. > > > > I'm thinking out loud here ^^, no changes needed ;-) > > > > > diff --git a/meson.build b/meson.build > > > index 88518ec0f0e9b81759a7..1132b4bd37075d8c9d21 100644 > > > --- a/meson.build > > > +++ b/meson.build > > > @@ -904,7 +904,7 @@ elif cc.has_header_symbol('sys/mkdev.h', > 'major') > > >pre_args += '-DMAJOR_IN_MKDEV' > > > endif > > > > > > -foreach h : ['xlocale.h', 'sys/sysctl.h', 'linux/futex.h'] > > > +foreach h : ['xlocale.h', 'sys/sysctl.h', 'linux/futex.h', > 'endian.h'] > > >if cc.compiles('#include <@0@>'.format(h), name : '@0@ > works'.format(h)) > > > pre_args += '-DHAVE_@0@'.format(h.to_upper().underscorify()) > > >endif > > > diff --git a/src/util/u_endian.h b/src/util/u_endian.h > > > index 22d011ec0086ee77e11c..e11b381588dbc960e8c3 100644 > > > --- a/src/util/u_endian.h > > > +++ b/src/util/u_endian.h > > > @@ -27,7 +27,7 @@ > > > #ifndef U_ENDIAN_H > > > #define U_ENDIAN_H > > > > > > -#if defined(__GLIBC__) || defined(ANDROID) || defined(__CYGWIN__) > > > +#ifdef HAVE_ENDIAN_H > > I'd either keep the ANDROID hunk here, or add -DHAVE_ENDIAN_H to > > Android.common.mk > > With slight inclination towards the latter ;-) > > > > With that the patch is > > Reviewed-by: Emil Velikov > > Cc: > > > > With Emil's fixes, > Reviewed-by: Dylan Baker Thanks! I have the Android.common.mk hunk that I sent in the other email applied locally; this is what I'll push tomorrow, I'm just giving RobHer some time to shout if I'm doing it wrong. Cheers, Eric ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH mesa] meson/configure: detect endian.h instead of trying to guess when it's available
Quoting Emil Velikov (2018-03-21 10:53:08) > On 21 March 2018 at 17:09, Eric Engestromwrote: > > Cc: Maxin B. John > > Cc: Khem Raj > > Suggested-by: Jon Turney > > Signed-off-by: Eric Engestrom > > --- > > configure.ac| 1 + > > meson.build | 2 +- > > src/util/u_endian.h | 2 +- > > 3 files changed, 3 insertions(+), 2 deletions(-) > > > > diff --git a/configure.ac b/configure.ac > > index 29d3c3457a7cdaefc36a..36c56da787e4fab5a355 100644 > > --- a/configure.ac > > +++ b/configure.ac > > @@ -865,6 +865,7 @@ fi > > AC_HEADER_MAJOR > > AC_CHECK_HEADER([xlocale.h], [DEFINES="$DEFINES -DHAVE_XLOCALE_H"]) > > AC_CHECK_HEADER([sys/sysctl.h], [DEFINES="$DEFINES -DHAVE_SYS_SYSCTL_H"]) > > +AC_CHECK_HEADER([endian.h], [DEFINES="$DEFINES -DHAVE_ENDIAN_H"]) > > AC_CHECK_FUNC([strtof], [DEFINES="$DEFINES -DHAVE_STRTOF"]) > > AC_CHECK_FUNC([mkostemp], [DEFINES="$DEFINES -DHAVE_MKOSTEMP"]) > > AC_CHECK_FUNC([timespec_get], [DEFINES="$DEFINES -DHAVE_TIMESPEC_GET"]) > Just hit me - why are we using any of these instead of AC_CHECK_FUNCS > and AC_CHECK_HEADERS (note the S at the end). > Those take a list + automatically set the HAVE_ macros for us. > > Off the top of my head - we could even use the _ONCE version which > should lead to smaller configure file + faster runtime. > > I'm thinking out loud here ^^, no changes needed ;-) > > > diff --git a/meson.build b/meson.build > > index 88518ec0f0e9b81759a7..1132b4bd37075d8c9d21 100644 > > --- a/meson.build > > +++ b/meson.build > > @@ -904,7 +904,7 @@ elif cc.has_header_symbol('sys/mkdev.h', 'major') > >pre_args += '-DMAJOR_IN_MKDEV' > > endif > > > > -foreach h : ['xlocale.h', 'sys/sysctl.h', 'linux/futex.h'] > > +foreach h : ['xlocale.h', 'sys/sysctl.h', 'linux/futex.h', 'endian.h'] > >if cc.compiles('#include <@0@>'.format(h), name : '@0@ works'.format(h)) > > pre_args += '-DHAVE_@0@'.format(h.to_upper().underscorify()) > >endif > > diff --git a/src/util/u_endian.h b/src/util/u_endian.h > > index 22d011ec0086ee77e11c..e11b381588dbc960e8c3 100644 > > --- a/src/util/u_endian.h > > +++ b/src/util/u_endian.h > > @@ -27,7 +27,7 @@ > > #ifndef U_ENDIAN_H > > #define U_ENDIAN_H > > > > -#if defined(__GLIBC__) || defined(ANDROID) || defined(__CYGWIN__) > > +#ifdef HAVE_ENDIAN_H > I'd either keep the ANDROID hunk here, or add -DHAVE_ENDIAN_H to > Android.common.mk > With slight inclination towards the latter ;-) > > With that the patch is > Reviewed-by: Emil Velikov > Cc: > With Emil's fixes, Reviewed-by: Dylan Baker signature.asc Description: signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH mesa] meson/configure: detect endian.h instead of trying to guess when it's available
Quoting Emil Velikov (2018-03-21 10:57:09) > On 21 March 2018 at 17:54, Eric Engestromwrote: > > On Wednesday, 2018-03-21 10:45:35 -0700, Dylan Baker wrote: > >> Quoting Eric Engestrom (2018-03-21 10:09:17) > >> > Cc: Maxin B. John > >> > Cc: Khem Raj > >> > Suggested-by: Jon Turney > >> > Signed-off-by: Eric Engestrom > >> > --- > >> > configure.ac| 1 + > >> > meson.build | 2 +- > >> > src/util/u_endian.h | 2 +- > >> > 3 files changed, 3 insertions(+), 2 deletions(-) > >> > > >> > diff --git a/configure.ac b/configure.ac > >> > index 29d3c3457a7cdaefc36a..36c56da787e4fab5a355 100644 > >> > --- a/configure.ac > >> > +++ b/configure.ac > >> > @@ -865,6 +865,7 @@ fi > >> > AC_HEADER_MAJOR > >> > AC_CHECK_HEADER([xlocale.h], [DEFINES="$DEFINES -DHAVE_XLOCALE_H"]) > >> > AC_CHECK_HEADER([sys/sysctl.h], [DEFINES="$DEFINES > >> > -DHAVE_SYS_SYSCTL_H"]) > >> > +AC_CHECK_HEADER([endian.h], [DEFINES="$DEFINES -DHAVE_ENDIAN_H"]) > >> > AC_CHECK_FUNC([strtof], [DEFINES="$DEFINES -DHAVE_STRTOF"]) > >> > AC_CHECK_FUNC([mkostemp], [DEFINES="$DEFINES -DHAVE_MKOSTEMP"]) > >> > AC_CHECK_FUNC([timespec_get], [DEFINES="$DEFINES -DHAVE_TIMESPEC_GET"]) > >> > diff --git a/meson.build b/meson.build > >> > index 88518ec0f0e9b81759a7..1132b4bd37075d8c9d21 100644 > >> > --- a/meson.build > >> > +++ b/meson.build > >> > @@ -904,7 +904,7 @@ elif cc.has_header_symbol('sys/mkdev.h', 'major') > >> >pre_args += '-DMAJOR_IN_MKDEV' > >> > endif > >> > > >> > -foreach h : ['xlocale.h', 'sys/sysctl.h', 'linux/futex.h'] > >> > +foreach h : ['xlocale.h', 'sys/sysctl.h', 'linux/futex.h', 'endian.h'] > >> >if cc.compiles('#include <@0@>'.format(h), name : '@0@ > >> > works'.format(h)) > >> > pre_args += '-DHAVE_@0@'.format(h.to_upper().underscorify()) > >> >endif > >> > diff --git a/src/util/u_endian.h b/src/util/u_endian.h > >> > index 22d011ec0086ee77e11c..e11b381588dbc960e8c3 100644 > >> > --- a/src/util/u_endian.h > >> > +++ b/src/util/u_endian.h > >> > @@ -27,7 +27,7 @@ > >> > #ifndef U_ENDIAN_H > >> > #define U_ENDIAN_H > >> > > >> > -#if defined(__GLIBC__) || defined(ANDROID) || defined(__CYGWIN__) > >> > +#ifdef HAVE_ENDIAN_H > >> > >> is it really safe to remove the `defined(ANDROID)` check here? > > > > I'm clearly too tired to do this... > > > > Cc'ing Rob; can you tell us if defining HAVE_ENDIAN_H unconditionally in > > Android.mk seems reasonable? Or is there a way to detect headers on Android? > > > Pretty sure it's safe - we've been doing that for a long time ;-) > > > I also forgot to add the check in scons; I just added this to the commit > > locally: > > 8< > > diff --git a/scons/gallium.py b/scons/gallium.py > > index 75200b89c1fe6d751980..6cb20efcbf4b8c997f60 100755 > > --- a/scons/gallium.py > > +++ b/scons/gallium.py > > @@ -354,6 +354,9 @@ def generate(env): > > if check_header(env, 'xlocale.h'): > > cppdefines += ['HAVE_XLOCALE_H'] > > > > +if check_header(env, 'endian.h'): > > +cppdefines += ['HAVE_ENDIAN_H'] > > + > Wondering if anyone uses scons on POSIX platforms. Regardless - thanks > for updating it. We do build test scons on Linux, it gives a pretty good indication when someone has broken the build on Windows (or is going to). Dylan signature.asc Description: signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/4] intel/genxml: Add SAMPLER_INSTDONE register.
--- src/intel/genxml/gen10.xml | 23 +++ src/intel/genxml/gen11.xml | 23 +++ src/intel/genxml/gen7.xml | 22 ++ src/intel/genxml/gen75.xml | 25 + src/intel/genxml/gen8.xml | 23 +++ src/intel/genxml/gen9.xml | 23 +++ 6 files changed, 139 insertions(+) diff --git a/src/intel/genxml/gen10.xml b/src/intel/genxml/gen10.xml index afdb580b624..aeb99667592 100644 --- a/src/intel/genxml/gen10.xml +++ b/src/intel/genxml/gen10.xml @@ -3504,6 +3504,29 @@ + + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/intel/genxml/gen11.xml b/src/intel/genxml/gen11.xml index a5e67c30bf5..6ca0e785ba0 100644 --- a/src/intel/genxml/gen11.xml +++ b/src/intel/genxml/gen11.xml @@ -3500,6 +3500,29 @@ + + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/intel/genxml/gen7.xml b/src/intel/genxml/gen7.xml index 52ca043b517..4865843fcbb 100644 --- a/src/intel/genxml/gen7.xml +++ b/src/intel/genxml/gen7.xml @@ -2436,6 +2436,28 @@ + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/intel/genxml/gen75.xml b/src/intel/genxml/gen75.xml index 9501ec53f83..da06e84ee91 100644 --- a/src/intel/genxml/gen75.xml +++ b/src/intel/genxml/gen75.xml @@ -2908,6 +2908,31 @@ + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/intel/genxml/gen8.xml b/src/intel/genxml/gen8.xml index 10dc787f48a..71626c15cd2 100644 --- a/src/intel/genxml/gen8.xml +++ b/src/intel/genxml/gen8.xml @@ -3165,6 +3165,29 @@ + + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/intel/genxml/gen9.xml b/src/intel/genxml/gen9.xml index 90d3a15eb21..c32f2c3162c 100644 --- a/src/intel/genxml/gen9.xml +++ b/src/intel/genxml/gen9.xml @@ -3450,6 +3450,29 @@ + + + + + + + + + + + + + + + + + + + + + + + -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/4] intel/genxml: Add SC_INSTDONE register.
--- src/intel/genxml/gen10.xml | 27 +++ src/intel/genxml/gen11.xml | 27 +++ src/intel/genxml/gen7.xml | 19 +++ src/intel/genxml/gen75.xml | 17 + src/intel/genxml/gen8.xml | 24 src/intel/genxml/gen9.xml | 26 ++ 6 files changed, 140 insertions(+) diff --git a/src/intel/genxml/gen10.xml b/src/intel/genxml/gen10.xml index cc696e800d1..e0bf0e91590 100644 --- a/src/intel/genxml/gen10.xml +++ b/src/intel/genxml/gen10.xml @@ -3459,6 +3459,33 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/intel/genxml/gen11.xml b/src/intel/genxml/gen11.xml index 417fac13654..3278f35b824 100644 --- a/src/intel/genxml/gen11.xml +++ b/src/intel/genxml/gen11.xml @@ -3455,6 +3455,33 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/intel/genxml/gen7.xml b/src/intel/genxml/gen7.xml index 87e05c94ef5..bc9fa5b65de 100644 --- a/src/intel/genxml/gen7.xml +++ b/src/intel/genxml/gen7.xml @@ -2397,6 +2397,25 @@ + + + + + + + + + + + + + + + + + + + diff --git a/src/intel/genxml/gen75.xml b/src/intel/genxml/gen75.xml index 68aff857f35..9e2b789006f 100644 --- a/src/intel/genxml/gen75.xml +++ b/src/intel/genxml/gen75.xml @@ -2869,6 +2869,23 @@ + + + + + + + + + + + + + + + + + diff --git a/src/intel/genxml/gen8.xml b/src/intel/genxml/gen8.xml index 8a4bf34cf7d..0a6be596988 100644 --- a/src/intel/genxml/gen8.xml +++ b/src/intel/genxml/gen8.xml @@ -3123,6 +3123,30 @@ + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/intel/genxml/gen9.xml b/src/intel/genxml/gen9.xml index cfae4a8b658..834f5773ff2 100644 --- a/src/intel/genxml/gen9.xml +++ b/src/intel/genxml/gen9.xml @@ -3406,6 +3406,32 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/4] intel/genxml: Add ROW_INSTDONE register.
--- src/intel/genxml/gen10.xml | 18 ++ src/intel/genxml/gen11.xml | 18 ++ src/intel/genxml/gen7.xml | 20 src/intel/genxml/gen75.xml | 22 ++ src/intel/genxml/gen8.xml | 18 ++ src/intel/genxml/gen9.xml | 18 ++ 6 files changed, 114 insertions(+) diff --git a/src/intel/genxml/gen10.xml b/src/intel/genxml/gen10.xml index e0bf0e91590..afdb580b624 100644 --- a/src/intel/genxml/gen10.xml +++ b/src/intel/genxml/gen10.xml @@ -3486,6 +3486,24 @@ + + + + + + + + + + + + + + + + + + diff --git a/src/intel/genxml/gen11.xml b/src/intel/genxml/gen11.xml index 3278f35b824..a5e67c30bf5 100644 --- a/src/intel/genxml/gen11.xml +++ b/src/intel/genxml/gen11.xml @@ -3482,6 +3482,24 @@ + + + + + + + + + + + + + + + + + + diff --git a/src/intel/genxml/gen7.xml b/src/intel/genxml/gen7.xml index bc9fa5b65de..52ca043b517 100644 --- a/src/intel/genxml/gen7.xml +++ b/src/intel/genxml/gen7.xml @@ -2416,6 +2416,26 @@ + + + + + + + + + + + + + + + + + + + + diff --git a/src/intel/genxml/gen75.xml b/src/intel/genxml/gen75.xml index 9e2b789006f..9501ec53f83 100644 --- a/src/intel/genxml/gen75.xml +++ b/src/intel/genxml/gen75.xml @@ -2886,6 +2886,28 @@ + + + + + + + + + + + + + + + + + + + + + + diff --git a/src/intel/genxml/gen8.xml b/src/intel/genxml/gen8.xml index 0a6be596988..10dc787f48a 100644 --- a/src/intel/genxml/gen8.xml +++ b/src/intel/genxml/gen8.xml @@ -3147,6 +3147,24 @@ + + + + + + + + + + + + + + + + + + diff --git a/src/intel/genxml/gen9.xml b/src/intel/genxml/gen9.xml index 834f5773ff2..90d3a15eb21 100644 --- a/src/intel/genxml/gen9.xml +++ b/src/intel/genxml/gen9.xml @@ -3432,6 +3432,24 @@ + + + + + + + + + + + + + + + + + + -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/4] intel/aubinator_error_decode: Decode more registers.
Decode SC_INSTDONE, ROW_INSTDONE and SAMPLER_INSTDONE. --- src/intel/tools/aubinator_error_decode.c | 12 1 file changed, 12 insertions(+) diff --git a/src/intel/tools/aubinator_error_decode.c b/src/intel/tools/aubinator_error_decode.c index db880d74a9e..9abd05fd75a 100644 --- a/src/intel/tools/aubinator_error_decode.c +++ b/src/intel/tools/aubinator_error_decode.c @@ -540,6 +540,18 @@ read_data_file(FILE *file) print_register(spec, reg_name, reg); } + matched = sscanf(line, " SC_INSTDONE: 0x%08x\n", ); + if (matched == 1) +print_register(spec, "SC_INSTDONE", reg); + + matched = sscanf(line, " SAMPLER_INSTDONE[%*d][%*d]: 0x%08x\n", ); + if (matched == 1) +print_register(spec, "SAMPLER_INSTDONE", reg); + + matched = sscanf(line, " ROW_INSTDONE[%*d][%*d]: 0x%08x\n", ); + if (matched == 1) +print_register(spec, "ROW_INSTDONE", reg); + matched = sscanf(line, " INSTDONE1: 0x%08x\n", ); if (matched == 1) print_register(spec, "INSTDONE_1", reg); -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 105240] GPU lock-up when running QT5 based celestia
https://bugs.freedesktop.org/show_bug.cgi?id=105240 --- Comment #1 from Hleb Valoshka <375...@gmail.com> --- Works on Devuan 2 (Debian 9) with Linux 4.9 and 4.15 and Mesa 13.0.6, so I assume that the problem is in Mesa. -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 105240] GPU lock-up when running QT5 based celestia
https://bugs.freedesktop.org/show_bug.cgi?id=105240 Hleb Valoshka <375...@gmail.com> changed: What|Removed |Added Assignee|dri-devel@lists.freedesktop |mesa-dev@lists.freedesktop. |.org|org -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/tiled_memcpy: realign rgba8_copy_aligned_dst stack in 32-bit builds
On Wednesday, 2018-03-21 10:11:45 -0700, Matt Turner wrote: > On Wed, Mar 21, 2018 at 2:39 AM, Eric Engestrom >wrote: > > On Tuesday, 2018-03-20 13:39:25 -0700, Scott D Phillips wrote: > >> When building intel_tiled_memcpy for i686, the stack will only be > >> 4-byte aligned. This isn't sufficient for SSE temporaries which > >> require 16-byte alignment. Use the force_align_arg_pointer > >> function attribute in that case to ensure sufficient alignment. > >> --- > >> src/mesa/drivers/dri/i965/intel_tiled_memcpy.c | 8 +++- > >> 1 file changed, 7 insertions(+), 1 deletion(-) > >> > >> diff --git a/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c > >> b/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c > >> index 69306828d72..bd8bafbd2d7 100644 > >> --- a/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c > >> +++ b/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c > >> @@ -42,6 +42,12 @@ > >> #include > >> #endif > >> > >> +#if defined(__GNUC__) && defined(__i386__) && (defined(__SSSE3__) || > >> defined(__SSE2__)) > > > > Is that a typo? s/SSSE3/SSE3/ ? > > Nope, that's correct. SSSE3 is "Supplemental" SSE3, and we have > compile-time code in this file that uses it over SSE2 to save a few > instructions. Oh, didn't know about that; learned something new :) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH mesa] meson/configure: detect endian.h instead of trying to guess when it's available
On Wednesday, 2018-03-21 17:53:08 +, Emil Velikov wrote: > On 21 March 2018 at 17:09, Eric Engestromwrote: > > Cc: Maxin B. John > > Cc: Khem Raj > > Suggested-by: Jon Turney > > Signed-off-by: Eric Engestrom > > --- > > configure.ac| 1 + > > meson.build | 2 +- > > src/util/u_endian.h | 2 +- > > 3 files changed, 3 insertions(+), 2 deletions(-) > > > > diff --git a/configure.ac b/configure.ac > > index 29d3c3457a7cdaefc36a..36c56da787e4fab5a355 100644 > > --- a/configure.ac > > +++ b/configure.ac > > @@ -865,6 +865,7 @@ fi > > AC_HEADER_MAJOR > > AC_CHECK_HEADER([xlocale.h], [DEFINES="$DEFINES -DHAVE_XLOCALE_H"]) > > AC_CHECK_HEADER([sys/sysctl.h], [DEFINES="$DEFINES -DHAVE_SYS_SYSCTL_H"]) > > +AC_CHECK_HEADER([endian.h], [DEFINES="$DEFINES -DHAVE_ENDIAN_H"]) > > AC_CHECK_FUNC([strtof], [DEFINES="$DEFINES -DHAVE_STRTOF"]) > > AC_CHECK_FUNC([mkostemp], [DEFINES="$DEFINES -DHAVE_MKOSTEMP"]) > > AC_CHECK_FUNC([timespec_get], [DEFINES="$DEFINES -DHAVE_TIMESPEC_GET"]) > Just hit me - why are we using any of these instead of AC_CHECK_FUNCS > and AC_CHECK_HEADERS (note the S at the end). > Those take a list + automatically set the HAVE_ macros for us. > > Off the top of my head - we could even use the _ONCE version which > should lead to smaller configure file + faster runtime. > > I'm thinking out loud here ^^, no changes needed ;-) I'll leave autotools refactors to you, I myself consider it in maintenance-only mode :P > > > diff --git a/meson.build b/meson.build > > index 88518ec0f0e9b81759a7..1132b4bd37075d8c9d21 100644 > > --- a/meson.build > > +++ b/meson.build > > @@ -904,7 +904,7 @@ elif cc.has_header_symbol('sys/mkdev.h', 'major') > >pre_args += '-DMAJOR_IN_MKDEV' > > endif > > > > -foreach h : ['xlocale.h', 'sys/sysctl.h', 'linux/futex.h'] > > +foreach h : ['xlocale.h', 'sys/sysctl.h', 'linux/futex.h', 'endian.h'] > >if cc.compiles('#include <@0@>'.format(h), name : '@0@ works'.format(h)) > > pre_args += '-DHAVE_@0@'.format(h.to_upper().underscorify()) > >endif > > diff --git a/src/util/u_endian.h b/src/util/u_endian.h > > index 22d011ec0086ee77e11c..e11b381588dbc960e8c3 100644 > > --- a/src/util/u_endian.h > > +++ b/src/util/u_endian.h > > @@ -27,7 +27,7 @@ > > #ifndef U_ENDIAN_H > > #define U_ENDIAN_H > > > > -#if defined(__GLIBC__) || defined(ANDROID) || defined(__CYGWIN__) > > +#ifdef HAVE_ENDIAN_H > I'd either keep the ANDROID hunk here, or add -DHAVE_ENDIAN_H to > Android.common.mk > With slight inclination towards the latter ;-) Same, I just asked RobHer what he thinks. > > With that the patch is > Reviewed-by: Emil Velikov Thanks! > Cc: Good point, didn't think about that! > > -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v6 1/2] gallium/winsys/kms: Fix possible leak in map/unmap.
On Tue, Mar 20, 2018 at 9:26 AM, Tomasz Figawrote: > On Wed, Mar 21, 2018 at 12:58 AM, Emil Velikov > wrote: >> On 20 March 2018 at 14:24, Tomasz Figa wrote: >>> On Tue, Mar 20, 2018 at 10:44 PM, Emil Velikov >>> wrote: On 20 March 2018 at 04:40, Tomasz Figa wrote: > On Tue, Mar 20, 2018 at 2:55 AM, Emil Velikov > wrote: >> Hi Lepton, >> >> On 19 March 2018 at 17:33, Lepton Wu wrote: >>> If user calls map twice for kms_sw_displaytarget, the first mapped >>> buffer could get leaked. Instead of calling mmap every time, just >>> reuse previous mapping. Since user could map same displaytarget with >>> different flags, we have to keep two different pointers, one for rw >>> mapping and one for ro mapping. Also introduce reference count for >>> mapped buffer so we can unmap them at right time. >>> >>> Reviewed-by: Emil Velikov >>> Reviewed-by: Tomasz Figa >>> Signed-off-by: Lepton Wu >> >> Nit: normally it's a good idea to have brief revision log when sending >> new version: >> v2: >> - split from larger patch (Emil) >> v3: >> - remove munmap w/a from dt_destory(Emil) >> ... >> >>> @@ -170,6 +172,14 @@ kms_sw_displaytarget_destroy(struct sw_winsys *ws, >>> if (kms_sw_dt->ref_count > 0) >>>return; >>> >>> + if (kms_sw_dt->map_count > 0) { >>> + DEBUG_PRINT("KMS-DEBUG: fix leaked map buffer %u\n", >>> kms_sw_dt->handle); >>> + munmap(kms_sw_dt->mapped, kms_sw_dt->size); >>> + kms_sw_dt->mapped = NULL; >>> + munmap(kms_sw_dt->ro_mapped, kms_sw_dt->size); >>> + kms_sw_dt->ro_mapped = NULL; >>> + } >>> + >> I could swear this workaround was missing in earlier revisions. I >> don't see anything in Tomasz' reply that suggesting we should bring it >> back? >> AFAICT the added refcounting makes no difference - the driver isn't >> cleaning up after itself. >> >> Am I missing something? > > I think this is actually consistent with what other winsys > implementations do. They free the map (or shadow malloc/shm buffer) in > _destroy() callback, so we should probably do the same. > Looking at the SW winsys - none of them seem to unmap at destroy time. Perhaps you meant that the HW ones do? >>> >>> dri: >>> https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/winsys/sw/dri/dri_sw_winsys.c#n128 >>> >>> gdi: >>> https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/winsys/sw/gdi/gdi_sw_winsys.c#n116 >>> >>> hgl: >>> https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/winsys/sw/hgl/hgl_sw_winsys.c#n152 >>> >>> xlib: >>> https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/winsys/sw/xlib/xlib_sw_winsys.c#n260 >>> https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/winsys/sw/xlib/xlib_sw_winsys.c#n271 >>> >>> The don't do real mapping - they all work on locally allocated memory. >>> However, after destroy, no resources are leaked and the pointers >>> returned from _map() are not valid anymore. >>> >> As mentioned before - zero objections against changing that, but keep >> it separate patch. >> Pretty please? > > SGTM. Thanks all for review. Is there anything else missing for getting this committed? > > Best regards, > Tomasz ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH mesa] meson/configure: detect endian.h instead of trying to guess when it's available
On 21 March 2018 at 17:54, Eric Engestromwrote: > On Wednesday, 2018-03-21 10:45:35 -0700, Dylan Baker wrote: >> Quoting Eric Engestrom (2018-03-21 10:09:17) >> > Cc: Maxin B. John >> > Cc: Khem Raj >> > Suggested-by: Jon Turney >> > Signed-off-by: Eric Engestrom >> > --- >> > configure.ac| 1 + >> > meson.build | 2 +- >> > src/util/u_endian.h | 2 +- >> > 3 files changed, 3 insertions(+), 2 deletions(-) >> > >> > diff --git a/configure.ac b/configure.ac >> > index 29d3c3457a7cdaefc36a..36c56da787e4fab5a355 100644 >> > --- a/configure.ac >> > +++ b/configure.ac >> > @@ -865,6 +865,7 @@ fi >> > AC_HEADER_MAJOR >> > AC_CHECK_HEADER([xlocale.h], [DEFINES="$DEFINES -DHAVE_XLOCALE_H"]) >> > AC_CHECK_HEADER([sys/sysctl.h], [DEFINES="$DEFINES -DHAVE_SYS_SYSCTL_H"]) >> > +AC_CHECK_HEADER([endian.h], [DEFINES="$DEFINES -DHAVE_ENDIAN_H"]) >> > AC_CHECK_FUNC([strtof], [DEFINES="$DEFINES -DHAVE_STRTOF"]) >> > AC_CHECK_FUNC([mkostemp], [DEFINES="$DEFINES -DHAVE_MKOSTEMP"]) >> > AC_CHECK_FUNC([timespec_get], [DEFINES="$DEFINES -DHAVE_TIMESPEC_GET"]) >> > diff --git a/meson.build b/meson.build >> > index 88518ec0f0e9b81759a7..1132b4bd37075d8c9d21 100644 >> > --- a/meson.build >> > +++ b/meson.build >> > @@ -904,7 +904,7 @@ elif cc.has_header_symbol('sys/mkdev.h', 'major') >> >pre_args += '-DMAJOR_IN_MKDEV' >> > endif >> > >> > -foreach h : ['xlocale.h', 'sys/sysctl.h', 'linux/futex.h'] >> > +foreach h : ['xlocale.h', 'sys/sysctl.h', 'linux/futex.h', 'endian.h'] >> >if cc.compiles('#include <@0@>'.format(h), name : '@0@ works'.format(h)) >> > pre_args += '-DHAVE_@0@'.format(h.to_upper().underscorify()) >> >endif >> > diff --git a/src/util/u_endian.h b/src/util/u_endian.h >> > index 22d011ec0086ee77e11c..e11b381588dbc960e8c3 100644 >> > --- a/src/util/u_endian.h >> > +++ b/src/util/u_endian.h >> > @@ -27,7 +27,7 @@ >> > #ifndef U_ENDIAN_H >> > #define U_ENDIAN_H >> > >> > -#if defined(__GLIBC__) || defined(ANDROID) || defined(__CYGWIN__) >> > +#ifdef HAVE_ENDIAN_H >> >> is it really safe to remove the `defined(ANDROID)` check here? > > I'm clearly too tired to do this... > > Cc'ing Rob; can you tell us if defining HAVE_ENDIAN_H unconditionally in > Android.mk seems reasonable? Or is there a way to detect headers on Android? > Pretty sure it's safe - we've been doing that for a long time ;-) > I also forgot to add the check in scons; I just added this to the commit > locally: > 8< > diff --git a/scons/gallium.py b/scons/gallium.py > index 75200b89c1fe6d751980..6cb20efcbf4b8c997f60 100755 > --- a/scons/gallium.py > +++ b/scons/gallium.py > @@ -354,6 +354,9 @@ def generate(env): > if check_header(env, 'xlocale.h'): > cppdefines += ['HAVE_XLOCALE_H'] > > +if check_header(env, 'endian.h'): > +cppdefines += ['HAVE_ENDIAN_H'] > + Wondering if anyone uses scons on POSIX platforms. Regardless - thanks for updating it. -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH mesa] meson/configure: detect endian.h instead of trying to guess when it's available
On Wednesday, 2018-03-21 17:54:02 +, Eric Engestrom wrote: > On Wednesday, 2018-03-21 10:45:35 -0700, Dylan Baker wrote: > > Quoting Eric Engestrom (2018-03-21 10:09:17) > > > Cc: Maxin B. John> > > Cc: Khem Raj > > > Suggested-by: Jon Turney > > > Signed-off-by: Eric Engestrom > > > --- > > > configure.ac| 1 + > > > meson.build | 2 +- > > > src/util/u_endian.h | 2 +- > > > 3 files changed, 3 insertions(+), 2 deletions(-) > > > > > > diff --git a/configure.ac b/configure.ac > > > index 29d3c3457a7cdaefc36a..36c56da787e4fab5a355 100644 > > > --- a/configure.ac > > > +++ b/configure.ac > > > @@ -865,6 +865,7 @@ fi > > > AC_HEADER_MAJOR > > > AC_CHECK_HEADER([xlocale.h], [DEFINES="$DEFINES -DHAVE_XLOCALE_H"]) > > > AC_CHECK_HEADER([sys/sysctl.h], [DEFINES="$DEFINES -DHAVE_SYS_SYSCTL_H"]) > > > +AC_CHECK_HEADER([endian.h], [DEFINES="$DEFINES -DHAVE_ENDIAN_H"]) > > > AC_CHECK_FUNC([strtof], [DEFINES="$DEFINES -DHAVE_STRTOF"]) > > > AC_CHECK_FUNC([mkostemp], [DEFINES="$DEFINES -DHAVE_MKOSTEMP"]) > > > AC_CHECK_FUNC([timespec_get], [DEFINES="$DEFINES -DHAVE_TIMESPEC_GET"]) > > > diff --git a/meson.build b/meson.build > > > index 88518ec0f0e9b81759a7..1132b4bd37075d8c9d21 100644 > > > --- a/meson.build > > > +++ b/meson.build > > > @@ -904,7 +904,7 @@ elif cc.has_header_symbol('sys/mkdev.h', 'major') > > >pre_args += '-DMAJOR_IN_MKDEV' > > > endif > > > > > > -foreach h : ['xlocale.h', 'sys/sysctl.h', 'linux/futex.h'] > > > +foreach h : ['xlocale.h', 'sys/sysctl.h', 'linux/futex.h', 'endian.h'] > > >if cc.compiles('#include <@0@>'.format(h), name : '@0@ > > > works'.format(h)) > > > pre_args += '-DHAVE_@0@'.format(h.to_upper().underscorify()) > > >endif > > > diff --git a/src/util/u_endian.h b/src/util/u_endian.h > > > index 22d011ec0086ee77e11c..e11b381588dbc960e8c3 100644 > > > --- a/src/util/u_endian.h > > > +++ b/src/util/u_endian.h > > > @@ -27,7 +27,7 @@ > > > #ifndef U_ENDIAN_H > > > #define U_ENDIAN_H > > > > > > -#if defined(__GLIBC__) || defined(ANDROID) || defined(__CYGWIN__) > > > +#ifdef HAVE_ENDIAN_H > > > > is it really safe to remove the `defined(ANDROID)` check here? > > I'm clearly too tired to do this... > > Cc'ing Rob; can you tell us if defining HAVE_ENDIAN_H unconditionally in > Android.mk seems reasonable? Or is there a way to detect headers on Android? To be clear, I'm suggesting this: 8< diff --git a/Android.common.mk b/Android.common.mk index 52dc7bff3be5af1f97b6..e8aed48c31ab1704cbcf 100644 --- a/Android.common.mk +++ b/Android.common.mk @@ -70,6 +70,7 @@ LOCAL_CFLAGS += \ -DHAVE_DLADDR \ -DHAVE_DL_ITERATE_PHDR \ -DHAVE_LINUX_FUTEX_H \ + -DHAVE_ENDIAN_H \ -DHAVE_ZLIB \ -DMAJOR_IN_SYSMACROS \ -fvisibility=hidden \ >8 > > I also forgot to add the check in scons; I just added this to the commit > locally: > 8< > diff --git a/scons/gallium.py b/scons/gallium.py > index 75200b89c1fe6d751980..6cb20efcbf4b8c997f60 100755 > --- a/scons/gallium.py > +++ b/scons/gallium.py > @@ -354,6 +354,9 @@ def generate(env): > if check_header(env, 'xlocale.h'): > cppdefines += ['HAVE_XLOCALE_H'] > > +if check_header(env, 'endian.h'): > +cppdefines += ['HAVE_ENDIAN_H'] > + > if check_functions(env, ['strtod_l', 'strtof_l']): > cppdefines += ['HAVE_STRTOD_L'] > > >8 > > > > > > #include > > > > > > #if __BYTE_ORDER == __LITTLE_ENDIAN > > > -- > > > Cheers, > > > Eric > > > > > > ___ > > > mesa-dev mailing list > > > mesa-dev@lists.freedesktop.org > > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH mesa] meson/configure: detect endian.h instead of trying to guess when it's available
On Wednesday, 2018-03-21 10:45:35 -0700, Dylan Baker wrote: > Quoting Eric Engestrom (2018-03-21 10:09:17) > > Cc: Maxin B. John> > Cc: Khem Raj > > Suggested-by: Jon Turney > > Signed-off-by: Eric Engestrom > > --- > > configure.ac| 1 + > > meson.build | 2 +- > > src/util/u_endian.h | 2 +- > > 3 files changed, 3 insertions(+), 2 deletions(-) > > > > diff --git a/configure.ac b/configure.ac > > index 29d3c3457a7cdaefc36a..36c56da787e4fab5a355 100644 > > --- a/configure.ac > > +++ b/configure.ac > > @@ -865,6 +865,7 @@ fi > > AC_HEADER_MAJOR > > AC_CHECK_HEADER([xlocale.h], [DEFINES="$DEFINES -DHAVE_XLOCALE_H"]) > > AC_CHECK_HEADER([sys/sysctl.h], [DEFINES="$DEFINES -DHAVE_SYS_SYSCTL_H"]) > > +AC_CHECK_HEADER([endian.h], [DEFINES="$DEFINES -DHAVE_ENDIAN_H"]) > > AC_CHECK_FUNC([strtof], [DEFINES="$DEFINES -DHAVE_STRTOF"]) > > AC_CHECK_FUNC([mkostemp], [DEFINES="$DEFINES -DHAVE_MKOSTEMP"]) > > AC_CHECK_FUNC([timespec_get], [DEFINES="$DEFINES -DHAVE_TIMESPEC_GET"]) > > diff --git a/meson.build b/meson.build > > index 88518ec0f0e9b81759a7..1132b4bd37075d8c9d21 100644 > > --- a/meson.build > > +++ b/meson.build > > @@ -904,7 +904,7 @@ elif cc.has_header_symbol('sys/mkdev.h', 'major') > >pre_args += '-DMAJOR_IN_MKDEV' > > endif > > > > -foreach h : ['xlocale.h', 'sys/sysctl.h', 'linux/futex.h'] > > +foreach h : ['xlocale.h', 'sys/sysctl.h', 'linux/futex.h', 'endian.h'] > >if cc.compiles('#include <@0@>'.format(h), name : '@0@ works'.format(h)) > > pre_args += '-DHAVE_@0@'.format(h.to_upper().underscorify()) > >endif > > diff --git a/src/util/u_endian.h b/src/util/u_endian.h > > index 22d011ec0086ee77e11c..e11b381588dbc960e8c3 100644 > > --- a/src/util/u_endian.h > > +++ b/src/util/u_endian.h > > @@ -27,7 +27,7 @@ > > #ifndef U_ENDIAN_H > > #define U_ENDIAN_H > > > > -#if defined(__GLIBC__) || defined(ANDROID) || defined(__CYGWIN__) > > +#ifdef HAVE_ENDIAN_H > > is it really safe to remove the `defined(ANDROID)` check here? I'm clearly too tired to do this... Cc'ing Rob; can you tell us if defining HAVE_ENDIAN_H unconditionally in Android.mk seems reasonable? Or is there a way to detect headers on Android? I also forgot to add the check in scons; I just added this to the commit locally: 8< diff --git a/scons/gallium.py b/scons/gallium.py index 75200b89c1fe6d751980..6cb20efcbf4b8c997f60 100755 --- a/scons/gallium.py +++ b/scons/gallium.py @@ -354,6 +354,9 @@ def generate(env): if check_header(env, 'xlocale.h'): cppdefines += ['HAVE_XLOCALE_H'] +if check_header(env, 'endian.h'): +cppdefines += ['HAVE_ENDIAN_H'] + if check_functions(env, ['strtod_l', 'strtof_l']): cppdefines += ['HAVE_STRTOD_L'] >8 > > > #include > > > > #if __BYTE_ORDER == __LITTLE_ENDIAN > > -- > > Cheers, > > Eric > > > > ___ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH mesa] meson/configure: detect endian.h instead of trying to guess when it's available
On 21 March 2018 at 17:09, Eric Engestromwrote: > Cc: Maxin B. John > Cc: Khem Raj > Suggested-by: Jon Turney > Signed-off-by: Eric Engestrom > --- > configure.ac| 1 + > meson.build | 2 +- > src/util/u_endian.h | 2 +- > 3 files changed, 3 insertions(+), 2 deletions(-) > > diff --git a/configure.ac b/configure.ac > index 29d3c3457a7cdaefc36a..36c56da787e4fab5a355 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -865,6 +865,7 @@ fi > AC_HEADER_MAJOR > AC_CHECK_HEADER([xlocale.h], [DEFINES="$DEFINES -DHAVE_XLOCALE_H"]) > AC_CHECK_HEADER([sys/sysctl.h], [DEFINES="$DEFINES -DHAVE_SYS_SYSCTL_H"]) > +AC_CHECK_HEADER([endian.h], [DEFINES="$DEFINES -DHAVE_ENDIAN_H"]) > AC_CHECK_FUNC([strtof], [DEFINES="$DEFINES -DHAVE_STRTOF"]) > AC_CHECK_FUNC([mkostemp], [DEFINES="$DEFINES -DHAVE_MKOSTEMP"]) > AC_CHECK_FUNC([timespec_get], [DEFINES="$DEFINES -DHAVE_TIMESPEC_GET"]) Just hit me - why are we using any of these instead of AC_CHECK_FUNCS and AC_CHECK_HEADERS (note the S at the end). Those take a list + automatically set the HAVE_ macros for us. Off the top of my head - we could even use the _ONCE version which should lead to smaller configure file + faster runtime. I'm thinking out loud here ^^, no changes needed ;-) > diff --git a/meson.build b/meson.build > index 88518ec0f0e9b81759a7..1132b4bd37075d8c9d21 100644 > --- a/meson.build > +++ b/meson.build > @@ -904,7 +904,7 @@ elif cc.has_header_symbol('sys/mkdev.h', 'major') >pre_args += '-DMAJOR_IN_MKDEV' > endif > > -foreach h : ['xlocale.h', 'sys/sysctl.h', 'linux/futex.h'] > +foreach h : ['xlocale.h', 'sys/sysctl.h', 'linux/futex.h', 'endian.h'] >if cc.compiles('#include <@0@>'.format(h), name : '@0@ works'.format(h)) > pre_args += '-DHAVE_@0@'.format(h.to_upper().underscorify()) >endif > diff --git a/src/util/u_endian.h b/src/util/u_endian.h > index 22d011ec0086ee77e11c..e11b381588dbc960e8c3 100644 > --- a/src/util/u_endian.h > +++ b/src/util/u_endian.h > @@ -27,7 +27,7 @@ > #ifndef U_ENDIAN_H > #define U_ENDIAN_H > > -#if defined(__GLIBC__) || defined(ANDROID) || defined(__CYGWIN__) > +#ifdef HAVE_ENDIAN_H I'd either keep the ANDROID hunk here, or add -DHAVE_ENDIAN_H to Android.common.mk With slight inclination towards the latter ;-) With that the patch is Reviewed-by: Emil Velikov Cc: -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH mesa] meson/configure: detect endian.h instead of trying to guess when it's available
Quoting Eric Engestrom (2018-03-21 10:09:17) > Cc: Maxin B. John> Cc: Khem Raj > Suggested-by: Jon Turney > Signed-off-by: Eric Engestrom > --- > configure.ac| 1 + > meson.build | 2 +- > src/util/u_endian.h | 2 +- > 3 files changed, 3 insertions(+), 2 deletions(-) > > diff --git a/configure.ac b/configure.ac > index 29d3c3457a7cdaefc36a..36c56da787e4fab5a355 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -865,6 +865,7 @@ fi > AC_HEADER_MAJOR > AC_CHECK_HEADER([xlocale.h], [DEFINES="$DEFINES -DHAVE_XLOCALE_H"]) > AC_CHECK_HEADER([sys/sysctl.h], [DEFINES="$DEFINES -DHAVE_SYS_SYSCTL_H"]) > +AC_CHECK_HEADER([endian.h], [DEFINES="$DEFINES -DHAVE_ENDIAN_H"]) > AC_CHECK_FUNC([strtof], [DEFINES="$DEFINES -DHAVE_STRTOF"]) > AC_CHECK_FUNC([mkostemp], [DEFINES="$DEFINES -DHAVE_MKOSTEMP"]) > AC_CHECK_FUNC([timespec_get], [DEFINES="$DEFINES -DHAVE_TIMESPEC_GET"]) > diff --git a/meson.build b/meson.build > index 88518ec0f0e9b81759a7..1132b4bd37075d8c9d21 100644 > --- a/meson.build > +++ b/meson.build > @@ -904,7 +904,7 @@ elif cc.has_header_symbol('sys/mkdev.h', 'major') >pre_args += '-DMAJOR_IN_MKDEV' > endif > > -foreach h : ['xlocale.h', 'sys/sysctl.h', 'linux/futex.h'] > +foreach h : ['xlocale.h', 'sys/sysctl.h', 'linux/futex.h', 'endian.h'] >if cc.compiles('#include <@0@>'.format(h), name : '@0@ works'.format(h)) > pre_args += '-DHAVE_@0@'.format(h.to_upper().underscorify()) >endif > diff --git a/src/util/u_endian.h b/src/util/u_endian.h > index 22d011ec0086ee77e11c..e11b381588dbc960e8c3 100644 > --- a/src/util/u_endian.h > +++ b/src/util/u_endian.h > @@ -27,7 +27,7 @@ > #ifndef U_ENDIAN_H > #define U_ENDIAN_H > > -#if defined(__GLIBC__) || defined(ANDROID) || defined(__CYGWIN__) > +#ifdef HAVE_ENDIAN_H is it really safe to remove the `defined(ANDROID)` check here? > #include > > #if __BYTE_ORDER == __LITTLE_ENDIAN > -- > Cheers, > Eric > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev signature.asc Description: signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] st/mesa: don't draw if the bound element array buffer is not allocated
From: Marek Olšák--- src/mesa/state_tracker/st_draw.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/src/mesa/state_tracker/st_draw.c b/src/mesa/state_tracker/st_draw.c index b95a2522b2e..73f936bb4a9 100644 --- a/src/mesa/state_tracker/st_draw.c +++ b/src/mesa/state_tracker/st_draw.c @@ -166,20 +166,27 @@ st_draw_vbo(struct gl_context *ctx, } info.index_size = ib->index_size; info.min_index = min_index; info.max_index = max_index; if (_mesa_is_bufferobj(bufobj)) { /* indices are in a real VBO */ info.has_user_indices = false; info.index.resource = st_buffer_object(bufobj)->buffer; + + /* Return if the bound element array buffer doesn't have any backing + * storage. (nothing to do) + */ + if (!info.index.resource) +return; + start = pointer_to_offset(ib->ptr) / info.index_size; } else { /* indices are in user space memory */ info.has_user_indices = true; info.index.user = ib->ptr; } setup_primitive_restart(ctx, ); } else { -- 2.15.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] u_endian.h: make endianness check libc agnostic
On Wednesday, 2018-03-21 10:11:55 -0700, Dylan Baker wrote: > Quoting Jon Turney (2018-03-21 09:47:23) > > On 21/03/2018 15:09, Emil Velikov wrote: > > > Hi Maxin, > > > > > > Welcome back ;-) > > > > > > On 21 March 2018 at 14:52,wrote: > > >> From: Khem Raj > > >> > > >> endianness check is OS wide and not specific to libc. > > >> Fixes build with musl libc > > >> > > >> Signed-off-by: Khem Raj > > >> Signed-off-by: Maxin B. John > > >> --- > > >> src/util/u_endian.h | 2 +- > > >> 1 file changed, 1 insertion(+), 1 deletion(-) > > >> > > >> diff --git a/src/util/u_endian.h b/src/util/u_endian.h > > >> index 22d011e..4d5b4f4 100644 > > >> --- a/src/util/u_endian.h > > >> +++ b/src/util/u_endian.h > > >> @@ -27,7 +27,7 @@ > > >> #ifndef U_ENDIAN_H > > >> #define U_ENDIAN_H > > >> > > >> -#if defined(__GLIBC__) || defined(ANDROID) || defined(__CYGWIN__) > > >> +#if defined(__linux__) > > > > > > Fairly sure that glibc, musl and android define __linux__, although > > > I'm having doubts about Cygwin. > > > Which platforms did you test this patch on? > > > > Yes, I have a hard time believing these two lines are equivalent. > > > > I don't know why this isn't an autoconf check for endian.h etc. > > > > > Jon, will this confirm if this will work on your end, or we'll need to > > > add the __CYGWIN__ hunk back? > > What about haiku? I think they use glibc as well. > > While we're down this road and Jon brought it up, why don't we just do this > check in the build system(s)? That seems much more reliable. I just did :) https://patchwork.freedesktop.org/patch/211859/ > > Dylan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] u_endian.h: make endianness check libc agnostic
Quoting Jon Turney (2018-03-21 09:47:23) > On 21/03/2018 15:09, Emil Velikov wrote: > > Hi Maxin, > > > > Welcome back ;-) > > > > On 21 March 2018 at 14:52,wrote: > >> From: Khem Raj > >> > >> endianness check is OS wide and not specific to libc. > >> Fixes build with musl libc > >> > >> Signed-off-by: Khem Raj > >> Signed-off-by: Maxin B. John > >> --- > >> src/util/u_endian.h | 2 +- > >> 1 file changed, 1 insertion(+), 1 deletion(-) > >> > >> diff --git a/src/util/u_endian.h b/src/util/u_endian.h > >> index 22d011e..4d5b4f4 100644 > >> --- a/src/util/u_endian.h > >> +++ b/src/util/u_endian.h > >> @@ -27,7 +27,7 @@ > >> #ifndef U_ENDIAN_H > >> #define U_ENDIAN_H > >> > >> -#if defined(__GLIBC__) || defined(ANDROID) || defined(__CYGWIN__) > >> +#if defined(__linux__) > > > > Fairly sure that glibc, musl and android define __linux__, although > > I'm having doubts about Cygwin. > > Which platforms did you test this patch on? > > Yes, I have a hard time believing these two lines are equivalent. > > I don't know why this isn't an autoconf check for endian.h etc. > > > Jon, will this confirm if this will work on your end, or we'll need to > > add the __CYGWIN__ hunk back? What about haiku? I think they use glibc as well. While we're down this road and Jon brought it up, why don't we just do this check in the build system(s)? That seems much more reliable. Dylan signature.asc Description: signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/tiled_memcpy: realign rgba8_copy_aligned_dst stack in 32-bit builds
On Wed, Mar 21, 2018 at 2:39 AM, Eric Engestromwrote: > On Tuesday, 2018-03-20 13:39:25 -0700, Scott D Phillips wrote: >> When building intel_tiled_memcpy for i686, the stack will only be >> 4-byte aligned. This isn't sufficient for SSE temporaries which >> require 16-byte alignment. Use the force_align_arg_pointer >> function attribute in that case to ensure sufficient alignment. >> --- >> src/mesa/drivers/dri/i965/intel_tiled_memcpy.c | 8 +++- >> 1 file changed, 7 insertions(+), 1 deletion(-) >> >> diff --git a/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c >> b/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c >> index 69306828d72..bd8bafbd2d7 100644 >> --- a/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c >> +++ b/src/mesa/drivers/dri/i965/intel_tiled_memcpy.c >> @@ -42,6 +42,12 @@ >> #include >> #endif >> >> +#if defined(__GNUC__) && defined(__i386__) && (defined(__SSSE3__) || >> defined(__SSE2__)) > > Is that a typo? s/SSSE3/SSE3/ ? Nope, that's correct. SSSE3 is "Supplemental" SSE3, and we have compile-time code in this file that uses it over SSE2 to save a few instructions. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH mesa] meson/configure: detect endian.h instead of trying to guess when it's available
Cc: Maxin B. JohnCc: Khem Raj Suggested-by: Jon Turney Signed-off-by: Eric Engestrom --- configure.ac| 1 + meson.build | 2 +- src/util/u_endian.h | 2 +- 3 files changed, 3 insertions(+), 2 deletions(-) diff --git a/configure.ac b/configure.ac index 29d3c3457a7cdaefc36a..36c56da787e4fab5a355 100644 --- a/configure.ac +++ b/configure.ac @@ -865,6 +865,7 @@ fi AC_HEADER_MAJOR AC_CHECK_HEADER([xlocale.h], [DEFINES="$DEFINES -DHAVE_XLOCALE_H"]) AC_CHECK_HEADER([sys/sysctl.h], [DEFINES="$DEFINES -DHAVE_SYS_SYSCTL_H"]) +AC_CHECK_HEADER([endian.h], [DEFINES="$DEFINES -DHAVE_ENDIAN_H"]) AC_CHECK_FUNC([strtof], [DEFINES="$DEFINES -DHAVE_STRTOF"]) AC_CHECK_FUNC([mkostemp], [DEFINES="$DEFINES -DHAVE_MKOSTEMP"]) AC_CHECK_FUNC([timespec_get], [DEFINES="$DEFINES -DHAVE_TIMESPEC_GET"]) diff --git a/meson.build b/meson.build index 88518ec0f0e9b81759a7..1132b4bd37075d8c9d21 100644 --- a/meson.build +++ b/meson.build @@ -904,7 +904,7 @@ elif cc.has_header_symbol('sys/mkdev.h', 'major') pre_args += '-DMAJOR_IN_MKDEV' endif -foreach h : ['xlocale.h', 'sys/sysctl.h', 'linux/futex.h'] +foreach h : ['xlocale.h', 'sys/sysctl.h', 'linux/futex.h', 'endian.h'] if cc.compiles('#include <@0@>'.format(h), name : '@0@ works'.format(h)) pre_args += '-DHAVE_@0@'.format(h.to_upper().underscorify()) endif diff --git a/src/util/u_endian.h b/src/util/u_endian.h index 22d011ec0086ee77e11c..e11b381588dbc960e8c3 100644 --- a/src/util/u_endian.h +++ b/src/util/u_endian.h @@ -27,7 +27,7 @@ #ifndef U_ENDIAN_H #define U_ENDIAN_H -#if defined(__GLIBC__) || defined(ANDROID) || defined(__CYGWIN__) +#ifdef HAVE_ENDIAN_H #include #if __BYTE_ORDER == __LITTLE_ENDIAN -- Cheers, Eric ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] gallium/u_vbuf: Protect against overflow with large instance divisors.
GTF-GLES3.gtf.GL3Tests.instanced_arrays.instanced_arrays_divisor uses -1 as a divisor, so we would overflow to count=0 and upload no data, triggering the assert below. We want to upload 1 element in this case, fixing the test on VC5. --- src/gallium/auxiliary/util/u_vbuf.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/src/gallium/auxiliary/util/u_vbuf.c b/src/gallium/auxiliary/util/u_vbuf.c index 95d7990c6ca4..9073f3feed98 100644 --- a/src/gallium/auxiliary/util/u_vbuf.c +++ b/src/gallium/auxiliary/util/u_vbuf.c @@ -936,7 +936,12 @@ u_vbuf_upload_buffers(struct u_vbuf *mgr, size = mgr->ve->src_format_size[i]; } else if (instance_div) { /* Per-instance attrib. */ - unsigned count = (num_instances + instance_div - 1) / instance_div; + unsigned count = (num_instances + instance_div - 1); + + if (count < num_instances) +count = 0x; + count /= instance_div; + first += vb->stride * start_instance; size = vb->stride * (count - 1) + mgr->ve->src_format_size[i]; } else { -- 2.16.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev