Re: [Mesa-dev] [PATCH 3/6] nir: Try to make sense of the nir_shader_compiler_options code.
Kenneth Graunke kenn...@whitecape.org writes: The code in glsl_to_nir is entirely dead, as we translate from GLSL to NIR at link time, when there isn't a _mesa_glsl_parse_state to pass, so every caller passes NULL. glsl_to_nir seems like the wrong place to try and create the shader compiler options structure anyway - tgsi_to_nir, prog_to_nir, and other translators all would have to duplicate that code. The driver should set this up once with whatever settings it wants, and pass it in. Eric also added a NirOptions field to ctx-Const.ShaderCompilerOptions[] and left a comment saying: The memory for the options is expected to be kept in a single static copy by the driver. This suggests the plan was to do exactly that. That pointer was not marked const, however, and the dead code used a mix of static structures and ralloced ones. This patch deletes the dead code in glsl_to_nir, instead making it take the shader compiler options as a mandatory argument. It creates an (empty) options struct in the i965 driver, and makes NirOptions point to that. It marks the pointer const so that we can actually do so without generating discards const qualifier compiler warnings. Reviewed-by: Eric Anholt e...@anholt.net signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Mesa 10.4.6
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Mesa 10.4.6 has been released. Mesa 10.4.6 is a bug fix release fixing bugs since the 10.4.5 release, (see below for a list of changes). The tag in the git repository for Mesa 10.4.6 is 'mesa-10.4.6'. Mesa 10.4.6 is available for download at ftp://freedesktop.org/pub/mesa/10.4.6/ SHA-256 checksums: 46c9082142e811c01e49a2c332a9ac0a1eb98f2908985fb9df216539d7eaeaf4 MesaLib-10.4.6.tar.gz d8baedd20e79ccd98a5a7b05e23d59a30892e68de1fcc057ca6873dafca02735 MesaLib-10.4.6.tar.bz2 6aded6eac7f0d4d55117b8b581d8424710bbb4c768fc90f7b881f29311a751aa MesaLib-10.4.6.zip I have verified building from the .tar.bz2 file by doing: tar xjf MesaLib-10.4.6.tar.bz2 cd Mesa-10.4.6 ./configure --enable-gallium-llvm make -j6 make -j6 install I have also verified that I pushed the tag. - -Emil - -- Changes from 10.4.5 to 10.4.6: Abdiel Janulgue (2): glsl: Don't optimize min/max into saturate when EmitNoSat is set st/mesa: For vertex shaders, don't emit saturate when SM 3.0 is unsupported Andreas Boll (1): glx: Fix returned values of GLX_RENDERER_PREFERRED_PROFILE_MESA Brian Paul (2): swrast: fix multiple color buffer writing st/mesa: fix sampler view reference counting bug in glDraw/CopyPixels Chris Forbes (1): i965/gs: Check newly-generated GS-out VUE map against correct stage Eduardo Lima Mitev (1): mesa: Fix error validating args for TexSubImage3D Emil Velikov (7): docs: Add sha256 sums for the 10.4.5 release install-lib-links: remove the .install-lib-links file Revert mesa: Correct backwards NULL check. mesa: cherry-pick the second half of commit 2aa71e9485a Revert gallivm: Update for RTDyldMemoryManager becoming an unique_ptr. Update version to 10.4.6 Add release notes for the 10.4.6 release Ian Romanick (3): mesa: Add missing error checks in _mesa_ProgramBinary mesa: Ensure that length is set to zero in _mesa_GetProgramBinary mesa: Always generate GL_INVALID_OPERATION in _mesa_GetProgramBinary Jonathan Gray (1): auxilary/os: correct sysctl use in os_get_total_physical_memory() José Fonseca (1): gallivm: Update for RTDyldMemoryManager becoming an unique_ptr. Leo Liu (1): st/omx/dec/h264: fix picture out-of-order with poc type 0 v2 Lucas Stach (1): install-lib-links: don't depend on .libs directory Marek Olšák (2): vbo: fix an unitialized-variable warning radeonsi: fix point sprites Matt Turner (4): glsl: Rewrite and fix min/max to saturate optimization. mesa: Correct backwards NULL check. i965/fs: Don't use backend_visitor::instructions after creating the CFG. mesa: Correct backwards NULL check. -BEGIN PGP SIGNATURE- Version: GnuPG v2 iQIcBAEBAgAGBQJU+kC7AAoJEO2uN7As60kNNXYP/3sGvo1UeyekHMdszxICmwgi VtehiG+7/IqEEi1xrb7CLpj17JpuILyog3rOz6Q3z4NRZC1TyF6aTjiBRQVL/1Jy fMseYHS6yWQK2iBLsedd3JGrhQgzxM0gTMueog5n+veOfzjQQ4P4vBBGdIsFcaLp 3qxhWNnY32MxigogrVH+2ZLPX+yiTK7vIGF2FKmrSgdxSg3vq5p27o2xWLbjuo15 4qyVs6enNoZSqyw09PpxagVbUDFpaWhPAeCVvqw/eSH0xHfe3OXtY4ZDPl/AE3AO aJ6AwviUlVN9GspGaSavUTcnBDIazMY/99CHMm/9LPVJwcKkfA3D25S16/nQ3MkQ gM+N0QsZEVFawEIesHtOVHlMOOfJSDJlWZGojGJBD43H3Rin7TKa7WVihKAG/nCp r6/NX7FJnVR15DwDBv1d8vwzD70Dhe9bQ21aNEBDtpxc7s6uiwkjkFNX4o/EQcD6 asDEWRN8f8eig4KdoskhBS3g4nwo+waXrqkLsaZS4/MqXjcrtZzUKIFFNLvhqJv5 5r0twTbhO10EGwkufzEx/ost/fPspG9PMFBn6HF2oYFvGyHlCoca8CTbMb8TL/M7 hLC0GvrED3sbb7aEL0NHOijBDO7bDfk9evaSWO1cXUsK/CmCWoFe3Yts6KVEdz6v PeQOP0hWmCh88+9ZlepL =NPbb -END PGP SIGNATURE- ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] radeonsi/compute: Use value from compiler for COMPUTE_PGM_RSRC1.FLOAT_MODE
Reviewed-by: Marek Olšák marek.ol...@amd.com Marek On Fri, Mar 6, 2015 at 4:53 PM, Tom Stellard thomas.stell...@amd.com wrote: --- src/gallium/drivers/radeonsi/si_compute.c | 3 ++- src/gallium/drivers/radeonsi/si_shader.c | 1 + src/gallium/drivers/radeonsi/si_shader.h | 1 + 3 files changed, 4 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/radeonsi/si_compute.c b/src/gallium/drivers/radeonsi/si_compute.c index 5009f69..8609b89 100644 --- a/src/gallium/drivers/radeonsi/si_compute.c +++ b/src/gallium/drivers/radeonsi/si_compute.c @@ -377,7 +377,8 @@ static void si_launch_grid( * XXX: The compiler should account for this. */ | S_00B848_SGPRS(((MAX2(4 + arg_user_sgpr_count, - shader-num_sgprs)) - 1) / 8)) + shader-num_sgprs)) - 1) / 8) + | S_00B028_FLOAT_MODE(shader-float_mode)) ; lds_blocks = shader-lds_size; diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index b0417ed..87aef4d 100644 --- a/src/gallium/drivers/radeonsi/si_shader.c +++ b/src/gallium/drivers/radeonsi/si_shader.c @@ -2546,6 +2546,7 @@ void si_shader_binary_read_config(const struct si_screen *sscreen, case R_00B848_COMPUTE_PGM_RSRC1: shader-num_sgprs = MAX2(shader-num_sgprs, (G_00B028_SGPRS(value) + 1) * 8); shader-num_vgprs = MAX2(shader-num_vgprs, (G_00B028_VGPRS(value) + 1) * 4); + shader-float_mode = G_00B028_FLOAT_MODE(value); break; case R_00B02C_SPI_SHADER_PGM_RSRC2_PS: shader-lds_size = MAX2(shader-lds_size, G_00B02C_EXTRA_LDS_SIZE(value)); diff --git a/src/gallium/drivers/radeonsi/si_shader.h b/src/gallium/drivers/radeonsi/si_shader.h index 551c7dc..4f2bb91 100644 --- a/src/gallium/drivers/radeonsi/si_shader.h +++ b/src/gallium/drivers/radeonsi/si_shader.h @@ -149,6 +149,7 @@ struct si_shader { unsignednum_vgprs; unsignedlds_size; unsignedspi_ps_input_ena; + unsignedfloat_mode; unsignedscratch_bytes_per_wave; unsignedspi_shader_col_format; unsignedspi_shader_z_format; -- 2.0.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glsl: Generate link error for non-matching gl_FragCoord redeclarations
in different fragment shaders. This also applies to a case when gl_FragCoord is redeclared with no layout qualifiers in one fragment shader and not declared but used in other fragment shader. Signed-off-by: Anuj Phogat anuj.pho...@gmail.com Khronos Bug#12957 Cc: 10.5 mesa-sta...@lists.freedesktop.org Cc: Ian Romanick i...@freedesktop.org --- src/glsl/linker.cpp | 15 ++- 1 file changed, 2 insertions(+), 13 deletions(-) diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp index e11b6fa..e8bda4f 100644 --- a/src/glsl/linker.cpp +++ b/src/glsl/linker.cpp @@ -1365,24 +1365,13 @@ link_fs_input_layout_qualifiers(struct gl_shader_program *prog, * If gl_FragCoord is redeclared in any fragment shader in a program, *it must be redeclared in all the fragment shaders in that program *that have a static use gl_FragCoord. - * - * Exclude the case when one of the 'linked_shader' or 'shader' redeclares - * gl_FragCoord with no layout qualifiers but the other one doesn't - * redeclare it. If we strictly follow GLSL 1.50 spec's language, it - * should be a link error. But, generating link error for this case will - * be a wrong behaviour which spec didn't intend to do and it could also - * break some applications. */ if ((linked_shader-redeclares_gl_fragcoord !shader-redeclares_gl_fragcoord -shader-uses_gl_fragcoord -(linked_shader-origin_upper_left - || linked_shader-pixel_center_integer)) +shader-uses_gl_fragcoord) || (shader-redeclares_gl_fragcoord !linked_shader-redeclares_gl_fragcoord - linked_shader-uses_gl_fragcoord - (shader-origin_upper_left - || shader-pixel_center_integer))) { + linked_shader-uses_gl_fragcoord)) { linker_error(prog, fragment shader defined with conflicting layout qualifiers for gl_FragCoord\n); } -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] meta: Plug memory leak in blit shader creation
It looks like this has existed since: commit f5a477ab76b6e0b268387699cd2253a43db0dfae Author: Ian Romanick ian.d.roman...@intel.com Date: Mon Dec 16 11:54:08 2013 -0800 meta: Refactor shader generation code out of mipmap generation path Valgrind was complaining on the piglit test: fbo-generatemipmap-formats GL_ARB_texture_float -auto -fbo Cc: Ian Romanick ian.d.roman...@intel.com Cc: Brian Paul bri...@vmware.com Cc: Eric Anholt e...@anholt.net Reported-by: Mark Janes mark.a.ja...@intel.com (Jenkins) Signed-off-by: Ben Widawsky b...@bwidawsk.net --- src/mesa/drivers/common/meta.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c index fdc4cf1..2c1abe3 100644 --- a/src/mesa/drivers/common/meta.c +++ b/src/mesa/drivers/common/meta.c @@ -270,6 +270,7 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx, if (shader-shader_prog != 0) { _mesa_UseProgram(shader-shader_prog); + ralloc_free(mem_ctx); return; } -- 2.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] i965: Throttle to the previous frame
On 03/06/2015 01:58 PM, Chris Wilson wrote: In order to facilitate the concurrency offered by triple buffering and to offset the latency induced by swapping via an external process, which may incur extra rendering itself, only throttle to the previous frame and not the last. The second issue that mostly affects swap benchmarks, but also can incur jitter in the throttling, is that the throttle bo is closer to the next SwapBuffers rather than immediately after the previous SwapBuffers. Throttling to the previous frame doubles the maximum possible latency at the benefit of improving throughput and reducing jitter. v2: Rename first_post_swapbuffer batches array to a plain throttle_batch[] as the pluralisation was contorting the name and not making it clear as to whether it was the first batch or first_post_swap batch. Not least of which was that not all throttle points are SwapBuffers. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Cc: Daniel Vetter daniel.vet...@ffwll.ch Cc: Kenneth Graunke kenn...@whitecape.org Cc: Ben Widawsky b...@bwidawsk.net Cc: Kristian Høgsberg k...@bitplanet.net Cc: Chad Versace chad.vers...@linux.intel.com Both patches Reviewed-by: Chad Versace chad.vers...@intel.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 06/13] i965: Simplify generator code for untyped surface messages.
On Fri, Feb 27, 2015 at 05:34:49PM +0200, Francisco Jerez wrote: The generate_untyped_*() methods do nothing useful other than calling the corresponding function from brw_eu_emit.c. The calls to brw_mark_surface_used() will go away too in a future commit. --- src/mesa/drivers/dri/i965/brw_fs.h | 11 -- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 42 +-- src/mesa/drivers/dri/i965/brw_vec4.h | 9 - src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 43 +--- 4 files changed, 18 insertions(+), 87 deletions(-) Reviewed-by: Topi Pohjolainen topi.pohjolai...@intel.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 03/13] i965: Pass number of components explicitly to brw_untyped_atomic and _surface_read.
On Fri, Feb 27, 2015 at 05:34:46PM +0200, Francisco Jerez wrote: And calculate the message response size based on the number of components rather than the other way around. This simplifies their interface somewhat and allows the caller to request a writeback message with more than one vector component in SIMD4x2 mode. --- src/mesa/drivers/dri/i965/brw_eu.h | 4 ++-- src/mesa/drivers/dri/i965/brw_eu_emit.c | 30 +++- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 9 --- src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 5 ++-- 4 files changed, 32 insertions(+), 16 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_eu.h b/src/mesa/drivers/dri/i965/brw_eu.h index 9b1e0e2..87a9f3f 100644 --- a/src/mesa/drivers/dri/i965/brw_eu.h +++ b/src/mesa/drivers/dri/i965/brw_eu.h @@ -403,7 +403,7 @@ brw_untyped_atomic(struct brw_compile *p, unsigned atomic_op, unsigned bind_table_index, unsigned msg_length, - unsigned response_length); + bool response_expected); I had to think about this somewhat but after reading the rest of the series I think this make sense. Reviewed-by: Topi Pohjolainen topi.pohjolai...@intel.com void brw_untyped_surface_read(struct brw_compile *p, @@ -411,7 +411,7 @@ brw_untyped_surface_read(struct brw_compile *p, struct brw_reg mrf, unsigned bind_table_index, unsigned msg_length, - unsigned response_length); + unsigned num_channels); void brw_pixel_interpolator_query(struct brw_compile *p, diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c index cd2ce92..2b1d6ff 100644 --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c @@ -2729,6 +2729,20 @@ brw_svb_write(struct brw_compile *p, send_commit_msg); /* send_commit_msg */ } +static unsigned +brw_surface_payload_size(struct brw_compile *p, + unsigned num_channels, + bool has_simd4x2, + bool has_simd16) +{ + if (has_simd4x2 brw_inst_access_mode(p-brw, p-current) == BRW_ALIGN_16) + return 1; + else if (has_simd16 p-compressed) + return 2 * num_channels; + else + return num_channels; +} + static void brw_set_dp_untyped_atomic_message(struct brw_compile *p, brw_inst *insn, @@ -2782,7 +2796,8 @@ brw_untyped_atomic(struct brw_compile *p, unsigned atomic_op, unsigned bind_table_index, unsigned msg_length, - unsigned response_length) { + bool response_expected) +{ const struct brw_context *brw = p-brw; brw_inst *insn = brw_next_insn(p, BRW_OPCODE_SEND); @@ -2790,7 +2805,9 @@ brw_untyped_atomic(struct brw_compile *p, brw_set_src0(p, insn, retype(payload, BRW_REGISTER_TYPE_UD)); brw_set_src1(p, insn, brw_imm_d(0)); brw_set_dp_untyped_atomic_message( - p, insn, atomic_op, bind_table_index, msg_length, response_length, + p, insn, atomic_op, bind_table_index, msg_length, + brw_surface_payload_size(p, response_expected, + brw-gen = 8 || brw-is_haswell, true), brw_inst_access_mode(brw, insn) == BRW_ALIGN_1); } @@ -2800,12 +2817,12 @@ brw_set_dp_untyped_surface_read_message(struct brw_compile *p, unsigned bind_table_index, unsigned msg_length, unsigned response_length, +unsigned num_channels, bool header_present) { const struct brw_context *brw = p-brw; const unsigned dispatch_width = (brw_inst_exec_size(brw, insn) == BRW_EXECUTE_16 ? 16 : 8); - const unsigned num_channels = response_length / (dispatch_width / 8); if (brw-gen = 8 || brw-is_haswell) { brw_set_message_descriptor(p, insn, HSW_SFID_DATAPORT_DATA_CACHE_1, @@ -2843,7 +2860,7 @@ brw_untyped_surface_read(struct brw_compile *p, struct brw_reg mrf, unsigned bind_table_index, unsigned msg_length, - unsigned response_length) + unsigned num_channels) { const struct brw_context *brw = p-brw; brw_inst *insn = next_insn(p, BRW_OPCODE_SEND); @@ -2851,8 +2868,9 @@ brw_untyped_surface_read(struct brw_compile *p, brw_set_dest(p, insn, retype(dest, BRW_REGISTER_TYPE_UD)); brw_set_src0(p, insn, retype(mrf, BRW_REGISTER_TYPE_UD));
Re: [Mesa-dev] [PATCH 1/5] i915: Remove unused IS_MOBILE macro
On Thu, Mar 05, 2015 at 11:49:54AM -0800, Ian Romanick wrote: From: Ian Romanick ian.d.roman...@intel.com Inspired by Damien's recent libdrm changes. Signed-off-by: Ian Romanick ian.d.roman...@intel.com Cc: Damien Lespiau damien.lesp...@intel.com For the whole series (not that my r-b tag has a lot of value on the mesa code base): Reviewed-by: Damien Lespiau damien.lesp...@intel.com -- Damien --- src/mesa/drivers/dri/i915/intel_chipset.h | 10 -- 1 file changed, 10 deletions(-) diff --git a/src/mesa/drivers/dri/i915/intel_chipset.h b/src/mesa/drivers/dri/i915/intel_chipset.h index 3828085..d05fd08 100644 --- a/src/mesa/drivers/dri/i915/intel_chipset.h +++ b/src/mesa/drivers/dri/i915/intel_chipset.h @@ -53,16 +53,6 @@ #define IS_PNVG(devid) (devid == PCI_CHIP_PNV_G) #define IS_PNV(devid) (IS_PNVG(devid) || IS_PNVGM(devid)) -#define IS_MOBILE(devid) (devid == PCI_CHIP_I855_GM || \ - devid == PCI_CHIP_I915_GM || \ - devid == PCI_CHIP_I945_GM || \ - devid == PCI_CHIP_I945_GME || \ - devid == PCI_CHIP_I965_GM || \ - devid == PCI_CHIP_I965_GME || \ - devid == PCI_CHIP_GM45_GM || \ - IS_PNV(devid) || \ - devid == PCI_CHIP_ILM_G) - #define IS_915(devid)(devid == PCI_CHIP_I915_G || \ devid == PCI_CHIP_E7221_G || \ devid == PCI_CHIP_I915_GM) -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/6] glsl: Mark array access when copying to a temporary for the ?: operator.
Piglit's spec/glsl-1.20/compiler/structure-and-array-operations/ array-selection.vert test contains the following code: gl_Position = (pick_from_a_or_b ? a : b)[i]; where a and b are uniform vec4[2] variables. ast_to_hir creates a temporary vec4[2] variable, conditional_tmp, and generates an if-block to copy one or the other: (declare (temporary) (array vec4 2) conditional_tmp) (if (var_ref pick_from_a_or_b) ((assign () (var_ref conditional_tmp) (var_ref a))) ((assign () (var_ref conditional_tmp) (var_ref b However, we failed to update max_array_access for a and b, so it remained 0 - here, the whole array is being accessed. At link time, update_array_sizes() used this bogus information to change the types of a and b to vec4[1]. We then had assignments from a vec4[1] to a vec4[2], which is highly illegal. This tripped assertions in nir_split_var_copies with scalar VS. Signed-off-by: Kenneth Graunke kenn...@whitecape.org Cc: mesa-sta...@lists.freedesktop.org --- src/glsl/ast_to_hir.cpp | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index acb5c76..d387b2e 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -1617,6 +1617,12 @@ ast_expression::do_hir(exec_list *instructions, cond_val != NULL) { result = cond_val-value.b[0] ? op[1] : op[2]; } else { + /* The copy to conditional_tmp reads the whole array. */ + if (type-is_array()) { +mark_whole_array_access(op[1]); +mark_whole_array_access(op[2]); + } + ir_variable *const tmp = new(ctx) ir_variable(type, conditional_tmp, ir_var_temporary); instructions-push_tail(tmp); -- 2.2.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/6] nir: Only do gl_FrontFacing workaround in glsl_to_nir for the FS.
Vertex shaders can have shader inputs where location happens to be VARYING_SLOT_FACE. Without predicating this on the shader stage, we suddenly end up with load_front_face intrinsics in vertex shaders, which is nonsensical. Fixes spec/arb_vertex_buffer_object/pos-array when using NIR for VS. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/glsl/nir/glsl_to_nir.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp index ddad207..047cb51 100644 --- a/src/glsl/nir/glsl_to_nir.cpp +++ b/src/glsl/nir/glsl_to_nir.cpp @@ -251,7 +251,8 @@ nir_visitor::visit(ir_variable *ir) break; case ir_var_shader_in: - if (ir-data.location == VARYING_SLOT_FACE) { + if (stage == MESA_SHADER_FRAGMENT + ir-data.location == VARYING_SLOT_FACE) { /* For whatever reason, GLSL IR makes gl_FrontFacing an input */ var-data.location = SYSTEM_VALUE_FRONT_FACE; var-data.mode = nir_var_system_value; -- 2.2.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/6] nir: Delete nir_shader::user_structures and num_user_structures.
Nothing actually uses these, and the only caller of glsl_to_nir() (brw_fs_nir.cpp) always passes NULL for the _mesa_glsl_parse_state pointer, meaning they'll always be NULL and 0, respectively. Just delete them. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/glsl/nir/glsl_to_nir.cpp | 11 --- src/glsl/nir/nir.c | 3 --- src/glsl/nir/nir.h | 4 src/glsl/nir/nir_print.c | 4 4 files changed, 22 deletions(-) diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp index adef19c..b82e5c7 100644 --- a/src/glsl/nir/glsl_to_nir.cpp +++ b/src/glsl/nir/glsl_to_nir.cpp @@ -154,17 +154,6 @@ glsl_to_nir(exec_list *ir, _mesa_glsl_parse_state *state, nir_shader *shader = nir_shader_create(NULL, options); - if (state) { - shader-num_user_structures = state-num_user_structures; - shader-user_structures = ralloc_array(shader, glsl_type *, - shader-num_user_structures); - memcpy(shader-user_structures, state-user_structures, -shader-num_user_structures * sizeof(glsl_type *)); - } else { - shader-num_user_structures = 0; - shader-user_structures = NULL; - } - nir_visitor v1(shader, native_integers); nir_function_visitor v2(v1); v2.run(ir); diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c index ab57fd4..abad3f8 100644 --- a/src/glsl/nir/nir.c +++ b/src/glsl/nir/nir.c @@ -42,9 +42,6 @@ nir_shader_create(void *mem_ctx, const nir_shader_compiler_options *options) shader-options = options; - shader-num_user_structures = 0; - shader-user_structures = NULL; - exec_list_make_empty(shader-functions); exec_list_make_empty(shader-registers); exec_list_make_empty(shader-globals); diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h index d5df596..b935354 100644 --- a/src/glsl/nir/nir.h +++ b/src/glsl/nir/nir.h @@ -1400,10 +1400,6 @@ typedef struct nir_shader { /** list of global register in the shader */ struct exec_list registers; - /** structures used in this shader */ - unsigned num_user_structures; - struct glsl_type **user_structures; - /** next available global register index */ unsigned reg_alloc; diff --git a/src/glsl/nir/nir_print.c b/src/glsl/nir/nir_print.c index 30d4821..f8b14a1 100644 --- a/src/glsl/nir/nir_print.c +++ b/src/glsl/nir/nir_print.c @@ -844,10 +844,6 @@ nir_print_shader(nir_shader *shader, FILE *fp) print_var_state state; init_print_state(state); - for (unsigned i = 0; i shader-num_user_structures; i++) { - glsl_print_struct(shader-user_structures[i], fp); - } - struct hash_entry *entry; hash_table_foreach(shader-uniforms, entry) { -- 2.2.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/6] nir: Add native_integers to nir_shader_compiler_options.
glsl_to_nir, tgsi_to_nir, and prog_to_nir all want to know whether the driver supports native integers. Presumably other passes may as well. Adding this to nir_shader_compiler_options is an easy way to provide that information, as it's accessible via nir_shader::options. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/glsl/nir/glsl_to_nir.cpp | 11 +-- src/glsl/nir/glsl_to_nir.h | 2 +- src/glsl/nir/nir.h | 6 ++ src/mesa/drivers/dri/i965/brw_context.c | 4 +++- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 2 +- 5 files changed, 16 insertions(+), 9 deletions(-) diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp index 7e40ef4..0d96e03 100644 --- a/src/glsl/nir/glsl_to_nir.cpp +++ b/src/glsl/nir/glsl_to_nir.cpp @@ -43,7 +43,7 @@ namespace { class nir_visitor : public ir_visitor { public: - nir_visitor(nir_shader *shader, bool supports_ints); + nir_visitor(nir_shader *shader); ~nir_visitor(); virtual void visit(ir_variable *); @@ -125,12 +125,11 @@ private: }; /* end of anonymous namespace */ nir_shader * -glsl_to_nir(exec_list *ir, bool native_integers, -const nir_shader_compiler_options *options) +glsl_to_nir(exec_list *ir, const nir_shader_compiler_options *options) { nir_shader *shader = nir_shader_create(NULL, options); - nir_visitor v1(shader, native_integers); + nir_visitor v1(shader); nir_function_visitor v2(v1); v2.run(ir); visit_exec_list(ir, v1); @@ -138,9 +137,9 @@ glsl_to_nir(exec_list *ir, bool native_integers, return shader; } -nir_visitor::nir_visitor(nir_shader *shader, bool supports_ints) +nir_visitor::nir_visitor(nir_shader *shader) { - this-supports_ints = supports_ints; + this-supports_ints = shader-options-native_integers; this-shader = shader; this-is_global = true; this-var_table = _mesa_hash_table_create(NULL, _mesa_hash_pointer, diff --git a/src/glsl/nir/glsl_to_nir.h b/src/glsl/nir/glsl_to_nir.h index 7300945..dd62793 100644 --- a/src/glsl/nir/glsl_to_nir.h +++ b/src/glsl/nir/glsl_to_nir.h @@ -32,7 +32,7 @@ extern C { #endif -nir_shader *glsl_to_nir(exec_list *ir, bool native_integers, +nir_shader *glsl_to_nir(exec_list *ir, const nir_shader_compiler_options *options); #ifdef __cplusplus diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h index b935354..669a26e 100644 --- a/src/glsl/nir/nir.h +++ b/src/glsl/nir/nir.h @@ -1370,6 +1370,12 @@ typedef struct nir_shader_compiler_options { bool lower_fsqrt; /** lowers fneg and ineg to fsub and isub. */ bool lower_negate; + + /** +* Does the driver support real 32-bit integers? (Otherwise, integers +* are simulated by floats.) +*/ + bool native_integers; } nir_shader_compiler_options; typedef struct nir_shader { diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index 892d4ca..03547e9 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -551,7 +551,9 @@ brw_initialize_context_constants(struct brw_context *brw) ctx-Const.Program[MESA_SHADER_FRAGMENT].MaxInputComponents = 128; } - static const nir_shader_compiler_options nir_options = {}; + static const nir_shader_compiler_options nir_options = { + .native_integers = true, + }; /* We want the GLSL compiler to emit code that uses condition codes */ for (int i = 0; i MESA_SHADER_STAGES; i++) { diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index e24bf92..ccb5cea 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -87,7 +87,7 @@ fs_visitor::emit_nir_code() /* first, lower the GLSL IR shader to NIR */ lower_output_reads(shader-base.ir); - nir_shader *nir = glsl_to_nir(shader-base.ir, true, options); + nir_shader *nir = glsl_to_nir(shader-base.ir, options); nir_validate_shader(nir); nir_lower_global_vars_to_local(nir); -- 2.2.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/6] nir: Try to make sense of the nir_shader_compiler_options code.
The code in glsl_to_nir is entirely dead, as we translate from GLSL to NIR at link time, when there isn't a _mesa_glsl_parse_state to pass, so every caller passes NULL. glsl_to_nir seems like the wrong place to try and create the shader compiler options structure anyway - tgsi_to_nir, prog_to_nir, and other translators all would have to duplicate that code. The driver should set this up once with whatever settings it wants, and pass it in. Eric also added a NirOptions field to ctx-Const.ShaderCompilerOptions[] and left a comment saying: The memory for the options is expected to be kept in a single static copy by the driver. This suggests the plan was to do exactly that. That pointer was not marked const, however, and the dead code used a mix of static structures and ralloced ones. This patch deletes the dead code in glsl_to_nir, instead making it take the shader compiler options as a mandatory argument. It creates an (empty) options struct in the i965 driver, and makes NirOptions point to that. It marks the pointer const so that we can actually do so without generating discards const qualifier compiler warnings. Signed-off-by: Kenneth Graunke kenn...@whitecape.org Cc: Eric Anholt e...@anholt.net --- src/glsl/nir/glsl_to_nir.cpp | 28 ++-- src/glsl/nir/glsl_to_nir.h | 4 ++-- src/mesa/drivers/dri/i965/brw_context.c | 5 + src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 5 - src/mesa/main/mtypes.h | 2 +- 5 files changed, 14 insertions(+), 30 deletions(-) Eric, does this look reasonable to you? diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp index b82e5c7..7e40ef4 100644 --- a/src/glsl/nir/glsl_to_nir.cpp +++ b/src/glsl/nir/glsl_to_nir.cpp @@ -124,34 +124,10 @@ private: }; /* end of anonymous namespace */ -static const nir_shader_compiler_options default_options = { -}; - nir_shader * -glsl_to_nir(exec_list *ir, _mesa_glsl_parse_state *state, -bool native_integers) +glsl_to_nir(exec_list *ir, bool native_integers, +const nir_shader_compiler_options *options) { - const nir_shader_compiler_options *options; - - if (state) { - struct gl_context *ctx = state-ctx; - struct gl_shader_compiler_options *gl_options = - ctx-Const.ShaderCompilerOptions[state-stage]; - - if (!gl_options-NirOptions) { - nir_shader_compiler_options *new_options = -rzalloc(ctx, nir_shader_compiler_options); - options = gl_options-NirOptions = new_options; - - if (gl_options-EmitNoPow) -new_options-lower_fpow = true; - } else { - options = gl_options-NirOptions; - } - } else { - options = default_options; - } - nir_shader *shader = nir_shader_create(NULL, options); nir_visitor v1(shader, native_integers); diff --git a/src/glsl/nir/glsl_to_nir.h b/src/glsl/nir/glsl_to_nir.h index 58b2cee..7300945 100644 --- a/src/glsl/nir/glsl_to_nir.h +++ b/src/glsl/nir/glsl_to_nir.h @@ -32,8 +32,8 @@ extern C { #endif -nir_shader *glsl_to_nir(exec_list * ir, _mesa_glsl_parse_state *state, -bool native_integers); +nir_shader *glsl_to_nir(exec_list *ir, bool native_integers, +const nir_shader_compiler_options *options); #ifdef __cplusplus } diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index 786e6f5..892d4ca 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -68,6 +68,8 @@ #include tnl/t_pipeline.h #include util/ralloc.h +#include glsl/nir/nir.h + /*** * Mesa's Driver Functions ***/ @@ -549,6 +551,8 @@ brw_initialize_context_constants(struct brw_context *brw) ctx-Const.Program[MESA_SHADER_FRAGMENT].MaxInputComponents = 128; } + static const nir_shader_compiler_options nir_options = {}; + /* We want the GLSL compiler to emit code that uses condition codes */ for (int i = 0; i MESA_SHADER_STAGES; i++) { ctx-Const.ShaderCompilerOptions[i].MaxIfDepth = brw-gen 6 ? 16 : UINT_MAX; @@ -562,6 +566,7 @@ brw_initialize_context_constants(struct brw_context *brw) (i == MESA_SHADER_FRAGMENT); ctx-Const.ShaderCompilerOptions[i].EmitNoIndirectUniform = false; ctx-Const.ShaderCompilerOptions[i].LowerClipDistance = true; + ctx-Const.ShaderCompilerOptions[i].NirOptions = nir_options; } ctx-Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].OptimizeForAOS = true; diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index a0300aa..e24bf92 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -82,9 +82,12 @@ count_nir_instrs(nir_shader *nir) void fs_visitor::emit_nir_code() { + const nir_shader_compiler_options *options = +
[Mesa-dev] [PATCH 5/6] nir: Plumb the shader stage into glsl_to_nir().
The next commit needs to know the shader stage in glsl_to_nir(). To facilitate that, we pass the gl_shader rather than the raw exec_list of instructions. This has both the exec_list and the stage. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/glsl/nir/glsl_to_nir.cpp | 14 -- src/glsl/nir/glsl_to_nir.h | 2 +- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 2 +- 3 files changed, 10 insertions(+), 8 deletions(-) diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp index 0d96e03..ddad207 100644 --- a/src/glsl/nir/glsl_to_nir.cpp +++ b/src/glsl/nir/glsl_to_nir.cpp @@ -43,7 +43,7 @@ namespace { class nir_visitor : public ir_visitor { public: - nir_visitor(nir_shader *shader); + nir_visitor(nir_shader *shader, gl_shader_stage stage); ~nir_visitor(); virtual void visit(ir_variable *); @@ -83,6 +83,7 @@ private: bool supports_ints; nir_shader *shader; + gl_shader_stage stage; nir_function_impl *impl; exec_list *cf_node_list; nir_instr *result; /* result of the expression tree last visited */ @@ -125,22 +126,23 @@ private: }; /* end of anonymous namespace */ nir_shader * -glsl_to_nir(exec_list *ir, const nir_shader_compiler_options *options) +glsl_to_nir(struct gl_shader *sh, const nir_shader_compiler_options *options) { nir_shader *shader = nir_shader_create(NULL, options); - nir_visitor v1(shader); + nir_visitor v1(shader, sh-Stage); nir_function_visitor v2(v1); - v2.run(ir); - visit_exec_list(ir, v1); + v2.run(sh-ir); + visit_exec_list(sh-ir, v1); return shader; } -nir_visitor::nir_visitor(nir_shader *shader) +nir_visitor::nir_visitor(nir_shader *shader, gl_shader_stage stage) { this-supports_ints = shader-options-native_integers; this-shader = shader; + this-stage = stage; this-is_global = true; this-var_table = _mesa_hash_table_create(NULL, _mesa_hash_pointer, _mesa_key_pointer_equal); diff --git a/src/glsl/nir/glsl_to_nir.h b/src/glsl/nir/glsl_to_nir.h index dd62793..3801e8c 100644 --- a/src/glsl/nir/glsl_to_nir.h +++ b/src/glsl/nir/glsl_to_nir.h @@ -32,7 +32,7 @@ extern C { #endif -nir_shader *glsl_to_nir(exec_list *ir, +nir_shader *glsl_to_nir(struct gl_shader *sh, const nir_shader_compiler_options *options); #ifdef __cplusplus diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index ccb5cea..3bb6806 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -87,7 +87,7 @@ fs_visitor::emit_nir_code() /* first, lower the GLSL IR shader to NIR */ lower_output_reads(shader-base.ir); - nir_shader *nir = glsl_to_nir(shader-base.ir, options); + nir_shader *nir = glsl_to_nir(shader-base, options); nir_validate_shader(nir); nir_lower_global_vars_to_local(nir); -- 2.2.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] glx: remove unneeded ifdef _WIN32 guard
On 03/06/2015 05:34 AM, Emil Velikov wrote: The C99 header exists on other platforms as well. Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- src/glx/glxclient.h | 2 -- 1 file changed, 2 deletions(-) diff --git a/src/glx/glxclient.h b/src/glx/glxclient.h index a140c87..122ae5d 100644 --- a/src/glx/glxclient.h +++ b/src/glx/glxclient.h @@ -47,9 +47,7 @@ #include string.h #include stdlib.h #include stdio.h -#ifdef _WIN32 #include stdint.h -#endif #include GL/glxproto.h #include glxconfig.h #include glxhash.h LGTM. I don't think we ever build this on Windows anyway. Reviewed-by: Brian Paul bri...@vmware.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Throttle rendering to an fbo
On Thu, Mar 05, 2015 at 02:38:44PM -0800, Ian Romanick wrote: On 03/04/2015 10:28 AM, Chad Versace wrote: That text does not appear in the GL spec. When I read the manpage alongside the GL spec, to get a more complete context, I think the manpage contains that phrase simply to contrast with glFinish. In my reading, it does not imply that glFlush may wait for *some* previously issued GL commands to complete. glFlush was invented to support indirect rendering (especially to the front buffer): it flushes the buffer in libGL to the xserver. If you're making any other assumptions about what it does or does not do... continue at your own peril. Well plan B is the kernel throttles for you. But that's going to cause a fuzz since at least for benchmarking the kernel kinda lacks the necessary information to make informed calls about when to throttle and how. So is there no gl entry point in mesa we can abuse and make this happen? Citing the spec doesn't make the real world issue go away. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/6] nir: Try to make sense of the nir_shader_compiler_options code.
Acked-by: Jason Ekstrand jason.ekstr...@intel.com On Mar 6, 2015 2:18 AM, Kenneth Graunke kenn...@whitecape.org wrote: The code in glsl_to_nir is entirely dead, as we translate from GLSL to NIR at link time, when there isn't a _mesa_glsl_parse_state to pass, so every caller passes NULL. glsl_to_nir seems like the wrong place to try and create the shader compiler options structure anyway - tgsi_to_nir, prog_to_nir, and other translators all would have to duplicate that code. The driver should set this up once with whatever settings it wants, and pass it in. Eric also added a NirOptions field to ctx-Const.ShaderCompilerOptions[] and left a comment saying: The memory for the options is expected to be kept in a single static copy by the driver. This suggests the plan was to do exactly that. That pointer was not marked const, however, and the dead code used a mix of static structures and ralloced ones. This patch deletes the dead code in glsl_to_nir, instead making it take the shader compiler options as a mandatory argument. It creates an (empty) options struct in the i965 driver, and makes NirOptions point to that. It marks the pointer const so that we can actually do so without generating discards const qualifier compiler warnings. Signed-off-by: Kenneth Graunke kenn...@whitecape.org Cc: Eric Anholt e...@anholt.net --- src/glsl/nir/glsl_to_nir.cpp | 28 ++-- src/glsl/nir/glsl_to_nir.h | 4 ++-- src/mesa/drivers/dri/i965/brw_context.c | 5 + src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 5 - src/mesa/main/mtypes.h | 2 +- 5 files changed, 14 insertions(+), 30 deletions(-) Eric, does this look reasonable to you? diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp index b82e5c7..7e40ef4 100644 --- a/src/glsl/nir/glsl_to_nir.cpp +++ b/src/glsl/nir/glsl_to_nir.cpp @@ -124,34 +124,10 @@ private: }; /* end of anonymous namespace */ -static const nir_shader_compiler_options default_options = { -}; - nir_shader * -glsl_to_nir(exec_list *ir, _mesa_glsl_parse_state *state, -bool native_integers) +glsl_to_nir(exec_list *ir, bool native_integers, +const nir_shader_compiler_options *options) { - const nir_shader_compiler_options *options; - - if (state) { - struct gl_context *ctx = state-ctx; - struct gl_shader_compiler_options *gl_options = - ctx-Const.ShaderCompilerOptions[state-stage]; - - if (!gl_options-NirOptions) { - nir_shader_compiler_options *new_options = -rzalloc(ctx, nir_shader_compiler_options); - options = gl_options-NirOptions = new_options; - - if (gl_options-EmitNoPow) -new_options-lower_fpow = true; - } else { - options = gl_options-NirOptions; - } - } else { - options = default_options; - } - nir_shader *shader = nir_shader_create(NULL, options); nir_visitor v1(shader, native_integers); diff --git a/src/glsl/nir/glsl_to_nir.h b/src/glsl/nir/glsl_to_nir.h index 58b2cee..7300945 100644 --- a/src/glsl/nir/glsl_to_nir.h +++ b/src/glsl/nir/glsl_to_nir.h @@ -32,8 +32,8 @@ extern C { #endif -nir_shader *glsl_to_nir(exec_list * ir, _mesa_glsl_parse_state *state, -bool native_integers); +nir_shader *glsl_to_nir(exec_list *ir, bool native_integers, +const nir_shader_compiler_options *options); #ifdef __cplusplus } diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index 786e6f5..892d4ca 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -68,6 +68,8 @@ #include tnl/t_pipeline.h #include util/ralloc.h +#include glsl/nir/nir.h + /*** * Mesa's Driver Functions ***/ @@ -549,6 +551,8 @@ brw_initialize_context_constants(struct brw_context *brw) ctx-Const.Program[MESA_SHADER_FRAGMENT].MaxInputComponents = 128; } + static const nir_shader_compiler_options nir_options = {}; + /* We want the GLSL compiler to emit code that uses condition codes */ for (int i = 0; i MESA_SHADER_STAGES; i++) { ctx-Const.ShaderCompilerOptions[i].MaxIfDepth = brw-gen 6 ? 16 : UINT_MAX; @@ -562,6 +566,7 @@ brw_initialize_context_constants(struct brw_context *brw) (i == MESA_SHADER_FRAGMENT); ctx-Const.ShaderCompilerOptions[i].EmitNoIndirectUniform = false; ctx-Const.ShaderCompilerOptions[i].LowerClipDistance = true; + ctx-Const.ShaderCompilerOptions[i].NirOptions = nir_options; } ctx-Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].OptimizeForAOS = true; diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index a0300aa..e24bf92 100644 ---
Re: [Mesa-dev] [PATCH 1/2] util: rework _MSC_VER = 1200 checks
On 06/03/15 14:26, Brian Paul wrote: On 03/06/2015 05:34 AM, Emil Velikov wrote: Replace the _MSC_VER = 1200 with defined (_MSC_VER) and compact if/else statements. We require MSVC 2008 or later with commit 46110c5d564. Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- src/util/macros.h | 8 +++- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/src/util/macros.h b/src/util/macros.h index b862bfd..63daba3 100644 --- a/src/util/macros.h +++ b/src/util/macros.h @@ -73,15 +73,13 @@ do {\ assert(!str);\ __builtin_unreachable(); \ } while (0) -#elif _MSC_VER = 1200 +#elif defined (_MSC_VER) #define unreachable(str)\ do {\ assert(!str);\ __assume(0); \ } while (0) -#endif - -#ifndef unreachable +#else #define unreachable(str) assert(!str) #endif @@ -99,7 +97,7 @@ do { \ #define assume(expr) ((expr) ? ((void) 0) \ : (assert(!assumption failed), \ __builtin_unreachable())) -#elif _MSC_VER = 1200 +#elif defined (_MSC_VER) #define assume(expr) __assume(expr) #else #define assume(expr) assert(expr) Building with this patch now and looks good so far. Reviewed-by: Brian Paul bri...@vmware.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.freedesktop.org_mailman_listinfo_mesa-2Ddevd=AwIGaQc=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEsr=zfmBZnnVGHeYde45pMKNnVyzeaZbdIqVLprmZCM2zzEm=VrCTq88usK6TJnKXQg4dtWAmjnQhTyIUUn69r-98pGIs=qu6XVEo8t4RfffgGmfDr3zUxhevQAukeEAlZ4IUXDPce= Looks good to me too. The minimum _MSC_VER we need to worry about is 1500 -- MSVC 2008. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] i965: Throttle rendering to an fbo
When rendering to an fbo, even though it may be acting as a winsys frontbuffer or just generally, we never throttle. However, when rendering to an fbo, there is no natural frame boundary. Conventionally we use SwapBuffers and glFinish, but potential callers avoid often glFinish for being too heavy handed (waiting on all outstanding rendering to complete). The kernel provides a soft-throttling option for this case that waits for rendering older than 20ms to be complete (that's a little too lax to be used for swapbuffers, but is here a useful safety net). The remaining choice is then either never to throttle, throttle after every draw call, or at after intermediate user defined point such as glFlush and thus all the implied flushes. This patch opts for the latter as that is the current method used for flushing to front buffers. v2: Defer the throttling from inside the flush to the next intel_prepare_render() and switch non-fbo frontbuffer throttling over to use the same lax method. The issuing being that glFlush()/intel_prepare_read() is just as likely to be called inside a tight loop and not at frame boundaries. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Cc: Daniel Vetter daniel.vet...@ffwll.ch Cc: Kenneth Graunke kenn...@whitecape.org Cc: Ben Widawsky b...@bwidawsk.net Cc: Kristian Høgsberg k...@bitplanet.net Cc: Chad Versace chad.vers...@linux.intel.com --- src/mesa/drivers/dri/i965/brw_context.c | 15 +++ src/mesa/drivers/dri/i965/brw_context.h | 3 ++- src/mesa/drivers/dri/i965/intel_screen.c | 8 3 files changed, 17 insertions(+), 9 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index 972e458..2ed5f16 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -232,8 +232,8 @@ intel_glFlush(struct gl_context *ctx) intel_batchbuffer_flush(brw); intel_flush_front(ctx); - if (brw_is_front_buffer_drawing(ctx-DrawBuffer)) - brw-need_throttle = true; + + brw-need_front_throttle = true; } static void @@ -1238,12 +1238,19 @@ intel_prepare_render(struct brw_context *brw) * the swap, and getting our hands on that doesn't seem worth it, * so we just us the first batch we emitted after the last swap. */ - if (brw-need_throttle brw-first_post_swapbuffers_batch) { + if (brw-need_swap_throttle brw-first_post_swapbuffers_batch) { if (!brw-disable_throttling) drm_intel_bo_wait_rendering(brw-first_post_swapbuffers_batch); drm_intel_bo_unreference(brw-first_post_swapbuffers_batch); brw-first_post_swapbuffers_batch = NULL; - brw-need_throttle = false; + brw-need_swap_throttle = false; + brw-need_front_throttle = false; + } + + if (brw-need_front_throttle) { + __DRIscreen *psp = brw-intelScreen-driScrnPriv; + drmCommandNone(psp-fd, DRM_I915_GEM_THROTTLE); + brw-need_front_throttle = false; } } diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 682fbe9..b90e050 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -1031,7 +1031,8 @@ struct brw_context /** Framerate throttling: @{ */ drm_intel_bo *first_post_swapbuffers_batch; - bool need_throttle; + bool need_swap_throttle; + bool need_front_throttle; /** @} */ GLuint stats_wm; diff --git a/src/mesa/drivers/dri/i965/intel_screen.c b/src/mesa/drivers/dri/i965/intel_screen.c index cea7ddf..044388a 100644 --- a/src/mesa/drivers/dri/i965/intel_screen.c +++ b/src/mesa/drivers/dri/i965/intel_screen.c @@ -174,10 +174,10 @@ intel_dri2_flush_with_flags(__DRIcontext *cPriv, if (flags __DRI2_FLUSH_DRAWABLE) intel_resolve_for_dri2_flush(brw, dPriv); - if (reason == __DRI2_THROTTLE_SWAPBUFFER || - reason == __DRI2_THROTTLE_FLUSHFRONT) { - brw-need_throttle = true; - } + if (reason == __DRI2_THROTTLE_SWAPBUFFER) + brw-need_swap_throttle = true; + if (reason == __DRI2_THROTTLE_FLUSHFRONT) + brw-need_front_throttle = true; intel_batchbuffer_flush(brw); -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] i965: Throttle rendering to an fbo
When rendering to an fbo, even though it may be acting as a winsys frontbuffer or just generally, we never throttle. However, when rendering to an fbo, there is no natural frame boundary. Conventionally we use SwapBuffers and glFinish, but potential callers avoid often glFinish for being too heavy handed (waiting on all outstanding rendering to complete). The kernel provides a soft-throttling option for this case that waits for rendering older than 20ms to be complete (that's a little too lax to be used for swapbuffers, but is here a useful safety net). The remaining choice is then either never to throttle, throttle after every draw call, or at after intermediate user defined point such as glFlush and thus all the implied flushes. This patch opts for the latter as that is the current method used for flushing to front buffers. v2: Defer the throttling from inside the flush to the next intel_prepare_render() and switch non-fbo frontbuffer throttling over to use the same lax method. The issuing being that glFlush()/intel_prepare_read() is just as likely to be called inside a tight loop and not at frame boundaries. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Cc: Daniel Vetter daniel.vet...@ffwll.ch Cc: Kenneth Graunke kenn...@whitecape.org Cc: Ben Widawsky b...@bwidawsk.net Cc: Kristian Høgsberg k...@bitplanet.net Cc: Chad Versace chad.vers...@linux.intel.com --- src/mesa/drivers/dri/i965/brw_context.c | 15 +++ src/mesa/drivers/dri/i965/brw_context.h | 3 ++- src/mesa/drivers/dri/i965/intel_screen.c | 8 3 files changed, 17 insertions(+), 9 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index 972e458..2ed5f16 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -232,8 +232,8 @@ intel_glFlush(struct gl_context *ctx) intel_batchbuffer_flush(brw); intel_flush_front(ctx); - if (brw_is_front_buffer_drawing(ctx-DrawBuffer)) - brw-need_throttle = true; + + brw-need_front_throttle = true; } static void @@ -1238,12 +1238,19 @@ intel_prepare_render(struct brw_context *brw) * the swap, and getting our hands on that doesn't seem worth it, * so we just us the first batch we emitted after the last swap. */ - if (brw-need_throttle brw-first_post_swapbuffers_batch) { + if (brw-need_swap_throttle brw-first_post_swapbuffers_batch) { if (!brw-disable_throttling) drm_intel_bo_wait_rendering(brw-first_post_swapbuffers_batch); drm_intel_bo_unreference(brw-first_post_swapbuffers_batch); brw-first_post_swapbuffers_batch = NULL; - brw-need_throttle = false; + brw-need_swap_throttle = false; + brw-need_front_throttle = false; + } + + if (brw-need_front_throttle) { + __DRIscreen *psp = brw-intelScreen-driScrnPriv; + drmCommandNone(psp-fd, DRM_I915_GEM_THROTTLE); + brw-need_front_throttle = false; } } diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 682fbe9..b90e050 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -1031,7 +1031,8 @@ struct brw_context /** Framerate throttling: @{ */ drm_intel_bo *first_post_swapbuffers_batch; - bool need_throttle; + bool need_swap_throttle; + bool need_front_throttle; /** @} */ GLuint stats_wm; diff --git a/src/mesa/drivers/dri/i965/intel_screen.c b/src/mesa/drivers/dri/i965/intel_screen.c index cea7ddf..044388a 100644 --- a/src/mesa/drivers/dri/i965/intel_screen.c +++ b/src/mesa/drivers/dri/i965/intel_screen.c @@ -174,10 +174,10 @@ intel_dri2_flush_with_flags(__DRIcontext *cPriv, if (flags __DRI2_FLUSH_DRAWABLE) intel_resolve_for_dri2_flush(brw, dPriv); - if (reason == __DRI2_THROTTLE_SWAPBUFFER || - reason == __DRI2_THROTTLE_FLUSHFRONT) { - brw-need_throttle = true; - } + if (reason == __DRI2_THROTTLE_SWAPBUFFER) + brw-need_swap_throttle = true; + if (reason == __DRI2_THROTTLE_FLUSHFRONT) + brw-need_front_throttle = true; intel_batchbuffer_flush(brw); -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/6] glsl: Mark array access when copying to a temporary for the ?: operator.
I gave you an back on 3; I'll let Eric actually review it. The rest are Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com On Mar 6, 2015 2:18 AM, Kenneth Graunke kenn...@whitecape.org wrote: Piglit's spec/glsl-1.20/compiler/structure-and-array-operations/ array-selection.vert test contains the following code: gl_Position = (pick_from_a_or_b ? a : b)[i]; where a and b are uniform vec4[2] variables. ast_to_hir creates a temporary vec4[2] variable, conditional_tmp, and generates an if-block to copy one or the other: (declare (temporary) (array vec4 2) conditional_tmp) (if (var_ref pick_from_a_or_b) ((assign () (var_ref conditional_tmp) (var_ref a))) ((assign () (var_ref conditional_tmp) (var_ref b However, we failed to update max_array_access for a and b, so it remained 0 - here, the whole array is being accessed. At link time, update_array_sizes() used this bogus information to change the types of a and b to vec4[1]. We then had assignments from a vec4[1] to a vec4[2], which is highly illegal. This tripped assertions in nir_split_var_copies with scalar VS. Signed-off-by: Kenneth Graunke kenn...@whitecape.org Cc: mesa-sta...@lists.freedesktop.org --- src/glsl/ast_to_hir.cpp | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index acb5c76..d387b2e 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -1617,6 +1617,12 @@ ast_expression::do_hir(exec_list *instructions, cond_val != NULL) { result = cond_val-value.b[0] ? op[1] : op[2]; } else { + /* The copy to conditional_tmp reads the whole array. */ + if (type-is_array()) { +mark_whole_array_access(op[1]); +mark_whole_array_access(op[2]); + } + ir_variable *const tmp = new(ctx) ir_variable(type, conditional_tmp, ir_var_temporary); instructions-push_tail(tmp); -- 2.2.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] util: rework _MSC_VER = 1200 checks
On 03/06/2015 05:34 AM, Emil Velikov wrote: Replace the _MSC_VER = 1200 with defined (_MSC_VER) and compact if/else statements. We require MSVC 2008 or later with commit 46110c5d564. Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- src/util/macros.h | 8 +++- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/src/util/macros.h b/src/util/macros.h index b862bfd..63daba3 100644 --- a/src/util/macros.h +++ b/src/util/macros.h @@ -73,15 +73,13 @@ do {\ assert(!str);\ __builtin_unreachable(); \ } while (0) -#elif _MSC_VER = 1200 +#elif defined (_MSC_VER) #define unreachable(str)\ do {\ assert(!str);\ __assume(0); \ } while (0) -#endif - -#ifndef unreachable +#else #define unreachable(str) assert(!str) #endif @@ -99,7 +97,7 @@ do { \ #define assume(expr) ((expr) ? ((void) 0) \ : (assert(!assumption failed), \ __builtin_unreachable())) -#elif _MSC_VER = 1200 +#elif defined (_MSC_VER) #define assume(expr) __assume(expr) #else #define assume(expr) assert(expr) Building with this patch now and looks good so far. Reviewed-by: Brian Paul bri...@vmware.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] i965: Throttle to the previous frame
In order to facilitate the concurrency offered by triple buffering and to offset the latency induced by swapping via an external process, which may incur extra rendering itself, only throttle to the previous frame and not the last. This doubles the maximum possible latency at the benefit of improving throughput and reducing jitter. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Cc: Daniel Vetter daniel.vet...@ffwll.ch Cc: Kenneth Graunke kenn...@whitecape.org Cc: Ben Widawsky b...@bwidawsk.net Cc: Kristian Høgsberg k...@bitplanet.net Cc: Chad Versace chad.vers...@linux.intel.com --- src/mesa/drivers/dri/i965/brw_context.c | 19 --- src/mesa/drivers/dri/i965/brw_context.h | 2 +- src/mesa/drivers/dri/i965/intel_batchbuffer.c | 7 --- 3 files changed, 17 insertions(+), 11 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index 2ed5f16..6897c2c 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -928,8 +928,10 @@ intelDestroyContext(__DRIcontext * driContextPriv) intel_batchbuffer_free(brw); - drm_intel_bo_unreference(brw-first_post_swapbuffers_batch); - brw-first_post_swapbuffers_batch = NULL; + drm_intel_bo_unreference(brw-first_post_swapbuffers_batch[1]); + drm_intel_bo_unreference(brw-first_post_swapbuffers_batch[0]); + brw-first_post_swapbuffers_batch[1] = NULL; + brw-first_post_swapbuffers_batch[0] = NULL; driDestroyOptionCache(brw-optionCache); @@ -1238,11 +1240,14 @@ intel_prepare_render(struct brw_context *brw) * the swap, and getting our hands on that doesn't seem worth it, * so we just us the first batch we emitted after the last swap. */ - if (brw-need_swap_throttle brw-first_post_swapbuffers_batch) { - if (!brw-disable_throttling) - drm_intel_bo_wait_rendering(brw-first_post_swapbuffers_batch); - drm_intel_bo_unreference(brw-first_post_swapbuffers_batch); - brw-first_post_swapbuffers_batch = NULL; + if (brw-need_swap_throttle brw-first_post_swapbuffers_batch[0]) { + if (brw-first_post_swapbuffers_batch[1]) { + if (!brw-disable_throttling) +drm_intel_bo_wait_rendering(brw-first_post_swapbuffers_batch[1]); + drm_intel_bo_unreference(brw-first_post_swapbuffers_batch[1]); + } + brw-first_post_swapbuffers_batch[1] = brw-first_post_swapbuffers_batch[0]; + brw-first_post_swapbuffers_batch[0] = NULL; brw-need_swap_throttle = false; brw-need_front_throttle = false; } diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index b90e050..e347f26 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -1030,7 +1030,7 @@ struct brw_context bool front_buffer_dirty; /** Framerate throttling: @{ */ - drm_intel_bo *first_post_swapbuffers_batch; + drm_intel_bo *first_post_swapbuffers_batch[2]; bool need_swap_throttle; bool need_front_throttle; /** @} */ diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c b/src/mesa/drivers/dri/i965/intel_batchbuffer.c index 5ac4d18..460b4b9 100644 --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c @@ -168,6 +168,7 @@ static void brw_new_batch(struct brw_context *brw) { /* Create a new batchbuffer and reset the associated state: */ + drm_intel_gem_bo_clear_relocs(brw-batch.bo, 0); intel_batchbuffer_reset(brw); /* If the kernel supports hardware contexts, then most hardware state is @@ -289,9 +290,9 @@ _intel_batchbuffer_flush(struct brw_context *brw, if (brw-batch.used == 0) return 0; - if (brw-first_post_swapbuffers_batch == NULL) { - brw-first_post_swapbuffers_batch = brw-batch.bo; - drm_intel_bo_reference(brw-first_post_swapbuffers_batch); + if (brw-first_post_swapbuffers_batch[0] == NULL) { + brw-first_post_swapbuffers_batch[0] = brw-batch.bo; + drm_intel_bo_reference(brw-first_post_swapbuffers_batch[0]); } if (unlikely(INTEL_DEBUG DEBUG_BATCH)) { -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Fix invalid extern C around header inclusion.
On 06/03/15 00:11, Mark Janes wrote: Matt Turner matts...@gmail.com writes: On Thu, Mar 5, 2015 at 4:54 AM, Jose Fonseca jfons...@vmware.com wrote: Thanks for doing this. It looks great. Reviewed-by: Jose Fonseca jfons...@vmware.com Feel free to push my patch or I'll push it after you push this one. Thanks José. I've just pushed it. ^^ I think Matt is talking about my patch, not José's. The no_extern_c.h patch still needs to be pushed. Thanks for the heads-up. I've just pushed mine. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 05/13] i965: Fix the untyped surface opcodes to deal with indirect surface access.
Pohjolainen, Topi topi.pohjolai...@intel.com writes: On Fri, Mar 06, 2015 at 02:29:15PM +0200, Francisco Jerez wrote: Pohjolainen, Topi topi.pohjolai...@intel.com writes: On Fri, Feb 27, 2015 at 05:34:48PM +0200, Francisco Jerez wrote: Change brw_untyped_atomic() and brw_untyped_surface_read() to take the surface index as a register instead of a constant and to use brw_send_indirect_message() to emit the indirect variant of send with a dynamically calculated message descriptor. This will be required to support variable indexing of image arrays for ARB_shader_image_load_store. --- src/mesa/drivers/dri/i965/brw_eu.h | 10 +- src/mesa/drivers/dri/i965/brw_eu_emit.c | 158 +-- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 4 +- src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 4 +- 4 files changed, 96 insertions(+), 80 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_eu.h b/src/mesa/drivers/dri/i965/brw_eu.h index 87a9f3f..9cc9123 100644 --- a/src/mesa/drivers/dri/i965/brw_eu.h +++ b/src/mesa/drivers/dri/i965/brw_eu.h @@ -398,18 +398,18 @@ void brw_CMP(struct brw_compile *p, void brw_untyped_atomic(struct brw_compile *p, - struct brw_reg dest, + struct brw_reg dst, struct brw_reg payload, + struct brw_reg surface, unsigned atomic_op, - unsigned bind_table_index, unsigned msg_length, bool response_expected); void brw_untyped_surface_read(struct brw_compile *p, - struct brw_reg dest, - struct brw_reg mrf, - unsigned bind_table_index, + struct brw_reg dst, + struct brw_reg payload, + struct brw_reg surface, unsigned msg_length, unsigned num_channels); diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c index 0b655d4..34695bf 100644 --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c @@ -2518,6 +2518,48 @@ brw_send_indirect_message(struct brw_compile *p, return setup; } +static struct brw_inst * +brw_send_indirect_surface_message(struct brw_compile *p, + unsigned sfid, + struct brw_reg dst, + struct brw_reg payload, + struct brw_reg surface, + unsigned message_len, + unsigned response_len, + bool header_present) +{ + const struct brw_context *brw = p-brw; + struct brw_inst *insn; + + if (surface.file != BRW_IMMEDIATE_VALUE) { + struct brw_reg addr = retype(brw_address_reg(0), BRW_REGISTER_TYPE_UD); + + brw_push_insn_state(p); + brw_set_default_access_mode(p, BRW_ALIGN_1); + brw_set_default_mask_control(p, BRW_MASK_DISABLE); + brw_set_default_predicate_control(p, BRW_PREDICATE_NONE); + + /* Mask out invalid bits from the surface index to avoid hangs e.g. when + * some surface array is accessed out of bounds. + */ + insn = brw_AND(p, addr, + suboffset(vec1(retype(surface, BRW_REGISTER_TYPE_UD)), + BRW_GET_SWZ(surface.dw1.bits.swizzle, 0)), + brw_imm_ud(0xff)); + + brw_pop_insn_state(p); + + surface = addr; + } + + insn = brw_send_indirect_message(p, sfid, dst, payload, surface); + brw_inst_set_mlen(brw, insn, message_len); + brw_inst_set_rlen(brw, insn, response_len); + brw_inst_set_header_present(brw, insn, header_present); I'll continue the discussion we started with patch number one here if you don't mind. What I find confusing is that in case 'surface' is not an immediate then these three calls modify the OR-instruction. Otherwise they modify the send. Or am I missing something? Yeah, that's the whole point of the OR instruction, indirect message sends no longer have an immediate source so all these control bits have to be specified somewhere else. The caller doesn't care whether the returned instruction is a SEND or some other opcode as long as it has room for the control fields. This I understand, what I miss is the effect of setting mlen/rlen/header to an OR-instruction. Those are fields of the message descriptor that is usually part of the immedate field of the SEND instruction. For indirect message sends they have to be loaded to an address register together with the remaining descriptor control bits, which is what the
Re: [Mesa-dev] [PATCH 05/13] i965: Fix the untyped surface opcodes to deal with indirect surface access.
Pohjolainen, Topi topi.pohjolai...@intel.com writes: On Fri, Feb 27, 2015 at 05:34:48PM +0200, Francisco Jerez wrote: Change brw_untyped_atomic() and brw_untyped_surface_read() to take the surface index as a register instead of a constant and to use brw_send_indirect_message() to emit the indirect variant of send with a dynamically calculated message descriptor. This will be required to support variable indexing of image arrays for ARB_shader_image_load_store. --- src/mesa/drivers/dri/i965/brw_eu.h | 10 +- src/mesa/drivers/dri/i965/brw_eu_emit.c | 158 +-- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 4 +- src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 4 +- 4 files changed, 96 insertions(+), 80 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_eu.h b/src/mesa/drivers/dri/i965/brw_eu.h index 87a9f3f..9cc9123 100644 --- a/src/mesa/drivers/dri/i965/brw_eu.h +++ b/src/mesa/drivers/dri/i965/brw_eu.h @@ -398,18 +398,18 @@ void brw_CMP(struct brw_compile *p, void brw_untyped_atomic(struct brw_compile *p, - struct brw_reg dest, + struct brw_reg dst, struct brw_reg payload, + struct brw_reg surface, unsigned atomic_op, - unsigned bind_table_index, unsigned msg_length, bool response_expected); void brw_untyped_surface_read(struct brw_compile *p, - struct brw_reg dest, - struct brw_reg mrf, - unsigned bind_table_index, + struct brw_reg dst, + struct brw_reg payload, + struct brw_reg surface, unsigned msg_length, unsigned num_channels); diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c index 0b655d4..34695bf 100644 --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c @@ -2518,6 +2518,48 @@ brw_send_indirect_message(struct brw_compile *p, return setup; } +static struct brw_inst * +brw_send_indirect_surface_message(struct brw_compile *p, + unsigned sfid, + struct brw_reg dst, + struct brw_reg payload, + struct brw_reg surface, + unsigned message_len, + unsigned response_len, + bool header_present) +{ + const struct brw_context *brw = p-brw; + struct brw_inst *insn; + + if (surface.file != BRW_IMMEDIATE_VALUE) { + struct brw_reg addr = retype(brw_address_reg(0), BRW_REGISTER_TYPE_UD); + + brw_push_insn_state(p); + brw_set_default_access_mode(p, BRW_ALIGN_1); + brw_set_default_mask_control(p, BRW_MASK_DISABLE); + brw_set_default_predicate_control(p, BRW_PREDICATE_NONE); + + /* Mask out invalid bits from the surface index to avoid hangs e.g. when + * some surface array is accessed out of bounds. + */ + insn = brw_AND(p, addr, + suboffset(vec1(retype(surface, BRW_REGISTER_TYPE_UD)), + BRW_GET_SWZ(surface.dw1.bits.swizzle, 0)), + brw_imm_ud(0xff)); + + brw_pop_insn_state(p); + + surface = addr; + } + + insn = brw_send_indirect_message(p, sfid, dst, payload, surface); + brw_inst_set_mlen(brw, insn, message_len); + brw_inst_set_rlen(brw, insn, response_len); + brw_inst_set_header_present(brw, insn, header_present); I'll continue the discussion we started with patch number one here if you don't mind. What I find confusing is that in case 'surface' is not an immediate then these three calls modify the OR-instruction. Otherwise they modify the send. Or am I missing something? Yeah, that's the whole point of the OR instruction, indirect message sends no longer have an immediate source so all these control bits have to be specified somewhere else. The caller doesn't care whether the returned instruction is a SEND or some other opcode as long as it has room for the control fields. signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] glx: remove unneeded ifdef _WIN32 guard
The C99 header exists on other platforms as well. Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- src/glx/glxclient.h | 2 -- 1 file changed, 2 deletions(-) diff --git a/src/glx/glxclient.h b/src/glx/glxclient.h index a140c87..122ae5d 100644 --- a/src/glx/glxclient.h +++ b/src/glx/glxclient.h @@ -47,9 +47,7 @@ #include string.h #include stdlib.h #include stdio.h -#ifdef _WIN32 #include stdint.h -#endif #include GL/glxproto.h #include glxconfig.h #include glxhash.h -- 2.1.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 05/13] i965: Fix the untyped surface opcodes to deal with indirect surface access.
On Fri, Mar 06, 2015 at 02:29:15PM +0200, Francisco Jerez wrote: Pohjolainen, Topi topi.pohjolai...@intel.com writes: On Fri, Feb 27, 2015 at 05:34:48PM +0200, Francisco Jerez wrote: Change brw_untyped_atomic() and brw_untyped_surface_read() to take the surface index as a register instead of a constant and to use brw_send_indirect_message() to emit the indirect variant of send with a dynamically calculated message descriptor. This will be required to support variable indexing of image arrays for ARB_shader_image_load_store. --- src/mesa/drivers/dri/i965/brw_eu.h | 10 +- src/mesa/drivers/dri/i965/brw_eu_emit.c | 158 +-- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 4 +- src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 4 +- 4 files changed, 96 insertions(+), 80 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_eu.h b/src/mesa/drivers/dri/i965/brw_eu.h index 87a9f3f..9cc9123 100644 --- a/src/mesa/drivers/dri/i965/brw_eu.h +++ b/src/mesa/drivers/dri/i965/brw_eu.h @@ -398,18 +398,18 @@ void brw_CMP(struct brw_compile *p, void brw_untyped_atomic(struct brw_compile *p, - struct brw_reg dest, + struct brw_reg dst, struct brw_reg payload, + struct brw_reg surface, unsigned atomic_op, - unsigned bind_table_index, unsigned msg_length, bool response_expected); void brw_untyped_surface_read(struct brw_compile *p, - struct brw_reg dest, - struct brw_reg mrf, - unsigned bind_table_index, + struct brw_reg dst, + struct brw_reg payload, + struct brw_reg surface, unsigned msg_length, unsigned num_channels); diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c index 0b655d4..34695bf 100644 --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c @@ -2518,6 +2518,48 @@ brw_send_indirect_message(struct brw_compile *p, return setup; } +static struct brw_inst * +brw_send_indirect_surface_message(struct brw_compile *p, + unsigned sfid, + struct brw_reg dst, + struct brw_reg payload, + struct brw_reg surface, + unsigned message_len, + unsigned response_len, + bool header_present) +{ + const struct brw_context *brw = p-brw; + struct brw_inst *insn; + + if (surface.file != BRW_IMMEDIATE_VALUE) { + struct brw_reg addr = retype(brw_address_reg(0), BRW_REGISTER_TYPE_UD); + + brw_push_insn_state(p); + brw_set_default_access_mode(p, BRW_ALIGN_1); + brw_set_default_mask_control(p, BRW_MASK_DISABLE); + brw_set_default_predicate_control(p, BRW_PREDICATE_NONE); + + /* Mask out invalid bits from the surface index to avoid hangs e.g. when + * some surface array is accessed out of bounds. + */ + insn = brw_AND(p, addr, + suboffset(vec1(retype(surface, BRW_REGISTER_TYPE_UD)), + BRW_GET_SWZ(surface.dw1.bits.swizzle, 0)), + brw_imm_ud(0xff)); + + brw_pop_insn_state(p); + + surface = addr; + } + + insn = brw_send_indirect_message(p, sfid, dst, payload, surface); + brw_inst_set_mlen(brw, insn, message_len); + brw_inst_set_rlen(brw, insn, response_len); + brw_inst_set_header_present(brw, insn, header_present); I'll continue the discussion we started with patch number one here if you don't mind. What I find confusing is that in case 'surface' is not an immediate then these three calls modify the OR-instruction. Otherwise they modify the send. Or am I missing something? Yeah, that's the whole point of the OR instruction, indirect message sends no longer have an immediate source so all these control bits have to be specified somewhere else. The caller doesn't care whether the returned instruction is a SEND or some other opcode as long as it has room for the control fields. This I understand, what I miss is the effect of setting mlen/rlen/header to an OR-instruction. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 05/13] i965: Fix the untyped surface opcodes to deal with indirect surface access.
On Fri, Mar 06, 2015 at 02:46:51PM +0200, Francisco Jerez wrote: Pohjolainen, Topi topi.pohjolai...@intel.com writes: On Fri, Mar 06, 2015 at 02:29:15PM +0200, Francisco Jerez wrote: Pohjolainen, Topi topi.pohjolai...@intel.com writes: On Fri, Feb 27, 2015 at 05:34:48PM +0200, Francisco Jerez wrote: Change brw_untyped_atomic() and brw_untyped_surface_read() to take the surface index as a register instead of a constant and to use brw_send_indirect_message() to emit the indirect variant of send with a dynamically calculated message descriptor. This will be required to support variable indexing of image arrays for ARB_shader_image_load_store. --- src/mesa/drivers/dri/i965/brw_eu.h | 10 +- src/mesa/drivers/dri/i965/brw_eu_emit.c | 158 +-- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 4 +- src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 4 +- 4 files changed, 96 insertions(+), 80 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_eu.h b/src/mesa/drivers/dri/i965/brw_eu.h index 87a9f3f..9cc9123 100644 --- a/src/mesa/drivers/dri/i965/brw_eu.h +++ b/src/mesa/drivers/dri/i965/brw_eu.h @@ -398,18 +398,18 @@ void brw_CMP(struct brw_compile *p, void brw_untyped_atomic(struct brw_compile *p, - struct brw_reg dest, + struct brw_reg dst, struct brw_reg payload, + struct brw_reg surface, unsigned atomic_op, - unsigned bind_table_index, unsigned msg_length, bool response_expected); void brw_untyped_surface_read(struct brw_compile *p, - struct brw_reg dest, - struct brw_reg mrf, - unsigned bind_table_index, + struct brw_reg dst, + struct brw_reg payload, + struct brw_reg surface, unsigned msg_length, unsigned num_channels); diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c index 0b655d4..34695bf 100644 --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c @@ -2518,6 +2518,48 @@ brw_send_indirect_message(struct brw_compile *p, return setup; } +static struct brw_inst * +brw_send_indirect_surface_message(struct brw_compile *p, + unsigned sfid, + struct brw_reg dst, + struct brw_reg payload, + struct brw_reg surface, + unsigned message_len, + unsigned response_len, + bool header_present) +{ + const struct brw_context *brw = p-brw; + struct brw_inst *insn; + + if (surface.file != BRW_IMMEDIATE_VALUE) { + struct brw_reg addr = retype(brw_address_reg(0), BRW_REGISTER_TYPE_UD); + + brw_push_insn_state(p); + brw_set_default_access_mode(p, BRW_ALIGN_1); + brw_set_default_mask_control(p, BRW_MASK_DISABLE); + brw_set_default_predicate_control(p, BRW_PREDICATE_NONE); + + /* Mask out invalid bits from the surface index to avoid hangs e.g. when + * some surface array is accessed out of bounds. + */ + insn = brw_AND(p, addr, + suboffset(vec1(retype(surface, BRW_REGISTER_TYPE_UD)), + BRW_GET_SWZ(surface.dw1.bits.swizzle, 0)), + brw_imm_ud(0xff)); + + brw_pop_insn_state(p); + + surface = addr; + } + + insn = brw_send_indirect_message(p, sfid, dst, payload, surface); + brw_inst_set_mlen(brw, insn, message_len); + brw_inst_set_rlen(brw, insn, response_len); + brw_inst_set_header_present(brw, insn, header_present); I'll continue the discussion we started with patch number one here if you don't mind. What I find confusing is that in case 'surface' is not an immediate then these three calls modify the OR-instruction. Otherwise they modify the send. Or am I missing something? Yeah, that's the whole point of the OR instruction, indirect message sends no longer have an immediate source so all these control bits have to be specified somewhere else. The caller doesn't care whether the returned instruction is a SEND or some other opcode as long as it has room for the control fields. This I understand, what I miss is the effect of setting mlen/rlen/header to an OR-instruction. Those are fields of the message descriptor that is usually part of
Re: [Mesa-dev] [PATCH 05/13] i965: Fix the untyped surface opcodes to deal with indirect surface access.
On Fri, Feb 27, 2015 at 05:34:48PM +0200, Francisco Jerez wrote: Change brw_untyped_atomic() and brw_untyped_surface_read() to take the surface index as a register instead of a constant and to use brw_send_indirect_message() to emit the indirect variant of send with a dynamically calculated message descriptor. This will be required to support variable indexing of image arrays for ARB_shader_image_load_store. --- src/mesa/drivers/dri/i965/brw_eu.h | 10 +- src/mesa/drivers/dri/i965/brw_eu_emit.c | 158 +-- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 4 +- src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 4 +- 4 files changed, 96 insertions(+), 80 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_eu.h b/src/mesa/drivers/dri/i965/brw_eu.h index 87a9f3f..9cc9123 100644 --- a/src/mesa/drivers/dri/i965/brw_eu.h +++ b/src/mesa/drivers/dri/i965/brw_eu.h @@ -398,18 +398,18 @@ void brw_CMP(struct brw_compile *p, void brw_untyped_atomic(struct brw_compile *p, - struct brw_reg dest, + struct brw_reg dst, struct brw_reg payload, + struct brw_reg surface, unsigned atomic_op, - unsigned bind_table_index, unsigned msg_length, bool response_expected); void brw_untyped_surface_read(struct brw_compile *p, - struct brw_reg dest, - struct brw_reg mrf, - unsigned bind_table_index, + struct brw_reg dst, + struct brw_reg payload, + struct brw_reg surface, unsigned msg_length, unsigned num_channels); diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c index 0b655d4..34695bf 100644 --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c @@ -2518,6 +2518,48 @@ brw_send_indirect_message(struct brw_compile *p, return setup; } +static struct brw_inst * +brw_send_indirect_surface_message(struct brw_compile *p, + unsigned sfid, + struct brw_reg dst, + struct brw_reg payload, + struct brw_reg surface, + unsigned message_len, + unsigned response_len, + bool header_present) +{ + const struct brw_context *brw = p-brw; + struct brw_inst *insn; + + if (surface.file != BRW_IMMEDIATE_VALUE) { + struct brw_reg addr = retype(brw_address_reg(0), BRW_REGISTER_TYPE_UD); + + brw_push_insn_state(p); + brw_set_default_access_mode(p, BRW_ALIGN_1); + brw_set_default_mask_control(p, BRW_MASK_DISABLE); + brw_set_default_predicate_control(p, BRW_PREDICATE_NONE); + + /* Mask out invalid bits from the surface index to avoid hangs e.g. when + * some surface array is accessed out of bounds. + */ + insn = brw_AND(p, addr, + suboffset(vec1(retype(surface, BRW_REGISTER_TYPE_UD)), + BRW_GET_SWZ(surface.dw1.bits.swizzle, 0)), + brw_imm_ud(0xff)); + + brw_pop_insn_state(p); + + surface = addr; + } + + insn = brw_send_indirect_message(p, sfid, dst, payload, surface); + brw_inst_set_mlen(brw, insn, message_len); + brw_inst_set_rlen(brw, insn, response_len); + brw_inst_set_header_present(brw, insn, header_present); I'll continue the discussion we started with patch number one here if you don't mind. What I find confusing is that in case 'surface' is not an immediate then these three calls modify the OR-instruction. Otherwise they modify the send. Or am I missing something? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] util: rework _MSC_VER = 1200 checks
Replace the _MSC_VER = 1200 with defined (_MSC_VER) and compact if/else statements. We require MSVC 2008 or later with commit 46110c5d564. Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- src/util/macros.h | 8 +++- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/src/util/macros.h b/src/util/macros.h index b862bfd..63daba3 100644 --- a/src/util/macros.h +++ b/src/util/macros.h @@ -73,15 +73,13 @@ do {\ assert(!str);\ __builtin_unreachable(); \ } while (0) -#elif _MSC_VER = 1200 +#elif defined (_MSC_VER) #define unreachable(str)\ do {\ assert(!str);\ __assume(0); \ } while (0) -#endif - -#ifndef unreachable +#else #define unreachable(str) assert(!str) #endif @@ -99,7 +97,7 @@ do { \ #define assume(expr) ((expr) ? ((void) 0) \ : (assert(!assumption failed), \ __builtin_unreachable())) -#elif _MSC_VER = 1200 +#elif defined (_MSC_VER) #define assume(expr) __assume(expr) #else #define assume(expr) assert(expr) -- 2.1.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 01/13] i965: Factor out logic to build a send message instruction with indirect descriptor.
Pohjolainen, Topi topi.pohjolai...@intel.com writes: On Fri, Feb 27, 2015 at 05:34:44PM +0200, Francisco Jerez wrote: --- src/mesa/drivers/dri/i965/brw_eu.h | 19 ++-- src/mesa/drivers/dri/i965/brw_eu_emit.c | 58 ++-- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 55 +- src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 37 --- 4 files changed, 77 insertions(+), 92 deletions(-) After discussing this further in the context of patch number five I'm now convinced and this patch is: Reviewed-by: Topi Pohjolainen topi.pohjolai...@intel.com Thanks Topi. signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 04/13] i965: Mask out unused Align16 components in brw_untyped_atomic.
Pohjolainen, Topi topi.pohjolai...@intel.com writes: On Fri, Feb 27, 2015 at 05:34:47PM +0200, Francisco Jerez wrote: This is currently not a problem because the vec4 visitor happens to mask out unused components from the destination, but it might become an issue when we start using atomics without writeback message. In any case it seems sensible to set it again here because the consequences of setting the wrong writemask (random graphics memory corruption) are difficult to debug and can easily go unnoticed. I started thinking if this should be an assertion here and should we force the logic in the visitor to consider the writemask correctly instead? I don't have a strong opinion, merely just wondering aloud. That would be rather inconvenient for my (not yet sent for review) ARB_shader_image_load_store intrinsic lowering code. If we made it an assertion, say: | emit(some_surface_opcode, vgrf(rlen), payload, surface, control); or | emit(some_surface_opcode, reg_null_ud(), payload, surface, control); would fail with an assertion failure. I need that to just work no matter what surface opcode is specified. It would also be somewhat misleading, because these messages have the annoying property of clobbering destination register components even if they're masked out, so in Align16 they kind of always behave as if WRITEMASK_XYZW was specified as far as the destination region is concerned. --- src/mesa/drivers/dri/i965/brw_eu_emit.c | 13 +++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c index 2b1d6ff..0b655d4 100644 --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c @@ -2799,16 +2799,25 @@ brw_untyped_atomic(struct brw_compile *p, bool response_expected) { const struct brw_context *brw = p-brw; + const bool align1 = (brw_inst_access_mode(brw, p-current) == BRW_ALIGN_1); + /* Mask out unused components -- This is especially important in Align16 +* mode on generations that don't have native support for SIMD4x2 atomics, +* because unused but enabled components will cause the dataport to perform +* additional atomic operations on the addresses that happen to be in the +* uninitialized Y, Z and W coordinates of the payload. +*/ + const unsigned mask = (align1 ? WRITEMASK_XYZW : WRITEMASK_X); brw_inst *insn = brw_next_insn(p, BRW_OPCODE_SEND); - brw_set_dest(p, insn, retype(dest, BRW_REGISTER_TYPE_UD)); + brw_set_dest(p, insn, retype(brw_writemask(dest, mask), +BRW_REGISTER_TYPE_UD)); brw_set_src0(p, insn, retype(payload, BRW_REGISTER_TYPE_UD)); brw_set_src1(p, insn, brw_imm_d(0)); brw_set_dp_untyped_atomic_message( p, insn, atomic_op, bind_table_index, msg_length, brw_surface_payload_size(p, response_expected, brw-gen = 8 || brw-is_haswell, true), - brw_inst_access_mode(brw, insn) == BRW_ALIGN_1); + align1); } static void -- 2.1.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 08/13] i965/vec4: Add support for untyped surface message sends from GRF.
Pohjolainen, Topi topi.pohjolai...@intel.com writes: On Fri, Feb 27, 2015 at 05:34:51PM +0200, Francisco Jerez wrote: This doesn't actually enable untyped surface message sends from GRF yet, the upcoming atomic counter and image intrinsic lowering code will. --- src/mesa/drivers/dri/i965/brw_vec4.cpp | 7 --- src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 16 +++- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 5 +++-- 3 files changed, 14 insertions(+), 14 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index e19..0004b10 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -256,6 +256,8 @@ vec4_instruction::is_send_from_grf() switch (opcode) { case SHADER_OPCODE_SHADER_TIME_ADD: case VS_OPCODE_PULL_CONSTANT_LOAD_GEN7: + case SHADER_OPCODE_UNTYPED_ATOMIC: + case SHADER_OPCODE_UNTYPED_SURFACE_READ: return true; default: return false; @@ -270,6 +272,8 @@ vec4_instruction::regs_read(unsigned arg) const switch (opcode) { case SHADER_OPCODE_SHADER_TIME_ADD: + case SHADER_OPCODE_UNTYPED_ATOMIC: + case SHADER_OPCODE_UNTYPED_SURFACE_READ: return arg == 0 ? mlen : 1; Before the logic always falled back to returning one. Now we may return one, two or three I think. I may be mistaken though, I'm just reading vec4_visitor::emit_untyped_atomic() and it can produce message lengths up to three. Does this effect the instruction scheduling logic and if not, can you explain why not? Before my change that wouldn't ever happen because we were using fake MRFs to assemble the message payload and the MRF register index would be specified as inst-base_mrf, so the payload wouldn't be an actual source of the untyped surface instruction. This change adds an additional source for the payload, but a fake MRF is still passed in as explicit source temporarily. A future commit will change the vec4 visitor to build untyped and typed surface message payloads directly in normal GRFs instead of fake MRFs. case VS_OPCODE_PULL_CONSTANT_LOAD_GEN7: @@ -347,9 +351,6 @@ vec4_visitor::implied_mrf_writes(vec4_instruction *inst) case SHADER_OPCODE_TG4: case SHADER_OPCODE_TG4_OFFSET: return inst-header_present ? 1 : 0; - case SHADER_OPCODE_UNTYPED_ATOMIC: - case SHADER_OPCODE_UNTYPED_SURFACE_READ: - return 0; default: unreachable(not reached); } diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp index 22fdd63..ef0cde9 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp @@ -1459,19 +1459,17 @@ vec4_generator::generate_code(const cfg_t *cfg) break; case SHADER_OPCODE_UNTYPED_ATOMIC: - assert(src[0].file == BRW_IMMEDIATE_VALUE -src[1].file == BRW_IMMEDIATE_VALUE); - brw_untyped_atomic(p, dst, brw_message_reg(inst-base_mrf), -src[1], src[0].dw1.ud, inst-mlen, + assert(src[1].file == BRW_IMMEDIATE_VALUE +src[2].file == BRW_IMMEDIATE_VALUE); + brw_untyped_atomic(p, dst, src[0], src[2], src[1].dw1.ud, inst-mlen, !inst-dst.is_null()); - brw_mark_surface_used(prog_data-base, src[1].dw1.ud); + brw_mark_surface_used(prog_data-base, src[2].dw1.ud); break; case SHADER_OPCODE_UNTYPED_SURFACE_READ: - assert(src[0].file == BRW_IMMEDIATE_VALUE); - brw_untyped_surface_read(p, dst, brw_message_reg(inst-base_mrf), - src[0], inst-mlen, 1); - brw_mark_surface_used(prog_data-base, src[0].dw1.ud); + assert(src[1].file == BRW_IMMEDIATE_VALUE); + brw_untyped_surface_read(p, dst, src[0], src[1], inst-mlen, 1); + brw_mark_surface_used(prog_data-base, src[1].dw1.ud); break; case SHADER_OPCODE_FIND_LIVE_CHANNEL: diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp index f25bff9..b8cfe8f 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp @@ -2953,6 +2953,7 @@ vec4_visitor::emit_untyped_atomic(unsigned atomic_op, unsigned surf_index, * unused channels will be masked out. */ vec4_instruction *inst = emit(SHADER_OPCODE_UNTYPED_ATOMIC, dst, + brw_message_reg(0), src_reg(atomic_op), src_reg(surf_index)); inst-base_mrf = 0; inst-mlen = mlen; @@ -2969,8 +2970,8 @@ vec4_visitor::emit_untyped_surface_read(unsigned surf_index, dst_reg dst, * untyped surface read message, but that's OK because unused * channels will be masked out.
Re: [Mesa-dev] [PATCH 01/13] i965: Factor out logic to build a send message instruction with indirect descriptor.
On Fri, Feb 27, 2015 at 05:34:44PM +0200, Francisco Jerez wrote: --- src/mesa/drivers/dri/i965/brw_eu.h | 19 ++-- src/mesa/drivers/dri/i965/brw_eu_emit.c | 58 ++-- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 55 +- src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 37 --- 4 files changed, 77 insertions(+), 92 deletions(-) After discussing this further in the context of patch number five I'm now convinced and this patch is: Reviewed-by: Topi Pohjolainen topi.pohjolai...@intel.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/6] dri/intel: the aligned_alloc/free over the _mesa_* wrappers
Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- src/mesa/drivers/dri/i915/intel_buffer_objects.c | 19 ++- src/mesa/drivers/dri/i965/intel_mipmap_tree.c| 5 +++-- 2 files changed, 13 insertions(+), 11 deletions(-) diff --git a/src/mesa/drivers/dri/i915/intel_buffer_objects.c b/src/mesa/drivers/dri/i915/intel_buffer_objects.c index ef06743..ca8fc8c 100644 --- a/src/mesa/drivers/dri/i915/intel_buffer_objects.c +++ b/src/mesa/drivers/dri/i915/intel_buffer_objects.c @@ -25,6 +25,7 @@ * **/ +#include c11_stdlib.h #include main/imports.h #include main/mtypes.h @@ -96,7 +97,7 @@ intel_bufferobj_free(struct gl_context * ctx, struct gl_buffer_object *obj) */ _mesa_buffer_unmap_all_mappings(ctx, obj); - _mesa_align_free(intel_obj-sys_buffer); + aligned_free(intel_obj-sys_buffer); drm_intel_bo_unreference(intel_obj-buffer); free(intel_obj); @@ -133,7 +134,7 @@ intel_bufferobj_data(struct gl_context * ctx, if (intel_obj-buffer != NULL) release_buffer(intel_obj); - _mesa_align_free(intel_obj-sys_buffer); + aligned_free(intel_obj-sys_buffer); intel_obj-sys_buffer = NULL; if (size != 0) { @@ -142,7 +143,7 @@ intel_bufferobj_data(struct gl_context * ctx, */ if (target == GL_ARRAY_BUFFER || target == GL_ELEMENT_ARRAY_BUFFER) { intel_obj-sys_buffer = -_mesa_align_malloc(size, ctx-Const.MinMapBufferAlignment); +aligned_alloc(ctx-Const.MinMapBufferAlignment, size); if (intel_obj-sys_buffer != NULL) { if (data != NULL) memcpy(intel_obj-sys_buffer, data, size); @@ -193,7 +194,7 @@ intel_bufferobj_subdata(struct gl_context * ctx, return; } - _mesa_align_free(intel_obj-sys_buffer); + aligned_free(intel_obj-sys_buffer); intel_obj-sys_buffer = NULL; } @@ -301,7 +302,7 @@ intel_bufferobj_map_range(struct gl_context * ctx, return obj-Mappings[index].Pointer; } - _mesa_align_free(intel_obj-sys_buffer); + aligned_free(intel_obj-sys_buffer); intel_obj-sys_buffer = NULL; } @@ -350,7 +351,7 @@ intel_bufferobj_map_range(struct gl_context * ctx, if (access GL_MAP_FLUSH_EXPLICIT_BIT) { intel_obj-range_map_buffer[index] = -_mesa_align_malloc(length + extra, alignment); +aligned_alloc(alignment, length + extra); obj-Mappings[index].Pointer = intel_obj-range_map_buffer[index] + extra; } else { @@ -445,7 +446,7 @@ intel_bufferobj_unmap(struct gl_context * ctx, struct gl_buffer_object *obj, * usage inside of a batchbuffer. */ intel_batchbuffer_emit_mi_flush(intel); - _mesa_align_free(intel_obj-range_map_buffer[index]); + aligned_free(intel_obj-range_map_buffer[index]); intel_obj-range_map_buffer[index] = NULL; } else if (intel_obj-range_map_bo[index] != NULL) { const unsigned extra = obj-Mappings[index].Pointer - @@ -490,7 +491,7 @@ intel_bufferobj_buffer(struct intel_context *intel, 0, intel_obj-Base.Size, intel_obj-sys_buffer); - _mesa_align_free(intel_obj-sys_buffer); + aligned_free(intel_obj-sys_buffer); intel_obj-sys_buffer = NULL; intel_obj-offset = 0; } @@ -677,7 +678,7 @@ intel_buffer_object_purgeable(struct gl_context * ctx, return intel_buffer_purgeable(intel_obj-buffer); if (option == GL_RELEASED_APPLE) { - _mesa_align_free(intel_obj-sys_buffer); + aligned_free(intel_obj-sys_buffer); intel_obj-sys_buffer = NULL; return GL_RELEASED_APPLE; diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index 36c3b26..e7265f3 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -27,6 +27,7 @@ #include GL/gl.h #include GL/internal/dri_interface.h +#include c11_stdlib.h #include intel_batchbuffer.h #include intel_mipmap_tree.h @@ -1945,7 +1946,7 @@ intel_miptree_map_movntdqa(struct brw_context *brw, map-stride = ALIGN(misalignment + width_bytes, 16); - map-buffer = _mesa_align_malloc(map-stride * map-h, 16); + map-buffer = aligned_alloc(16, map-stride * map-h); /* Offset the destination so it has the same misalignment as src. */ map-ptr = map-buffer + misalignment; @@ -1968,7 +1969,7 @@ intel_miptree_unmap_movntdqa(struct brw_context *brw, unsigned int level, unsigned int slice) { - _mesa_align_free(map-buffer); + aligned_free(map-buffer); map-buffer = NULL; map-ptr = NULL; } -- 2.1.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/5] egl/main: convert thread management to use c11 threads
Convert the code to use the C11 threads implementation, and nuke the Windows non-pthreads code-path. The c11/threads_win32.h abstraction should be better than the current code. Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- src/egl/main/eglcurrent.c | 48 ++- 1 file changed, 6 insertions(+), 42 deletions(-) diff --git a/src/egl/main/eglcurrent.c b/src/egl/main/eglcurrent.c index dc32ed4..5d8cae4 100644 --- a/src/egl/main/eglcurrent.c +++ b/src/egl/main/eglcurrent.c @@ -29,6 +29,7 @@ #include stdlib.h #include string.h #include c99_compat.h +#include c11/threads.h #include egllog.h #include eglcurrent.h @@ -41,14 +42,9 @@ /* a fallback thread info to guarantee that every thread always has one */ static _EGLThreadInfo dummy_thread = _EGL_THREAD_INFO_INITIALIZER; - - -#if HAVE_PTHREAD -#include pthread.h - static mtx_t _egl_TSDMutex = _MTX_INITIALIZER_NP; static EGLBoolean _egl_TSDInitialized; -static pthread_key_t _egl_TSD; +static tss_t _egl_TSD; static void (*_egl_FreeTSD)(_EGLThreadInfo *); #ifdef GLX_USE_TLS @@ -58,7 +54,7 @@ static __thread const _EGLThreadInfo *_egl_TLS static inline void _eglSetTSD(const _EGLThreadInfo *t) { - pthread_setspecific(_egl_TSD, (const void *) t); + tss_set(_egl_TSD, (const void *) t); #ifdef GLX_USE_TLS _egl_TLS = t; #endif @@ -69,7 +65,7 @@ static inline _EGLThreadInfo *_eglGetTSD(void) #ifdef GLX_USE_TLS return (_EGLThreadInfo *) _egl_TLS; #else - return (_EGLThreadInfo *) pthread_getspecific(_egl_TSD); + return (_EGLThreadInfo *) tss_get(_egl_TSD); #endif } @@ -82,7 +78,7 @@ static inline void _eglFiniTSD(void) _egl_TSDInitialized = EGL_FALSE; if (t _egl_FreeTSD) _egl_FreeTSD((void *) t); - pthread_key_delete(_egl_TSD); + tss_delete(_egl_TSD); } mtx_unlock(_egl_TSDMutex); } @@ -94,7 +90,7 @@ static inline EGLBoolean _eglInitTSD(void (*dtor)(_EGLThreadInfo *)) /* check again after acquiring lock */ if (!_egl_TSDInitialized) { - if (pthread_key_create(_egl_TSD, (void (*)(void *)) dtor) != 0) { + if (tss_create(_egl_TSD, (void (*)(void *)) dtor) != thrd_success) { mtx_unlock(_egl_TSDMutex); return EGL_FALSE; } @@ -109,38 +105,6 @@ static inline EGLBoolean _eglInitTSD(void (*dtor)(_EGLThreadInfo *)) return EGL_TRUE; } -#else /* HAVE_PTHREAD */ -static const _EGLThreadInfo *_egl_TSD; -static void (*_egl_FreeTSD)(_EGLThreadInfo *); - -static inline void _eglSetTSD(const _EGLThreadInfo *t) -{ - _egl_TSD = t; -} - -static inline _EGLThreadInfo *_eglGetTSD(void) -{ - return (_EGLThreadInfo *) _egl_TSD; -} - -static inline void _eglFiniTSD(void) -{ - if (_egl_FreeTSD _egl_TSD) - _egl_FreeTSD((_EGLThreadInfo *) _egl_TSD); -} - -static inline EGLBoolean _eglInitTSD(void (*dtor)(_EGLThreadInfo *)) -{ - if (!_egl_FreeTSD dtor) { - _egl_FreeTSD = dtor; - _eglAddAtExitCall(_eglFiniTSD); - } - return EGL_TRUE; -} - -#endif /* !HAVE_PTHREAD */ - - static void _eglInitThreadInfo(_EGLThreadInfo *t) { -- 2.1.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/5] glx: remove final reference to THREADS
Left over from commit 18db13f5865(mapi: THREADS was always defined, remove it) Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- src/glx/glxcurrent.c | 4 1 file changed, 4 deletions(-) diff --git a/src/glx/glxcurrent.c b/src/glx/glxcurrent.c index dc2acd5..86fb658 100644 --- a/src/glx/glxcurrent.c +++ b/src/glx/glxcurrent.c @@ -138,10 +138,6 @@ __glXGetCurrentContext(void) # endif /* defined( GLX_USE_TLS ) */ -#elif defined( THREADS ) - -#error Unknown threading method specified. - #else /* not thread safe */ -- 2.1.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/5] configure: require pthreads for POSIX builds
This has been an implicit rule for building mesa for a long time. Let's make it official and just bail out at configure time. This way we can cleaning up some of our glx code. Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- configure.ac | 3 +++ 1 file changed, 3 insertions(+) diff --git a/configure.ac b/configure.ac index 90c7737..4989444 100644 --- a/configure.ac +++ b/configure.ac @@ -657,6 +657,9 @@ mingw*) ;; *) AX_PTHREAD +if test x$ax_pthread_ok = xno; then +AC_MSG_ERROR([Building mesa on this platform requires pthreads]) +fi ;; esac dnl AX_PTHREADS leaves PTHREAD_LIBS empty for gcc and sets PTHREAD_CFLAGS -- 2.1.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] clover: Return the minimum required value for CL_DEVICE_SINGLE_FP_CONFIG
Tom Stellard t...@stellard.net writes: On Thu, Mar 05, 2015 at 08:42:25PM +0200, Francisco Jerez wrote: Tom Stellard thomas.stell...@amd.com writes: This means dropping CL_FP_DENORM from the current return value. --- src/gallium/state_trackers/clover/api/device.cpp | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/gallium/state_trackers/clover/api/device.cpp b/src/gallium/state_trackers/clover/api/device.cpp index b1f556f..db3b931 100644 --- a/src/gallium/state_trackers/clover/api/device.cpp +++ b/src/gallium/state_trackers/clover/api/device.cpp @@ -201,8 +201,10 @@ clGetDeviceInfo(cl_device_id d_dev, cl_device_info param, break; case CL_DEVICE_SINGLE_FP_CONFIG: + // This is the mandated minimum single precision floating-point + // capability Could you add that this is according to the OpenCL 1.1 specification? OpenCL 1.2 is even weaker (CL_FP_INF_NAN is not required, only one of CL_FP_ROUND_TO_ZERO or CL_FP_ROUND_TO_NEAREST is required, and no FP capabilities at all are required for custom devices as Jan pointed out). buf.as_scalarcl_device_fp_config() = - CL_FP_DENORM | CL_FP_INF_NAN | CL_FP_ROUND_TO_NEAREST; + CL_FP_INF_NAN | CL_FP_ROUND_TO_NEAREST; I'm okay with this change, but I'm curious, is this motivated by your architecture not supporting denorms? It can, but supporting them hurts performance. Sounds like you want to advertise denorm support and rely on the -cl-denorms-are-zero compiler option to decide whether to flush them to zero or not? -Tom break; case CL_DEVICE_DOUBLE_FP_CONFIG: -- 2.0.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/5] egl/main: use c11/threads' mutex directly
Hi all, Just accidently pushed the series to master. I'll revert them in a second. -Emil On 06/03/15 16:54, Emil Velikov wrote: Remove the inline wrappers/abstraction layer. Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- src/egl/main/Makefile.sources | 1 - src/egl/main/eglapi.c | 14 + src/egl/main/eglcurrent.c | 13 - src/egl/main/egldisplay.c | 13 + src/egl/main/egldisplay.h | 4 +-- src/egl/main/egldriver.c | 8 +++--- src/egl/main/eglglobals.c | 9 +++--- src/egl/main/eglglobals.h | 4 +-- src/egl/main/egllog.c | 18 ++-- src/egl/main/eglmutex.h | 66 --- src/egl/main/eglscreen.c | 8 +++--- 11 files changed, 47 insertions(+), 111 deletions(-) delete mode 100644 src/egl/main/eglmutex.h diff --git a/src/egl/main/Makefile.sources b/src/egl/main/Makefile.sources index 6a917e2..75f060a 100644 --- a/src/egl/main/Makefile.sources +++ b/src/egl/main/Makefile.sources @@ -26,7 +26,6 @@ LIBEGL_C_FILES := \ eglmisc.h \ eglmode.c \ eglmode.h \ - eglmutex.h \ eglscreen.c \ eglscreen.h \ eglstring.c \ diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c index 2258830..a74efcd 100644 --- a/src/egl/main/eglapi.c +++ b/src/egl/main/eglapi.c @@ -87,6 +87,8 @@ #include stdlib.h #include string.h #include c99_compat.h +#include c11/threads.h +#include eglcompiler.h #include eglglobals.h #include eglcontext.h @@ -275,7 +277,7 @@ _eglLockDisplay(EGLDisplay display) { _EGLDisplay *dpy = _eglLookupDisplay(display); if (dpy) - _eglLockMutex(dpy-Mutex); + mtx_lock(dpy-Mutex); return dpy; } @@ -286,7 +288,7 @@ _eglLockDisplay(EGLDisplay display) static inline void _eglUnlockDisplay(_EGLDisplay *dpy) { - _eglUnlockMutex(dpy-Mutex); + mtx_unlock(dpy-Mutex); } @@ -896,7 +898,7 @@ eglWaitClient(void) RETURN_EGL_SUCCESS(NULL, EGL_TRUE); disp = ctx-Resource.Display; - _eglLockMutex(disp-Mutex); + mtx_lock(disp-Mutex); /* let bad current context imply bad current surface */ if (_eglGetContextHandle(ctx) == EGL_NO_CONTEXT || @@ -942,7 +944,7 @@ eglWaitNative(EGLint engine) RETURN_EGL_SUCCESS(NULL, EGL_TRUE); disp = ctx-Resource.Display; - _eglLockMutex(disp-Mutex); + mtx_lock(disp-Mutex); /* let bad current context imply bad current surface */ if (_eglGetContextHandle(ctx) == EGL_NO_CONTEXT || @@ -1457,10 +1459,10 @@ eglReleaseThread(void) t-CurrentAPIIndex = i; -_eglLockMutex(disp-Mutex); +mtx_lock(disp-Mutex); drv = disp-Driver; (void) drv-API.MakeCurrent(drv, disp, NULL, NULL, NULL); -_eglUnlockMutex(disp-Mutex); +mtx_unlock(disp-Mutex); } } diff --git a/src/egl/main/eglcurrent.c b/src/egl/main/eglcurrent.c index 3d49641..dc32ed4 100644 --- a/src/egl/main/eglcurrent.c +++ b/src/egl/main/eglcurrent.c @@ -31,7 +31,6 @@ #include c99_compat.h #include egllog.h -#include eglmutex.h #include eglcurrent.h #include eglglobals.h @@ -47,7 +46,7 @@ static _EGLThreadInfo dummy_thread = _EGL_THREAD_INFO_INITIALIZER; #if HAVE_PTHREAD #include pthread.h -static _EGLMutex _egl_TSDMutex = _EGL_MUTEX_INITIALIZER; +static mtx_t _egl_TSDMutex = _MTX_INITIALIZER_NP; static EGLBoolean _egl_TSDInitialized; static pthread_key_t _egl_TSD; static void (*_egl_FreeTSD)(_EGLThreadInfo *); @@ -76,7 +75,7 @@ static inline _EGLThreadInfo *_eglGetTSD(void) static inline void _eglFiniTSD(void) { - _eglLockMutex(_egl_TSDMutex); + mtx_lock(_egl_TSDMutex); if (_egl_TSDInitialized) { _EGLThreadInfo *t = _eglGetTSD(); @@ -85,18 +84,18 @@ static inline void _eglFiniTSD(void) _egl_FreeTSD((void *) t); pthread_key_delete(_egl_TSD); } - _eglUnlockMutex(_egl_TSDMutex); + mtx_unlock(_egl_TSDMutex); } static inline EGLBoolean _eglInitTSD(void (*dtor)(_EGLThreadInfo *)) { if (!_egl_TSDInitialized) { - _eglLockMutex(_egl_TSDMutex); + mtx_lock(_egl_TSDMutex); /* check again after acquiring lock */ if (!_egl_TSDInitialized) { if (pthread_key_create(_egl_TSD, (void (*)(void *)) dtor) != 0) { -_eglUnlockMutex(_egl_TSDMutex); +mtx_unlock(_egl_TSDMutex); return EGL_FALSE; } _egl_FreeTSD = dtor; @@ -104,7 +103,7 @@ static inline EGLBoolean _eglInitTSD(void (*dtor)(_EGLThreadInfo *)) _egl_TSDInitialized = EGL_TRUE; } - _eglUnlockMutex(_egl_TSDMutex); + mtx_unlock(_egl_TSDMutex); } return EGL_TRUE; diff --git a/src/egl/main/egldisplay.c b/src/egl/main/egldisplay.c index a167ae5..b7a5b8f 100644 --- a/src/egl/main/egldisplay.c +++
[Mesa-dev] [PATCH 0/6] Slimdown import.[ch] - add c11_stdlib.h wrapper
C11 introduces the aligned_alloc() function, while mesa already has it's own wrapper. Create a new header, and make use of it. I was aiming to convert gallium as well, although that code diverges depending on the debugging state required but wrapping around malloc and friends. #This series points does not cover gallium, as the uses a slightly #different approach - i.e. rolls out a custom align_malloc around malloc #only when debugging # #latter has #convoluted #extra code for memory debugging. If the latter is unused, we could #easily port it over. # Cheers, Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/6] st/mesa: don't use the mesa wrapper _mesa_align_free
On Fri, Mar 6, 2015 at 11:32 AM, Emil Velikov emil.l.veli...@gmail.com wrote: Upon closer look it seems that TexData is no longer used. Perhaps we can nuke it ? http://patchwork.freedesktop.org/patch/43969/ Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- src/mesa/state_tracker/st_cb_texture.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/mesa/state_tracker/st_cb_texture.c b/src/mesa/state_tracker/st_cb_texture.c index a8b19a1..8f3060d 100644 --- a/src/mesa/state_tracker/st_cb_texture.c +++ b/src/mesa/state_tracker/st_cb_texture.c @@ -26,6 +26,8 @@ **/ #include stdio.h +#include c11_stdlib.h + #include main/bufferobj.h #include main/enums.h #include main/fbobject.h @@ -178,7 +180,7 @@ st_FreeTextureImageBuffer(struct gl_context *ctx, pipe_resource_reference(stImage-pt, NULL); } - _mesa_align_free(stImage-TexData); + aligned_free(stImage-TexData); stImage-TexData = NULL; free(stImage-transfer); @@ -1534,7 +1536,7 @@ copy_image_data_to_texture(struct st_context *st, stImage-TexData, srcRowStride, srcSliceStride); - _mesa_align_free(stImage-TexData); + aligned_free(stImage-TexData); stImage-TexData = NULL; } -- 2.1.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/5] glx: remove support for non-multithreaded platforms
Implicitly required for a while, although commit 9385c592c68 (mapi: remove u_thread.h) was the one that put the final nail on the coffin. Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- docs/dispatch.html| 7 ++- src/glx/glxclient.h | 18 +- src/glx/glxcurrent.c | 11 --- src/glx/tests/fake_glx_screen.cpp | 2 +- 4 files changed, 4 insertions(+), 34 deletions(-) diff --git a/docs/dispatch.html b/docs/dispatch.html index aacd01e..77cfba3 100644 --- a/docs/dispatch.html +++ b/docs/dispatch.html @@ -185,8 +185,6 @@ ways that the dispatch table pointer can be accessed. There are four different methods that can be used:/p ol -liUsing tt_glapi_Dispatch/tt directly in builds for non-multithreaded -environments./li liUsing tt_glapi_Dispatch/tt and tt_glapi_get_dispatch/tt in multithreaded environments./li liUsing tt_glapi_Dispatch/tt and ttpthread_getspecific/tt in @@ -204,9 +202,8 @@ terribly relevant./p few preprocessor defines./p ul -liIf ttGLX_USE_TLS/tt is defined, method #4 is used./li -liIf ttHAVE_PTHREAD/tt is defined, method #3 is used./li -liIf ttWIN32_THREADS/tt is defined, method #2 is used./li +liIf ttGLX_USE_TLS/tt is defined, method #3 is used./li +liIf ttHAVE_PTHREAD/tt is defined, method #2 is used./li liIf none of the preceding are defined, method #1 is used./li /ul diff --git a/src/glx/glxclient.h b/src/glx/glxclient.h index a140c87..4211d31 100644 --- a/src/glx/glxclient.h +++ b/src/glx/glxclient.h @@ -47,15 +47,13 @@ #include string.h #include stdlib.h #include stdio.h +#include pthread.h #ifdef _WIN32 #include stdint.h #endif #include GL/glxproto.h #include glxconfig.h #include glxhash.h -#if defined( HAVE_PTHREAD ) -# include pthread.h -#endif #include util/macros.h #include glxextensions.h @@ -631,7 +629,6 @@ extern void __glXPreferEGL(int state); extern int __glXDebug; /* This is per-thread storage in an MT environment */ -#if defined( HAVE_PTHREAD ) extern void __glXSetCurrentContext(struct glx_context * c); @@ -648,14 +645,6 @@ extern struct glx_context *__glXGetCurrentContext(void); # endif /* defined( GLX_USE_TLS ) */ -#else - -extern struct glx_context *__glXcurrentContext; -#define __glXGetCurrentContext() __glXcurrentContext -#define __glXSetCurrentContext(gc) __glXcurrentContext = gc - -#endif /* defined( HAVE_PTHREAD ) */ - extern void __glXSetCurrentContextNull(void); @@ -663,14 +652,9 @@ extern void __glXSetCurrentContextNull(void); ** Global lock for all threads in this address space using the GLX ** extension */ -#if defined( HAVE_PTHREAD ) extern pthread_mutex_t __glXmutex; #define __glXLock()pthread_mutex_lock(__glXmutex) #define __glXUnlock() pthread_mutex_unlock(__glXmutex) -#else -#define __glXLock() -#define __glXUnlock() -#endif /* ** Setup for a command. Initialize the extension for dpy if necessary. diff --git a/src/glx/glxcurrent.c b/src/glx/glxcurrent.c index 86fb658..7f47a42 100644 --- a/src/glx/glxcurrent.c +++ b/src/glx/glxcurrent.c @@ -33,9 +33,7 @@ * Client-side GLX interface for current context management. */ -#ifdef HAVE_PTHREAD #include pthread.h -#endif #include glxclient.h @@ -67,8 +65,6 @@ struct glx_context dummyContext = { * Current context management and locking */ -#if defined( HAVE_PTHREAD ) - _X_HIDDEN pthread_mutex_t __glXmutex = PTHREAD_MUTEX_INITIALIZER; # if defined( GLX_USE_TLS ) @@ -138,13 +134,6 @@ __glXGetCurrentContext(void) # endif /* defined( GLX_USE_TLS ) */ -#else - -/* not thread safe */ -_X_HIDDEN struct glx_context *__glXcurrentContext = dummyContext; - -#endif - _X_HIDDEN void __glXSetCurrentContextNull(void) diff --git a/src/glx/tests/fake_glx_screen.cpp b/src/glx/tests/fake_glx_screen.cpp index ccb1afa..db20749 100644 --- a/src/glx/tests/fake_glx_screen.cpp +++ b/src/glx/tests/fake_glx_screen.cpp @@ -77,7 +77,7 @@ indirect_create_context_attribs(struct glx_screen *base, __thread void *__glX_tls_Context = NULL; -#if defined(HAVE_PTHREAD) !defined(GLX_USE_TLS) +#if !defined(GLX_USE_TLS) extern C struct glx_context * __glXGetCurrentContext() { -- 2.1.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/5] egl/main: use c11/threads' mutex directly
On 06/03/15 17:05, Emil Velikov wrote: Hi all, Just accidently pushed the series to master. I'll revert them in a second. All done. Apologies for the noise. -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/6] mesa/main: remove _mesa_align_malloc/free
Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- src/mesa/main/imports.c | 76 - src/mesa/main/imports.h | 6 2 files changed, 82 deletions(-) diff --git a/src/mesa/main/imports.c b/src/mesa/main/imports.c index 5961587..3937a02 100644 --- a/src/mesa/main/imports.c +++ b/src/mesa/main/imports.c @@ -64,82 +64,6 @@ extern int vsnprintf(char *str, size_t count, const char *fmt, va_list arg); #endif -/**/ -/** \name Memory */ -/*@{*/ - -/** - * Allocate aligned memory. - * - * \param bytes number of bytes to allocate. - * \param alignment alignment (must be greater than zero). - * - * Allocates extra memory to accommodate rounding up the address for - * alignment and to record the real malloc address. - * - * \sa _mesa_align_free(). - */ -void * -_mesa_align_malloc(size_t bytes, unsigned long alignment) -{ -#if defined(HAVE_POSIX_MEMALIGN) - void *mem; - int err = posix_memalign( mem, alignment, bytes); - if (err) - return NULL; - return mem; -#elif defined(_WIN32) defined(_MSC_VER) - return _aligned_malloc(bytes, alignment); -#else - uintptr_t ptr, buf; - - assert( alignment 0 ); - - ptr = (uintptr_t)malloc(bytes + alignment + sizeof(void *)); - if (!ptr) - return NULL; - - buf = (ptr + alignment + sizeof(void *)) ~(uintptr_t)(alignment - 1); - *(uintptr_t *)(buf - sizeof(void *)) = ptr; - -#ifdef DEBUG - /* mark the non-aligned area */ - while ( ptr buf - sizeof(void *) ) { - *(unsigned long *)ptr = 0xcdcdcdcd; - ptr += sizeof(unsigned long); - } -#endif - - return (void *) buf; -#endif /* defined(HAVE_POSIX_MEMALIGN) */ -} - -/** - * Free memory which was allocated with either _mesa_align_malloc(). - * \param ptr pointer to the memory to be freed. - * The actual address to free is stored in the word immediately before the - * address the client sees. - * Note that it is legal to pass NULL pointer to this function and will be - * handled accordingly. - */ -void -_mesa_align_free(void *ptr) -{ -#if defined(HAVE_POSIX_MEMALIGN) - free(ptr); -#elif defined(_WIN32) defined(_MSC_VER) - _aligned_free(ptr); -#else - if (ptr) { - void **cubbyHole = (void **) ((char *) ptr - sizeof(void *)); - void *realAddr = *cubbyHole; - free(realAddr); - } -#endif /* defined(HAVE_POSIX_MEMALIGN) */ -} - -/*@}*/ - /**/ /** \name Math */ diff --git a/src/mesa/main/imports.h b/src/mesa/main/imports.h index a3767fd..db3a91d 100644 --- a/src/mesa/main/imports.h +++ b/src/mesa/main/imports.h @@ -361,12 +361,6 @@ _mesa_little_endian(void) */ extern void * -_mesa_align_malloc( size_t bytes, unsigned long alignment ); - -extern void -_mesa_align_free( void *ptr ); - -extern void * _mesa_exec_malloc( GLuint size ); extern void -- 2.1.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/6] st/mesa: don't use the mesa wrapper _mesa_align_free
Upon closer look it seems that TexData is no longer used. Perhaps we can nuke it ? Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- src/mesa/state_tracker/st_cb_texture.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/mesa/state_tracker/st_cb_texture.c b/src/mesa/state_tracker/st_cb_texture.c index a8b19a1..8f3060d 100644 --- a/src/mesa/state_tracker/st_cb_texture.c +++ b/src/mesa/state_tracker/st_cb_texture.c @@ -26,6 +26,8 @@ **/ #include stdio.h +#include c11_stdlib.h + #include main/bufferobj.h #include main/enums.h #include main/fbobject.h @@ -178,7 +180,7 @@ st_FreeTextureImageBuffer(struct gl_context *ctx, pipe_resource_reference(stImage-pt, NULL); } - _mesa_align_free(stImage-TexData); + aligned_free(stImage-TexData); stImage-TexData = NULL; free(stImage-transfer); @@ -1534,7 +1536,7 @@ copy_image_data_to_texture(struct st_context *st, stImage-TexData, srcRowStride, srcSliceStride); - _mesa_align_free(stImage-TexData); + aligned_free(stImage-TexData); stImage-TexData = NULL; } -- 2.1.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] clover: Return the minimum required value for CL_DEVICE_SINGLE_FP_CONFIG
On Thu, Mar 05, 2015 at 08:42:25PM +0200, Francisco Jerez wrote: Tom Stellard thomas.stell...@amd.com writes: This means dropping CL_FP_DENORM from the current return value. --- src/gallium/state_trackers/clover/api/device.cpp | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/gallium/state_trackers/clover/api/device.cpp b/src/gallium/state_trackers/clover/api/device.cpp index b1f556f..db3b931 100644 --- a/src/gallium/state_trackers/clover/api/device.cpp +++ b/src/gallium/state_trackers/clover/api/device.cpp @@ -201,8 +201,10 @@ clGetDeviceInfo(cl_device_id d_dev, cl_device_info param, break; case CL_DEVICE_SINGLE_FP_CONFIG: + // This is the mandated minimum single precision floating-point + // capability Could you add that this is according to the OpenCL 1.1 specification? OpenCL 1.2 is even weaker (CL_FP_INF_NAN is not required, only one of CL_FP_ROUND_TO_ZERO or CL_FP_ROUND_TO_NEAREST is required, and no FP capabilities at all are required for custom devices as Jan pointed out). buf.as_scalarcl_device_fp_config() = - CL_FP_DENORM | CL_FP_INF_NAN | CL_FP_ROUND_TO_NEAREST; + CL_FP_INF_NAN | CL_FP_ROUND_TO_NEAREST; I'm okay with this change, but I'm curious, is this motivated by your architecture not supporting denorms? It can, but supporting them hurts performance. -Tom break; case CL_DEVICE_DOUBLE_FP_CONFIG: -- 2.0.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] radeonsi/compute: Use value from compiler for COMPUTE_PGM_RSRC1.FLOAT_MODE
--- src/gallium/drivers/radeonsi/si_compute.c | 3 ++- src/gallium/drivers/radeonsi/si_shader.c | 1 + src/gallium/drivers/radeonsi/si_shader.h | 1 + 3 files changed, 4 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/radeonsi/si_compute.c b/src/gallium/drivers/radeonsi/si_compute.c index 5009f69..8609b89 100644 --- a/src/gallium/drivers/radeonsi/si_compute.c +++ b/src/gallium/drivers/radeonsi/si_compute.c @@ -377,7 +377,8 @@ static void si_launch_grid( * XXX: The compiler should account for this. */ | S_00B848_SGPRS(((MAX2(4 + arg_user_sgpr_count, - shader-num_sgprs)) - 1) / 8)) + shader-num_sgprs)) - 1) / 8) + | S_00B028_FLOAT_MODE(shader-float_mode)) ; lds_blocks = shader-lds_size; diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index b0417ed..87aef4d 100644 --- a/src/gallium/drivers/radeonsi/si_shader.c +++ b/src/gallium/drivers/radeonsi/si_shader.c @@ -2546,6 +2546,7 @@ void si_shader_binary_read_config(const struct si_screen *sscreen, case R_00B848_COMPUTE_PGM_RSRC1: shader-num_sgprs = MAX2(shader-num_sgprs, (G_00B028_SGPRS(value) + 1) * 8); shader-num_vgprs = MAX2(shader-num_vgprs, (G_00B028_VGPRS(value) + 1) * 4); + shader-float_mode = G_00B028_FLOAT_MODE(value); break; case R_00B02C_SPI_SHADER_PGM_RSRC2_PS: shader-lds_size = MAX2(shader-lds_size, G_00B02C_EXTRA_LDS_SIZE(value)); diff --git a/src/gallium/drivers/radeonsi/si_shader.h b/src/gallium/drivers/radeonsi/si_shader.h index 551c7dc..4f2bb91 100644 --- a/src/gallium/drivers/radeonsi/si_shader.h +++ b/src/gallium/drivers/radeonsi/si_shader.h @@ -149,6 +149,7 @@ struct si_shader { unsignednum_vgprs; unsignedlds_size; unsignedspi_ps_input_ena; + unsignedfloat_mode; unsignedscratch_bytes_per_wave; unsignedspi_shader_col_format; unsignedspi_shader_z_format; -- 2.0.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] clover: Return the minimum required value for CL_DEVICE_SINGLE_FP_CONFIG v2
This means dropping CL_FP_DENORM from the current return value. v2: - Add comments about minimum values for OpenCL 1.2. --- src/gallium/state_trackers/clover/api/device.cpp | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/src/gallium/state_trackers/clover/api/device.cpp b/src/gallium/state_trackers/clover/api/device.cpp index b1f556f..b79997f 100644 --- a/src/gallium/state_trackers/clover/api/device.cpp +++ b/src/gallium/state_trackers/clover/api/device.cpp @@ -201,8 +201,11 @@ clGetDeviceInfo(cl_device_id d_dev, cl_device_info param, break; case CL_DEVICE_SINGLE_FP_CONFIG: + // This is the mandated minimum single precision floating-point + // capability for OpenCL 1.1. In OpenCL 1.2, CL_FP_INF_NAN + // is no longer required and nothing is required for custom devices. buf.as_scalarcl_device_fp_config() = - CL_FP_DENORM | CL_FP_INF_NAN | CL_FP_ROUND_TO_NEAREST; + CL_FP_INF_NAN | CL_FP_ROUND_TO_NEAREST; break; case CL_DEVICE_DOUBLE_FP_CONFIG: -- 2.0.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/6] mesa: use c11_stdlib.h' aligned_alloc/free
Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- src/mesa/main/bufferobj.c | 8 +--- src/mesa/math/m_debug_norm.c | 6 -- src/mesa/math/m_debug_xform.c | 6 -- src/mesa/math/m_matrix.c | 10 ++ src/mesa/math/m_vector.c | 5 +++-- src/mesa/program/prog_parameter.c | 11 ++- src/mesa/swrast/s_texture.c | 5 +++-- src/mesa/tnl/t_vb_program.c | 5 +++-- src/mesa/tnl/t_vb_vertex.c| 5 +++-- src/mesa/tnl/t_vertex.c | 6 -- src/mesa/vbo/vbo_exec_api.c | 8 +--- 11 files changed, 46 insertions(+), 29 deletions(-) diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c index e1c5877..d54d37d 100644 --- a/src/mesa/main/bufferobj.c +++ b/src/mesa/main/bufferobj.c @@ -32,6 +32,8 @@ #include stdbool.h #include inttypes.h /* for PRId64 macro */ +#include c11_stdlib.h + #include glheader.h #include enums.h #include hash.h @@ -417,7 +419,7 @@ _mesa_delete_buffer_object(struct gl_context *ctx, { (void) ctx; - _mesa_align_free(bufObj-Data); + aligned_free(bufObj-Data); /* assign strange values here to help w/ debugging */ bufObj-RefCount = -1000; @@ -569,9 +571,9 @@ _mesa_buffer_data( struct gl_context *ctx, GLenum target, GLsizeiptrARB size, (void) target; - _mesa_align_free( bufObj-Data ); + aligned_free( bufObj-Data ); - new_data = _mesa_align_malloc( size, ctx-Const.MinMapBufferAlignment ); + new_data = aligned_alloc( ctx-Const.MinMapBufferAlignment, size ); if (new_data) { bufObj-Data = (GLubyte *) new_data; bufObj-Size = size; diff --git a/src/mesa/math/m_debug_norm.c b/src/mesa/math/m_debug_norm.c index 197b43c..ed61c44 100644 --- a/src/mesa/math/m_debug_norm.c +++ b/src/mesa/math/m_debug_norm.c @@ -27,6 +27,8 @@ */ #include c99_math.h +#include c11_stdlib.h + #include main/glheader.h #include main/context.h #include main/macros.h @@ -209,7 +211,7 @@ static int test_norm_function( normal_func func, int mtype, long *cycles ) (void) cycles; - mat-m = _mesa_align_malloc( 16 * sizeof(GLfloat), 16 ); + mat-m = aligned_alloc( 16, 16 * sizeof(GLfloat) ); mat-inv = m = mat-m; init_matrix( m ); @@ -328,7 +330,7 @@ static int test_norm_function( normal_func func, int mtype, long *cycles ) } } - _mesa_align_free( mat-m ); + aligned_free( mat-m ); return 1; } diff --git a/src/mesa/math/m_debug_xform.c b/src/mesa/math/m_debug_xform.c index 632c82e..2a0778b 100644 --- a/src/mesa/math/m_debug_xform.c +++ b/src/mesa/math/m_debug_xform.c @@ -26,6 +26,8 @@ * Updated for P6 architecture by Gareth Hughes. */ +#include c11_stdlib.h + #include main/glheader.h #include main/context.h #include main/macros.h @@ -183,7 +185,7 @@ static int test_transform_function( transform_func func, int psize, return 0; } - mat-m = _mesa_align_malloc( 16 * sizeof(GLfloat), 16 ); + mat-m = aligned_alloc( 16, 16 * sizeof(GLfloat) ); mat-type = mtypes[mtype]; m = mat-m; @@ -273,7 +275,7 @@ static int test_transform_function( transform_func func, int psize, } } - _mesa_align_free( mat-m ); + aligned_free( mat-m ); return 1; } diff --git a/src/mesa/math/m_matrix.c b/src/mesa/math/m_matrix.c index 0475a7a..ebee5b9 100644 --- a/src/mesa/math/m_matrix.c +++ b/src/mesa/math/m_matrix.c @@ -35,6 +35,8 @@ #include c99_math.h +#include c11_stdlib.h + #include main/glheader.h #include main/imports.h #include main/macros.h @@ -1469,10 +1471,10 @@ _math_matrix_loadf( GLmatrix *mat, const GLfloat *m ) void _math_matrix_ctr( GLmatrix *m ) { - m-m = _mesa_align_malloc( 16 * sizeof(GLfloat), 16 ); + m-m = aligned_alloc( 16, 16 * sizeof(GLfloat) ); if (m-m) memcpy( m-m, Identity, sizeof(Identity) ); - m-inv = _mesa_align_malloc( 16 * sizeof(GLfloat), 16 ); + m-inv = aligned_alloc( 16, 16 * sizeof(GLfloat) ); if (m-inv) memcpy( m-inv, Identity, sizeof(Identity) ); m-type = MATRIX_IDENTITY; @@ -1489,10 +1491,10 @@ _math_matrix_ctr( GLmatrix *m ) void _math_matrix_dtr( GLmatrix *m ) { - _mesa_align_free( m-m ); + aligned_free( m-m ); m-m = NULL; - _mesa_align_free( m-inv ); + aligned_free( m-inv ); m-inv = NULL; } diff --git a/src/mesa/math/m_vector.c b/src/mesa/math/m_vector.c index 831f953..e1d5c27 100644 --- a/src/mesa/math/m_vector.c +++ b/src/mesa/math/m_vector.c @@ -27,6 +27,7 @@ */ #include stdio.h +#include c11_stdlib.h #include main/glheader.h #include main/imports.h @@ -101,7 +102,7 @@ _mesa_vector4f_alloc( GLvector4f *v, GLbitfield flags, GLuint count, { v-stride = 4 * sizeof(GLfloat); v-size = 2; - v-storage = _mesa_align_malloc( count * 4 * sizeof(GLfloat), alignment ); + v-storage = aligned_alloc( alignment, count * 4 * sizeof(GLfloat) ); v-storage_count = count; v-start = (GLfloat *) v-storage; v-data = (GLfloat (*)[4]) v-storage; @@ -119,7 +120,7
[Mesa-dev] [PATCH 1/6] c11: add c11 compatibility wrapper around stdlib.h
Used for aligned_alloc and other C11 functions missing from the header. Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- include/c11_stdlib.h | 118 +++ 1 file changed, 118 insertions(+) create mode 100644 include/c11_stdlib.h diff --git a/include/c11_stdlib.h b/include/c11_stdlib.h new file mode 100644 index 000..04e494f --- /dev/null +++ b/include/c11_stdlib.h @@ -0,0 +1,118 @@ +/* + * Mesa 3-D graphics library + * + * Copyright (C) 1999-2007 Brian Paul All Rights Reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included + * in all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +/** + * Wrapper for stdlib.h which makes sure we have definitions of all the c11 + * functions. + */ + +#ifndef _C11_STDLIB_H_ +#define _C11_STDLIB_H_ + +#include stdint.h +#include stdlib.h +#include c99_compat.h + + +#if !defined(_ISOC11_SOURCE) __STDC_VERSION__ 201112L + +#if defined(_WIN32) !defined(__CYGWIN__) +#include malloc.h +#endif + +/** + * Allocate aligned memory. + * + * \param alignment alignment (must be greater than zero). + * \param size number of bytes to allocate. + * + * Allocates extra memory to accommodate rounding up the address for + * alignment and to record the real malloc address. + * + * \sa aligned_free(). + */ +static inline void * +aligned_alloc(size_t alignment, size_t size) +{ +#if defined(HAVE_POSIX_MEMALIGN) + void *mem; + int err = posix_memalign(mem, alignment, size); + if (err) + return NULL; + return mem; +#elif defined(_WIN32) !defined(__CYGWIN__) + return _aligned_malloc(size, alignment); +#else + uintptr_t ptr, buf; + + assert( alignment 0 ); + + ptr = (uintptr_t)malloc(size + alignment + sizeof(void *)); + if (!ptr) + return NULL; + + buf = (ptr + alignment + sizeof(void *)) ~(uintptr_t)(alignment - 1); + *(uintptr_t *)(buf - sizeof(void *)) = ptr; + +#ifdef DEBUG + /* mark the non-aligned area */ + while ( ptr buf - sizeof(void *) ) { + *(unsigned long *)ptr = 0xcdcdcdcd; + ptr += sizeof(unsigned long); + } +#endif + + return (void *) buf; +#endif /* defined(HAVE_POSIX_MEMALIGN) */ +} + +#endif /* C11 */ + +/** + * Free memory which was allocated with aligned_alloc(). + * + * \param ptr pointer to the memory to be freed. + * + * The actual address to free is stored in the word immediately before the + * address the client sees. + * Note that it is legal to pass NULL pointer to this function and will be + * handled accordingly. + */ +static inline void +aligned_free(void *ptr) +{ +#if defined(HAVE_POSIX_MEMALIGN) + free(ptr); +#elif defined(_WIN32) !defined(__CYGWIN__) + _aligned_free(ptr); +#else + if (ptr) { + void **cubbyHole = (void **) ((char *) ptr - sizeof(void *)); + void *realAddr = *cubbyHole; + free(realAddr); + } +#endif /* defined(HAVE_POSIX_MEMALIGN) */ +} + +#endif /* #define _C11_STDLIB_H_ */ -- 2.1.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/6] mesa: inline _mesa_align_{re, c}alloc into their only users
Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- src/mesa/main/imports.c | 74 +-- src/mesa/main/imports.h | 7 src/mesa/program/prog_parameter.c | 18 +++--- src/mesa/tnl/t_vertex.c | 5 ++- 4 files changed, 18 insertions(+), 86 deletions(-) diff --git a/src/mesa/main/imports.c b/src/mesa/main/imports.c index a7ffe22..5961587 100644 --- a/src/mesa/main/imports.c +++ b/src/mesa/main/imports.c @@ -115,57 +115,7 @@ _mesa_align_malloc(size_t bytes, unsigned long alignment) } /** - * Same as _mesa_align_malloc(), but using calloc(1, ) instead of - * malloc() - */ -void * -_mesa_align_calloc(size_t bytes, unsigned long alignment) -{ -#if defined(HAVE_POSIX_MEMALIGN) - void *mem; - - mem = _mesa_align_malloc(bytes, alignment); - if (mem != NULL) { - (void) memset(mem, 0, bytes); - } - - return mem; -#elif defined(_WIN32) defined(_MSC_VER) - void *mem; - - mem = _aligned_malloc(bytes, alignment); - if (mem != NULL) { - (void) memset(mem, 0, bytes); - } - - return mem; -#else - uintptr_t ptr, buf; - - assert( alignment 0 ); - - ptr = (uintptr_t)calloc(1, bytes + alignment + sizeof(void *)); - if (!ptr) - return NULL; - - buf = (ptr + alignment + sizeof(void *)) ~(uintptr_t)(alignment - 1); - *(uintptr_t *)(buf - sizeof(void *)) = ptr; - -#ifdef DEBUG - /* mark the non-aligned area */ - while ( ptr buf - sizeof(void *) ) { - *(unsigned long *)ptr = 0xcdcdcdcd; - ptr += sizeof(unsigned long); - } -#endif - - return (void *)buf; -#endif /* defined(HAVE_POSIX_MEMALIGN) */ -} - -/** - * Free memory which was allocated with either _mesa_align_malloc() - * or _mesa_align_calloc(). + * Free memory which was allocated with either _mesa_align_malloc(). * \param ptr pointer to the memory to be freed. * The actual address to free is stored in the word immediately before the * address the client sees. @@ -188,28 +138,6 @@ _mesa_align_free(void *ptr) #endif /* defined(HAVE_POSIX_MEMALIGN) */ } -/** - * Reallocate memory, with alignment. - */ -void * -_mesa_align_realloc(void *oldBuffer, size_t oldSize, size_t newSize, -unsigned long alignment) -{ -#if defined(_WIN32) defined(_MSC_VER) - (void) oldSize; - return _aligned_realloc(oldBuffer, newSize, alignment); -#else - const size_t copySize = (oldSize newSize) ? oldSize : newSize; - void *newBuf = _mesa_align_malloc(newSize, alignment); - if (newBuf oldBuffer copySize 0) { - memcpy(newBuf, oldBuffer, copySize); - } - - _mesa_align_free(oldBuffer); - return newBuf; -#endif -} - /*@}*/ diff --git a/src/mesa/main/imports.h b/src/mesa/main/imports.h index 7921000..a3767fd 100644 --- a/src/mesa/main/imports.h +++ b/src/mesa/main/imports.h @@ -363,17 +363,10 @@ _mesa_little_endian(void) extern void * _mesa_align_malloc( size_t bytes, unsigned long alignment ); -extern void * -_mesa_align_calloc( size_t bytes, unsigned long alignment ); - extern void _mesa_align_free( void *ptr ); extern void * -_mesa_align_realloc(void *oldBuffer, size_t oldSize, size_t newSize, -unsigned long alignment); - -extern void * _mesa_exec_malloc( GLuint size ); extern void diff --git a/src/mesa/program/prog_parameter.c b/src/mesa/program/prog_parameter.c index 5939f6f..edb5389 100644 --- a/src/mesa/program/prog_parameter.c +++ b/src/mesa/program/prog_parameter.c @@ -116,19 +116,27 @@ _mesa_add_parameter(struct gl_program_parameter_list *paramList, assert(size 0); if (oldNum + sz4 paramList-Size) { + size_t newSize, copySize; + void *newBuf; + /* Need to grow the parameter list array (alloc some extra) */ paramList-Size = paramList-Size + 4 * sz4; + newSize = paramList-Size * 4 * sizeof(gl_constant_value); + copySize = MIN2(oldNum * 4 * sizeof(gl_constant_value), newSize); + /* realloc arrays */ paramList-Parameters = realloc(paramList-Parameters, paramList-Size * sizeof(struct gl_program_parameter)); - paramList-ParameterValues = (gl_constant_value (*)[4]) - _mesa_align_realloc(paramList-ParameterValues, /* old buf */ - oldNum * 4 * sizeof(gl_constant_value),/* old sz */ - paramList-Size*4*sizeof(gl_constant_value),/*new*/ - 16); + newBuf = _mesa_align_malloc(newSize, 16); + if (newBuf paramList-ParameterValues copySize 0) { + memcpy(newBuf, paramList-ParameterValues, copySize); + } + + _mesa_align_free(paramList-ParameterValues); + paramList-ParameterValues = (gl_constant_value (*)[4]) newBuf; } if (!paramList-Parameters || diff --git a/src/mesa/tnl/t_vertex.c b/src/mesa/tnl/t_vertex.c index 369d6d9..607977c 100644 --- a/src/mesa/tnl/t_vertex.c +++ b/src/mesa/tnl/t_vertex.c @@ -26,6 +26,7 @@ */ #include
[Mesa-dev] [PATCH 1/5] egl/main: use c11/threads' mutex directly
Remove the inline wrappers/abstraction layer. Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- src/egl/main/Makefile.sources | 1 - src/egl/main/eglapi.c | 14 + src/egl/main/eglcurrent.c | 13 - src/egl/main/egldisplay.c | 13 + src/egl/main/egldisplay.h | 4 +-- src/egl/main/egldriver.c | 8 +++--- src/egl/main/eglglobals.c | 9 +++--- src/egl/main/eglglobals.h | 4 +-- src/egl/main/egllog.c | 18 ++-- src/egl/main/eglmutex.h | 66 --- src/egl/main/eglscreen.c | 8 +++--- 11 files changed, 47 insertions(+), 111 deletions(-) delete mode 100644 src/egl/main/eglmutex.h diff --git a/src/egl/main/Makefile.sources b/src/egl/main/Makefile.sources index 6a917e2..75f060a 100644 --- a/src/egl/main/Makefile.sources +++ b/src/egl/main/Makefile.sources @@ -26,7 +26,6 @@ LIBEGL_C_FILES := \ eglmisc.h \ eglmode.c \ eglmode.h \ - eglmutex.h \ eglscreen.c \ eglscreen.h \ eglstring.c \ diff --git a/src/egl/main/eglapi.c b/src/egl/main/eglapi.c index 2258830..a74efcd 100644 --- a/src/egl/main/eglapi.c +++ b/src/egl/main/eglapi.c @@ -87,6 +87,8 @@ #include stdlib.h #include string.h #include c99_compat.h +#include c11/threads.h +#include eglcompiler.h #include eglglobals.h #include eglcontext.h @@ -275,7 +277,7 @@ _eglLockDisplay(EGLDisplay display) { _EGLDisplay *dpy = _eglLookupDisplay(display); if (dpy) - _eglLockMutex(dpy-Mutex); + mtx_lock(dpy-Mutex); return dpy; } @@ -286,7 +288,7 @@ _eglLockDisplay(EGLDisplay display) static inline void _eglUnlockDisplay(_EGLDisplay *dpy) { - _eglUnlockMutex(dpy-Mutex); + mtx_unlock(dpy-Mutex); } @@ -896,7 +898,7 @@ eglWaitClient(void) RETURN_EGL_SUCCESS(NULL, EGL_TRUE); disp = ctx-Resource.Display; - _eglLockMutex(disp-Mutex); + mtx_lock(disp-Mutex); /* let bad current context imply bad current surface */ if (_eglGetContextHandle(ctx) == EGL_NO_CONTEXT || @@ -942,7 +944,7 @@ eglWaitNative(EGLint engine) RETURN_EGL_SUCCESS(NULL, EGL_TRUE); disp = ctx-Resource.Display; - _eglLockMutex(disp-Mutex); + mtx_lock(disp-Mutex); /* let bad current context imply bad current surface */ if (_eglGetContextHandle(ctx) == EGL_NO_CONTEXT || @@ -1457,10 +1459,10 @@ eglReleaseThread(void) t-CurrentAPIIndex = i; -_eglLockMutex(disp-Mutex); +mtx_lock(disp-Mutex); drv = disp-Driver; (void) drv-API.MakeCurrent(drv, disp, NULL, NULL, NULL); -_eglUnlockMutex(disp-Mutex); +mtx_unlock(disp-Mutex); } } diff --git a/src/egl/main/eglcurrent.c b/src/egl/main/eglcurrent.c index 3d49641..dc32ed4 100644 --- a/src/egl/main/eglcurrent.c +++ b/src/egl/main/eglcurrent.c @@ -31,7 +31,6 @@ #include c99_compat.h #include egllog.h -#include eglmutex.h #include eglcurrent.h #include eglglobals.h @@ -47,7 +46,7 @@ static _EGLThreadInfo dummy_thread = _EGL_THREAD_INFO_INITIALIZER; #if HAVE_PTHREAD #include pthread.h -static _EGLMutex _egl_TSDMutex = _EGL_MUTEX_INITIALIZER; +static mtx_t _egl_TSDMutex = _MTX_INITIALIZER_NP; static EGLBoolean _egl_TSDInitialized; static pthread_key_t _egl_TSD; static void (*_egl_FreeTSD)(_EGLThreadInfo *); @@ -76,7 +75,7 @@ static inline _EGLThreadInfo *_eglGetTSD(void) static inline void _eglFiniTSD(void) { - _eglLockMutex(_egl_TSDMutex); + mtx_lock(_egl_TSDMutex); if (_egl_TSDInitialized) { _EGLThreadInfo *t = _eglGetTSD(); @@ -85,18 +84,18 @@ static inline void _eglFiniTSD(void) _egl_FreeTSD((void *) t); pthread_key_delete(_egl_TSD); } - _eglUnlockMutex(_egl_TSDMutex); + mtx_unlock(_egl_TSDMutex); } static inline EGLBoolean _eglInitTSD(void (*dtor)(_EGLThreadInfo *)) { if (!_egl_TSDInitialized) { - _eglLockMutex(_egl_TSDMutex); + mtx_lock(_egl_TSDMutex); /* check again after acquiring lock */ if (!_egl_TSDInitialized) { if (pthread_key_create(_egl_TSD, (void (*)(void *)) dtor) != 0) { -_eglUnlockMutex(_egl_TSDMutex); +mtx_unlock(_egl_TSDMutex); return EGL_FALSE; } _egl_FreeTSD = dtor; @@ -104,7 +103,7 @@ static inline EGLBoolean _eglInitTSD(void (*dtor)(_EGLThreadInfo *)) _egl_TSDInitialized = EGL_TRUE; } - _eglUnlockMutex(_egl_TSDMutex); + mtx_unlock(_egl_TSDMutex); } return EGL_TRUE; diff --git a/src/egl/main/egldisplay.c b/src/egl/main/egldisplay.c index a167ae5..b7a5b8f 100644 --- a/src/egl/main/egldisplay.c +++ b/src/egl/main/egldisplay.c @@ -35,13 +35,14 @@ #include assert.h #include stdlib.h #include string.h +#include c11/threads.h + #include eglcontext.h #include eglcurrent.h #include eglsurface.h #include egldisplay.h #include egldriver.h #include eglglobals.h -#include
Re: [Mesa-dev] [PATCH 1/5] egl/main: use c11/threads' mutex directly
On 06/03/15 17:05, Emil Velikov wrote: Hi all, Just accidently pushed the series to master. I'll revert them in a second. All done. Apologies for the noise. -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/5] i915: Remove unused IS_MOBILE macro
Series Reviewed-by: Jordan Justen jordan.l.jus...@intel.com On 2015-03-05 11:49:54, Ian Romanick wrote: From: Ian Romanick ian.d.roman...@intel.com Inspired by Damien's recent libdrm changes. Signed-off-by: Ian Romanick ian.d.roman...@intel.com Cc: Damien Lespiau damien.lesp...@intel.com --- src/mesa/drivers/dri/i915/intel_chipset.h | 10 -- 1 file changed, 10 deletions(-) diff --git a/src/mesa/drivers/dri/i915/intel_chipset.h b/src/mesa/drivers/dri/i915/intel_chipset.h index 3828085..d05fd08 100644 --- a/src/mesa/drivers/dri/i915/intel_chipset.h +++ b/src/mesa/drivers/dri/i915/intel_chipset.h @@ -53,16 +53,6 @@ #define IS_PNVG(devid) (devid == PCI_CHIP_PNV_G) #define IS_PNV(devid) (IS_PNVG(devid) || IS_PNVGM(devid)) -#define IS_MOBILE(devid) (devid == PCI_CHIP_I855_GM || \ -devid == PCI_CHIP_I915_GM || \ -devid == PCI_CHIP_I945_GM || \ -devid == PCI_CHIP_I945_GME || \ -devid == PCI_CHIP_I965_GM || \ -devid == PCI_CHIP_I965_GME || \ -devid == PCI_CHIP_GM45_GM || \ -IS_PNV(devid) || \ -devid == PCI_CHIP_ILM_G) - #define IS_915(devid) (devid == PCI_CHIP_I915_G || \ devid == PCI_CHIP_E7221_G || \ devid == PCI_CHIP_I915_GM) -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 09/13] i965: Pass the number of components as a source of the untyped surface read opcode.
On Fri, Feb 27, 2015 at 05:34:52PM +0200, Francisco Jerez wrote: --- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 5 +++-- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 2 +- src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 6 -- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 3 ++- 4 files changed, 10 insertions(+), 6 deletions(-) Reviewed-by: Topi Pohjolainen topi.pohjolai...@intel.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/nir: Resolve source modifiers on Gen8+ logic operations.
On Fri, Mar 06, 2015 at 01:33:05AM -0800, Kenneth Graunke wrote: On Gen8+, AND/OR/XOR/NOT don't support the abs() source modifier, and negate changes meaning to bitwise-not (~, not -). This isn't what NIR expects, so we should resolve the source modifers via a MOV. +30 Piglits (fs-op-bit{and,or,xor}-not-abs-*). Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/brw_fs.cpp | 11 +++ src/mesa/drivers/dri/i965/brw_fs.h | 1 + src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 15 +++ 3 files changed, 27 insertions(+) Reviewed-by: Topi Pohjolainen topi.pohjolai...@intel.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 01/13] i965: Factor out logic to build a send message instruction with indirect descriptor.
Pohjolainen, Topi topi.pohjolai...@intel.com writes: On Fri, Mar 06, 2015 at 10:37:06AM +0200, Pohjolainen, Topi wrote: On Fri, Feb 27, 2015 at 05:34:44PM +0200, Francisco Jerez wrote: [..] +/** + * Send message to shared unit \p sfid with a possibly indirect descriptor \p + * desc. If the descriptor is not an immediate it will be transparently + * loaded to an address register using an OR instruction that will be returned + * to the caller so additional descriptor bits can be specified with the usual + * brw_set_*_message() helper functions. + */ Right, you exploit this in patch number five. I exploit it in the generator code for all typed and untyped surface opcodes, and was considering to take more advantage from it in the texturing and pull constant load opcode -- it will likely reduce the amount of code required to emit such messages to less than one third. I think at least this comment is misleading as it doesn't say anything about the returned instruction in case the given descriptor is an immediate. I've edited the comment locally, hopefully it's more obvious now: | /** | * Send message to shared unit \p sfid with a possibly indirect descriptor \p | * desc. If \p desc is not an immediate it will be transparently loaded to an | * address register using an OR instruction. The returned instruction can be | * passed as argument to the usual brw_set_*_message() functions in order to | * specify any additional descriptor bits -- If \p desc is an immediate this | * will be the SEND instruction itself, otherwise it will be the OR | * instruction. | */ All in all I'm not too happy about the return value having such differing semantics depending on the given descriptor type. The point is that all the semantics that callers of this function care about is being able to set descriptor control bits on the returned instruction, they don't care about the actual SEND instruction itself. Consider fs_generator::generate_varying_pull_constant_load_gen7: | if (index.file == BRW_IMMEDIATE_VALUE) { | | uint32_t surf_index = index.dw1.ud; | | brw_inst *send = brw_next_insn(p, BRW_OPCODE_SEND); | brw_set_dest(p, send, retype(dst, BRW_REGISTER_TYPE_UW)); | brw_set_src0(p, send, offset); | brw_set_sampler_message(p, send, | surf_index, | 0, /* LD message ignores sampler unit */ | GEN5_SAMPLER_MESSAGE_SAMPLE_LD, | rlen, | mlen, | false, /* no header */ | simd_mode, | 0); | | brw_mark_surface_used(prog_data, surf_index); | | } else { | | struct brw_reg addr = vec1(retype(brw_address_reg(0), BRW_REGISTER_TYPE_UD)); | | brw_push_insn_state(p); | brw_set_default_mask_control(p, BRW_MASK_DISABLE); | brw_set_default_access_mode(p, BRW_ALIGN_1); | | /* a0.0 = surf_index 0xff */ | brw_inst *insn_and = brw_next_insn(p, BRW_OPCODE_AND); | brw_inst_set_exec_size(p-brw, insn_and, BRW_EXECUTE_1); | brw_set_dest(p, insn_and, addr); | brw_set_src0(p, insn_and, vec1(retype(index, BRW_REGISTER_TYPE_UD))); | brw_set_src1(p, insn_and, brw_imm_ud(0x0ff)); | | | /* a0.0 |= descriptor */ | brw_inst *insn_or = brw_next_insn(p, BRW_OPCODE_OR); | brw_set_sampler_message(p, insn_or, | 0 /* surface */, | 0 /* sampler */, | GEN5_SAMPLER_MESSAGE_SAMPLE_LD, | rlen /* rlen */, | mlen /* mlen */, | false /* header */, | simd_mode, | 0); | brw_inst_set_exec_size(p-brw, insn_or, BRW_EXECUTE_1); | brw_inst_set_src1_reg_type(p-brw, insn_or, BRW_REGISTER_TYPE_UD); | brw_set_src0(p, insn_or, addr); | brw_set_dest(p, insn_or, addr); | | | /* dst = send(offset, a0.0) */ | brw_inst *insn_send = brw_next_insn(p, BRW_OPCODE_SEND); | brw_set_dest(p, insn_send, retype(dst, BRW_REGISTER_TYPE_UW)); | brw_set_src0(p, insn_send, offset); | brw_set_indirect_send_descriptor(p, insn_send, BRW_SFID_SAMPLER, addr); | | brw_pop_insn_state(p); | | /* visitor knows more than we do about the surface limit required, | * so has already done marking. | */ | } This allows us to simplify the whole snippet into: | brw_inst *insn = brw_send_indirect_surface_message( | p, BRW_SFID_SAMPLER, dst, offset, index, mlen, rlen, false); | brw_set_sampler_message(p, insn, | 0 /* surface */, | 0 /* sampler */, | GEN5_SAMPLER_MESSAGE_SAMPLE_LD, | 0 /* rlen */, | 0 /* mlen */, |
Re: [Mesa-dev] [PATCH 10/13] i965: Reorder sources of the untyped atomic opcode.
On Fri, Feb 27, 2015 at 05:34:53PM +0200, Francisco Jerez wrote: This is consistent with the untyped surface read opcode. From now on all typed and untyped surface access opcodes will follow the same pattern: src[0] will be the message payload, src[1] will be the surface index and src[2] will be a control immediate (atomic operation for atomic opcodes and number of vector components for surface read and write opcodes). --- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 4 ++-- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 2 +- src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 4 ++-- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 2 +- 4 files changed, 6 insertions(+), 6 deletions(-) Reviewed-by: Topi Pohjolainen topi.pohjolai...@intel.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 08/13] i965/vec4: Add support for untyped surface message sends from GRF.
On Fri, Feb 27, 2015 at 05:34:51PM +0200, Francisco Jerez wrote: This doesn't actually enable untyped surface message sends from GRF yet, the upcoming atomic counter and image intrinsic lowering code will. --- src/mesa/drivers/dri/i965/brw_vec4.cpp | 7 --- src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 16 +++- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 5 +++-- 3 files changed, 14 insertions(+), 14 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index e19..0004b10 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -256,6 +256,8 @@ vec4_instruction::is_send_from_grf() switch (opcode) { case SHADER_OPCODE_SHADER_TIME_ADD: case VS_OPCODE_PULL_CONSTANT_LOAD_GEN7: + case SHADER_OPCODE_UNTYPED_ATOMIC: + case SHADER_OPCODE_UNTYPED_SURFACE_READ: return true; default: return false; @@ -270,6 +272,8 @@ vec4_instruction::regs_read(unsigned arg) const switch (opcode) { case SHADER_OPCODE_SHADER_TIME_ADD: + case SHADER_OPCODE_UNTYPED_ATOMIC: + case SHADER_OPCODE_UNTYPED_SURFACE_READ: return arg == 0 ? mlen : 1; Before the logic always falled back to returning one. Now we may return one, two or three I think. I may be mistaken though, I'm just reading vec4_visitor::emit_untyped_atomic() and it can produce message lengths up to three. Does this effect the instruction scheduling logic and if not, can you explain why not? case VS_OPCODE_PULL_CONSTANT_LOAD_GEN7: @@ -347,9 +351,6 @@ vec4_visitor::implied_mrf_writes(vec4_instruction *inst) case SHADER_OPCODE_TG4: case SHADER_OPCODE_TG4_OFFSET: return inst-header_present ? 1 : 0; - case SHADER_OPCODE_UNTYPED_ATOMIC: - case SHADER_OPCODE_UNTYPED_SURFACE_READ: - return 0; default: unreachable(not reached); } diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp index 22fdd63..ef0cde9 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp @@ -1459,19 +1459,17 @@ vec4_generator::generate_code(const cfg_t *cfg) break; case SHADER_OPCODE_UNTYPED_ATOMIC: - assert(src[0].file == BRW_IMMEDIATE_VALUE -src[1].file == BRW_IMMEDIATE_VALUE); - brw_untyped_atomic(p, dst, brw_message_reg(inst-base_mrf), -src[1], src[0].dw1.ud, inst-mlen, + assert(src[1].file == BRW_IMMEDIATE_VALUE +src[2].file == BRW_IMMEDIATE_VALUE); + brw_untyped_atomic(p, dst, src[0], src[2], src[1].dw1.ud, inst-mlen, !inst-dst.is_null()); - brw_mark_surface_used(prog_data-base, src[1].dw1.ud); + brw_mark_surface_used(prog_data-base, src[2].dw1.ud); break; case SHADER_OPCODE_UNTYPED_SURFACE_READ: - assert(src[0].file == BRW_IMMEDIATE_VALUE); - brw_untyped_surface_read(p, dst, brw_message_reg(inst-base_mrf), - src[0], inst-mlen, 1); - brw_mark_surface_used(prog_data-base, src[0].dw1.ud); + assert(src[1].file == BRW_IMMEDIATE_VALUE); + brw_untyped_surface_read(p, dst, src[0], src[1], inst-mlen, 1); + brw_mark_surface_used(prog_data-base, src[1].dw1.ud); break; case SHADER_OPCODE_FIND_LIVE_CHANNEL: diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp index f25bff9..b8cfe8f 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp @@ -2953,6 +2953,7 @@ vec4_visitor::emit_untyped_atomic(unsigned atomic_op, unsigned surf_index, * unused channels will be masked out. */ vec4_instruction *inst = emit(SHADER_OPCODE_UNTYPED_ATOMIC, dst, + brw_message_reg(0), src_reg(atomic_op), src_reg(surf_index)); inst-base_mrf = 0; inst-mlen = mlen; @@ -2969,8 +2970,8 @@ vec4_visitor::emit_untyped_surface_read(unsigned surf_index, dst_reg dst, * untyped surface read message, but that's OK because unused * channels will be masked out. */ - vec4_instruction *inst = emit(SHADER_OPCODE_UNTYPED_SURFACE_READ, - dst, src_reg(surf_index)); + vec4_instruction *inst = emit(SHADER_OPCODE_UNTYPED_SURFACE_READ, dst, + brw_message_reg(0), src_reg(surf_index)); inst-base_mrf = 0; inst-mlen = 1; } -- 2.1.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev
Re: [Mesa-dev] [PATCH 01/13] i965: Factor out logic to build a send message instruction with indirect descriptor.
On Fri, Mar 06, 2015 at 10:37:06AM +0200, Pohjolainen, Topi wrote: On Fri, Feb 27, 2015 at 05:34:44PM +0200, Francisco Jerez wrote: --- src/mesa/drivers/dri/i965/brw_eu.h | 19 ++-- src/mesa/drivers/dri/i965/brw_eu_emit.c | 58 ++-- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 55 +- src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 37 --- 4 files changed, 77 insertions(+), 92 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_eu.h b/src/mesa/drivers/dri/i965/brw_eu.h index 1b954c8..9b1e0e2 100644 --- a/src/mesa/drivers/dri/i965/brw_eu.h +++ b/src/mesa/drivers/dri/i965/brw_eu.h @@ -205,11 +205,6 @@ void brw_set_sampler_message(struct brw_compile *p, unsigned simd_mode, unsigned return_format); -void brw_set_indirect_send_descriptor(struct brw_compile *p, - brw_inst *insn, - unsigned sfid, - struct brw_reg descriptor); - void brw_set_dp_read_message(struct brw_compile *p, brw_inst *insn, unsigned binding_table_index, @@ -243,6 +238,20 @@ void brw_urb_WRITE(struct brw_compile *p, unsigned offset, unsigned swizzle); +/** + * Send message to shared unit \p sfid with a possibly indirect descriptor \p + * desc. If the descriptor is not an immediate it will be transparently + * loaded to an address register using an OR instruction that will be returned + * to the caller so additional descriptor bits can be specified with the usual + * brw_set_*_message() helper functions. + */ Right, you exploit this in patch number five. I think at least this comment is misleading as it doesn't say anything about the returned instruction in case the given descriptor is an immediate. All in all I'm not too happy about the return value having such differing semantics depending on the given descriptor type. +struct brw_inst * +brw_send_indirect_message(struct brw_compile *p, + unsigned sfid, + struct brw_reg dst, + struct brw_reg payload, + struct brw_reg desc); + void brw_ff_sync(struct brw_compile *p, struct brw_reg dest, unsigned msg_reg_nr, diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c index e69840a..cd2ce92 100644 --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c @@ -751,21 +751,6 @@ brw_set_sampler_message(struct brw_compile *p, } } -void brw_set_indirect_send_descriptor(struct brw_compile *p, - brw_inst *insn, - unsigned sfid, - struct brw_reg descriptor) -{ - /* Only a0.0 may be used as SEND's descriptor operand. */ - assert(descriptor.file == BRW_ARCHITECTURE_REGISTER_FILE); - assert(descriptor.type == BRW_REGISTER_TYPE_UD); - assert(descriptor.nr == BRW_ARF_ADDRESS); - assert(descriptor.subnr == 0); - - brw_set_message_descriptor(p, insn, sfid, 0, 0, false, false); - brw_set_src1(p, insn, descriptor); -} - static void gen7_set_dp_scratch_message(struct brw_compile *p, brw_inst *inst, @@ -2490,6 +2475,49 @@ void brw_urb_WRITE(struct brw_compile *p, swizzle); } +struct brw_inst * +brw_send_indirect_message(struct brw_compile *p, + unsigned sfid, + struct brw_reg dst, + struct brw_reg payload, + struct brw_reg desc) +{ + const struct brw_context *brw = p-brw; + struct brw_inst *send, *setup; + + assert(desc.type == BRW_REGISTER_TYPE_UD); + + if (desc.file == BRW_IMMEDIATE_VALUE) { + setup = send = next_insn(p, BRW_OPCODE_SEND); If I'm reading this correctly, all the callers in this patch use 'desc' of type other than BRW_IMMEDIATE_VALUE. Hence returning the actual send-instruction as the descriptor instuction is not needed by any of the logic modified in this patch. Do we really need to do this or could we just return NULL since in this case there really isn't any OR-instruction setting the descriptor bits? (Your documentation above says that the returned instruction is an OR setting the descriptor. Returning the SEND instead is not the same really). ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965/nir: Resolve source modifiers on Gen8+ logic operations.
On Gen8+, AND/OR/XOR/NOT don't support the abs() source modifier, and negate changes meaning to bitwise-not (~, not -). This isn't what NIR expects, so we should resolve the source modifers via a MOV. +30 Piglits (fs-op-bit{and,or,xor}-not-abs-*). Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/brw_fs.cpp | 11 +++ src/mesa/drivers/dri/i965/brw_fs.h | 1 + src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 15 +++ 3 files changed, 27 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index d6acc23..428234f 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -1561,6 +1561,17 @@ fs_visitor::emit_sampleid_setup() return reg; } +void +fs_visitor::resolve_source_modifiers(fs_reg *src) +{ + if (!src-abs !src-negate) + return; + + fs_reg temp = retype(vgrf(1), src-type); + emit(MOV(temp, *src)); + *src = temp; +} + fs_reg fs_visitor::fix_math_operand(fs_reg src) { diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index 70098d8..ec77962 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -299,6 +299,7 @@ public: int texunit); fs_reg emit_mcs_fetch(fs_reg coordinate, int components, fs_reg sampler); void emit_gen6_gather_wa(uint8_t wa, fs_reg dst); + void resolve_source_modifiers(fs_reg *src); fs_reg fix_math_operand(fs_reg src); fs_inst *emit_math(enum opcode op, fs_reg dst, fs_reg src0); fs_inst *emit_math(enum opcode op, fs_reg dst, fs_reg src0, fs_reg src1); diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index 66f7918..a0300aa 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -935,15 +935,30 @@ fs_visitor::nir_emit_alu(nir_alu_instr *instr) break; case nir_op_inot: + if (brw-gen = 8) { + resolve_source_modifiers(op[0]); + } emit(NOT(result, op[0])); break; case nir_op_ixor: + if (brw-gen = 8) { + resolve_source_modifiers(op[0]); + resolve_source_modifiers(op[1]); + } emit(XOR(result, op[0], op[1])); break; case nir_op_ior: + if (brw-gen = 8) { + resolve_source_modifiers(op[0]); + resolve_source_modifiers(op[1]); + } emit(OR(result, op[0], op[1])); break; case nir_op_iand: + if (brw-gen = 8) { + resolve_source_modifiers(op[0]); + resolve_source_modifiers(op[1]); + } emit(AND(result, op[0], op[1])); break; -- 2.2.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Fix URB size for CHV
On Thu, Mar 05, 2015 at 01:48:29PM -0800, Kenneth Graunke wrote: On Thursday, March 05, 2015 07:41:29 PM Ville Syrjälä wrote: On Fri, Jan 23, 2015 at 12:12:56PM +0200, ville.syrj...@linux.intel.com wrote: From: Ville Syrjälä ville.syrj...@linux.intel.com Increase the device info .urb.size for CHV to match the default URB size (192kB). Signed-off-by: Ville Syrjälä ville.syrj...@linux.intel.com Ping? Oh, sorry! I thought I'd reviewed this. It does indeed appear to be 192kB. Reviewed-by: Kenneth Graunke kenn...@whitecape.org Have you tested it? Assuming it doesn't explode, feel free to push this. Thanks for catching the mistake! Yep, been running with this patch for ~6 months or so ;) Pushed now. Thanks. -- Ville Syrjälä Intel OTC ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 04/13] i965: Mask out unused Align16 components in brw_untyped_atomic.
On Fri, Feb 27, 2015 at 05:34:47PM +0200, Francisco Jerez wrote: This is currently not a problem because the vec4 visitor happens to mask out unused components from the destination, but it might become an issue when we start using atomics without writeback message. In any case it seems sensible to set it again here because the consequences of setting the wrong writemask (random graphics memory corruption) are difficult to debug and can easily go unnoticed. I started thinking if this should be an assertion here and should we force the logic in the visitor to consider the writemask correctly instead? I don't have a strong opinion, merely just wondering aloud. --- src/mesa/drivers/dri/i965/brw_eu_emit.c | 13 +++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c index 2b1d6ff..0b655d4 100644 --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c @@ -2799,16 +2799,25 @@ brw_untyped_atomic(struct brw_compile *p, bool response_expected) { const struct brw_context *brw = p-brw; + const bool align1 = (brw_inst_access_mode(brw, p-current) == BRW_ALIGN_1); + /* Mask out unused components -- This is especially important in Align16 +* mode on generations that don't have native support for SIMD4x2 atomics, +* because unused but enabled components will cause the dataport to perform +* additional atomic operations on the addresses that happen to be in the +* uninitialized Y, Z and W coordinates of the payload. +*/ + const unsigned mask = (align1 ? WRITEMASK_XYZW : WRITEMASK_X); brw_inst *insn = brw_next_insn(p, BRW_OPCODE_SEND); - brw_set_dest(p, insn, retype(dest, BRW_REGISTER_TYPE_UD)); + brw_set_dest(p, insn, retype(brw_writemask(dest, mask), +BRW_REGISTER_TYPE_UD)); brw_set_src0(p, insn, retype(payload, BRW_REGISTER_TYPE_UD)); brw_set_src1(p, insn, brw_imm_d(0)); brw_set_dp_untyped_atomic_message( p, insn, atomic_op, bind_table_index, msg_length, brw_surface_payload_size(p, response_expected, brw-gen = 8 || brw-is_haswell, true), - brw_inst_access_mode(brw, insn) == BRW_ALIGN_1); + align1); } static void -- 2.1.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 07/13] i965: Don't request untyped atomic writeback message if the destination is null.
On Fri, Feb 27, 2015 at 05:34:50PM +0200, Francisco Jerez wrote: --- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 2 +- src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 3 ++- 2 files changed, 3 insertions(+), 2 deletions(-) Reviewed-by: Topi Pohjolainen topi.pohjolai...@intel.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/fs: Implement SIMD16 dual source blending.
On Thursday, March 05, 2015 09:39:58 PM Jason Ekstrand wrote: This looks fine to me. I just kicked off a build on our test farm and, assuming that looks good (I'll send another e-mail in the morning if it does), Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com I ran shader-db on the change and I was kind of surprised to see that it doesn't really do anything. GAINED: shaders/dolphin/smg.1.shader_test FS SIMD16 total instructions in shared programs: 5769629 - 5769629 (0.00%) instructions in affected programs: 0 - 0 helped:0 HURT: 0 GAINED:1 LOST: 0 Perhaps shader-db doesn't account for some other GL state required for dual-source because I doubt only one shader uses it. Ken? --Jason That would be dolphin/smg.1.shader_test - the one lonely shader that uses layout qualifiers to specify the dual source color output index: layout(location = 0, index = 1) out vec4 ocol1; Other applications (such as Unigine) most likely call glBindFragDataLocationIndexed to assign the location and index. Unfortunately, shader-db doesn't capture this, as it's tied to API calls, and not part of the shader itself. Eric's new shader-db-2 project that uses apitrace would catch this (but at a large cost). We could probably capture this somehow - add some kind of annotations to the file with the locations/indexes of each shader input/output, then make the API calls after compiling the shader...relink to make them take effect, which would also cause a new precompile, then replace the original results...seems like a pain, but probably doable... signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i915: Fix GCC unused-variable warning in release build.
Reviewed-by: Timothy Arceri t_arc...@yahoo.com.au On Tue, 2015-03-03 at 18:57 -0800, Vinson Lee wrote: i915_debug_fp.c: In function ‘i915_disassemble_program’: i915_debug_fp.c:302:11: warning: unused variable ‘size’ [-Wunused-variable] GLuint size = program[0] 0x1ff; ^ Signed-off-by: Vinson Lee v...@freedesktop.org --- src/mesa/drivers/dri/i915/i915_debug_fp.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i915/i915_debug_fp.c b/src/mesa/drivers/dri/i915/i915_debug_fp.c index 9b4bc76..3f09902 100644 --- a/src/mesa/drivers/dri/i915/i915_debug_fp.c +++ b/src/mesa/drivers/dri/i915/i915_debug_fp.c @@ -299,12 +299,11 @@ print_dcl_op(GLuint opcode, const GLuint * program) void i915_disassemble_program(const GLuint * program, GLuint sz) { - GLuint size = program[0] 0x1ff; GLint i; printf(\t\tBEGIN\n); - assert(size + 2 == sz); + assert(program[0] 0x1ff + 2 == sz); program++; for (i = 1; i sz; i += 3, program += 3) { ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] nesa-10.4.4: gallivm/lp_bld_misc.cpp:503:38: error: no viable conversion from 'ShaderMemoryManager *' to 'std::unique_ptrRTDyldMemoryManager'
On 4 March 2015 at 18:07, Roland Scheidegger srol...@vmware.com wrote: Am 04.03.2015 um 12:38 schrieb Jose Fonseca: On 04/03/15 02:00, Emil Velikov wrote: On 27 February 2015 at 23:28, Sedat Dilek sedat.di...@gmail.com wrote: On Mon, Feb 9, 2015 at 6:30 PM, Emil Velikov emil.l.veli...@gmail.com wrote: On 07/02/15 21:44, Sedat Dilek wrote: Hi, I was building mesa v10.4.4 with my llvm-toolchain v3.6.0rc2. My build breaks like this... ... Please cherry-pick... commit ef7e0b39a24966526b102643523feac765771842 gallivm: Update for RTDyldMemoryManager becoming an unique_ptr. ..for mesa 10.4 Git branch. Hi Sedat, Picking a fix in a stable branch against a non-final release sounds like a no-go in our books. As the official llvm 3.6 rolls out we'll pick this fix for the stable branches - until then I would recommend (a) applying it locally or (b) using mesa from the 10.5 or master branch. Just FYI... [LLVMdev] LLVM 3.6 Release (see [1]). Please pick this patch for-10.4, thanks. As promised, mesa 10.4.6 will feature this. But is cross-porting this patch enough? As I said when this first issue was raised fixing the build with LLVM 3.6 is just half of the problem. It must also _run_ correctly. And building correctly doesn't necessarily means it will run correctly. That is, unless somebody actually ensures that all LLVM 3.6 related fixes have been crossported and that things run correctly, it is misleading to enable the build of Mesa 10.4.6 with LLVM 3.6. I don't know about radeon drivers, but at least from llvmpipe POV I simply don't have the time to do this (go through every LLVM 3.6 related patch, ensure they are all in 10.4.6, and test). I quickly went through the diffs between 10.4 branch, and found one such commit is missing: http://cgit.freedesktop.org/mesa/mesa/commit/?id=74f505fa73eda0c9b5b1984bebb44cedac8e8794 https://bugs.freedesktop.org/show_bug.cgi?id=85467 But there might be more, and I don't know if crossporting this is safe or not. Therefore my stance for is that building Mesa stable releases with LLVM releases after the Mesa release was branched is still unsupported. If people want to do so, they will do at their own peril. And any incoming bugs will be unsupported, use Mesa. If having a Mesa release capable of building LLVM 3.6 is so important, I think it might be easier/safer to just make a new release from a recent enough commit, than trying to backport it. This is quite right, the above commit is a must if you want to build with llvm 3.6. I am quite sure crossport should be safe (it missed the branch point of 10.4 just narrowly), and I don't think there's any other patches missing, but no guarantees... I think it is sort of unfortunate that the latest mesa release wouldn't run with the latest llvm release, but the fact remains that without testing this sounds all a bit risky... Thanks for the input gents. So the input so far we've got is that no-one is testing llvm 3.6 with mesa 10.4. I love to give it a spin, yet Archlinux doesn't have llvm 3.6 . There is also the double-free bug mentioned in https://bugs.freedesktop.org/show_bug.cgi?id=89387 All that said, Sedat I will revert the commit and release 10.4.6 without it. On the positive side, mesa 10.5.0 is coming out later on today, which should work like a charm with llvm 3.6. Cheers Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] clover: Return the minimum required value for CL_DEVICE_SINGLE_FP_CONFIG v2
On Fri, 2015-03-06 at 15:53 +, Tom Stellard wrote: This means dropping CL_FP_DENORM from the current return value. v2: - Add comments about minimum values for OpenCL 1.2. --- src/gallium/state_trackers/clover/api/device.cpp | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/src/gallium/state_trackers/clover/api/device.cpp b/src/gallium/state_trackers/clover/api/device.cpp index b1f556f..b79997f 100644 --- a/src/gallium/state_trackers/clover/api/device.cpp +++ b/src/gallium/state_trackers/clover/api/device.cpp @@ -201,8 +201,11 @@ clGetDeviceInfo(cl_device_id d_dev, cl_device_info param, break; case CL_DEVICE_SINGLE_FP_CONFIG: + // This is the mandated minimum single precision floating-point + // capability for OpenCL 1.1. In OpenCL 1.2, CL_FP_INF_NAN + // is no longer required and nothing is required for custom devices. I think Francisco's information was not entirely correct. OpenCL 1.0, 1.1 minimum is INF_NAN | RTN for full profile (page 36, and 40, respectively) OpenCL 1.2, 2.0 minimum is INF_NAN | RTN if not TYPE_CUSTOM for full profile (pages 42, and 67, respectively) OpenCL 1.0, 1.1, embedded profile minimum is RTZ or RTN (pages 298, and 363) OpenCL 1.2 2.0 embedded profile minimum is RTZ or RTN if not TYPE_CUSTOM (pages 352, and 262) sorry I did not catch the email yesterday. jan buf.as_scalarcl_device_fp_config() = - CL_FP_DENORM | CL_FP_INF_NAN | CL_FP_ROUND_TO_NEAREST; + CL_FP_INF_NAN | CL_FP_ROUND_TO_NEAREST; break; case CL_DEVICE_DOUBLE_FP_CONFIG: -- Jan Vesely jan.ves...@rutgers.edu signature.asc Description: This is a digitally signed message part ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] clover: Return the minimum required value for CL_DEVICE_SINGLE_FP_CONFIG v2
Jan Vesely jan.ves...@rutgers.edu writes: On Fri, 2015-03-06 at 15:53 +, Tom Stellard wrote: This means dropping CL_FP_DENORM from the current return value. v2: - Add comments about minimum values for OpenCL 1.2. --- src/gallium/state_trackers/clover/api/device.cpp | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/src/gallium/state_trackers/clover/api/device.cpp b/src/gallium/state_trackers/clover/api/device.cpp index b1f556f..b79997f 100644 --- a/src/gallium/state_trackers/clover/api/device.cpp +++ b/src/gallium/state_trackers/clover/api/device.cpp @@ -201,8 +201,11 @@ clGetDeviceInfo(cl_device_id d_dev, cl_device_info param, break; case CL_DEVICE_SINGLE_FP_CONFIG: + // This is the mandated minimum single precision floating-point + // capability for OpenCL 1.1. In OpenCL 1.2, CL_FP_INF_NAN + // is no longer required and nothing is required for custom devices. I think Francisco's information was not entirely correct. OpenCL 1.0, 1.1 minimum is INF_NAN | RTN for full profile (page 36, and 40, respectively) OpenCL 1.2, 2.0 minimum is INF_NAN | RTN if not TYPE_CUSTOM for full profile (pages 42, and 67, respectively) OpenCL 1.0, 1.1, embedded profile minimum is RTZ or RTN (pages 298, and 363) OpenCL 1.2 2.0 embedded profile minimum is RTZ or RTN if not TYPE_CUSTOM (pages 352, and 262) sorry I did not catch the email yesterday. Ah, you're right, I ended up looking at the embedded profile by accident. With the CL_FP_INF_NAN sentence left out this patch is: Reviewed-by: Francisco Jerez curroje...@riseup.net jan buf.as_scalarcl_device_fp_config() = - CL_FP_DENORM | CL_FP_INF_NAN | CL_FP_ROUND_TO_NEAREST; + CL_FP_INF_NAN | CL_FP_ROUND_TO_NEAREST; break; case CL_DEVICE_DOUBLE_FP_CONFIG: -- Jan Vesely jan.ves...@rutgers.edu signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/5] egl/main: use c11/threads' mutex directly
On Fri, Mar 6, 2015 at 10:14 AM, Emil Velikov emil.l.veli...@gmail.com wrote: On 06/03/15 17:05, Emil Velikov wrote: Hi all, Just accidently pushed the series to master. I'll revert them in a second. All done. Apologies for the noise. Series looks OK to me. You can re-push with Reviewed-by: Brian Paul bri...@vmware.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/6] c11: add c11 compatibility wrapper around stdlib.h
On Fri, Mar 6, 2015 at 9:32 AM, Emil Velikov emil.l.veli...@gmail.com wrote: Used for aligned_alloc and other C11 functions missing from the header. Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- include/c11_stdlib.h | 118 ++ I wonder if this should be include/c11/stdlib.h instead. I also wonder if I should have put c99_math.h in c99/math.h Jose followed my pattern with c99_alloca.h We should probably be more consistent about this. What do you think? + 1 file changed, 118 insertions(+) create mode 100644 include/c11_stdlib.h diff --git a/include/c11_stdlib.h b/include/c11_stdlib.h new file mode 100644 index 000..04e494f --- /dev/null +++ b/include/c11_stdlib.h @@ -0,0 +1,118 @@ +/* + * Mesa 3-D graphics library + * + * Copyright (C) 1999-2007 Brian Paul All Rights Reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included + * in all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +/** + * Wrapper for stdlib.h which makes sure we have definitions of all the c11 + * functions. + */ + +#ifndef _C11_STDLIB_H_ +#define _C11_STDLIB_H_ + +#include stdint.h I stdint.h really needed here? Otherwise than the naming issue and the stdint.h question, the series looks good to me. Reviewed-by: Brian Paul bri...@vmware.com +#include stdlib.h +#include c99_compat.h + + +#if !defined(_ISOC11_SOURCE) __STDC_VERSION__ 201112L + +#if defined(_WIN32) !defined(__CYGWIN__) +#include malloc.h +#endif + +/** + * Allocate aligned memory. + * + * \param alignment alignment (must be greater than zero). + * \param size number of bytes to allocate. + * + * Allocates extra memory to accommodate rounding up the address for + * alignment and to record the real malloc address. + * + * \sa aligned_free(). + */ +static inline void * +aligned_alloc(size_t alignment, size_t size) +{ +#if defined(HAVE_POSIX_MEMALIGN) + void *mem; + int err = posix_memalign(mem, alignment, size); + if (err) + return NULL; + return mem; +#elif defined(_WIN32) !defined(__CYGWIN__) + return _aligned_malloc(size, alignment); +#else + uintptr_t ptr, buf; + + assert( alignment 0 ); + + ptr = (uintptr_t)malloc(size + alignment + sizeof(void *)); + if (!ptr) + return NULL; + + buf = (ptr + alignment + sizeof(void *)) ~(uintptr_t)(alignment - 1); + *(uintptr_t *)(buf - sizeof(void *)) = ptr; + +#ifdef DEBUG + /* mark the non-aligned area */ + while ( ptr buf - sizeof(void *) ) { + *(unsigned long *)ptr = 0xcdcdcdcd; + ptr += sizeof(unsigned long); + } +#endif + + return (void *) buf; +#endif /* defined(HAVE_POSIX_MEMALIGN) */ +} + +#endif /* C11 */ + +/** + * Free memory which was allocated with aligned_alloc(). + * + * \param ptr pointer to the memory to be freed. + * + * The actual address to free is stored in the word immediately before the + * address the client sees. + * Note that it is legal to pass NULL pointer to this function and will be + * handled accordingly. + */ +static inline void +aligned_free(void *ptr) +{ +#if defined(HAVE_POSIX_MEMALIGN) + free(ptr); +#elif defined(_WIN32) !defined(__CYGWIN__) + _aligned_free(ptr); +#else + if (ptr) { + void **cubbyHole = (void **) ((char *) ptr - sizeof(void *)); + void *realAddr = *cubbyHole; + free(realAddr); + } +#endif /* defined(HAVE_POSIX_MEMALIGN) */ +} + +#endif /* #define _C11_STDLIB_H_ */ -- 2.1.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] clover: Return the minimum required value for CL_DEVICE_SINGLE_FP_CONFIG
On Mar 6, 2015, at 8:56 AM, Francisco Jerez curroje...@riseup.net wrote: Tom Stellard t...@stellard.net mailto:t...@stellard.net writes: On Thu, Mar 05, 2015 at 08:42:25PM +0200, Francisco Jerez wrote: Tom Stellard thomas.stell...@amd.com writes: This means dropping CL_FP_DENORM from the current return value. --- src/gallium/state_trackers/clover/api/device.cpp | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/gallium/state_trackers/clover/api/device.cpp b/src/gallium/state_trackers/clover/api/device.cpp index b1f556f..db3b931 100644 --- a/src/gallium/state_trackers/clover/api/device.cpp +++ b/src/gallium/state_trackers/clover/api/device.cpp @@ -201,8 +201,10 @@ clGetDeviceInfo(cl_device_id d_dev, cl_device_info param, break; case CL_DEVICE_SINGLE_FP_CONFIG: + // This is the mandated minimum single precision floating-point + // capability Could you add that this is according to the OpenCL 1.1 specification? OpenCL 1.2 is even weaker (CL_FP_INF_NAN is not required, only one of CL_FP_ROUND_TO_ZERO or CL_FP_ROUND_TO_NEAREST is required, and no FP capabilities at all are required for custom devices as Jan pointed out). buf.as_scalarcl_device_fp_config() = - CL_FP_DENORM | CL_FP_INF_NAN | CL_FP_ROUND_TO_NEAREST; + CL_FP_INF_NAN | CL_FP_ROUND_TO_NEAREST; I'm okay with this change, but I'm curious, is this motivated by your architecture not supporting denorms? It can, but supporting them hurts performance. Sounds like you want to advertise denorm support and rely on the -cl-denorms-are-zero compiler option to decide whether to flush them to zero or not? This is true for newer devices which have more instructions as fast with denormal support. For the currently supported devices, the performance difference is quite extreme and the denormal support is not that useful. -Tom break; case CL_DEVICE_DOUBLE_FP_CONFIG: -- 2.0.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965: Throttle rendering to an fbo
On 03/06/2015 06:56 AM, Chris Wilson wrote: When rendering to an fbo, even though it may be acting as a winsys frontbuffer or just generally, we never throttle. However, when rendering to an fbo, there is no natural frame boundary. Conventionally we use SwapBuffers and glFinish, but potential callers avoid often glFinish for being too heavy handed (waiting on all outstanding rendering to complete). The kernel provides a soft-throttling option for this case that waits for rendering older than 20ms to be complete (that's a little too lax to be used for swapbuffers, but is here a useful safety net). The remaining choice is then either never to throttle, throttle after every draw call, or at after intermediate user defined point such as glFlush and thus all the implied flushes. This patch opts for the latter as that is the current method used for flushing to front buffers. v2: Defer the throttling from inside the flush to the next intel_prepare_render() and switch non-fbo frontbuffer throttling over to use the same lax method. The issuing being that glFlush()/intel_prepare_read() is just as likely to be called inside a tight loop and not at frame boundaries. Thanks for the change. Moving the throttle to intel_prepare_render() makes sense to me. Comments below. diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index 972e458..2ed5f16 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -232,8 +232,8 @@ intel_glFlush(struct gl_context *ctx) intel_batchbuffer_flush(brw); intel_flush_front(ctx); - if (brw_is_front_buffer_drawing(ctx-DrawBuffer)) - brw-need_throttle = true; + + brw-need_front_throttle = true; } Because of the variable name need_front_throttle, this code looks incorrect for fbo rendering, even thought it's not. Please add a comment here explaining that need_front_throttle is being hijacked for fbo, back, and front buffer rendering. With that little comment, patch 1 is Reviewed-by: Chad Versace chad.vers...@intel.com I'm still looking at patch 2. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 89477] include/no_extern_c.h:47:1: error: template with C linkage
https://bugs.freedesktop.org/show_bug.cgi?id=89477 Bug ID: 89477 Summary: include/no_extern_c.h:47:1: error: template with C linkage Product: Mesa Version: git Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Keywords: bisected, regression Severity: blocker Priority: medium Component: Mesa core Assignee: mesa-dev@lists.freedesktop.org Reporter: v...@freedesktop.org QA Contact: mesa-dev@lists.freedesktop.org CC: jfons...@vmware.com, mark.a.ja...@intel.com mesa: bf061a3d2ec00aa486cda0fb4af04e50e8522868 (master 10.6.0-devel) CXX codegen/nv50_ir_from_tgsi.lo In file included from ../../../../include/c99_compat.h:28:0, from ../../../../src/gallium/include/pipe/p_compiler.h:32, from ../../../../src/gallium/auxiliary/tgsi/tgsi_dump.h:31, from codegen/nv50_ir_from_tgsi.cpp:24: ../../../../include/no_extern_c.h:47:1: error: template with C linkage templateclass T class _IncludeInsideExternCNotPortable; ^ Build error is introduced with commit bfb4db83b618d57fcc5f0c9e9fdb3a7ff33d07f3. commit bfb4db83b618d57fcc5f0c9e9fdb3a7ff33d07f3 Author: José Fonseca jfons...@vmware.com Date: Thu Dec 11 22:14:14 2014 + include: Add helper header to help trap includes inside extern C. This is just to help repro and fixing these issues with any C++ compiler -- -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] i965: Throttle to the previous frame
On 03/06/2015 06:56 AM, Chris Wilson wrote: In order to facilitate the concurrency offered by triple buffering and to offset the latency induced by swapping via an external process, which may incur extra rendering itself, only throttle to the previous frame and not the last. This doubles the maximum possible latency at the benefit of improving throughput and reducing jitter. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Cc: Daniel Vetter daniel.vet...@ffwll.ch Cc: Kenneth Graunke kenn...@whitecape.org Cc: Ben Widawsky b...@bwidawsk.net Cc: Kristian Høgsberg k...@bitplanet.net Cc: Chad Versace chad.vers...@linux.intel.com --- src/mesa/drivers/dri/i965/brw_context.c | 19 --- src/mesa/drivers/dri/i965/brw_context.h | 2 +- src/mesa/drivers/dri/i965/intel_batchbuffer.c | 7 --- 3 files changed, 17 insertions(+), 11 deletions(-) I'm ok with the patch's idea. Just please rename the variable. It's badly named now. The batches are used for more than just swapbuffers. And it holds two batches, but how can there be two first batches? How about brw-post_throttle_batches? Any name without first or swap would be fine. diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index 2ed5f16..6897c2c 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -928,8 +928,10 @@ intelDestroyContext(__DRIcontext * driContextPriv) intel_batchbuffer_free(brw); - drm_intel_bo_unreference(brw-first_post_swapbuffers_batch); - brw-first_post_swapbuffers_batch = NULL; + drm_intel_bo_unreference(brw-first_post_swapbuffers_batch[1]); + drm_intel_bo_unreference(brw-first_post_swapbuffers_batch[0]); + brw-first_post_swapbuffers_batch[1] = NULL; + brw-first_post_swapbuffers_batch[0] = NULL; driDestroyOptionCache(brw-optionCache); @@ -1238,11 +1240,14 @@ intel_prepare_render(struct brw_context *brw) * the swap, and getting our hands on that doesn't seem worth it, * so we just us the first batch we emitted after the last swap. */ - if (brw-need_swap_throttle brw-first_post_swapbuffers_batch) { - if (!brw-disable_throttling) - drm_intel_bo_wait_rendering(brw-first_post_swapbuffers_batch); - drm_intel_bo_unreference(brw-first_post_swapbuffers_batch); - brw-first_post_swapbuffers_batch = NULL; + if (brw-need_swap_throttle brw-first_post_swapbuffers_batch[0]) { + if (brw-first_post_swapbuffers_batch[1]) { + if (!brw-disable_throttling) + drm_intel_bo_wait_rendering(brw-first_post_swapbuffers_batch[1]); + drm_intel_bo_unreference(brw-first_post_swapbuffers_batch[1]); + } + brw-first_post_swapbuffers_batch[1] = brw-first_post_swapbuffers_batch[0]; + brw-first_post_swapbuffers_batch[0] = NULL; brw-need_swap_throttle = false; brw-need_front_throttle = false; } diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index b90e050..e347f26 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -1030,7 +1030,7 @@ struct brw_context bool front_buffer_dirty; /** Framerate throttling: @{ */ - drm_intel_bo *first_post_swapbuffers_batch; + drm_intel_bo *first_post_swapbuffers_batch[2]; bool need_swap_throttle; bool need_front_throttle; /** @} */ diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c b/src/mesa/drivers/dri/i965/intel_batchbuffer.c index 5ac4d18..460b4b9 100644 --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c @@ -168,6 +168,7 @@ static void brw_new_batch(struct brw_context *brw) { /* Create a new batchbuffer and reset the associated state: */ + drm_intel_gem_bo_clear_relocs(brw-batch.bo, 0); intel_batchbuffer_reset(brw); /* If the kernel supports hardware contexts, then most hardware state is @@ -289,9 +290,9 @@ _intel_batchbuffer_flush(struct brw_context *brw, if (brw-batch.used == 0) return 0; - if (brw-first_post_swapbuffers_batch == NULL) { - brw-first_post_swapbuffers_batch = brw-batch.bo; - drm_intel_bo_reference(brw-first_post_swapbuffers_batch); + if (brw-first_post_swapbuffers_batch[0] == NULL) { + brw-first_post_swapbuffers_batch[0] = brw-batch.bo; + drm_intel_bo_reference(brw-first_post_swapbuffers_batch[0]); } if (unlikely(INTEL_DEBUG DEBUG_BATCH)) { ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] include: Add helper header to help trap includes inside extern C.
Unfortunately, my build configuration does not build gallium. This caused me to miss some files which still have extern C includes. Due to this oversight, master encounters build errors. I'll send a patch asap. -Mark Mark Janes mark.a.ja...@intel.com writes: Assuming my patch to fix 'extern C' issues is pushed before this one, Reviewed-by: Mark Janes mark.a.ja...@intel.com Jose Fonseca jfons...@vmware.com writes: From: José Fonseca jfons...@vmware.com This is just to help repro and fixing these issues with any C++ compiler -- commiting this will of course wait until all issues are addressed. $ scons src/glsl/ scons: Reading SConscript files ... Checking for GCC ... yes Checking for Clang ... no Checking for X11 (x11 xext xdamage xfixes glproto = 1.4.13)... yes Checking for XCB (x11-xcb xcb-glx = 1.8.1 xcb-dri2 = 1.8)... yes Checking for XF86VIDMODE (xxf86vm)... yes Checking for DRM (libdrm = 2.4.38)... yes Checking for UDEV (libudev = 151)... yes warning: LLVM disabled: not building llvmpipe scons: done reading SConscript files. scons: Building targets ... scons: building associated VariantDir targets: build/linux-x86_64-debug/glsl Compiling src/glsl/ast_array_index.cpp ... Compiling src/glsl/ast_expr.cpp ... Compiling src/glsl/ast_function.cpp ... Compiling src/glsl/ast_to_hir.cpp ... Compiling src/glsl/ast_type.cpp ... Compiling src/glsl/builtin_functions.cpp ... In file included from include/c99_compat.h:28:0, from src/mapi/u_compiler.h:4, from src/mapi/u_thread.h:47, from src/mapi/glapi/glapi.h:47, from src/mesa/main/mtypes.h:42, from src/mesa/main/errors.h:47, from src/mesa/main/imports.h:41, from src/mesa/main/core.h:44, from src/glsl/builtin_functions.cpp:58: include/no_extern_c.h:48:1: error: template with C linkage templateclass T class _IncludeInsideExternCNotPortable; ^ In file included from include/c99_compat.h:28:0, from include/c11/threads.h:38, from src/mapi/u_thread.h:49, from src/mapi/glapi/glapi.h:47, from src/mesa/main/mtypes.h:42, from src/mesa/main/errors.h:47, from src/mesa/main/imports.h:41, from src/mesa/main/core.h:44, from src/glsl/builtin_functions.cpp:58: include/no_extern_c.h:48:1: error: template with C linkage templateclass T class _IncludeInsideExternCNotPortable; ^ Compiling src/glsl/builtin_types.cpp ... Compiling src/glsl/builtin_variables.cpp ... scons: *** [build/linux-x86_64-debug/glsl/builtin_functions.os] Error 1 scons: building terminated because of errors. --- include/c99_compat.h | 2 ++ include/no_extern_c.h | 49 + src/util/u_atomic.h | 3 +++ 3 files changed, 54 insertions(+) create mode 100644 include/no_extern_c.h diff --git a/include/c99_compat.h b/include/c99_compat.h index 429c601..a8819ac 100644 --- a/include/c99_compat.h +++ b/include/c99_compat.h @@ -25,6 +25,8 @@ * **/ +#include no_extern_c.h + #ifndef _C99_COMPAT_H_ #define _C99_COMPAT_H_ diff --git a/include/no_extern_c.h b/include/no_extern_c.h new file mode 100644 index 000..d038a4f --- /dev/null +++ b/include/no_extern_c.h @@ -0,0 +1,49 @@ +/** + * + * Copyright 2014 VMware, Inc. + * All Rights Reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included + * in all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + * + **/ + + +/* + * Including system's headers inside `extern C { ... }` is not safe, as system + * headers may have C++ code in
[Mesa-dev] [PATCH] i965/skl: Fix the order of the arguments for the LD sampler message
In Skylake the order of the arguments for sample messages with the LD type are u, v, lod, r whereas previously they were u, lod, v, r. This fixes 82 Piglit tests using texelFetch. --- I have a feeling this probably isn't the right way to do this patch so maybe someone who knows the compiler better can write a better one. If the arguments are now in a convenient order at least for 2D textures is it possible to avoid a MOV now or something? I haven't run it though a full Piglit run to check for regressions but instead I only ran it with -t texelFetch. src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 13 +++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index 6b48f70..7ce9dfa 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -1742,15 +1742,24 @@ fs_visitor::emit_texture_gen7(ir_texture_opcode op, fs_reg dst, length++; break; case ir_txf: - /* Unfortunately, the parameters for LD are intermixed: u, lod, v, r. */ + /* Unfortunately, the parameters for LD are intermixed: u, lod, v, r. + * On Gen9 they are u, v, lod, r + */ + emit(MOV(retype(sources[length], BRW_REGISTER_TYPE_D), coordinate)); coordinate = offset(coordinate, 1); length++; + if (brw-gen = 9 coord_components = 2) { + emit(MOV(retype(sources[length], BRW_REGISTER_TYPE_D), coordinate)); + coordinate = offset(coordinate, 1); + length++; + } + emit(MOV(retype(sources[length], BRW_REGISTER_TYPE_D), lod)); length++; - for (int i = 1; i coord_components; i++) { + for (int i = brw-gen = 9 ? 2 : 1; i coord_components; i++) { emit(MOV(retype(sources[length], BRW_REGISTER_TYPE_D), coordinate)); coordinate = offset(coordinate, 1); length++; -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 89477] include/no_extern_c.h:47:1: error: template with C linkage
https://bugs.freedesktop.org/show_bug.cgi?id=89477 --- Comment #2 from Mark Janes mark.a.ja...@intel.com --- patch sent to list. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nouveau: Fix build, invalid extern C around header inclusion.
Reviewed-by: Ilia Mirkin imir...@alum.mit.edu However you should probably split off the r300_public.h change into a separate commit -- a little awkward to have 'nouveau:' as the subject of a change to r300. -ilia On Fri, Mar 6, 2015 at 4:16 PM, Mark Janes mark.a.ja...@intel.com wrote: The previous patch to fix header inclusion within extern C neglected to fix the occurences of this pattern in nouveau files. When the helper to detect this issue was pushed to master, it broke the build for the nouveau driver. This patch fixes the nouveau build. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89477 --- src/gallium/auxiliary/tgsi/tgsi_scan.h| 7 +++ src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 2 -- src/gallium/drivers/r300/r300_public.h| 8 3 files changed, 15 insertions(+), 2 deletions(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.h b/src/gallium/auxiliary/tgsi/tgsi_scan.h index 5dc9267..0ea0e88 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_scan.h +++ b/src/gallium/auxiliary/tgsi/tgsi_scan.h @@ -33,6 +33,10 @@ #include pipe/p_state.h #include pipe/p_shader_tokens.h +#ifdef __cplusplus +extern C { +#endif + /** * Shader summary info */ @@ -114,5 +118,8 @@ tgsi_scan_shader(const struct tgsi_token *tokens, extern boolean tgsi_is_passthrough_shader(const struct tgsi_token *tokens); +#ifdef __cplusplus +} // extern C +#endif #endif /* TGSI_SCAN_H */ diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp index 6e75730..1e0a695 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp @@ -20,11 +20,9 @@ * OTHER DEALINGS IN THE SOFTWARE. */ -extern C { #include tgsi/tgsi_dump.h #include tgsi/tgsi_scan.h #include tgsi/tgsi_util.h -} #include set diff --git a/src/gallium/drivers/r300/r300_public.h b/src/gallium/drivers/r300/r300_public.h index b605920..57a69cb 100644 --- a/src/gallium/drivers/r300/r300_public.h +++ b/src/gallium/drivers/r300/r300_public.h @@ -2,8 +2,16 @@ #ifndef R300_PUBLIC_H #define R300_PUBLIC_H +#ifdef __cplusplus +extern C { +#endif + struct radeon_winsys; struct pipe_screen* r300_screen_create(struct radeon_winsys *rws); +#ifdef __cplusplus +} // extern C +#endif + #endif -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] nouveau: Fix build, invalid extern C around header inclusion.
The previous patch to fix header inclusion within extern C neglected to fix the occurences of this pattern in nouveau files. When the helper to detect this issue was pushed to master, it broke the build for the nouveau driver. This patch fixes the nouveau build. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89477 --- src/gallium/auxiliary/tgsi/tgsi_scan.h| 7 +++ src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 2 -- src/gallium/drivers/r300/r300_public.h| 8 3 files changed, 15 insertions(+), 2 deletions(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.h b/src/gallium/auxiliary/tgsi/tgsi_scan.h index 5dc9267..0ea0e88 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_scan.h +++ b/src/gallium/auxiliary/tgsi/tgsi_scan.h @@ -33,6 +33,10 @@ #include pipe/p_state.h #include pipe/p_shader_tokens.h +#ifdef __cplusplus +extern C { +#endif + /** * Shader summary info */ @@ -114,5 +118,8 @@ tgsi_scan_shader(const struct tgsi_token *tokens, extern boolean tgsi_is_passthrough_shader(const struct tgsi_token *tokens); +#ifdef __cplusplus +} // extern C +#endif #endif /* TGSI_SCAN_H */ diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp index 6e75730..1e0a695 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp @@ -20,11 +20,9 @@ * OTHER DEALINGS IN THE SOFTWARE. */ -extern C { #include tgsi/tgsi_dump.h #include tgsi/tgsi_scan.h #include tgsi/tgsi_util.h -} #include set diff --git a/src/gallium/drivers/r300/r300_public.h b/src/gallium/drivers/r300/r300_public.h index b605920..57a69cb 100644 --- a/src/gallium/drivers/r300/r300_public.h +++ b/src/gallium/drivers/r300/r300_public.h @@ -2,8 +2,16 @@ #ifndef R300_PUBLIC_H #define R300_PUBLIC_H +#ifdef __cplusplus +extern C { +#endif + struct radeon_winsys; struct pipe_screen* r300_screen_create(struct radeon_winsys *rws); +#ifdef __cplusplus +} // extern C +#endif + #endif -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH, v2 1/2] nouveau: Fix build, invalid extern C around header inclusion.
Series is Reviewed-by: Ilia Mirkin imir...@alum.mit.edu Thanks for splitting them up! On Fri, Mar 6, 2015 at 4:36 PM, Mark Janes mark.a.ja...@intel.com wrote: A previous patch to fix header inclusion within extern C neglected to fix the occurences of this pattern in nouveau files. When the helper to detect this issue was pushed to master, it broke the build for the nouveau driver. This patch fixes the nouveau build. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89477 --- src/gallium/auxiliary/tgsi/tgsi_scan.h| 7 +++ src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 2 -- 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.h b/src/gallium/auxiliary/tgsi/tgsi_scan.h index 5dc9267..0ea0e88 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_scan.h +++ b/src/gallium/auxiliary/tgsi/tgsi_scan.h @@ -33,6 +33,10 @@ #include pipe/p_state.h #include pipe/p_shader_tokens.h +#ifdef __cplusplus +extern C { +#endif + /** * Shader summary info */ @@ -114,5 +118,8 @@ tgsi_scan_shader(const struct tgsi_token *tokens, extern boolean tgsi_is_passthrough_shader(const struct tgsi_token *tokens); +#ifdef __cplusplus +} // extern C +#endif #endif /* TGSI_SCAN_H */ diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp index 6e75730..1e0a695 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp @@ -20,11 +20,9 @@ * OTHER DEALINGS IN THE SOFTWARE. */ -extern C { #include tgsi/tgsi_dump.h #include tgsi/tgsi_scan.h #include tgsi/tgsi_util.h -} #include set -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 89477] include/no_extern_c.h:47:1: error: template with C linkage
https://bugs.freedesktop.org/show_bug.cgi?id=89477 --- Comment #1 from Mark Janes mark.a.ja...@intel.com --- Can you please include your configure line? I'm having trouble reproducing your failure. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH, v2 2/2] r300g: Fix build, invalid extern C around header inclusion.
A previous patch to fix header inclusion within extern C neglected to fix the occurences of this pattern in r300 files. When the helper to detect this issue was pushed to master, it broke the build for the r300 driver. This patch fixes the r300 build. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89477 --- src/gallium/drivers/r300/r300_public.h | 8 1 file changed, 8 insertions(+) diff --git a/src/gallium/drivers/r300/r300_public.h b/src/gallium/drivers/r300/r300_public.h index b605920..57a69cb 100644 --- a/src/gallium/drivers/r300/r300_public.h +++ b/src/gallium/drivers/r300/r300_public.h @@ -2,8 +2,16 @@ #ifndef R300_PUBLIC_H #define R300_PUBLIC_H +#ifdef __cplusplus +extern C { +#endif + struct radeon_winsys; struct pipe_screen* r300_screen_create(struct radeon_winsys *rws); +#ifdef __cplusplus +} // extern C +#endif + #endif -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH, v2 1/2] nouveau: Fix build, invalid extern C around header inclusion.
A previous patch to fix header inclusion within extern C neglected to fix the occurences of this pattern in nouveau files. When the helper to detect this issue was pushed to master, it broke the build for the nouveau driver. This patch fixes the nouveau build. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89477 --- src/gallium/auxiliary/tgsi/tgsi_scan.h| 7 +++ src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 2 -- 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.h b/src/gallium/auxiliary/tgsi/tgsi_scan.h index 5dc9267..0ea0e88 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_scan.h +++ b/src/gallium/auxiliary/tgsi/tgsi_scan.h @@ -33,6 +33,10 @@ #include pipe/p_state.h #include pipe/p_shader_tokens.h +#ifdef __cplusplus +extern C { +#endif + /** * Shader summary info */ @@ -114,5 +118,8 @@ tgsi_scan_shader(const struct tgsi_token *tokens, extern boolean tgsi_is_passthrough_shader(const struct tgsi_token *tokens); +#ifdef __cplusplus +} // extern C +#endif #endif /* TGSI_SCAN_H */ diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp index 6e75730..1e0a695 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp @@ -20,11 +20,9 @@ * OTHER DEALINGS IN THE SOFTWARE. */ -extern C { #include tgsi/tgsi_dump.h #include tgsi/tgsi_scan.h #include tgsi/tgsi_util.h -} #include set -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] i965: Throttle rendering to an fbo
When rendering to an fbo, even though it may be acting as a winsys frontbuffer or just generally, we never throttle. However, when rendering to an fbo, there is no natural frame boundary. Conventionally we use SwapBuffers and glFinish, but potential callers avoid often glFinish for being too heavy handed (waiting on all outstanding rendering to complete). The kernel provides a soft-throttling option for this case that waits for rendering older than 20ms to be complete (that's a little too lax to be used for swapbuffers, but is here a useful safety net). The remaining choice is then either never to throttle, throttle after every draw call, or at after intermediate user defined point such as glFlush and thus all the implied flushes. This patch opts for the latter as that is the current method used for flushing to front buffers. v2: Defer the throttling from inside the flush to the next intel_prepare_render() and switch non-fbo frontbuffer throttling over to use the same lax method. The issuing being that glFlush()/intel_prepare_read() is just as likely to be called inside a tight loop and not at frame boundaries. v3: Rename from need_front_throttle to need_flush_throttle to avoid any ambiguity between front buffer rendering and fbo rendering. (Chad) Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Cc: Daniel Vetter daniel.vet...@ffwll.ch Cc: Kenneth Graunke kenn...@whitecape.org Cc: Ben Widawsky b...@bwidawsk.net Cc: Kristian Høgsberg k...@bitplanet.net Cc: Chad Versace chad.vers...@linux.intel.com Reviewed-by: Chad Versace chad.vers...@linux.intel.com --- src/mesa/drivers/dri/i965/brw_context.c | 16 src/mesa/drivers/dri/i965/brw_context.h | 12 +++- src/mesa/drivers/dri/i965/intel_screen.c | 8 3 files changed, 27 insertions(+), 9 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index 972e458..bfda55f 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -232,8 +232,8 @@ intel_glFlush(struct gl_context *ctx) intel_batchbuffer_flush(brw); intel_flush_front(ctx); - if (brw_is_front_buffer_drawing(ctx-DrawBuffer)) - brw-need_throttle = true; + + brw-need_flush_throttle = true; } static void @@ -1238,12 +1238,20 @@ intel_prepare_render(struct brw_context *brw) * the swap, and getting our hands on that doesn't seem worth it, * so we just us the first batch we emitted after the last swap. */ - if (brw-need_throttle brw-first_post_swapbuffers_batch) { + if (brw-need_swap_throttle brw-first_post_swapbuffers_batch) { if (!brw-disable_throttling) drm_intel_bo_wait_rendering(brw-first_post_swapbuffers_batch); drm_intel_bo_unreference(brw-first_post_swapbuffers_batch); brw-first_post_swapbuffers_batch = NULL; - brw-need_throttle = false; + brw-need_swap_throttle = false; + /* Throttling here is more precise than the throttle ioctl, so skip it */ + brw-need_flush_throttle = false; + } + + if (brw-need_flush_throttle) { + __DRIscreen *psp = brw-intelScreen-driScrnPriv; + drmCommandNone(psp-fd, DRM_I915_GEM_THROTTLE); + brw-need_flush_throttle = false; } } diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 682fbe9..7854300 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -1031,7 +1031,17 @@ struct brw_context /** Framerate throttling: @{ */ drm_intel_bo *first_post_swapbuffers_batch; - bool need_throttle; + /* Limit the number of outstanding SwapBuffers by waiting for an earlier +* frame of rendering to complete. This gives a very precise cap to the +* latency between input and output such that rendering never gets more +* than a frame behind the user. (With the caveat that we technically are +* not using the SwapBuffers itself as a barrier but the first batch +* submitted afterwards, which may be immediately prior to the next +* SwapBuffers.) +*/ + bool need_swap_throttle; + /** General throttling, not caught by throttling between SwapBuffers */ + bool need_flush_throttle; /** @} */ GLuint stats_wm; diff --git a/src/mesa/drivers/dri/i965/intel_screen.c b/src/mesa/drivers/dri/i965/intel_screen.c index cea7ddf..3640b67 100644 --- a/src/mesa/drivers/dri/i965/intel_screen.c +++ b/src/mesa/drivers/dri/i965/intel_screen.c @@ -174,10 +174,10 @@ intel_dri2_flush_with_flags(__DRIcontext *cPriv, if (flags __DRI2_FLUSH_DRAWABLE) intel_resolve_for_dri2_flush(brw, dPriv); - if (reason == __DRI2_THROTTLE_SWAPBUFFER || - reason == __DRI2_THROTTLE_FLUSHFRONT) { - brw-need_throttle = true; - } + if (reason == __DRI2_THROTTLE_SWAPBUFFER) + brw-need_swap_throttle = true; + if (reason == __DRI2_THROTTLE_FLUSHFRONT) + brw-need_flush_throttle = true;
[Mesa-dev] [PATCH 2/2] i965: Throttle to the previous frame
In order to facilitate the concurrency offered by triple buffering and to offset the latency induced by swapping via an external process, which may incur extra rendering itself, only throttle to the previous frame and not the last. The second issue that mostly affects swap benchmarks, but also can incur jitter in the throttling, is that the throttle bo is closer to the next SwapBuffers rather than immediately after the previous SwapBuffers. Throttling to the previous frame doubles the maximum possible latency at the benefit of improving throughput and reducing jitter. v2: Rename first_post_swapbuffer batches array to a plain throttle_batch[] as the pluralisation was contorting the name and not making it clear as to whether it was the first batch or first_post_swap batch. Not least of which was that not all throttle points are SwapBuffers. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Cc: Daniel Vetter daniel.vet...@ffwll.ch Cc: Kenneth Graunke kenn...@whitecape.org Cc: Ben Widawsky b...@bwidawsk.net Cc: Kristian Høgsberg k...@bitplanet.net Cc: Chad Versace chad.vers...@linux.intel.com --- src/mesa/drivers/dri/i965/brw_context.c | 19 --- src/mesa/drivers/dri/i965/brw_context.h | 2 +- src/mesa/drivers/dri/i965/intel_batchbuffer.c | 7 --- 3 files changed, 17 insertions(+), 11 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index bfda55f..c669397 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -928,8 +928,10 @@ intelDestroyContext(__DRIcontext * driContextPriv) intel_batchbuffer_free(brw); - drm_intel_bo_unreference(brw-first_post_swapbuffers_batch); - brw-first_post_swapbuffers_batch = NULL; + drm_intel_bo_unreference(brw-throttle_batch[1]); + drm_intel_bo_unreference(brw-throttle_batch[0]); + brw-throttle_batch[1] = NULL; + brw-throttle_batch[0] = NULL; driDestroyOptionCache(brw-optionCache); @@ -1238,11 +1240,14 @@ intel_prepare_render(struct brw_context *brw) * the swap, and getting our hands on that doesn't seem worth it, * so we just us the first batch we emitted after the last swap. */ - if (brw-need_swap_throttle brw-first_post_swapbuffers_batch) { - if (!brw-disable_throttling) - drm_intel_bo_wait_rendering(brw-first_post_swapbuffers_batch); - drm_intel_bo_unreference(brw-first_post_swapbuffers_batch); - brw-first_post_swapbuffers_batch = NULL; + if (brw-need_swap_throttle brw-throttle_batch[0]) { + if (brw-throttle_batch[1]) { + if (!brw-disable_throttling) +drm_intel_bo_wait_rendering(brw-throttle_batch[1]); + drm_intel_bo_unreference(brw-throttle_batch[1]); + } + brw-throttle_batch[1] = brw-throttle_batch[0]; + brw-throttle_batch[0] = NULL; brw-need_swap_throttle = false; /* Throttling here is more precise than the throttle ioctl, so skip it */ brw-need_flush_throttle = false; diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 7854300..ab7ac05 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -1030,7 +1030,7 @@ struct brw_context bool front_buffer_dirty; /** Framerate throttling: @{ */ - drm_intel_bo *first_post_swapbuffers_batch; + drm_intel_bo *throttle_batch[2]; /* Limit the number of outstanding SwapBuffers by waiting for an earlier * frame of rendering to complete. This gives a very precise cap to the * latency between input and output such that rendering never gets more diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c b/src/mesa/drivers/dri/i965/intel_batchbuffer.c index 5ac4d18..87862cd 100644 --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c @@ -168,6 +168,7 @@ static void brw_new_batch(struct brw_context *brw) { /* Create a new batchbuffer and reset the associated state: */ + drm_intel_gem_bo_clear_relocs(brw-batch.bo, 0); intel_batchbuffer_reset(brw); /* If the kernel supports hardware contexts, then most hardware state is @@ -289,9 +290,9 @@ _intel_batchbuffer_flush(struct brw_context *brw, if (brw-batch.used == 0) return 0; - if (brw-first_post_swapbuffers_batch == NULL) { - brw-first_post_swapbuffers_batch = brw-batch.bo; - drm_intel_bo_reference(brw-first_post_swapbuffers_batch); + if (brw-throttle_batch[0] == NULL) { + brw-throttle_batch[0] = brw-batch.bo; + drm_intel_bo_reference(brw-throttle_batch[0]); } if (unlikely(INTEL_DEBUG DEBUG_BATCH)) { -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: Silence GCC maybe-uninitialized warning.
brw_shader.cpp: In function ‘bool brw_saturate_immediate(brw_reg_type, brw_reg*)’: brw_shader.cpp:618:31: warning: ‘sat_imm.brw_saturate_immediate(brw_reg_type, brw_reg*)::anonymous union::ud’ may be used uninitialized in this function [-Wmaybe-uninitialized] reg-dw1.ud = sat_imm.ud; ^ Signed-off-by: Vinson Lee v...@freedesktop.org --- src/mesa/drivers/dri/i965/brw_shader.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp b/src/mesa/drivers/dri/i965/brw_shader.cpp index f2b4d82..ff0ef4b 100644 --- a/src/mesa/drivers/dri/i965/brw_shader.cpp +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp @@ -584,7 +584,7 @@ brw_saturate_immediate(enum brw_reg_type type, struct brw_reg *reg) unsigned ud; int d; float f; - } imm = { reg-dw1.ud }, sat_imm; + } imm = { reg-dw1.ud }, sat_imm = { 0 }; switch (type) { case BRW_REGISTER_TYPE_UD: -- 2.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] egl/dri2: Fix GCC maybe-uninitialized warning.
egl_dri2.c: In function ‘dri2_bind_tex_image’: egl_dri2.c:1240:4: warning: ‘format’ may be used uninitialized in this function [-Wmaybe-uninitialized] (*dri2_dpy-tex_buffer-setTexBuffer2)(dri2_ctx-dri_context, ^ Signed-off-by: Vinson Lee v...@freedesktop.org --- src/egl/drivers/dri2/egl_dri2.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c index d503196..88022e0 100644 --- a/src/egl/drivers/dri2/egl_dri2.c +++ b/src/egl/drivers/dri2/egl_dri2.c @@ -1227,6 +1227,7 @@ dri2_bind_tex_image(_EGLDriver *drv, break; default: assert(0); + format = 0; } switch (dri2_surf-base.TextureTarget) { -- 2.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/6] c11: add c11 compatibility wrapper around stdlib.h
On 07/03/15 07:23, Jose Fonseca wrote: On 06/03/15 18:26, Brian Paul wrote: On Fri, Mar 6, 2015 at 9:32 AM, Emil Velikov emil.l.veli...@gmail.com mailto:emil.l.veli...@gmail.com wrote: Used for aligned_alloc and other C11 functions missing from the header. Signed-off-by: Emil Velikov emil.l.veli...@gmail.com mailto:emil.l.veli...@gmail.com --- include/c11_stdlib.h | 118 ++ I wonder if this should be include/c11/stdlib.h instead. I also wonder if I should have put c99_math.h in c99/math.h Jose followed my pattern with c99_alloca.h We should probably be more consistent about this. What do you think? No, {c11,c99}_foo.h are really different from {c11,c99}/foo.h include/c99 is added to the include path. It is meant to have _complete_ header replacements (stdint.h, inttypes) for compilers that don't provide these C99 headers -- MSVC prior to 2013. This way the C include can always just have #include stdint.h regardless of the compiler. If c99_math.h is moved to c99/math.h then it will make it impossible to ever include the real MSVC's math.h whenever include/c99 is added to the path. This is also why c11/threads.h is in c11. It too is an complete implementation of C11'S threads.h. And in fact we could just add include/c11 to the include path and drop the c11/ prefix. But the fact that GLIB started implement C11 threads.h, and because we still didn't eliminate the use of non-portable _MTX_INITIALIZER_NP from Mesa tree gave me pause. Now, things like c11_math.h c11_stdlib.h are a different beast -- they're not complete replacement for the headers, but just wrappers around the headers which fixup missing things. They can't ever be put in the include path. We can can consider move the c99_foo.h/c11_foo.h them somewhere else (another subdirectory, or util) or renaming them (like u_foo.h). But the distinction between the stuff in c99/ and c11/ must be preserved somehow. Oh nevermind this. Just saw your other reply. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2] nv30: Add unused attribute to function nv40_fp_bra.
Silences GCC unused-function warning. nv30/nvfx_fragprog.c:333:1: warning: ‘nv40_fp_bra’ defined but not used [-Wunused-function] nv40_fp_bra(struct nvfx_fpc *fpc, unsigned target) ^ Signed-off-by: Vinson Lee v...@freedesktop.org --- src/gallium/drivers/nouveau/nv30/nvfx_fragprog.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/nouveau/nv30/nvfx_fragprog.c b/src/gallium/drivers/nouveau/nv30/nvfx_fragprog.c index 6600997..abd51c8 100644 --- a/src/gallium/drivers/nouveau/nv30/nvfx_fragprog.c +++ b/src/gallium/drivers/nouveau/nv30/nvfx_fragprog.c @@ -329,7 +329,7 @@ nv40_fp_rep(struct nvfx_fpc *fpc, unsigned count, unsigned target) } /* warning: this only works forward, and probably only if not inside any IF */ -static void +static __attribute__((unused)) void nv40_fp_bra(struct nvfx_fpc *fpc, unsigned target) { struct nvfx_relocation reloc; -- 2.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i915: Fix GCC unused-but-set-variable warning in release build.
i915_fragprog.c: In function ‘i915ValidateFragmentProgram’: i915_fragprog.c:1453:11: warning: variable ‘k’ set but not used [-Wunused-but-set-variable] int k; ^ Signed-off-by: Vinson Lee v...@freedesktop.org --- src/mesa/drivers/dri/i915/i915_fragprog.c | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i915/i915_fragprog.c b/src/mesa/drivers/dri/i915/i915_fragprog.c index d42da5a..9b00223 100644 --- a/src/mesa/drivers/dri/i915/i915_fragprog.c +++ b/src/mesa/drivers/dri/i915/i915_fragprog.c @@ -1450,8 +1450,6 @@ i915ValidateFragmentProgram(struct i915_context *i915) if (s2 != i915-state.Ctx[I915_CTXREG_LIS2] || s4 != i915-state.Ctx[I915_CTXREG_LIS4]) { - int k; - I915_STATECHANGE(i915, I915_UPLOAD_CTX); /* Must do this *after* statechange, so as not to affect @@ -1471,8 +1469,7 @@ i915ValidateFragmentProgram(struct i915_context *i915) i915-state.Ctx[I915_CTXREG_LIS2] = s2; i915-state.Ctx[I915_CTXREG_LIS4] = s4; - k = intel-vtbl.check_vertex_size(intel, intel-vertex_size); - assert(k); + assert(intel-vtbl.check_vertex_size(intel, intel-vertex_size)); } if (!p-params_uptodate) -- 2.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] radeon: Fix GCC unused-but-set-variable warnings.
radeon_fbo.c: In function ‘radeon_map_renderbuffer_s8z24’: radeon_fbo.c:162:9: warning: variable ‘ret’ set but not used [-Wunused-but-set-variable] int ret; ^ radeon_fbo.c: In function ‘radeon_map_renderbuffer_z16’: radeon_fbo.c:200:9: warning: variable ‘ret’ set but not used [-Wunused-but-set-variable] int ret; ^ radeon_fbo.c: In function ‘radeon_map_renderbuffer’: radeon_fbo.c:242:8: warning: variable ‘ret’ set but not used [-Wunused-but-set-variable] int ret; ^ radeon_fbo.c: In function ‘radeon_unmap_renderbuffer’: radeon_fbo.c:419:14: warning: variable ‘ok’ set but not used [-Wunused-but-set-variable] GLboolean ok; ^ Signed-off-by: Vinson Lee v...@freedesktop.org --- src/mesa/drivers/dri/radeon/radeon_fbo.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/src/mesa/drivers/dri/radeon/radeon_fbo.c b/src/mesa/drivers/dri/radeon/radeon_fbo.c index 110b030..2e3cb2b 100644 --- a/src/mesa/drivers/dri/radeon/radeon_fbo.c +++ b/src/mesa/drivers/dri/radeon/radeon_fbo.c @@ -169,6 +169,9 @@ radeon_map_renderbuffer_s8z24(struct gl_context *ctx, rrb-map_buffer = malloc(w * h * 4); ret = radeon_bo_map(rrb-bo, !!(mode GL_MAP_WRITE_BIT)); assert(!ret); +if (!ret) { + return; +} untiled_s8z24_map = rrb-map_buffer; tiled_s8z24_map = rrb-bo-ptr; @@ -207,6 +210,9 @@ radeon_map_renderbuffer_z16(struct gl_context *ctx, rrb-map_buffer = malloc(w * h * 2); ret = radeon_bo_map(rrb-bo, !!(mode GL_MAP_WRITE_BIT)); assert(!ret); +if (!ret) { +return; +} untiled_z16_map = rrb-map_buffer; tiled_z16_map = rrb-bo-ptr; @@ -291,6 +297,9 @@ radeon_map_renderbuffer(struct gl_context *ctx, ret = radeon_bo_map(rrb-map_bo, !!(mode GL_MAP_WRITE_BIT)); assert(!ret); + if (!ret) { + return; + } map = rrb-map_bo-ptr; @@ -449,6 +458,7 @@ radeon_unmap_renderbuffer(struct gl_context *ctx, rrb-map_w, rrb-map_h, GL_FALSE); assert(ok); + (void) ok; } radeon_bo_unref(rrb-map_bo); -- 2.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2] egl/dri2: Fix GCC maybe-uninitialized warning.
egl_dri2.c: In function ‘dri2_bind_tex_image’: egl_dri2.c:1240:4: warning: ‘format’ may be used uninitialized in this function [-Wmaybe-uninitialized] (*dri2_dpy-tex_buffer-setTexBuffer2)(dri2_ctx-dri_context, ^ Suggested-by: Ilia Mirkin imir...@alum.mit.edu Signed-off-by: Vinson Lee v...@freedesktop.org --- src/egl/drivers/dri2/egl_dri2.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c index d503196..c5c475d 100644 --- a/src/egl/drivers/dri2/egl_dri2.c +++ b/src/egl/drivers/dri2/egl_dri2.c @@ -1226,7 +1226,8 @@ dri2_bind_tex_image(_EGLDriver *drv, format = __DRI_TEXTURE_FORMAT_RGBA; break; default: - assert(0); + _eglError(EGL_BAD_SURFACE, unrecognized format); + return EGL_FALSE; } switch (dri2_surf-base.TextureTarget) { @@ -1234,7 +1235,8 @@ dri2_bind_tex_image(_EGLDriver *drv, target = GL_TEXTURE_2D; break; default: - assert(0); + _eglError(EGL_BAD_SURFACE, unrecognized target); + return EGL_FALSE; } (*dri2_dpy-tex_buffer-setTexBuffer2)(dri2_ctx-dri_context, -- 2.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/4] Clover: use get_device_vendor instead of get_vendor
The pipe's get_vendor method returns something more akin to a driver vendor string in most cases, instead of the actual device vendor. Use get_device_vendor instead, which was introduced specifically for this purpose. --- src/gallium/state_trackers/clover/core/device.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/state_trackers/clover/core/device.cpp b/src/gallium/state_trackers/clover/core/device.cpp index c3f3b4e..42b45b7 100644 --- a/src/gallium/state_trackers/clover/core/device.cpp +++ b/src/gallium/state_trackers/clover/core/device.cpp @@ -192,7 +192,7 @@ device::device_name() const { std::string device::vendor_name() const { - return pipe-get_vendor(pipe); + return pipe-get_device_vendor(pipe); } enum pipe_shader_ir -- 2.1.2.766.gaa23a90 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/4] Implement get_device_vendor() for existing drivers
The only hackish ones are llvmpipe and softpipe, which currently return the same string as for get_vendor(), while ideally they should return the CPU vendor. --- src/gallium/drivers/freedreno/freedreno_screen.c | 8 src/gallium/drivers/galahad/glhd_screen.c| 10 ++ src/gallium/drivers/i915/i915_screen.c | 7 +++ src/gallium/drivers/ilo/ilo_screen.c | 7 +++ src/gallium/drivers/llvmpipe/lp_screen.c | 1 + src/gallium/drivers/noop/noop_pipe.c | 6 ++ src/gallium/drivers/nouveau/nouveau_screen.c | 7 +++ src/gallium/drivers/r300/r300_screen.c | 6 ++ src/gallium/drivers/radeon/r600_pipe_common.c| 6 ++ src/gallium/drivers/rbug/rbug_screen.c | 10 ++ src/gallium/drivers/softpipe/sp_screen.c | 1 + src/gallium/drivers/svga/svga_screen.c | 1 + src/gallium/drivers/trace/tr_screen.c| 22 ++ src/gallium/drivers/vc4/vc4_screen.c | 1 + 14 files changed, 93 insertions(+) diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c b/src/gallium/drivers/freedreno/freedreno_screen.c index 3e9a3f3..7ebf5e8 100644 --- a/src/gallium/drivers/freedreno/freedreno_screen.c +++ b/src/gallium/drivers/freedreno/freedreno_screen.c @@ -96,6 +96,13 @@ fd_screen_get_vendor(struct pipe_screen *pscreen) return freedreno; } +static const char * +fd_screen_get_device_vendor(struct pipe_screen *pscreen) +{ + return Qualcomm; +} + + static uint64_t fd_screen_get_timestamp(struct pipe_screen *pscreen) { @@ -533,6 +540,7 @@ fd_screen_create(struct fd_device *dev) pscreen-get_name = fd_screen_get_name; pscreen-get_vendor = fd_screen_get_vendor; + pscreen-get_device_vendor = fd_screen_get_vendor; pscreen-get_timestamp = fd_screen_get_timestamp; diff --git a/src/gallium/drivers/galahad/glhd_screen.c b/src/gallium/drivers/galahad/glhd_screen.c index 11ab1a9..9fafdbf 100644 --- a/src/gallium/drivers/galahad/glhd_screen.c +++ b/src/gallium/drivers/galahad/glhd_screen.c @@ -69,6 +69,15 @@ galahad_screen_get_vendor(struct pipe_screen *_screen) return screen-get_vendor(screen); } +static const char * +galahad_screen_get_device_vendor(struct pipe_screen *_screen) +{ + struct galahad_screen *glhd_screen = galahad_screen(_screen); + struct pipe_screen *screen = glhd_screen-screen; + + return screen-get_device_vendor(screen); +} + static int galahad_screen_get_param(struct pipe_screen *_screen, enum pipe_cap param) @@ -361,6 +370,7 @@ galahad_screen_create(struct pipe_screen *screen) GLHD_SCREEN_INIT(destroy); GLHD_SCREEN_INIT(get_name); GLHD_SCREEN_INIT(get_vendor); + GLHD_SCREEN_INIT(get_device_vendor); GLHD_SCREEN_INIT(get_param); GLHD_SCREEN_INIT(get_shader_param); //GLHD_SCREEN_INIT(get_video_param); diff --git a/src/gallium/drivers/i915/i915_screen.c b/src/gallium/drivers/i915/i915_screen.c index dc76464..9ab9c48 100644 --- a/src/gallium/drivers/i915/i915_screen.c +++ b/src/gallium/drivers/i915/i915_screen.c @@ -55,6 +55,12 @@ i915_get_vendor(struct pipe_screen *screen) } static const char * +i915_get_device_vendor(struct pipe_screen *screen) +{ + return Intel; +} + +static const char * i915_get_name(struct pipe_screen *screen) { static char buffer[128]; @@ -547,6 +553,7 @@ i915_screen_create(struct i915_winsys *iws) is-base.get_name = i915_get_name; is-base.get_vendor = i915_get_vendor; + is-base.get_device_vendor = i915_get_device_vendor; is-base.get_param = i915_get_param; is-base.get_shader_param = i915_get_shader_param; is-base.get_paramf = i915_get_paramf; diff --git a/src/gallium/drivers/ilo/ilo_screen.c b/src/gallium/drivers/ilo/ilo_screen.c index bf0a84a..80ea4c7 100644 --- a/src/gallium/drivers/ilo/ilo_screen.c +++ b/src/gallium/drivers/ilo/ilo_screen.c @@ -515,6 +515,12 @@ ilo_get_vendor(struct pipe_screen *screen) } static const char * +ilo_get_device_vendor(struct pipe_screen *screen) +{ + return Intel; +} + +static const char * ilo_get_name(struct pipe_screen *screen) { struct ilo_screen *is = ilo_screen(screen); @@ -844,6 +850,7 @@ ilo_screen_create(struct intel_winsys *ws) is-base.destroy = ilo_screen_destroy; is-base.get_name = ilo_get_name; is-base.get_vendor = ilo_get_vendor; + is-base.get_device_vendor = ilo_get_device_vendor; is-base.get_param = ilo_get_param; is-base.get_paramf = ilo_get_paramf; is-base.get_shader_param = ilo_get_shader_param; diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c b/src/gallium/drivers/llvmpipe/lp_screen.c index 3387d3a..4b45725 100644 --- a/src/gallium/drivers/llvmpipe/lp_screen.c +++ b/src/gallium/drivers/llvmpipe/lp_screen.c @@ -589,6 +589,7 @@ llvmpipe_create_screen(struct sw_winsys *winsys) screen-base.get_name = llvmpipe_get_name; screen-base.get_vendor = llvmpipe_get_vendor; +
[Mesa-dev] [PATCH 2/4] Introduce get_device_vendor() entrypoint for pipes
This will be needed by Clover to return the correct information to CL_DEVICE_VENDOR info queries. --- src/gallium/include/pipe/p_screen.h | 9 + 1 file changed, 9 insertions(+) diff --git a/src/gallium/include/pipe/p_screen.h b/src/gallium/include/pipe/p_screen.h index 4018f8a..cba4c95 100644 --- a/src/gallium/include/pipe/p_screen.h +++ b/src/gallium/include/pipe/p_screen.h @@ -228,6 +228,15 @@ struct pipe_screen { unsigned index, struct pipe_driver_query_info *info); + + /** +* Returns the device vendor. +* +* The returned value should return the actual device vendor/manufacturer, +* rather than a potentially generic driver string. +*/ + const char *(*get_device_vendor)( struct pipe_screen * ); + }; -- 2.1.2.766.gaa23a90 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCHv2 0/4] Separate device from driver vendor
OpenCL (as opposed to OpenGL) has separate vendor strings for the implementation/driver/platform and the device. CL_PLATFORM_VENDOR is akin to the GL_VENDOR string, while CL_DEVICE_VENDOR is supposed to return the actual device vendor. (For example, the AMD OpenCL platform returns GenuineIntel as CL_DEVICE_VENDOR for an Intel CPU.) This patchset separates (where possible/necessary) the device vendor from the driver vendor in existing gallium drivers, and makes clover use the newly introduced device vendor for CL_DEVICE_VENDOR queries. For some drivers, get_device_vendor is mapped to get_vendor: this is usually done because get_vendor already returns an appropriate string, except in the case of softpipe and llvmpipe, where the returned vendor should be the CPU vendor (and I don't know enough about the structure of these driver to extract the information 8-P) Changes since v1: * vc4's get_device_vendor maps to vc4_screen_get_vendor() instead of the non-existing vc4_screen_get_device_vendor(); * freedreno should report Qualcomm, as device vendor, so it needs an actual implementation of get_device_vendor distinct from that of get_vendor. Giuseppe Bilotta (4): Whitespace cleanup Introduce get_device_vendor() entrypoint for pipes Implement get_device_vendor() for existing drivers Clover: use get_device_vendor instead of get_vendor src/gallium/drivers/freedreno/freedreno_screen.c | 8 src/gallium/drivers/galahad/glhd_screen.c | 10 ++ src/gallium/drivers/i915/i915_screen.c| 7 +++ src/gallium/drivers/ilo/ilo_screen.c | 7 +++ src/gallium/drivers/llvmpipe/lp_screen.c | 1 + src/gallium/drivers/noop/noop_pipe.c | 6 ++ src/gallium/drivers/nouveau/nouveau_screen.c | 7 +++ src/gallium/drivers/r300/r300_screen.c| 6 ++ src/gallium/drivers/radeon/r600_pipe_common.c | 6 ++ src/gallium/drivers/rbug/rbug_screen.c| 10 ++ src/gallium/drivers/softpipe/sp_screen.c | 1 + src/gallium/drivers/svga/svga_screen.c| 1 + src/gallium/drivers/trace/tr_screen.c | 22 ++ src/gallium/drivers/vc4/vc4_screen.c | 1 + src/gallium/include/pipe/p_screen.h | 11 ++- src/gallium/state_trackers/clover/core/device.cpp | 2 +- 16 files changed, 104 insertions(+), 2 deletions(-) -- 2.1.2.766.gaa23a90 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev