[Mesa-dev] [PATCH 06/15] i965: add helper for creating packing writemask

2016-07-19 Thread Timothy Arceri
For example where n=3 first_component=1 this will give us 0xE (WRITEMASK_YZW). Reviewed-by: Edward O'Callaghan --- src/mesa/drivers/dri/i965/brw_reg.h | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_reg.h

Re: [Mesa-dev] [PATCH] i965: Use tex_mocs instead of rb_mocs for GL images.

2016-07-19 Thread Kenneth Graunke
On Monday, July 18, 2016 10:58:31 PM PDT Ben Widawsky wrote: > On Mon, Jul 18, 2016 at 07:08:46PM -0700, Kenneth Graunke wrote: > > Fixes a 10-20% performance regression in OglCSDof caused by commit > > 5a8c89038abab0184ea72664ab390ec6ca58b4d6, which made images (in the > > image load/store sense)

[Mesa-dev] V5 ARB_enhanced_layouts packing support for i965 Gen6+

2016-07-19 Thread Timothy Arceri
V5: - rebase on Ken's interpolation clean-ups [1] V4: - add vec4 backend support and enable for Gen6+ V3: - Rewrite patch 9 (add support for packing arrays) to not add hacks to the type_size() functions. - Add packing support for the load_output intrinsics (patch 12) - Add

[Mesa-dev] [PATCH 02/15] i965: enable component packing for vs and fs

2016-07-19 Thread Timothy Arceri
Rather than trying to work out the total number of components used at a location we simply treat all outputs as vec4s. --- src/mesa/drivers/dri/i965/brw_fs.h | 1 - src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 22 ++ src/mesa/drivers/dri/i965/brw_fs_visitor.cpp |

[Mesa-dev] [PATCH 09/15] i965/vec4: add component packing for gs

2016-07-19 Thread Timothy Arceri
Reviewed-by: Edward O'Callaghan --- src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp b/src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp index 9ebfb27..16d2410 100644 ---

[Mesa-dev] [PATCH 15/15] docs: mark ARB_enhanced_layouts as DONE for i965

2016-07-19 Thread Timothy Arceri
Reviewed-by: Edward O'Callaghan --- docs/GL3.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/GL3.txt b/docs/GL3.txt index 1335397..ebaf4bf 100644 --- a/docs/GL3.txt +++ b/docs/GL3.txt @@ -193,11 +193,11 @@ GL 4.4, GLSL 4.40:

[Mesa-dev] [PATCH 12/15] i965/vec4: add support for packing tes inputs

2016-07-19 Thread Timothy Arceri
--- src/mesa/drivers/dri/i965/brw_vec4_tes.cpp | 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_tes.cpp b/src/mesa/drivers/dri/i965/brw_vec4_tes.cpp index 6639c86..8266a9d 100644 ---

[Mesa-dev] [PATCH 10/15] i965/vec4: support packing tcs inputs

2016-07-19 Thread Timothy Arceri
Reviewed-by: Edward O'Callaghan --- src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp | 8 ++-- src/mesa/drivers/dri/i965/brw_vec4_tcs.h | 1 + 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp

[Mesa-dev] [PATCH 13/15] i965/vec4: add packing support for tes load outputs

2016-07-19 Thread Timothy Arceri
Reviewed-by: Edward O'Callaghan --- src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp | 17 + src/mesa/drivers/dri/i965/brw_vec4_tcs.h | 1 + 2 files changed, 14 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp

[Mesa-dev] [PATCH 07/15] i965/vec4: add support for packing inputs

2016-07-19 Thread Timothy Arceri
Reviewed-by: Edward O'Callaghan --- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp index f3b4528..33ad852 100644 ---

[Mesa-dev] [PATCH 01/15] i965: bring back type_size_vec4_times_4()

2016-07-19 Thread Timothy Arceri
We will use this for output varyings. To make component packing simpler we will just treat all varyings as vec4s. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 13 + src/mesa/drivers/dri/i965/brw_shader.h | 1 + 2 files changed, 14 insertions(+) diff --git

Re: [Mesa-dev] [PATCH 09/12] st/va: add functions for VAAPI encode

2016-07-19 Thread Christian König
Am 19.07.2016 um 00:43 schrieb Boyuan Zhang: Add necessary functions/changes for VAAPI encoding to buffer and picture. These changes will allow driver to handle all Vaapi encode related operations. This patch doesn't change the Vaapi decode behaviour. Signed-off-by: Boyuan Zhang

Re: [Mesa-dev] [Mesa-stable] [PATCH] mapi: Export all GLES 3.1 functions in libGLESv2.so

2016-07-19 Thread Andreas Boll
Hi, sorry for being late but this patch doesn't mention that all those symbols should be exported in libGL.so too [1]. If you look at the history of static_data.py it was mentioned that this list of functions should never grow [2]. Thanks, Andreas [1]

[Mesa-dev] [PATCH 14/15] i965: enable ARB_enhanced_layouts for gen6+

2016-07-19 Thread Timothy Arceri
Reviewed-by: Edward O'Callaghan --- src/mesa/drivers/dri/i965/intel_extensions.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c b/src/mesa/drivers/dri/i965/intel_extensions.c index c557137..ec89094 100644 ---

[Mesa-dev] [PATCH 03/15] i965: add component packing support for load_output intrinsics

2016-07-19 Thread Timothy Arceri
--- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 38 +++- 1 file changed, 33 insertions(+), 5 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index 395594f..e75e7f7 100644 ---

[Mesa-dev] [PATCH 05/15] i965: add helpers for creating component layout swizzle

2016-07-19 Thread Timothy Arceri
This will be used to swizzle components to the beginning or end of the vector based on the component layout qualifier and whether we are doing a load or store. Reviewed-by: Edward O'Callaghan --- src/mesa/drivers/dri/i965/brw_reg.h | 3 +++ 1 file changed, 3

[Mesa-dev] [PATCH 11/15] i965/vec4: add support for packing tcs outputs

2016-07-19 Thread Timothy Arceri
Reviewed-by: Edward O'Callaghan --- src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp | 7 +++ 1 file changed, 7 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp b/src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp index 8bd150a..4bc3be7 100644 ---

[Mesa-dev] [PATCH 04/15] nir: add doubles component packing support

2016-07-19 Thread Timothy Arceri
This makes sure we give the correct driver location for doubles when using component packing. --- src/compiler/nir/nir_lower_io.c | 16 1 file changed, 16 insertions(+) diff --git a/src/compiler/nir/nir_lower_io.c b/src/compiler/nir/nir_lower_io.c index e480264..7a72e69 100644

[Mesa-dev] [PATCH 08/15] i965/vec4: add support for packing vs/gs/tes outputs

2016-07-19 Thread Timothy Arceri
Here we create a new output_generic_reg array with the ability to store the dst_reg for each component of user defined varyings. This is needed as the previous code only stored the dst_reg based on the varying location which meant packed varyings would overwrite each other. ---

Re: [Mesa-dev] [PATCH 11/12] st/va: add environmental variable to disable interlace

2016-07-19 Thread Christian König
Am 19.07.2016 um 00:43 schrieb Boyuan Zhang: Add environmental variable to disable interlace mode. At VAAPI decoding stage, driver can not distinguish b/w pure decoding case and transcoding case. And since interlace encoding is not supported, we have to disable interlace for transcoding case.

Re: [Mesa-dev] [PATCH 05/12] st/va: add encode entrypoint

2016-07-19 Thread Christian König
Am 19.07.2016 um 00:43 schrieb Boyuan Zhang: VAAPI passes PIPE_VIDEO_ENTRYPOINT_ENCODE as entry point for encoding case. We will save this encode entry point in config. config_id was used as profile previously. Now, config has both profile and entrypoint field, and config_id is used to get

Re: [Mesa-dev] [PATCH 02/12] vl: add entry point

2016-07-19 Thread Christian König
Am 19.07.2016 um 00:43 schrieb Boyuan Zhang: Add entrypoint to distinguish H.264 decode and encode. For example, in patch 5/11 when is calling "VaCreateContext", "pps" and "sps" shouldn't be allocated for H.264 encoding. So we need to use the entry_point to determine this is H.264 decode or

[Mesa-dev] [PATCH 91/95] i965/vec4: dump subnr for FIXED_GRF

2016-07-19 Thread Iago Toral Quiroga
This came in handy when debugging the payload setup for Tess Eval, since it prints correct subnr for attributes that can be loaded in the second half of a register. --- src/mesa/drivers/dri/i965/brw_vec4.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git

[Mesa-dev] [PATCH 85/95] i965/vec4/tcs: fix outputs for 64-bit data

2016-07-19 Thread Iago Toral Quiroga
--- src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp | 29 +++-- 1 file changed, 27 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp b/src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp index 70f81a0..cdfcefa 100644 ---

[Mesa-dev] [PATCH 92/95] i965/vec4/scalarize_df: do not scalarize instructions with identity swizzles

2016-07-19 Thread Iago Toral Quiroga
We can implement them directly. Also, document other possible improvements for future reference. --- src/mesa/drivers/dri/i965/brw_vec4.cpp | 46 +- 1 file changed, 45 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp

[Mesa-dev] [PATCH 90/95] i965/vec4: implement force_vstride0 for FIXED_GRF

2016-07-19 Thread Iago Toral Quiroga
From: Samuel Iglesias Gonsálvez Signed-off-by: Samuel Iglesias Gonsálvez --- src/mesa/drivers/dri/i965/brw_vec4.cpp | 13 - 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp

[Mesa-dev] [PATCH 80/95] i965/vec4: fix move_push_constants_to_pull_constants() for 64-bit data

2016-07-19 Thread Iago Toral Quiroga
--- src/mesa/drivers/dri/i965/brw_vec4.cpp | 20 +--- 1 file changed, 17 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index d7fbb5d..5c7a07a 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++

[Mesa-dev] [PATCH 78/95] i965/vec4: fix move_uniform_array_access_to_pull_constant() for 64-bit data

2016-07-19 Thread Iago Toral Quiroga
--- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 19 +-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp index 441a450..40ba648 100644 ---

[Mesa-dev] [PATCH 88/95] i965/vec4: split instructions that read 64-bit attrs in TessEval

2016-07-19 Thread Iago Toral Quiroga
The tessellation evaluation stage generates source regions with a vstride=0 for these so they hit the gen7 hardware decompression bug. Split them to prevent this. --- src/mesa/drivers/dri/i965/brw_vec4.cpp | 13 - 1 file changed, 12 insertions(+), 1 deletion(-) diff --git

[Mesa-dev] [PATCH 77/95] i965/vec4: fix scratch writes for 64bit data

2016-07-19 Thread Iago Toral Quiroga
Mostly the same stuff as usual: we ned to shuffle the data before we write and we need to emit two 32-bit write messages (with appropriate 32-bit writemask channels set) for a full dvec4 scratch write. --- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 64 ++ 1 file

[Mesa-dev] [PATCH 79/95] i965/vec4: fix indentation in move_push_constants_to_pull_constants()

2016-07-19 Thread Iago Toral Quiroga
--- src/mesa/drivers/dri/i965/brw_vec4.cpp | 60 +- 1 file changed, 30 insertions(+), 30 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index 99b30ce..d7fbb5d 100644 ---

[Mesa-dev] [PATCH 86/95] i965/vec4/tes: fix input loading for 64bit data types

2016-07-19 Thread Iago Toral Quiroga
FIXME: We need to fix the case where not all the attributes fit in the push constant buffer --- src/mesa/drivers/dri/i965/brw_vec4_tes.cpp | 63 +++--- 1 file changed, 48 insertions(+), 15 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_tes.cpp

[Mesa-dev] [PATCH 74/95] i965/vec4: do not split scratch read/write opcodes

2016-07-19 Thread Iago Toral Quiroga
64-bit scratch read/writes require to shuffle data around so we need to have access to the full 64-bit data. We will do the right thing for these when we emit the messages. --- src/mesa/drivers/dri/i965/brw_vec4.cpp | 9 + 1 file changed, 9 insertions(+) diff --git

[Mesa-dev] [PATCH 81/95] i965/vec4: make emit_pull_constant_load support 64-bit loads

2016-07-19 Thread Iago Toral Quiroga
This way callers don't need to know about 64-bit particularities and we reuse some code. --- src/mesa/drivers/dri/i965/brw_vec4.cpp | 22 ++- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 81 ++ 2 files changed, 50 insertions(+), 53 deletions(-) diff --git

[Mesa-dev] [PATCH 87/95] i965/vec4/tes: fix setup_payload() for 64bit data types

2016-07-19 Thread Iago Toral Quiroga
Use a width of 2 with 64-bit attributes. Also, if we have a dvec split across two registers such that components XY are stored in the second half of a register and components ZW are stored in the first half of the next register, fix up the regioning parameters for channels ZW. ---

[Mesa-dev] [PATCH 89/95] i965/vec4: fix writes to Z/W:DF from a FIXED_GRF

2016-07-19 Thread Iago Toral Quiroga
These can happen, for example, in tessellation evaluation when it maps incoming attributes to FIXED_GRF registers. In this case, just as with VGRFs, we need to make sure we have vstride=0 for these to work. --- src/mesa/drivers/dri/i965/brw_vec4.cpp | 3 +-- 1 file changed, 1 insertion(+), 2

[Mesa-dev] [PATCH 83/95] i965/vec4/gs: fix input loading for 64bit data

2016-07-19 Thread Iago Toral Quiroga
From: Samuel Iglesias Gonsálvez Signed-off-by: Samuel Iglesias Gonsálvez --- src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp | 43 +-- 1 file changed, 28 insertions(+), 15 deletions(-) diff --git

[Mesa-dev] [PATCH 82/95] i965/vec4: fix store output for 64-bit types

2016-07-19 Thread Iago Toral Quiroga
We need to shuffle the data before it is written to the URB. Also, dvec3/4 need two vec4 slots. --- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 19 +++ 1 file changed, 15 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp

[Mesa-dev] [PATCH 53/95] i965/disasm: fix subreg for dst in Align16 mode

2016-07-19 Thread Iago Toral Quiroga
There is a single bit for this, so it is a binary 0 or 1 meaning offset 0B or 16B respectively. --- src/mesa/drivers/dri/i965/brw_disasm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_disasm.c b/src/mesa/drivers/dri/i965/brw_disasm.c index

[Mesa-dev] [PATCH 56/95] i965/vec4: fix regs_written for doubles

2016-07-19 Thread Iago Toral Quiroga
--- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp index 265bb17..ae8704a 100644 ---

[Mesa-dev] [PATCH 44/95] i965/vec4: teach CSE about exec_size, group and doubles

2016-07-19 Thread Iago Toral Quiroga
--- src/mesa/drivers/dri/i965/brw_vec4_cse.cpp | 30 -- 1 file changed, 24 insertions(+), 6 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp b/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp index 0c1f0c3..d1bd9fa 100644 ---

[Mesa-dev] [PATCH 2/3] r600: advertise 8 bits subpixel precision for viewport bounds

2016-07-19 Thread Józef Kucia
Signed-off-by: Józef Kucia --- src/gallium/drivers/r600/r600_pipe.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/r600/r600_pipe.c b/src/gallium/drivers/r600/r600_pipe.c index 6bd027b..a3b6189 100644 ---

Re: [Mesa-dev] [Mesa-stable] [PATCH] mapi: Massage code to allow clang to compile.

2016-07-19 Thread Emil Velikov
On 18 July 2016 at 21:54, Matt Turner wrote: > On Mon, Jul 11, 2016 at 10:49 AM, Matt Turner wrote: >> According to https://llvm.org/bugs/show_bug.cgi?id=19778#c3 this code >> was violating the spec, resulting in it failing to compile. >> >> Cc:

Re: [Mesa-dev] [PATCH] glsl: subroutine types cannot be compared

2016-07-19 Thread Andres Gomez
Dropping this patch. It seems I overlooked: https://lists.freedesktop.org/archives/mesa-dev/2016-June/119616.html On Mon, 2016-07-18 at 16:39 +0300, Andres Gomez wrote: > subroutine variables are to be used just in the way functions are > called. Although the spec doesn't say it explicitely, this

Re: [Mesa-dev] [PATCH 1/3] i965: Use intel_get_image_dims in alloc_texture_storage

2016-07-19 Thread Iago Toral
On Mon, 2016-07-18 at 22:16 -0700, Jason Ekstrand wrote: > The intel_get_image_dims helper function handles some image dimension > sanitization for us for things such as 1-D array textures.  We should > probably be using it here. > > Signed-off-by: Jason Ekstrand > Cc:

Re: [Mesa-dev] [Mesa-stable] [PATCH] mapi: Massage code to allow clang to compile.

2016-07-19 Thread Jose Fonseca
AFAICS, this code is only used when USE_X86_ASM/USE_X86_64_ASM. These are never defined on Windows (we never use the assembly files on Windows, regardless which compiler is used), therefore there should be no impact to MSVC or Windows builds. Acked-by: Jose Fonseca

Re: [Mesa-dev] [PATCH] glsl/ast: don't allow subroutine uniform comparisons

2016-07-19 Thread Andres Gomez
Hi, Just dropped: https://lists.freedesktop.org/archives/mesa-dev/2016-July/123485.html I didn't realize there was already this thread open. On Tue, 2016-06-07 at 09:59 -0700, Ian Romanick wrote: > On 06/06/2016 10:20 PM, Dave Airlie wrote: > > From: Dave Airlie > > > >

Re: [Mesa-dev] [PATCH 3/9] st/mesa: completely rewrite state atoms

2016-07-19 Thread Rob Clark
() On Mon, Jul 18, 2016 at 9:11 AM, Marek Olšák wrote: > From: Marek Olšák > > The goal is to do this in st_validate_state: >while (dirty) > atoms[u_bit_scan()]->update(st); > > That implies that atoms can't specify which flags they consume. >

Re: [Mesa-dev] [PATCH] glsl/ast: don't allow subroutine uniform comparisons

2016-07-19 Thread Andres Gomez
On Tue, 2016-06-07 at 15:20 +1000, Dave Airlie wrote: > From: Dave Airlie > > This fixes: > GL45-CTS.shader_subroutine.subroutines_cannot_be_assigned_float_int_values_or_be_compared > > though I'm not 100% sure why this is illegal from the spec, > but it makes us pass the

Re: [Mesa-dev] [PATCH 07/10] egl/android: Make drm_gralloc headers optional

2016-07-19 Thread Rob Clark
On Tue, Jul 19, 2016 at 6:54 AM, Emil Velikov wrote: > On 19 July 2016 at 04:21, Tomasz Figa wrote: >> On Tue, Jul 19, 2016 at 2:35 AM, Emil Velikov >> wrote: >>> On 18 July 2016 at 16:38, Tomasz Figa

Re: [Mesa-dev] [PATCH 3/4] radeonsi: set optimal settings in COMPUTE_RESOURCE_LIMITS

2016-07-19 Thread Nicolai Hähnle
On 18.07.2016 14:14, Marek Olšák wrote: From: Marek Olšák ported from Vulkan --- src/gallium/drivers/radeonsi/si_compute.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_compute.c

[Mesa-dev] Testing patches 9987 and 9988

2016-07-19 Thread
Hello I would like to test http://patchwork.freedesktop.org/series/9987/ and http://patchwork.freedesktop.org/series/9988/ but the mbox patches aren't compatible with mesa-git. Would it be possible to update 9987 and 9988 to match mesa-git? Do 9987 and 9988 assume additional public patches that

Re: [Mesa-dev] [PATCH 1/3] gallium: split transfer_inline_write into buffer and texture callbacks

2016-07-19 Thread Nicolai Hähnle
On 18.07.2016 14:25, Marek Olšák wrote: From: Marek Olšák to reduce the call indirections with u_resource_vtbl. The worst call tree you could get was: - u_transfer_inline_write_vtbl - u_default_transfer_inline_write - u_transfer_map_vtbl -

Re: [Mesa-dev] Testing patches 9987 and 9988

2016-07-19 Thread Mike Lothian
Hi You're best replying directly to the posts on the mailing list for these. Most folk won't know the their patch series by their patchwork ID I think Marek posted a branch with his patches applied, it might be easier to test that, I'm sure he'll rebase his patches after review Cheers Mike On

Re: [Mesa-dev] [PATCH 4/4] radeonsi: emit PS exports last

2016-07-19 Thread Nicolai Hähnle
Patches 1, 3 & 4 are Reviewed-by: Nicolai Hähnle On 18.07.2016 14:14, Marek Olšák wrote: From: Marek Olšák This effectively removes s_waitcnt instructions after FP16 exports. Before: v_cvt_pkrtz_f16_f32_e32 v0, v0, v1 ; 5E000300

Re: [Mesa-dev] [PATCH 3/3] radeonsi: implement buffer_subdata without indirect calls

2016-07-19 Thread Nicolai Hähnle
Patches 2 & 3: Reviewed-by: Nicolai Hähnle On 18.07.2016 14:25, Marek Olšák wrote: From: Marek Olšák There is less noise in CPU profile data now. --- src/gallium/drivers/r600/r600_pipe.c| 2 +-

Re: [Mesa-dev] [PATCH 0/5] gallium pb_cache optimizations

2016-07-19 Thread Nicolai Hähnle
Series is Reviewed-by: Nicolai Hähnle On 18.07.2016 14:35, Marek Olšák wrote: Hi, These are small optimizations for reducing pb_cache overhead with Bioshock Infinite. Please review. Marek ___ mesa-dev mailing list

[Mesa-dev] [PATCH v2 3/8] nv50/ir: optimize ADD(ADD(a, b), c) to ADD3(a, b, c)

2016-07-19 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset --- .../drivers/nouveau/codegen/nv50_ir_peephole.cpp | 55 ++ 1 file changed, 55 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp

[Mesa-dev] [PATCH v2 5/8] nv50/ir: optimize ADD3(d, a, b, 0x0) to ADD(d, a, b)

2016-07-19 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 8 1 file changed, 8 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp

[Mesa-dev] [PATCH v2 2/8] gm107/ir: add emission for OP_ADD3

2016-07-19 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset --- .../drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp | 34 ++ 1 file changed, 34 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp

[Mesa-dev] [PATCH 84/95] i965/vec4/tcs: fix input loading for 64-bit data

2016-07-19 Thread Iago Toral Quiroga
--- src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp | 27 --- 1 file changed, 24 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp b/src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp index f61c612..70f81a0 100644 ---

[Mesa-dev] [PATCH 94/95] i965/vec4: enable ARB_gpu_shader_fp64 for Haswell

2016-07-19 Thread Iago Toral Quiroga
--- src/mesa/drivers/dri/i965/intel_extensions.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c b/src/mesa/drivers/dri/i965/intel_extensions.c index c557137..6ba44b8 100644 --- a/src/mesa/drivers/dri/i965/intel_extensions.c +++

[Mesa-dev] [PATCH 93/95] i965/vec4/scalarize_df: Always scalarize XY / ZW writemasks

2016-07-19 Thread Iago Toral Quiroga
Now that we are letting some instructions through without being fully scalarized we have to make sure that we do scalarize any that have XY / ZW writemasks, since this don't have native support. --- src/mesa/drivers/dri/i965/brw_vec4.cpp | 10 +- 1 file changed, 9 insertions(+), 1

[Mesa-dev] [PATCH 68/95] i965/vec4: Prevent copy propagation from violating pre-gen8 restrictions

2016-07-19 Thread Iago Toral Quiroga
In gen < 8 instructions that write more than one register need to read more than one register too. Make sure we don't break that restriction by copy propagating from a uniform. --- src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp | 7 +++ 1 file changed, 7 insertions(+) diff --git

[Mesa-dev] [PATCH 95/95] i965/gen7: expose OpenGL 4.0 on Haswell

2016-07-19 Thread Iago Toral Quiroga
ARB_gpu_shader_fp64 was the last piece missing. Notice that some hardware and kernel combinations do not support pipelined register writes, which are required for some OpenGL 4.0 features, in which case the driver won't expose 4.0. --- src/mesa/drivers/dri/i965/intel_extensions.c | 2 ++

[Mesa-dev] [PATCH 03/95] i965/vec4/nir: allocate two registers for dvec3/dvec4

2016-07-19 Thread Iago Toral Quiroga
From: Connor Abbott --- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp index 6662a1e..1f8fa80 100644 ---

[Mesa-dev] [PATCH mesa] configure.ac: raise Mako required version to 0.8.0

2016-07-19 Thread Eric Engestrom
It seems [0] old versions of Mako are no longer supported. Emil mentioned it might need v0.8.0 [1] for isl_format_layout [2], although I didn't get a confirmation that it's really the minimum. Let's raise it to that to avoid getting other bugs. We might lower it a bit again later if it turns out

[Mesa-dev] [PATCH 72/95] i965/vec4: Do not use DepCtrl with 64-bit instructions

2016-07-19 Thread Iago Toral Quiroga
The BDW PRM says that it is not supported, but it seems that gen7 is also affected, since doing DepCtrl on double-float instructions leads to GPU hangs in some cases, which is probably not surprising knowing that this is not supported in new hardware iterations. The SKL PRMs do not mention this

[Mesa-dev] [PATCH 65/95] i965/vec4: Fix SSBO stores for 64-bit data

2016-07-19 Thread Iago Toral Quiroga
In this case we need to shuffle the 64-bit data before we write it to memory, source from reg_offset + 1 to write components Z and W and consider that each DF channel is twice as big. --- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 40 -- 1 file changed, 32

[Mesa-dev] [PATCH 76/95] i965/vec4: fix scratch reads for 64bit data

2016-07-19 Thread Iago Toral Quiroga
--- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 15 +-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp index 454ad03..6e09778 100644 ---

[Mesa-dev] [PATCH 69/95] i965/vec4: don't propagate single-precision uniforms into 4-wide instructions

2016-07-19 Thread Iago Toral Quiroga
Otherwise we end up producing code that violates the register region restriction that says that when execsize == width and hstride != 0 the vstride can't be 0. --- src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp | 11 +++ 1 file changed, 11 insertions(+) diff --git

[Mesa-dev] [PATCH 73/95] i965/vec4: set force_vstride0 on any 64-bit source that has subnr > 0

2016-07-19 Thread Iago Toral Quiroga
From: Samuel Iglesias Gonsálvez Sometimes we emit code that has subnr > 0 to select the second half of a DF register (components Z or W). For example, the 64-bit shuffling code does this. For that code to work properly we need to make sure that that we use a vstride=0 on

[Mesa-dev] [PATCH 70/95] i965/vec4: don't copy propagate if subnr is set

2016-07-19 Thread Iago Toral Quiroga
From: Samuel Iglesias Gonsálvez This means we would copy propagate partial reads or writes and that can affect the result. Signed-off-by: Samuel Iglesias Gonsálvez --- src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp | 3 +++ 1 file changed,

[Mesa-dev] [PATCH 71/95] i965/vec4: extend the DWORD multiply DepCtrl restriction to all gen8 platforms

2016-07-19 Thread Iago Toral Quiroga
--- src/mesa/drivers/dri/i965/brw_vec4.cpp | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index e204d81..b4a22d1 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++

[Mesa-dev] [PATCH 75/95] i965/vec4: fix scratch offset for 64bit data

2016-07-19 Thread Iago Toral Quiroga
A vec4 is 16 bytes and a dvec4 is 32 bytes so for doubles we have to multiply the reladdr by 2. The reg_offset part is in units of 16 bytes and is used to select the low/high 16-byte chunk of a full dvec4, so we don't want to multiply that part of the address. ---

[Mesa-dev] [PATCH 62/95] i965/vec4: Add a shuffle_64bit_data helper

2016-07-19 Thread Iago Toral Quiroga
SIMD4x2 64bit data is stored in register space like this: r0.0:DF x0 y0 z0 w0 r0.1:DF x1 y1 z1 w1 When we need to write data such as this to memory using 32-bit write messages we need to shuffle it in this fashion: r0.0:DF x0 y0 x1 y1 r0.1:DF z0 w0 z1 w1 and emit two 32-bit write messages,

[Mesa-dev] [PATCH 66/95] i965/vec4: don't constant propagate 64-bit immediates

2016-07-19 Thread Iago Toral Quiroga
From: Connor Abbott v2: Also check if the instruction source target is 64-bit. (Samuel) Signed-off-by: Samuel Iglesias Gonsálvez --- src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp | 7 +++ 1 file changed, 7 insertions(+) diff

[Mesa-dev] [PATCH 63/95] i965/vec4: Fix UBO loads for 64-bit data

2016-07-19 Thread Iago Toral Quiroga
We need to emit to 32-bit load messages to load a full dvec4. If only 1 or 2 double components are needed dead-code-elimination will remove the second one. We also need to shuffle the result of the 32-bit messages to form valid 64-bit SIMD4x2 data. --- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp

[Mesa-dev] [PATCH 64/95] i965/vec4: Fix SSBO loads for 64-bit data

2016-07-19 Thread Iago Toral Quiroga
Same requirements as for UBO loads. --- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 32 -- 1 file changed, 26 insertions(+), 6 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp index 172bf48..5bc1fd5

[Mesa-dev] [PATCH 39/95] i965/vec4: dump the instruction execution size

2016-07-19 Thread Iago Toral Quiroga
--- src/mesa/drivers/dri/i965/brw_vec4.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index c55d594..8316691 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++

[Mesa-dev] [PATCH 57/95] i965/vec4: fix pack_uniform_registers for doubles

2016-07-19 Thread Iago Toral Quiroga
We need to consider the fact that dvec3/4 require two vec4 slots. --- src/mesa/drivers/dri/i965/brw_vec4.cpp | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index 1b190ab..95b408e

[Mesa-dev] [PATCH 54/95] i965/vec4: fix regs_read() for doubles

2016-07-19 Thread Iago Toral Quiroga
--- src/mesa/drivers/dri/i965/brw_vec4.cpp | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index 9400baa..a366548 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++

[Mesa-dev] [Bug 96993] new gallium swr driver can not be built on Windows

2016-07-19 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=96993 Bug ID: 96993 Summary: new gallium swr driver can not be built on Windows Product: Mesa Version: unspecified Hardware: Other OS: Windows (All) Status: NEW

Re: [Mesa-dev] [PATCH 02/15] i965: enable component packing for vs and fs

2016-07-19 Thread Timothy Arceri
On Tue, 2016-07-19 at 13:03 +0200, Alejandro Piñeiro wrote: > Is this the correct version of the patch? It uses nir_lower_io with 4 > parameters, while nir_lower_io on master uses 3 (and afaik, it has > been > using 3 for a while). > > FWIW, this patch doesn't apply cleanly with current master

[Mesa-dev] [PATCH] freedreno/a2xx: silence missing case 'SHADER_COMPUTE' warning

2016-07-19 Thread Francesco Ansanelli
--- src/gallium/drivers/freedreno/a2xx/disasm-a2xx.c |3 +++ 1 file changed, 3 insertions(+) diff --git a/src/gallium/drivers/freedreno/a2xx/disasm-a2xx.c b/src/gallium/drivers/freedreno/a2xx/disasm-a2xx.c index f00d5d4..54b3514 100644 --- a/src/gallium/drivers/freedreno/a2xx/disasm-a2xx.c

[Mesa-dev] [PATCH 67/95] i965/vec4: prevent copy-propagation from values with a different type size

2016-07-19 Thread Iago Toral Quiroga
Because the meaning of the swizzles and writemasks involved is different, so replacing the source would lead to different semantics. --- src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp | 7 +++ 1 file changed, 7 insertions(+) diff --git

[Mesa-dev] [PATCH 60/95] i965/vec4/nir: do not emit 64-bit MAD

2016-07-19 Thread Iago Toral Quiroga
RepCtrl=1 does not work with 64-bit operands so we need to use RepCtrl=0. In that situation, the regioning generated for the sources seems to be equivalent to <4,4,1>:DF, so it will only work for components XY, which means that we have to move any other swizzle to a temporary so that we can

[Mesa-dev] [PATCH 61/95] i965/vec4: do not emit 64-bit MAD

2016-07-19 Thread Iago Toral Quiroga
RepCtrl=1 does not work with 64-bit operands so we need to use RepCtrl=0. In that situation, the regioning generated for the sources seems to be equivalent to <4,4,1>:DF, so it will only work for components XY, which means that we have to move any other swizzle to a temporary so that we can

[Mesa-dev] [PATCH] configure.ac: Use ${datarootdir} for --with-vulkan-icddir help string too

2016-07-19 Thread Andreas Boll
The help string wasn't updated in cbc37f7. Fixes: cbc37f7 ("anv: install the intel_icd.json to ${datarootdir} by default") Signed-off-by: Andreas Boll --- configure.ac | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/configure.ac b/configure.ac

[Mesa-dev] [PATCH 51/95] i965/vec4: add a sanity check for force_vstride0

2016-07-19 Thread Iago Toral Quiroga
We only set this to true when fixing up 64bit regions and for one specific purpose only, so check that nothing else sets this to true. This helped me find a bug where the field was incorrectly initialized to true in some cases. --- src/mesa/drivers/dri/i965/brw_vec4.cpp | 3 +++ 1 file changed, 3

[Mesa-dev] [PATCH 55/95] i965/vec4: teach register coalescing about 64-bit

2016-07-19 Thread Iago Toral Quiroga
Specifically, at least for now, we don't want to deal with the fact that channel sizes for fp64 instructions are twice the size, so prevent coalescing from instructions with a different type size. Also, we should check that if we are coalescing a register from another MOV we should be reading the

[Mesa-dev] [PATCH 49/95] i965/vec4: implement access to DF source components Z/W

2016-07-19 Thread Iago Toral Quiroga
The general idea is that with 32-bit swizzles we cannot address DF components Z/W directly, so instead we select the region that starts at the middle of the SIMD register and use X/Y swizzles. The above, however, has the caveat that we can't do that without violating register region restrictions

[Mesa-dev] [PATCH 52/95] i965/vec4: print subnr in dump_instruction()

2016-07-19 Thread Iago Toral Quiroga
Also, we use reg_offset=1 with DF uniforms when we try to access components Z/W, so print reg_offset for them too. --- src/mesa/drivers/dri/i965/brw_vec4.cpp | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp

[Mesa-dev] [PATCH 59/95] i965/vec4: Skip swizzle to subnr in 3src instructions with DF operands

2016-07-19 Thread Iago Toral Quiroga
We make scalar sources in 3src instructions use subnr instead of swizzles because they don't really use swizzles. With doubles it is more complicated because we use vstride=0 in more scenarios in which they don't produce scalar regions. Also RepCtrl is not allowed with 64-bit operands, so we

[Mesa-dev] [PATCH 43/95] i965/disasm: print NibCtrl for instructions with execsize 4

2016-07-19 Thread Iago Toral Quiroga
--- src/mesa/drivers/dri/i965/brw_disasm.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_disasm.c b/src/mesa/drivers/dri/i965/brw_disasm.c index c8bdeab..d5e9916 100644 --- a/src/mesa/drivers/dri/i965/brw_disasm.c +++

[Mesa-dev] [PATCH 58/95] i965/vec4: fix indentation in pack_uniform_registers

2016-07-19 Thread Iago Toral Quiroga
--- src/mesa/drivers/dri/i965/brw_vec4.cpp | 30 +++--- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index 95b408e..68efea6 100644 ---

[Mesa-dev] [PATCH 45/95] i965/vec4: split double-precision bcsel

2016-07-19 Thread Iago Toral Quiroga
There is a hardware bug affecting compressed double-precision bcsel instructions in align16 mode by which they won't read predication mask properly, leading to incorrect behavior at least in non-uniform control flow scenarios. The bug does not affect other predicated instructions and it does not

[Mesa-dev] [PATCH 35/95] i965/vec4: fix optimize predicate for doubles

2016-07-19 Thread Iago Toral Quiroga
--- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp index c9b8edf..d7c6bf4 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp +++

Re: [Mesa-dev] [PATCH] configure.ac: Use ${datarootdir} for --with-vulkan-icddir help string too

2016-07-19 Thread Eric Engestrom
On Tue, Jul 19, 2016 at 12:45:54PM +0200, Andreas Boll wrote: > The help string wasn't updated in cbc37f7. > > Fixes: cbc37f7 ("anv: install the intel_icd.json to ${datarootdir} by > default") > > Signed-off-by: Andreas Boll Good catch! Reviewed-by: Eric Engestrom

Re: [Mesa-dev] [Mesa-stable] [PATCH] mapi: Export all GLES 3.1 functions in libGLESv2.so

2016-07-19 Thread Emil Velikov
On 19 July 2016 at 09:55, Andreas Boll wrote: > Hi, > > sorry for being late but this patch doesn't mention that all those > symbols should be exported in libGL.so too [1]. > If you look at the history of static_data.py it was mentioned that > this list of functions

  1   2   3   >