[Mesa-dev] [Bug 98172] Concurrent call to glClientWaitSync results in segfault in one of the waiters.

2016-10-11 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=98172 --- Comment #2 from Michel Dänzer --- Created attachment 127204 --> https://bugs.freedesktop.org/attachment.cgi?id=127204&action=edit Work with a local reference of so->fence Does this patch help? -- You are receiving this mail because: You

Re: [Mesa-dev] [PATCH] radeonsi: emit TA_CS_BC_BASE_ADDR on SI only if the kernel allows it

2016-10-11 Thread Nicolai Hähnle
Reviewed-by: Nicolai Hähnle On 10.10.2016 13:25, Marek Olšák wrote: From: Marek Olšák The kernel patch has been sent to amd-gfx. --- src/gallium/drivers/radeonsi/si_compute.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/radeonsi/si_compute.c

[Mesa-dev] [PATCH v2 014/103] i965/disasm: align16 DF source regions have a width of 2

2016-10-11 Thread Iago Toral Quiroga
Reviewed-by: Francisco Jerez --- src/mesa/drivers/dri/i965/brw_disasm.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_disasm.c b/src/mesa/drivers/dri/i965/brw_disasm.c index 5e51be7..1d2a4d2 100644 --- a/src/mesa/drivers/dri/i965/brw_disasm

[Mesa-dev] [PATCH v2 000/103] i965 Haswell ARB_gpu_shader_fp64 / OpenGL 4.0

2016-10-11 Thread Iago Toral Quiroga
It's been some time since we sent the first version of the patches, so here is a v2, which adds: 1. Feedback from Curro to v1. I think the only thing missing is the suggestion to change the semantics of the offset() helper in vec4 to match those in the scalar backend. I sent this as a separate ser

[Mesa-dev] [PATCH 2/3] radv/winsys: Move a 'default:' to the end of case stmt

2016-10-11 Thread Edward O'Callaghan
Shift this down and maintain the exact same behaviour as the current code. Signed-off-by: Edward O'Callaghan --- src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.c b/src/amd/vulkan/win

[Mesa-dev] [PATCH v2 017/103] i965/vec4: add VEC4_OPCODE_PICK_{LOW, HIGH}_32BIT opcodes

2016-10-11 Thread Iago Toral Quiroga
These opcodes will pick the low/high 32-bit in each 64-bit data element using Align1 mode. We will use this, for example, to do things like unpackDouble2x32. We use Align1 mode because in order to implement this in Align16 mode we would need to use 32-bit logical swizzles (XZ for low, YW for high)

[Mesa-dev] [PATCH 1/3] radv/winsys: Trivial style and readability fixups

2016-10-11 Thread Edward O'Callaghan
Drop/add a few newlines where appropriate and drop a couple of unnessary braces. Signed-off-by: Edward O'Callaghan --- src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c | 16 ++-- src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.h | 2 +- src/amd/vulkan/winsys/amdgpu/radv_amdgpu_su

[Mesa-dev] [PATCH v2 013/103] i965/vec4: set correct register regions for 32-bit and 64-bit

2016-10-11 Thread Iago Toral Quiroga
For 32-bit instructions we want to use <4,4,1> regions for VGRF sources so we should really set a width of 4 (we were setting 8). For 64-bit instructions we want to use a width of 2 because the hardware uses 32-bit swizzles, meaning that we can only address 2 consecutive 64-bit components in a row

[Mesa-dev] Various radv fixups, style + one mem leak fix

2016-10-11 Thread Edward O'Callaghan
Nothing major here, patch 3 is the only interesting one. Edward O'Callaghan (3): [PATCH 1/3] radv/winsys: Trivial style and readability fixups [PATCH 2/3] radv/winsys: Move a 'default:' to the end of case stmt [PATCH 3/3] radv/winsys: Fix mem leak at failed do_winsys_init() call ___

[Mesa-dev] [PATCH v2 022/103] i965/vec4: implement double packing

2016-10-11 Thread Iago Toral Quiroga
--- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 11 +++ 1 file changed, 11 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp index 2631bf3..37c3d7c 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp +++ b/src/mesa/dri

[Mesa-dev] [PATCH v2 032/103] i965/vec4: implement d2i, d2u, i2d and u2d

2016-10-11 Thread Iago Toral Quiroga
--- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 14 ++ 1 file changed, 14 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp index 0170d21..cc10247 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp +++ b/src/mesa/

[Mesa-dev] [PATCH v2 029/103] i965/vec4: Rename DF to/from F generator opcodes

2016-10-11 Thread Iago Toral Quiroga
The opcodes are not specific for conversions to/from float since we need the same for conversions to/from other 32-bit types. Rename the opcodes accordingly and change the asserts to check the size of the types involved instead. --- src/mesa/drivers/dri/i965/brw_defines.h | 4 ++--

[Mesa-dev] [PATCH v2 008/103] i965/vec4: add support for printing DF immediates

2016-10-11 Thread Iago Toral Quiroga
From: Connor Abbott Reviewed-by: Francisco Jerez --- src/mesa/drivers/dri/i965/brw_vec4.cpp | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index 6aa9102..c29cfb5 100644 --- a/src/mesa/drivers/dri/i965/brw_v

[Mesa-dev] [PATCH 3/3] radv/winsys: Fix mem leak at failed do_winsys_init() call site

2016-10-11 Thread Edward O'Callaghan
Probably unlikely however ensure we don't leak a heap allocation on the fail path. Signed-off-by: Edward O'Callaghan --- src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.c b/src/amd/vulkan/winsy

[Mesa-dev] [PATCH v2 007/103] i965/vec4/nir: fix emitting 64-bit immediates

2016-10-11 Thread Iago Toral Quiroga
--- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 22 ++ 1 file changed, 18 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp index 05e7f29..ce95c8d 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4

[Mesa-dev] [PATCH v2 010/103] i965/vec4: translate d2f/f2d

2016-10-11 Thread Iago Toral Quiroga
--- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 24 1 file changed, 24 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp index ce95c8d..b75337c 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp +++ b

[Mesa-dev] [PATCH v2 002/103] i965/vec4/nir: simplify glsl_type_for_nir_alu_type()

2016-10-11 Thread Iago Toral Quiroga
From: Connor Abbott Less duplication, one one less case to handle for doubles and support for sized NIR types. v2: Fix call to get_instance by swapping rows and columns params (Iago) Signed-off-by: Iago Toral Quiroga Reviewed-by: Francisco Jerez --- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp

[Mesa-dev] [PATCH v2 026/103] i965/vec4: fix get_nir_dest() to use DF type for 64-bit destinations

2016-10-11 Thread Iago Toral Quiroga
v2: Make dst_reg_for_nir_reg() handle this for nir_register since we want to have the correct type set before we call offset(). --- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp b/

[Mesa-dev] [PATCH v2 003/103] i965/vec4/nir: allocate two registers for dvec3/dvec4

2016-10-11 Thread Iago Toral Quiroga
From: Connor Abbott v2 (Curro): - Do not special-case for a bit-size of 64, divide the bit_size by 32 instead. - Use DIV_ROUND_UP so we can handle sub-32-bit types. --- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/s

[Mesa-dev] [PATCH v2 012/103] i965: add brw_vecn_grf()

2016-10-11 Thread Iago Toral Quiroga
From: Connor Abbott Reviewed-by: Francisco Jerez --- src/mesa/drivers/dri/i965/brw_reg.h | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_reg.h b/src/mesa/drivers/dri/i965/brw_reg.h index 8907c9c..1fa2595 100644 --- a/src/mesa/drivers/dri/i965/brw_reg.h +

[Mesa-dev] [PATCH v2 034/103] i965/vec4: implement fsign() for doubles

2016-10-11 Thread Iago Toral Quiroga
v2: use a predicated MOV instead of a CMP, like we do in d2b, to skip loading a double immediate. --- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 64 +++--- 1 file changed, 49 insertions(+), 15 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp b/sr

[Mesa-dev] [PATCH v2 018/103] i965/vec4: add VEC4_OPCODE_SET_{LOW, HIGH}_32BIT opcodes

2016-10-11 Thread Iago Toral Quiroga
These opcodes will set the low/high 32-bit in each 64-bit data element using Align1 mode. We will use this to implement packDouble2x32. We use Align1 mode because in order to implement this in Align16 mode we would need to use 32-bit logical swizzles (XZ for low, YW for high), but the IR works in

[Mesa-dev] [PATCH v2 004/103] i965/vec4/nir: Add bit-size information to types

2016-10-11 Thread Iago Toral Quiroga
Reviewed-by: Francisco Jerez --- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp index af76730..5048c4e 100644 --- a/src/mesa/drivers/dri

[Mesa-dev] [PATCH v2 011/103] i965: fix subnr overflow in suboffset()

2016-10-11 Thread Iago Toral Quiroga
--- src/mesa/drivers/dri/i965/brw_reg.h | 13 + 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_reg.h b/src/mesa/drivers/dri/i965/brw_reg.h index 3b46d27..8907c9c 100644 --- a/src/mesa/drivers/dri/i965/brw_reg.h +++ b/src/mesa/drivers/dri/i9

[Mesa-dev] [PATCH v2 005/103] i965/vec4/nir: support doubles in ALU operations

2016-10-11 Thread Iago Toral Quiroga
Basically, this involves considering the bit-size information to set the appropriate type on both operands and destination. v2 (Curro) - Don't use two temporaries (and write one of them twice ) to obtain the nir_alu_type. Reviewed-by: Francisco Jerez --- src/mesa/drivers/dri/i965/brw_vec4

[Mesa-dev] [PATCH v2 028/103] i965/vec4: fix register allocation for 64-bit undef sources

2016-10-11 Thread Iago Toral Quiroga
Reviewed-by: Francisco Jerez --- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp index fdd3cba..4dffd76 100644 --- a/src/mesa/drivers/dri/i965/

[Mesa-dev] [PATCH v2 021/103] i965/vec4: implement double unpacking

2016-10-11 Thread Iago Toral Quiroga
--- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 12 1 file changed, 12 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp index 04f70ef..2631bf3 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp +++ b/src/mesa/dr

[Mesa-dev] [PATCH v2 006/103] i965/vec4/nir: set the right type for 64-bit registers

2016-10-11 Thread Iago Toral Quiroga
From: Connor Abbott --- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp index 0d4c8f5..05e7f29 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp +++ b/

[Mesa-dev] [PATCH v2 035/103] i965/vec4: fix optimize predicate for doubles

2016-10-11 Thread Iago Toral Quiroga
--- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp index c0cb141..088ed13 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp +++ b/src

[Mesa-dev] [PATCH v2 067/103] i965/vec4: Fix SSBO stores for 64-bit data

2016-10-11 Thread Iago Toral Quiroga
In this case we need to shuffle the 64-bit data before we write it to memory, source from reg_offset + 1 to write components Z and W and consider that each DF channel is twice as big. --- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 40 -- 1 file changed, 32 insertions(

[Mesa-dev] [PATCH v2 044/103] i965/vec4: add a horiz_offset() helper

2016-10-11 Thread Iago Toral Quiroga
This will come in handy when we implement a simd lowering pass in a follow-up patch. --- src/mesa/drivers/dri/i965/brw_ir_vec4.h | 41 + 1 file changed, 41 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h b/src/mesa/drivers/dri/i965/brw_ir_vec4.h

[Mesa-dev] [PATCH v2 041/103] i965/vec4: use the IR's execution size

2016-10-11 Thread Iago Toral Quiroga
In the vec4 backend the generator sets to 8 the execution size for all instructions by default, however, to implement 64-bit floating-point we will need to split certain instruction into smaller sizes so we need the IR to convey this information like we do in the scalar backend. This patch uses the

[Mesa-dev] [PATCH v2 051/103] i965/vec4: teach cmod propagation about different execution sizes

2016-10-11 Thread Iago Toral Quiroga
We can't propagate the conditional modifier from one instruction to another of a different execution size / group, since that would change the channels affected by the conditional. --- src/mesa/drivers/dri/i965/brw_vec4_cmod_propagation.cpp | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)

[Mesa-dev] [PATCH v2 074/103] i965/vec4: Do not use DepCtrl with 64-bit instructions

2016-10-11 Thread Iago Toral Quiroga
The BDW PRM says that it is not supported, but it seems that gen7 is also affected, since doing DepCtrl on double-float instructions leads to GPU hangs in some cases, which is probably not surprising knowing that this is not supported in new hardware iterations. The SKL PRMs do not mention this res

[Mesa-dev] [PATCH v2 019/103] i965/vec4: Fix DCE for VEC4_OPCODE_SET_{LOW, HIGH}_32BIT

2016-10-11 Thread Iago Toral Quiroga
These align1 opcodes do partial writes of 64-bit data. The problem is that we want to use them to write on the same register to implement packDouble2x32 and from the point of view of DCE, since both opcodes write to the same register, only the last one stands and decides to eliminate the first, whi

[Mesa-dev] [PATCH v2 077/103] i965/vec4: fix scratch reads for 64bit data

2016-10-11 Thread Iago Toral Quiroga
v2: Setup for a 64-bit scratch read by checking the type size of the correct register (Iago) --- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 15 +-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/driver

[Mesa-dev] [PATCH v2 033/103] i965/vec4: implement d2b

2016-10-11 Thread Iago Toral Quiroga
v2 (Curo): - Generate the flag register with a predicated MOV instead of a CMP instruction, which has the benefit that we can skip loading a DF 0.0 constant. - Avoid the PICK_LOW_32BIT + MOV by using the flag result and a SEL to set the boolean result. --- src/mesa/drivers/dri/i965

[Mesa-dev] [PATCH v2 053/103] i965/vec4: add a scalarization pass for double-precision instructions

2016-10-11 Thread Iago Toral Quiroga
The hardware only supports 32-bit swizzles, which means that we can only access directly channels XY of a DF making access to channels ZW more difficult, specially considering the various regioning restrictions imposed by the hardware. The combination of both things makes handling ramdom swizzles o

[Mesa-dev] [PATCH v2 097/103] i965/vec4: run scalarize_df() after spilling

2016-10-11 Thread Iago Toral Quiroga
Spilling of 64-bit data requires data shuffling for the corresponding scratch read/write messages. This produces unsupported swizzle regions and writemasks that we need to scalarize. --- src/mesa/drivers/dri/i965/brw_vec4.cpp | 18 ++ 1 file changed, 18 insertions(+) diff --git a/

[Mesa-dev] [PATCH v2 039/103] i965/vec4: fix size_written for doubles

2016-10-11 Thread Iago Toral Quiroga
--- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp index 619e010..4e7515c 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.c

[Mesa-dev] [PATCH v2 016/103] i965/vec4: add dst_null_df()

2016-10-11 Thread Iago Toral Quiroga
Reviewed-by: Francisco Jerez --- src/mesa/drivers/dri/i965/brw_vec4.h | 5 + 1 file changed, 5 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h index 1505ba6..86e58f3 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.h +++ b/src/mesa/drive

[Mesa-dev] [PATCH v2 084/103] i965/vec4: fix attribute setup for doubles

2016-10-11 Thread Iago Toral Quiroga
--- src/mesa/drivers/dri/i965/brw_vec4.cpp | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index 426faf0..56a46ad 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/sr

[Mesa-dev] [PATCH v2 037/103] i965/vec4: use the new helper function to create double immediates

2016-10-11 Thread Iago Toral Quiroga
From: Samuel Iglesias Gonsálvez Signed-off-by: Samuel Iglesias Gonsálvez --- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp index 4d5fa96..1da

[Mesa-dev] [PATCH v2 047/103] i965/vec4: make the generator set correct NibCtrl for SIMD4 DF instructions

2016-10-11 Thread Iago Toral Quiroga
From the HSW PRM, Command Reference, QtrCtrl: "NibCtrl is only allowed for SIMD4 instructions with a DF (Double Float) source or destination type." v2: Assert that the type is DF (Samuel) v3: Don't set the default group to 0 and then set it only for 4-wide instructions. Instead, assert

[Mesa-dev] [PATCH v2 024/103] i965/vec4: fix base offset for nir_registers with doubles

2016-10-11 Thread Iago Toral Quiroga
v2: do this inside dst_reg_for_nir_reg() instead of its callers --- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp index 815082e..860ec51 100644 --- a/src/mesa

[Mesa-dev] [PATCH v2 009/103] i965/vec4: add double/float conversion pseudo-opcodes

2016-10-11 Thread Iago Toral Quiroga
These need to be emitted as align1 MOV's, since they need to have a stride of 2 on the float register (whether src or dest) so that data from another thread doesn't cross the middle of a SIMD8 register. v2 (Iago): - The float-to-double needs to align 32-bit data to 64-bit before doing the conversi

[Mesa-dev] [PATCH v2 079/103] i965/vec4: fix move_uniform_array_access_to_pull_constant() for 64-bit data

2016-10-11 Thread Iago Toral Quiroga
--- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 19 +-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp index b0b5f39..f12a114 100644 --- a/src/mesa/drivers/dri/i965

[Mesa-dev] [PATCH v2 063/103] i965/vec4: support multiple dispatch widths and groups in the IR builder.

2016-10-11 Thread Iago Toral Quiroga
--- src/mesa/drivers/dri/i965/brw_vec4_builder.h | 39 ++-- 1 file changed, 37 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_builder.h b/src/mesa/drivers/dri/i965/brw_vec4_builder.h index dab6e03..8352542 100644 --- a/src/mesa/drivers/dri/i

[Mesa-dev] [PATCH v2 093/103] i965/vec4: split instructions that read 64-bit interleaved attributes

2016-10-11 Thread Iago Toral Quiroga
Stages that use interleaved attributes generate regions with a vstride=0 that can hit the gen7 hardware decompression bug. --- src/mesa/drivers/dri/i965/brw_vec4.cpp | 28 ++-- 1 file changed, 26 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4

[Mesa-dev] [PATCH v2 075/103] i965/vec4: do not split scratch read/write opcodes

2016-10-11 Thread Iago Toral Quiroga
64-bit scratch read/writes require to shuffle data around so we need to have access to the full 64-bit data. We will do the right thing for these when we emit the messages. --- src/mesa/drivers/dri/i965/brw_vec4.cpp | 9 + 1 file changed, 9 insertions(+) diff --git a/src/mesa/drivers/dri/

[Mesa-dev] [PATCH v2 083/103] i965/vec4: fix indentation in lower_attributes_to_hw_regs()

2016-10-11 Thread Iago Toral Quiroga
--- src/mesa/drivers/dri/i965/brw_vec4.cpp | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index e732bf4..426faf0 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa

[Mesa-dev] [PATCH v2 052/103] i965/vec4: split double-precision bcsel

2016-10-11 Thread Iago Toral Quiroga
There is a hardware bug affecting compressed double-precision bcsel instructions in align16 mode by which they won't read predication mask properly. The bug does not affect other predicated instructions and it does not affect bcsel in Align1 mode either. This was found empirically and verified by C

[Mesa-dev] [PATCH v2 042/103] i965/vec4: dump the instruction execution size

2016-10-11 Thread Iago Toral Quiroga
Reviewed-by: Francisco Jerez --- src/mesa/drivers/dri/i965/brw_vec4.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index 2bde628..3191eab 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp

[Mesa-dev] [PATCH v2 068/103] i965/vec4: don't constant propagate 64-bit immediates

2016-10-11 Thread Iago Toral Quiroga
From: Connor Abbott v2: Also check if the instruction source target is 64-bit. (Samuel) Signed-off-by: Samuel Iglesias Gonsálvez --- src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp | 7 +++ 1 file changed, 7 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_copy_propag

[Mesa-dev] [PATCH v2 043/103] i965/vec4: handle 32 and 64 bit channels in liveness analysis

2016-10-11 Thread Iago Toral Quiroga
From: "Juan A. Suarez Romero" Our current data flow analysis does not take into account that channels on 64-bit operands are 64-bit. This is a problem when the same register is accessed using both 64-bit and 32-bit channels. This is very common in operations where we need to access 64-bit data in

[Mesa-dev] [PATCH v2 059/103] i965/vec4: fix indentation in pack_uniform_registers

2016-10-11 Thread Iago Toral Quiroga
--- src/mesa/drivers/dri/i965/brw_vec4.cpp | 30 +++--- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index b79fd5e..45d49e9 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cp

[Mesa-dev] [PATCH v2 095/103] i965/vec4/scalarize_df: support more swizzles via vstride=0

2016-10-11 Thread Iago Toral Quiroga
By exploiting gen7's hardware decompression bug with vstride=0 we gain the capacity to support additional swizzle combinations. This also fixes ZW writes from X/Y channels like in: mov r2.z:df r0.:df Because DF regions use 2-wide rows with a vstride of 2, the region generated for the source

[Mesa-dev] [PATCH v2 027/103] i965/vec4: make opt_vector_float ignore doubles

2016-10-11 Thread Iago Toral Quiroga
The pass does not support doubles in its current form. --- src/mesa/drivers/dri/i965/brw_vec4.cpp | 1 + 1 file changed, 1 insertion(+) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index 06fa38f..675b7fc 100644 --- a/src/mesa/drivers/dri/i965/brw_v

[Mesa-dev] [PATCH v2 048/103] i965/vec4: dump NibCtrl for instructions with execsize != 8

2016-10-11 Thread Iago Toral Quiroga
v2: do it in the same fashion as the FS backend for consistency (Curro) --- src/mesa/drivers/dri/i965/brw_vec4.cpp | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index 490cbae..69fdb1e 100644 --- a/src/mesa/dr

[Mesa-dev] [PATCH v2 030/103] i965/vec4: add helpers for conversions to/from doubles

2016-10-11 Thread Iago Toral Quiroga
Use these helpers to implement d2f and f2d. We will reuse these helpers when we implement things like d2i or i2d as well. --- src/mesa/drivers/dri/i965/brw_vec4.h | 5 +++ src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 54 +++--- 2 files changed, 39 insertions(+), 20 d

[Mesa-dev] [PATCH v2 064/103] i965/vec4: Add a shuffle_64bit_data helper

2016-10-11 Thread Iago Toral Quiroga
SIMD4x2 64bit data is stored in register space like this: r0.0:DF x0 y0 z0 w0 r0.1:DF x1 y1 z1 w1 When we need to write data such as this to memory using 32-bit write messages we need to shuffle it in this fashion: r0.0:DF x0 y0 x1 y1 r0.1:DF z0 w0 z1 w1 and emit two 32-bit write messages,

[Mesa-dev] [PATCH v2 055/103] i965/vec4: implement access to DF source components Z/W

2016-10-11 Thread Iago Toral Quiroga
The general idea is that with 32-bit swizzles we cannot address DF components Z/W directly, so instead we select the region that starts at the the 16B offset into the register and use X/Y swizzles. The above, however, has the caveat that we can't do that without violating register region restricti

[Mesa-dev] [PATCH v2 094/103] i965/vec4/scalarize_df: do not scalarize swizzles that we can support natively

2016-10-11 Thread Iago Toral Quiroga
Certain swizzles like XYZW can be supported by translating only the first two 64-bit swizzle channels to 32-bit channels. This happens with swizzles such that the first two logical components, when translated to 32-bit channels and replicated across the second dvec2 row, select the same channels sp

[Mesa-dev] [PATCH v2 069/103] i965/vec4: prevent copy-propagation from values with a different type size

2016-10-11 Thread Iago Toral Quiroga
Because the meaning of the swizzles and writemasks involved is different, so replacing the source would lead to different semantics. --- src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp | 7 +++ 1 file changed, 7 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_copy_propag

[Mesa-dev] [PATCH v2 092/103] i965/vec4: dump subnr for FIXED_GRF

2016-10-11 Thread Iago Toral Quiroga
This came in handy when debugging the payload setup for Tess Eval, since it prints correct subnr for attributes that can be loaded in the second half of a register. --- src/mesa/drivers/dri/i965/brw_vec4.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i

[Mesa-dev] [PATCH v2 099/103] i965/vec4: avoid spilling of registers that mix 32-bit and 64-bit access

2016-10-11 Thread Iago Toral Quiroga
When 64-bit registers are (un)spilled, we need to execute data shuffling code before writing to or after reading from memory. If we have instructions that operate on 64-bit data via 32-bit instructions, (un)spills for the register produced by 32-bit instructions will not do data shuffling at all (b

[Mesa-dev] [PATCH v2 065/103] i965/vec4: Fix UBO loads for 64-bit data

2016-10-11 Thread Iago Toral Quiroga
We need to emit 2 32-bit load messages to load a full dvec4. If only 1 or 2 double components are needed dead-code-elimination will remove the second one. We also need to shuffle the result of the 32-bit messages to form valid 64-bit SIMD4x2 data. --- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp |

[Mesa-dev] [PATCH v2 091/103] i965/vec4/tes: consider register offsets during attribute setup

2016-10-11 Thread Iago Toral Quiroga
--- src/mesa/drivers/dri/i965/brw_vec4_tes.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_tes.cpp b/src/mesa/drivers/dri/i965/brw_vec4_tes.cpp index c8fa2ca..a1aa672 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_tes.cpp +++ b/src/m

[Mesa-dev] [PATCH v2 089/103] i965/vec4/tes: fix input loading for 64bit data types

2016-10-11 Thread Iago Toral Quiroga
--- src/mesa/drivers/dri/i965/brw_vec4_tes.cpp | 72 +++--- 1 file changed, 55 insertions(+), 17 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_tes.cpp b/src/mesa/drivers/dri/i965/brw_vec4_tes.cpp index 226dcb4..f2a4507 100644 --- a/src/mesa/drivers/dri/i965

[Mesa-dev] [PATCH v2 070/103] i965/vec4: Prevent copy propagation from violating pre-gen8 restrictions

2016-10-11 Thread Iago Toral Quiroga
In gen < 8 instructions that write more than one register need to read more than one register too. Make sure we don't break that restriction by copy propagating from a uniform. --- src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp | 7 +++ 1 file changed, 7 insertions(+) diff --git a/sr

[Mesa-dev] [PATCH v2 071/103] i965/vec4: don't propagate single-precision uniforms into 4-wide instructions

2016-10-11 Thread Iago Toral Quiroga
Otherwise we end up producing code that violates the register region restriction that says that when execsize == width and hstride != 0 the vstride can't be 0. --- src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp | 11 +++ 1 file changed, 11 insertions(+) diff --git a/src/mesa/driv

[Mesa-dev] [PATCH v2 066/103] i965/vec4: Fix SSBO loads for 64-bit data

2016-10-11 Thread Iago Toral Quiroga
Same requirements as for UBO loads. --- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 31 +- 1 file changed, 26 insertions(+), 5 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp index f234e65..001a62f 100

[Mesa-dev] [PATCH v2 001/103] i965/nir: double/dvec2 uniforms only need to be padded to a single vec4 slot

2016-10-11 Thread Iago Toral Quiroga
From: Samuel Iglesias Gonsálvez max_vector_size is used in the vec4 backend to pad out the uniform components to match a size that is a multiple of a vec4. Double and dvec2 uniforms only require a single vec4 slot, not two. Signed-off-by: Samuel Iglesias Gonsálvez Signed-off-by: Iago Toral Quir

[Mesa-dev] [PATCH v2 085/103] i965/vec4: fix store output for 64-bit types

2016-10-11 Thread Iago Toral Quiroga
We need to shuffle the data before it is written to the URB. Also, dvec3/4 need two vec4 slots. --- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 29 ++--- 1 file changed, 26 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp b/src/mesa/d

[Mesa-dev] [PATCH v2 060/103] i965/vec4: Skip swizzle to subnr in 3src instructions with DF operands

2016-10-11 Thread Iago Toral Quiroga
We make scalar sources in 3src instructions use subnr instead of swizzles because they don't really use swizzles. With doubles it is more complicated because we use vstride=0 in more scenarios in which they don't produce scalar regions. Also RepCtrl=1 is not allowed with 64-bit operands, so we sho

[Mesa-dev] [PATCH v2 090/103] i965/vec4/tes: fix setup_payload() for 64bit data types

2016-10-11 Thread Iago Toral Quiroga
Use a width of 2 with 64-bit attributes. Also, if we have a dvec3/4 attribute that gets split across two registers such that components XY are stored in the second half of a register and components ZW are stored in the first half of the next, we need to fix regioning for any instruction that reads

[Mesa-dev] [PATCH v2 082/103] i965/vec4: make emit_pull_constant_load support 64-bit loads

2016-10-11 Thread Iago Toral Quiroga
This way callers don't need to know about 64-bit particularities and we reuse some code. --- src/mesa/drivers/dri/i965/brw_vec4.cpp | 22 ++- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 81 ++ 2 files changed, 50 insertions(+), 53 deletions(-) diff --git a

[Mesa-dev] [PATCH v2 086/103] i965/vec4/gs: fix input loading for 64bit data

2016-10-11 Thread Iago Toral Quiroga
From: Samuel Iglesias Gonsálvez v2 (Iago): - Adapt 64-bit path to component packing changes. Signed-off-by: Samuel Iglesias Gonsálvez Signed-off-by: Iago Toral Quiroga --- src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp | 51 ++- 1 file changed, 34 insertions(+), 17 d

[Mesa-dev] [PATCH v2 057/103] i965/vec4: teach register coalescing about 64-bit

2016-10-11 Thread Iago Toral Quiroga
Specifically, at least for now, we don't want to deal with the fact that channel sizes for fp64 instructions are twice the size, so prevent coalescing from instructions with a different type size. Also, we should check that if we are coalescing a register from another MOV we should be reading the

[Mesa-dev] [PATCH v2 020/103] i965/vec4: don't copy propagate vector opcodes that operate in align1 mode

2016-10-11 Thread Iago Toral Quiroga
Basically, ALIGN1 mode will ignore swizzles on the input vectors so we don't want the copy propagation pass to mess with them. --- .../drivers/dri/i965/brw_vec4_copy_propagation.cpp | 24 ++ 1 file changed, 24 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_copy_

[Mesa-dev] [PATCH v2 050/103] i965/vec4: teach CSE about exec_size, group and doubles

2016-10-11 Thread Iago Toral Quiroga
--- src/mesa/drivers/dri/i965/brw_vec4_cse.cpp | 31 +++--- 1 file changed, 24 insertions(+), 7 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp b/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp index bef897a..229d7b2 100644 --- a/src/mesa/drivers/dri/i965/

[Mesa-dev] [PATCH v2 015/103] i965/vec4: We only support 32-bit integer ALU operations for now

2016-10-11 Thread Iago Toral Quiroga
Add asserts so we remember to address this when we enable 64-bit integer support, as suggested by Connor and Jason. Reviewed-by: Francisco Jerez --- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 71 ++ 1 file changed, 53 insertions(+), 18 deletions(-) diff --git a/src

[Mesa-dev] [PATCH v2 078/103] i965/vec4: fix scratch writes for 64bit data

2016-10-11 Thread Iago Toral Quiroga
Mostly the same stuff as usual: we ned to shuffle the data before we write and we need to emit two 32-bit write messages (with appropriate 32-bit writemask channels set) for a full dvec4 scratch write. --- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 64 ++ 1 file chang

[Mesa-dev] [PATCH v2 072/103] i965/vec4: don't copy propagate misaligned registers

2016-10-11 Thread Iago Toral Quiroga
From: Samuel Iglesias Gonsálvez This means we would copy propagate partial reads or writes and that can affect the result. Signed-off-by: Samuel Iglesias Gonsálvez --- src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/mesa/driver

[Mesa-dev] [PATCH v2 046/103] i965/vec4: add a SIMD lowering pass

2016-10-11 Thread Iago Toral Quiroga
Generally, instructions in Align16 mode only ever write to a single register and don't need any form of SIMD splitting, that's why we have never had a SIMD splitting pass in the vec4 backend. However, double-precision instructions typically write 2 registers and in some cases they run into certain

[Mesa-dev] [PATCH v2 025/103] i965/vec4: fix indentation in get_nir_src()

2016-10-11 Thread Iago Toral Quiroga
--- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp index 860ec51..c825aeb 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp +++ b/src/m

[Mesa-dev] [PATCH v2 040/103] i965/vec4: fix regs_read() for doubles

2016-10-11 Thread Iago Toral Quiroga
--- src/mesa/drivers/dri/i965/brw_vec4.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index 75a8473..2bde628 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/

[Mesa-dev] [PATCH v2 073/103] i965/vec4: extend the DWORD multiply DepCtrl restriction to all gen8 platforms

2016-10-11 Thread Iago Toral Quiroga
--- src/mesa/drivers/dri/i965/brw_vec4.cpp | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index 7af65ab..7f6acc3 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers

[Mesa-dev] [PATCH v2 036/103] i965/vec4: add a helper function to create double immediates

2016-10-11 Thread Iago Toral Quiroga
Gen7 hardware does not support double immediates so these need to be moved in 32-bit chunks to a regular vgrf instead. Instead of doing this every time we need to create a DF immediate, create a helper function that does the right thing depending on the hardware generation. v2 (Curro): - Use swi

[Mesa-dev] [PATCH v2 023/103] i965/vec4/nir: implement double comparisons

2016-10-11 Thread Iago Toral Quiroga
--- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 21 ++--- 1 file changed, 18 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp index 37c3d7c..815082e 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_

[Mesa-dev] [PATCH v2 049/103] i965/disasm: print NibCtrl for instructions with execsize < 8

2016-10-11 Thread Iago Toral Quiroga
v2 (Curro): - Print it also for execsize < 4. - QtrCtrl is still in effect, so print 2 * qtr_ctl + nib_ctl + 1 - Do not read the nib ctl from the instruction in gen < 7, the field only exists in gen7+. --- src/mesa/drivers/dri/i965/brw_disasm.c | 6 +- 1 file changed, 5 insertions(+)

[Mesa-dev] [PATCH v2 080/103] i965/vec4: fix indentation in move_push_constants_to_pull_constants()

2016-10-11 Thread Iago Toral Quiroga
--- src/mesa/drivers/dri/i965/brw_vec4.cpp | 60 +- 1 file changed, 30 insertions(+), 30 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index 75e47f9..0788ba2 100644 --- a/src/mesa/drivers/dri/i965/brw_vec

[Mesa-dev] [PATCH v2 062/103] i965/vec4: do not emit 64-bit MAD

2016-10-11 Thread Iago Toral Quiroga
The previous patch made sure that we do not generate MAD instructions for any NIR's 64-bit ffma, but there is nothing preventing i965 from producing MAD instructions as a result of lowerings or optimization passes. This patch makes sure that any 64-bit MAD produced inside the driver after translati

[Mesa-dev] [PATCH v2 058/103] i965/vec4: fix pack_uniform_registers for doubles

2016-10-11 Thread Iago Toral Quiroga
We need to consider the fact that dvec3/4 require two vec4 slots. --- src/mesa/drivers/dri/i965/brw_vec4.cpp | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index e5391b9..b79fd5e 10064

[Mesa-dev] [PATCH v2 088/103] i965/vec4/tcs: fix outputs for 64-bit data

2016-10-11 Thread Iago Toral Quiroga
--- src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp | 31 -- 1 file changed, 29 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp b/src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp index f62dc9c..914396c 100644 --- a/src/mesa/drivers/dri/i965/

[Mesa-dev] [PATCH v2 056/103] i965/disasm: fix subreg for dst in Align16 mode

2016-10-11 Thread Iago Toral Quiroga
There is a single bit for this, so it is a binary 0 or 1 meaning offset 0B or 16B respectively. v2: - Since brw_inst_dst_da16_subreg_nr() is known to be 1, remove it from the expression (Curro) Reviewed-by: Francisco Jerez --- src/mesa/drivers/dri/i965/brw_disasm.c | 2 +- 1 file changed,

[Mesa-dev] [PATCH v2 081/103] i965/vec4: fix move_push_constants_to_pull_constants() for 64-bit data

2016-10-11 Thread Iago Toral Quiroga
--- src/mesa/drivers/dri/i965/brw_vec4.cpp | 20 +--- 1 file changed, 17 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index 0788ba2..b0bc2d5 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src

[Mesa-dev] [PATCH v2 045/103] i965: move the group field from fs_inst to backend_instruction.

2016-10-11 Thread Iago Toral Quiroga
Just like the exec_size, we are going to need this in the vec4 backend when we implement a simd splitting pass. --- src/mesa/drivers/dri/i965/brw_ir_fs.h | 9 - src/mesa/drivers/dri/i965/brw_shader.h | 9 + src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 1 + 3 fi

[Mesa-dev] [PATCH v2 031/103] i965/vec4: implement hardware workaround for align16 double to float conversion

2016-10-11 Thread Iago Toral Quiroga
From the BDW PRM, Workarounds chapter: "DF->f format conversion for Align16 has wrong emask calculation when source is immediate." So detect the case and move the immediate source to a VGRF before we attempt the conversion. Notice that Broadwell and later are strictly scalar at the moment

[Mesa-dev] [PATCH v2 054/103] i965/vec4: translate 64-bit swizzles to 32-bit

2016-10-11 Thread Iago Toral Quiroga
The hardware can only operate with 32-bit swizzles, which is a rather limiting restriction. However, the idea is not to expose this to the optimization passes, which would be a mess to deal with. Instead, we let the bulk of the vec4 backend ignore this fact and we fix the swizzles right at codegen

  1   2   3   4   >