vec4: Fix exec size for MOVs SET_{HIGH, LOW}_32BIT.

Samuel Iglesias Gonsálvez Tue, 07 Mar 2017 05:12:01 -0800


On 04/03/17 01:26, Francisco Jerez wrote:
> Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
> 
>> From: Matt Turner <matts...@gmail.com>
>>
>> Otherwise for a pack_double_2x32_split opcode, we emit:
>>
>>    vec1 64 ssa_135 = pack_double_2x32_split ssa_133, ssa_134
>> mov(8)          g5<1>UD         g5<4>.xUD                       { align16 1Q 
>> compacted };
>> mov(8)          g7<2>UD         g5<4,4,1>UD                     { align1 1Q 
>> };
>>         ERROR: When the destination spans two registers, the source must 
>> span two registers
>>                (exceptions for scalar source and packed-word to packed-dword 
>> expansion)
>> mov(8)          g8<2>UD         g5.4<4,4,1>UD                   { align1 2N 
>> };
>>         ERROR: The offset from the two source registers must be the same
>> mov(8)          g5<1>UD         g6<4>.xUD                       { align16 1Q 
>> compacted };
>> mov(8)          g7.1<2>UD       g5<4,4,1>UD                     { align1 1Q 
>> };
>>         ERROR: When the destination spans two registers, the source must 
>> span two registers
>>                (exceptions for scalar source and packed-word to packed-dword 
>> expansion)
>> mov(8)          g8.1<2>UD       g5.4<4,4,1>UD                   { align1 2N 
>> };
>>         ERROR: The offset from the two source registers must be the same
>>
>> The intention was to emit mov(4)s for the instructions that have ERROR
>> annotations.
>>
>> See tests/spec/arb_gpu_shader_fp64/execution/vs-isinf-dvec.shader_test
>> for example.
>>
>> Reviewed-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com>
>> ---
>>  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp 
>> b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
>> index b570792badd..f6034bc8b76 100644
>> --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
>> @@ -2025,6 +2025,7 @@ generate_code(struct brw_codegen *p,
>>           assert(type_sz(dst.type) == 8);
>>  
>>           brw_set_default_access_mode(p, BRW_ALIGN_1);
>> +         brw_set_default_exec_size(p, BRW_EXECUTE_4);
>>  
> 
> NAK, we're missing a bug elsewhere if the exec_size coming in from the
> IR is not accurate.  You don't happen to be doubling the execution size
> of this single-precision instruction, do you?
>


The exec_size from the IR is correct and Matt's patch too.

The problem is that the original instruction
(VEC4_OPCODE_SET_{HIGH,LOW}_32BIT) is a double-precision one because dst
is DF, so we doubled the exec_size at the beginning of this function
-generate_code()-.

However, this opcode reads 32-bits elements from source and write them
into the respective 32-bit position (low or high) of the dst, so we
retyped the destination to UD to do it (see below), hence this
instruction is actually a 32-bit one.

We can either use Matt patch or add an exception to the doubling
exec_size condition which is at the beginning of this function. In
either case, I will add a comment explaining this, something like:

/* Don't double exec_size for VEC4_OPCODE_SET_{HIGH,LOW}_32BIT
 * because dst is retyped to a 32-bit type to copy source's values,
 * so the instruction is effectively a 32-bit one.
 */

Which solution do you prefer?

Sam

>>           dst = retype(dst, BRW_REGISTER_TYPE_UD);
>>           if (inst->opcode == VEC4_OPCODE_SET_HIGH_32BIT)
>> @@ -2037,6 +2038,7 @@ generate_code(struct brw_codegen *p,
>>           src[0].hstride = BRW_HORIZONTAL_STRIDE_1;
>>           brw_MOV(p, dst, src[0]);
>>  
>> +         brw_set_default_exec_size(p, BRW_EXECUTE_8);
>>           brw_set_default_access_mode(p, BRW_ALIGN_16);
>>           break;
>>        }
>> -- 
>> 2.11.0
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

signature.asc
Description: OpenPGP digital signature

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 20/24] i965/vec4: Fix exec size for MOVs SET_{HIGH, LOW}_32BIT.

Reply via email to