Jason Ekstrand <ja...@jlekstrand.net> writes:

> On Tue, May 24, 2016 at 12:18 AM, Francisco Jerez <curroje...@riseup.net>
> wrote:
>
>> Due to a Gen7-specific hardware bug native 32-wide instructions get
>> the lower 16 bits of the execution mask applied incorrectly to both
>> halves of the instruction, so the MOV trick we currently use wouldn't
>> work.  Instead emit multiple 16-wide MOV instructions in 32-wide mode
>> in order to cover the whole execution mask.
>> ---
>>  src/mesa/drivers/dri/i965/brw_eu_emit.c | 25 +++++++++++++++++--------
>>  1 file changed, 17 insertions(+), 8 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c
>> b/src/mesa/drivers/dri/i965/brw_eu_emit.c
>> index af7caed..d36877c 100644
>> --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
>> +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
>> @@ -3330,6 +3330,7 @@ void
>>  brw_find_live_channel(struct brw_codegen *p, struct brw_reg dst)
>>  {
>>     const struct brw_device_info *devinfo = p->devinfo;
>> +   const unsigned exec_size = 1 << brw_inst_exec_size(devinfo,
>> p->current);
>>     brw_inst *inst;
>>
>>     assert(devinfo->gen >= 7);
>> @@ -3359,15 +3360,23 @@ brw_find_live_channel(struct brw_codegen *p,
>> struct brw_reg dst)
>>
>>           brw_MOV(p, flag, brw_imm_ud(0));
>>
>> -         /* Run a 16-wide instruction returning zero with execution
>> masking
>> -          * and a conditional modifier enabled in order to get the current
>> -          * execution mask in f1.0.
>> +         /* Run enough instructions returning zero with execution masking
>> and
>> +          * a conditional modifier enabled in order to get the full
>> execution
>> +          * mask in f1.0.  We could use a single 32-wide move here if it
>> +          * weren't because of the hardware bug that causes channel
>> enables to
>> +          * be applied incorrectly to the second half of 32-wide
>> instructions
>> +          * on Gen7.
>>            */
>> -         inst = brw_MOV(p, brw_null_reg(), brw_imm_ud(0));
>> -         brw_inst_set_exec_size(devinfo, inst, BRW_EXECUTE_16);
>> -         brw_inst_set_mask_control(devinfo, inst, BRW_MASK_ENABLE);
>> -         brw_inst_set_cond_modifier(devinfo, inst, BRW_CONDITIONAL_Z);
>> -         brw_inst_set_flag_reg_nr(devinfo, inst, 1);
>> +         const unsigned lower_size = MIN2(16, exec_size);
>> +         for (unsigned i = 0; i < exec_size / lower_size; i++) {
>> +            inst = brw_MOV(p, retype(brw_null_reg(),
>> BRW_REGISTER_TYPE_UW),
>> +                           brw_imm_uw(0));
>>
>
> Is there a reason this is changing from D to UW?
>

It's likely to have lower execution latency than an instruction with
32-bit integer execution type.  It shouldn't have any practical
implications other than that, the result of the instruction is only used
to set bits of the flag register.

>
>> +            brw_inst_set_mask_control(devinfo, inst, BRW_MASK_ENABLE);
>> +            brw_inst_set_group(devinfo, inst, lower_size * i);
>> +            brw_inst_set_cond_modifier(devinfo, inst, BRW_CONDITIONAL_Z);
>> +            brw_inst_set_flag_reg_nr(devinfo, inst, 1);
>> +            brw_inst_set_exec_size(devinfo, inst, cvt(lower_size) - 1);
>> +         }
>>
>>           brw_FBL(p, vec1(dst), flag);
>>        }
>> --
>> 2.7.3
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>

Attachment: signature.asc
Description: PGP signature

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to