Neil Roberts <n...@linux.intel.com> writes:

> Are you sure this patch is necessary? The documentation for the multiply
> instruction on BDW+ says:
>
> SourceType :  *D      
> DestinationType :     *D      
> Project :     EXCLUDE(CHV) 
>
> This to me implies that it should work on BXT because it doesn't say
> EXCLUDE(BXT).
>

In fact both CHV and BXT *can* support 32x32 multiplication, that's not
really the reason that motivated this patch -- The problem is that 32x32
multiplication has QWORD execution type which has several restrictions
on CHV and BXT, the annoying one is:

"CHV, BXT

When source or destination datatype is 64b or operation is integer DWord
multiply, regioning in Align1 must follow these rules:

    Source and Destination horizontal stride must be aligned to the same
    qword.

    Example:
[...]
    // mul (4) r10.0<2>:d r11.0<8;4,2>:d r12.0<8;4,2>:d // Source and 
Destination stride must be 2 since the execution type is Qword."

So it sounds like we could use the native 32x32 multiply with some
additional effort to pass the arguments with the correct stride, but
it's not clear to me whether the solution would be any more better than
splitting up the multiply into 32x16 multiplies, so doing the same as in
CHV and pre-Gen8 seemed like the most KISS solution for now.

> I made a little test case and ran it on my BXT and it seems to work even
> without this patch. I looked at the assembler source and it is
> definitely doing a MUL instruction with D types for the dst and two
> sources.
>
> https://github.com/bpeel/piglit/blob/test-integer-multiply/tests/general/mult32.c
>
Hmm, I guess the docs could be wrong, but I'm not sure what the
consequences would be of violating the alignment rule, I guess the
failure may be non-deterministic.  What stepping of BXT did you test
this on?

> Regards,
> - Neil
>
> Francisco Jerez <curroje...@riseup.net> writes:
>
>> AFAIK BXT has the same annoying alignment limitation as CHV on the
>> source register regions of 32x32 bit MULs, give it the same treatment.
>> ---
>>  src/mesa/drivers/dri/i965/brw_fs.cpp | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
>> b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> index 244f299..fc9f007 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> @@ -3126,9 +3126,9 @@ fs_visitor::lower_integer_multiplication()
>>     bool progress = false;
>>  
>>     /* Gen8's MUL instruction can do a 32-bit x 32-bit -> 32-bit operation
>> -    * directly, but Cherryview cannot.
>> +    * directly, but CHV/BXT cannot.
>>      */
>> -   if (devinfo->gen >= 8 && !devinfo->is_cherryview)
>> +   if (devinfo->gen >= 8 && !devinfo->is_cherryview && !devinfo->is_broxton)
>>        return false;
>>  
>>     foreach_block_and_inst_safe(block, fs_inst, inst, cfg) {
>> -- 
>> 2.4.6
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Attachment: signature.asc
Description: PGP signature

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to