Neil Roberts <n...@linux.intel.com> writes: > Are you sure this patch is necessary? The documentation for the multiply > instruction on BDW+ says: > > SourceType : *D > DestinationType : *D > Project : EXCLUDE(CHV) > > This to me implies that it should work on BXT because it doesn't say > EXCLUDE(BXT). >
In fact both CHV and BXT *can* support 32x32 multiplication, that's not really the reason that motivated this patch -- The problem is that 32x32 multiplication has QWORD execution type which has several restrictions on CHV and BXT, the annoying one is: "CHV, BXT When source or destination datatype is 64b or operation is integer DWord multiply, regioning in Align1 must follow these rules: Source and Destination horizontal stride must be aligned to the same qword. Example: [...] // mul (4) r10.0<2>:d r11.0<8;4,2>:d r12.0<8;4,2>:d // Source and Destination stride must be 2 since the execution type is Qword." So it sounds like we could use the native 32x32 multiply with some additional effort to pass the arguments with the correct stride, but it's not clear to me whether the solution would be any more better than splitting up the multiply into 32x16 multiplies, so doing the same as in CHV and pre-Gen8 seemed like the most KISS solution for now. > I made a little test case and ran it on my BXT and it seems to work even > without this patch. I looked at the assembler source and it is > definitely doing a MUL instruction with D types for the dst and two > sources. > > https://github.com/bpeel/piglit/blob/test-integer-multiply/tests/general/mult32.c > Hmm, I guess the docs could be wrong, but I'm not sure what the consequences would be of violating the alignment rule, I guess the failure may be non-deterministic. What stepping of BXT did you test this on? > Regards, > - Neil > > Francisco Jerez <curroje...@riseup.net> writes: > >> AFAIK BXT has the same annoying alignment limitation as CHV on the >> source register regions of 32x32 bit MULs, give it the same treatment. >> --- >> src/mesa/drivers/dri/i965/brw_fs.cpp | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp >> b/src/mesa/drivers/dri/i965/brw_fs.cpp >> index 244f299..fc9f007 100644 >> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp >> @@ -3126,9 +3126,9 @@ fs_visitor::lower_integer_multiplication() >> bool progress = false; >> >> /* Gen8's MUL instruction can do a 32-bit x 32-bit -> 32-bit operation >> - * directly, but Cherryview cannot. >> + * directly, but CHV/BXT cannot. >> */ >> - if (devinfo->gen >= 8 && !devinfo->is_cherryview) >> + if (devinfo->gen >= 8 && !devinfo->is_cherryview && !devinfo->is_broxton) >> return false; >> >> foreach_block_and_inst_safe(block, fs_inst, inst, cfg) { >> -- >> 2.4.6 >> >> _______________________________________________ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
signature.asc
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev