On 09/26/2013 03:00 PM, Matt Turner wrote:
> CSE would otherwise combine the two mul(8) emitted by [iu]mulExtended:
> 
>       mul(8)  acc0 x y
>       mach(8) null x y
>       mov(8)  lsb  acc0
>       ...
>       mul(8)  acc0 x y
>       mach(8) msb  x y
> Into:
>       mul(8)  temp x y
>       mov(8)  acc0 temp
>       mach(8) null x y
>       mov(8)  lsb  acc0
>       ...
>       mov(8)  acc0 temp
>       mach(8) msb  x y
> 
> But mul(8) into the accumulator produces more than 32-bits of precision,
> which is required and lost if multiplying into a general register and
> moving to the accumulator.
> ---
>  src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
> index ccd4e5e..61b3aeb 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
> @@ -98,7 +98,8 @@ fs_visitor::opt_cse_local(bblock_t *block, exec_list *aeb)
>        if (is_expression(inst) &&
>            !inst->predicate &&
>            !inst->is_partial_write() &&
> -          !inst->conditional_mod)
> +          !inst->conditional_mod &&
> +          inst->dst.file != HW_REG)
>        {
>        bool found = false;

Patches 1-4 are:
Reviewed-by: Kenneth Graunke <[email protected]>

_______________________________________________
mesa-dev mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to