Reviewed-by: Roland Scheidegger <[email protected]> I'm curious though, is for radeonsi zext not equivalent to interleaving the low 32bits of each number with zeros (and hence doing the a uninterleave doesn't give you back the low respectively high bits)?
Roland Am 09.11.2016 um 12:46 schrieb Nicolai Hähnle: > From: Nicolai Hähnle <[email protected]> > > The fix in commit 88f791db75e9f065bac8134e0937e1b76600aa36 was insufficient > for radeonsi because the vector case was not handled properly. It seems > piglit only covers the scalar case, unfortunately. > > Fixes GL45-CTS.shader_bitfield_operation.[iu]mulExtended.* > --- > src/gallium/auxiliary/gallivm/lp_bld_arit.c | 20 ++++++++++++-------- > 1 file changed, 12 insertions(+), 8 deletions(-) > > diff --git a/src/gallium/auxiliary/gallivm/lp_bld_arit.c > b/src/gallium/auxiliary/gallivm/lp_bld_arit.c > index 43ad238..5553cb1 100644 > --- a/src/gallium/auxiliary/gallivm/lp_bld_arit.c > +++ b/src/gallium/auxiliary/gallivm/lp_bld_arit.c > @@ -1230,42 +1230,46 @@ lp_build_mul_32_lohi_cpu(struct lp_build_context *bld, > * Emits generic code. > */ > LLVMValueRef > lp_build_mul_32_lohi(struct lp_build_context *bld, > LLVMValueRef a, > LLVMValueRef b, > LLVMValueRef *res_hi) > { > struct gallivm_state *gallivm = bld->gallivm; > LLVMBuilderRef builder = gallivm->builder; > - LLVMValueRef tmp; > + LLVMValueRef tmp, shift, res_lo; > struct lp_type type_tmp; > - LLVMTypeRef wide_type, cast_type; > + LLVMTypeRef wide_type, narrow_type; > > type_tmp = bld->type; > + narrow_type = lp_build_vec_type(gallivm, type_tmp); > type_tmp.width *= 2; > wide_type = lp_build_vec_type(gallivm, type_tmp); > - type_tmp = bld->type; > - type_tmp.length *= 2; > - cast_type = lp_build_vec_type(gallivm, type_tmp); > + shift = lp_build_const_vec(gallivm, type_tmp, 32); > > if (bld->type.sign) { > a = LLVMBuildSExt(builder, a, wide_type, ""); > b = LLVMBuildSExt(builder, b, wide_type, ""); > } else { > a = LLVMBuildZExt(builder, a, wide_type, ""); > b = LLVMBuildZExt(builder, b, wide_type, ""); > } > tmp = LLVMBuildMul(builder, a, b, ""); > - tmp = LLVMBuildBitCast(builder, tmp, cast_type, ""); > - *res_hi = lp_build_uninterleave1(gallivm, bld->type.length * 2, tmp, 1); > - return lp_build_uninterleave1(gallivm, bld->type.length * 2, tmp, 0); > + > + res_lo = LLVMBuildTrunc(builder, tmp, narrow_type, ""); > + > + /* Since we truncate anyway, LShr and AShr are equivalent. */ > + tmp = LLVMBuildLShr(builder, tmp, shift, ""); > + *res_hi = LLVMBuildTrunc(builder, tmp, narrow_type, ""); > + > + return res_lo; > } > > > /* a * b + c */ > LLVMValueRef > lp_build_mad(struct lp_build_context *bld, > LLVMValueRef a, > LLVMValueRef b, > LLVMValueRef c) > { > _______________________________________________ mesa-dev mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/mesa-dev
