Am 18.01.2017 um 06:56 schrieb Dave Airlie: > On 12 December 2016 at 10:11, <[email protected]> wrote: >> From: Roland Scheidegger <[email protected]> >> >> By using a dst_type in the the gather interface, gather has some more >> knowledge about how values should be fetched. >> E.g. if this is a 3x32bit fetch and dst_type is 4x32bit vector gather >> will no longer do a ZExt with a 96bit scalar value to 128bit, but >> just fetch the 96bit as 3x32bit vector (this is still going to be >> 2 loads of course, but the loads can be done directly to simd vector >> that way). >> Also, we can now do some try to use the right int/float type. This should >> make no difference really since there's typically no domain transition >> penalties for such simd loads, however it actually makes a difference >> since llvm will use different shuffle lowering afterwards so the caller >> can use this to trick llvm into using sane shuffle afterwards (and yes >> llvm is really stupid there - nothing against using the shuffle >> instruction from the correct domain, but not at the cost of doing 3 times >> more shuffles, the case which actually matters is refusal to use shufps >> for integer values). >> Also do some attempt to avoid things which look great on paper but llvm >> doesn't really handle (e.g. fetching 3-element 8 bit and 16 bit vectors >> which is simply disastrous - I suspect type legalizer is to blame trying >> to extend these vectors to 128bit types somehow, so fetching these with >> scalars like before which is suboptimal due to the ZExt). >> >> Remove the ability for truncation (no point, this is gather, not conversion) >> as it is complex enough already. >> >> While here also implement not just the float, but also the 64bit avx2 >> gathers (disabled though since based on the theoretical numbers the benefit >> just isn't there at all until Skylake at least). > > Hi Roland, > > This breaks the build on big endian machines. > > CC gallivm/lp_bld_gather.lo > CC gallivm/lp_bld_init.lo > gallivm/lp_bld_gather.c: In function 'lp_build_gather_elem_vec': > gallivm/lp_bld_gather.c:238:42: error: 'dst_elem_type' undeclared > (first use in this function) > LLVMConstInt(dst_elem_type, > ^ > gallivm/lp_bld_gather.c:238:42: note: each undeclared identifier is > reported only once for each function it appears in > gallivm/lp_bld_gather.c: In function 'lp_build_gather': >
Oh right. I thought I actually hack-tested compilation for this, but apparently not the latest version... I've pushed a trivial fix for this, though I have to say I'm not really certain this change all works correct on big endian arch, though I tried to keep things the same... Roland _______________________________________________ mesa-dev mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/mesa-dev
