On Thu, Feb 4, 2016 at 5:48 PM, Matt Turner <matts...@gmail.com> wrote:
> Helps 11 shaders in UnrealEngine4 demos. > > I seriously hope they would have given us bitfieldReverse() if we > exposed GL 4.0 (but we do expose ARB_gpu_shader5, so why not use that > anyway?). > > instructions in affected programs: 4875 -> 4633 (-4.96%) > cycles in affected programs: 270516 -> 244516 (-9.61%) > > I suspect there's a *lot* of room to improve nir_search/opt_algebraic's > handling of this. We'd actually like to match, e.g., step2 by matching > step1 once and then doing a pointer comparison for the second instance > of step1, but unfortunately we generate an enormous tuple for instead. > > The .text size increases by 6.5% and the .data by 17.5%. > > text data bss dec hex filename > 22957 45224 0 68181 10a55 nir_libnir_la-nir_opt_algebraic.o > 24461 53160 0 77621 12f35 nir_libnir_la-nir_opt_algebraic.o > > I'd be happy to remove this if Unreal4 uses bitfieldReverse() if it is > in a GL 4.0 context once we expose GL 4.0. > --- > Maybe it'd be better do make this a separate pass capable of recognizing > this > pattern without blowing up the compiled code size. Probably worth checking > whether they use bitfieldReverse() under GL 4.0 first... > > src/compiler/nir/nir_opt_algebraic.py | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > > diff --git a/src/compiler/nir/nir_opt_algebraic.py > b/src/compiler/nir/nir_opt_algebraic.py > index 0a248a2..f92c6b9 100644 > --- a/src/compiler/nir/nir_opt_algebraic.py > +++ b/src/compiler/nir/nir_opt_algebraic.py > @@ -311,6 +311,18 @@ optimizations = [ > 'options->lower_unpack_snorm_4x8'), > ] > > +def bitfield_reverse(u): > + step1 = ('ior', ('ishl', u, 16), ('ushr', u, 16)) > + step2 = ('ior', ('ishl', ('iand', step1, 0x00ff00ff), 8), ('ushr', > ('iand', step1, 0xff00ff00), 8)) > + step3 = ('ior', ('ishl', ('iand', step2, 0x0f0f0f0f), 4), ('ushr', > ('iand', step2, 0xf0f0f0f0), 4)) > + step4 = ('ior', ('ishl', ('iand', step3, 0x33333333), 2), ('ushr', > ('iand', step3, 0xcccccccc), 2)) > + step5 = ('ior', ('ishl', ('iand', step4, 0x55555555), 1), ('ushr', > ('iand', step4, 0xaaaaaaaa), 1)) > + > + return step5 > Mind calling this "ue4_bitfield_reverse"? You're not detecting a generic bitfield reverse here. With that, patches 1, 3, and 5 are Reviewed-by: Jason Ekstrand <jason.ekstr...@intel.com> I trust Dylan on patch 4. I was just trying to ensure that we got/used a 32-bit value. --Jason > + > +optimizations += [(bitfield_reverse('x'), ('bitfield_reverse', 'x'))] > + > + > # Add optimizations to handle the case where the result of a ternary is > # compared to a constant. This way we can take things like > # > -- > 2.4.10 > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev >
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev