On Thu, Oct 18, 2018 at 2:49 PM Ian Romanick <i...@freedesktop.org> wrote:
> On 10/17/2018 11:33 AM, Jason Ekstrand wrote: > > From: Connor Abbott <cwabbo...@gmail.com> > > > > Shader-db results on Haswell: > > > > total instructions in shared programs: 2180337 -> 2154080 (-1.20%) > > instructions in affected programs: 959766 -> 933509 (-2.74%) > > helped: 5653 > > HURT: 2560 > > > > total cycles in shared programs: 12339326 -> 12307102 (-0.26%) > > cycles in affected programs: 6102794 -> 6070570 (-0.53%) > > helped: 3838 > > HURT: 4868 > > In cases like this, the extra statistics generated by my extra changes > to report.py can be informative. Give me a few minutes, and I'll gather > that data. > > > Most of the hurt programs seem to be because we generate extra MOV's due > > to vectorizing things. For example, in > > shaders/non-free/steam/anomaly-2/158.shader_test, this: > > > > add(8) g116<1>.xyF g12<4,4,1>.xyyyF g1.4<0,4,1>.xyyyF { > align16 NoDDClr 1Q }; > > add(8) g117<1>.xyF g12<4,4,1>.xyyyF g1.4<0,4,1>.zwwwF { > align16 NoDDClr 1Q }; > > add(8) g116<1>.zwF g12<4,4,1>.xxxyF -g1.4<0,4,1>.xxxyF { > align16 NoDDChk 1Q }; > > add(8) g117<1>.zwF g12<4,4,1>.xxxyF -g1.4<0,4,1>.zzzwF { > align16 NoDDChk 1Q }; > > > > Turns into this: > > > > add(8) g13<1>F g12<4,4,1>.xyxyF g1.4<0,4,1>F { > align16 1Q }; > > add(8) g14<1>F g12<4,4,1>.xyxyF -g1.4<0,4,1>F { > align16 1Q }; > > mov(8) g116<1>.xyD g13<4,4,1>.xyyyD { > align16 NoDDClr 1Q }; > > mov(8) g117<1>.xyD g13<4,4,1>.zwwwD { > align16 NoDDClr 1Q }; > > mov(8) g116<1>.zwD g14<4,4,1>.xxxyD { > align16 NoDDChk 1Q }; > > mov(8) g117<1>.zwD g14<4,4,1>.zzzwD { > align16 NoDDChk 1Q }; > > > > So we eliminated two add's, but then had to introduce four mov's to > > transpose the result. Some of the hurt is because vectorization is a bit > > over-aggressive and we vectorize something when we should have left it > > as a scalar and CSEd it. Unfortunately, this is all really tricky to do > > as it involves the interactions between many different components. > > This seems to me like vectorization should be done later in the > optimization pipeline. I would have guessed that it would go after the > regular optimization loop. Did you try calling it from other places to > see the effects? > No, I've done very little work on this. I mostly rebased Connor's patches, got them working again, and sent them to the list. Someone was asking about it on IRC in the context of old Mali hardware, I think. I was surprised to find we'd never actually landed it so I decided to freshen it up a bit so that others could at least experiment with it again. Turns out that a lot has changed in NIR in the last three years... --Jason
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev