On Fri, Feb 22, 2019 at 4:05 PM Timothy Arceri <[email protected]> wrote:
> > > On 23/2/19 8:54 am, Jason Ekstrand wrote: > > On Fri, Feb 22, 2019 at 2:47 PM Timothy Arceri <[email protected] > > <mailto:[email protected]>> wrote: > > > > > > > > On 23/2/19 6:31 am, Rob Clark wrote: > > > On Fri, Feb 22, 2019 at 12:39 PM Jason Ekstrand > > <[email protected] <mailto:[email protected]>> wrote: > > >> > > >> On Fri, Feb 22, 2019 at 9:51 AM Eric Anholt <[email protected] > > <mailto:[email protected]>> wrote: > > >>> > > >>> Timothy Arceri <[email protected] > > <mailto:[email protected]>> writes: > > >>> > > >>>> shader-db results i965 (SKL): > > >>>> > > >>>> total instructions in shared programs: 13219105 -> 13024761 > > (-1.47%) > > >>>> instructions in affected programs: 1169457 -> 975113 (-16.62%) > > >>>> helped: 599 > > >>>> HURT: 154 > > >>>> > > >>>> total cycles in shared programs: 333968972 -> 324822073 > (-2.74%) > > >>>> cycles in affected programs: 130032440 -> 120885541 (-7.03%) > > >>>> helped: 590 > > >>>> HURT: 216 > > >>>> > > >>>> total spills in shared programs: 57947 -> 29130 (-49.73%) > > >>>> spills in affected programs: 53364 -> 24547 (-54.00%) > > >>>> helped: 351 > > >>>> HURT: 0 > > >>>> > > >>>> total fills in shared programs: 51310 -> 25468 (-50.36%) > > >>>> fills in affected programs: 44882 -> 19040 (-57.58%) > > >>>> helped: 351 > > >>>> HURT: 0 > > >>>> --- > > >>>> src/compiler/nir/nir_lower_phis_to_scalar.c | 1 + > > >>>> 1 file changed, 1 insertion(+) > > >>>> > > >>>> diff --git a/src/compiler/nir/nir_lower_phis_to_scalar.c > > b/src/compiler/nir/nir_lower_phis_to_scalar.c > > >>>> index 16001f73685..f6f702bca15 100644 > > >>>> --- a/src/compiler/nir/nir_lower_phis_to_scalar.c > > >>>> +++ b/src/compiler/nir/nir_lower_phis_to_scalar.c > > >>>> @@ -74,6 +74,7 @@ is_phi_src_scalarizable(nir_phi_src *src, > > >>>> /* A phi is scalarizable if we're going to lower it */ > > >>>> return should_lower_phi(nir_instr_as_phi(src_instr), > > state); > > >>>> > > >>>> + case nir_instr_type_tex: > > >>>> case nir_instr_type_load_const: > > >>>> case nir_instr_type_ssa_undef: > > >>>> /* These are trivially scalarizable */ > > >>> > > >>> Sounds promising, but I would definitely not describe > > instr_type_tex as > > >>> "trivially scalarizable" -- could you explain what's going on > > with this > > >>> patch? > > > > Basically it just turns: > > > > if ssa0 { > > ... > > vec4 ssa1 = txf ..... > > } else { > > ... > > vec4 ssa2 = ... > > } > > vec4 ss3 = phi ssa1, ssa2 > > > > Into > > > > if ssa0 { > > ... > > vec4 ssa1 = txf ..... > > vec1 ssa2 = imov ssa1.x > > vec1 ssa3 = imov ssa1.y > > vec1 ssa4 = imov ssa1.z > > vec1 ssa5 = imov ssa1.w > > } else { > > ... > > vec4 ssa6 = ... > > vec1 ssa7 = imov ssa6.x > > vec1 ssa8 = imov ssa6.y > > vec1 ssa9 = imov ssa6.z > > vec1 ssa10 = imov ssa6.w > > } > > vec1 ss11 = phi ssa2, ssa7 > > vec1 ss12 = phi ssa3, ssa8 > > vec1 ss13 = phi ssa4, ssa9 > > vec1 ss14 = phi ssa5, ssa10 > > > > > > This allows a whole bunch more optimisation to take place as often > not > > all of the phi channels are actually used. If some cases large > > chunks of > > logic can be remove from the if branch that doesn't contain the > texture > > access. > > > > >> > > >> > > >> I think I can for Intel though I'm not sure how this affects > > other drivers. > > >> > > >> On Intel hardware, we almost always have to combine all the > > texture sources into one big message. Since having more than one > > source is very common, this means that we have to make a temporary > > copy of the sources anyway. Because we're copying them, having them > > contiguous (a vector in NIR terms) doesn't actually gain us > > anything. We may as well let NIR scalarize them and give more > > freedom to the register allocator and other NIR passes which may > > need to clean things up. We don't want to make the same choice for > > destinations as they are required to be contiguous. > > >> > > > > > > hmm, but this is abut the phi src (ie. the tex dest), not the tex > > src, isn't it? > > > > > > Well, that's a pickle... When I wrote this pass I did so to try and > > explicitly keep from breaking up things that are known to be vectors in > > the back-end such as textures. The idea was to try and not break up > > things like this: > > > > if (...) { > > ssa_1 = tex() > > } else { > > ssa_2 = tex() > > } > > ssa_3 = phi (ssa_1, ssa_2) > > > > in the hopes that it would turn into > > > > if (...) { > > r1 = tex(); > > } else { > > r1 = tex(); > > } > > > > Clearly, that notion was mis-placed. At this point, I really wonder > > what the complexity is saving us. Maybe it's not worth it at all? > > Maybe we need to be more agressive and require all sources to not be > > vectorizable or something? > > Maybe checking if all components of the phi are used before converting > to scalar would be useful? Not sure but I've sent a more conservative v2 > of this patch where a bunch of hurt is gone. > > The remaining hurt is very small and comes from shaders like this: > > if (...) { > r1 = tex(); > } else { > if (...) { > r1 = tex(); > } else { > r1 = undefined; > } > } > We generally shouldn't let undef affect things. This is currently true but not anymore with your new heuristic. I'm thinking about how to actually write the heuristic I have in my head and it's really annoying. :'(
_______________________________________________ mesa-dev mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/mesa-dev
