On Sat, May 26, 2018 at 2:03 PM, Francisco Jerez <curroje...@riseup.net> wrote:
> Jason Ekstrand <ja...@jlekstrand.net> writes: > > > On Sat, May 26, 2018 at 12:21 PM, Francisco Jerez <curroje...@riseup.net > > > > wrote: > > > >> Jason Ekstrand <ja...@jlekstrand.net> writes: > >> > >> > On Sat, May 26, 2018 at 11:10 AM, Francisco Jerez < > curroje...@riseup.net > >> > > >> > wrote: > >> > > >> >> Jason Ekstrand <ja...@jlekstrand.net> writes: > >> >> > >> >> > From: Francisco Jerez <curroje...@riseup.net> > >> >> > > >> >> > When we don't have PLN (gen4 and gen11+), we implement LINTERP as > >> either > >> >> > LINE+MAC or a pair of MADs. In both cases, the accumulator is > written > >> >> > by the first of the two instructions and read by the second. Even > >> >> > though the accumulator value isn't actually ever used from a > logical > >> >> > instruction perspective, it is trashed so we need to make the > >> scheduler > >> >> > aware. Otherwise, the scheduler could end up re-ordering > instructions > >> >> > and putting a LINTERP between another an instruction which writes > the > >> >> > accumulator and another which tries to use that result. > >> >> > > >> >> > Cc: mesa-sta...@lists.freedesktop.org > >> >> > --- > >> >> > src/intel/compiler/brw_shader.cpp | 3 ++- > >> >> > 1 file changed, 2 insertions(+), 1 deletion(-) > >> >> > > >> >> > diff --git a/src/intel/compiler/brw_shader.cpp > >> b/src/intel/compiler/brw_ > >> >> shader.cpp > >> >> > index 141b64e..dfd2c5c 100644 > >> >> > --- a/src/intel/compiler/brw_shader.cpp > >> >> > +++ b/src/intel/compiler/brw_shader.cpp > >> >> > @@ -984,7 +984,8 @@ backend_instruction::writes_ > >> accumulator_implicitly(const > >> >> struct gen_device_info > >> >> > return writes_accumulator || > >> >> > (devinfo->gen < 6 && > >> >> > ((opcode >= BRW_OPCODE_ADD && opcode < BRW_OPCODE_NOP) > || > >> >> > - (opcode >= FS_OPCODE_DDX_COARSE && opcode <= > >> >> FS_OPCODE_LINTERP))); > >> >> > + (opcode >= FS_OPCODE_DDX_COARSE && opcode <= > >> >> FS_OPCODE_LINTERP))) || > >> >> > + (opcode == FS_OPCODE_LINTERP && !devinfo->has_pln); > >> >> > >> >> The devinfo->has_pln condition is technically inaccurate, because > even > >> >> SNB will fall back to the non-PLN path which overwrites the > accumulator > >> >> for certain valid IR, which yeah I'm aware is not *typically* > >> >> encountered before this series, > >> > > >> > > >> > I'm pretty sure it's impossible before this series because, without > >> SIMD32, > >> > the barycentric coordinates always start at g2 and get incremented by > 2 > >> > every time. The only other way to get something into the coordinates > >> > source of the PLN is with a pixel interpolator message. For that, the > >> > register allocator has a workaround to ensure that it's assigned an > even > >> > register on SNB. One of the early patches in my series replaces the > >> broken > >> > gen4.5-6 PLN fall-back (it didn't work in SIMD16 because it assumed > the > >> > wrong register layout for coordinates) with an assert and Jenkins is > just > >> > fine with the assert. > >> > > >> > >> I know the conditions for the non-PLN fall-back are not typically > >> encountered on Gen5-6, but it's still valid IR, so this implementation > >> of writes_accumulator_implicitly() relies on the behavior of the > >> register allocator, the NIR-to-i965 translation pass and the rest of the > >> visitor being exactly what you expect in every possible codepath for > >> correctness. > > > > > > It's already relied on for correctness because the LINE+MAC fallback that > > we had before is wrong in SIMD16 (see patch 33). > > > > Yes, I had to do basically the same fix on my SIMD32 branch in order for > it to pass piglit on Gen4-5. Still hardly an excuse to make more code > rely on the same broken assumption which wasn't doing it before. > I'm very confused by what you mean here. The old code assumed the LINE+MAC layout register layout which is different from the PLN layout. This means that the fallback is broken for g45-gen6 because the hardware supports PLN so we use the PLN layout which arranges thing x1y1x2y2 but the LINE+MAC fallback assumes x1x2y1y2. The reason why we need so much more code is because we have to split it all the way down to 8-wide in order to handle the PLN layout with LINE+MAC. Looking at the code in your branch, I have no idea how it could possibly work correctly. My point is that the odd register fallback on g4x-gen6 was never used and didn't work even if it was. --Jason > > > >> That wasn't the case in my original patch, nor in your > >> series after PATCHv2 33 because this inaccuracy actually becomes a > >> problem. Instead of introducing code which we know is dubiously > >> correct, CC'ing mesa-stable, and then fixing it immediately, why don't > >> we just do the obviously correct thing from the start? > >> > > > > Sure, we can squash them together if you really want. But if you're > > concerned about this restriction being valid in live code, then 33/53 > also > > needs to go to stable. I'm fine with that if you're unconvinced by my > > argument that LINTERP with an odd coordinate never occurs. > > > > --Jason > > > > > >> > > >> >> but why make things more inaccurate than > >> >> the original only to revert back to a devinfo->gen based check in > >> >> PATCHv2 33? > >> >> > >> > > >> > See above. > >> > > >> > > >> >> I think I'd squash the last hunk of PATCHv2 33 into this one which > would > >> >> give you something functionally equivalent to v1 but updated to > handle > >> >> Gen11 correctly. > >> >> > >> >> > } > >> >> > > >> >> > bool > >> >> > -- > >> >> > 2.5.0.400.gff86faf > >> >> > >> >
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev