Jason Ekstrand <[email protected]> writes: > On Dec 14, 2015 6:24 AM, "Francisco Jerez" <[email protected]> wrote: >> >> Jason Ekstrand <[email protected]> writes: >> >> > On Dec 11, 2015 5:44 AM, "Francisco Jerez" <[email protected]> > wrote: >> >> >> >> Jason Ekstrand <[email protected]> writes: >> >> >> >> > On Dec 10, 2015 6:58 AM, "Francisco Jerez" <[email protected]> >> > wrote: >> >> >> >> >> >> Jason Ekstrand <[email protected]> writes: >> >> >> >> >> >> > We aren't using it anymore. >> >> >> >> >> >> It seems useful to me to be able to represent indirect access as > part >> > of >> >> >> any instruction source or destination register. >> >> >> >> >> >> The following: >> >> >> >> >> >> | mov_indirect g0, g1, a0 >> >> >> | foo g2, g0 >> >> >> >> >> >> and the converse case with indirect destination offset (which you > don't >> >> >> seem to represent currently) can be implemented by the hardware more >> >> >> efficiently using a single instruction in certain cases. The > current >> > IR >> >> >> is able to represent what the hardware can do, but supporting the >> >> >> MOV_INDIRECT instruction only would force us to keep the indirection >> >> >> separate from the instruction that uses it, so it seems like a less >> >> >> expressive representation to me than the current approach, unless >> > you're >> >> >> willing to add _INDIRECT variants of most hardware opcodes. >> >> > >> >> > Yes and, mostly, no. Yes, you can put an indirect on almost anything >> > but >> >> > it has substantial restrictions: >> >> > >> >> Yes, I'm aware of the restrictions of indirect addressing on Gen >> >> hardware. The fact that reladdr can represent addressing modes which >> >> aren't directly implemented in the hardware doesn't invalidate the >> >> abstraction, nor implies that optimization and translation passes must >> >> be made aware of such restrictions. It just means that our abstraction >> >> is a superset of the hardware indirect addressing scheme, which is fine >> >> given a lowering pass to convert indirect addressing into a legal form. >> >> >> >> Your proposal instead goes the opposite direction and replaces the >> >> preexisting abstraction with a new abstraction which can only represent >> >> a tiny subset of the functionality of hardware instructions, which I >> >> don't think is acceptable for a low-level IR. >> >> >> >> > 1) Destination indirects must be uniform (I'm only 95% sure this is > the >> >> > case) >> >> > >> >> > 2) We only have 8 address subregisters on HSW and prior and 16 on BDW >> > which >> >> > means: >> >> > >> >> >> >> That's the kind of thing that SIMD lowering would be able to take care >> >> of easily. >> > >> > Yes, and we use it for MOV_INDIRECT. >> > >> Right, and there's no reason why it couldn't be used for reladdr in the >> same way to handle the above restrictions. >> >> >> > a) All indirects must be uniform OR >> >> > >> >> > b) We must be on Broadwell, have only two indirects, and split > the >> >> > instruction OR >> >> > >> >> > c) We can only have one indirect (and still split the > instruction on >> >> > HSW) >> >> > >> >> > So, yes, "everthing can be uniform", but not in real life. >> >> >> >> > The reladdr abstraction, while maybe ok for a high-level IR, doesn't >> >> > accurately represent the hardware at all because it can't represent >> >> > any of the restrictions without piles of helper functions and checks. >> >> >> >> The reladdr abstraction can represent the full flexibility of the >> >> hardware, while MOV_INDIRECT cannot, that makes reladdr a more accurate >> >> low-level representation of the program. For that reason I'd argue the >> >> exact opposite: MOV_INDIRECT would be a great abstraction and a welcome >> >> simplification for a high-level IR (e.g. NIR), in which nothing is lost >> >> by sacrificing the expressiveness of individual instructions in favour >> >> of instruction composition, because the back-end can always combine >> >> multiple high-level instructions into a single low-level instruction >> >> where the hardware ISA allows, as long as the low-level IR is able to >> >> represent the full functionality of hardware instructions (IOW the >> >> mapping between low-level IR instructions and hardware instructions is >> >> surjective), which is further from being the case after this series. >> > >> > Flexibility and expressiveness isn't always a good thing. >> >> Expressiveness is a good thing as long as it's required to represent the >> capabilities of the hardware. > > As long as it's required to represent those capabilities that we care > about, yes. However, the hardware can do many things that either aren't > useful for us or take so much effort for the little benefit you gain, that > it's not worth it. I'm arguing that saving a single MOV in the few places > that we do in directs isn't worth the complexity. > >> > Yes, reladdr is more "expressive", but it's not getting used because >> > that expressiveness makes it a pain. >> >> AFAIK nobody has bothered to implement it completely down to the >> back-end generator so it's not a wonder that it's not extensively used. >> I'd like to see some evidence that it would actually be a pain >> (e.g. some proof of concept implementation) before we give up and commit >> to an inherently more limited model. If you already have such a proof >> of concept, please share it. If you don't and are fully convinced that >> it will be a pain I suggest you hold up this series for a while and let >> me write the proof of concept -- Doing something with the expectation >> that it's going to be a pain is the best way to make sure it actually >> becomes a pain. Worst-case scenario you're right and it does become a >> pain, at which point I come back and review this series. Best-case >> scenario you're wrong and we don't give up useful functionality of the >> IR based on an expectation. > > Do I have a proof-of-concept in code, no. However, I've run through it in > my head and I have a pretty good idea what it would look like.
Ah, that's what I guessed. I don't think that your previous claim that
supporting indirect addressing based on the current reladdr approach
would require rewriting optimization passes to be recursive demonstrates
a particularly clear notion of what it could look like.
> You are free to go off and do it if you don't believe me, but I don't
> really want to hold things up while you do. If you go off and
> demonstrate that reladdr really is better, its not that hard to
> revert.
>
Whether it will be hard to revert or not depends on how much of the code
added in the meantime assumes that reladdr is no longer a thing, and you
cannot give any guarantees about that. I don't have any obligation to
spend time substantiating or refuting your assumption, and I wont if
your plan is to make things unnecessarily difficult for me in the
meantime.
> One other side-note: One of the substantial advantages to this series is
> the removal of the uniform size arrays. If you do decide to make a
> different prototype, please make that a major design point. When we get
> uniforms in from SPIR-V they may come in std140 packed and with just an
> offset and maybe range. The size arrays are implicitly tied to variables.
> I want them gone.
>
>> > You could say it is getting used, but it's current use is just an
>> > intermediate to something less expressive than MOV_INDIRECT. In that
>> > sense, we are expressing *more* with MOV_INDIRECT than we were before
>> > since we are also expressing in directing on registers and not just
>> > pull constants.
>> >
>> Yes, but MOV_INDIRECT is a dead end in the long-term, and the same
>> functionality you implement here could have been implemented based on
>> the preexisting model.
>>
>> > Also, while reladdr is "expressive", it doesn't actually represent the
>> > hardware. If we wanted that, we would expose the address register file
>> > (which I did consider doing). However, that comes with it's own set of
>> > issues.
>> >
>> It represents the hardware more accurately than MOV_INDIRECT in the
>> following sense: It allows us to represent a wide subset of the indirect
>> addressing capabilities of the hardware down to the generator, on any
>> instruction, source or destination we can possibly address indirectly.
>> MOV_INDIRECT on the other hand allows us to represent a MOV instruction
>> with indirectly-addressed source *only*.
>>
>> >> > The MOV_INDIRECT instruction, on the other hand, is something we can
>> >> > always do because it falls in the "one indirect" case above. The
>> >> > reladdr abstraction also suffers from the recursive reladdr problem
>> >> > which MOV_INDIRECT neatly side-steps by design.
>> >>
>> >> I don't think the reladdr "problem" requires any fundamentally more
>> >> difficult logic in optimization passes than what you need in order to
>> >> handle MOV_INDIRECT instructions reasonably.
>> >
>> > No, it just requires rewriting them because iterating over sources just
>> > became recursive.
>> >
>> That's quite an overstatement, most optimization passes only need to
>> care about the top-level register of a source or destination, because
>> wherever a reladdr is present it's not statically determined so you're
>> unlikely to be able to optimize through it anyway regardless of whether
>> you recurse or not. Cases where you actually want to iterate over each
>> register used by an instruction are a minority and don't require
>> fundamentally changing the optimization pass, the iteration can be
>> factored out as in the straightforward for_each_use() helper I
>> originally implemented to solve a different problem but which happens to
>> handle arbitrarily-nested reladdr as one would expect.
>>
>> >> > We can't even use reladdr past initial NIR -> FS stage because none
> of
>> >> > the optimization passes know what to do with it (and no, I don't want
>> >> > to add reladdr support to any of them).
>> >> >
>> >> > Yes, removing reladdr makes the IR "less expressive", but we're not
>> > really
>> >> > using that expressiveness nor is reladdr the abstraction we want.
> If we
>> >> > want to start taking more advantage of relative addressing in the
>> > future,
>> >> > we can come up with a better abstraction then.
>> >> >
>> >> > --Jason
>> >> >
>> >> >> > ---
>> >> >> > src/mesa/drivers/dri/i965/brw_fs.cpp | 7 +------
>> >> >> > src/mesa/drivers/dri/i965/brw_ir_fs.h | 5 +----
>> >> >> > 2 files changed, 2 insertions(+), 10 deletions(-)
>> >> >> >
>> >> >> > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
>> >> > b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> >> >> > index 7cc03c5..786c5fb 100644
>> >> >> > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
>> >> >> > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> >> >> > @@ -433,7 +433,6 @@ fs_reg::fs_reg(struct ::brw_reg reg) :
>> >> >> > {
>> >> >> > this->reg_offset = 0;
>> >> >> > this->subreg_offset = 0;
>> >> >> > - this->reladdr = NULL;
>> >> >> > this->stride = 1;
>> >> >> > if (this->file == IMM &&
>> >> >> > (this->type != BRW_REGISTER_TYPE_V &&
>> >> >> > @@ -448,7 +447,6 @@ fs_reg::equals(const fs_reg &r) const
>> >> >> > {
>> >> >> > return (this->backend_reg::equals(r) &&
>> >> >> > subreg_offset == r.subreg_offset &&
>> >> >> > - !reladdr && !r.reladdr &&
>> >> >> > stride == r.stride);
>> >> >> > }
>> >> >> >
>> >> >> > @@ -4716,9 +4714,7 @@
>> > fs_visitor::dump_instruction(backend_instruction
>> >> > *be_inst, FILE *file)
>> >> >> > break;
>> >> >> > case UNIFORM:
>> >> >> > fprintf(file, "u%d", inst->src[i].nr +
>> >> > inst->src[i].reg_offset);
>> >> >> > - if (inst->src[i].reladdr) {
>> >> >> > - fprintf(file, "+reladdr");
>> >> >> > - } else if (inst->src[i].subreg_offset) {
>> >> >> > + if (inst->src[i].subreg_offset) {
>> >> >> > fprintf(file, "+%d.%d", inst->src[i].reg_offset,
>> >> >> > inst->src[i].subreg_offset);
>> >> >> > }
>> >> >> > @@ -4829,7 +4825,6 @@
>> >> > fs_visitor::get_instruction_generating_reg(fs_inst *start,
>> >> >> > {
>> >> >> > if (end == start ||
>> >> >> > end->is_partial_write() ||
>> >> >> > - reg.reladdr ||
>> >> >> > !reg.equals(end->dst)) {
>> >> >> > return NULL;
>> >> >> > } else {
>> >> >> > diff --git a/src/mesa/drivers/dri/i965/brw_ir_fs.h
>> >> > b/src/mesa/drivers/dri/i965/brw_ir_fs.h
>> >> >> > index c3eec2e..e4f20f4 100644
>> >> >> > --- a/src/mesa/drivers/dri/i965/brw_ir_fs.h
>> >> >> > +++ b/src/mesa/drivers/dri/i965/brw_ir_fs.h
>> >> >> > @@ -58,8 +58,6 @@ public:
>> >> >> > */
>> >> >> > int subreg_offset;
>> >> >> >
>> >> >> > - fs_reg *reladdr;
>> >> >> > -
>> >> >> > /** Register region horizontal stride */
>> >> >> > uint8_t stride;
>> >> >> > };
>> >> >> > @@ -136,8 +134,7 @@ component(fs_reg reg, unsigned idx)
>> >> >> > static inline bool
>> >> >> > is_uniform(const fs_reg ®)
>> >> >> > {
>> >> >> > - return (reg.stride == 0 || reg.is_null()) &&
>> >> >> > - (!reg.reladdr || is_uniform(*reg.reladdr));
>> >> >> > + return (reg.stride == 0 || reg.is_null());
>> >> >> > }
>> >> >> >
>> >> >> > /**
>> >> >> > --
>> >> >> > 2.5.0.400.gff86faf
>> >> >> >
>> >> >> > _______________________________________________
>> >> >> > mesa-dev mailing list
>> >> >> > [email protected]
>> >> >> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
signature.asc
Description: PGP signature
_______________________________________________ mesa-dev mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/mesa-dev
