Re: [m5-dev] src/dest detection in the ISA descriptions

Steve Reinhardt Wed, 27 Apr 2011 10:36:28 -0700

On Wed, Apr 27, 2011 at 10:14 AM, Gabe Black <gbl...@eecs.umich.edu> wrote:


> On 04/27/11 08:02, Steve Reinhardt wrote:
> > On Wed, Apr 27, 2011 at 1:21 AM, Gabe Black <gbl...@eecs.umich.edu>
> wrote:
> >
> >>> Perhaps the heuristics could simply be extended to deal with
> >>> structure field accesses... if the thing after the symbol is a ".",
> then
> >> you
> >>> look past that to see if there's an equals sign or not, and behave
> >>> appropriately.
> >> "appropriately" isn't clearly defined, really. [...]
> >>
> > "appropriately" was just a placeholder for "stuff I'm sure we can figure
> out
> > later". [...]
>
> My point is that it -is- too complicated to try to figure out if you
> overwrite an entire dest or just part of it. That's why you need to be
> able to say which it is by hand.
>

That's fine... having a system that does the right thing automatically most
of the time (like it does now, ideally with extensions to avoid awkward
workarounds and increase the fraction of time it's automatically right)
coupled with a way to efficiently do manual overrides when it's not right
seems like a fine approach to me.  Extending the heuristics as I said could
still be a part of avoiding workarounds and increasing the number of cases
where it does the right thing automatically.


> >> My idea is to be able to inherit from the standard op types like
> >> IntRegOperand and allow them to install more than one index and/or
> >> change how they're declared, read, and written. So say for example you
> >> had a 128 bit wide SIMD instruction operating on four floating point
> >> registers which are normally accessible as 32 bit, indexed through some
> >> strange scheme. You could say that operand generates 4 indices,
> >> determined by whatever weird formula or just sequentially. Then you
> >> could define a structure which was the union of an array of 4 32 bit
> >> ints and 4 floats or 2 doubles, or 16 bytes, or ... The constructor
> >> could take the return of reading the 4 indices with
> >> readFloatRegOperandBits to fill it's main array, and then the
> >> instruction could access whatever other representation it needed. The
> >> other direction would work as well. Then, assuming we get this
> >> source/dest thing straightened out, you could have an instruction that,
> >> say, did parallel byte adds defined like this:
> >>
> >> for (int i = 0; i < 7; i++)
> >>    Dest.bytes[i] = Source1.bytes[i] + Source2.bytes[i];
> >>
> >> And all the other goop would figure itself out. If some other scheme was
> >> needed (runtime selectable width, selectable float vs. int, etc) then
> >> the IntRegOperand subclass combined with the composite operand type
> >> could have the smarts to do the right thing all tidily hidden away after
> >> it was set up. Note that Dest is just a dest, even though it uses a
> >> structure field.
> >>
> > I'm not sure how you're envisioning this will work... are you assuming
> > there's a full C++ parser like gcc-xml?  How would you know that in this
> > case you're overwriting all of Dest, but if the upper bound of the loop
> was
> > 6 and not 7 then it would be just a partial overwrite?  That's a level of
> > compiler analysis I *really* don't want to get into.  (Yes, it is
> > theoretically feasible and it would be pretty slick, but I'm sure there
> are
> > many other things that would have a greater ROI than that.)
> >
> > Could we do this with C++ operator overloading?  Seems like you could
> just
> > say "Dest = Source1 + Source2;" which would be obvious to our code
> parser,
> > and then have C++ do the magic.  The type extensions could be used as
> casts
> > to make this more flexible, e.g., "Dest = Source1@bv + Source2@bv" could
> do
> > byte-by-byte adds while "Source1@fv + Source2@fv" could do it by 32-bit
> > floats, or something like that.
>
> The only tricky thing there is determining what is a source and what is
> a dest, although it highlights how hard that can be.


I still don't really get your proposal, in terms of how the code sample you
have parsed above would get turned into a concrete set of source and dest
operand indices.


> Operator
> overloading seems a little too magical to me, and it wouldn't be obvious
> to somebody looking at the code what's going on. I also don't like
> having to engineer all the possible operations ahead of time and/or
> define a type that does each one. One of the problems I've seen with our
> approach so far is that having a few or even many prepackaged options
> breaks down as you scale it out. I would rather not just push out that
> boundary since we have a better idea of where it needs to be today.
> Granted, the types and operators could be defined as needed, but having
> to make a set for every operation would be pretty cumbersome.
>

I don't think operator overloading would be that confusing in this
particular case, but I agree it's not a general solution.


> >> Also the operands might be smart enough to change how they set
> >> themselves up on the fly. Lets say in a particular mode you only need 2
> >> 32 bit floats and the other two spots are zeros. The operand
> >> initialization code could figure out what mode it's in at construction
> >> time that it doesn't need all 4 operands and could only fill in 2 spots.
> >> The next operand would then pack in behind it. This would hopefully make
> >> it easier to get multiple behaviors without having to define a new
> >> instruction (or just code blob) for each one.
> >>
> > Are there concrete examples of this that you've encountered?  I don't see
> us
> > defining many (any?) new ISAs, so at this point we should focus on
> cleaning
> > up the ones we have (or just implementing extensions, like maybe SSE ops
> > that haven't been done yet).  I'd bet that any features we don't have a
> > concrete use case for now are extremely unlikely to ever get used.
> >
>
> Yes, although I can't talk about them. Being able to take more than one
> slot will be 95% of the way to being able to take however many slots you
> need, so I expect it would add very little complexity.
>

When you say "at construction time", do you mean in the C++ constructor, or
when the C++ gets generated?  Conceptually this seems reasonable to me.

Steve
_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

Re: [m5-dev] src/dest detection in the ISA descriptions

Reply via email to