>>>> My idea is to be able to inherit from the standard op types like
>>>> IntRegOperand and allow them to install more than one index and/or
>>>> change how they're declared, read, and written. So say for example you
>>>> had a 128 bit wide SIMD instruction operating on four floating point
>>>> registers which are normally accessible as 32 bit, indexed through some
>>>> strange scheme. You could say that operand generates 4 indices,
>>>> determined by whatever weird formula or just sequentially. Then you
>>>> could define a structure which was the union of an array of 4 32 bit
>>>> ints and 4 floats or 2 doubles, or 16 bytes, or ... The constructor
>>>> could take the return of reading the 4 indices with
>>>> readFloatRegOperandBits to fill it's main array, and then the
>>>> instruction could access whatever other representation it needed. The
>>>> other direction would work as well. Then, assuming we get this
>>>> source/dest thing straightened out, you could have an instruction that,
>>>> say, did parallel byte adds defined like this:
>>>>
>>>> for (int i = 0; i < 7; i++)
>>>>    Dest.bytes[i] = Source1.bytes[i] + Source2.bytes[i];
>>>>
>>>> And all the other goop would figure itself out. If some other scheme was
>>>> needed (runtime selectable width, selectable float vs. int, etc) then
>>>> the IntRegOperand subclass combined with the composite operand type
>>>> could have the smarts to do the right thing all tidily hidden away after
>>>> it was set up. Note that Dest is just a dest, even though it uses a
>>>> structure field.
>>>>
>>> I'm not sure how you're envisioning this will work... are you assuming
>>> there's a full C++ parser like gcc-xml?  How would you know that in this
>>> case you're overwriting all of Dest, but if the upper bound of the loop
>> was
>>> 6 and not 7 then it would be just a partial overwrite?  That's a level of
>>> compiler analysis I *really* don't want to get into.  (Yes, it is
>>> theoretically feasible and it would be pretty slick, but I'm sure there
>> are
>>> many other things that would have a greater ROI than that.)
>>>
>>> Could we do this with C++ operator overloading?  Seems like you could
>> just
>>> say "Dest = Source1 + Source2;" which would be obvious to our code
>> parser,
>>> and then have C++ do the magic.  The type extensions could be used as
>> casts
>>> to make this more flexible, e.g., "Dest = Source1@bv + Source2@bv" could
>> do
>>> byte-by-byte adds while "Source1@fv + Source2@fv" could do it by 32-bit
>>> floats, or something like that.
>> The only tricky thing there is determining what is a source and what is
>> a dest, although it highlights how hard that can be.
>
> I still don't really get your proposal, in terms of how the code sample you
> have parsed above would get turned into a concrete set of source and dest
> operand indices.

Very roughly, it would look something like this.

struct SimdData
{
    union
    {
        uint32_t regBits[4];
        uint8_t bytes[32];
    };
};

def operand_types {{
    'SimdData' : 'SimdData'
}};

class SimdOp(FloatRegOperand):
    def makeConstructor(self):
        return '''
            _srcRegIdx[numSrcRegs++] = %s + 0 + FP_Base_DepTag;
            _srcRegIdx[numSrcRegs++] = %s + 1 + FP_Base_DepTag;
            _srcRegIdx[numSrcRegs++] = %s + 2 + FP_Base_DepTag;
            _srcRegIdx[numSrcRegs++] = %s + 3 + FP_Base_DepTag;
        '''
    def makeRead(self):
        return '''
            %s.regBits[0] = xc->readFloatRegOperandBits(this, %d + 0,
final_val);
            %s.regBits[1] = xc->readFloatRegOperandBits(this, %d + 1,
final_val);
            %s.regBits[2] = xc->readFloatRegOperandBits(this, %d + 2,
final_val);
            %s.regBits[3] = xc->readFloatRegOperandBits(this, %d + 3,
final_val);
        ''' % (self.base_name, self.idxStart (?))
    def makeWrite(self):
        return '''
            xc->setFloatRegOperandBits(this, %d, %s.regBits[0]);
            xc->setFloatRegOperandBits(this, %d, %s.regBits[1]);
            xc->setFloatRegOperandBits(this, %d, %s.regBits[2]);
            xc->setFloatRegOperandBits(this, %d, %s.regBits[3]);
        ''' % (self.idxStart, self.base_name)

def operands {{
    'Dest' : ('SimdOp', 'SimdData', 'DEST', 'IsInteger', 1),
    'Source1' : ('SimdOp', 'SimdData', 'SOURCE1', 'IsInteger', 2),
    'Source2 : ('SimdOp', 'SimdData', 'SOURCE2', 'IsInteger', 3),
}};

>
>>>> Also the operands might be smart enough to change how they set
>>>> themselves up on the fly. Lets say in a particular mode you only need 2
>>>> 32 bit floats and the other two spots are zeros. The operand
>>>> initialization code could figure out what mode it's in at construction
>>>> time that it doesn't need all 4 operands and could only fill in 2 spots.
>>>> The next operand would then pack in behind it. This would hopefully make
>>>> it easier to get multiple behaviors without having to define a new
>>>> instruction (or just code blob) for each one.
>>>>
>>> Are there concrete examples of this that you've encountered?  I don't see
>> us
>>> defining many (any?) new ISAs, so at this point we should focus on
>> cleaning
>>> up the ones we have (or just implementing extensions, like maybe SSE ops
>>> that haven't been done yet).  I'd bet that any features we don't have a
>>> concrete use case for now are extremely unlikely to ever get used.
>>>
>> Yes, although I can't talk about them. Being able to take more than one
>> slot will be 95% of the way to being able to take however many slots you
>> need, so I expect it would add very little complexity.
>>
> When you say "at construction time", do you mean in the C++ constructor, or
> when the C++ gets generated?  Conceptually this seems reasonable to me.

I think in the C++ constructor, but I haven't quite figured out the
details of all this stuff yet. I figured I should make some inroads or
it would never go anywhere.

Gabe
_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

Reply via email to