>>>> My idea is to be able to inherit from the standard op types like >>>> IntRegOperand and allow them to install more than one index and/or >>>> change how they're declared, read, and written. So say for example you >>>> had a 128 bit wide SIMD instruction operating on four floating point >>>> registers which are normally accessible as 32 bit, indexed through some >>>> strange scheme. You could say that operand generates 4 indices, >>>> determined by whatever weird formula or just sequentially. Then you >>>> could define a structure which was the union of an array of 4 32 bit >>>> ints and 4 floats or 2 doubles, or 16 bytes, or ... The constructor >>>> could take the return of reading the 4 indices with >>>> readFloatRegOperandBits to fill it's main array, and then the >>>> instruction could access whatever other representation it needed. The >>>> other direction would work as well. Then, assuming we get this >>>> source/dest thing straightened out, you could have an instruction that, >>>> say, did parallel byte adds defined like this: >>>> >>>> for (int i = 0; i < 7; i++) >>>> Dest.bytes[i] = Source1.bytes[i] + Source2.bytes[i]; >>>> >>>> And all the other goop would figure itself out. If some other scheme was >>>> needed (runtime selectable width, selectable float vs. int, etc) then >>>> the IntRegOperand subclass combined with the composite operand type >>>> could have the smarts to do the right thing all tidily hidden away after >>>> it was set up. Note that Dest is just a dest, even though it uses a >>>> structure field. >>>> >>> I'm not sure how you're envisioning this will work... are you assuming >>> there's a full C++ parser like gcc-xml? How would you know that in this >>> case you're overwriting all of Dest, but if the upper bound of the loop >> was >>> 6 and not 7 then it would be just a partial overwrite? That's a level of >>> compiler analysis I *really* don't want to get into. (Yes, it is >>> theoretically feasible and it would be pretty slick, but I'm sure there >> are >>> many other things that would have a greater ROI than that.) >>> >>> Could we do this with C++ operator overloading? Seems like you could >> just >>> say "Dest = Source1 + Source2;" which would be obvious to our code >> parser, >>> and then have C++ do the magic. The type extensions could be used as >> casts >>> to make this more flexible, e.g., "Dest = Source1@bv + Source2@bv" could >> do >>> byte-by-byte adds while "Source1@fv + Source2@fv" could do it by 32-bit >>> floats, or something like that. >> The only tricky thing there is determining what is a source and what is >> a dest, although it highlights how hard that can be. > > I still don't really get your proposal, in terms of how the code sample you > have parsed above would get turned into a concrete set of source and dest > operand indices.
Very roughly, it would look something like this. struct SimdData { union { uint32_t regBits[4]; uint8_t bytes[32]; }; }; def operand_types {{ 'SimdData' : 'SimdData' }}; class SimdOp(FloatRegOperand): def makeConstructor(self): return ''' _srcRegIdx[numSrcRegs++] = %s + 0 + FP_Base_DepTag; _srcRegIdx[numSrcRegs++] = %s + 1 + FP_Base_DepTag; _srcRegIdx[numSrcRegs++] = %s + 2 + FP_Base_DepTag; _srcRegIdx[numSrcRegs++] = %s + 3 + FP_Base_DepTag; ''' def makeRead(self): return ''' %s.regBits[0] = xc->readFloatRegOperandBits(this, %d + 0, final_val); %s.regBits[1] = xc->readFloatRegOperandBits(this, %d + 1, final_val); %s.regBits[2] = xc->readFloatRegOperandBits(this, %d + 2, final_val); %s.regBits[3] = xc->readFloatRegOperandBits(this, %d + 3, final_val); ''' % (self.base_name, self.idxStart (?)) def makeWrite(self): return ''' xc->setFloatRegOperandBits(this, %d, %s.regBits[0]); xc->setFloatRegOperandBits(this, %d, %s.regBits[1]); xc->setFloatRegOperandBits(this, %d, %s.regBits[2]); xc->setFloatRegOperandBits(this, %d, %s.regBits[3]); ''' % (self.idxStart, self.base_name) def operands {{ 'Dest' : ('SimdOp', 'SimdData', 'DEST', 'IsInteger', 1), 'Source1' : ('SimdOp', 'SimdData', 'SOURCE1', 'IsInteger', 2), 'Source2 : ('SimdOp', 'SimdData', 'SOURCE2', 'IsInteger', 3), }}; > >>>> Also the operands might be smart enough to change how they set >>>> themselves up on the fly. Lets say in a particular mode you only need 2 >>>> 32 bit floats and the other two spots are zeros. The operand >>>> initialization code could figure out what mode it's in at construction >>>> time that it doesn't need all 4 operands and could only fill in 2 spots. >>>> The next operand would then pack in behind it. This would hopefully make >>>> it easier to get multiple behaviors without having to define a new >>>> instruction (or just code blob) for each one. >>>> >>> Are there concrete examples of this that you've encountered? I don't see >> us >>> defining many (any?) new ISAs, so at this point we should focus on >> cleaning >>> up the ones we have (or just implementing extensions, like maybe SSE ops >>> that haven't been done yet). I'd bet that any features we don't have a >>> concrete use case for now are extremely unlikely to ever get used. >>> >> Yes, although I can't talk about them. Being able to take more than one >> slot will be 95% of the way to being able to take however many slots you >> need, so I expect it would add very little complexity. >> > When you say "at construction time", do you mean in the C++ constructor, or > when the C++ gets generated? Conceptually this seems reasonable to me. I think in the C++ constructor, but I haven't quite figured out the details of all this stuff yet. I figured I should make some inroads or it would never go anywhere. Gabe _______________________________________________ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev