This email has a lot of good tidbits in it that I would like to see as doxygen 
comments in the ptstate base class.

Thanks,
Ali

On Oct 24, 2010, at 2:31 AM, Gabe Black wrote:

> Since replying inline to your top level review isn't possible on
> reviewboard, I'll respond to that part here and the rest there. Also,
> thank you to you and Nate for your reviews. I took a little longer than
> I'd have liked to get back to you considering how much I pestered you
> guys :-), but I didn't want to rush out a sloppy reply.
> 
> Steve Reinhardt wrote:
>> -----------------------------------------------------------
>> This is an automatically generated e-mail. To reply, visit:
>> http://reviews.m5sim.org/r/255/#review405
>> -----------------------------------------------------------
>> 
>> 
>> So overall this is great, I love how it gets rid of the setNextPC & 
>> setNextNPC stuff all over the place, and the '#if ISA_HAS_DELAY_SLOT' in the 
>> bpred, etc.  Most of my concerns are just superficial issues, like making 
>> the syntax a little easier.
>> 
> 
> Thanks!
> 
>> 
>>> If you just
>>> want the address of the current or next instruction or the current micro PC,
>>> you can get those through read-only accessors on either the PCState type or
>>> the *Contexts. These are instAddr(), nextInstAddr(), and microPC().
>>> 
>> 
>> What's the distinction between calling instAddr() and pc() (and similarly 
>> nextInstAddr() vs npc())?  It looks to me like they're currently 
>> interchangeable, yet sometimes one is used and sometimes the other.
>> 
> 
> instAddr, nextInstAddr, and microPC are guaranteed to be there, read
> only, and what ISA independent code is allowed to use. pc, npc, etc. are
> read/write accessors, there at the whim of the ISA and for its exclusive
> use in managing the PC. Another distinction which came up in our recent
> discussion of the PAL bit is that instAddr and nextInstAddr are supposed
> to be literal memory addresses for the instruction, hence the name
> change. The "pc", which is usually equivalent but not always like in the
> case of Alpha, is an ISA level concept. Part of what might be confusing
> is that a lot of the time they're equal to each other. Another somewhat
> contrived example would be if we were to implement an ISA with a word
> addressed PC like the LC2K ISA (all the people not from MI now say "the
> what?"). In an effort towards consistency and towards sharing the
> implementation among ISAs, the pc, npc, etc. scheme is used throughout
> as applicable.
> 
>> 
>>> The PCs were previously managed entirely by the CPU which had to know about 
>>> PC
>>> semantics, try to figure out which dimension to increment the PC in, what to
>>> set NPC/NNPC, etc. These decisions are best left to the ISA in conjunction
>>> with the PC type itself. Because most of the information about how to
>>> increment the PC (mainly what type of instruction it refers to) is contained
>>> in the instruction object, a new advancePC virtual function was added to the
>>> StaticInst class. Subclasses provide an implementation that moves around the
>>> right element of the PC with a minimal amount of decision making. In ISAs 
>>> like
>>> Alpha, the instructions always simply assign NPC to PC without having to 
>>> worry
>>> about micropcs, nnpcs, etc. The added cost of a virtual function call should
>>> be outweighed by not having to figure out as much about what to do with the
>>> PCs and mucking around with the extra elements.
>>> 
>> 
>> Does it make sense to go through an ISA-specific function, for example:
>> 
>> inline
>> TheISA::advancePC(PCState &pcState, StaticInstPtr inst)
>> {
>>    pcState.advance();
>> OR
>>    inst->advancePC(pcState);
>> }
>> 
> 
> As far as pcState.advance(), no. The idea is that the instruction knows
> what would follow it (a microop or a new instruction) and advances the
> PC in the right dimension. As far as inst->advancePC that would
> functionally work, but I'm not sure how it would be useful. Wouldn't
> that just obscure what's going on?
> 
>> 
>>> One drawback of making the StaticInsts advance the PC is that you have to
>>> actually have one to advance the PC. This would, superficially, seem to
>>> require decoding an instruction before fetch could advance. This is, as far 
>>> as
>>> I can tell, realistic. fetch would advance through memory addresses, not 
>>> PCs,
>>> perhaps predicting new memory addresses using existing ones. More
>>> sophisticated decisions about control flow would be made later on, after the
>>> instruction was decoded, and handed back to fetch. If branching needs to
>>> happen, some amount of decoding needs to happen to see that it's a branch,
>>> what the target is, etc. This could get a little more complicated if that 
>>> gets
>>> done by the predecoder, but I'm choosing to ignore that for now.
>>> 
>> 
>> Sounds reasonable to me.
>> 
>> 
>>> Variable length instructions:
>>> 
>>> To handle variable length instructions in x86 and ARM, the predecoder now
>>> takes in the current by reference to the getExtMachInst function. It can
>>> 
>> 
>> You're missing "PC" after "current" here.
>> 
> 
> Yep. Fixed in my local copy.
> 
>> 
>>> ISA parser:
>>> 
>>> To support the new API, all PC related operand types were removed from the
>>> parser and replaced with a PCState type. There are two warts on this
>>> implementation. First, as with all the other operand types, the PCState 
>>> still
>>> has to have a valid operand type even though it doesn't use it. Second, 
>>> using
>>> syntax like PCS.npc(target) doesn't work for two reasons, this looks like 
>>> the
>>> syntax for operand type overriding, and the parser can't figure out if 
>>> you're
>>> reading or writing. Instructions that use the PCS operand (which I've
>>> consistently called it) need to first read it into a local variable,
>>> manipulate it, and then write it back out.
>>> 
>> 
>> See my comments below... I don't see why this change is necessary or even 
>> desirable as opposed to just making the existing PC/NPC/NNPC/UPC etc. 
>> operands do the right thing, at least in the common cases.
>> 
> 
> I would have really like to keep the old syntax since I completely agree
> this new version is a bit clunky and round about. The basic problem is
> that unlike in the previous system, there aren't accessors for reading
> and writing parts of the PC since what those parts are isn't defined
> external to the ISA. Because of that, an instruction that wants to
> modify the PC needs to bring in a copy of the whole thing, do any
> modifications locally, and then send the whole revised version back.
> This could be done locally to update each operand, but then you'd have a
> lot of wasteful copying of the PCState structure and calls to get/set it.
> 
> One possible solution would be to keep track of whether -any- member of
> the PC was being accessed, read it in once, use it, and if any member
> was written to write it out at the end. Then the individual components
> could be accessed with operands. This could be a little limitting,
> though, because the names, quantities, arguments, etc. of the accessors
> and raw members aren't fixed. We'd have to have a syntax for sort of
> dynamically building up an operand type on a per ISA basis to give
> access to all the bits of the PC. This would all be possible, but I
> think is outside the scope of this already very large change.
> 
>> 
>>> The return address stack needed a little extra help because, in the presence
>>> of branch delay slots, it has to merge together elements of the return PC 
>>> and
>>> the call PC. To handle that, a buildRetPC utility function was added. There
>>> are basically only two versions in all the ISAs, but it didn't seem short
>>> enough to put into the generic ISA directory. Also, the branch predictor 
>>> code
>>> in O3 and InOrder were adjusted so that they always store the PC of the 
>>> actual
>>> call instruction in the RAS, not the next PC. If the call instruction is a
>>> microop, the next PC refers to the next microop in the same macroop which is
>>> probably not desirable. The buildRetPC function advances the PC 
>>> intelligently
>>> to the next macroop (in an ISA specific way) so that that case works.
>>> 
>> 
>> I don't know that it matters, but just as a sanity check: wouldn't it be 
>> equivalent to call buildRetPC on the call PC beforehand, then store the 
>> result in the RAS?  I.e., it's not what gets stored that matters, it's using 
>> the buildRetPC() logic as opposed to just using nextPC()?
>> 
>> 
> 
> Yeah, I think that's functionally equivalent. I expect this way would
> have slightly better performance because it defers the work until you
> know you need it, but either way should work.
> 
> Gabe
> _______________________________________________
> m5-dev mailing list
> [email protected]
> http://m5sim.org/mailman/listinfo/m5-dev
> 

_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev

Reply via email to