On Wed, Aug 27, 2014 at 8:03 AM, Peter Gavin <[email protected]> wrote: > On Tue, Aug 26, 2014 at 11:24 PM, Stefan Kristiansson > <[email protected]> wrote: >> >> I'm not sure that it *has* to be slow, mor1kx executes all of them in >> 1-2 cycles. >> I'm also not sure if any of the SPRs that would change pipeline state >> in an unpredictable way is actually allowed to be written to? >> PC and the *current* register file for instance is stated as undefined >> behavior if read/written to from a program. > > > I don't believe the manual states that writing the GPR or NPC with mtspr is > undefined behavior, however, I may have missed it buried in there somewhere.
It's on page 32: "4.10 Next and Previous Program Counter (NPC and PPC) The Program Counter registers represent the address just executed and the address instruction just to be executed. These and the GPR registers mapped into SPR space should only be used for debugging purposes by an external debugger. Applications should use the l.jal instruction to obtain the current program counter and arithmethic instructions to obtain GPR register values." So it's not explicitly stating it's undefined behavior, just that you're not allowed to use them. > Other potentially state-changing mtsprs would be SR, or course, and the > TLB/ATB stuff. I found it simplest and safest just to not worry about it > and always flush on mtspr, since the manual isn't 100% clear about what > should happen. The impact is extremely minor IMO, since m[ft]spr should > really only be used in OS code, and I don't think they should be considered > fast instructions. > > The kernel handles this by inserting lots of nops after mtsprs that change > TLB state or enable the IC/DC, for example: > > enable_mmu: > /* > * enable dmmu & immu > * SR[5] = 0, SR[6] = 0, 6th and 7th bit of SR set to 0 > */ > l.mfspr r30,r0,SPR_SR > l.movhi r28,hi(SPR_SR_DME | SPR_SR_IME) > l.ori r28,r28,lo(SPR_SR_DME | SPR_SR_IME) > l.or r30,r30,r28 > l.mtspr r0,r30,SPR_SR > l.nop > l.nop > l.nop > /* lots more nops follow */ > > I think this is a mistake, and that it would be better to spell out exactly > when the state change occurs. These nops are functionally equivalent to > flushing, on short, in-order pipelines. But simply inserting nops like this > will be insufficient if someone decides to build an aggressive, speculative, > dynamically scheduled pipeline based on OpenRISC. Such a pipeline could > elide all those nops away, and then begin executing the next load > instruction before the TLB is actually enabled. But if the pipeline flushes > itself, those nops aren't even necessary. > Yes, that's not the cleanest way to enable the MMU. The 'correct' way is to setup ESR and then issue an l.rfe instruction (and the kernel use this approach in other places). Regardless, I agree, those nops serve no purpose, implementations should be able to function properly without them (and if they do that by special casing SR accesses or flushing the pipeline on each mtspr instruction, that's beside the point) and I doubt they are actually even needed. But we digress, I agree on your main point, the SPR accesses might be slow. It's just an implementation detail. However, it's not possible (at least not in the multicore case) to use the main memory as a scratch area, so the motivation to have something else to save state to is not really about speed. Stefan _______________________________________________ OpenRISC mailing list [email protected] http://lists.openrisc.net/listinfo/openrisc
