On Wed, Aug 27, 2014 at 4:56 AM, Matt Thomas <[email protected]> wrote: > On Aug 26, 2014, at 6:39 PM, Peter Gavin <[email protected]> wrote: > > > On Tue, Aug 26, 2014 at 9:00 PM, Matt Thomas <[email protected]> wrote: > > I'd be happy that if fast context switching isn't implemented that an extra > > partial set (4 is enough, 8 is better) of GPR SPRs would be made available. > > > > Finishing the fast context switching/shadow register stuff is probably the > > best way to go with this, IMO. > > I agree, mostly. But it seems incomplete. >
This has been discussed in detail earlier, and the conclusion was then as Peter suggested, to use the context switching/shadow reg stuff. http://lists.openrisc.net/pipermail/openrisc/2014-May/002159.html To sum it up, you can exploit the shadowed gprs without using the full set of fast context switch features. Reading and writing the shadowed GPRs is already supported in or1ksim, and mor1kx has support for this as well. By exploiting the shadowed GPRs like this, you get the exact functionality you are asking for (with the added bonus that you have 32 'SPRG' registers instead of 4-8). Relying on this feature of course limit yourself to a smaller set of implementations, but that would of course be even more true for 'SPRG' registers. > > > The problem with using SPRs for temporary data is that accessing them is > > slow, because the SPR number must be calculated and decoded in order to > > determine which SPR is being accessed. My implementation actually flushes > > the pipeline on mtspr instructions, and my guess is that other pipelines > > are similar, because the mtspr can change important pipeline state in a way > > that is not predictable. (I could have avoided the flush in some cases, > > but I didn't think the extra logic was worth it.) And the register being > > written to by mtspr is probably determined too late to easily bypass the > > result to earlier pipe stages (unless you have a really deep pipeline). > > So, depending on pipeline depth and architecture, an mtspr instruction can > > effectively take several cycles to execute. The result is that it probably > > won't be any faster than just going through main memory, assuming a > > reasonable cache hit rate. > I'm not sure that it *has* to be slow, mor1kx executes all of them in 1-2 cycles. I'm also not sure if any of the SPRs that would change pipeline state in an unpredictable way is actually allowed to be written to? PC and the *current* register file for instance is stated as undefined behavior if read/written to from a program. > Without SPRGs, not trashing registers on exception becomes more difficult. > Because r0 is not fixed as 0, the very first thing the exception handler must > do is l.xor r0, r0, r0 to make sure r0 is zero. Or I suppose you can use r0 > as an early temporary and then set it to 0 later. Or you can reserve two > registers for exclusive kernel use like MIPS does. It just gets nasty. r0 *may* not be fixed to zero, you can't rely on it not being fixed, so you can't use it as a temporary. Our Linux port resorts to a hack where it uses the the memory area between 0x0-0x100 as a temporary storage. This of course only works for uniprocessor systems, and because of that we had the previous discussion. Side note, to be more friendly to RTL simulations, it's better to use l.movhi r0,0 or l.andi r0, r0, 0 to make sure r0 is cleared. > > > BTW I think I remember finding some problems with fast context switching as > > defined in the spec. It's been a while since I looked at that so I can't > > recall what those problems are now. And IIRC, there are *no* > > implementations that support it, not even or1ksim. As noted above, even though I think it's correct that no implementation implements the full fast context switching, there are implementations that implements an useful subset of it. Stefan _______________________________________________ OpenRISC mailing list [email protected] http://lists.openrisc.net/listinfo/openrisc
