Gabe Black wrote: > Timothy M Jones wrote: > >> Hi everyone, >> >> On 06/06/2010 18:59, Ali Saidi wrote: >> >>> On Jun 6, 2010, at 4:53 PM, Steve Reinhardt wrote: >>> >>> >>>> I've only thought about this briefly, but here are a few quick >>>> reactions: >>>> >>>> - PowerPC has updating ld/st instructions too. How are these handled? >>>> Whatever we do, we should do the same thing for both. >>>> >>> Tim, care to comment? >>> >>> >> Yes, the Power ISA has loads and stores that update a given register >> with the effective address, exactly like the example Ali gave. >> >> I've written these so that the register update is performed in >> completeAcc(). I haven't profiled performance in O3 and hadn't >> thought about the dependencies that this would cause. If you've got a >> better solution to use then I am happy to alter the Power code to use >> this. >> >> To be honest, I am implementing some more instructions in Power and >> have come across two that load or store multiple register values to >> memory. I was going to write a question about implementing this to the >> list anyway! If the solution to the above problem was creating >> micro-ops, then I could implement the multiple load/store instructions >> in the same way too, otherwise I will have to find a different >> solution to this. >> >> Cheers >> Tim >> >> > > I think microops are generally going to be a pretty good solution, but > one catch is that when they can't execute in parallel for whatever > reason (ie. in the simple CPU) they'll count as two instructions in > stead of one. That could mess with any rough performance measurement you > were trying to do. Also, it's not a problem perse, but make sure you do > the register update second so that the load/store has a chance to kill > the macroop. > > Also, if you already have a microop that does a register update like > this (the stupd, store with update, microop in x86 is like this) then it > would have to be split into two microops. This scheme would effectively > make this microop impossible without reintroducing this problem. I'd > assume it was part of the microop set I borrowed for a reason, and we'd > be defeating that by eliminating it. > > Gabe > _______________________________________________ > m5-dev mailing list > [email protected] > http://m5sim.org/mailman/listinfo/m5-dev >
I just thought of another, more important drawback. In an in order pipeline, the writeback will take up an extra pipeline stage, effectively adding a bubble. In reality, I'd imagine the update would be computed in the execute stage at the same time as the address computation. This is especially important for ARM which, if I'm not mistaken, is usually implemented as an in order pipeline. Gabe _______________________________________________ m5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/m5-dev
