Gabe Black wrote: > Ali Saidi wrote: > >> On Sep 10, 2010, at 2:27 AM, Gabe Black wrote: >> >> >> >>> Hello again. I'm currently working on x86 support in O3, and one of >>> the problems I've run into is that the stupd (store, update base) >>> microop expects to update the base register in completeAcc. I'm pretty >>> sure and Kevin has confirmed that completeAcc is never called on stores >>> in O3, I think partially because the instruction is considered executed >>> when the store gets sent to the load/store queue. Please fogive me Kevin >>> if I mangle the facts here. That might be because the store is >>> considered the memory system's problem at that point, and generally >>> nothing is done in store's completeAcc function anyway. This isn't true, >>> though, in either ARM or x86 when they try to update the base register >>> there. >>> >>> One possible solution might be to call completeAcc right away if >>> there's no fault in translation. At that point, assuming there isn't >>> some sort of machine check fault because you stored to something you >>> shouldn't, the instruction really is at the end of it's life and can be >>> considered executed. This would, I think, also address the problem we >>> had with ARM where the updated base register would hang around until the >>> store completed instead of being immediately available for subsequent >>> instructions, and eliminate the need for splitting the operation into >>> multiple microops. There would be a performance hit (hopefully a small >>> one) from calling completeAcc all the time even when most of the time it >>> doesn't do anything. That may or may not matter. >>> >>> Any thoughts? Any corrections, Kevin? >>> >>> >> We just changed the micro-ops in ARM so that any of the register updating >> loads/stores were micro-coded, calculated the new address, placed it in a >> temp register, loaded from the temp, and the moved the temp into the real >> register. This solution worked fine for both O3 and the simple cpus. Since >> ARM has some interesting options for PC relative loads and loading the PC >> via a load, I think you can pretty much accomplish anything with three >> carefully crafted micro-ops. >> >> Ali >> >> _______________________________________________ >> m5-dev mailing list >> [email protected] >> http://m5sim.org/mailman/listinfo/m5-dev >> >> > > The thing I'm not crazy about is that in x86, I'm working directly from > the spec for the microops, so the real deal uses a single microop that > does a store and the update. If there was no advantage to using one > microop, I'd imagine they wouldn't have bothered. This -sounds- at least > like it would make that work. > > Gabe > _______________________________________________ > m5-dev mailing list > [email protected] > http://m5sim.org/mailman/listinfo/m5-dev > And by "real deal" I mean that patent, of course, not anything that isn't already public (so don't ship me off to Siberia yet, AMD). But that still was the real deal back in the day.
Gabe _______________________________________________ m5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/m5-dev
