On 05/24/12 07:56, Nilay Vaish wrote: > On Thu, 24 May 2012, Gabe Black wrote: > >> Some quick testing shows that splitting the x86 condition code register >> up adds about an 8% penalty on twolf simple atomic, relative to a >> version of gem5 with some performance improvements that aren't checked >> in yet. That's ok since this is something that needs to happen, but that >> shows the overhead of reading the extra registers. Avoiding reading any >> unnecessary registers (including the zero register as a >> placeholder/substitute) will help recover that lost performance. >> > > Gabe, what performance improvements do you have in mind? We had > discussed before the possibility of the registers read/written by a > microop being decided at the time of construction of the microop, > instead of at compile time. From the discussion it seemed that we > would need to assess the execution time impact of such any such change > as it would probably affect all the ISAs. > > -- > Nilay > _______________________________________________ > gem5-dev mailing list > [email protected] > http://m5sim.org/mailman/listinfo/gem5-dev
Just finding a way to avoid the unnecessary register reads, ie the ones for bits that aren't being set or read, which is what we talked about before. If we can simply avoid the reads without changing the way the CPUs work, then it shouldn't affect the other ISAs. We'll want to keep the changes localized to the way the StaticInsts are set up, because technically that's all that really needs to change. A larger goal might be to reduce the number of calls to the thread context generally. Maybe we could add a readIntRegs (note the s) and similar functions which read/write multiple registers at once and reduced the function call overhead. It also wouldn't hurt to double check my measurement as far as how much of an impact splitting up the registers made. Gabe _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
