Applying the patch that splits the x86 condition code register resulted in more significant performance degradation on my end. The degradation is as large as 50% for simple micro-benchmarks due to increased pressure on the physical registers. I have not tested SPEC benchmarks yet.
The performance impact is large enough that avoiding unnecessary register reads and writes would help improve the performance a lot. Yasuko -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Gabe Black Sent: Friday, May 25, 2012 12:45 AM To: [email protected] Subject: Re: [gem5-dev] performance impact of splitting up the registers On 05/24/12 07:56, Nilay Vaish wrote: > On Thu, 24 May 2012, Gabe Black wrote: > >> Some quick testing shows that splitting the x86 condition code >> register up adds about an 8% penalty on twolf simple atomic, relative >> to a version of gem5 with some performance improvements that aren't >> checked in yet. That's ok since this is something that needs to >> happen, but that shows the overhead of reading the extra registers. >> Avoiding reading any unnecessary registers (including the zero >> register as a >> placeholder/substitute) will help recover that lost performance. >> > > Gabe, what performance improvements do you have in mind? We had > discussed before the possibility of the registers read/written by a > microop being decided at the time of construction of the microop, > instead of at compile time. From the discussion it seemed that we > would need to assess the execution time impact of such any such change > as it would probably affect all the ISAs. > > -- > Nilay > _______________________________________________ > gem5-dev mailing list > [email protected] > http://m5sim.org/mailman/listinfo/gem5-dev Just finding a way to avoid the unnecessary register reads, ie the ones for bits that aren't being set or read, which is what we talked about before. If we can simply avoid the reads without changing the way the CPUs work, then it shouldn't affect the other ISAs. We'll want to keep the changes localized to the way the StaticInsts are set up, because technically that's all that really needs to change. A larger goal might be to reduce the number of calls to the thread context generally. Maybe we could add a readIntRegs (note the s) and similar functions which read/write multiple registers at once and reduced the function call overhead. It also wouldn't hurt to double check my measurement as far as how much of an impact splitting up the registers made. Gabe _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
