Applying the patch that splits the x86 condition code register resulted in more 
significant performance degradation on my end. The degradation is as large as 
50% for simple micro-benchmarks due to increased pressure on the physical 
registers. I have not tested SPEC benchmarks yet.

The performance impact is large enough that avoiding unnecessary register reads 
and writes would help improve the performance a lot.

Yasuko

-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of 
Gabe Black
Sent: Friday, May 25, 2012 12:45 AM
To: [email protected]
Subject: Re: [gem5-dev] performance impact of splitting up the registers

On 05/24/12 07:56, Nilay Vaish wrote:
> On Thu, 24 May 2012, Gabe Black wrote:
>
>> Some quick testing shows that splitting the x86 condition code 
>> register up adds about an 8% penalty on twolf simple atomic, relative 
>> to a version of gem5 with some performance improvements that aren't 
>> checked in yet. That's ok since this is something that needs to 
>> happen, but that shows the overhead of reading the extra registers. 
>> Avoiding reading any unnecessary registers (including the zero 
>> register as a
>> placeholder/substitute) will help recover that lost performance.
>>
>
> Gabe, what performance improvements do you have in mind? We had 
> discussed before the possibility of the registers read/written by a 
> microop being decided at the time of construction of the microop, 
> instead of at compile time. From the discussion it seemed that we 
> would need to assess the execution time impact of such any such change 
> as it would probably affect all the ISAs.
>
> --
> Nilay
> _______________________________________________
> gem5-dev mailing list
> [email protected]
> http://m5sim.org/mailman/listinfo/gem5-dev

Just finding a way to avoid the unnecessary register reads, ie the ones for 
bits that aren't being set or read, which is what we talked about before. If we 
can simply avoid the reads without changing the way the CPUs work, then it 
shouldn't affect the other ISAs. We'll want to keep the changes localized to 
the way the StaticInsts are set up, because technically that's all that really 
needs to change.

A larger goal might be to reduce the number of calls to the thread context 
generally. Maybe we could add a readIntRegs (note the s) and similar functions 
which read/write multiple registers at once and reduced the function call 
overhead.

It also wouldn't hurt to double check my measurement as far as how much of an 
impact splitting up the registers made.

Gabe
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev


_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to