[Bug rtl-optimization/90168] context-sensitive local register allocation

2019-05-05 Thread ebotcazou at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90168

--- Comment #5 from Eric Botcazou  ---
> How about adjusting REG_FREQ_MAX to be same as BB_FREQ_MAX? Now
> REG_FREQ_MAX/BB_FREQ_MAX is 1/10.

The way out is probably to use a 64-bit fixed-point type like profiling.

[Bug rtl-optimization/90168] context-sensitive local register allocation

2019-05-04 Thread fxue at os dot amperecomputing.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90168

--- Comment #4 from Feng Xue  ---
(In reply to Andrew Pinski from comment #3)
> >or to use float type to hold frequency?
> 
> This won't work correctly as floating point is different between hosts. 
> There has been some usage of floating point inside of GCC which was removed
> because of that issue.  I thought that was documented somewhere too.

How about adjusting REG_FREQ_MAX to be same as BB_FREQ_MAX? Now
REG_FREQ_MAX/BB_FREQ_MAX is 1/10.

[Bug rtl-optimization/90168] context-sensitive local register allocation

2019-05-04 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90168

--- Comment #3 from Andrew Pinski  ---
>or to use float type to hold frequency?

This won't work correctly as floating point is different between hosts.  There
has been some usage of floating point inside of GCC which was removed because
of that issue.  I thought that was documented somewhere too.

[Bug rtl-optimization/90168] context-sensitive local register allocation

2019-04-19 Thread ebotcazou at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90168

Eric Botcazou  changed:

   What|Removed |Added

 Status|WAITING |NEW
Version|unknown |9.0
   Severity|minor   |enhancement

[Bug rtl-optimization/90168] context-sensitive local register allocation

2019-04-19 Thread fxue at os dot amperecomputing.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90168

--- Comment #2 from Feng Xue  ---
(In reply to Eric Botcazou from comment #1)
> > Supposed a function as the following, in which 'cond', 'S1' and 'S2' are
> > completely irrelevant, means they do not access same variables(in term of
> > RA, they own separate live range set).
> > 
> >   f1()
> >   { 
> >   if (cond) {
> >   S1
> >   } else {
> >   S2
> >   }
> >   }
> > 
> > Ideally, we can expect that register allocation on 'S1'is totally
> > independent of 'S2', w or w/o which makes no difference.
> 
> This seems a rather far-fetched assumption, to say the least.  This would
> essentially imply that no global optimization is applied to the function.
> 
> > Its result should be same as below function consisting of only 'S1':
> > 
> >   f2()
> >   {
> >   S1
> >   }
> > 
> > But we found gcc does not has this property. Strictly speaking, this is not
> > a bug, but exposes some kind of instability in code generation, has
> > undeterminable impact on some optimization, such as inlining. 
> 
> And do you know of any non-toy/production compiler that has the property?

llvm has, and icc nearly has(only minor difference in register number, but
completely same spills).

> 
> > Investigation shows this is related to integer-based frequency normalization
> > (REG_FREQ_FROM_BB) used by RA, which always rounds up a small frequency
> > (less than 1) to 1. In foo1(), introduction of new code makes profile counts
> > of CODES be decreased, so that impact of frequency normalization error
> > becomes more significant and actually distorts original proportion of
> > profile counts among basic blocks in CODES. For example, in foo(), two
> > blocks have counts of 3 and 100 receptively, and in foo1(), they become 0.3
> > and 10, after rounding up, they are 1 and 10, thus proportion is changed
> > from (3 vs 100) to (1 vs 10).
> > 
> > Possible solution might be to adjust two scale factors used by
> > REG_FREQ_FROM_BB : REG_FREQ_MAX and BB_FREQ_MAX, or to use float type to
> > hold frequency?
> 
> Which version of the compiler are you using? This changed in GCC 8.

GCC trunk (9.0.1 20190325)

[Bug rtl-optimization/90168] context-sensitive local register allocation

2019-04-19 Thread ebotcazou at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90168

Eric Botcazou  changed:

   What|Removed |Added

 Status|UNCONFIRMED |WAITING
   Last reconfirmed||2019-04-19
 CC||ebotcazou at gcc dot gnu.org
Summary|Unstable register   |context-sensitive local
   |allocation result for same  |register allocation
   |source code |
 Ever confirmed|0   |1
   Severity|normal  |minor

--- Comment #1 from Eric Botcazou  ---
> Supposed a function as the following, in which 'cond', 'S1' and 'S2' are
> completely irrelevant, means they do not access same variables(in term of
> RA, they own separate live range set).
> 
>   f1()
>   { 
>   if (cond) {
>   S1
>   } else {
>   S2
>   }
>   }
> 
> Ideally, we can expect that register allocation on 'S1'is totally
> independent of 'S2', w or w/o which makes no difference.

This seems a rather far-fetched assumption, to say the least.  This would
essentially imply that no global optimization is applied to the function.

> Its result should be same as below function consisting of only 'S1':
> 
>   f2()
>   {
>   S1
>   }
> 
> But we found gcc does not has this property. Strictly speaking, this is not
> a bug, but exposes some kind of instability in code generation, has
> undeterminable impact on some optimization, such as inlining. 

And do you know of any non-toy/production compiler that has the property?

> Investigation shows this is related to integer-based frequency normalization
> (REG_FREQ_FROM_BB) used by RA, which always rounds up a small frequency
> (less than 1) to 1. In foo1(), introduction of new code makes profile counts
> of CODES be decreased, so that impact of frequency normalization error
> becomes more significant and actually distorts original proportion of
> profile counts among basic blocks in CODES. For example, in foo(), two
> blocks have counts of 3 and 100 receptively, and in foo1(), they become 0.3
> and 10, after rounding up, they are 1 and 10, thus proportion is changed
> from (3 vs 100) to (1 vs 10).
> 
> Possible solution might be to adjust two scale factors used by
> REG_FREQ_FROM_BB : REG_FREQ_MAX and BB_FREQ_MAX, or to use float type to
> hold frequency?

Which version of the compiler are you using? This changed in GCC 8.