Re: RFA: one more version of the patch for PR61360

2014-10-04 Thread Richard Sandiford
Uros Bizjak ubiz...@gmail.com writes:
 On Thu, Oct 2, 2014 at 10:13 PM, Vladimir Makarov vmaka...@redhat.com wrote:

 I guess we achieved the consensus about the following patch to fix
 PR61360

 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61360

 The patch was successfully bootstrapped and tested (w/wo
 -march=amdfam10) on x86/x86-64.

 Is it ok to commit to trunk?


 I've tested your patch and unfortunately it doesn't work:

 In file included from
 /var/tmp/moz-build-dir/js/src/shell/Unified_cpp_js_src_shell0.cpp:15:0:
 /var/tmp/mozilla-central/js/src/shell/js.cpp: In function ‘void
 Process(JSContext*, JSObject*, const char*, bool)’:
 /var/tmp/mozilla-central/js/src/shell/js.cpp:592:1: internal compiler
 error: in lra_update_insn_recog_data, at lra.c:1221
   }
   ^
 0xa9d9ec lra_update_insn_recog_data(rtx_insn*)
  ../../gcc/gcc/lra.c:1220
 0xab450f eliminate_regs_in_insn
  ../../gcc/gcc/lra-eliminations.c:1077
 0xab450f process_insn_for_elimination
  ../../gcc/gcc/lra-eliminations.c:1344
 0xab450f lra_eliminate(bool, bool)
  ../../gcc/gcc/lra-eliminations.c:1408
 0xa9f2da lra(_IO_FILE*)
  ../../gcc/gcc/lra.c:2270
 0xa5d659 do_reload
  ../../gcc/gcc/ira.c:5311
 0xa5d659 execute
  ../../gcc/gcc/ira.c:5470


 Testcase is attached:

   % g++ -c -march=amdfam10 -w -O2 js.ii
 js.ii: In function ‘void RunFile(C)’:
 js.ii:64:1: internal compiler error: in lra_update_insn_recog_data, at
 lra.c:1221


 Thanks for reporting this, Marcus.  The problem now is in
 optimize_function_for_size_p.  It is false, when we define and cache enable
 attributes at early stages (instantation of virtual regs) and true later.

 It is returning us to the same problem.  I believe that we should not have
 enable attributes depending on optimization options or on the current
 running pass if we don't want the current solution in the trunk (with
 recog_init).  Setting right value for optimize_function_for_size_p does not
 solve the problem as we can have different options for different functions
 in the same compilation file.

 So minimal solution would be removing optimize_function_for_size_p from the
 attribute definition.  But I guess we could remove all condition.
 Unfortunately, Ganesh did not post is it really beneficial to switch off
 this alternative for AMD CPUs even if AMD optimization guide recommends it.

 I propose to split the pattern into size-optimized and normal pattern.
 The patch implements this proposal.

An alternative would be to add two new enabled-like attributes,
good_for_size and good_for_speed, that say whether it is efficient
to use a particular alternative.  These attributes wouldn't ever stop
an existing instruction from being recognised; they would just say
whether the RA and optimisers should consider the alternative when
optimising for size or speed respectively.  The attributes would have
the same restrictions as the enabled attribute and could be cached in
the same way.

The enabled attribute would then be purely about whether an
instruction is available, not whether it's efficient in a particular
situation.

The main advantage is that it would be possible to make the size/speed
choice at a basic block level rather than a function level.  In the worst
case we might move an instruction from a hot block to a cold block or
vice versa, but with intelligent spilling that shouldn't happen too
often and at least it wouldn't trigger an ICE.

If that sounds OK, I'll try to get something together next week.
I think Ramana said he had a use for this on ARM too.

Thanks,
Richard





Re: RFA: one more version of the patch for PR61360

2014-10-04 Thread Uros Bizjak
On Sat, Oct 4, 2014 at 12:49 PM, Richard Sandiford
rdsandif...@googlemail.com wrote:
 Uros Bizjak ubiz...@gmail.com writes:
 On Thu, Oct 2, 2014 at 10:13 PM, Vladimir Makarov vmaka...@redhat.com 
 wrote:

 I guess we achieved the consensus about the following patch to fix
 PR61360

 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61360

 The patch was successfully bootstrapped and tested (w/wo
 -march=amdfam10) on x86/x86-64.

 Is it ok to commit to trunk?


 I've tested your patch and unfortunately it doesn't work:

 In file included from
 /var/tmp/moz-build-dir/js/src/shell/Unified_cpp_js_src_shell0.cpp:15:0:
 /var/tmp/mozilla-central/js/src/shell/js.cpp: In function ‘void
 Process(JSContext*, JSObject*, const char*, bool)’:
 /var/tmp/mozilla-central/js/src/shell/js.cpp:592:1: internal compiler
 error: in lra_update_insn_recog_data, at lra.c:1221
   }
   ^
 0xa9d9ec lra_update_insn_recog_data(rtx_insn*)
  ../../gcc/gcc/lra.c:1220
 0xab450f eliminate_regs_in_insn
  ../../gcc/gcc/lra-eliminations.c:1077
 0xab450f process_insn_for_elimination
  ../../gcc/gcc/lra-eliminations.c:1344
 0xab450f lra_eliminate(bool, bool)
  ../../gcc/gcc/lra-eliminations.c:1408
 0xa9f2da lra(_IO_FILE*)
  ../../gcc/gcc/lra.c:2270
 0xa5d659 do_reload
  ../../gcc/gcc/ira.c:5311
 0xa5d659 execute
  ../../gcc/gcc/ira.c:5470


 Testcase is attached:

   % g++ -c -march=amdfam10 -w -O2 js.ii
 js.ii: In function ‘void RunFile(C)’:
 js.ii:64:1: internal compiler error: in lra_update_insn_recog_data, at
 lra.c:1221


 Thanks for reporting this, Marcus.  The problem now is in
 optimize_function_for_size_p.  It is false, when we define and cache enable
 attributes at early stages (instantation of virtual regs) and true later.

 It is returning us to the same problem.  I believe that we should not have
 enable attributes depending on optimization options or on the current
 running pass if we don't want the current solution in the trunk (with
 recog_init).  Setting right value for optimize_function_for_size_p does not
 solve the problem as we can have different options for different functions
 in the same compilation file.

 So minimal solution would be removing optimize_function_for_size_p from the
 attribute definition.  But I guess we could remove all condition.
 Unfortunately, Ganesh did not post is it really beneficial to switch off
 this alternative for AMD CPUs even if AMD optimization guide recommends it.

 I propose to split the pattern into size-optimized and normal pattern.
 The patch implements this proposal.

 An alternative would be to add two new enabled-like attributes,
 good_for_size and good_for_speed, that say whether it is efficient
 to use a particular alternative.  These attributes wouldn't ever stop
 an existing instruction from being recognised; they would just say
 whether the RA and optimisers should consider the alternative when
 optimising for size or speed respectively.  The attributes would have
 the same restrictions as the enabled attribute and could be cached in
 the same way.

 The enabled attribute would then be purely about whether an
 instruction is available, not whether it's efficient in a particular
 situation.

 The main advantage is that it would be possible to make the size/speed
 choice at a basic block level rather than a function level.  In the worst
 case we might move an instruction from a hot block to a cold block or
 vice versa, but with intelligent spilling that shouldn't happen too
 often and at least it wouldn't trigger an ICE.

 If that sounds OK, I'll try to get something together next week.
 I think Ramana said he had a use for this on ARM too.

I think that this would be way better than duplication of patterns.
Perhaps we can name these attributes enable_for_size and
enable_for_speed, and have them in addition to enable attribute.
The final enable condition would be an union of enable,
enable_for_speed and enable_for_size attributes.

Uros.


Re: RFA: one more version of the patch for PR61360

2014-10-04 Thread Richard Sandiford
Uros Bizjak ubiz...@gmail.com writes:
 On Sat, Oct 4, 2014 at 12:49 PM, Richard Sandiford
 rdsandif...@googlemail.com wrote:
 Uros Bizjak ubiz...@gmail.com writes:
 On Thu, Oct 2, 2014 at 10:13 PM, Vladimir Makarov
 vmaka...@redhat.com wrote:

 I guess we achieved the consensus about the following patch to fix
 PR61360

 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61360

 The patch was successfully bootstrapped and tested (w/wo
 -march=amdfam10) on x86/x86-64.

 Is it ok to commit to trunk?


 I've tested your patch and unfortunately it doesn't work:

 In file included from
 /var/tmp/moz-build-dir/js/src/shell/Unified_cpp_js_src_shell0.cpp:15:0:
 /var/tmp/mozilla-central/js/src/shell/js.cpp: In function ‘void
 Process(JSContext*, JSObject*, const char*, bool)’:
 /var/tmp/mozilla-central/js/src/shell/js.cpp:592:1: internal compiler
 error: in lra_update_insn_recog_data, at lra.c:1221
   }
   ^
 0xa9d9ec lra_update_insn_recog_data(rtx_insn*)
  ../../gcc/gcc/lra.c:1220
 0xab450f eliminate_regs_in_insn
  ../../gcc/gcc/lra-eliminations.c:1077
 0xab450f process_insn_for_elimination
  ../../gcc/gcc/lra-eliminations.c:1344
 0xab450f lra_eliminate(bool, bool)
  ../../gcc/gcc/lra-eliminations.c:1408
 0xa9f2da lra(_IO_FILE*)
  ../../gcc/gcc/lra.c:2270
 0xa5d659 do_reload
  ../../gcc/gcc/ira.c:5311
 0xa5d659 execute
  ../../gcc/gcc/ira.c:5470


 Testcase is attached:

   % g++ -c -march=amdfam10 -w -O2 js.ii
 js.ii: In function ‘void RunFile(C)’:
 js.ii:64:1: internal compiler error: in lra_update_insn_recog_data, at
 lra.c:1221


 Thanks for reporting this, Marcus.  The problem now is in
 optimize_function_for_size_p.  It is false, when we define and cache enable
 attributes at early stages (instantation of virtual regs) and true later.

 It is returning us to the same problem.  I believe that we should not have
 enable attributes depending on optimization options or on the current
 running pass if we don't want the current solution in the trunk (with
 recog_init).  Setting right value for optimize_function_for_size_p does not
 solve the problem as we can have different options for different functions
 in the same compilation file.

 So minimal solution would be removing optimize_function_for_size_p from the
 attribute definition.  But I guess we could remove all condition.
 Unfortunately, Ganesh did not post is it really beneficial to switch off
 this alternative for AMD CPUs even if AMD optimization guide recommends it.

 I propose to split the pattern into size-optimized and normal pattern.
 The patch implements this proposal.

 An alternative would be to add two new enabled-like attributes,
 good_for_size and good_for_speed, that say whether it is efficient
 to use a particular alternative.  These attributes wouldn't ever stop
 an existing instruction from being recognised; they would just say
 whether the RA and optimisers should consider the alternative when
 optimising for size or speed respectively.  The attributes would have
 the same restrictions as the enabled attribute and could be cached in
 the same way.

 The enabled attribute would then be purely about whether an
 instruction is available, not whether it's efficient in a particular
 situation.

 The main advantage is that it would be possible to make the size/speed
 choice at a basic block level rather than a function level.  In the worst
 case we might move an instruction from a hot block to a cold block or
 vice versa, but with intelligent spilling that shouldn't happen too
 often and at least it wouldn't trigger an ICE.

 If that sounds OK, I'll try to get something together next week.
 I think Ramana said he had a use for this on ARM too.

 I think that this would be way better than duplication of patterns.
 Perhaps we can name these attributes enable_for_size and
 enable_for_speed, and have them in addition to enable attribute.
 The final enable condition would be an union of enable,
 enable_for_speed and enable_for_size attributes.

I was trying to avoid having enable in the names because the new
attributes would only be strong optimisation hints whereas enabled
is a correctness thing.  We'd need to ignore the new attributes when
constraining an existing instruction (which might have been optimised
the opposite way and then moved) but we can never ignore enabled.

E.g. to take two examples, when constrain_operands is used like this:

if (! constrain_operands (1))
  fatal_insn_not_found (insn);

then only the enabled attribute matters, whereas for validate_change
I think we'd want to take the new attributes into account as well.

good_for_ wasn't a great name though.  Better suggestions are
definitely welcome.

Thanks,
Richard


Re: RFA: one more version of the patch for PR61360

2014-10-03 Thread Uros Bizjak
On Thu, Oct 2, 2014 at 10:13 PM, Vladimir Makarov vmaka...@redhat.com wrote:

 I guess we achieved the consensus about the following patch to fix
 PR61360

 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61360

 The patch was successfully bootstrapped and tested (w/wo
 -march=amdfam10) on x86/x86-64.

 Is it ok to commit to trunk?


 I've tested your patch and unfortunately it doesn't work:

 In file included from
 /var/tmp/moz-build-dir/js/src/shell/Unified_cpp_js_src_shell0.cpp:15:0:
 /var/tmp/mozilla-central/js/src/shell/js.cpp: In function ‘void
 Process(JSContext*, JSObject*, const char*, bool)’:
 /var/tmp/mozilla-central/js/src/shell/js.cpp:592:1: internal compiler
 error: in lra_update_insn_recog_data, at lra.c:1221
   }
   ^
 0xa9d9ec lra_update_insn_recog_data(rtx_insn*)
  ../../gcc/gcc/lra.c:1220
 0xab450f eliminate_regs_in_insn
  ../../gcc/gcc/lra-eliminations.c:1077
 0xab450f process_insn_for_elimination
  ../../gcc/gcc/lra-eliminations.c:1344
 0xab450f lra_eliminate(bool, bool)
  ../../gcc/gcc/lra-eliminations.c:1408
 0xa9f2da lra(_IO_FILE*)
  ../../gcc/gcc/lra.c:2270
 0xa5d659 do_reload
  ../../gcc/gcc/ira.c:5311
 0xa5d659 execute
  ../../gcc/gcc/ira.c:5470


 Testcase is attached:

   % g++ -c -march=amdfam10 -w -O2 js.ii
 js.ii: In function ‘void RunFile(C)’:
 js.ii:64:1: internal compiler error: in lra_update_insn_recog_data, at
 lra.c:1221


 Thanks for reporting this, Marcus.  The problem now is in
 optimize_function_for_size_p.  It is false, when we define and cache enable
 attributes at early stages (instantation of virtual regs) and true later.

 It is returning us to the same problem.  I believe that we should not have
 enable attributes depending on optimization options or on the current
 running pass if we don't want the current solution in the trunk (with
 recog_init).  Setting right value for optimize_function_for_size_p does not
 solve the problem as we can have different options for different functions
 in the same compilation file.

 So minimal solution would be removing optimize_function_for_size_p from the
 attribute definition.  But I guess we could remove all condition.
 Unfortunately, Ganesh did not post is it really beneficial to switch off
 this alternative for AMD CPUs even if AMD optimization guide recommends it.

I propose to split the pattern into size-optimized and normal pattern.
The patch implements this proposal.

Uros.
Index: i386.md
===
--- i386.md (revision 215797)
+++ i386.md (working copy)
@@ -4766,6 +4766,38 @@
 }
 })
 
+(define_insn *floatSWI48:modeMODEF:mode2_sse_size
+  [(set (match_operand:MODEF 0 register_operand =f,x,x)
+   (float:MODEF
+ (match_operand:SWI48 1 nonimmediate_operand m,r,m)))]
+  SSE_FLOAT_MODE_P (MODEF:MODEmode)  TARGET_SSE_MATH
+optimize_function_for_size_p (cfun)
+ @
+   fild%Z1\t%1
+   %vcvtsi2MODEF:ssemodesuffixSWI48:rex64suffix\t{%1, %d0|%d0, %1}
+   %vcvtsi2MODEF:ssemodesuffixSWI48:rex64suffix\t{%1, %d0|%d0, %1}
+  [(set_attr type fmov,sseicvt,sseicvt)
+   (set_attr prefix orig,maybe_vex,maybe_vex)
+   (set_attr mode MODEF:MODE)
+   (set (attr prefix_rex)
+ (if_then_else
+   (and (eq_attr prefix maybe_vex)
+   (match_test SWI48:MODEmode == DImode))
+   (const_string 1)
+   (const_string *)))
+   (set_attr unit i387,*,*)
+   (set_attr athlon_decode *,double,direct)
+   (set_attr amdfam10_decode *,vector,double)
+   (set_attr bdver1_decode *,double,direct)
+   (set_attr fp_int_src true)
+   (set (attr enabled)
+ (cond [(eq_attr alternative 0)
+  (symbol_ref TARGET_MIX_SSE_I387
+X87_ENABLE_FLOAT (MODEF:MODEmode,
+SWI48:MODEmode))
+   ]
+   (const_int 1)))])
+
 (define_insn *floatSWI48:modeMODEF:mode2_sse
   [(set (match_operand:MODEF 0 register_operand =f,x,x)
(float:MODEF
@@ -4795,16 +4827,9 @@
 X87_ENABLE_FLOAT (MODEF:MODEmode,
 SWI48:MODEmode))
 (eq_attr alternative 1)
-  /* ??? For sched1 we need constrain_operands to be able to
- select an alternative.  Leave this enabled before RA.  */
-  (symbol_ref TARGET_INTER_UNIT_CONVERSIONS
-   || optimize_function_for_size_p (cfun)
-   || !(reload_completed
-|| reload_in_progress
-|| lra_in_progress))
+  (symbol_ref TARGET_INTER_UNIT_CONVERSIONS)
]
-   (symbol_ref true)))
-   ])
+   (const_int 1)))])
 
 (define_insn *floatSWI48x:modeMODEF:mode2_i387
   [(set (match_operand:MODEF 0 register_operand =f)


Re: RFA: one more version of the patch for PR61360

2014-10-02 Thread Markus Trippelsdorf
On 2014.09.26 at 16:31 -0400, Vladimir Makarov wrote:
 I guess we achieved the consensus about the following patch to fix PR61360
 
 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61360
 
 The patch was successfully bootstrapped and tested (w/wo 
 -march=amdfam10) on x86/x86-64.
 
 Is it ok to commit to trunk?

I've tested your patch and unfortunately it doesn't work:

In file included from 
/var/tmp/moz-build-dir/js/src/shell/Unified_cpp_js_src_shell0.cpp:15:0:
/var/tmp/mozilla-central/js/src/shell/js.cpp: In function ‘void 
Process(JSContext*, JSObject*, const char*, bool)’:
/var/tmp/mozilla-central/js/src/shell/js.cpp:592:1: internal compiler error: in 
lra_update_insn_recog_data, at lra.c:1221
 }
 ^
0xa9d9ec lra_update_insn_recog_data(rtx_insn*)
../../gcc/gcc/lra.c:1220
0xab450f eliminate_regs_in_insn
../../gcc/gcc/lra-eliminations.c:1077
0xab450f process_insn_for_elimination
../../gcc/gcc/lra-eliminations.c:1344
0xab450f lra_eliminate(bool, bool)
../../gcc/gcc/lra-eliminations.c:1408
0xa9f2da lra(_IO_FILE*)
../../gcc/gcc/lra.c:2270
0xa5d659 do_reload
../../gcc/gcc/ira.c:5311
0xa5d659 execute
../../gcc/gcc/ira.c:5470
Please submit a full bug report,

-- 
Markus


Re: RFA: one more version of the patch for PR61360

2014-10-02 Thread Markus Trippelsdorf
On 2014.10.02 at 09:17 +0200, Markus Trippelsdorf wrote:
 On 2014.09.26 at 16:31 -0400, Vladimir Makarov wrote:
  I guess we achieved the consensus about the following patch to fix PR61360
  
  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61360
  
  The patch was successfully bootstrapped and tested (w/wo 
  -march=amdfam10) on x86/x86-64.
  
  Is it ok to commit to trunk?
 
 I've tested your patch and unfortunately it doesn't work:
 
 In file included from 
 /var/tmp/moz-build-dir/js/src/shell/Unified_cpp_js_src_shell0.cpp:15:0:
 /var/tmp/mozilla-central/js/src/shell/js.cpp: In function ‘void 
 Process(JSContext*, JSObject*, const char*, bool)’:
 /var/tmp/mozilla-central/js/src/shell/js.cpp:592:1: internal compiler error: 
 in lra_update_insn_recog_data, at lra.c:1221
  }
  ^
 0xa9d9ec lra_update_insn_recog_data(rtx_insn*)
 ../../gcc/gcc/lra.c:1220
 0xab450f eliminate_regs_in_insn
 ../../gcc/gcc/lra-eliminations.c:1077
 0xab450f process_insn_for_elimination
 ../../gcc/gcc/lra-eliminations.c:1344
 0xab450f lra_eliminate(bool, bool)
 ../../gcc/gcc/lra-eliminations.c:1408
 0xa9f2da lra(_IO_FILE*)
 ../../gcc/gcc/lra.c:2270
 0xa5d659 do_reload
 ../../gcc/gcc/ira.c:5311
 0xa5d659 execute
 ../../gcc/gcc/ira.c:5470

Testcase is attached:

 % g++ -c -march=amdfam10 -w -O2 js.ii
js.ii: In function ‘void RunFile(C)’:
js.ii:64:1: internal compiler error: in lra_update_insn_recog_data, at 
lra.c:1221

-- 
Markus
void printf(...);
class A;
template typename
class B;
class C {
 public:
  template typename S
  C(S);
};
template class
class D {
 public:
  void m_fn1(int);
};

template 
class BA : public DA {};

typedef bool (*JSNative)(int *, unsigned, A *);
class F {
 public:
  BA m_fn2();
};
struct G {
  JSNative a;
};
struct H {
  H(int, int, int);
};
class I {
 public:
  int m_fn3();
};
class J {
 public:
  enum Result {};
  H helpOption;
  int usage;
  int ver;
  int descr;
  int descrWidth;
  int helpWidth;
  int nextArgument;
  int restArgument;
  J(int)
  : helpOption('h', 0, 0),
usage(ver),
descr(descrWidth),
helpWidth(nextArgument),
restArgument() {}
  Result m_fn4();
  int m_fn5();
};

class OffThreadState {
 public:
  int m_fn6();
} d;
int f, h;
long PRMJ_Now();
static void RunFile(C) {
  long a = PRMJ_Now();
  printf(double(a) / 0);
}

static void Process(int, int) { RunFile(0); }

bool Now(int *, unsigned, A *) {
  F b;
  double c = PRMJ_Now();
  b.m_fn2().m_fn1(c);
}

int InitWatchdog();
G e{Now};
static int Shell() {
  I g;
  if (g.m_fn3()) Process(0, 0);
  if (f) Process(0, 0);
}

main() {
  J i(0);
  switch (i.m_fn4())
  case 0:
  return 0;
  if (i.m_fn5()) return 0;
  if (h) return 0;
  if (d.m_fn6()) return 0;
  if (InitWatchdog()) return 0;
  Shell();
}


Re: RFA: one more version of the patch for PR61360

2014-10-02 Thread Vladimir Makarov

On 2014-10-02 7:31 AM, Markus Trippelsdorf wrote:

On 2014.10.02 at 09:17 +0200, Markus Trippelsdorf wrote:

On 2014.09.26 at 16:31 -0400, Vladimir Makarov wrote:

I guess we achieved the consensus about the following patch to fix PR61360

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61360

The patch was successfully bootstrapped and tested (w/wo
-march=amdfam10) on x86/x86-64.

Is it ok to commit to trunk?


I've tested your patch and unfortunately it doesn't work:

In file included from 
/var/tmp/moz-build-dir/js/src/shell/Unified_cpp_js_src_shell0.cpp:15:0:
/var/tmp/mozilla-central/js/src/shell/js.cpp: In function ‘void 
Process(JSContext*, JSObject*, const char*, bool)’:
/var/tmp/mozilla-central/js/src/shell/js.cpp:592:1: internal compiler error: in 
lra_update_insn_recog_data, at lra.c:1221
  }
  ^
0xa9d9ec lra_update_insn_recog_data(rtx_insn*)
 ../../gcc/gcc/lra.c:1220
0xab450f eliminate_regs_in_insn
 ../../gcc/gcc/lra-eliminations.c:1077
0xab450f process_insn_for_elimination
 ../../gcc/gcc/lra-eliminations.c:1344
0xab450f lra_eliminate(bool, bool)
 ../../gcc/gcc/lra-eliminations.c:1408
0xa9f2da lra(_IO_FILE*)
 ../../gcc/gcc/lra.c:2270
0xa5d659 do_reload
 ../../gcc/gcc/ira.c:5311
0xa5d659 execute
 ../../gcc/gcc/ira.c:5470


Testcase is attached:

  % g++ -c -march=amdfam10 -w -O2 js.ii
js.ii: In function ‘void RunFile(C)’:
js.ii:64:1: internal compiler error: in lra_update_insn_recog_data, at 
lra.c:1221



Thanks for reporting this, Marcus.  The problem now is in 
optimize_function_for_size_p.  It is false, when we define and cache 
enable attributes at early stages (instantation of virtual regs) and 
true later.


It is returning us to the same problem.  I believe that we should not 
have enable attributes depending on optimization options or on the 
current running pass if we don't want the current solution in the trunk 
(with recog_init).  Setting right value for optimize_function_for_size_p 
does not solve the problem as we can have different options for 
different functions in the same compilation file.


So minimal solution would be removing optimize_function_for_size_p from 
the attribute definition.  But I guess we could remove all condition. 
Unfortunately, Ganesh did not post is it really beneficial to switch off 
this alternative for AMD CPUs even if AMD optimization guide recommends it.




Re: RFA: one more version of the patch for PR61360

2014-09-28 Thread Uros Bizjak
On Fri, Sep 26, 2014 at 10:31 PM, Vladimir Makarov vmaka...@redhat.com wrote:
 I guess we achieved the consensus about the following patch to fix PR61360

 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61360

 The patch was successfully bootstrapped and tested (w/wo -march=amdfam10) on
 x86/x86-64.

 Is it ok to commit to trunk?

 2014-09-26  Vladimir Makarov  vmaka...@redhat.com

 PR target/61360
 * lra.c (lra): Remove call of recog_init.
 * recog.c (constrain_operands): Permit reg for memory constraint
 when LRA is used.
 * config/i386/i386.md (*floatSWI48:modeMODEF:mode2_sse):
 Enable first alternative independently on RA stage.

Please also mention that this patch reverts:

2014-04-01  Richard Henderson  r...@redhat.com

PR target/60704
* config/i386/i386.md (*floatSWI48MODEF2_sse): Leave the second
alternative enabled before register allocation.

The x86 part is OK.

BTW: I think that this patch should also be backported to 4.9 branch,
since the original patch and the PR60704 fix were also installed
there.

Thanks,
Uros.


Re: RFA: one more version of the patch for PR61360

2014-09-27 Thread Richard Sandiford
Hi Vlad,

Vladimir Makarov vmaka...@redhat.com writes:
 I guess we achieved the consensus about the following patch to fix PR61360

 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61360

 The patch was successfully bootstrapped and tested (w/wo 
 -march=amdfam10) on x86/x86-64.

 Is it ok to commit to trunk?

 2014-09-26  Vladimir Makarov  vmaka...@redhat.com

  PR target/61360
  * lra.c (lra): Remove call of recog_init.
  * recog.c (constrain_operands): Permit reg for memory constraint
  when LRA is used.
  * config/i386/i386.md (*floatSWI48:modeMODEF:mode2_sse):
  Enable first alternative independently on RA stage.

Many thanks for doing this!  I hadn't realised when seeing the bug
originally that LRA made the recog.c part possible.  Obviously I can't
approve it, but this approach seems much cleaner to me.

Richard