Re: [RFC] PR64703, glibc sysdeps/powerpc/powerpc64/dl-machine.h miscompile

2015-01-27 Thread Alan Modra
On Mon, Jan 26, 2015 at 10:11:14AM +0100, Richard Biener wrote:
 On Sat, Jan 24, 2015 at 12:23 AM, Alan Modra amo...@gmail.com wrote:
  How does this look as a potential fix for PR64703?  I haven't made
  many forays into gimple code, so even though this patch passes
  bootstrap and regression testing on powerpc64-linux it's quite
  possible this is the wrong place to change.  If it does look to be OK,
  then I'll fill out the targetm changes, include a testcase etc.
 
 It looks mostly ok, comments below.

Thanks for looking!

  PR target/64703
  * tree-ssa-alias.c (pt_solution_includes_base): New function,
  extracted from..
  (ref_maybe_used_by_call_p_1): ..here.  Delete dead code checking
  for NULL return from ao_ref_base.  Handle potential memory
  reference by indirect calls on targets using function descriptors.
 
  Index: gcc/tree-ssa-alias.c
  ===
  --- gcc/tree-ssa-alias.c(revision 220025)
  +++ gcc/tree-ssa-alias.c(working copy)
  @@ -1532,6 +1532,23 @@ refs_output_dependent_p (tree store1, tree store2)
 return refs_may_alias_p_1 (r1, r2, false);
   }
 
  +static bool
  +pt_solution_includes_base (struct pt_solution *pt, tree base)
 
 Needs a comment.

Sure, that was part of etc. above. ;)

  +{
  +  if (DECL_P (base))
  +return pt_solution_includes (pt, base);
  +
  +  if ((TREE_CODE (base) == MEM_REF
  +   || TREE_CODE (base) == TARGET_MEM_REF)
  +   TREE_CODE (TREE_OPERAND (base, 0)) == SSA_NAME)
  +{
  +  struct ptr_info_def *pi = SSA_NAME_PTR_INFO (TREE_OPERAND (base, 0));
  +  if (pi)
  +   return pt_solutions_intersect (pt, pi-pt);
  +}
  +  return true;
  +}
  +
   /* If the call CALL may use the memory reference REF return true,
  otherwise return false.  */
 
  @@ -1542,15 +1559,24 @@ ref_maybe_used_by_call_p_1 (gcall *call, ao_ref *r
 unsigned i;
 int flags = gimple_call_flags (call);
 
  +  base = ao_ref_base (ref);
 
 You dropped the
 
   if (!base)
 return true;
 
 check - please put it back.

Hmm, calls to ao_ref_base in tree-ssa-alias.c are a mixed bag.  Some
check the return for NULL, others don't.  All the checks for NULL are
dead code since ao_ref_base never returns NULL.  OK, I'll put it back
and leave cleanup for another day.

 
  +  callee = gimple_call_fn (call);
  +  if (callee  TREE_CODE (callee) == SSA_NAME
 
 Do we never propagate the address of a function descriptor here?
 That is, can we generate a testcase like
 
descr fn;
setup fn
foo (fn);
 
 void foo (fn)
 {
(*fn) (a, b);
 }
 
 and then inline foo so the call becomes
 
   (*fn) (a, b);
 
 ?  We'd then still implicitely read from 'fn'.  I also wonder whether
 (and where!) we resolve such a descriptor reference to the actual
 function on GIMPLE?

Well, if we did end up with a direct call then the value in 'fn' won't
matter, I think.  A direct call implies gcc knows the value of a
function pointer, and can thus use the value rather than the pointer.
In this case it implies gcc has looked through 'fn' to its
initialization, and seen that 'fn' is a function address.
ie. your setup fn above is something like
  fn = *(descr *) some_function;
Now it appears that gcc isn't clever enough to do this, but if it did,
then surely the value of the pointer would be 'some_function', not
'fn'.

I can't see how we could end up with a direct call any other way.  As
far as I know, gcc doesn't have any knowledge that 'descr' has
anything to do with a function address unless that knowledge is
imparted by an initialization such as the above.  ie. The answer to
your and where! question is nowhere.

 That is - don't you simply want to use
 
   if (targetm.function_descriptors
ptr_deref_mayalias_ref_p_1 (callee, ref))
 return true;
 
 here?

No.  I don't want to do needless work on direct calls, particularly
since it appears that ptr_deref_may_alias_ref_p_1 does return true for
direct calls like memcpy.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH] Fix ICE due to invalid thunk (PR ipa/64776)

2015-01-27 Thread Richard Biener
On Mon, 26 Jan 2015, Jakub Jelinek wrote:

 Hi!
 
 On x86_64-darwin, we ICE on one of the pr64307.c testcase, because
 expand_thunk doesn't load non-gimple_val arguments into registers
 for the first argument, only for all the other ones.
 Supposedly normally thunks were meant to have this argument as pointer first
 and thus it wasn't an issue, but in the -O0 -fipa-icf case a thunk is
 created even for a non-method.
 
 This patch fixes it by special-casing the first argument only if
 this_adjusting - then we know it is a pointer that is being adjusted.
 
 Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

Thanks,
Richard.

 2015-01-26  Jakub Jelinek  ja...@redhat.com
 
   PR ipa/64776
   * cgraphunit.c (cgraph_node::expand_thunk): If not this_adjusting,
   handle the first argument in the same loop as all the other arguments.
 
 --- gcc/cgraphunit.c.jj   2015-01-15 14:05:05.0 +0100
 +++ gcc/cgraphunit.c  2015-01-26 17:26:18.629818527 +0100
 @@ -1610,14 +1610,18 @@ cgraph_node::expand_thunk (bool output_a
for (arg = a; arg; arg = DECL_CHAIN (arg))
  nargs++;
auto_vectree vargs (nargs);
 +  i = 0;
 +  arg = a;
if (this_adjusting)
 -vargs.quick_push (thunk_adjust (bsi, a, 1, fixed_offset,
 - virtual_offset));
 -  else if (nargs)
 -vargs.quick_push (a);
 + {
 +   vargs.quick_push (thunk_adjust (bsi, a, 1, fixed_offset,
 +   virtual_offset));
 +   arg = DECL_CHAIN (a);
 +   i = 1;
 + }
  
if (nargs)
 -for (i = 1, arg = DECL_CHAIN (a); i  nargs; i++, arg = DECL_CHAIN 
 (arg))
 + for (; i  nargs; i++, arg = DECL_CHAIN (arg))
 {
   tree tmp = arg;
   if (!is_gimple_val (arg))
 
   Jakub
 
 

-- 
Richard Biener rguent...@suse.de
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild,
Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)


[PATCH] Fix PR64798

2015-01-27 Thread Richard Biener

The new exceptional EH allocator failed to align exception objects
properly (it ended up aligning to __alignof__((std::size_t))).  The
following fixes that by aligning to what __attribute__((aligned))
would align to (this is what _Unwind_Exception is aligned to, a
member of __cxa_refcounted_exception).

Bootstrapped and tested on x86_64-unknown-linux-gnu - Rainer is
testing this on sparc-solaris where it broke 
g++.old-deja/g++.eh/badalloc1.C.

Ok for trunk?

Thanks,
Richard.

2015-01-27  Richard Biener  rguent...@suse.de

PR libstdc++/64798
* libsupc++/eh_alloc.cc (struct allocated_entry): Align
data member.
(pool::allocate): Adjust allocation size and alignment to
that change.
(pool::free): Adjust pointer offsetting.

Index: libstdc++-v3/libsupc++/eh_alloc.cc
===
--- libstdc++-v3/libsupc++/eh_alloc.cc  (revision 220164)
+++ libstdc++-v3/libsupc++/eh_alloc.cc  (working copy)
@@ -94,7 +94,7 @@ namespace
   };
   struct allocated_entry {
std::size_t size;
-   char data[];
+   char data[] __attribute__((aligned));
   };
 
   // A single mutex controlling emergency allocations.
@@ -133,17 +133,18 @@ namespace
   void *pool::allocate (std::size_t size)
 {
   __gnu_cxx::__scoped_lock sentry(emergency_mutex);
-  // We need an additional size_t member.
-  size += sizeof (std::size_t);
+  // We need an additional size_t member plus the padding to
+  // ensure proper alignment of data.
+  size += offsetof (allocated_entry, data);
   // And we need to at least hand out objects of the size of
   // a freelist entry.
   if (size  sizeof (free_entry))
size = sizeof (free_entry);
-  // And we need to align objects we hand out to the required
-  // alignment of a freelist entry (this really aligns the
+  // And we need to align objects we hand out to the maximum
+  // alignment required on the target (this really aligns the
   // tail which will become a new freelist entry).
-  size = ((size + __alignof__(free_entry) - 1)
-  ~(__alignof__(free_entry) - 1));
+  size = ((size + __alignof__ (allocated_entry::data) - 1)
+  ~(__alignof__ (allocated_entry::data) - 1));
   // Search for an entry of proper size on the freelist.
   free_entry **e;
   for (e = first_free_entry;
@@ -185,7 +186,7 @@ namespace
 {
   __gnu_cxx::__scoped_lock sentry(emergency_mutex);
   allocated_entry *e = reinterpret_cast allocated_entry *
-   (reinterpret_cast char * (data) - sizeof (std::size_t));
+   (reinterpret_cast char * (data) - offsetof (allocated_entry, data));
   std::size_t sz = e-size;
   if (!first_free_entry)
{


Re: RFA: patch to fix a bad code generation for PR64110 -- new constraints addition

2015-01-27 Thread Richard Sandiford
Jeff Law l...@redhat.com writes:
 On 01/24/15 04:29, Richard Sandiford wrote:

 Yeah.  I expect in practice most people who used ? and ! attached
 them to a particular operand for a reason.  From a quick scan through
 386.exp it looked like almost all uses would either want this behaviour
 or wouldn't care.  An interesting exception is:

 (define_insn extendsidi2_1
[(set (match_operand:DI 0 nonimmediate_operand =*A,r,?r,?*o)
  (sign_extend:DI (match_operand:SI 1 register_operand 0,0,r,r)))
 (clobber (reg:CC FLAGS_REG))
 (clobber (match_scratch:SI 2 =X,X,X,r))]
!TARGET_64BIT
#)

 I don't know how effective the third alternative is with LRA.  Surely
 a r-0 alternative is by definition a case where r-r is possible
 only with a ?-cost reload?  Seems to me we could just delete it.
 But assuming it does some good, I suppose the ? really does apply to
 the alternative as a whole.  If we had to reload operand 1 or operand 0,
 there's an extra cost if it can't use the same register as the other
 operand.

 Wouldn't it be better to make ? and ! behave the new way and only
 add new constraints if it turns out that the old behaviour really is
 useful in some cases?

 Maybe stage 4 isn't the time to be making that kind of change.
 Still, it'd be great if someone who's set up do x86_64 benchmarking
 could measure the effect of making ? and ! behave like the
 new constraints.
 My worry isn't the x86_64 port, but all the others that folks don't test 
 as regularly.

 I'd rather go the other direction, have folks familiar with the port go 
 through it changing the constraints where it makes sense.  That just 
 seems a hell of a lot safer.

 A port maintainer could certainly hack something together for testing 
 purposes to guide them as to whether or not there's something to be 
 gained by converting many/most of the ?! to the new constraints.

Yeah, but in practice that's only ever going to be a partial transition.
Many port maintainers won't look at this, so we'll have to support both
versions indefinitely, even if the new behaviour turns out to be the
best for all cases.

I just think we're going to regret having two sets of constraints with
such subtly different meanings.

Looking back at the original PR, Jakub said:

  The ! has been added by me for PR63594, so it isn't there from the era
  when i?86 backend was using reload.  If there is a better way to
  express that RA should prefer to use memory or xmm register and only
  use r constraint if it already is in a r register and doesn't need to
  be reloaded, I can use that.  Whether it is ?, ??? or something else.
  ! description in gcc docs just fitted most what I wanted...

In some ways this seems to match the intention of *.  Originally I think
it was just an RA-only thing and was ignored by reload, but LRA does take it
into account too (which sounds like progress to me).

If I revert the patch locally and change the *vec_dupmode pattern to
use *, it passes both the test for PR64110 and the tests for PR63594.
Would that be OK as an alternative?

Thanks,
Richard



Re: Merge current set of OpenACC changes from gomp-4_0-branch

2015-01-27 Thread Julian Brown
On Mon, 26 Jan 2015 17:34:26 +0300
Ilya Verbin iver...@gmail.com wrote:

 Here is my current patch, it works for OpenMP-MIC, but obviously
 will not work for PTX, since it requires symmetrical changes in the
 plugin.  Could you please take a look, whether it is possible to
 support this new interface in PTX plugin?

I think it can probably be made to work. I'll have a look in more
detail.

Thanks,

Julian


Re: [Patch, Fortran, OOP] PR 64230: [4.9/5 Regression] Invalid memory reference in a compiler-generated finalizer for allocatable component

2015-01-27 Thread Jakub Jelinek
On Tue, Jan 27, 2015 at 07:20:10PM +0100, Janus Weil wrote:
 2015-01-27 10:30 GMT+01:00 Jakub Jelinek ja...@redhat.com:
  Yeah, if you want to add ubsan tests, you need to add gfortran.dg/ubsan/
  directory and hack up ubsan.exp in there
 
 Thanks for the remark, I was suspecting something like that. However,
 for this case it's not really worth the hassle. In fact the test case
 does not really need the sanitizer and should also work without it. So
 I'll just remove the -fsanitize option:
 
 Index: gcc/testsuite/gfortran.dg/class_allocate_18.f90
 ===
 --- gcc/testsuite/gfortran.dg/class_allocate_18.f90(Revision 220180)
 +++ gcc/testsuite/gfortran.dg/class_allocate_18.f90(Arbeitskopie)
 @@ -1,5 +1,4 @@
  ! { dg-do run }
 -! { dg-options -fsanitize=undefined }
  !
  ! PR 64230: [4.9/5 Regression] Invalid memory reference in a
 compiler-generated finalizer for allocatable component
  !

LGTM.

Jakub


Re: [PATCH][RFA][PR target/15184] Partial fix for direct byte access on x86

2015-01-27 Thread Jeff Law

On 01/26/15 22:11, Segher Boessenkool wrote:

On Mon, Jan 26, 2015 at 08:07:29PM -0700, Jeff Law wrote:

The second change we need is an additional simplification.

If we have
(subreg:M1 (zero_extend:M2 (x))

Where M1  M2 and both are scalar integer modes.  It's advantageous to
strip the SUBREG and instead have a wider extension.


Should you also check M1 is not multiple registers?
We're generally working with pseudos, so we could estimate, but not know 
for sure if we're dealing with multiple hard regs.  But more 
importantly, I'm not sure what that check would buy us.






Bootstrapped and regression tested on x86_64-unknown-linux-gnu.
Thoughts?


It looks fine to me.  Well, some comments...


@@ -2643,6 +2644,24 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
rtx_insn *i0,
   || GET_CODE (src) == LSHIFTRT)
nshift++;
}
+
+  /* If I0 loads a memory and I3 sets the same memory, then I2 and I3
+are likely manipulating its value.  Ideally we'll be able to combine
+all four insns into a bitfield insertion of some kind.
+
+Note the source in I0 might be inside a sign/zero extension and the
+memory modes in I0 and I3 might be different.  So extract the address
+from the destination of I3 and search for it in the source of I0.
+
+In the event that there's a match but the source/dest do not actually
+refer to the same memory, the worst that happens is we try some
+combinations that we wouldn't have otherwise.  */
+  if ((set0 = single_set (i0))
+  (set3 = single_set (i3))
+  GET_CODE (SET_DEST (set3)) == MEM
+  rtx_referenced_p (XEXP (SET_DEST (set3), 0), SET_SRC (set0)))
+   ngood += 2;


I think you should test MEM_P (SET_SRC (set0)), too.  Or even just test
rtx_equal_p (SET_DEST (set3), SET_SRC (set0)) ?
Yea, we need a tighter test on set0 to ensure it's a MEM.  That code got 
twidded before the last testrun.  I'll take care of that.


Earlier versions checked reg_equal_p on the MEM.  But that's often a 
mistake because the modes of the two memory references may be different. 
 I don't recall which of the various tests, but I was definitely seeing 
SImode in the load and HImode in the store.


Similarly you don't want to check reg_equal_p on the addresses as they 
aren't necessarily the same either (they're obviously related).


That's how I ultimately settled on rtx_referenced_p form you see above. 
 I'm still not sure that's 100% what I want, but I don't have any tests 
yet which require something more complex.








+
if (ngood  2  nshift  2)
return 0;
  }
@@ -5663,6 +5682,25 @@ combine_simplify_rtx (rtx x, machine_mode op0_mode, int 
in_dest,
  return CONST0_RTX (mode);
}

+  /* If we have (subreg:M1 (zero_extend:M2 (x))) or
+(subreg:M1 (sign_extend: M2 (x))) where M1 is wider
+then M2, then go ahead and just widen the original extension.
+
+While the subreg is useful in saying I don't care about those
+upper bits.  Squashing out the subreg results in simpler RTL that
+is more easily matched.  */


Closing quote missing.

Fixed locally.




+  if ((GET_CODE (SUBREG_REG (x)) == ZERO_EXTEND
+  || GET_CODE (SUBREG_REG (x)) == SIGN_EXTEND)
+  SCALAR_INT_MODE_P (GET_MODE (x))
+  SCALAR_INT_MODE_P (GET_MODE (SUBREG_REG (x)))
+  GET_MODE (x)  GET_MODE (SUBREG_REG (x)))


GET_MODE_SIZE instead?
It's work as-is.  But using GET_MODE_SIZE shows the intent clearer. 
I'll fix that momentarily.





Does this do anything good for the dec mem thing on x86?  That would
be a nice bonus :-)
It might, but I haven't tested for that specifically.  If you've got 
sample code or a PR in mind, pass it along and I'll take a look.  I'd 
think dec mem would generally be handled by 3-1 insn combination code 
unless there's something else going on.


jef


[RFC PATCH] Emit DW_LANG_Fortran{03,08}

2015-01-27 Thread Jakub Jelinek
Hi!

DW_LANG_Fortran03 and DW_LANG_Fortran08 DW_AT_language values were recently
accepted into DWARF5.  This patch changes GCC to handle those similarly to
how e.g. the -std=c++11, -std=c++14 or -std=c11 are handled.

As it will take some time for consumers to catch up, I'm enabling that
only if -gdwarf-5 is used for now.

2015-01-27  Jakub Jelinek  ja...@redhat.com

* dwarf2.h (enum dwarf_source_language): Add DW_LANG_Fortran03
and DW_LANG_Fortran08.
* dwarf2out.c (is_fortran): Also return true for DW_LANG_Fortran03
or DW_LANG_Fortran08.
(lower_bound_default): Return 1 for DW_LANG_Fortran03 or
DW_LANG_Fortran08.
(gen_compile_unit_die): Handle GNU Fortran2003 and
GNU Fortran2008 language strings.
* dbxout.c (get_lang_number): Use lang_GNU_Fortran.
* langhooks.h (lang_GNU_Fortran): New prototype.
* langhooks.c (lang_GNU_Fortran): New function.
fortran/
* options.c: Include langhooks.h.
(gfc_post_options): Change lang_hooks.name based on
selected -std= mode.

--- include/dwarf2.h.jj 2014-11-26 20:35:01.0 +0100
+++ include/dwarf2.h2015-01-27 17:55:18.086122137 +0100
@@ -312,6 +312,8 @@ enum dwarf_source_language
 DW_LANG_C_plus_plus_11 = 0x001a, /* dwarf5.20141029.pdf DRAFT */
 DW_LANG_C11 = 0x001d,
 DW_LANG_C_plus_plus_14 = 0x0021,
+DW_LANG_Fortran03 = 0x0022,
+DW_LANG_Fortran08 = 0x0023,
 
 DW_LANG_lo_user = 0x8000,  /* Implementation-defined range start.  */
 DW_LANG_hi_user = 0x,  /* Implementation-defined range start.  */
--- gcc/dwarf2out.c.jj  2015-01-27 17:54:13.0 +0100
+++ gcc/dwarf2out.c 2015-01-27 19:03:30.632411565 +0100
@@ -4736,7 +4736,9 @@ is_fortran (void)
 
   return (lang == DW_LANG_Fortran77
  || lang == DW_LANG_Fortran90
- || lang == DW_LANG_Fortran95);
+ || lang == DW_LANG_Fortran95
+ || lang == DW_LANG_Fortran03
+ || lang == DW_LANG_Fortran08);
 }
 
 /* Return TRUE if the language is Ada.  */
@@ -16720,6 +16722,8 @@ lower_bound_default (void)
 case DW_LANG_Fortran77:
 case DW_LANG_Fortran90:
 case DW_LANG_Fortran95:
+case DW_LANG_Fortran03:
+case DW_LANG_Fortran08:
   return 1;
 case DW_LANG_UPC:
 case DW_LANG_D:
@@ -19781,8 +19785,17 @@ gen_compile_unit_die (const char *filena
 {
   if (strcmp (language_string, GNU Ada) == 0)
language = DW_LANG_Ada95;
-  else if (strcmp (language_string, GNU Fortran) == 0)
-   language = DW_LANG_Fortran95;
+  else if (strncmp (language_string, GNU Fortran, 11) == 0)
+   {
+ language = DW_LANG_Fortran95;
+ if (dwarf_version = 5 /* || !dwarf_strict */)
+   {
+ if (strcmp (language_string, GNU Fortran2003) == 0)
+   language = DW_LANG_Fortran03;
+ else if (strcmp (language_string, GNU Fortran2008) == 0)
+   language = DW_LANG_Fortran08;
+   }
+   }
   else if (strcmp (language_string, GNU Java) == 0)
language = DW_LANG_Java;
   else if (strcmp (language_string, GNU Objective-C) == 0)
@@ -19796,7 +19809,7 @@ gen_compile_unit_die (const char *filena
}
 }
   /* Use a degraded Fortran setting in strict DWARF2 so is_fortran works.  */
-  else if (strcmp (language_string, GNU Fortran) == 0)
+  else if (strncmp (language_string, GNU Fortran, 11) == 0)
 language = DW_LANG_Fortran90;
 
   add_AT_unsigned (die, DW_AT_language, language);
@@ -19806,6 +19819,8 @@ gen_compile_unit_die (const char *filena
 case DW_LANG_Fortran77:
 case DW_LANG_Fortran90:
 case DW_LANG_Fortran95:
+case DW_LANG_Fortran03:
+case DW_LANG_Fortran08:
   /* Fortran has case insensitive identifiers and the front-end
 lowercases everything.  */
   add_AT_unsigned (die, DW_AT_identifier_case, DW_ID_down_case);
--- gcc/dbxout.c.jj 2015-01-15 20:25:30.0 +0100
+++ gcc/dbxout.c2015-01-27 18:58:58.286033152 +0100
@@ -967,7 +967,7 @@ get_lang_number (void)
 return N_SO_CC;
   else if (strcmp (language_string, GNU F77) == 0)
 return N_SO_FORTRAN;
-  else if (strcmp (language_string, GNU Fortran) == 0)
+  else if (lang_GNU_Fortran ())
 return N_SO_FORTRAN90; /* CHECKME */
   else if (strcmp (language_string, GNU Pascal) == 0)
 return N_SO_PASCAL;
--- gcc/langhooks.c.jj  2015-01-09 21:59:54.0 +0100
+++ gcc/langhooks.c 2015-01-27 18:58:37.375387995 +0100
@@ -731,3 +731,11 @@ lang_GNU_CXX (void)
 {
   return strncmp (lang_hooks.name, GNU C++, 7) == 0;
 }
+
+/* Returns true if the current lang_hooks represents the GNU Fortran frontend. 
 */
+
+bool
+lang_GNU_Fortran (void)
+{
+  return strncmp (lang_hooks.name, GNU Fortran, 11) == 0;
+}
--- gcc/langhooks.h.jj  2015-01-05 13:07:13.0 +0100
+++ gcc/langhooks.h 2015-01-27 18:57:51.139172602 +0100
@@ -509,5 +509,6 @@ extern tree add_builtin_type (const char
 
 extern bool lang_GNU_C 

Re: [RFC PATCH] Emit DW_LANG_Fortran{03,08}

2015-01-27 Thread David Malcolm
On Tue, 2015-01-27 at 19:19 +0100, Jakub Jelinek wrote:
 Hi!
 
 DW_LANG_Fortran03 and DW_LANG_Fortran08 DW_AT_language values were recently
 accepted into DWARF5.  This patch changes GCC to handle those similarly to
 how e.g. the -std=c++11, -std=c++14 or -std=c11 are handled.
 
 As it will take some time for consumers to catch up, I'm enabling that
 only if -gdwarf-5 is used for now.
 
 2015-01-27  Jakub Jelinek  ja...@redhat.com
 
   * dwarf2.h (enum dwarf_source_language): Add DW_LANG_Fortran03
   and DW_LANG_Fortran08.
   * dwarf2out.c (is_fortran): Also return true for DW_LANG_Fortran03
   or DW_LANG_Fortran08.
   (lower_bound_default): Return 1 for DW_LANG_Fortran03 or
   DW_LANG_Fortran08.
   (gen_compile_unit_die): Handle GNU Fortran2003 and
   GNU Fortran2008 language strings.
   * dbxout.c (get_lang_number): Use lang_GNU_Fortran.
   * langhooks.h (lang_GNU_Fortran): New prototype.
   * langhooks.c (lang_GNU_Fortran): New function.
 fortran/
   * options.c: Include langhooks.h.
   (gfc_post_options): Change lang_hooks.name based on
   selected -std= mode.

(...snip...)

 --- gcc/fortran/options.c.jj  2015-01-12 21:29:11.0 +0100
 +++ gcc/fortran/options.c 2015-01-27 19:07:33.729285229 +0100
 @@ -43,6 +43,7 @@ along with GCC; see the file COPYING3.
  #include cpp.h
  #include diagnostic.h  /* For global_dc.  */
  #include tm.h
 +#include langhooks.h
  
  gfc_option_t gfc_option;
  
 @@ -398,6 +399,11 @@ gfc_post_options (const char **pfilename
  
gfc_cpp_post_options ();
  
 +  if (gfc_option.allow_std  GFC_STD_F2008)
 +lang_hooks.name = GNU Fortran2008;
 +  else if (gfc_option.allow_std  GFC_STD_F2003)
 +lang_hooks.name = GNU Fortran2003;
 +

Did you test this on rs6000?

In particular, rs6000_output_function_epilogue has a:
  else if (! strcmp (language_string, GNU F77)
   || ! strcmp (language_string, GNU Fortran))
i = 1;

Does that conditional need updating to track the langhooks.name change
(maybe to use your new lang_GNU_Fortran function?)

Dave




Re: [PATCH][2/2] Improve array-bound warnings and VRP

2015-01-27 Thread Martin Uecker


Richard Biener wrote:

 On Mon, 26 Jan 2015, Jakub Jelinek wrote:

  Then it probably should be ok.  I'm really afraid of emitting more warnings
  with such high false positive rate now.

 As the patch also mitigates some of the code bloat we get with
 the complete peeling (regression against 4.7) I have installed it.
 It's also the easiest vehicle to verify range-info is not broken
 by passes between vrp1 and vrp2.

You could make warnings appear only for warn_array_bounds  1
if there are concerns about false positives.

For what it's worth, I tested the old version of both patches on 
one of my projects (mostly numerical algorithms) and it did not 
produce additional warnings.

I really appreciate all improvements in this area.

Martin




Re: [RFC PATCH] Emit DW_LANG_Fortran{03,08}

2015-01-27 Thread Tobias Burnus

Jakub Jelinek wrote:

DW_LANG_Fortran03 and DW_LANG_Fortran08 DW_AT_language values were recently
accepted into DWARF5.  This patch changes GCC to handle those similarly to
how e.g. the -std=c++11, -std=c++14 or -std=c11 are handled.


For completeness: gfortran currently produces GNU Fortran and 
DW_LANG_Fortran95; GCC itself also handles ...Fortran77 and 
...Fortran90, but those are not produced with gfortran.


With the patch, it produces for -gdwarf-2/3/4 (4 is default) or 
-gdwarf-5 -std=f95 the same as above. For -std=f2003 -gdwarf-5, it 
yields GNU Fortran2003 and DW_LANG_Fortran2003. And for -gdwarf-5 and 
the rest of -std= (f2008, f2008ts, gnu, legacy), it produces GNU 
Fortran2008 and DW_LANG_Fortran2008.


(In principle, they could have prepared for the future and added Fortran 
2015 as well.)



Regarding the change: it is fine with me. (However, I wonder how much 
will break, once the || !dwarf_strict is enabled, knowing that 
compilers are often more frequently updated as debuggers, valgrind and 
similar programs. On the other, except of debuggers, most tools should 
care much about the DW_LANG.)


Tobias

PS: Talking about DWARF5, do you know when it will be available as 
public draft? I am especially looking forward to 
http://dwarfstd.org/ShowIssue.php?issue=121221.1 (Allow DW_AT_type with 
DW_TAG_string_type), which would be a low-hanging fruit in terms of 
implementation. Contrary to the array additions of 130313.5.



As it will take some time for consumers to catch up, I'm enabling that
only if -gdwarf-5 is used for now.

2015-01-27  Jakub Jelinek  ja...@redhat.com

* dwarf2.h (enum dwarf_source_language): Add DW_LANG_Fortran03
and DW_LANG_Fortran08.
* dwarf2out.c (is_fortran): Also return true for DW_LANG_Fortran03
or DW_LANG_Fortran08.
(lower_bound_default): Return 1 for DW_LANG_Fortran03 or
DW_LANG_Fortran08.
(gen_compile_unit_die): Handle GNU Fortran2003 and
GNU Fortran2008 language strings.
* dbxout.c (get_lang_number): Use lang_GNU_Fortran.
* langhooks.h (lang_GNU_Fortran): New prototype.
* langhooks.c (lang_GNU_Fortran): New function.
fortran/
* options.c: Include langhooks.h.
(gfc_post_options): Change lang_hooks.name based on
selected -std= mode.


Re: [Patch, Fortran] PR64771 - Fix coarray ICE

2015-01-27 Thread Tobias Burnus

Tobias Burnus wrote:

This one compiles just as well, of course.

 From my side, that patch (using MAX) is fine. Thanks for
bearing the bootstrap failure and for the patch.


I have now committed it (i.e. Rainer's patch) as Rev. 220182.
I have also committed the fixed-up/combined patch to the 4.9 branch as 
Rev. 220184.


(BTW: The original patch was approved by Paul on IRC.)

Tobias


Re: [Patch, Fortran, OOP] PR 64230: [4.9/5 Regression] Invalid memory reference in a compiler-generated finalizer for allocatable component

2015-01-27 Thread Janus Weil
 On Tue, Jan 27, 2015 at 10:24:47AM +0100, Andreas Schwab wrote:

  2015-01-19  Janus Weil  ja...@gcc.gnu.org
 
  PR fortran/64230
  * gfortran.dg/class_allocate_18.f90: Extended.

 FAIL: gfortran.dg/class_allocate_18.f90   -O0  (test for excess errors)
 Excess errors:
 /usr/ia64-suse-linux/bin/ld: cannot find -lubsan

Sorry for the breakage, guys!


2015-01-27 10:30 GMT+01:00 Jakub Jelinek ja...@redhat.com:
 Yeah, if you want to add ubsan tests, you need to add gfortran.dg/ubsan/
 directory and hack up ubsan.exp in there

Thanks for the remark, I was suspecting something like that. However,
for this case it's not really worth the hassle. In fact the test case
does not really need the sanitizer and should also work without it. So
I'll just remove the -fsanitize option:

Index: gcc/testsuite/gfortran.dg/class_allocate_18.f90
===
--- gcc/testsuite/gfortran.dg/class_allocate_18.f90(Revision 220180)
+++ gcc/testsuite/gfortran.dg/class_allocate_18.f90(Arbeitskopie)
@@ -1,5 +1,4 @@
 ! { dg-do run }
-! { dg-options -fsanitize=undefined }
 !
 ! PR 64230: [4.9/5 Regression] Invalid memory reference in a
compiler-generated finalizer for allocatable component
 !


Cheers,
Janus


Re: Merge current set of OpenACC changes from gomp-4_0-branch

2015-01-27 Thread Jack Howarth
Thomas,
 Any plans to fix
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64635 soon? On x86_64
darwin, the OpenACC merge resulted a huge number of failures in the
libgomp test suite…


=== libgomp Summary ===
# of expected passes 10628
# of unexpected failures 724
# of unsupported tests 562

which are resolved with a fix similar to
https://gcc.gnu.org/bugzilla/attachment.cgi?id=34480.
   Jack


On Mon, Jan 26, 2015 at 8:44 AM, Thomas Schwinge
tho...@codesourcery.com wrote:
 Hi!

 Sorry for the late answer -- I've been on sick leave, and just now
 returning to work.  Julian, would you please have a look at the following
 issues?

   In r219682, I have committed to trunk our current set of OpenACC changes,
   which we had prepared on gomp-4_0-branch.  Thanks to everyone who has
   been contributing!

 On Fri, 23 Jan 2015 20:20:53 +0300, Ilya Verbin iver...@gmail.com wrote:
 On 17 Jan 02:16, Ilya Verbin wrote:
  Unfortunately, it broke offloading from shared libraries (I mean common 
  libs
  with NEEDED entries, not dlopened).

 Sorry for that!

  Such things are not covered by the
  testsuite, that's why you missed this issue.  Here is a simple testcase:

 http://news.gmane.org/find-root.php?message_id=%3C20150116231632.GB48380%40msticlxl57.ims.intel.com%3E

 Probably a good motivation for adding such a test case.  ;-)

  So, you don't assume that a device can have multiple images from multiple 
  libs?

 Ping?

 This probably is just a bug that we introduced with our changes?
 (Julian?)


 Also, could you please explain, why did you divide a device initialization 
 into
 two functions -- gomp_init_device and gomp_init_tables?

 As I understand it (again, Julian, please correct me if I got that
 wrong), the reason is that for OpenACC support, we need these as two
 separate (independent) actions.  Is this causing problems for OpenMP
 offloading?


 Currently I'm trying to rebase on trunk my old patch, which fixes offloading
 from dlopened libraries: 
 http://gcc.gnu.org/ml/gcc-patches/2014-11/msg01604.html
 It works for OpenMP and MIC, but I don't know how not to break OpenACC and 
 PTX.


 Grüße,
  Thomas


Re: [Patch, Fortran, OOP] PR 64230: [4.9/5 Regression] Invalid memory reference in a compiler-generated finalizer for allocatable component

2015-01-27 Thread Janus Weil
2015-01-27 19:23 GMT+01:00 Jakub Jelinek ja...@redhat.com:
 On Tue, Jan 27, 2015 at 07:20:10PM +0100, Janus Weil wrote:
 2015-01-27 10:30 GMT+01:00 Jakub Jelinek ja...@redhat.com:
  Yeah, if you want to add ubsan tests, you need to add gfortran.dg/ubsan/
  directory and hack up ubsan.exp in there

 Thanks for the remark, I was suspecting something like that. However,
 for this case it's not really worth the hassle. In fact the test case
 does not really need the sanitizer and should also work without it. So
 I'll just remove the -fsanitize option:

 Index: gcc/testsuite/gfortran.dg/class_allocate_18.f90
 ===
 --- gcc/testsuite/gfortran.dg/class_allocate_18.f90(Revision 220180)
 +++ gcc/testsuite/gfortran.dg/class_allocate_18.f90(Arbeitskopie)
 @@ -1,5 +1,4 @@
  ! { dg-do run }
 -! { dg-options -fsanitize=undefined }
  !
  ! PR 64230: [4.9/5 Regression] Invalid memory reference in a
 compiler-generated finalizer for allocatable component
  !

 LGTM.

Good, committed as r220181. Since I had already backported the
original patch to 4.9 yesterday, I'll do the same there ...

Cheers,
Janus


Re: [RFC PATCH] Emit DW_LANG_Fortran{03,08}

2015-01-27 Thread Jakub Jelinek
On Tue, Jan 27, 2015 at 01:52:12PM -0500, David Malcolm wrote:
  @@ -398,6 +399,11 @@ gfc_post_options (const char **pfilename
   
 gfc_cpp_post_options ();
   
  +  if (gfc_option.allow_std  GFC_STD_F2008)
  +lang_hooks.name = GNU Fortran2008;
  +  else if (gfc_option.allow_std  GFC_STD_F2003)
  +lang_hooks.name = GNU Fortran2003;
  +
 
 Did you test this on rs6000?
 
 In particular, rs6000_output_function_epilogue has a:
   else if (! strcmp (language_string, GNU F77)
  || ! strcmp (language_string, GNU Fortran))
   i = 1;

You're right, missed that.  Consider that changed to lang_GNU_Fortran ().

Jakub


C++ PATCH for c++/63889 (ICE with member variable template)

2015-01-27 Thread Jason Merrill
We were trying to instantiate is_ok with only the innermost set of 
template arguments; we need to make sure that the outer args are 
provided as well.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit e2df55ffbe254dfc15801a204af16d012aeb4cb5
Author: Jason Merrill ja...@redhat.com
Date:   Mon Jan 26 10:55:42 2015 -0500

	PR c++/63889
	* pt.c (finish_template_variable): Move from semantics.c.
	Handle multiple template arg levels.  Handle coercion here.
	(lookup_template_variable): Not here.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index bc26530..d377daa 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -8091,13 +8091,28 @@ tree
 lookup_template_variable (tree templ, tree arglist)
 {
   tree type = unknown_type_node;
-  tsubst_flags_t complain = tf_warning_or_error;
-  tree parms = INNERMOST_TEMPLATE_PARMS (DECL_TEMPLATE_PARMS (templ));
-  arglist = coerce_template_parms (parms, arglist, templ, complain,
-   /*req_all*/true, /*use_default*/true);
   return build2 (TEMPLATE_ID_EXPR, type, templ, arglist);
 }
 
+/* Instantiate a variable declaration from a TEMPLATE_ID_EXPR for use. */
+
+tree
+finish_template_variable (tree var)
+{
+  tree templ = TREE_OPERAND (var, 0);
+
+  tree arglist = TREE_OPERAND (var, 1);
+  tree tmpl_args = DECL_TI_ARGS (DECL_TEMPLATE_RESULT (templ));
+  arglist = add_outermost_template_args (tmpl_args, arglist);
+
+  tree parms = DECL_TEMPLATE_PARMS (templ);
+  tsubst_flags_t complain = tf_warning_or_error;
+  arglist = coerce_innermost_template_parms (parms, arglist, templ, complain,
+	 /*req_all*/true,
+	 /*use_default*/true);
+
+  return instantiate_template (templ, arglist, complain);
+}
 
 struct pair_fn_data
 {
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 915048d..75aa501 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -2454,15 +2454,6 @@ finish_call_expr (tree fn, vectree, va_gc **args, bool disallow_virtual,
   return result;
 }
 
-/* Instantiate a variable declaration from a TEMPLATE_ID_EXPR for use. */
-
-tree
-finish_template_variable (tree var)
-{
-  return instantiate_template (TREE_OPERAND (var, 0), TREE_OPERAND (var, 1),
-   tf_error);
-}
-
 /* Finish a call to a postfix increment or decrement or EXPR.  (Which
is indicated by CODE, which should be POSTINCREMENT_EXPR or
POSTDECREMENT_EXPR.)  */
diff --git a/gcc/testsuite/g++.dg/cpp1y/var-templ22.C b/gcc/testsuite/g++.dg/cpp1y/var-templ22.C
new file mode 100644
index 000..9ddc925
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/var-templ22.C
@@ -0,0 +1,14 @@
+// PR c++/63889
+// { dg-do compile { target c++14 } }
+
+templateclass T
+struct A
+{
+  templateclass
+  static constexpr bool is_ok = true;
+
+  templatebool v = is_okT
+  A(T) { }
+};
+
+Aint p(42);


Re: [Patch, Fortran] PR63861 - fix OpenMP/ACC's gfc_has_alloc_comps

2015-01-27 Thread Jakub Jelinek
On Tue, Jan 27, 2015 at 08:27:07AM +0100, Tobias Burnus wrote:
 2015-01-27  Tobias Burnus  bur...@net-b.de
 
   PR fortran/63861
 gcc/fortran/
   * trans-openmp.c (gfc_has_alloc_comps, gfc_trans_omp_clauses):
   Fix handling for scalar coarrays.
   * trans-types.c (gfc_get_element_type): Add comment.
 
 gcc/testsuite/
   * gfortran.dg/goacc/coarray_2.f90: New.

Ok, thanks.

Jakub


Re: [PATCH] Fix ICE during ipa dumping (PR ipa/64730)

2015-01-27 Thread Richard Biener
On Mon, 26 Jan 2015, Jakub Jelinek wrote:

 Hi!
 
 On various targets, %s in fprintf can't handle NULL arguments,
 and even when edge-call_stmt is non-NULL, it still might have
 UNKNOWN_LOCATION or BUILTINS_LOCATION, which have NULL filename.
 In this particular case it is a fnsplit created call.
 
 Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
 ok for trunk?

Ok.

Thanks,
Richard.

 2015-01-26  Jakub Jelinek  ja...@redhat.com
 
   PR ipa/64730
   * ipa-inline.c (inline_small_functions): Print unknown even
   if edge-call_stmt is non-NULL, but has builtins or unknown
   location.
 
 --- gcc/ipa-inline.c.jj   2015-01-22 21:45:18.0 +0100
 +++ gcc/ipa-inline.c  2015-01-26 15:41:57.193640527 +0100
 @@ -1822,6 +1822,9 @@ inline_small_functions (void)
   Estimated badness is %f, frequency %.2f.\n,
  edge-caller-name (), edge-caller-order,
  edge-call_stmt
 + (LOCATION_LOCUS (gimple_location ((const_gimple)
 + edge-call_stmt))
 + BUILTINS_LOCATION)
  ? gimple_filename ((const_gimple) edge-call_stmt)
  : unknown,
  edge-call_stmt
 
   Jakub
 
 

-- 
Richard Biener rguent...@suse.de
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild,
Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)


Re: [Patch, Fortran, OOP] PR 64230: [4.9/5 Regression] Invalid memory reference in a compiler-generated finalizer for allocatable component

2015-01-27 Thread Jakub Jelinek
On Tue, Jan 27, 2015 at 10:24:47AM +0100, Andreas Schwab wrote:
 Janus Weil ja...@gcc.gnu.org writes:
 
  2015-01-19  Janus Weil  ja...@gcc.gnu.org
 
  PR fortran/64230
  * gfortran.dg/class_allocate_18.f90: Extended.
 
 FAIL: gfortran.dg/class_allocate_18.f90   -O0  (test for excess errors)
 Excess errors:
 /usr/ia64-suse-linux/bin/ld: cannot find -lubsan

Yeah, if you want to add ubsan tests, you need to add gfortran.dg/ubsan/
directory and hack up ubsan.exp in there, from say gcc.dg/ubsan/ubsan.exp
and gfortran.dg/dg.exp.

Jakub


Re: [PATCH] Update BBs in cleanup_barriers pass (PR rtl-optimization/61058)

2015-01-27 Thread Eric Botcazou
 Yes, they do, that is why it crashed during final.

OK.  Why wouldn't it work to call reorder_insns instead of reorder_insns_nobb?

-- 
Eric Botcazou


[PATCH] S/390: -mhotpatch v2

2015-01-27 Thread Dominik Vogt
The attached patch updates the -mhotpatch option and the hopatch
function attribute with (incompatible) new semantics.  Please
refer to the commit in the patch for details.

--

2015-01-27  Dominik Vogt  v...@linux.vnet.ibm.com

* doc/extend.texi: s/390: Update documentation of hotpatch attribute.
* doc/invoke.texi (-mhotpatch): s/390: Update documentation of
-mhotpatch= option.
* config/s390/s390.opt (mhotpatch): s/390: Remove -mhotpatch and
-mno-hotpatch options.  Change syntax of -mhotpatch= option.
* config/s390/s390.c (s390_hotpatch_trampoline_halfwords_default):
Renamed.
(s390_hotpatch_trampoline_halfwords_max): Renamed.
(s390_hotpatch_hw_max): New name.
(s390_hotpatch_trampoline_halfwords): Renamed.
(s390_hotpatch_hw_before_label): New name.
(get_hotpatch_attribute): Removed.
(s390_hotpatch_hw_after_label): New name.
(s390_handle_hotpatch_attribute): Add second parameter to hotpatch
attribute.
(s390_attribute_table): Ditto.
(s390_function_num_hotpatch_trampoline_halfwords): Renamed.
(s390_function_num_hotpatch_hw): New name.
Remove special handling of inline functions and hotpatching.
Return number of nops before and after the function label.
(s390_can_inline_p): Removed.
(s390_asm_output_function_label): Emit a configurable number of nops
after the function label.
(s390_option_override): Update -mhotpatch= syntax and remove -mhotpatch.
(TARGET_CAN_INLINE_P) Removed.
(TARGET_FUNCTION_ATTRIBUTE_INLINABLE_P): New.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
From 9123265bb1d6e325f4edc99a2d1f33a862b3ba53 Mon Sep 17 00:00:00 2001
From: Dominik Vogt v...@linux.vnet.ibm.com
Date: Mon, 1 Dec 2014 15:59:42 +0100
Subject: [PATCH] S/390: -mhotpatch v2

Update the -mhotpatch option and the hotpatch function attribute to take
exactly two arguments.  The first is the number of halfwords to be filled with
two-byte-nops before the function label.  The second is the number of halfwords
to be filled with nops after the label (the biggest available nop instructions
are used).

Further changes are:

* Artificial functions and the main function are also patched.
* Functions selected for hotpatching can still be inlined.  It's the
  responsibility of the user to take care of this when patching, or to
  explicitly disable inlining.
---
 gcc/config/s390/s390.c | 227 -
 gcc/config/s390/s390.opt   |  12 +-
 gcc/doc/extend.texi|  17 +-
 gcc/doc/invoke.texi|  16 +-
 gcc/testsuite/gcc.target/s390/hotpatch-1.c |  14 +-
 gcc/testsuite/gcc.target/s390/hotpatch-10.c|  15 +-
 gcc/testsuite/gcc.target/s390/hotpatch-11.c|  12 +-
 gcc/testsuite/gcc.target/s390/hotpatch-12.c|  14 +-
 gcc/testsuite/gcc.target/s390/hotpatch-13.c|  17 ++
 gcc/testsuite/gcc.target/s390/hotpatch-14.c|  17 ++
 gcc/testsuite/gcc.target/s390/hotpatch-15.c|  17 ++
 gcc/testsuite/gcc.target/s390/hotpatch-16.c|  17 ++
 gcc/testsuite/gcc.target/s390/hotpatch-17.c|  17 ++
 gcc/testsuite/gcc.target/s390/hotpatch-18.c|  16 ++
 gcc/testsuite/gcc.target/s390/hotpatch-19.c|  23 +++
 gcc/testsuite/gcc.target/s390/hotpatch-2.c |  12 +-
 gcc/testsuite/gcc.target/s390/hotpatch-20.c|  20 ++
 gcc/testsuite/gcc.target/s390/hotpatch-3.c |  10 +-
 gcc/testsuite/gcc.target/s390/hotpatch-4.c |  18 +-
 gcc/testsuite/gcc.target/s390/hotpatch-5.c |  15 +-
 gcc/testsuite/gcc.target/s390/hotpatch-6.c |  13 +-
 gcc/testsuite/gcc.target/s390/hotpatch-7.c |  13 +-
 gcc/testsuite/gcc.target/s390/hotpatch-8.c |  24 +--
 gcc/testsuite/gcc.target/s390/hotpatch-9.c |  15 +-
 gcc/testsuite/gcc.target/s390/hotpatch-compile-1.c |  24 +--
 .../gcc.target/s390/hotpatch-compile-10.c  |  12 ++
 .../gcc.target/s390/hotpatch-compile-11.c  |  12 ++
 .../gcc.target/s390/hotpatch-compile-12.c  |  12 ++
 .../gcc.target/s390/hotpatch-compile-13.c  |  29 +++
 .../gcc.target/s390/hotpatch-compile-14.c  |  11 +
 .../gcc.target/s390/hotpatch-compile-15.c  |  43 
 .../gcc.target/s390/hotpatch-compile-16.c  |  24 +++
 gcc/testsuite/gcc.target/s390/hotpatch-compile-2.c |  24 +--
 gcc/testsuite/gcc.target/s390/hotpatch-compile-3.c |  24 +--
 gcc/testsuite/gcc.target/s390/hotpatch-compile-4.c |   2 +-
 gcc/testsuite/gcc.target/s390/hotpatch-compile-5.c |  23 +--
 gcc/testsuite/gcc.target/s390/hotpatch-compile-6.c |   4 +-
 gcc/testsuite/gcc.target/s390/hotpatch-compile-7.c |  66 +-
 gcc/testsuite/gcc.target/s390/hotpatch-compile-8.c |  23 +--
 gcc/testsuite/gcc.target/s390/hotpatch-compile-9.c |  12 ++
 40 files changed, 532 insertions(+), 404 

Re: [PATCH] Fix PR64277

2015-01-27 Thread Richard Biener
On Tue, 27 Jan 2015, Jakub Jelinek wrote:

 On Tue, Jan 27, 2015 at 10:25:48AM +0100, Richard Biener wrote:
  
  This disables array-bound warnings from VRP2 as discussed.
  
  Bootstrapped and tested on x86_64-unknown-linux-gnu - ok for trunk?
 
 So nothing in the testsuite needed to change?  Nice.

Yes.

 Ok for trunk.
 
  I'll search for duplicates and add a few testcases.
 
 Thanks.

Committed as follows (first testcase in PR59124 not fixed - it warns
from the first pass).

2015-01-27  Richard Biener  rguent...@suse.de

PR tree-optimization/56273
PR tree-optimization/59124
PR tree-optimization/64277
* tree-vrp.c (vrp_finalize): Emit array-bound warnings only
from the first VRP pass.

* g++.dg/warn/Warray-bounds-6.C: New testcase.
* gcc.dg/Warray-bounds-12.c: Likewise.
* gcc.dg/Warray-bounds-13.c: Likewise.

Index: gcc/tree-vrp.c
===
*** gcc/tree-vrp.c.orig 2015-01-27 10:34:26.453743828 +0100
--- gcc/tree-vrp.c  2015-01-27 10:43:04.970610102 +0100
*** vrp_finalize (void)
*** 10229,10235 
substitute_and_fold (op_with_constant_singleton_value_range,
   vrp_fold_stmt, false);
  
!   if (warn_array_bounds)
  check_all_array_refs ();
  
/* We must identify jump threading opportunities before we release
--- 10229,10235 
substitute_and_fold (op_with_constant_singleton_value_range,
   vrp_fold_stmt, false);
  
!   if (warn_array_bounds  first_pass_instance)
  check_all_array_refs ();
  
/* We must identify jump threading opportunities before we release
Index: gcc/testsuite/g++.dg/warn/Warray-bounds-6.C
===
*** /dev/null   1970-01-01 00:00:00.0 +
--- gcc/testsuite/g++.dg/warn/Warray-bounds-6.C 2015-01-27 10:40:31.311871855 
+0100
***
*** 0 
--- 1,26 
+ // { dg-do compile }
+ // { dg-options -O3 -Warray-bounds }
+ 
+ struct type {
+ bool a, b;
+ bool get_b() { return b; }
+ };
+ 
+ type stuff[9u];
+ 
+ void bar();
+ 
+ void foo()
+ {
+   for(unsigned i = 0u; i  9u; i++)
+ {
+   if(!stuff[i].a)
+   continue;
+ 
+   bar();
+ 
+   for(unsigned j = i + 1u; j  9u; j++)
+   if(stuff[j].a  stuff[j].get_b()) // { dg-bogus above array bounds }
+ return;
+ }
+ }
Index: gcc/testsuite/gcc.dg/Warray-bounds-12.c
===
*** /dev/null   1970-01-01 00:00:00.0 +
--- gcc/testsuite/gcc.dg/Warray-bounds-12.c 2015-01-27 10:40:58.196175989 
+0100
***
*** 0 
--- 1,26 
+ /* { dg-do compile } */
+ /* { dg-options -O3 -Warray-bounds } */
+ /* { dg-additional-options -mssse3 { target x86_64-*-* i?86-*-* } } */
+ 
+ void foo(short a[], short m)
+ {
+   int i, j;
+   int f1[10];
+   short nc;
+ 
+   nc = m + 1;
+   if (nc  3)
+ {
+   for (i = 0; i = nc; i++)
+   {
+ f1[i] = f1[i] + 1;
+   }
+ }
+ 
+   for (i = 0, j = m; i  nc; i++, j--)
+ {
+   a[i] = f1[i]; /* { dg-bogus above array bounds } */
+   a[j] = i;
+ }
+   return;
+ }
Index: gcc/testsuite/gcc.dg/Warray-bounds-13.c
===
*** /dev/null   1970-01-01 00:00:00.0 +
--- gcc/testsuite/gcc.dg/Warray-bounds-13.c 2015-01-27 10:42:43.738369929 
+0100
***
*** 0 
--- 1,18 
+ /* { dg-do compile } */
+ /* { dg-options -O3 -Warray-bounds } */
+ 
+ extern char *bar[17];
+ 
+ int foo(int argc, char **argv)
+ {
+   int i;
+   int n = 0;
+ 
+   for (i = 0; i  argc; i++)
+ n++;
+ 
+   for (i = 0; i  argc; i++)
+ argv[i] = bar[i + n]; /* { dg-bogus above array bounds } */
+ 
+   return 0;
+ }


Re: [PATCH][AArch64] Use target builtin instead of __builtin_sqrt for vsqrt_f64

2015-01-27 Thread Kyrill Tkachov


On 19/01/15 15:46, Kyrill Tkachov wrote:

On 19/01/15 15:44, James Greenhalgh wrote:

On Mon, Jan 12, 2015 at 05:30:46PM +, Andrew Pinski wrote:

On Mon, Jan 12, 2015 at 7:52 AM, Kyrill Tkachov kyrylo.tkac...@arm.com wrote:

Hi all,

As raised in https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01237.html and
discussed in that thread, using __builtin_sqrt for vsqrt_f64 may end up in a
call to the library sqrt at -O0. To avoid that this patch uses a target
builtin for sqrt on DF mode and uses that to implement the intrinsic.

With this patch I don't see sqrt calls being created at -O0 on a large
arm_neon.h testcase where they were generated before.
aarch64-none-elf testing and the intrinsics testsuite in particular are
clean.
Ok for trunk?

Maybe have a target fold which folds this into sqrt if -fno-math-errno
is supplied.  This might be useful the -ffast-math case.
Maybe also fold it when a constant is supplied too.

Given that we are now in Stage 4, I'd rather see this fixed for GCC 5.0
in the way Kyrill proposed than languishing on a TODO list. Though an
IOU ticket on bugzilla for the missed optimization seems a good idea
to me.

Unless Kyrill already has something in the works to address your
comment, this looks like the right short-term solution to me
(Though Marcus/Richard will have to approve it).

Sorry, this slipped through the cracks.
I agree with James. A missed-optimization issue on bugzilla would be
helpful to keep track of this.


I've filed PR 64821 to keep track of this for GCC 6.
Can I ping https://gcc.gnu.org/ml/gcc-patches/2015-01/msg00710.html then?
It's a regression fix at -O0 so should be appropriate for stage4

Thanks,
Kyrill



Kyrill


Thanks,
James


2015-01-12  Kyrylo Tkachov  kyrylo.tkac...@arm.com

  * config/aarch64/aarch64-simd-builtins.def (sqrt): Use BUILTIN_VDQF_DF.
  * config/aarch64/arm_neon.h (vsqrt_f64): Use __builtin_aarch64_sqrtdf
  instead of __builtin_sqrt.








Re: [PATCH] Fix PR64277

2015-01-27 Thread Ilya Enkovich
2015-01-27 12:47 GMT+03:00 Richard Biener rguent...@suse.de:
 On Tue, 27 Jan 2015, Jakub Jelinek wrote:

 On Tue, Jan 27, 2015 at 10:25:48AM +0100, Richard Biener wrote:
 
  This disables array-bound warnings from VRP2 as discussed.
 
  Bootstrapped and tested on x86_64-unknown-linux-gnu - ok for trunk?

 So nothing in the testsuite needed to change?  Nice.

 Yes.

 Ok for trunk.

  I'll search for duplicates and add a few testcases.

 Thanks.

 Committed as follows (first testcase in PR59124 not fixed - it warns
 from the first pass).

Are you going to port it to 4.9 branch?

Thanks,
Ilya


 2015-01-27  Richard Biener  rguent...@suse.de

 PR tree-optimization/56273
 PR tree-optimization/59124
 PR tree-optimization/64277
 * tree-vrp.c (vrp_finalize): Emit array-bound warnings only
 from the first VRP pass.

 * g++.dg/warn/Warray-bounds-6.C: New testcase.
 * gcc.dg/Warray-bounds-12.c: Likewise.
 * gcc.dg/Warray-bounds-13.c: Likewise.

 Index: gcc/tree-vrp.c
 ===
 *** gcc/tree-vrp.c.orig 2015-01-27 10:34:26.453743828 +0100
 --- gcc/tree-vrp.c  2015-01-27 10:43:04.970610102 +0100
 *** vrp_finalize (void)
 *** 10229,10235 
 substitute_and_fold (op_with_constant_singleton_value_range,
vrp_fold_stmt, false);

 !   if (warn_array_bounds)
   check_all_array_refs ();

 /* We must identify jump threading opportunities before we release
 --- 10229,10235 
 substitute_and_fold (op_with_constant_singleton_value_range,
vrp_fold_stmt, false);

 !   if (warn_array_bounds  first_pass_instance)
   check_all_array_refs ();

 /* We must identify jump threading opportunities before we release
 Index: gcc/testsuite/g++.dg/warn/Warray-bounds-6.C
 ===
 *** /dev/null   1970-01-01 00:00:00.0 +
 --- gcc/testsuite/g++.dg/warn/Warray-bounds-6.C 2015-01-27 10:40:31.311871855 
 +0100
 ***
 *** 0 
 --- 1,26 
 + // { dg-do compile }
 + // { dg-options -O3 -Warray-bounds }
 +
 + struct type {
 + bool a, b;
 + bool get_b() { return b; }
 + };
 +
 + type stuff[9u];
 +
 + void bar();
 +
 + void foo()
 + {
 +   for(unsigned i = 0u; i  9u; i++)
 + {
 +   if(!stuff[i].a)
 +   continue;
 +
 +   bar();
 +
 +   for(unsigned j = i + 1u; j  9u; j++)
 +   if(stuff[j].a  stuff[j].get_b()) // { dg-bogus above array bounds 
 }
 + return;
 + }
 + }
 Index: gcc/testsuite/gcc.dg/Warray-bounds-12.c
 ===
 *** /dev/null   1970-01-01 00:00:00.0 +
 --- gcc/testsuite/gcc.dg/Warray-bounds-12.c 2015-01-27 10:40:58.196175989 
 +0100
 ***
 *** 0 
 --- 1,26 
 + /* { dg-do compile } */
 + /* { dg-options -O3 -Warray-bounds } */
 + /* { dg-additional-options -mssse3 { target x86_64-*-* i?86-*-* } } */
 +
 + void foo(short a[], short m)
 + {
 +   int i, j;
 +   int f1[10];
 +   short nc;
 +
 +   nc = m + 1;
 +   if (nc  3)
 + {
 +   for (i = 0; i = nc; i++)
 +   {
 + f1[i] = f1[i] + 1;
 +   }
 + }
 +
 +   for (i = 0, j = m; i  nc; i++, j--)
 + {
 +   a[i] = f1[i]; /* { dg-bogus above array bounds } */
 +   a[j] = i;
 + }
 +   return;
 + }
 Index: gcc/testsuite/gcc.dg/Warray-bounds-13.c
 ===
 *** /dev/null   1970-01-01 00:00:00.0 +
 --- gcc/testsuite/gcc.dg/Warray-bounds-13.c 2015-01-27 10:42:43.738369929 
 +0100
 ***
 *** 0 
 --- 1,18 
 + /* { dg-do compile } */
 + /* { dg-options -O3 -Warray-bounds } */
 +
 + extern char *bar[17];
 +
 + int foo(int argc, char **argv)
 + {
 +   int i;
 +   int n = 0;
 +
 +   for (i = 0; i  argc; i++)
 + n++;
 +
 +   for (i = 0; i  argc; i++)
 + argv[i] = bar[i + n]; /* { dg-bogus above array bounds } */
 +
 +   return 0;
 + }


Re: [PATCH] Fix for PR64741 (UBSan/ASan integration)

2015-01-27 Thread Jakub Jelinek
On Tue, Jan 27, 2015 at 09:19:20AM +0300, Yury Gribov wrote:
 As described in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64741 , ASan
 may currently report false positives for UBSan internal variables due to
 their incomplete type information. This patch fixes this.
 
 Bootstrapped and regtested on Linux x64. Ok to commit?
 
 -Y

 commit cf083510ece7b7bde1ab5a41e293b5a6a5bb4550
 Author: Yury Gribov y.gri...@samsung.com
 Date:   Mon Jan 26 10:19:03 2015 +0300
 
 2015-01-26  Yury Gribov  y.gri...@samsung.com
 
   PR ubsan/64741
 
   * ubsan.c (ubsan_type_descriptor): Update type size.

No extra newline between PR and * ubsan.c lines.

 --- a/gcc/ubsan.c
 +++ b/gcc/ubsan.c
 @@ -504,6 +504,14 @@ ubsan_type_descriptor (tree type, enum ubsan_print_style 
 pstyle)
tinfo = get_ubsan_type_info_for_type (type);
  
/* Create a new VAR_DECL of type descriptor.  */
 +  const char *tmp = pp_formatted_text (pretty_name);
 +  size_t len = strlen (tmp);
 +  tree str = build_string (len + 1, tmp);
 +  TREE_TYPE (str) = build_array_type (char_type_node,
 +   build_index_type (size_int (len)));
 +  TREE_READONLY (str) = 1;
 +  TREE_STATIC (str) = 1;

While touching this, could you please rewrite it as:
  const char *tmp = pp_formatted_text (pretty_name);
  size_t len = strlen (tmp) + 1;
  tree str = build_string (len, tmp);
  TREE_TYPE (str) = build_array_type_nelts (char_type_node, len);
  TREE_READONLY (str) = 1;
  TREE_STATIC (str) = 1;
?  Or, if you want, do it as a follow-up.  There is another occurrence
of this in ubsan_source_location.

Ok for trunk with or without this change.

Jakub


Re: [PATCH] Update BBs in cleanup_barriers pass (PR rtl-optimization/61058)

2015-01-27 Thread Eric Botcazou
 Because reorder_insns doesn't handle the case of moving a barrier into a
 middle of basic block.

Right, I should have read the audit trail. :-)  The patch is OK then, but add 
a ??? note at the end of the comment saying that the proper thing to do here 
is probably not to run cleanup_barrier for this back-end.

-- 
Eric Botcazou


Re: [PATCH] wide-int division fix (PR tree-optimization/64807)

2015-01-27 Thread Richard Biener
On Mon, 26 Jan 2015, Jakub Jelinek wrote:

 Hi!
 
 On the following testcase we generate wrong code, because
 apparently divmod_internal_2 relies on 0 being the topmost
 element (at b_dividend[m]):
algorithm.  M is the number of significant elements of U however
there needs to be at least one extra element of B_DIVIDEND
allocated, N is the number of elements of B_DIVISOR.  */
 The comment talks just about allocation, but from the code
 it seems it really relies on it being 0.
 There is space for it:
   unsigned HOST_HALF_WIDE_INT
 b_dividend[(4 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_HALF_WIDE_INT) + 
 1];
   unsigned HOST_HALF_WIDE_INT
 b_divisor[4 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_HALF_WIDE_INT];
 (the + 1), and usually there already is a zero in there:
   m = dividend_blocks_needed;
   while (m  1  b_dividend[m - 1] == 0)
 m--;
 so the only problematic case is if m isn't decreased.
 
 Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
 ok for trunk?

Ok.

Thanks,
Richard.

 2015-01-26  Jakub Jelinek  ja...@redhat.com
 
   PR tree-optimization/64807
   * wide-int.cc (wi::divmod_internal): Clear
   b_dividend[dividend_blocks_needed].
 
   * gcc.dg/pr64807.c: New test.
 
 --- gcc/wide-int.cc.jj2015-01-09 21:59:38.0 +0100
 +++ gcc/wide-int.cc   2015-01-26 19:21:56.114316481 +0100
 @@ -1819,6 +1819,7 @@ wi::divmod_internal (HOST_WIDE_INT *quot
divisor_blocks_needed, divisor_prec, sgn);
  
m = dividend_blocks_needed;
 +  b_dividend[m] = 0;
while (m  1  b_dividend[m - 1] == 0)
  m--;
  
 --- gcc/testsuite/gcc.dg/pr64807.c.jj 2015-01-26 19:24:13.612943033 +0100
 +++ gcc/testsuite/gcc.dg/pr64807.c2015-01-26 19:32:34.502237566 +0100
 @@ -0,0 +1,19 @@
 +/* PR tree-optimization/64807 */
 +/* { dg-do run { target int128 } } */
 +/* { dg-options -O2 } */
 +
 +__uint128_t
 +foo (void)
 +{
 +  __uint128_t a = -1;
 +  __uint128_t b = -1;
 +  return a / b;
 +}
 +
 +int
 +main ()
 +{
 +  if (foo () != 1)
 +__builtin_abort ();
 +  return 0;
 +}
 
   Jakub
 
 

-- 
Richard Biener rguent...@suse.de
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild,
Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)


[PATCH, PR tree-optimization/64277] Improve loop iterations count estimation

2015-01-27 Thread Ilya Enkovich
Hi,

This patch was supposed to fix PR tree-optimization/64277.  Tracker is now 
fixed by warnings disabling but I think patch is still useful to avoid dead 
code generated by complete unroll.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

Thanks,
Ilya
--
gcc/

2015-01-27  Ilya Enkovich  ilya.enkov...@intel.com

* tree-ssa-loop-niter.c (record_nonwrapping_iv): Use base
range info when possible to refine estimation.

gcc/testsuite/

2015-01-27  Ilya Enkovich  ilya.enkov...@intel.com

* gcc.dg/pr64277.c: New.


diff --git a/gcc/testsuite/gcc.dg/pr64277.c b/gcc/testsuite/gcc.dg/pr64277.c
new file mode 100644
index 000..0d5ef11
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr64277.c
@@ -0,0 +1,21 @@
+/* PR tree-optimization/64277 */
+/* { dg-do compile } */
+/* { dg-options -O3 -Wall -Werror } */
+
+
+int f1[10];
+void test1 (short a[], short m, unsigned short l)
+{
+  int i = l;
+  for (i = i + 5; i  m; i++)
+f1[i] = a[i]++;
+}
+
+void test2 (short a[], short m, short l)
+{
+  int i;
+  if (m  5)
+m = 5;
+  for (i = m; i  l; i--)
+f1[i] = a[i]++;
+}
diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index 919f5c0..6a55c6f 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -2754,6 +2754,7 @@ record_nonwrapping_iv (struct loop *loop, tree base, tree 
step, gimple stmt,
 {
   tree niter_bound, extreme, delta;
   tree type = TREE_TYPE (base), unsigned_type;
+  tree orig_base = base;
 
   if (TREE_CODE (step) != INTEGER_CST || integer_zerop (step))
 return;
@@ -2777,16 +2778,32 @@ record_nonwrapping_iv (struct loop *loop, tree base, 
tree step, gimple stmt,
 
   if (tree_int_cst_sign_bit (step))
 {
+  wide_int min, max;
   extreme = fold_convert (unsigned_type, low);
-  if (TREE_CODE (base) != INTEGER_CST)
+  if (TREE_CODE (orig_base) == SSA_NAME
+  TREE_CODE (high) == INTEGER_CST
+  !POINTER_TYPE_P (TREE_TYPE (orig_base))
+  SSA_NAME_RANGE_INFO (orig_base)
+  get_range_info (orig_base, min, max) == VR_RANGE
+  wi::gts_p (wide_int (high), max))
+   base = wide_int_to_tree (unsigned_type, max);
+  else if (TREE_CODE (base) != INTEGER_CST)
base = fold_convert (unsigned_type, high);
   delta = fold_build2 (MINUS_EXPR, unsigned_type, base, extreme);
   step = fold_build1 (NEGATE_EXPR, unsigned_type, step);
 }
   else
 {
+  wide_int min, max;
   extreme = fold_convert (unsigned_type, high);
-  if (TREE_CODE (base) != INTEGER_CST)
+  if (TREE_CODE (orig_base) == SSA_NAME
+  TREE_CODE (low) == INTEGER_CST
+  !POINTER_TYPE_P (TREE_TYPE (orig_base))
+  SSA_NAME_RANGE_INFO (orig_base)
+  get_range_info (orig_base, min, max) == VR_RANGE
+  wi::gts_p (min, wide_int (low)))
+   base = wide_int_to_tree (unsigned_type, min);
+  else if (TREE_CODE (base) != INTEGER_CST)
base = fold_convert (unsigned_type, low);
   delta = fold_build2 (MINUS_EXPR, unsigned_type, extreme, base);
 }


Re: [PATCH] Update BBs in cleanup_barriers pass (PR rtl-optimization/61058)

2015-01-27 Thread Jakub Jelinek
On Tue, Jan 27, 2015 at 09:25:32AM +0100, Eric Botcazou wrote:
  Yes, they do, that is why it crashed during final.
 
 OK.  Why wouldn't it work to call reorder_insns instead of reorder_insns_nobb?

Because reorder_insns doesn't handle the case of moving a barrier into a
middle of basic block.

  if (!BARRIER_P (from)
   (bb2 = BLOCK_FOR_INSN (from)))
{
  if (BB_END (bb2) == to)
BB_END (bb2) = prev;
  df_set_bb_dirty (bb2);
}

  if (BB_END (bb) == after)
BB_END (bb) = to;

  for (x = from; x != NEXT_INSN (to); x = NEXT_INSN (x))
if (!BARRIER_P (x))
  df_insn_change_bb (x, bb);

from == to is a BARRIER in this case, BB_END (bb) != after (BB_END
is actually PREV_INSN (from)), so this doesn't do anything at all.

While what we need is:

1) set BB_END to after
2) clear BLOCK_FOR_INSN on the notes after AFTER (after addition of
   barrier after FROM == TO) until former PREV_INSN (FROM) (inclusive)

Jakub


Re: [Patch, Fortran, OOP] PR 64230: [4.9/5 Regression] Invalid memory reference in a compiler-generated finalizer for allocatable component

2015-01-27 Thread Andreas Schwab
Janus Weil ja...@gcc.gnu.org writes:

 2015-01-19  Janus Weil  ja...@gcc.gnu.org

 PR fortran/64230
 * gfortran.dg/class_allocate_18.f90: Extended.

FAIL: gfortran.dg/class_allocate_18.f90   -O0  (test for excess errors)
Excess errors:
/usr/ia64-suse-linux/bin/ld: cannot find -lubsan

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
And now for something completely different.


[PATCH] Fix PR64277

2015-01-27 Thread Richard Biener

This disables array-bound warnings from VRP2 as discussed.

Bootstrapped and tested on x86_64-unknown-linux-gnu - ok for trunk?

I'll search for duplicates and add a few testcases.

Thanks,
Richard.

2015-01-27  Richard Biener  rguent...@suse.de

PR tree-optimization/64277
* tree-vrp.c (vrp_finalize): Emit array-bound warnings only
from the first VRP pass.

Index: gcc/tree-vrp.c
===
--- gcc/tree-vrp.c  (revision 220107)
+++ gcc/tree-vrp.c  (working copy)
@@ -10229,7 +10197,7 @@ vrp_finalize (void)
   substitute_and_fold (op_with_constant_singleton_value_range,
   vrp_fold_stmt, false);
 
-  if (warn_array_bounds)
+  if (warn_array_bounds  first_pass_instance)
 check_all_array_refs ();
 
   /* We must identify jump threading opportunities before we release


Re: [PATCH] Fix PR64277

2015-01-27 Thread Jakub Jelinek
On Tue, Jan 27, 2015 at 10:25:48AM +0100, Richard Biener wrote:
 
 This disables array-bound warnings from VRP2 as discussed.
 
 Bootstrapped and tested on x86_64-unknown-linux-gnu - ok for trunk?

So nothing in the testsuite needed to change?  Nice.

Ok for trunk.

 I'll search for duplicates and add a few testcases.

Thanks.

 2015-01-27  Richard Biener  rguent...@suse.de
 
   PR tree-optimization/64277
   * tree-vrp.c (vrp_finalize): Emit array-bound warnings only
   from the first VRP pass.
 
 Index: gcc/tree-vrp.c
 ===
 --- gcc/tree-vrp.c(revision 220107)
 +++ gcc/tree-vrp.c(working copy)
 @@ -10229,7 +10197,7 @@ vrp_finalize (void)
substitute_and_fold (op_with_constant_singleton_value_range,
  vrp_fold_stmt, false);
  
 -  if (warn_array_bounds)
 +  if (warn_array_bounds  first_pass_instance)
  check_all_array_refs ();
  
/* We must identify jump threading opportunities before we release

Jakub


Re: [debug-early] C++ clones and limbo DIEs

2015-01-27 Thread Jason Merrill

On 01/23/2015 01:45 PM, Aldy Hernandez wrote:

It would expect [the flush] to be before free_lang_data and LTO streaming.


The reason this wouldn't make a difference is because, as it stands,
dwarf for the clones are not generated until final.c:

   if (!DECL_IGNORED_P (current_function_decl))
 debug_hooks-function_decl (current_function_decl);

which happens after free_lang_data.


I agree that the current code doesn't have this effect, but we're 
talking about changing things, right? :)



Unfortunately, this sets DECL_ABSTRACT_P for the static_p above, and
refuses to unset it after the call to dwarf2out_decl.


Well, that sounds like a bug.  Why isn't it being unset?  Is it because 
DECL_ABSTRACT_P was already set for the function, so we don't call 
set_decl_abstract_flags (decl, 0)?  Perhaps a solution to that would be 
to avoid calling set_decl_abstract_flags (decl, 1) if the function is 
already marked as abstract.  Or to teach set_decl_abstract_flags not to 
mess with static local variables.


Jason



Re: [PATCH][RFA][PR target/15184] Partial fix for direct byte access on x86

2015-01-27 Thread Segher Boessenkool
On Tue, Jan 27, 2015 at 01:53:34PM -0700, Jeff Law wrote:
 I do have a specific PR in mind, but I cannot currently find it.  It was
 about x86, dec mem and then using the flags...  Must have sent 100 emails
 in that thread...  And cannot find it now!
 Are you referring to 61225?

That is the one, thanks.


Segher


C++ PATCH for c++/58597 (lambda in default arg)

2015-01-27 Thread Jason Merrill
Here, sometimes we can end up in maybe_add_lambda_conv_op with 
current_function_decl set but not cfun.  If we push_function_context in 
that case, the later pop doesn't clear cfun, but leaves it with a value 
that leads to a crash later on.  So let's avoid calling 
push_function_context in that case.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit ce65568ba19c4613c25f48064a0d5e66454265ac
Author: Jason Merrill ja...@redhat.com
Date:   Tue Jan 27 14:26:18 2015 -0500

	PR c++/58597
	* lambda.c (maybe_add_lambda_conv_op): Check cfun rather than
	current_function_decl.

diff --git a/gcc/cp/lambda.c b/gcc/cp/lambda.c
index 6c9e224..b160c8c 100644
--- a/gcc/cp/lambda.c
+++ b/gcc/cp/lambda.c
@@ -854,7 +854,7 @@ prepare_op_call (tree fn, int nargs)
 void
 maybe_add_lambda_conv_op (tree type)
 {
-  bool nested = (current_function_decl != NULL_TREE);
+  bool nested = (cfun != NULL);
   bool nested_def = decl_function_context (TYPE_MAIN_DECL (type));
   tree callop = lambda_function (type);
 
diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-defarg6.C b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-defarg6.C
new file mode 100644
index 000..fe8767a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-defarg6.C
@@ -0,0 +1,9 @@
+// PR c++/58597
+// { dg-do compile { target c++11 } }
+
+templatetypename struct A
+{
+  templatetypename T A(T, int = []{ return 0; }()) {}
+};
+
+Aint a = 0;


Bug 62044 - [4.8/4.9 Regression] ICE in USE statement with RENAME for extended derived type

2015-01-27 Thread Paul Richard Thomas
Dear All,

The highly embarrassing bug in mold = allocations to class entities
has been fixed in revisions 220140 and 220191 for trunk and 4.9
respectively. The PR has been set as RESOLVED.

Cheers

Paul


[Patch, fortran] PR63205 - [OOP] Wrongly rejects type = class (for identical declared type)

2015-01-27 Thread Paul Richard Thomas
Dear All,

This patch enables the passing of an allocatable class object, scalar
or array, to a derived type of the declared type, either in an
assignment or as an actual argument. Much of the effort went into
sorting out the finalization call so that the 'left over' allocatable
components added by the dynamic type do not leak memory. At the
moment, the existence of the finalization function is tested for. A
check to see if the dynamic type is the same as the declared type
could be added.

Note that adding the 'must_finalize' field to gfc_expr will be useful
in enabling the missing mandatory finalization calls.

There are still interrogation marks about the patch; especially in
build_class_array_ref, where I do not understand why the added code
does not work in general, except for hidden function results.
Nonetheless, the code does not leak memory, apart perhaps from the
compound derived type constructors, with allocatable components that
already show leaks elsewhere. It is also well ringfenced and so should
not cause any regressions... touch wood!

Bootstraps and regtests on x86_64/FC21 - OK for trunk?

Paul

2015-01-27  Paul Thomas  pa...@gcc.gnu.org

PR fortran/63205
* gfortran.h: Add 'must finalize' field to gfc_expr and
prototypes for gfc_is_alloc_class_scalar_function and for
gfc_is_alloc_class_array_function.
* expr.c (gfc_is_alloc_class_scalar_function,
gfc_is_alloc_class_array_function): New functions.
* trans-array.c (gfc_add_loop_ss_code): Do not move the
expression for allocatable class scalar functions outside the
loop.
(conv_array_index_offset): Cope with deltas being NULL_TREE.
(build_class_array_ref): Do not return with allocatable class
array functions. Add code to pick out the returned class array.
Dereference if necessary and return if not a class object.
(gfc_conv_scalarized_array_ref): Cope with offsets being NULL.
(gfc_walk_function_expr): Return an array ss for the result of
an allocatable class array function.
* trans-expr.c (gfc_conv_subref_array_arg): Remove the assert
that the argument should be a variable. If an allocatable class
array function, set the offset to zero and skip the write-out
loop in this case.
(gfc_conv_procedure_call): Add allocatable class array function
to the assert. Call gfc_conv_subref_array_arg for allocatable
class array function arguments with derived type formal arg..
Add the code for handling allocatable class functions, including
finalization calls to prevent memory leaks.
(arrayfunc_assign_needs_temporary): Return if an allocatable
class array function.
(gfc_trans_assignment_1): Set must_finalize to rhs expression
for allocatable class functions. Set scalar_to_array as needed
for scalar class allocatable functions assigned to an array.
Nullify the allocatable components corresponding the the lhs
derived type so that the finalization does not free them.

2015-01-27  Paul Thomas  pa...@gcc.gnu.org

PR fortran/63205
* gfortran.dg/class_to_type_4.f90: New test
Index: gcc/fortran/gfortran.h
===
*** gcc/fortran/gfortran.h  (revision 208092)
--- gcc/fortran/gfortran.h  (working copy)
*** typedef struct gfc_expr
*** 1753,1758 
--- 1753,1761 
/* Mark an expression as being a MOLD argument of ALLOCATE.  */
unsigned int mold : 1;

+   /* Will require finalization after use.  */
+   unsigned int must_finalize : 1;
+
/* If an expression comes from a Hollerith constant or compile-time
   evaluation of a transfer statement, it may have a prescribed target-
   memory representation, and these cannot always be backformed from
*** bool gfc_expr_check_typed (gfc_expr*, gf
*** 2804,2809 
--- 2807,2814 

  gfc_component * gfc_get_proc_ptr_comp (gfc_expr *);
  bool gfc_is_proc_ptr_comp (gfc_expr *);
+ bool gfc_is_alloc_class_scalar_function (gfc_expr *);
+ bool gfc_is_alloc_class_array_function (gfc_expr *);

  bool gfc_ref_this_image (gfc_ref *ref);
  bool gfc_is_coindexed (gfc_expr *);
Index: gcc/fortran/expr.c
===
*** gcc/fortran/expr.c  (revision 208092)
--- gcc/fortran/expr.c  (working copy)
*** gfc_is_proc_ptr_comp (gfc_expr *expr)
*** 4274,4279 
--- 4274,4313 
  }


+ /* Determine if an expression is a function with an allocatable class scalar
+result.  */
+ bool
+ gfc_is_alloc_class_scalar_function (gfc_expr *expr)
+ {
+   if (expr-expr_type == EXPR_FUNCTION
+expr-value.function.esym
+expr-value.function.esym-result
+expr-value.function.esym-result-ts.type == BT_CLASS
+!CLASS_DATA (expr-value.function.esym-result)-attr.dimension
+CLASS_DATA (expr-value.function.esym-result)-attr.allocatable)
+ return true;
+
+   return false;
+ }
+
+
+ /* Determine if an expression is a 

Re: [PATCH][RFA][PR target/15184] Partial fix for direct byte access on x86

2015-01-27 Thread Segher Boessenkool
On Tue, Jan 27, 2015 at 12:27:38PM -0700, Jeff Law wrote:
 On 01/26/15 22:11, Segher Boessenkool wrote:
 On Mon, Jan 26, 2015 at 08:07:29PM -0700, Jeff Law wrote:
 The second change we need is an additional simplification.
 
 If we have
 (subreg:M1 (zero_extend:M2 (x))
 
 Where M1  M2 and both are scalar integer modes.  It's advantageous to
 strip the SUBREG and instead have a wider extension.
 
 Should you also check M1 is not multiple registers?
 We're generally working with pseudos, so we could estimate, but not know 
 for sure if we're dealing with multiple hard regs.  But more 
 importantly, I'm not sure what that check would buy us.

I mean e.g. DI on a 32-bit target.  My worry is that zero_extend:DI then
is more expensive -- if say, it is implemented as a split, combine itself
cannot get rid of the redundancy.

 Earlier versions checked reg_equal_p on the MEM.  But that's often a 
 mistake because the modes of the two memory references may be different. 
  I don't recall which of the various tests, but I was definitely seeing 
 SImode in the load and HImode in the store.
 
 Similarly you don't want to check reg_equal_p on the addresses as they 
 aren't necessarily the same either (they're obviously related).
 
 That's how I ultimately settled on rtx_referenced_p form you see above. 
  I'm still not sure that's 100% what I want, but I don't have any tests 
 yet which require something more complex.

Okay, if there are actual real cases like that :-)  All this code does
is cull cases that are not useful to try to combine, since without that
combining four insns is very expensive.

 Does this do anything good for the dec mem thing on x86?  That would
 be a nice bonus :-)
 It might, but I haven't tested for that specifically.  If you've got 
 sample code or a PR in mind, pass it along and I'll take a look.  I'd 
 think dec mem would generally be handled by 3-1 insn combination code 
 unless there's something else going on.

I do have a specific PR in mind, but I cannot currently find it.  It was
about x86, dec mem and then using the flags...  Must have sent 100 emails
in that thread...  And cannot find it now!


Segher


Re: [PATCH][RFA][PR target/15184] Partial fix for direct byte access on x86

2015-01-27 Thread Jeff Law

On 01/27/15 13:36, Segher Boessenkool wrote:

On Tue, Jan 27, 2015 at 12:27:38PM -0700, Jeff Law wrote:

On 01/26/15 22:11, Segher Boessenkool wrote:

On Mon, Jan 26, 2015 at 08:07:29PM -0700, Jeff Law wrote:

The second change we need is an additional simplification.

If we have
(subreg:M1 (zero_extend:M2 (x))

Where M1  M2 and both are scalar integer modes.  It's advantageous to
strip the SUBREG and instead have a wider extension.


Should you also check M1 is not multiple registers?

We're generally working with pseudos, so we could estimate, but not know
for sure if we're dealing with multiple hard regs.  But more
importantly, I'm not sure what that check would buy us.


I mean e.g. DI on a 32-bit target.  My worry is that zero_extend:DI then
is more expensive -- if say, it is implemented as a split, combine itself
cannot get rid of the redundancy.
We might lose for a case like (subreg:DI ({zero,sign}_extend:SI (x))) on 
a 32 bit target if something were able to recognize that the upper bits 
were don't cares.


The most likely place for that to happen would be at assembly output 
time -- but that would require the target to exploit the don't care 
semantics of those bits.  I don't recall any port doing that.


We could exploit this in generic splitting code, but I don't think we 
do.  lower-subreg slams in a zero when it finds a paradoxical subreg and 
we've asked for the high word.  I don't immediately see that does 
anything special when the operand of the subreg is anything other than 
another reg or mem.


combine exploits the don't care nature of those bits to eliminate 
masking and such.  It's not going to be able to eliminate the subreg 
entirely unless it folks into some later insn and we're ultimately able 
to narrow the operation back down to SImode.



Also note that while ports may not have special cases around the subreg 
variant, several have special cases for ZERO_EXTEND.  Basically they 
slam in a zero to the upper word, either via a splitter or during 
assembly code output.  Those special cases will be recognized more often 
now.



Jeff


Re: [PATCH][RFA][PR target/15184] Partial fix for direct byte access on x86

2015-01-27 Thread Jeff Law

On 01/27/15 13:36, Segher Boessenkool wrote:

I mean e.g. DI on a 32-bit target.  My worry is that zero_extend:DI then
is more expensive -- if say, it is implemented as a split, combine itself
cannot get rid of the redundancy.

OK.  Let me play with that a bit.



Okay, if there are actual real cases like that :-)  All this code does
is cull cases that are not useful to try to combine, since without that
combining four insns is very expensive.

There are :-)  It surprised me as well.




Does this do anything good for the dec mem thing on x86?  That would
be a nice bonus :-)

It might, but I haven't tested for that specifically.  If you've got
sample code or a PR in mind, pass it along and I'll take a look.  I'd
think dec mem would generally be handled by 3-1 insn combination code
unless there's something else going on.


I do have a specific PR in mind, but I cannot currently find it.  It was
about x86, dec mem and then using the flags...  Must have sent 100 emails
in that thread...  And cannot find it now!

Are you referring to 61225?

Jeff


Re: Fix 59828 - Broken assembly on ppc* with two -mcpu= options

2015-01-27 Thread Alan Modra
On Wed, Jan 21, 2015 at 02:01:44PM -0500, David Edelsohn wrote:
 I want to avoid duplicating the -mcpu parsing logic or the Rube
 Goldberg mechanism to re-generate the -mXXX assembler directive.

Oh well, I had fun writing the patch.  I thought it reasonably
elegant, meeting the goals you state above.  You think differently,
and I won't push my approach further.  The bug isn't important enough
to argue over.

-- 
Alan Modra
Australia Development Lab, IBM


RE: [Patch][wwwdocs]Deprecate the ARM TPCS related options in gcc 5.0

2015-01-27 Thread Terry Guo


 -Original Message-
 From: Gerald Pfeifer [mailto:ger...@pfeifer.com]
 Sent: Monday, January 26, 2015 7:34 PM
 To: Terry Guo
 Cc: gcc-patches@gcc.gnu.org; Richard Earnshaw; Ramana Radhakrishnan
 Subject: Re: [Patch][wwwdocs]Deprecate the ARM TPCS related options in
 gcc 5.0
 
 On Monday 2015-01-26 16:47, Terry Guo wrote:
  This patch intends to update gcc 5.0 change.html to deprecate TPCS
  related options because TPCS is obsoleted per the ABI document at
 
 http://infocenter.arm.com/help/topic/com.arm.doc.ihi0042e/IHI0042E_aapc
 s.pdf.
  Is it OK?
 
 From a language perspective I suggest to say The options list the
options
 here related to the old ABI... or The options related to the old ABI
--
 list the options -- ..., where I somewhat prefer the former.
 
 Please wait for Richard or Ramana for final review and approval.
 
 Gerald

Thanks Gerald. Patch is updated. Is this one OK?

BR,
Terry

Index: htdocs/gcc-5/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v
retrieving revision 1.73
diff -u -p -r1.73 changes.html
--- htdocs/gcc-5/changes.html   26 Jan 2015 09:40:03 -  1.73
+++ htdocs/gcc-5/changes.html   27 Jan 2015 09:35:32 -
@@ -513,8 +513,9 @@ void operator delete[] (void *, std::siz
   li The deprecated option code-mwords-little-endian/code
has been removed.
   /li
-  li The options relating to the old ABI code-mapcs/code and
-  code-mapcs-frame/code have been deprecated.
+  li The options code-mapcs/code, code-mapcs-frame/code,
+  code-mtpcs-frame/code and code-mtpcs-leaf-frame/code
+  which are only applicable to the old ABI have been deprecated.
   /li
   liThe transitional options code-mlra/code and
code-mno-lra/code
have been removed. The ARM backend now uses the local register
allocator





Re: Fix 59828 - Broken assembly on ppc* with two -mcpu= options

2015-01-27 Thread David Edelsohn
On Tue, Jan 27, 2015 at 7:27 PM, Alan Modra amo...@gmail.com wrote:
 On Wed, Jan 21, 2015 at 02:01:44PM -0500, David Edelsohn wrote:
 I want to avoid duplicating the -mcpu parsing logic or the Rube
 Goldberg mechanism to re-generate the -mXXX assembler directive.

 Oh well, I had fun writing the patch.  I thought it reasonably
 elegant, meeting the goals you state above.  You think differently,
 and I won't push my approach further.  The bug isn't important enough
 to argue over.

Alan,

I am sorry that you do not want to finish the patch.  I don't
understand why you find the command line argument so appealing when
the .machine pseudo-op was designed for this purpose.

Thanks, David


Re: [PATCH][AArch64] Improve bit-test-branch pattern to avoid unnecessary register clobber

2015-01-27 Thread Jakub Jelinek
On Tue, Jan 27, 2015 at 02:31:14PM +, Jiong Wang wrote:
 testcase changed to execution version, and moved to gcc.dg. the compile time 
 only
 take several seconds. (previously I am using cc1 built by O0 which at most 
 take 24s)
 
 ok to install?

Ok for the testcase.
The config/aarch64/ bits I'll defer to aarch64 maintainers.

 2015-01-19  Ramana Radhakrishnan  ramana.radhakrish...@arm.com
 Jiong Wang  jiong.w...@arm.com
 
   gcc/
 * config/aarch64/aarch64.md (tboptabmode1): Clobber CC reg instead of 
 scratch reg.
 (cboptabmode1): Likewise.
 * config/aarch64/iterators.md (bcond): New define_code_attr.
 
   gcc/testsuite/
 * gcc.dg/long_branch.c: New testcase.

Jakub


RE: [PATCH RFA MIPS] Prohibit vector modes in accumulators

2015-01-27 Thread Moore, Catherine


 -Original Message-
 From: Matthew Fortune [mailto:matthew.fort...@imgtec.com]
 Sent: Tuesday, January 27, 2015 7:19 AM
 To: Richard Sandiford
 Cc: Robert Suchanek; gcc-patches@gcc.gnu.org; Moore, Catherine
 Subject: RE: [PATCH RFA MIPS] Prohibit vector modes in accumulators
 
 Richard Sandiford rdsandif...@googlemail.com writes:
  Matthew Fortune matthew.fort...@imgtec.com writes:
   2015-01-23  Robert Suchanek  robert.sucha...@imgtec.com
  
* config/mips/mips.c (mips_hard_regno_mode_ok_p): Prohibit
   accumulators
for all vector modes.
  
   This seems like a genuine bug and although it can only be triggered
   by loongson or paired-single support it probably qualifies for fixing.
 
  Agreed FWIW.  We shouldn't mark something as valid for a mode if even
  the mode's move pattern can't handle it.
 
  I think this kind of thing should go in regardless of development stage.
 
 Given that it was one of the pre-existing tests that failed I'm happy that we
 are covering this issue. All of these LRA related issues are likely to phase 
 in
 and out with subtle changes to code-gen so I don't think we can always get a
 test case that fails on trunk.
 
That's true.

 Since Catherine asked for further info then I will leave her to say if she is
 happy to accept on this basis.
 

I withdraw my request for a testcase.

Catherine


[RFC PATCH] Avoid most of the BUILT_IN_*_CHKP enum values

2015-01-27 Thread Jakub Jelinek
Hi!

I've grepped for BUILT_IN_.*_CHKP in the sources and we actually need
far fewer enum values than the 1204 that are being defined.

This patch requires builtins.def to say explicitly (by using
DEF_*BUILTIN_CHKP macro instead of corresponding DEF_*BUILTIN) which
ones need that, for all the others only space in the enum is reserved and
nothing else.

I'd hope this could work around the buggy AIX stabs handling, but even
on say x86_64-linux it has a benefit of decreasing cc1plus .debug_info
by about 2.7MB (of course, with dwz that benefit goes to almost nothing,
just the ~ 7000 bytes or so, plus .debug_str cost (that is merged even
without dwz between TUs).  The cost without dwz is obviously mainly
from repeating that in most of the translation units.  But why declare
BUILT_IN_*_CHKP enums that are never used by anything...

2015-01-27  Jakub Jelinek  ja...@redhat.com

* builtins.def (DEF_BUILTIN_CHKP): Define if not defined.
(DEF_LIB_BUILTIN_CHKP, DEF_EXT_LIB_BUILTIN_CHKP): Redefine.
(DEF_CHKP_BUILTIN): Define using DEF_BUILTIN_CHKP instead
of DEF_BUILTIN.
(BUILT_IN_MEMCPY, BUILT_IN_MEMMOVE, BUILT_IN_MEMSET, BUILT_IN_STRCAT,
BUILT_IN_STRCHR, BUILT_IN_STRCPY, BUILT_IN_STRLEN): Use
DEF_LIB_BUILTIN_CHKP macro instead of DEF_LIB_BUILTIN.
(BUILT_IN_MEMCPY_CHK, BUILT_IN_MEMMOVE_CHK, BUILT_IN_MEMPCPY_CHK,
BUILT_IN_MEMPCPY, BUILT_IN_MEMSET_CHK, BUILT_IN_STPCPY_CHK,
BUILT_IN_STPCPY, BUILT_IN_STRCAT_CHK, BUILT_IN_STRCPY_CHK): Use
DEF_EXT_LIB_BUILTIN_CHKP macro instead of DEF_EXT_LIB_BUILTIN.
* tree-core.h (enum built_in_function): In between
BEGIN_CHKP_BUILTINS and END_CHKP_BUILTINS only define enum values
for builtins that use DEF_BUILTIN_CHKP macro.

--- gcc/builtins.def.jj 2015-01-15 23:39:10.0 +0100
+++ gcc/builtins.def2015-01-27 15:04:44.860924664 +0100
@@ -63,6 +63,16 @@ along with GCC; see the file COPYING3.
 
The builtins is registered only if COND is true.  */
 
+/* A macro for builtins where the
+   BUILT_IN_*_CHKP = BUILT_IN_* + BEGIN_CHKP_BUILTINS + 1
+   enums should be defined too.  */
+#ifndef DEF_BUILTIN_CHKP
+#define DEF_BUILTIN_CHKP(ENUM, NAME, CLASS, TYPE, LIBTYPE, BOTH_P, \
+FALLBACK_P, NONANSI_P, ATTRS, IMPLICIT, COND)  \
+  DEF_BUILTIN(ENUM, NAME, CLASS, TYPE, LIBTYPE, BOTH_P, FALLBACK_P,\
+ NONANSI_P, ATTRS, IMPLICIT, COND)
+#endif
+
 /* A GCC builtin (like __builtin_saveregs) is provided by the
compiler, but does not correspond to a function in the standard
library.  */
@@ -87,6 +97,10 @@ along with GCC; see the file COPYING3.
 #define DEF_LIB_BUILTIN(ENUM, NAME, TYPE, ATTRS)   \
   DEF_BUILTIN (ENUM, __builtin_ NAME, BUILT_IN_NORMAL, TYPE, TYPE,   \
   true, true, false, ATTRS, true, true)
+#undef DEF_LIB_BUILTIN_CHKP
+#define DEF_LIB_BUILTIN_CHKP(ENUM, NAME, TYPE, ATTRS)  \
+  DEF_BUILTIN_CHKP (ENUM, __builtin_ NAME, BUILT_IN_NORMAL, TYPE,\
+   TYPE, true, true, false, ATTRS, true, true)
 
 /* Like DEF_LIB_BUILTIN, except that the function is not one that is
specified by ANSI/ISO C.  So, when we're being fully conformant we
@@ -96,6 +110,10 @@ along with GCC; see the file COPYING3.
 #define DEF_EXT_LIB_BUILTIN(ENUM, NAME, TYPE, ATTRS)   \
   DEF_BUILTIN (ENUM, __builtin_ NAME, BUILT_IN_NORMAL, TYPE, TYPE,   \
   true, true, true, ATTRS, false, true)
+#undef DEF_EXT_LIB_BUILTIN_CHKP
+#define DEF_EXT_LIB_BUILTIN_CHKP(ENUM, NAME, TYPE, ATTRS)  \
+  DEF_BUILTIN_CHKP (ENUM, __builtin_ NAME, BUILT_IN_NORMAL, TYPE,\
+   TYPE, true, true, true, ATTRS, false, true)
 
 /* Like DEF_LIB_BUILTIN, except that the function is only a part of
the standard in C94 or above.  */
@@ -199,8 +217,8 @@ along with GCC; see the file COPYING3.
 /* Builtin used by the implementation of Pointer Bounds Checker.  */
 #undef DEF_CHKP_BUILTIN
 #define DEF_CHKP_BUILTIN(ENUM, NAME, TYPE, ATTRS) \
-  DEF_BUILTIN (ENUM, __builtin_ NAME, BUILT_IN_NORMAL, TYPE, TYPE,\
-  true, true, false, ATTRS, true, true)
+  DEF_BUILTIN_CHKP (ENUM, __builtin_ NAME, BUILT_IN_NORMAL, TYPE,\
+   TYPE, true, true, false, ATTRS, true, true)
 
 /* Define an attribute list for math functions that are normally
impure because some of them may write into global memory for
@@ -595,22 +613,22 @@ DEF_EXT_LIB_BUILTIN(BUILT_IN_BZERO,
 DEF_EXT_LIB_BUILTIN(BUILT_IN_INDEX, index, 
BT_FN_STRING_CONST_STRING_INT, ATTR_PURE_NOTHROW_NONNULL_LEAF)
 DEF_LIB_BUILTIN(BUILT_IN_MEMCHR, memchr, 
BT_FN_PTR_CONST_PTR_INT_SIZE, ATTR_PURE_NOTHROW_NONNULL_LEAF)
 DEF_LIB_BUILTIN(BUILT_IN_MEMCMP, memcmp, 
BT_FN_INT_CONST_PTR_CONST_PTR_SIZE, ATTR_PURE_NOTHROW_NONNULL_LEAF)
-DEF_LIB_BUILTIN(BUILT_IN_MEMCPY, memcpy, 
BT_FN_PTR_PTR_CONST_PTR_SIZE, ATTR_RET1_NOTHROW_NONNULL_LEAF)
-DEF_LIB_BUILTIN(BUILT_IN_MEMMOVE, memmove, 

Re: RFA: patch to fix a bad code generation for PR64110 -- new constraints addition

2015-01-27 Thread Jeff Law

On 01/27/15 07:08, Richard Sandiford wrote:


Yeah, but in practice that's only ever going to be a partial transition.
Many port maintainers won't look at this, so we'll have to support both
versions indefinitely, even if the new behaviour turns out to be the
best for all cases.
Yes, most likely.   I find myself pondering the related question of how 
we get ports to transition to LRA and if we could tie these together. 
Maintainers are going to need to transition to LRA if we're ever going 
to start removing blobs of reload.  As a part of that transition they're 
presumably going to be looking closely at their backend and could make 
the constraint transition.


In an ideal world, we'd declare release X.Y has a cut-off point.  Ports 
that haven't transitioned to LRA get deprecated at that point.  Those 
ports are the ones most likely not to make the constraint transition as 
well.  I think we would have to consider any uses of ?! that remain 
after that point as intentional.




I just think we're going to regret having two sets of constraints with
such subtly different meanings.
But isn't that inevitable?  While I suspect that most instances of ?! 
should be converted, there may be some that should not.  If that's the 
case then we're going to have both forever.





Looking back at the original PR, Jakub said:

   The ! has been added by me for PR63594, so it isn't there from the era
   when i?86 backend was using reload.  If there is a better way to
   express that RA should prefer to use memory or xmm register and only
   use r constraint if it already is in a r register and doesn't need to
   be reloaded, I can use that.  Whether it is ?, ??? or something else.
   ! description in gcc docs just fitted most what I wanted...

In some ways this seems to match the intention of *.  Originally I think
it was just an RA-only thing and was ignored by reload, but LRA does take it
into account too (which sounds like progress to me).

If I revert the patch locally and change the *vec_dupmode pattern to
use *, it passes both the test for PR64110 and the tests for PR63594.
Would that be OK as an alternative?

I think that's up to Uros and Jakub to sort out.

Jeff


Re: [PATCH] Fix ada bootstrap under cygwin-64

2015-01-27 Thread Arnaud Charlet
 this patch fixes the ada bootstrap under cygwin-64.
 
 Boot-strapped under x86_64-pc-cygwin.
 OK for trunk?

OK


Re: [RFC PATCH] Avoid most of the BUILT_IN_*_CHKP enum values

2015-01-27 Thread Ilya Enkovich
2015-01-27 17:27 GMT+03:00 Jakub Jelinek ja...@redhat.com:
 Hi!

 I've grepped for BUILT_IN_.*_CHKP in the sources and we actually need
 far fewer enum values than the 1204 that are being defined.

 This patch requires builtins.def to say explicitly (by using
 DEF_*BUILTIN_CHKP macro instead of corresponding DEF_*BUILTIN) which
 ones need that, for all the others only space in the enum is reserved and
 nothing else.

 I'd hope this could work around the buggy AIX stabs handling, but even
 on say x86_64-linux it has a benefit of decreasing cc1plus .debug_info
 by about 2.7MB (of course, with dwz that benefit goes to almost nothing,
 just the ~ 7000 bytes or so, plus .debug_str cost (that is merged even
 without dwz between TUs).  The cost without dwz is obviously mainly
 from repeating that in most of the translation units.  But why declare
 BUILT_IN_*_CHKP enums that are never used by anything...

Enum values not mentioned in the code are not fully useless.  When we
have builtin functions defined as 'always_inline' functions, they are
instrumented and enum names may be used in dumps and debugging.
That's not a big value though.  Thanks a lot for taking care of it!

Ilya

 2015-01-27  Jakub Jelinek  ja...@redhat.com

 * builtins.def (DEF_BUILTIN_CHKP): Define if not defined.
 (DEF_LIB_BUILTIN_CHKP, DEF_EXT_LIB_BUILTIN_CHKP): Redefine.
 (DEF_CHKP_BUILTIN): Define using DEF_BUILTIN_CHKP instead
 of DEF_BUILTIN.
 (BUILT_IN_MEMCPY, BUILT_IN_MEMMOVE, BUILT_IN_MEMSET, BUILT_IN_STRCAT,
 BUILT_IN_STRCHR, BUILT_IN_STRCPY, BUILT_IN_STRLEN): Use
 DEF_LIB_BUILTIN_CHKP macro instead of DEF_LIB_BUILTIN.
 (BUILT_IN_MEMCPY_CHK, BUILT_IN_MEMMOVE_CHK, BUILT_IN_MEMPCPY_CHK,
 BUILT_IN_MEMPCPY, BUILT_IN_MEMSET_CHK, BUILT_IN_STPCPY_CHK,
 BUILT_IN_STPCPY, BUILT_IN_STRCAT_CHK, BUILT_IN_STRCPY_CHK): Use
 DEF_EXT_LIB_BUILTIN_CHKP macro instead of DEF_EXT_LIB_BUILTIN.
 * tree-core.h (enum built_in_function): In between
 BEGIN_CHKP_BUILTINS and END_CHKP_BUILTINS only define enum values
 for builtins that use DEF_BUILTIN_CHKP macro.

 --- gcc/builtins.def.jj 2015-01-15 23:39:10.0 +0100
 +++ gcc/builtins.def2015-01-27 15:04:44.860924664 +0100
 @@ -63,6 +63,16 @@ along with GCC; see the file COPYING3.

 The builtins is registered only if COND is true.  */

 +/* A macro for builtins where the
 +   BUILT_IN_*_CHKP = BUILT_IN_* + BEGIN_CHKP_BUILTINS + 1
 +   enums should be defined too.  */
 +#ifndef DEF_BUILTIN_CHKP
 +#define DEF_BUILTIN_CHKP(ENUM, NAME, CLASS, TYPE, LIBTYPE, BOTH_P, \
 +FALLBACK_P, NONANSI_P, ATTRS, IMPLICIT, COND)  \
 +  DEF_BUILTIN(ENUM, NAME, CLASS, TYPE, LIBTYPE, BOTH_P, FALLBACK_P,\
 + NONANSI_P, ATTRS, IMPLICIT, COND)
 +#endif
 +
  /* A GCC builtin (like __builtin_saveregs) is provided by the
 compiler, but does not correspond to a function in the standard
 library.  */
 @@ -87,6 +97,10 @@ along with GCC; see the file COPYING3.
  #define DEF_LIB_BUILTIN(ENUM, NAME, TYPE, ATTRS)   \
DEF_BUILTIN (ENUM, __builtin_ NAME, BUILT_IN_NORMAL, TYPE, TYPE,   \
true, true, false, ATTRS, true, true)
 +#undef DEF_LIB_BUILTIN_CHKP
 +#define DEF_LIB_BUILTIN_CHKP(ENUM, NAME, TYPE, ATTRS)  \
 +  DEF_BUILTIN_CHKP (ENUM, __builtin_ NAME, BUILT_IN_NORMAL, TYPE,\
 +   TYPE, true, true, false, ATTRS, true, true)

  /* Like DEF_LIB_BUILTIN, except that the function is not one that is
 specified by ANSI/ISO C.  So, when we're being fully conformant we
 @@ -96,6 +110,10 @@ along with GCC; see the file COPYING3.
  #define DEF_EXT_LIB_BUILTIN(ENUM, NAME, TYPE, ATTRS)   \
DEF_BUILTIN (ENUM, __builtin_ NAME, BUILT_IN_NORMAL, TYPE, TYPE,   \
true, true, true, ATTRS, false, true)
 +#undef DEF_EXT_LIB_BUILTIN_CHKP
 +#define DEF_EXT_LIB_BUILTIN_CHKP(ENUM, NAME, TYPE, ATTRS)  \
 +  DEF_BUILTIN_CHKP (ENUM, __builtin_ NAME, BUILT_IN_NORMAL, TYPE,\
 +   TYPE, true, true, true, ATTRS, false, true)

  /* Like DEF_LIB_BUILTIN, except that the function is only a part of
 the standard in C94 or above.  */
 @@ -199,8 +217,8 @@ along with GCC; see the file COPYING3.
  /* Builtin used by the implementation of Pointer Bounds Checker.  */
  #undef DEF_CHKP_BUILTIN
  #define DEF_CHKP_BUILTIN(ENUM, NAME, TYPE, ATTRS) \
 -  DEF_BUILTIN (ENUM, __builtin_ NAME, BUILT_IN_NORMAL, TYPE, TYPE,\
 -  true, true, false, ATTRS, true, true)
 +  DEF_BUILTIN_CHKP (ENUM, __builtin_ NAME, BUILT_IN_NORMAL, TYPE,\
 +   TYPE, true, true, false, ATTRS, true, true)

  /* Define an attribute list for math functions that are normally
 impure because some of them may write into global memory for
 @@ -595,22 +613,22 @@ DEF_EXT_LIB_BUILTIN(BUILT_IN_BZERO,
  DEF_EXT_LIB_BUILTIN(BUILT_IN_INDEX, index, 
 BT_FN_STRING_CONST_STRING_INT, ATTR_PURE_NOTHROW_NONNULL_LEAF)
  

[PATCH] Add comdat_group effective target (PR bootstrap/64612)

2015-01-27 Thread Jakub Jelinek
Hi!

This patch introduces a new effective target check and adds it to the pr64612.C
- if comdat groups aren't used, there is no guarantee that the D2 dtor will
be emitted always alongside of D1 dtor.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2015-01-27  Jakub Jelinek  ja...@redhat.com

PR bootstrap/64612
* lib/target-supports.exp (check_effective_target_comdat_group): New.
* g++.dg/ipa/pr64612.C: Guard scan-assembler test with
{ target comdat_group }.

* doc/sourcebuild.texi (comdat_group): Document.

--- gcc/testsuite/lib/target-supports.exp.jj2015-01-15 23:39:06.0 
+0100
+++ gcc/testsuite/lib/target-supports.exp   2015-01-26 15:24:55.325236098 
+0100
@@ -6198,3 +6198,13 @@ proc check_effective_target_pie_copyrelo
 
 return $pie_copyreloc_available_saved
 }
+
+# Return 1 if the target uses comdat groups.
+
+proc check_effective_target_comdat_group {} {
+return [check_no_messages_and_pattern comdat_group 
\.section\[^\n\r]*,comdat assembly {
+   // C++
+   inline int foo () { return 1; }
+   int (*fn) () = foo;
+}]
+}
--- gcc/testsuite/g++.dg/ipa/pr64612.C.jj   2015-01-26 15:25:43.301410027 
+0100
+++ gcc/testsuite/g++.dg/ipa/pr64612.C  2015-01-26 15:23:11.380025863 +0100
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options -O3 -std=c++11 } */
-/* { dg-final { scan-assembler _ZN5QListI7QStringED1Ev } } */
+/* { dg-final { scan-assembler _ZN5QListI7QStringED1Ev { target comdat_group 
} } } */
 
 class A
 {
--- gcc/doc/sourcebuild.texi.jj 2015-01-15 23:39:02.0 +0100
+++ gcc/doc/sourcebuild.texi2015-01-27 16:07:37.504081520 +0100
@@ -1930,6 +1930,9 @@ Target supports @code{wchar_t} that is c
 
 @item wchar_t_char32_t_compatible
 Target supports @code{wchar_t} that is compatible with @code{char32_t}.
+
+@item comdat_group
+Target uses comdat groups.
 @end table
 
 @subsubsection Local to tests in @code{gcc.target/i386}

Jakub


[PATCH PR64809]

2015-01-27 Thread Yuri Rumyantsev
Hi All,

Here is a simple patch that cures ICE - skip debug gimples.
Test is also included.

Bootstrap and regression testing did not show any new failures.

Is it OK for trunk?

ChangeLog:

2015-01-27  Yuri Rumyantsev  ysrum...@gmail.com

PR tree-optimization/64809
* cfgexpand.c (reorder_operands): Skip debug gimples.

gcc/testsuite/ChangeLog

* gcc.dg/pr64809.c: New test.


patch
Description: Binary data


Re: [Patch, Fortran] PR64771 - Fix coarray ICE

2015-01-27 Thread Rainer Orth
Steve Kargl s...@troutmask.apl.washington.edu writes:

 On Sat, Jan 24, 2015 at 06:13:04PM +0100, Tobias Burnus wrote:
if (s1-as-type == AS_EXPLICIT)
 -for (i = 0; i  s1-as-rank + s1-as-corank; i++)
 +for (i = 0; i  s1-as-rank + std::max(0, s1-as-corank-1); i++)

 Doesn't this require '#include algorithms'?
 I suspect that you are depending on namespace pollution
 via some other header (coretypes.h?).

It was committed with that change, which unfortunately broke Solaris
bootstrap:

In file included from ./config.h:6:0,
 from /vol/gcc/src/hg/trunk/local/gcc/fortran/interface.c:68:
./auto-host.h:2055:0: error: _FILE_OFFSET_BITS redefined [-Werror]
 #define _FILE_OFFSET_BITS 64
 ^
In file included from /usr/include/iso/stdlib_iso.h:24:0,
 from /usr/include/stdlib.h:11,
 from 
/var/gcc/regression/trunk/11-gcc/build/prev-i386-pc-solaris2.11/libstdc++-v3/include/cstdlib:72,
 from 
/var/gcc/regression/trunk/11-gcc/build/prev-i386-pc-solaris2.11/libstdc++-v3/include/bits/stl_algo.h:59,
 from 
/var/gcc/regression/trunk/11-gcc/build/prev-i386-pc-solaris2.11/libstdc++-v3/include/algorithm:62,
 from /vol/gcc/src/hg/trunk/local/gcc/fortran/interface.c:66:
/var/gcc/regression/trunk/11-gcc/build/prev-gcc/include-fixed/sys/feature_tests.h:213:0:
 note: this is the location of the previous definition
 #define _FILE_OFFSET_BITS 32
 ^

The problem is (as so often) that algorithm was included *before*
config.h.  Moving it after the other includes allows interface.c to
compile without warnings.

Ok for mainline?

Rainer


2015-01-27  Rainer Orth  r...@cebitec.uni-bielefeld.de

gcc/fortran:
* interface.c: Include algorithm after config.h

diff --git a/gcc/fortran/interface.c b/gcc/fortran/interface.c
--- a/gcc/fortran/interface.c
+++ b/gcc/fortran/interface.c
@@ -63,8 +63,6 @@ along with GCC; see the file COPYING3.  
formal argument list points to symbols within the same namespace as
the program unit name.  */
 
-#include algorithm  /* For std::max.  */
-
 #include config.h
 #include system.h
 #include coretypes.h
@@ -73,6 +71,8 @@ along with GCC; see the file COPYING3.  
 #include match.h
 #include arith.h
 
+#include algorithm  /* For std::max.  */
+
 /* The current_interface structure holds information about the
interface currently being parsed.  This structure is saved and
restored during recursive interfaces.  */

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [Patch, Fortran] PR64771 - Fix coarray ICE

2015-01-27 Thread Jakub Jelinek
On Tue, Jan 27, 2015 at 03:55:17PM +0100, Rainer Orth wrote:
 Steve Kargl s...@troutmask.apl.washington.edu writes:
 
  On Sat, Jan 24, 2015 at 06:13:04PM +0100, Tobias Burnus wrote:
 if (s1-as-type == AS_EXPLICIT)
  -  for (i = 0; i  s1-as-rank + s1-as-corank; i++)
  +  for (i = 0; i  s1-as-rank + std::max(0, s1-as-corank-1); i++)
 
  Doesn't this require '#include algorithms'?
  I suspect that you are depending on namespace pollution
  via some other header (coretypes.h?).
 
 It was committed with that change, which unfortunately broke Solaris
 bootstrap:
 
 In file included from ./config.h:6:0,
  from /vol/gcc/src/hg/trunk/local/gcc/fortran/interface.c:68:
 ./auto-host.h:2055:0: error: _FILE_OFFSET_BITS redefined [-Werror]
  #define _FILE_OFFSET_BITS 64
  ^
 In file included from /usr/include/iso/stdlib_iso.h:24:0,
  from /usr/include/stdlib.h:11,
  from 
 /var/gcc/regression/trunk/11-gcc/build/prev-i386-pc-solaris2.11/libstdc++-v3/include/cstdlib:72,
  from 
 /var/gcc/regression/trunk/11-gcc/build/prev-i386-pc-solaris2.11/libstdc++-v3/include/bits/stl_algo.h:59,
  from 
 /var/gcc/regression/trunk/11-gcc/build/prev-i386-pc-solaris2.11/libstdc++-v3/include/algorithm:62,
  from /vol/gcc/src/hg/trunk/local/gcc/fortran/interface.c:66:
 /var/gcc/regression/trunk/11-gcc/build/prev-gcc/include-fixed/sys/feature_tests.h:213:0:
  note: this is the location of the previous definition
  #define _FILE_OFFSET_BITS 32
  ^
 
 The problem is (as so often) that algorithm was included *before*
 config.h.  Moving it after the other includes allows interface.c to
 compile without warnings.

Why don't you use MAX macro instead of std::max as everywhere else
in the gcc sources?

Your change is wrong, you can't include system headers after including
system.h and other headers.

Jakub


Re: [PATCH][AArch64] Improve bit-test-branch pattern to avoid unnecessary register clobber

2015-01-27 Thread Marcus Shawcroft
On 27 January 2015 at 14:31, Jiong Wang jiong.w...@arm.com wrote:

 2015-01-19  Ramana Radhakrishnan  ramana.radhakrish...@arm.com
 Jiong Wang  jiong.w...@arm.com

   gcc/
 * config/aarch64/aarch64.md (tboptabmode1): Clobber CC reg instead
 of scratch reg.
 (cboptabmode1): Likewise.
 * config/aarch64/iterators.md (bcond): New define_code_attr.

OK /Marcus


   gcc/testsuite/
 * gcc.dg/long_branch.c: New testcase.


Re: [PATCH][AArch64] Improve bit-test-branch pattern to avoid unnecessary register clobber

2015-01-27 Thread Jiong Wang

On 19/01/15 10:58, Jakub Jelinek wrote:


On Mon, Jan 19, 2015 at 10:52:14AM +, Ramana Radhakrishnan wrote:

What is aarch64 specific on the testcase?


The number of if-then-else's required to get the compiler to generate
cmp branch sequences rather than the tbnz instruction.

That doesn't mean the same testcase couldn't be tested on other targets and
perhaps find bugs in there.
That said, if the testcase is too expensive to compile (several seconds is
ok, minutes is not), then perhaps it shouldn't be included at all, or should
be guarded with run_expensive_tests target.

Jakub



testcase changed to execution version, and moved to gcc.dg. the compile time 
only
take several seconds. (previously I am using cc1 built by O0 which at most take 
24s)

ok to install?

Thanks.

2015-01-19  Ramana Radhakrishnan  ramana.radhakrish...@arm.com
Jiong Wang  jiong.w...@arm.com

  gcc/
* config/aarch64/aarch64.md (tboptabmode1): Clobber CC reg instead of 
scratch reg.
(cboptabmode1): Likewise.
* config/aarch64/iterators.md (bcond): New define_code_attr.

  gcc/testsuite/
* gcc.dg/long_branch.c: New testcase.
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 597ff8c..1e00396 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -466,13 +466,17 @@
 		   (const_int 0))
 	 (label_ref (match_operand 2  ))
 	 (pc)))
-   (clobber (match_scratch:DI 3 =r))]
+   (clobber (reg:CC CC_REGNUM))]
   
-  *
-  if (get_attr_length (insn) == 8)
-return \ubfx\\t%w3, %w0, %1, #1\;cbz\\t%w3, %l2\;
-  return \tbz\\t%w0, %1, %l2\;
-  
+  {
+if (get_attr_length (insn) == 8)
+  {
+	operands[1] = GEN_INT (HOST_WIDE_INT_1U  UINTVAL (operands[1]));
+	return tst\t%w0, %1\;bcond\t%l2;
+  }
+else
+  return tbz\t%w0, %1, %l2;
+  }
   [(set_attr type branch)
(set (attr length)
 	(if_then_else (and (ge (minus (match_dup 2) (pc)) (const_int -32768))
@@ -486,13 +490,21 @@
  (const_int 0))
 			   (label_ref (match_operand 1  ))
 			   (pc)))
-   (clobber (match_scratch:DI 2 =r))]
+   (clobber (reg:CC CC_REGNUM))]
   
-  *
-  if (get_attr_length (insn) == 8)
-return \ubfx\\t%w2, %w0, sizem1, #1\;cbz\\t%w2, %l1\;
-  return \tbz\\t%w0, sizem1, %l1\;
-  
+  {
+if (get_attr_length (insn) == 8)
+  {
+	char buf[64];
+	uint64_t val = ((uint64_t ) 1)
+			 (GET_MODE_SIZE (MODEmode) * BITS_PER_UNIT - 1);
+	sprintf (buf, tst\t%%w0, %PRId64, val);
+	output_asm_insn (buf, operands);
+	return bcond\t%l1;
+  }
+else
+  return tbz\t%w0, sizem1, %l1;
+  }
   [(set_attr type branch)
(set (attr length)
 	(if_then_else (and (ge (minus (match_dup 1) (pc)) (const_int -32768))
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index 7dd3917..bd144f9 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -823,6 +823,9 @@
 		  (smax s) (umax u)
 		  (smin s) (umin u)])
 
+;; Emit conditional branch instructions.
+(define_code_attr bcond [(eq beq) (ne bne) (lt bne) (ge beq)])
+
 ;; Emit cbz/cbnz depending on comparison type.
 (define_code_attr cbz [(eq cbz) (ne cbnz) (lt cbnz) (ge cbz)])
 
diff --git a/gcc/testsuite/gcc.dg/long_branch.c b/gcc/testsuite/gcc.dg/long_branch.c
new file mode 100644
index 000..f388a80
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/long_branch.c
@@ -0,0 +1,198 @@
+/* { dg-do run } */
+/* { dg-options -O2 -fno-reorder-blocks } */
+
+void abort ();
+
+__attribute__((noinline, noclone)) int
+restore (int a, int b)
+{
+  return a * b;
+}
+
+__attribute__((noinline, noclone)) void
+do_nothing (int *input)
+{
+  *input = restore (*input, 1);
+  return;
+}
+
+#define CASE_ENTRY(n) \
+  case n: \
+sum = sum / (n + 1); \
+sum = restore (sum, n + 1); \
+if (sum == (n + addend)) \
+  break;\
+sum = sum / (n + 2); \
+sum = restore (sum, n + 2); \
+sum = sum / (n + 3); \
+sum = restore (sum, n + 3); \
+sum = sum / (n + 4); \
+sum = restore (sum, n + 4); \
+sum = sum / (n + 5); \
+sum = restore (sum, n + 5); \
+sum = sum / (n + 6); \
+sum = restore (sum, n + 6); \
+sum = sum / (n + 7); \
+sum = restore (sum, n + 7); \
+sum = sum / (n + 8); \
+sum = restore (sum, n + 8); \
+sum = sum / (n + 9); \
+sum = restore (sum, n + 9); \
+sum = sum / (n + 10); \
+sum = restore (sum, n + 10); \
+sum = sum / (n + 11); \
+sum = restore (sum, n + 11); \
+sum = sum / (n + 12); \
+sum = restore (sum, n + 12); \
+sum = sum / (n + 13); \
+sum = restore (sum, n + 13); \
+sum = sum / (n + 14); \
+sum = restore (sum, n + 14); \
+sum = sum / (n + 15); \
+sum = restore (sum, n + 15); \
+sum = sum / (n + 16); \
+sum = restore (sum, n + 16); \
+sum = sum / (n + 17); \
+sum = restore (sum, n + 17); \
+sum = sum / (n + 18); \
+sum = restore (sum, n + 18); \
+sum = sum / (n + 19); \
+sum = restore (sum, n + 19); 

[PATCH] Fix ada bootstrap under cygwin-64

2015-01-27 Thread Bernd Edlinger
Hi,


this patch fixes the ada bootstrap under cygwin-64.

Boot-strapped under x86_64-pc-cygwin.
OK for trunk?


Thanks
Bernd.
  2015-01-27  Bernd Edlinger  bernd.edlin...@hotmail.de

Fix build under cygwin/64.
* adaint.h: Add check for __CYGWIN__.
* mingw32.h: Prevent windows.h from including x86intrin.h in GCC.



patch-ada-cygwin64.diff
Description: Binary data


Re: RFA: patch to fix a bad code generation for PR64110 -- new constraints addition

2015-01-27 Thread Vladimir Makarov
On 01/27/2015 12:11 PM, Richard Sandiford wrote:
 Vladimir Makarov vmaka...@redhat.com writes:
 On 01/27/2015 09:08 AM, Richard Sandiford wrote:
 Yeah, but in practice that's only ever going to be a partial transition.
 Many port maintainers won't look at this, so we'll have to support both
 versions indefinitely, even if the new behaviour turns out to be the
 best for all cases.

 I just think we're going to regret having two sets of constraints with
 such subtly different meanings.

 Looking back at the original PR, Jakub said:

   The ! has been added by me for PR63594, so it isn't there from the era
   when i?86 backend was using reload.  If there is a better way to
   express that RA should prefer to use memory or xmm register and only
   use r constraint if it already is in a r register and doesn't need to
   be reloaded, I can use that.  Whether it is ?, ??? or something else.
   ! description in gcc docs just fitted most what I wanted...

 In some ways this seems to match the intention of *.  Originally I think
 it was just an RA-only thing and was ignored by reload, but LRA does take it
 into account too (which sounds like progress to me).
   I guess we don't need '*' in many cases.  It is overused.  Imho, IRA
 should decide what class is better based on costs of alternatives and
 the explicit exclusion of register class by using '*' is a bad practice.

   Saying that I believe we should do register class preferrencing
 algorithm more alternative oriented.  The algorithm should choose first
 an alternative (of may be subset of alternatives) and then register
 classes.  I think it is more logical.  It would permits us to rid off
 all such constraints including '*' and use only one like '?' which
 increases the alternative cost.

   In perspective it is even better to rid of '?' too and have some hook
 (or attribute) to get insn alternative costs which can be depended on
 sub-target or other run-time characteristics.  Otherwise we need to
 duplicate insn descriptions and put different insn guards.  I am going
 to work on this.  But it is hard to say will it work well (may be I have
 some performance issues with this).  This hook somehow (min or average
 of the values for all alternatives) can be used in combiner and other
 algorithms need an insn cost. That is how I see the solution of the
 problem in a long perspective.
 Definitely agree that it'd be better to remove these constraints
 in favour of a new attribute.  preferred_for_size and preferred_for_speed
 give something similar, though they're much more stringent than what
 we need here.

 If I revert the patch locally and change the *vec_dupmode pattern to
 use *, it passes both the test for PR64110 and the tests for PR63594.
 Would that be OK as an alternative?

   I don't think it will work in general case.  It probably works because
 a different class is chosen in IRA.  If IRA for some reasons choose the
 same class, we might see the same problem in LRA.
 But isn't that the point of '*'?  It should stop IRA from using the 'r'
 alternative as an indication that 'r' is a good choice for this instruction.
 If IRA chooses 'r' anyway, it must be because other instructions that
 use the same allocno strongly prefer 'r'.

 And in those some circumstances -- i.e. if IRA does choose 'r' despite
 the constraints in this instruction -- then I think we do want to use the
 'r' alternative.  And AIUI that's also what the new constraint is designed
 to do.  If IRA chooses 'r' anyway, the new constraint causes LRA to prefer
 the 'r' alternative _even if_ another operand (the destination) has to
 be reloaded, which is the fundamental difference between the new constraint
 and '!'.

 So I'm still not sure why '*' wouldn't do what we want.
Frequently use of '*' (and sometimes '!' for reload) means that we need
splitting for this alternative probably into 2/3 insns.  Instead of '*'
use we would need to set up costs of all these insns.  I believe just
ignoring the class with '*' is wrong.  There are some cases where we
need '*' to avoid definitely this reg class, e.g. mmx when we use other
classes for fp values.  But I guess this solution is not reliable and
without the constraints we could set the alternative cost very high to
have a reliable right solution.

  I also don't like when register classes are excluded by '*' for IRA
 (see my thoughts above).
 Understood, and I agree it would be good to move to attributes.
 But in a way, I think that's an even better reason to try to avoid
 adding these new constraints.  It sounds like we're hoping to get rid
 of them as soon as we've added them :-)


Sometimes to get rid off, you should add more :)

But to be serious, what I wrote can not be implemented for GCC-5.0 (and
the generated code performance is still unknown for the proposed
approach).  I believe the current solution is more reliable than using
'*'.   Ridding off the new constraints will be much much smaller problem
than ridding of other constraints.




Re: [PATCH][RFA][PR target/15184] Partial fix for direct byte access on x86

2015-01-27 Thread Jeff Law

On 01/27/15 14:21, Segher Boessenkool wrote:

On Tue, Jan 27, 2015 at 01:53:34PM -0700, Jeff Law wrote:

I do have a specific PR in mind, but I cannot currently find it.  It was
about x86, dec mem and then using the flags...  Must have sent 100 emails
in that thread...  And cannot find it now!

Are you referring to 61225?


That is the one, thanks.

It's not going to help 61225.  The key insns in 61225 are:

(insn 6 3 7 2 (set (reg:SI 91 [ *x_3(D) ])
(mem:SI (reg/v/f:SI 90 [ x ]) [1 *x_3(D)+0 S4 A32])) k.c:11 90 
{*movsi_internal}

 (nil))
(insn 7 6 8 2 (parallel [
(set (reg:SI 88 [ D.1494 ])
(plus:SI (reg:SI 91 [ *x_3(D) ])
(const_int -1 [0x])))
(clobber (reg:CC 17 flags))
]) k.c:11 220 {*addsi_1}
 (expr_list:REG_DEAD (reg:SI 91 [ *x_3(D) ])
(expr_list:REG_UNUSED (reg:CC 17 flags)
(expr_list:REG_EQUAL (plus:SI (mem:SI (reg/v/f:SI 90 [ x ]) 
[1 *x_3(D)+0 S4 A32])

(const_int -1 [0x]))
(nil)
(insn 8 7 9 2 (set (mem:SI (reg/v/f:SI 90 [ x ]) [1 *x_3(D)+0 S4 A32])
(reg:SI 88 [ D.1494 ])) k.c:11 90 {*movsi_internal}
 (nil))
(insn 9 8 10 2 (set (reg:CCZ 17 flags)
(compare:CCZ (reg:SI 88 [ D.1494 ])
(const_int 0 [0]))) k.c:11 3 {*cmpsi_ccno_1}
 (expr_list:REG_DEAD (reg:SI 88 [ D.1494 ])
(nil)))


Note how REG:SI 88 has two uses.   Never do we pass a set of insns into 
try_combine that are useful to optimize this particular case (we never 
include insn #9 in any of the attempted combinations).


Even a bridge pattern isn't going to help here.


jeff








Re: [PATCH] Add comdat_group effective target (PR bootstrap/64612)

2015-01-27 Thread Mike Stump
On Jan 27, 2015, at 7:10 AM, Jakub Jelinek ja...@redhat.com wrote:
 
 This patch introduces a new effective target check and adds it to the 
 pr64612.C
 - if comdat groups aren't used, there is no guarantee that the D2 dtor will
 be emitted always alongside of D1 dtor.
 
 Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

Re: [RFC PATCH] Avoid most of the BUILT_IN_*_CHKP enum values

2015-01-27 Thread Jakub Jelinek
On Tue, Jan 27, 2015 at 06:04:53PM +0300, Ilya Enkovich wrote:
 2015-01-27 17:27 GMT+03:00 Jakub Jelinek ja...@redhat.com:
  I've grepped for BUILT_IN_.*_CHKP in the sources and we actually need
  far fewer enum values than the 1204 that are being defined.
 
  This patch requires builtins.def to say explicitly (by using
  DEF_*BUILTIN_CHKP macro instead of corresponding DEF_*BUILTIN) which
  ones need that, for all the others only space in the enum is reserved and
  nothing else.
 
  I'd hope this could work around the buggy AIX stabs handling, but even
  on say x86_64-linux it has a benefit of decreasing cc1plus .debug_info
  by about 2.7MB (of course, with dwz that benefit goes to almost nothing,
  just the ~ 7000 bytes or so, plus .debug_str cost (that is merged even
  without dwz between TUs).  The cost without dwz is obviously mainly
  from repeating that in most of the translation units.  But why declare
  BUILT_IN_*_CHKP enums that are never used by anything...
 
 Enum values not mentioned in the code are not fully useless.  When we
 have builtin functions defined as 'always_inline' functions, they are
 instrumented and enum names may be used in dumps and debugging.
 That's not a big value though.  Thanks a lot for taking care of it!

Note, patch successfully bootstrapped/regtested on x86_64-linux and
i686-linux, and David said that on AIX it passed stage1 cc1 linking.

Ok for trunk?

As for the enums, I doubt the pain is worth the trouble.
What perhaps could be done (apparently preexisting issue, because you
include builtins.def just once in built_in_names, would be to tweak
fprintf (file,  built-in %s:%s,
 built_in_class_names[(int) DECL_BUILT_IN_CLASS (node)],
 built_in_names[(int) DECL_FUNCTION_CODE (node)]);
so that if DECL_FUNCTION_CODE is in between
BEGIN_CHKP_BUILTINS and END_CHKP_BUILTINS you don't print (null) on glibc
there or crash (on various other hosts), but actually print
built_in_names[(int) DECL_FUNCTION_CODE (node) - (int) BEGIN_CHKP_BUILTINS - 1]
concatenated with _CHKP.

  2015-01-27  Jakub Jelinek  ja...@redhat.com
 
  * builtins.def (DEF_BUILTIN_CHKP): Define if not defined.
  (DEF_LIB_BUILTIN_CHKP, DEF_EXT_LIB_BUILTIN_CHKP): Redefine.
  (DEF_CHKP_BUILTIN): Define using DEF_BUILTIN_CHKP instead
  of DEF_BUILTIN.
  (BUILT_IN_MEMCPY, BUILT_IN_MEMMOVE, BUILT_IN_MEMSET, 
  BUILT_IN_STRCAT,
  BUILT_IN_STRCHR, BUILT_IN_STRCPY, BUILT_IN_STRLEN): Use
  DEF_LIB_BUILTIN_CHKP macro instead of DEF_LIB_BUILTIN.
  (BUILT_IN_MEMCPY_CHK, BUILT_IN_MEMMOVE_CHK, BUILT_IN_MEMPCPY_CHK,
  BUILT_IN_MEMPCPY, BUILT_IN_MEMSET_CHK, BUILT_IN_STPCPY_CHK,
  BUILT_IN_STPCPY, BUILT_IN_STRCAT_CHK, BUILT_IN_STRCPY_CHK): Use
  DEF_EXT_LIB_BUILTIN_CHKP macro instead of DEF_EXT_LIB_BUILTIN.
  * tree-core.h (enum built_in_function): In between
  BEGIN_CHKP_BUILTINS and END_CHKP_BUILTINS only define enum values
  for builtins that use DEF_BUILTIN_CHKP macro.
 
  --- gcc/builtins.def.jj 2015-01-15 23:39:10.0 +0100
  +++ gcc/builtins.def2015-01-27 15:04:44.860924664 +0100
  @@ -63,6 +63,16 @@ along with GCC; see the file COPYING3.
 
  The builtins is registered only if COND is true.  */
 
  +/* A macro for builtins where the
  +   BUILT_IN_*_CHKP = BUILT_IN_* + BEGIN_CHKP_BUILTINS + 1
  +   enums should be defined too.  */
  +#ifndef DEF_BUILTIN_CHKP
  +#define DEF_BUILTIN_CHKP(ENUM, NAME, CLASS, TYPE, LIBTYPE, BOTH_P, \
  +FALLBACK_P, NONANSI_P, ATTRS, IMPLICIT, COND)  \
  +  DEF_BUILTIN(ENUM, NAME, CLASS, TYPE, LIBTYPE, BOTH_P, FALLBACK_P,\
  + NONANSI_P, ATTRS, IMPLICIT, COND)
  +#endif
  +
   /* A GCC builtin (like __builtin_saveregs) is provided by the
  compiler, but does not correspond to a function in the standard
  library.  */
  @@ -87,6 +97,10 @@ along with GCC; see the file COPYING3.
   #define DEF_LIB_BUILTIN(ENUM, NAME, TYPE, ATTRS)   \
 DEF_BUILTIN (ENUM, __builtin_ NAME, BUILT_IN_NORMAL, TYPE, TYPE,   \
 true, true, false, ATTRS, true, true)
  +#undef DEF_LIB_BUILTIN_CHKP
  +#define DEF_LIB_BUILTIN_CHKP(ENUM, NAME, TYPE, ATTRS)  \
  +  DEF_BUILTIN_CHKP (ENUM, __builtin_ NAME, BUILT_IN_NORMAL, TYPE,\
  +   TYPE, true, true, false, ATTRS, true, true)
 
   /* Like DEF_LIB_BUILTIN, except that the function is not one that is
  specified by ANSI/ISO C.  So, when we're being fully conformant we
  @@ -96,6 +110,10 @@ along with GCC; see the file COPYING3.
   #define DEF_EXT_LIB_BUILTIN(ENUM, NAME, TYPE, ATTRS)   \
 DEF_BUILTIN (ENUM, __builtin_ NAME, BUILT_IN_NORMAL, TYPE, TYPE,   \
 true, true, true, ATTRS, false, true)
  +#undef DEF_EXT_LIB_BUILTIN_CHKP
  +#define DEF_EXT_LIB_BUILTIN_CHKP(ENUM, NAME, TYPE, ATTRS)  \
  +  DEF_BUILTIN_CHKP (ENUM, __builtin_ NAME, BUILT_IN_NORMAL, TYPE,\
  +

Re: [PATCH, PR tree-optimization/64277] Improve loop iterations count estimation

2015-01-27 Thread Ilya Enkovich
On 27 Jan 12:40, Ilya Enkovich wrote:
 Hi,
 
 This patch was supposed to fix PR tree-optimization/64277.  Tracker is now 
 fixed by warnings disabling but I think patch is still useful to avoid dead 
 code generated by complete unroll.
 
 Bootstrapped and tested on x86_64-unknown-linux-gnu.
 
 Thanks,
 Ilya
 --
 gcc/
 
 2015-01-27  Ilya Enkovich  ilya.enkov...@intel.com
 
   * tree-ssa-loop-niter.c (record_nonwrapping_iv): Use base
   range info when possible to refine estimation.
 
 gcc/testsuite/
 
 2015-01-27  Ilya Enkovich  ilya.enkov...@intel.com
 
   * gcc.dg/pr64277.c: New.
 
 

Here is a new version fixed according to comments in the tracker.  I also fixed 
a test to scan cunroll dumps.  Does it look OK?

What are possible branches for this patch?

Thanks,
Ilya
--
diff --git a/gcc/testsuite/gcc.dg/pr64277.c b/gcc/testsuite/gcc.dg/pr64277.c
new file mode 100644
index 000..c6ef331
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr64277.c
@@ -0,0 +1,23 @@
+/* PR tree-optimization/64277 */
+/* { dg-do compile } */
+/* { dg-options -O3 -Wall -Werror -fdump-tree-cunroll-details } */
+/* { dg-final { scan-tree-dump loop with 5 iterations completely unrolled 
cunroll } } */
+/* { dg-final { scan-tree-dump loop with 6 iterations completely unrolled 
cunroll } } */
+/* { dg-final { cleanup-tree-dump cunroll } } */
+
+int f1[10];
+void test1 (short a[], short m, unsigned short l)
+{
+  int i = l;
+  for (i = i + 5; i  m; i++)
+f1[i] = a[i]++;
+}
+
+void test2 (short a[], short m, short l)
+{
+  int i;
+  if (m  5)
+m = 5;
+  for (i = m; i  l; i--)
+f1[i] = a[i]++;
+}
diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index 919f5c0..1cd297d 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -2754,6 +2754,7 @@ record_nonwrapping_iv (struct loop *loop, tree base, tree 
step, gimple stmt,
 {
   tree niter_bound, extreme, delta;
   tree type = TREE_TYPE (base), unsigned_type;
+  tree orig_base = base;
 
   if (TREE_CODE (step) != INTEGER_CST || integer_zerop (step))
 return;
@@ -2777,16 +2778,30 @@ record_nonwrapping_iv (struct loop *loop, tree base, 
tree step, gimple stmt,
 
   if (tree_int_cst_sign_bit (step))
 {
+  wide_int min, max;
   extreme = fold_convert (unsigned_type, low);
-  if (TREE_CODE (base) != INTEGER_CST)
+  if (TREE_CODE (orig_base) == SSA_NAME
+  TREE_CODE (high) == INTEGER_CST
+  INTEGRAL_TYPE_P (TREE_TYPE (orig_base))
+  get_range_info (orig_base, min, max) == VR_RANGE
+  wi::gts_p (wide_int (high), max))
+   base = wide_int_to_tree (unsigned_type, max);
+  else if (TREE_CODE (base) != INTEGER_CST)
base = fold_convert (unsigned_type, high);
   delta = fold_build2 (MINUS_EXPR, unsigned_type, base, extreme);
   step = fold_build1 (NEGATE_EXPR, unsigned_type, step);
 }
   else
 {
+  wide_int min, max;
   extreme = fold_convert (unsigned_type, high);
-  if (TREE_CODE (base) != INTEGER_CST)
+  if (TREE_CODE (orig_base) == SSA_NAME
+  TREE_CODE (low) == INTEGER_CST
+  INTEGRAL_TYPE_P (TREE_TYPE (orig_base))
+  get_range_info (orig_base, min, max) == VR_RANGE
+  wi::gts_p (min, wide_int (low)))
+   base = wide_int_to_tree (unsigned_type, min);
+  else if (TREE_CODE (base) != INTEGER_CST)
base = fold_convert (unsigned_type, low);
   delta = fold_build2 (MINUS_EXPR, unsigned_type, extreme, base);
 }


Re: [PATCH] Fix PR64277

2015-01-27 Thread Richard Biener
On Tue, 27 Jan 2015, Ilya Enkovich wrote:

 2015-01-27 12:47 GMT+03:00 Richard Biener rguent...@suse.de:
  On Tue, 27 Jan 2015, Jakub Jelinek wrote:
 
  On Tue, Jan 27, 2015 at 10:25:48AM +0100, Richard Biener wrote:
  
   This disables array-bound warnings from VRP2 as discussed.
  
   Bootstrapped and tested on x86_64-unknown-linux-gnu - ok for trunk?
 
  So nothing in the testsuite needed to change?  Nice.
 
  Yes.
 
  Ok for trunk.
 
   I'll search for duplicates and add a few testcases.
 
  Thanks.
 
  Committed as follows (first testcase in PR59124 not fixed - it warns
  from the first pass).
 
 Are you going to port it to 4.9 branch?

I plan to do that (4.8 as well) after some time.

Richard.

 Thanks,
 Ilya
 
 
  2015-01-27  Richard Biener  rguent...@suse.de
 
  PR tree-optimization/56273
  PR tree-optimization/59124
  PR tree-optimization/64277
  * tree-vrp.c (vrp_finalize): Emit array-bound warnings only
  from the first VRP pass.
 
  * g++.dg/warn/Warray-bounds-6.C: New testcase.
  * gcc.dg/Warray-bounds-12.c: Likewise.
  * gcc.dg/Warray-bounds-13.c: Likewise.
 
  Index: gcc/tree-vrp.c
  ===
  *** gcc/tree-vrp.c.orig 2015-01-27 10:34:26.453743828 +0100
  --- gcc/tree-vrp.c  2015-01-27 10:43:04.970610102 +0100
  *** vrp_finalize (void)
  *** 10229,10235 
  substitute_and_fold (op_with_constant_singleton_value_range,
 vrp_fold_stmt, false);
 
  !   if (warn_array_bounds)
check_all_array_refs ();
 
  /* We must identify jump threading opportunities before we release
  --- 10229,10235 
  substitute_and_fold (op_with_constant_singleton_value_range,
 vrp_fold_stmt, false);
 
  !   if (warn_array_bounds  first_pass_instance)
check_all_array_refs ();
 
  /* We must identify jump threading opportunities before we release
  Index: gcc/testsuite/g++.dg/warn/Warray-bounds-6.C
  ===
  *** /dev/null   1970-01-01 00:00:00.0 +
  --- gcc/testsuite/g++.dg/warn/Warray-bounds-6.C 2015-01-27 
  10:40:31.311871855 +0100
  ***
  *** 0 
  --- 1,26 
  + // { dg-do compile }
  + // { dg-options -O3 -Warray-bounds }
  +
  + struct type {
  + bool a, b;
  + bool get_b() { return b; }
  + };
  +
  + type stuff[9u];
  +
  + void bar();
  +
  + void foo()
  + {
  +   for(unsigned i = 0u; i  9u; i++)
  + {
  +   if(!stuff[i].a)
  +   continue;
  +
  +   bar();
  +
  +   for(unsigned j = i + 1u; j  9u; j++)
  +   if(stuff[j].a  stuff[j].get_b()) // { dg-bogus above array 
  bounds }
  + return;
  + }
  + }
  Index: gcc/testsuite/gcc.dg/Warray-bounds-12.c
  ===
  *** /dev/null   1970-01-01 00:00:00.0 +
  --- gcc/testsuite/gcc.dg/Warray-bounds-12.c 2015-01-27 
  10:40:58.196175989 +0100
  ***
  *** 0 
  --- 1,26 
  + /* { dg-do compile } */
  + /* { dg-options -O3 -Warray-bounds } */
  + /* { dg-additional-options -mssse3 { target x86_64-*-* i?86-*-* } } */
  +
  + void foo(short a[], short m)
  + {
  +   int i, j;
  +   int f1[10];
  +   short nc;
  +
  +   nc = m + 1;
  +   if (nc  3)
  + {
  +   for (i = 0; i = nc; i++)
  +   {
  + f1[i] = f1[i] + 1;
  +   }
  + }
  +
  +   for (i = 0, j = m; i  nc; i++, j--)
  + {
  +   a[i] = f1[i]; /* { dg-bogus above array bounds } */
  +   a[j] = i;
  + }
  +   return;
  + }
  Index: gcc/testsuite/gcc.dg/Warray-bounds-13.c
  ===
  *** /dev/null   1970-01-01 00:00:00.0 +
  --- gcc/testsuite/gcc.dg/Warray-bounds-13.c 2015-01-27 
  10:42:43.738369929 +0100
  ***
  *** 0 
  --- 1,18 
  + /* { dg-do compile } */
  + /* { dg-options -O3 -Warray-bounds } */
  +
  + extern char *bar[17];
  +
  + int foo(int argc, char **argv)
  + {
  +   int i;
  +   int n = 0;
  +
  +   for (i = 0; i  argc; i++)
  + n++;
  +
  +   for (i = 0; i  argc; i++)
  + argv[i] = bar[i + n]; /* { dg-bogus above array bounds } */
  +
  +   return 0;
  + }
 
 

-- 
Richard Biener rguent...@suse.de
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild,
Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)


[PATCH] pr 64047 - explicitly handle target_option_default_node in rs6000_set_current_function

2015-01-27 Thread tbsaunde+gcc
From: Trevor Saunders tbsaunde+...@tbsaunde.org

Hi,

the compiler crashes on pr52429.c because this_target_ira_int gets initialized
with null x_init_costs and x_op_costs.  While I don't really understand this 
option handling mess r217659 made the analogous change to i386 when it broke 
this.  So it seems likely this is the right way to fix the regression.

bootstrapped + regtested ppc64-linux-gnu, without regression and pr52429.c is
fixed, ok?


Trev

gcc/

* config/rs6000/rs6000.c (rs6000_set_current_function): Handle
explicit default options.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 85eb0fd..207fc55 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -32609,7 +32609,7 @@ rs6000_set_current_function (tree fndecl)
   if (old_tree == new_tree)
;
 
-  else if (new_tree)
+  else if (new_tree  new_tree != target_option_default_node)
{
  cl_target_option_restore (global_options,
TREE_TARGET_OPTION (new_tree));
@@ -32620,7 +32620,7 @@ rs6000_set_current_function (tree fndecl)
  = save_target_globals_default_opts ();
}
 
-  else if (old_tree)
+  else if (old_tree  old_tree != target_option_default_node)
{
  new_tree = target_option_current_node;
  cl_target_option_restore (global_options,
-- 
2.1.4



Re: Merge current set of OpenACC changes from gomp-4_0-branch

2015-01-27 Thread Julian Brown
On Mon, 26 Jan 2015 14:44:19 +0100
Thomas Schwinge tho...@codesourcery.com wrote:

  On 17 Jan 02:16, Ilya Verbin wrote:
   Unfortunately, it broke offloading from shared libraries (I mean
   common libs with NEEDED entries, not dlopened).
 
 Sorry for that!
 
   Such things are not covered by the
   testsuite, that's why you missed this issue.  Here is a simple
   testcase:
 
 http://news.gmane.org/find-root.php?message_id=%3C20150116231632.GB48380%40msticlxl57.ims.intel.com%3E
 
 Probably a good motivation for adding such a test case.  ;-)
 
   So, you don't assume that a device can have multiple images from
   multiple libs?
  
  Ping?
 
 This probably is just a bug that we introduced with our changes?
 (Julian?)

AFAICR, we haven't yet figured out how to make (shared) libraries work
with PTX. Actually I'm not entirely sure if static libraries containing
PTX code will work either. But, multiple images (e.g. from different
object files) are supported, via the loop in gomp_target_init.

(The semantics of gomp_register_image_for_device were changed, but not
-- intentionally! -- to limit the number of offloaded images to one.)

  Also, could you please explain, why did you divide a device
  initialization into two functions -- gomp_init_device and
  gomp_init_tables?
 
 As I understand it (again, Julian, please correct me if I got that
 wrong), the reason is that for OpenACC support, we need these as two
 separate (independent) actions.  Is this causing problems for OpenMP
 offloading?

This was certainly necessary at some point, when the support for
multiple devices of the same type in the OpenACC runtime was delegated
entirely to target-dependent code. Later (after one round of
refactoring), the gomp_device_descr and the memory map were still
separate, with the former possibly representing a number of devices,
and the latter having independent copies for each instance of a device.

That's largely been refactored (again) away now though -- a
gomp_device_descr and its memory map are stored together, per-device
instance. So this separation of their initialisation can probably go
away, although some (somewhat delicate) code in oacc-init.c would need
to be tweaked.

Julian


RE: [PATCH RFA MIPS] Prohibit vector modes in accumulators

2015-01-27 Thread Matthew Fortune
Richard Sandiford rdsandif...@googlemail.com writes:
 Matthew Fortune matthew.fort...@imgtec.com writes:
  2015-01-23  Robert Suchanek  robert.sucha...@imgtec.com
 
 * config/mips/mips.c (mips_hard_regno_mode_ok_p): Prohibit
  accumulators
 for all vector modes.
 
  This seems like a genuine bug and although it can only be triggered by
  loongson or paired-single support it probably qualifies for fixing.
 
 Agreed FWIW.  We shouldn't mark something as valid for a mode if even
 the mode's move pattern can't handle it.
 
 I think this kind of thing should go in regardless of development stage.

Given that it was one of the pre-existing tests that failed I'm happy that
we are covering this issue. All of these LRA related issues are likely
to phase in and out with subtle changes to code-gen so I don't think we
can always get a test case that fails on trunk.

Since Catherine asked for further info then I will leave her to say if she
is happy to accept on this basis.

Matthew


Re: [PATCH] Fix PR64277

2015-01-27 Thread Ilya Enkovich
2015-01-27 13:59 GMT+03:00 Richard Biener rguent...@suse.de:
 On Tue, 27 Jan 2015, Ilya Enkovich wrote:

 2015-01-27 12:47 GMT+03:00 Richard Biener rguent...@suse.de:
  On Tue, 27 Jan 2015, Jakub Jelinek wrote:
 
  On Tue, Jan 27, 2015 at 10:25:48AM +0100, Richard Biener wrote:
  
   This disables array-bound warnings from VRP2 as discussed.
  
   Bootstrapped and tested on x86_64-unknown-linux-gnu - ok for trunk?
 
  So nothing in the testsuite needed to change?  Nice.
 
  Yes.
 
  Ok for trunk.
 
   I'll search for duplicates and add a few testcases.
 
  Thanks.
 
  Committed as follows (first testcase in PR59124 not fixed - it warns
  from the first pass).

 Are you going to port it to 4.9 branch?

 I plan to do that (4.8 as well) after some time.

Great, thanks!

Ilya


 Richard.

 Thanks,
 Ilya

 
  2015-01-27  Richard Biener  rguent...@suse.de
 
  PR tree-optimization/56273
  PR tree-optimization/59124
  PR tree-optimization/64277
  * tree-vrp.c (vrp_finalize): Emit array-bound warnings only
  from the first VRP pass.
 
  * g++.dg/warn/Warray-bounds-6.C: New testcase.
  * gcc.dg/Warray-bounds-12.c: Likewise.
  * gcc.dg/Warray-bounds-13.c: Likewise.
 
  Index: gcc/tree-vrp.c
  ===
  *** gcc/tree-vrp.c.orig 2015-01-27 10:34:26.453743828 +0100
  --- gcc/tree-vrp.c  2015-01-27 10:43:04.970610102 +0100
  *** vrp_finalize (void)
  *** 10229,10235 
  substitute_and_fold (op_with_constant_singleton_value_range,
 vrp_fold_stmt, false);
 
  !   if (warn_array_bounds)
check_all_array_refs ();
 
  /* We must identify jump threading opportunities before we release
  --- 10229,10235 
  substitute_and_fold (op_with_constant_singleton_value_range,
 vrp_fold_stmt, false);
 
  !   if (warn_array_bounds  first_pass_instance)
check_all_array_refs ();
 
  /* We must identify jump threading opportunities before we release
  Index: gcc/testsuite/g++.dg/warn/Warray-bounds-6.C
  ===
  *** /dev/null   1970-01-01 00:00:00.0 +
  --- gcc/testsuite/g++.dg/warn/Warray-bounds-6.C 2015-01-27 
  10:40:31.311871855 +0100
  ***
  *** 0 
  --- 1,26 
  + // { dg-do compile }
  + // { dg-options -O3 -Warray-bounds }
  +
  + struct type {
  + bool a, b;
  + bool get_b() { return b; }
  + };
  +
  + type stuff[9u];
  +
  + void bar();
  +
  + void foo()
  + {
  +   for(unsigned i = 0u; i  9u; i++)
  + {
  +   if(!stuff[i].a)
  +   continue;
  +
  +   bar();
  +
  +   for(unsigned j = i + 1u; j  9u; j++)
  +   if(stuff[j].a  stuff[j].get_b()) // { dg-bogus above array 
  bounds }
  + return;
  + }
  + }
  Index: gcc/testsuite/gcc.dg/Warray-bounds-12.c
  ===
  *** /dev/null   1970-01-01 00:00:00.0 +
  --- gcc/testsuite/gcc.dg/Warray-bounds-12.c 2015-01-27 
  10:40:58.196175989 +0100
  ***
  *** 0 
  --- 1,26 
  + /* { dg-do compile } */
  + /* { dg-options -O3 -Warray-bounds } */
  + /* { dg-additional-options -mssse3 { target x86_64-*-* i?86-*-* } } */
  +
  + void foo(short a[], short m)
  + {
  +   int i, j;
  +   int f1[10];
  +   short nc;
  +
  +   nc = m + 1;
  +   if (nc  3)
  + {
  +   for (i = 0; i = nc; i++)
  +   {
  + f1[i] = f1[i] + 1;
  +   }
  + }
  +
  +   for (i = 0, j = m; i  nc; i++, j--)
  + {
  +   a[i] = f1[i]; /* { dg-bogus above array bounds } */
  +   a[j] = i;
  + }
  +   return;
  + }
  Index: gcc/testsuite/gcc.dg/Warray-bounds-13.c
  ===
  *** /dev/null   1970-01-01 00:00:00.0 +
  --- gcc/testsuite/gcc.dg/Warray-bounds-13.c 2015-01-27 
  10:42:43.738369929 +0100
  ***
  *** 0 
  --- 1,18 
  + /* { dg-do compile } */
  + /* { dg-options -O3 -Warray-bounds } */
  +
  + extern char *bar[17];
  +
  + int foo(int argc, char **argv)
  + {
  +   int i;
  +   int n = 0;
  +
  +   for (i = 0; i  argc; i++)
  + n++;
  +
  +   for (i = 0; i  argc; i++)
  + argv[i] = bar[i + n]; /* { dg-bogus above array bounds } */
  +
  +   return 0;
  + }



 --
 Richard Biener rguent...@suse.de
 SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild,
 Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)


Re: [PATCH][2/2] Improve array-bound warnings and VRP

2015-01-27 Thread Richard Biener
On Mon, 26 Jan 2015, Jakub Jelinek wrote:

 On Mon, Jan 26, 2015 at 04:18:32PM +0100, Richard Biener wrote:
Ok for trunk?  Or should I delay this to GCC 6?
   
   Does this work even without the other patch?
  
  Yes, I've actually developed 2/2 first.  The other patch only ever
  emits more warnings...
 
 Then it probably should be ok.  I'm really afraid of emitting more warnings
 with such high false positive rate now.

As the patch also mitigates some of the code bloat we get with
the complete peeling (regression against 4.7) I have installed it.
It's also the easiest vehicle to verify range-info is not broken
by passes between vrp1 and vrp2.

Thanks,
Richard.


Re: [PATCH, PR tree-optimization/64277] Improve loop iterations count estimation

2015-01-27 Thread Richard Biener
On Tue, Jan 27, 2015 at 11:47 AM, Ilya Enkovich enkovich@gmail.com wrote:
 On 27 Jan 12:40, Ilya Enkovich wrote:
 Hi,

 This patch was supposed to fix PR tree-optimization/64277.  Tracker is now 
 fixed by warnings disabling but I think patch is still useful to avoid dead 
 code generated by complete unroll.

 Bootstrapped and tested on x86_64-unknown-linux-gnu.

 Thanks,
 Ilya
 --
 gcc/

 2015-01-27  Ilya Enkovich  ilya.enkov...@intel.com

   * tree-ssa-loop-niter.c (record_nonwrapping_iv): Use base
   range info when possible to refine estimation.

 gcc/testsuite/

 2015-01-27  Ilya Enkovich  ilya.enkov...@intel.com

   * gcc.dg/pr64277.c: New.



 Here is a new version fixed according to comments in the tracker.  I also 
 fixed a test to scan cunroll dumps.  Does it look OK?

Minor comments below.

 What are possible branches for this patch?

You can probably create a testcase that shows code-size regressions
against a version that didn't peel completely (GCC 4.7).  Thus I'd say
it would apply to 4.9 as well (4.8 doesn't have range information).

 Thanks,
 Ilya
 --
 diff --git a/gcc/testsuite/gcc.dg/pr64277.c b/gcc/testsuite/gcc.dg/pr64277.c
 new file mode 100644
 index 000..c6ef331
 --- /dev/null
 +++ b/gcc/testsuite/gcc.dg/pr64277.c
 @@ -0,0 +1,23 @@
 +/* PR tree-optimization/64277 */
 +/* { dg-do compile } */
 +/* { dg-options -O3 -Wall -Werror -fdump-tree-cunroll-details } */
 +/* { dg-final { scan-tree-dump loop with 5 iterations completely unrolled 
 cunroll } } */
 +/* { dg-final { scan-tree-dump loop with 6 iterations completely unrolled 
 cunroll } } */
 +/* { dg-final { cleanup-tree-dump cunroll } } */
 +
 +int f1[10];
 +void test1 (short a[], short m, unsigned short l)
 +{
 +  int i = l;
 +  for (i = i + 5; i  m; i++)
 +f1[i] = a[i]++;
 +}
 +
 +void test2 (short a[], short m, short l)
 +{
 +  int i;
 +  if (m  5)
 +m = 5;
 +  for (i = m; i  l; i--)
 +f1[i] = a[i]++;
 +}
 diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
 index 919f5c0..1cd297d 100644
 --- a/gcc/tree-ssa-loop-niter.c
 +++ b/gcc/tree-ssa-loop-niter.c
 @@ -2754,6 +2754,7 @@ record_nonwrapping_iv (struct loop *loop, tree base, 
 tree step, gimple stmt,
  {
tree niter_bound, extreme, delta;
tree type = TREE_TYPE (base), unsigned_type;
 +  tree orig_base = base;

if (TREE_CODE (step) != INTEGER_CST || integer_zerop (step))
  return;
 @@ -2777,16 +2778,30 @@ record_nonwrapping_iv (struct loop *loop, tree base, 
 tree step, gimple stmt,

if (tree_int_cst_sign_bit (step))
  {
 +  wide_int min, max;
extreme = fold_convert (unsigned_type, low);
 -  if (TREE_CODE (base) != INTEGER_CST)
 +  if (TREE_CODE (orig_base) == SSA_NAME
 +  TREE_CODE (high) == INTEGER_CST
 +  INTEGRAL_TYPE_P (TREE_TYPE (orig_base))
 +  get_range_info (orig_base, min, max) == VR_RANGE
 +  wi::gts_p (wide_int (high), max))

For me a simple wi::gts_p (high, max) worked fine.

 +   base = wide_int_to_tree (unsigned_type, max);
 +  else if (TREE_CODE (base) != INTEGER_CST)
 base = fold_convert (unsigned_type, high);
delta = fold_build2 (MINUS_EXPR, unsigned_type, base, extreme);
step = fold_build1 (NEGATE_EXPR, unsigned_type, step);
  }
else
  {
 +  wide_int min, max;
extreme = fold_convert (unsigned_type, high);
 -  if (TREE_CODE (base) != INTEGER_CST)
 +  if (TREE_CODE (orig_base) == SSA_NAME
 +  TREE_CODE (low) == INTEGER_CST
 +  INTEGRAL_TYPE_P (TREE_TYPE (orig_base))
 +  get_range_info (orig_base, min, max) == VR_RANGE
 +  wi::gts_p (min, wide_int (low)))

Likewise.

Ok for trunk with that changes.  For the 4.9 branch you need to adjust
the patch to not use wide-ints.  I'd leave it on trunk for a while and
eventually open a bugreport for the size regression to keep track of it.

Thanks,
Richard.

 +   base = wide_int_to_tree (unsigned_type, min);
 +  else if (TREE_CODE (base) != INTEGER_CST)
 base = fold_convert (unsigned_type, low);
delta = fold_build2 (MINUS_EXPR, unsigned_type, extreme, base);
  }


Re: [PATCH 6/8] Handle SCRATCH in decompose_address

2015-01-27 Thread Maxim Kuvyrkov
On Oct 23, 2014, at 4:18 AM, Jeff Law l...@redhat.com wrote:

 On 10/22/14 17:01, Maxim Kuvyrkov wrote:
 On Oct 23, 2014, at 9:02 AM, Jeff Law l...@redhat.com wrote:
 
 On 10/20/14 21:35, Maxim Kuvyrkov wrote:
 Hi,
 
 This patch is a simple fix to allow decompose_address to handle
 SCRATCH'es during 2nd scheduler pass. This patch is a
 prerequisite for a scheduler improvement that relies on
 decompose_address to parse insns.
 
 Bootstrapped and regtested on x86_64-linux-gnu and regtested on
 arm-linux-gnueabihf and aarch64-linux-gnu.
 I'd like to see some further discussion here.
 
 get_base_term is supposed to look at its argument as a base
 address. I'm curious under what circumstances you want to have a
 SCRATCH as a base address?
 
 I didn't see anything in patch #8 which obviously dependended on
 this, but maybe it's in there, but more subtle than expected.
 
 If you can justify why it's useful to handle scratch in here, then
 the patch will be fine.
 
 Without this patch decompose_address() ICEs during second scheduler
 pass on prologue instructions that usually have (clobber (mem:BLK
 (scratch)).  The only reason for this patch is to prevent that fault
 and enable use of decompose_address during 2nd scheduler pass.
 
 Does this answer your question, or are you looking for a more
 in-depth reason?
 Yea, that's everything I needed to know.  Patch approved.

Hi,

Turns out that the above patch applies without conflicts to two functions in 
rtlanal.c: get_base_term(), for which the patch is intended, and 
get_index_term(), for which the patch is not.

Due to git rebases and patch updates, I have accidentally pushed the patch 
twice and unintentionally changed get_index_term().  From what I can tell the 
change is benign, but, still, it is unnecessary.  The attached patch reverts 
the accidental commit.  It was bootstrapped arm-linux-gnueabihf.

OK for stage 1?  I'll regtest it before committing, just in case.

Thanks,

--
Maxim Kuvyrkov
www.linaro.org




0001-Revert-accidental-commit-get_base_index-was-the-inte.patch
Description: Binary data


Re: [PATCH][RFA][PR target/15184] Partial fix for direct byte access on x86

2015-01-27 Thread Jeff Law
I'm withdrawing the combine_simplify_rtx hunk of this patch.  While 
working cleaning up my improvements for the remaining of testcases I 
stumbled upon a simpler change which covers all the tests.


What's kind of funny is I'd been staring at the relevant code a goodly 
part of the weekend without seeing how easily it could be extended and 
that the result doesn't have to match a pattern as combine can split the 
horrid mess in such a way that we two matched insns which when combined 
with other nearby insns ultimately collapse into precisely what we want.




We're still going to need the changes to the heuristic to enable 4 insn 
combinations as we need to be giving nice big blobs of code to 
combine_simplify_rtx and its children.


It's actually kind of cool to see something like this flow into 
make_field_assignment:


(set (mem/c:HI (symbol_ref:SI (y) [flags 0x40] var_decl 
0x7670bcf0 y) [2 y+0 S2 A16])
(subreg:HI (ior:SI (zero_extend:SI (mem/c:QI (symbol_ref:SI (y) 
[flags 0x40] var_decl 0x7670bcf0 y) [2 y+0 S1 A16]))

(reg:SI 100 [ D.1569 ])) 0))

And make_field_assignment turns it into:

(set (mem/c:QI (const:SI (plus:SI (symbol_ref:SI (y) [flags 0x40] 
var_decl 0x7670bcf0 y)

(const_int 1 [0x1]))) [2 y+1 S1 A8])
(subreg:QI (lshiftrt:SI (reg:SI 100 [ D.1569 ])
(const_int 8 [0x8])) 0))

Combine then chooses the lshift expression as a split point and emits an 
insn for the lshift and substitutes a nice simple reg into that hunk of 
RTL above for the lshift


(set (mem/c:QI (const:SI (plus:SI (symbol_ref:SI (y) [flags 0x40] 
var_decl 0x7670bcf0 y)

(const_int 1 [0x1]))) [2 y+1 S1 A8])
(subreg:QI (reg:SI 103) 0))

oh, that looks perfect.  Now it's just a simple matter of 
cleanup :-)  Which is actually kindof fun to watch:


We'll have this after the combine  split step:

(insn 9 8 10 2 (parallel [
(set (reg:SI 99 [ D.1569 ])
(ashift:SI (reg:SI 96 [ c ])
(const_int 8 [0x8])))
(clobber (reg:CC 17 flags))
]) j.c:33 510 {*ashlsi3_1}
 (expr_list:REG_UNUSED (reg:CC 17 flags)
(expr_list:REG_DEAD (reg:SI 96 [ c ])
(nil

(insn 10 9 13 2 (parallel [
(set (reg:SI 100 [ D.1569 ])
(and:SI (reg:SI 99 [ D.1569 ])
(const_int 65280 [0xff00])))
(clobber (reg:CC 17 flags))
]) j.c:33 380 {*andsi_1}
 (expr_list:REG_UNUSED (reg:CC 17 flags)
(expr_list:REG_DEAD (reg:SI 99 [ D.1569 ])
(nil

(insn 13 10 14 2 (parallel [
(set (reg:SI 103)
(lshiftrt:SI (reg:SI 100 [ D.1569 ])
(const_int 8 [0x8])))
(clobber (reg:CC 17 flags))
]) j.c:33 543 {*lshrsi3_1}
 (expr_list:REG_UNUSED (reg:CC 17 flags)
(expr_list:REG_DEAD (reg:SI 100 [ D.1569 ])
(nil

(insn 14 13 0 2 (set (mem/c:QI (const:SI (plus:SI (symbol_ref:SI (y) 
[flags 0x40] var_decl 0x7670bcf0 y)

(const_int 1 [0x1]))) [2 y+1 S1 A8])
(subreg:QI (reg:SI 103) 0)) j.c:33 93 {*movqi_internal}
 (expr_list:REG_DEAD (reg:SI 103)
(nil)))

Eventually all those participate in combinations again resulting in just:

(insn 14 13 0 2 (set (mem/c:QI (const:SI (plus:SI (symbol_ref:SI (y) 
[flags 0x40]  var_decl 0x7fd247c16cf0 y)

(const_int 1 [0x1]))) [2 y+1 S1 A8])
(subreg:QI (reg:SI 96 [ c ]) 0)) j.c:33 93 {*movqi_internal}
 (expr_list:REG_DEAD (reg:SI 96 [ c ])
(nil)))

ie, movb %al, y+1


Sometimes I malign the combiner, but there are also days when I just 
have to say wow, not bad given its last major revamp was in 1990/1991 
(which brought in the splitting code noted above.


Anyway, onward bootstrapping and testing...


Jeff

On 01/26/15 20:07, Jeff Law wrote:

Segher: I know you're not officially noted as a maintainer or reviewer
for combine.c, but that's something I'd like to change if you're
interested in a larger role.  In the mean time, any feedback you have
would be appreciated.


So the issue mentioned in the BZ is that fairly obvious code sequences
that ought to use simple byte moves are expanding into hideous sequences
(load, store, couple bitwise logicals, maybe a shift or extension thrown
in for good measure).

As mentioned in the BZ, one of the issues is that combine is limited in
terms of how many insns it will look at.  As it turns out that was
addressed not terribly low ago and we can do 4 insn combinations. With
just a little work in combine.c we can get the desired code for the
first two testcases as well as two of my own.

The first issue is 4 insn combinations are (reasonably) guarded in such
a way as to avoid them if they are unlikely to succeed.  We basically
look at the operands of the 4 insns and try to guess if there's a
reasonable chance a combination would succeed.  If not, no 4 insn
combinations are 

Re: [patch, libobjc] export __objc_get_forward_imp, get_imp again

2015-01-27 Thread Matthias Klose
On 01/22/2015 05:09 PM, Matthias Klose wrote:
 On 01/22/2015 12:56 AM, Andrew Pinski wrote:
 On Wed, Jan 21, 2015 at 8:51 AM, Jakub Jelinek ja...@redhat.com wrote:
 On Wed, Jan 21, 2015 at 08:41:46AM -0800, pins...@gmail.com wrote:
 On Jan 21, 2015, at 1:02 AM, Matthias Klose d...@ubuntu.com wrote:

 __objc_get_forward_imp and get_imp were exported in libobjc since GCC 
 4.1, for
 some reason these are not exported anymore in GCC 5 (both declared 
 inline).  So
 either export these as before, or don't export them and bump the soname.  
 The
 latter seems to be unwanted, and at least gnustep-base is using the 
 get_imp
 function.  So better keep the references in GCC 5?

 Is this an intended change in GCC 5 to not to export inline methods 
 anymore?

 Just remove the inline instead.

 The comments like:

 /* The new name of get_imp().  */
 IMP
 class_getMethodImplementation (Class class_, SEL selector)
 {
   if (class_ == Nil  ||  selector == NULL)
 return NULL;

   /* get_imp is inlined, so we're good.  */
   return get_imp (class_, selector);
 }

 don't make me very confident in such a change.
 The extern prototypes really work with both -std=gnu89 and -std=gnu11 and
 thus will at least keep status quo.

 Let's do that then.
 
 get_imp was renamed to class_getMethodImplementation, which is exported from
 objc/runtime.h.  GNUstep-base uses get_imp to define it's own
 class_getMethodImplementation, so get_imp isn't really needed. So either make
 the two functions inline, and don't export them, or declare the prototypes.  
 For
 the latter I would suggest objc-private/runtime.h 
 (class_getMethodImplementation
 is declared in objc/runtime.h).

now commited the following patch after approval on IRC.

  Matthias


libobjc/

2015-01-27  Matthias Klose  d...@ubuntu.com

	* sendmsg.c: Add prototypes for __objc_get_forward_imp and get_imp.

Index: libobjc/sendmsg.c
===
--- libobjc/sendmsg.c	(revision 220167)
+++ libobjc/sendmsg.c	(working copy)
@@ -104,6 +104,10 @@
 struct objc_method * search_for_method_in_list (struct objc_method_list * list, SEL op);
 id nil_method (id, SEL);
 
+/* Make sure this inline function is exported regardless of GNU89 or C99
+   inlining semantics as it is part of the libobjc ABI.  */
+extern IMP __objc_get_forward_imp (id, SEL);
+
 /* Given a selector, return the proper forwarding implementation.  */
 inline
 IMP
@@ -320,6 +324,10 @@
   return res;
 }
 
+/* Make sure this inline function is exported regardless of GNU89 or C99
+   inlining semantics as it is part of the libobjc ABI.  */
+extern IMP get_imp (Class, SEL);
+
 inline
 IMP
 get_imp (Class class, SEL sel)


Re: [PATCH] Fix PR64798

2015-01-27 Thread Jonathan Wakely

On 27/01/15 14:43 +0100, Richard Biener wrote:


The new exceptional EH allocator failed to align exception objects
properly (it ended up aligning to __alignof__((std::size_t))).  The
following fixes that by aligning to what __attribute__((aligned))
would align to (this is what _Unwind_Exception is aligned to, a
member of __cxa_refcounted_exception).

Bootstrapped and tested on x86_64-unknown-linux-gnu - Rainer is
testing this on sparc-solaris where it broke
g++.old-deja/g++.eh/badalloc1.C.

Ok for trunk?


Yes, thanks.



[PATCH][AArch32] Testcase fix for __ATOMIC_CONSUME

2015-01-27 Thread Alex Velenko

Hi,

This patch fixes arm/atomic-op-consume.c test to expect safe LDAEX
instruction to be generated when __ATOMIC_CONSUME semantics is requested.

This patch was tested by running the modified test on arm-none-eabi and
arm-none-linux-gnueabi compilers.

Is this patch ok?

Alex

2015-01-27  Alex Velenko  alex.vele...@arm.com

gcc/testsuite/

  * gcc.target/arm/atomic-op-consume.c (scan-assember-times): Adjust
  scan-assembler-times pattern.

diff --git a/gcc/testsuite/gcc.target/arm/atomic-op-consume.c 
b/gcc/testsuite/gcc.target/arm/atomic-op-consume.c
index 0354717..cc6c028 100644
--- a/gcc/testsuite/gcc.target/arm/atomic-op-consume.c
+++ b/gcc/testsuite/gcc.target/arm/atomic-op-consume.c
@@ -5,6 +5,9 @@
 
 #include ../aarch64/atomic-op-consume.x
 
-/* { dg-final { scan-assembler-times ldrex\tr\[0-9\]+, \\\[r\[0-9\]+\\\] 6 } 
} */
+/* To workaround Bugzilla 59448 issue, a request for __ATOMIC_CONSUME is always
+   promoted to __ATOMIC_ACQUIRE, implemented as MEMMODEL_ACQUIRE.  This causes
+   LDAEX to be generated instead of LDREX.  */
+/* { dg-final { scan-assembler-times ldaex\tr\[0-9\]+, \\\[r\[0-9\]+\\\] 6 } 
} */
 /* { dg-final { scan-assembler-times strex\t...?, r\[0-9\]+, 
\\\[r\[0-9\]+\\\] 6 } } */
 /* { dg-final { scan-assembler-not dmb } } */

Re: [patch] libstdc++/64368 add configure check for timed mutex operations

2015-01-27 Thread Jonathan Wakely

This isn't related to the last patch for this bug, except that the PR
is currently being used for all darwin FAILs.

We need to check a configure macro before using
pthread_rwlock_timedrdlock because Darwin doesn't define the
_POSIX_TIMEOUTS option.

Tested x86_64-linux, committed to trunk.
commit d48fe00ea96b3515a6a1f7a6926dfe2ff7db643c
Author: Jonathan Wakely jwak...@redhat.com
Date:   Tue Jan 27 10:38:09 2015 +

	PR libstdc++/64368
	* include/std/shared_mutex (shared_timed_mutex::try_lock_for,
	shared_timed_mutex::try_lock_until): Only define when POSIX thread
	timeouts option is supported.
	(shared_timed_mutex::try_shared_lock_for,
	shared_timed_mutex::try_shared_lock_until): Likewise.

diff --git a/libstdc++-v3/include/std/shared_mutex b/libstdc++-v3/include/std/shared_mutex
index 643768c..47cfc64 100644
--- a/libstdc++-v3/include/std/shared_mutex
+++ b/libstdc++-v3/include/std/shared_mutex
@@ -108,6 +108,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   return true;
 }
 
+#if _GTHREAD_USE_MUTEX_TIMEDLOCK
 templatetypename _Rep, typename _Period
   bool
   try_lock_for(const chrono::duration_Rep, _Period __rel_time)
@@ -149,6 +150,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	const auto __s_atime = __s_entry + __delta;
 	return try_lock_until(__s_atime);
   }
+#endif
 
 void
 unlock()
@@ -186,6 +188,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   return true;
 }
 
+#if _GTHREAD_USE_MUTEX_TIMEDLOCK
 templatetypename _Rep, typename _Period
   bool
   try_lock_shared_for(const chrono::duration_Rep, _Period __rel_time)
@@ -230,6 +233,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	const auto __s_atime = __s_entry + __delta;
 	return try_lock_shared_until(__s_atime);
   }
+#endif
 
 void
 unlock_shared()


Re: [PATCH, PR tree-optimization/64277] Improve loop iterations count estimation

2015-01-27 Thread Ilya Enkovich
On 27 Jan 12:29, Richard Biener wrote:
 On Tue, Jan 27, 2015 at 11:47 AM, Ilya Enkovich enkovich@gmail.com 
 wrote:
  On 27 Jan 12:40, Ilya Enkovich wrote:
  Hi,
 
  This patch was supposed to fix PR tree-optimization/64277.  Tracker is now 
  fixed by warnings disabling but I think patch is still useful to avoid 
  dead code generated by complete unroll.
 
  Bootstrapped and tested on x86_64-unknown-linux-gnu.
 
  Thanks,
  Ilya
  --
  gcc/
 
  2015-01-27  Ilya Enkovich  ilya.enkov...@intel.com
 
* tree-ssa-loop-niter.c (record_nonwrapping_iv): Use base
range info when possible to refine estimation.
 
  gcc/testsuite/
 
  2015-01-27  Ilya Enkovich  ilya.enkov...@intel.com
 
* gcc.dg/pr64277.c: New.
 
 
 
  Here is a new version fixed according to comments in the tracker.  I also 
  fixed a test to scan cunroll dumps.  Does it look OK?
 
 Minor comments below.
 
  What are possible branches for this patch?
 
 You can probably create a testcase that shows code-size regressions
 against a version that didn't peel completely (GCC 4.7).  Thus I'd say
 it would apply to 4.9 as well (4.8 doesn't have range information).
 
  Thanks,
  Ilya
  --
  diff --git a/gcc/testsuite/gcc.dg/pr64277.c b/gcc/testsuite/gcc.dg/pr64277.c
  new file mode 100644
  index 000..c6ef331
  --- /dev/null
  +++ b/gcc/testsuite/gcc.dg/pr64277.c
  @@ -0,0 +1,23 @@
  +/* PR tree-optimization/64277 */
  +/* { dg-do compile } */
  +/* { dg-options -O3 -Wall -Werror -fdump-tree-cunroll-details } */
  +/* { dg-final { scan-tree-dump loop with 5 iterations completely 
  unrolled cunroll } } */
  +/* { dg-final { scan-tree-dump loop with 6 iterations completely 
  unrolled cunroll } } */
  +/* { dg-final { cleanup-tree-dump cunroll } } */
  +
  +int f1[10];
  +void test1 (short a[], short m, unsigned short l)
  +{
  +  int i = l;
  +  for (i = i + 5; i  m; i++)
  +f1[i] = a[i]++;
  +}
  +
  +void test2 (short a[], short m, short l)
  +{
  +  int i;
  +  if (m  5)
  +m = 5;
  +  for (i = m; i  l; i--)
  +f1[i] = a[i]++;
  +}
  diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
  index 919f5c0..1cd297d 100644
  --- a/gcc/tree-ssa-loop-niter.c
  +++ b/gcc/tree-ssa-loop-niter.c
  @@ -2754,6 +2754,7 @@ record_nonwrapping_iv (struct loop *loop, tree base, 
  tree step, gimple stmt,
   {
 tree niter_bound, extreme, delta;
 tree type = TREE_TYPE (base), unsigned_type;
  +  tree orig_base = base;
 
 if (TREE_CODE (step) != INTEGER_CST || integer_zerop (step))
   return;
  @@ -2777,16 +2778,30 @@ record_nonwrapping_iv (struct loop *loop, tree 
  base, tree step, gimple stmt,
 
 if (tree_int_cst_sign_bit (step))
   {
  +  wide_int min, max;
 extreme = fold_convert (unsigned_type, low);
  -  if (TREE_CODE (base) != INTEGER_CST)
  +  if (TREE_CODE (orig_base) == SSA_NAME
  +  TREE_CODE (high) == INTEGER_CST
  +  INTEGRAL_TYPE_P (TREE_TYPE (orig_base))
  +  get_range_info (orig_base, min, max) == VR_RANGE
  +  wi::gts_p (wide_int (high), max))
 
 For me a simple wi::gts_p (high, max) worked fine.
 
  +   base = wide_int_to_tree (unsigned_type, max);
  +  else if (TREE_CODE (base) != INTEGER_CST)
  base = fold_convert (unsigned_type, high);
 delta = fold_build2 (MINUS_EXPR, unsigned_type, base, extreme);
 step = fold_build1 (NEGATE_EXPR, unsigned_type, step);
   }
 else
   {
  +  wide_int min, max;
 extreme = fold_convert (unsigned_type, high);
  -  if (TREE_CODE (base) != INTEGER_CST)
  +  if (TREE_CODE (orig_base) == SSA_NAME
  +  TREE_CODE (low) == INTEGER_CST
  +  INTEGRAL_TYPE_P (TREE_TYPE (orig_base))
  +  get_range_info (orig_base, min, max) == VR_RANGE
  +  wi::gts_p (min, wide_int (low)))
 
 Likewise.
 
 Ok for trunk with that changes.  For the 4.9 branch you need to adjust
 the patch to not use wide-ints.  I'd leave it on trunk for a while and
 eventually open a bugreport for the size regression to keep track of it.
 
 Thanks,
 Richard.
 

Thanks a lot for review!  Here is a final version for GCC 5.0.  Will prepare 
4.9 version later.

Thanks,
Ilya
--
diff --git a/gcc/testsuite/gcc.dg/pr64277.c b/gcc/testsuite/gcc.dg/pr64277.c
new file mode 100644
index 000..c6ef331
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr64277.c
@@ -0,0 +1,23 @@
+/* PR tree-optimization/64277 */
+/* { dg-do compile } */
+/* { dg-options -O3 -Wall -Werror -fdump-tree-cunroll-details } */
+/* { dg-final { scan-tree-dump loop with 5 iterations completely unrolled 
cunroll } } */
+/* { dg-final { scan-tree-dump loop with 6 iterations completely unrolled 
cunroll } } */
+/* { dg-final { cleanup-tree-dump cunroll } } */
+
+int f1[10];
+void test1 (short a[], short m, unsigned short l)
+{
+  int i = l;
+  for (i = i + 5; i  m; i++)
+f1[i] = a[i]++;
+}
+
+void test2 (short a[], short m, short l)
+{
+  int i;
+  if (m  5)
+m = 5;
+  for (i = m; i  

[PATCH, CHKP] Fix PR middle-end/64805

2015-01-27 Thread Ilya Enkovich
Hi,

Some time ago removal of not instrumented version of funtion with 
'always_inline' was delayed to enable their inlining.  With this change we may 
have situations when we inline into a not instrumented version of a function 
which also has an instrumented version (happens when both of them have 
'always_inline').  It causes ICE in cgraph_node verifier because we clear all 
references before inlining and verifier expects IPA_REF_CHKP reference for all 
functions having instrumented version.  This patch fixes it by rebuilding 
IPA_REF_CHKP reference.

Bootstrapped and tested on x86_64-unknown-linux-gnu.  OK for trunk?

Thanks,
Ilya
--
2015-01-27  Ilya Enkovich  ilya.enkov...@intel.com

PR middle-end/64805
* ipa-inline.c (early_inliner): Rebuild IPA_REF_CHKP reference
to avoid error in cgraph node verification.

2015-01-27  Ilya Enkovich  ilya.enkov...@intel.com

PR middle-end/64805
* gcc.target/i386/pr64805.c: New.


diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c
index c0ff329..d341619 100644
--- a/gcc/ipa-inline.c
+++ b/gcc/ipa-inline.c
@@ -2464,6 +2464,13 @@ early_inliner (function *fun)
 #endif
   node-remove_all_references ();
 
+  /* Rebuild this reference because it dosn't depend on
+ function's body and it's required to pass cgraph_node
+ verification.  */
+  if (node-instrumented_version
+   !node-instrumentation_clone)
+node-create_reference (node-instrumented_version, IPA_REF_CHKP, NULL);
+
   /* Even when not optimizing or not inlining inline always-inline
  functions.  */
   inlined = inline_always_inline_functions (node);
diff --git a/gcc/testsuite/gcc.target/i386/pr64805.c 
b/gcc/testsuite/gcc.target/i386/pr64805.c
new file mode 100644
index 000..8ba0a97
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr64805.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target mpx } */
+/* { dg-options -fcheck-pointer-bounds -mmpx } */
+
+#include stdio.h
+
+static inline void __attribute ((always_inline)) functionA(void)
+{
+  return;
+}
+
+static inline void __attribute ((always_inline)) functionB(void)
+{
+  functionA();
+}
+
+int test(void)
+{
+  functionB();
+
+  return 0;
+}


Re: [PATCH] Workaround -Wmaybe-uninitialized false positives during profiledbootstrap

2015-01-27 Thread Martin Liška

On 01/27/2015 05:23 AM, DJ Delorie wrote:

+/* Workaround -Wstrict-overflow false positive during profiledbootstrap.  */
+
+# if GCC_VERSION = 4004
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored -Wstrict-overflow
+#endif
+


#pragma diagnostic ignored was added in 4.4 but #pragma diagnostic
push/pop wasn't added until a later release (4.6 I think).  Attempts
to build with 4.4 (i.e. on RHEL 6) causes warnings on most files.


Hello.

Thank you for pointing out, thus changing to 4006 would be the right fix?

Thanks,
Martin


2010-06-21  DJ Delorie  d...@redhat.com

 * diagnostic.h (diagnostic_classification_change_t): New.
 (diagnostic_context): Add history and push/pop list.
 (diagnostic_push_diagnostics): Declare.
 (diagnostic_pop_diagnostics): Declare.
 * diagnostic.c (diagnostic_classify_diagnostic): Store changes
 from pragmas in a history chain instead of the global table.
 (diagnostic_push_diagnostics): New.
 (diagnostic_pop_diagnostics): New.
 (diagnostic_report_diagnostic): Scan history chain to find state
 of diagnostics as of the diagnostic location.
 * opts.c (set_option): Pass UNKNOWN_LOCATION to
 diagnostic_classify_diagnostic.
 (enable_warning_as_error): Likewise.
 * diagnostic-core.h (DK_POP): Add after real diagnostics, for
 use in the history chain.
 * doc/extend.texi: Document pragma GCC diagnostic changes.





Re: [Patch, ARM/Thumb1]Add a Thumb1 insn pattern to legalize the instruction that moves pc to low register

2015-01-27 Thread Ramana Radhakrishnan
On Fri, Jan 9, 2015 at 7:43 AM, Terry Guo terry@arm.com wrote:


 -Original Message-
 From: Richard Earnshaw
 Sent: Monday, December 08, 2014 7:31 PM
 To: Terry Guo; gcc-patches@gcc.gnu.org
 Cc: Ramana Radhakrishnan
 Subject: Re: [Patch, ARM/Thumb1]Add a Thumb1 insn pattern to legalize the
 instruction that moves pc to low register

 On 08/12/14 08:24, Terry Guo wrote:
  Hi there,
 
  When compile below simple code:
 
  terguo01@terry-pc01:mtpcs-frame$ cat test.c int main(void) {
  return 0;
  }
 
  I got ICE with option -mtpcs-leaf-frame (no error if remove this
 option).
 
  terguo01@terry-pc01:mtpcs-frame$
  /work/terguo01/tools/gcc-arm-none-eabi-5_0-2014q4/bin/arm-none-eabi-
 gc
  c -mtpcs-leaf-frame test.c -c -mcpu=cortex-m0plus -mthumb -da
  test.c: In function 'main':
  test.c:4:1: error: unrecognizable insn:
   }
   ^
  (insn 20 19 21 (set (reg:SI 2 r2)
  (reg:SI 15 pc)) test.c:2 -1
   (nil))
  test.c:4:1: internal compiler error: in extract_insn, at recog.c:2327
  Please submit a full bug report, with preprocessed source if
  appropriate.
  See http://gcc.gnu.org/bugs.html\ for instructions.
 
  This RTL is generated in function thumb1_expand_prologue. The expected
  insn pattern is thumb1_movsi_insn in thumb1.md. And instruction like
 mov r2, pc
  is a legal instruction. Because gcc returns NO_REG for PC register, so
  no valid pattern to match instruction that move pc to low register.
  This patch intends to add a new insn pattern to legalize such thing.
 
  Tested with GCC regression test. No regression. Is it OK to trunk?
 
  BR,
  Terry
 
  2014-12-08  Terry Guo  terry@arm.com
 
   * config/arm/predicates.md (pc_register): New to match PC register.
   * config/arm/thumb1.md (*thumb1_movpc_insn): New insn pattern.
 
  gcc/testsuite/ChangeLog:
  2014-12-08  Terry Guo  terry@arm.com
 
   * gcc.target/arm/thumb1-mov-pc.c: New test.
 
 
  thumb1-move-pc-v1.txt
 
 
  diff --git a/gcc/config/arm/predicates.md
  b/gcc/config/arm/predicates.md index 032808c..c5ef5ed 100644
  --- a/gcc/config/arm/predicates.md
  +++ b/gcc/config/arm/predicates.md
  @@ -361,6 +361,10 @@
 (and (match_code smin,smax,umin,umax)
  (match_test mode == GET_MODE (op
 
  +(define_special_predicate pc_register
  +  (and (match_code reg)
  +   (match_test REGNO (op) == PC_REGNUM)))
  +
   (define_special_predicate cc_register
 (and (match_code reg)
  (and (match_test REGNO (op) == CC_REGNUM) diff --git
  a/gcc/config/arm/thumb1.md b/gcc/config/arm/thumb1.md index
  ddedc39..8e6057c 100644
  --- a/gcc/config/arm/thumb1.md
  +++ b/gcc/config/arm/thumb1.md
  @@ -1780,6 +1780,16 @@
 
   )
 
  +(define_insn *thumb1_movpc_insn
  +  [(set (match_operand:SI 0 low_register_operand)

 This needs constraints.


 The constraint is used now. Is this one OK?


This is OK now.

Ramana

 BR,
 Terry

 2015-01-09  Terry Guo  terry@arm.com

  * config/arm/thumb1.md (*thumb1_movpc_insn): New insn pattern.


[Committed] S/390: Increase register move costs for FPR-GPR moves

2015-01-27 Thread Andreas Krebbel
Hi,

I've committed the attached patch which fixes a 4.8 vs 4.9/5.0
performance regression introduced with the aggressive use of FPRs as
spill slots.

Committed to mainline and 4.9 branch.

Bye,

-Andreas-

2015-01-27  Andreas Krebbel  andreas.kreb...@de.ibm.com

* config/s390/s390.c (s390_register_move_cost): Increase costs for
FPR-GPR moves.

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 36b547d..fcde638 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -2393,16 +2393,29 @@ s390_float_const_zero_p (rtx value)
 /* Implement TARGET_REGISTER_MOVE_COST.  */
 
 static int
-s390_register_move_cost (machine_mode mode ATTRIBUTE_UNUSED,
+s390_register_move_cost (machine_mode mode,
  reg_class_t from, reg_class_t to)
 {
-  /* On s390, copy between fprs and gprs is expensive as long as no
- ldgr/lgdr can be used.  */
-  if ((!TARGET_Z10 || GET_MODE_SIZE (mode) != 8)
-   ((reg_classes_intersect_p (from, GENERAL_REGS)
-   reg_classes_intersect_p (to, FP_REGS))
- || (reg_classes_intersect_p (from, FP_REGS)
-  reg_classes_intersect_p (to, GENERAL_REGS
+  /* On s390, copy between fprs and gprs is expensive.  */
+
+  /* It becomes somewhat faster having ldgr/lgdr.  */
+  if (TARGET_Z10  GET_MODE_SIZE (mode) == 8)
+{
+  /* ldgr is single cycle. */
+  if (reg_classes_intersect_p (from, GENERAL_REGS)
+  reg_classes_intersect_p (to, FP_REGS))
+   return 1;
+  /* lgdr needs 3 cycles. */
+  if (reg_classes_intersect_p (to, GENERAL_REGS)
+  reg_classes_intersect_p (from, FP_REGS))
+   return 3;
+}
+
+  /* Otherwise copying is done via memory.  */
+  if ((reg_classes_intersect_p (from, GENERAL_REGS)
+reg_classes_intersect_p (to, FP_REGS))
+  || (reg_classes_intersect_p (from, FP_REGS)
+  reg_classes_intersect_p (to, GENERAL_REGS)))
 return 10;
 
   return 1;



[Committed] S/390: Increase memory access costs

2015-01-27 Thread Andreas Krebbel
Hi,

I've committed the attached patch which fixes a 4.8 vs 4.9/5.0
performance regression introduced with the aggressive use of FPRs as
spill slots.

Committed to mainline and 4.9 branch.

Bye,

-Andreas-

2015-01-27  Andreas Krebbel  andreas.kreb...@de.ibm.com

* config/s390/s390.c (s390_memory_move_cost): Increase costs for
memory accesses.

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 1409fa8..9c67157 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -2434,7 +2434,7 @@ s390_memory_move_cost (machine_mode mode ATTRIBUTE_UNUSED,
   reg_class_t rclass ATTRIBUTE_UNUSED,
   bool in ATTRIBUTE_UNUSED)
 {
-  return 1;
+  return 2;
 }
 
 /* Compute a (partial) cost for rtx X.  Return true if the complete



Re: [Patch, Fortran] PR64771 - Fix coarray ICE

2015-01-27 Thread Rainer Orth
Jakub Jelinek ja...@redhat.com writes:

 The problem is (as so often) that algorithm was included *before*
 config.h.  Moving it after the other includes allows interface.c to
 compile without warnings.

 Why don't you use MAX macro instead of std::max as everywhere else
 in the gcc sources?

No idea, ask Tobias :-)  Anyway, the original patch would most likely
have worked: system.h already includes algorithm.

This one compiles just as well, of course.

Rainer


2015-01-27  Rainer Orth  r...@cebitec.uni-bielefeld.de

* interface.c: Remove algorithm.
(check_dummy_characteristics): Use MAX instead of std::max.

# HG changeset patch
# Parent a742f8ce2a00e481ddf92dbecaf8d1ee01448911
Avoid std::max

diff --git a/gcc/fortran/interface.c b/gcc/fortran/interface.c
--- a/gcc/fortran/interface.c
+++ b/gcc/fortran/interface.c
@@ -63,8 +63,6 @@ along with GCC; see the file COPYING3.  
formal argument list points to symbols within the same namespace as
the program unit name.  */
 
-#include algorithm  /* For std::max.  */
-
 #include config.h
 #include system.h
 #include coretypes.h
@@ -1215,7 +1213,7 @@ check_dummy_characteristics (gfc_symbol 
 	}
 
   if (s1-as-type == AS_EXPLICIT)
-	for (i = 0; i  s1-as-rank + std::max(0, s1-as-corank-1); i++)
+	for (i = 0; i  s1-as-rank + MAX (0, s1-as-corank-1); i++)
 	  {
 	shape1 = gfc_subtract (gfc_copy_expr (s1-as-upper[i]),
   gfc_copy_expr (s1-as-lower[i]));

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [Patch, Fortran] PR64771 - Fix coarray ICE

2015-01-27 Thread Tobias Burnus
Rainer Orth wrote:
  Why don't you use MAX macro instead of std::max as everywhere else
  in the gcc sources?

 No idea, ask Tobias :-)

No real reason - presumably, because I had MAX not in mind and thought
of the general move towards standard features.

 Anyway, the original patch would most likely
 have worked: system.h already includes algorithm.

It did on my system :-)

 This one compiles just as well, of course.

From my side, that patch (using MAX) is fine. Thanks for
bearing the bootstrap failure and for the patch.

Tobias


 2015-01-27  Rainer Orth  r...@cebitec.uni-bielefeld.de
 
   * interface.c: Remove algorithm.
   (check_dummy_characteristics): Use MAX instead of std::max.
 
 # HG changeset patch
 # Parent a742f8ce2a00e481ddf92dbecaf8d1ee01448911
 Avoid std::max
 
 diff --git a/gcc/fortran/interface.c b/gcc/fortran/interface.c
 --- a/gcc/fortran/interface.c
 +++ b/gcc/fortran/interface.c
 @@ -63,8 +63,6 @@ along with GCC; see the file COPYING3.  
 formal argument list points to symbols within the same namespace as
 the program unit name.  */
  
 -#include algorithm  /* For std::max.  */
 -
  #include config.h
  #include system.h
  #include coretypes.h
 @@ -1215,7 +1213,7 @@ check_dummy_characteristics (gfc_symbol 
   }
  
if (s1-as-type == AS_EXPLICIT)
 - for (i = 0; i  s1-as-rank + std::max(0, s1-as-corank-1); i++)
 + for (i = 0; i  s1-as-rank + MAX (0, s1-as-corank-1); i++)
 {
   shape1 = gfc_subtract (gfc_copy_expr (s1-as-upper[i]),
 gfc_copy_expr (s1-as-lower[i]));


Re: RFA: patch to fix a bad code generation for PR64110 -- new constraints addition

2015-01-27 Thread Vladimir Makarov
On 01/27/2015 09:08 AM, Richard Sandiford wrote:

 Yeah, but in practice that's only ever going to be a partial transition.
 Many port maintainers won't look at this, so we'll have to support both
 versions indefinitely, even if the new behaviour turns out to be the
 best for all cases.

 I just think we're going to regret having two sets of constraints with
 such subtly different meanings.

 Looking back at the original PR, Jakub said:

   The ! has been added by me for PR63594, so it isn't there from the era
   when i?86 backend was using reload.  If there is a better way to
   express that RA should prefer to use memory or xmm register and only
   use r constraint if it already is in a r register and doesn't need to
   be reloaded, I can use that.  Whether it is ?, ??? or something else.
   ! description in gcc docs just fitted most what I wanted...

 In some ways this seems to match the intention of *.  Originally I think
 it was just an RA-only thing and was ignored by reload, but LRA does take it
 into account too (which sounds like progress to me).
  I guess we don't need '*' in many cases.  It is overused.  Imho, IRA
should decide what class is better based on costs of alternatives and
the explicit exclusion of register class by using '*' is a bad practice.

  Saying that I believe we should do register class preferrencing
algorithm more alternative oriented.  The algorithm should choose first
an alternative (of may be subset of alternatives) and then register
classes.  I think it is more logical.  It would permits us to rid off
all such constraints including '*' and use only one like '?' which
increases the alternative cost.

  In perspective it is even better to rid of '?' too and have some hook
(or attribute) to get insn alternative costs which can be depended on
sub-target or other run-time characteristics.  Otherwise we need to
duplicate insn descriptions and put different insn guards.  I am going
to work on this.  But it is hard to say will it work well (may be I have
some performance issues with this).  This hook somehow (min or average
of the values for all alternatives) can be used in combiner and other
algorithms need an insn cost. That is how I see the solution of the
problem in a long perspective.
 If I revert the patch locally and change the *vec_dupmode pattern to
 use *, it passes both the test for PR64110 and the tests for PR63594.
 Would that be OK as an alternative?

  I don't think it will work in general case.  It probably works because
a different class is chosen in IRA.  If IRA for some reasons choose the
same class, we might see the same problem in LRA.  I also don't like
when register classes are excluded by '*' for IRA (see my thoughts above).




Fix ICE in ipa-devirt

2015-01-27 Thread Jan Hubicka
Hi,
the two testcases show somewhat crazy layout of C++ object that goes
in order base1,base2,virtual_base_of_base1
this confuses the walk in get_binfo_at_offset while looking for
virtual_base_of_base1 to look into base2 instead of base1.

It seems that in the case of virtual inheritance we simply want to do
fully recursive search for the given binfo - it is not that expensiv ebecause
bases are not many.

Bootstrapped/regtested x86_64-linux.  Will commit it today after rebuilding
firefox.

PR IPA/60871
PR IPA/64139
* tree.c (lookup_binfo_at_offset): New function.
(get_binfo_at_offset): Use it.

* g++.dg/torture/pr64139.C: New testcase.
* g++.dg/torture/pr60871.C: Likewise.
Index: tree.c
===
--- tree.c  (revision 220142)
+++ tree.c  (working copy)
@@ -11990,6 +11990,23 @@ type_in_anonymous_namespace_p (const_tre
   return (TYPE_STUB_DECL (t)  !TREE_PUBLIC (TYPE_STUB_DECL (t)));
 }
 
+/* Lookup sub-BINFO of BINFO of TYPE at offset POS.  */
+
+tree
+lookup_binfo_at_offset (tree binfo, tree type, HOST_WIDE_INT pos)
+{
+  unsigned int i;
+  tree base_binfo, b;
+
+  for (i = 0; BINFO_BASE_ITERATE (binfo, i, base_binfo); i++)
+if (pos == tree_to_shwi (BINFO_OFFSET (base_binfo))
+types_same_for_odr (TREE_TYPE (base_binfo), type))
+  return base_binfo;
+else if ((b = lookup_binfo_at_offset (base_binfo, type, pos)) != NULL)
+  return b;
+  return NULL;
+}
+
 /* Try to find a base info of BINFO that would have its field decl at offset
OFFSET within the BINFO type and which is of EXPECTED_TYPE.  If it can be
found, return, otherwise return NULL_TREE.  */
@@ -12027,42 +12044,22 @@ get_binfo_at_offset (tree binfo, HOST_WI
 represented in the binfo for the derived class.  */
   else if (offset != 0)
{
- tree base_binfo, binfo2 = binfo;
-
- /* Find BINFO corresponding to FLD.  This is bit harder
-by a fact that in virtual inheritance we may need to walk down
-the non-virtual inheritance chain.  */
- while (true)
-   {
- tree containing_binfo = NULL, found_binfo = NULL;
- for (i = 0; BINFO_BASE_ITERATE (binfo2, i, base_binfo); i++)
-   if (types_same_for_odr (TREE_TYPE (base_binfo), TREE_TYPE 
(fld)))
- {
-   found_binfo = base_binfo;
-   break;
- }
-   else
- if ((tree_to_shwi (BINFO_OFFSET (base_binfo)) 
-  - tree_to_shwi (BINFO_OFFSET (binfo)))
- * BITS_PER_UNIT  pos
- /* Rule out types with no virtual methods or we can get 
confused
-here by zero sized bases.  */
-  TYPE_BINFO (BINFO_TYPE (base_binfo))
-  BINFO_VTABLE (TYPE_BINFO (BINFO_TYPE (base_binfo)))
-  (!containing_binfo
- || (tree_to_shwi (BINFO_OFFSET (containing_binfo))
-  tree_to_shwi (BINFO_OFFSET (base_binfo)
-   containing_binfo = base_binfo;
- if (found_binfo)
-   {
- binfo = found_binfo;
- break;
-   }
- if (!containing_binfo)
-   return NULL_TREE;
- binfo2 = containing_binfo;
-   }
-   }
+ tree found_binfo = NULL, base_binfo;
+ int offset = (tree_to_shwi (BINFO_OFFSET (binfo)) + pos
+   / BITS_PER_UNIT);
+
+ for (i = 0; BINFO_BASE_ITERATE (binfo, i, base_binfo); i++)
+   if (tree_to_shwi (BINFO_OFFSET (base_binfo)) == offset
+types_same_for_odr (TREE_TYPE (base_binfo), TREE_TYPE (fld)))
+ {
+   found_binfo = base_binfo;
+   break;
+ }
+ if (found_binfo)
+   binfo = found_binfo;
+ else
+   binfo = lookup_binfo_at_offset (binfo, TREE_TYPE (fld), offset);
+}
 
   type = TREE_TYPE (fld);
   offset -= pos;
Index: testsuite/g++.dg/torture/pr64139.C
===
--- testsuite/g++.dg/torture/pr64139.C  (revision 0)
+++ testsuite/g++.dg/torture/pr64139.C  (revision 0)
@@ -0,0 +1,34 @@
+// { dg-do compile }
+class IObject {
+public:
+  virtual ~IObject();
+};
+class A {
+  virtual int m_fn1();
+};
+class B {
+public:
+  virtual int m_fn2(B) const;
+};
+class D : IObject, public virtual B {};
+class G : public D, A {
+public:
+  G(A);
+};
+class F : B {
+  friend class C;
+};
+class C {
+  void m_fn3(const IObject , int );
+  void m_fn4(const B , int );
+};
+A a;
+void C::m_fn3(const IObject , int p2) {
+  G r(a);
+  m_fn4(r, p2);
+}
+void C::m_fn4(const B p1, int ) {
+  F b;
+  p1.m_fn2(b);
+}
+
Index: testsuite/g++.dg/torture/pr60871.C

Re: [PATCH][AArch32] Testcase fix for __ATOMIC_CONSUME

2015-01-27 Thread Ramana Radhakrishnan
On Tue, Jan 27, 2015 at 4:06 PM, Alex Velenko alex.vele...@arm.com wrote:

 Hi,

 This patch fixes arm/atomic-op-consume.c test to expect safe LDAEX
 instruction to be generated when __ATOMIC_CONSUME semantics is requested.

 This patch was tested by running the modified test on arm-none-eabi and
 arm-none-linux-gnueabi compilers.

 Is this patch ok?

Ok. Please remember James's comments in the future about cover notes.

Ramana


 Alex

 2015-01-27  Alex Velenko  alex.vele...@arm.com

 gcc/testsuite/

   * gcc.target/arm/atomic-op-consume.c (scan-assember-times): Adjust
   scan-assembler-times pattern.

 diff --git a/gcc/testsuite/gcc.target/arm/atomic-op-consume.c 
 b/gcc/testsuite/gcc.target/arm/atomic-op-consume.c
 index 0354717..cc6c028 100644
 --- a/gcc/testsuite/gcc.target/arm/atomic-op-consume.c
 +++ b/gcc/testsuite/gcc.target/arm/atomic-op-consume.c
 @@ -5,6 +5,9 @@

  #include ../aarch64/atomic-op-consume.x

 -/* { dg-final { scan-assembler-times ldrex\tr\[0-9\]+, \\\[r\[0-9\]+\\\] 6 
 } } */
 +/* To workaround Bugzilla 59448 issue, a request for __ATOMIC_CONSUME is 
 always
 +   promoted to __ATOMIC_ACQUIRE, implemented as MEMMODEL_ACQUIRE.  This 
 causes
 +   LDAEX to be generated instead of LDREX.  */
 +/* { dg-final { scan-assembler-times ldaex\tr\[0-9\]+, \\\[r\[0-9\]+\\\] 6 
 } } */
  /* { dg-final { scan-assembler-times strex\t...?, r\[0-9\]+, 
 \\\[r\[0-9\]+\\\] 6 } } */
  /* { dg-final { scan-assembler-not dmb } } */


[PATCH][AArch64] Testcase fix for __ATOMIC_CONSUME

2015-01-27 Thread Alex Velenko

Hi,

This patch fixes aarch64/atomic-op-consume.c test to expect safe LDAXR
instruction to be generated when __ATOMIC_CONSUME semantics is requested.

This patch was tested by running the modified test on aarch64-none-elf
compiler.

Is this patch ok?

Alex

2015-01-27  Alex Velenko  alex.vele...@arm.com

gcc/testsuite/

  * gcc.target/aarch64/atomic-op-consume.c (scan-assember-times): Adjust
  scan-assembler-times pattern.

diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-op-consume.c 
b/gcc/testsuite/gcc.target/aarch64/atomic-op-consume.c
index 38d6c2c..7ece5b1 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-op-consume.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-op-consume.c
@@ -3,5 +3,8 @@
 
 #include atomic-op-consume.x
 
-/* { dg-final { scan-assembler-times ldxr\tw\[0-9\]+, \\\[x\[0-9\]+\\\] 6 } 
} */
+/* To workaround Bugzilla 59448 issue, a request for __ATOMIC_CONSUME is always
+   promoted to __ATOMIC_ACQUIRE, implemented as MEMMODEL_ACQUIRE.  This causes
+   LDAXR to be generated instead of LDXR.  */
+/* { dg-final { scan-assembler-times ldaxr\tw\[0-9\]+, \\\[x\[0-9\]+\\\] 6 } 
} */
 /* { dg-final { scan-assembler-times stxr\tw\[0-9\]+, w\[0-9\]+, 
\\\[x\[0-9\]+\\\] 6 } } */

[PATCH][AArch64][test][committed] Fix FAIL: gcc.target/aarch64/store-pair-1.c scan-assembler stp\tw[0-9]+, w[0-9]+

2015-01-27 Thread Kyrill Tkachov

Hi all,

I notice this test fails on aarch64-none-elf because the scan-assembler 
scans for wnum registers when one of them can be the wzr reg since we 
store a 0 into *a.

This patch updates the pattern that is scanned for.

Committed as obvious with r220176.

Thanks,
Kyrill

2015-01-27  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* gcc.target/aarch64/store-pair-1.c: Update scan-assembler to check
for wzr reg.diff --git a/gcc/testsuite/gcc.target/aarch64/store-pair-1.c b/gcc/testsuite/gcc.target/aarch64/store-pair-1.c
index a726d64..a90fc61 100644
--- a/gcc/testsuite/gcc.target/aarch64/store-pair-1.c
+++ b/gcc/testsuite/gcc.target/aarch64/store-pair-1.c
@@ -9,4 +9,4 @@ int f(int *a, int b)
 }
 
 /* We should be able to produce store pair for the store of 28/29 store. */
-/* { dg-final { scan-assembler stp\tw\[0-9\]+, w\[0-9\]+ } } */
+/* { dg-final { scan-assembler stp\tw(\[0-9\]+)\|(zr), w\[0-9\]+ } } */

[PATCH] PR jit/64780: configure: --enable-host-shared and the jit

2015-01-27 Thread David Malcolm
Currently the jit requires you to specify --enable-host-shared, or the
build eventually fails with  linker errors (this is something of a FAQ
for people trying out the jit).

We seem to have two choices here:

(A) default to --enable-host-shared when jit is an enabled language
(B) have the toplevel configure reject jit as language if
--enable-host-shared is not supplied.

FWIW apparently Darwin defaults to position-independent code, so it's not
explicitly needed there.

I think (B) is the better option for us, since there is a performance
cost: there are people who perform benchmarking of GCC (and publish
their results on prominent websites).  If they turn on the jit and use
the same configuration to do their benchmarking of the rest of GCC,
they'll see GCC 5 be apparently slower than earlier releases.  This
is sufficiently subtle that I don't think it's reasonable to simply
document it and expect 3rd-party reviewers to see such a note in the
documentation before benchmarking.

The attached patch implements (B), with a note in the error message
recommending that people configure and build GCC twice to avoid the
performance hit, so that it can be self-documenting.

Tested by hand with various combinations of values for
--enable-host-shared and --enable-languages.

OK for stage 4?

PR jit/64780
* configure.ac: Require the user to explicitly specify
--enable-host-shared if the jit is enabled.
* configure: Regenerate.
---
 configure| 24 
 configure.ac | 24 
 2 files changed, 48 insertions(+)

diff --git a/configure b/configure
index 5860241..dd794db 100755
--- a/configure
+++ b/configure
@@ -14750,6 +14750,30 @@ fi
 
 
 
+# PR jit/64780: Require the user to explicitly specify
+# --enable-host-shared if the jit is enabled, hinting
+# that they might want to do a separate configure/build of
+# the jit, to avoid users from slowing down the rest of the
+# compiler by enabling the jit.
+if test ${host_shared} = no ; then
+  case ${enable_languages} in
+*jit*)
+  as_fn_error 
+Enabling language \jit\ requires --enable-host-shared.
+
+--enable-host-shared typically slows the rest of the compiler down by
+a few %, so you must explicitly enable it.
+
+If you want to build both the jit and the regular compiler, it is often
+best to do this via two separate configure/builds, in separate
+directories, to avoid imposing the performance cost of
+--enable-host-shared on the regular compiler. $LINENO 5
+  ;;
+*)
+  ;;
+  esac
+fi
+
 # Specify what files to not compare during bootstrap.
 
 compare_exclusions=gcc/cc*-checksum\$(objext) | gcc/ada/*tools/*
diff --git a/configure.ac b/configure.ac
index 267c8e6..4ea5e00 100644
--- a/configure.ac
+++ b/configure.ac
@@ -3467,6 +3467,30 @@ AC_ARG_ENABLE(host-shared,
 [host_shared=$enableval], [host_shared=no])
 AC_SUBST(host_shared)
 
+# PR jit/64780: Require the user to explicitly specify
+# --enable-host-shared if the jit is enabled, hinting
+# that they might want to do a separate configure/build of
+# the jit, to avoid users from slowing down the rest of the
+# compiler by enabling the jit.
+if test ${host_shared} = no ; then
+  case ${enable_languages} in
+*jit*)
+  AC_MSG_ERROR([
+Enabling language jit requires --enable-host-shared.
+
+--enable-host-shared typically slows the rest of the compiler down by
+a few %, so you must explicitly enable it.
+
+If you want to build both the jit and the regular compiler, it is often
+best to do this via two separate configure/builds, in separate
+directories, to avoid imposing the performance cost of
+--enable-host-shared on the regular compiler.])
+  ;;
+*)
+  ;;
+  esac
+fi
+
 # Specify what files to not compare during bootstrap.
 
 compare_exclusions=gcc/cc*-checksum\$(objext) | gcc/ada/*tools/*
-- 
1.8.5.3



Re: RFA: patch to fix a bad code generation for PR64110 -- new constraints addition

2015-01-27 Thread Richard Sandiford
Vladimir Makarov vmaka...@redhat.com writes:
 On 01/27/2015 09:08 AM, Richard Sandiford wrote:
 Yeah, but in practice that's only ever going to be a partial transition.
 Many port maintainers won't look at this, so we'll have to support both
 versions indefinitely, even if the new behaviour turns out to be the
 best for all cases.

 I just think we're going to regret having two sets of constraints with
 such subtly different meanings.

 Looking back at the original PR, Jakub said:

   The ! has been added by me for PR63594, so it isn't there from the era
   when i?86 backend was using reload.  If there is a better way to
   express that RA should prefer to use memory or xmm register and only
   use r constraint if it already is in a r register and doesn't need to
   be reloaded, I can use that.  Whether it is ?, ??? or something else.
   ! description in gcc docs just fitted most what I wanted...

 In some ways this seems to match the intention of *.  Originally I think
 it was just an RA-only thing and was ignored by reload, but LRA does take it
 into account too (which sounds like progress to me).
   I guess we don't need '*' in many cases.  It is overused.  Imho, IRA
 should decide what class is better based on costs of alternatives and
 the explicit exclusion of register class by using '*' is a bad practice.

   Saying that I believe we should do register class preferrencing
 algorithm more alternative oriented.  The algorithm should choose first
 an alternative (of may be subset of alternatives) and then register
 classes.  I think it is more logical.  It would permits us to rid off
 all such constraints including '*' and use only one like '?' which
 increases the alternative cost.

   In perspective it is even better to rid of '?' too and have some hook
 (or attribute) to get insn alternative costs which can be depended on
 sub-target or other run-time characteristics.  Otherwise we need to
 duplicate insn descriptions and put different insn guards.  I am going
 to work on this.  But it is hard to say will it work well (may be I have
 some performance issues with this).  This hook somehow (min or average
 of the values for all alternatives) can be used in combiner and other
 algorithms need an insn cost. That is how I see the solution of the
 problem in a long perspective.

Definitely agree that it'd be better to remove these constraints
in favour of a new attribute.  preferred_for_size and preferred_for_speed
give something similar, though they're much more stringent than what
we need here.

 If I revert the patch locally and change the *vec_dupmode pattern to
 use *, it passes both the test for PR64110 and the tests for PR63594.
 Would that be OK as an alternative?

   I don't think it will work in general case.  It probably works because
 a different class is chosen in IRA.  If IRA for some reasons choose the
 same class, we might see the same problem in LRA.

But isn't that the point of '*'?  It should stop IRA from using the 'r'
alternative as an indication that 'r' is a good choice for this instruction.
If IRA chooses 'r' anyway, it must be because other instructions that
use the same allocno strongly prefer 'r'.

And in those some circumstances -- i.e. if IRA does choose 'r' despite
the constraints in this instruction -- then I think we do want to use the
'r' alternative.  And AIUI that's also what the new constraint is designed
to do.  If IRA chooses 'r' anyway, the new constraint causes LRA to prefer
the 'r' alternative _even if_ another operand (the destination) has to
be reloaded, which is the fundamental difference between the new constraint
and '!'.

So I'm still not sure why '*' wouldn't do what we want.

  I also don't like when register classes are excluded by '*' for IRA
 (see my thoughts above).

Understood, and I agree it would be good to move to attributes.
But in a way, I think that's an even better reason to try to avoid
adding these new constraints.  It sounds like we're hoping to get rid
of them as soon as we've added them :-)

Thanks,
Richard



Re: [ping] Re: proper name of i386/x86-64/etc targets

2015-01-27 Thread Uros Bizjak
On Tue, Jan 27, 2015 at 2:56 AM, Sandra Loosemore
san...@codesourcery.com wrote:
 On 01/20/2015 12:02 PM, H.J. Lu wrote:

 On Tue, Jan 20, 2015 at 10:51 AM, Eric Botcazou ebotca...@adacore.com
 wrote:

 Ping?  Any thoughts?


 x86 for the family and x86-32/x86-64 for the 2 architectures?


 Works for me.


 [redirecting from gcc@ to gcc-patches@]

 OK, here is a patch that attempts to implement that convention.  I'd
 appreciate review from a target maintainer to check that I've correctly
 disambiguated places where i386 was referring to both 32- and 64-bit
 variants vs 32-bit only.  I've left alone some instances of i386 where it
 seemed appropriate to name a specific processor -- e.g. there are a bunch of
 examples in the inline asm section that are described as i386 code.

 If this is OK to commit, I will follow it up with another patch to
 re-alphabetize the renamed sections (i386 whatever to x86 whatever).
 Trying to do both the renaming and the shuffling in a single patch would
 have made it impossible to review the actual changes to content. When I was
 working on this I also realized that some of the x86-specific material in
 extend.texi really needs copy-editing; again, best to do that in a separate
 patch.

-@node i386 and x86-64 Options
-@subsection Intel 386 and AMD x86-64 Options
+@node x86 Options
+@subsection x86 Options
 @cindex i386 Options
-@cindex x86-64 Options
+@cindex x86 Options
+@cindex IA-32 Options
 @cindex Intel 386 Options
 @cindex AMD x86-64 Options

Let's go all the way and remove all but  @cindex x86 Options.

-These @samp{-m} options are defined for the i386 and x86-64 family of
-computers:
+These @samp{-m} options are defined for the x86 family of computers,
+including both x86-32 (IA-32 and Intel 386) and AMD x86-64:

Also here. ... the x86 family of computers.. Without the including ... part.

-@node i386 and x86-64 Windows Options
-@subsection i386 and x86-64 Windows Options
-@cindex i386 and x86-64 Windows Options
+@node x86 Windows Options
+@subsection x86 Windows Options
+@cindex x86 Windows Options
+@cindex i386 Windows Options
+@cindex Intel 386 Windows Options
+@cindex AMD x86-64 Windows Options
+@cindex Windows Options for x86

IMO, all but @cindex x86 Windows Options should be removed.

Others LGTM.

Thanks,
Uros.