date:20150127

On Mon, 26 Jan 2015, Jakub Jelinek wrote:

 Hi!
 
 On x86_64-darwin, we ICE on one of the pr64307.c testcase, because
 expand_thunk doesn't load non-gimple_val arguments into registers
 for the first argument, only for all the other ones.
 Supposedly normally thunks were meant to have this argument as pointer first
 and thus it wasn't an issue, but in the -O0 -fipa-icf case a thunk is
 created even for a non-method.
 
 This patch fixes it by special-casing the first argument only if
 this_adjusting - then we know it is a pointer that is being adjusted.
 
 Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

Thanks,
Richard.

 2015-01-26  Jakub Jelinek  ja...@redhat.com
 
   PR ipa/64776
   * cgraphunit.c (cgraph_node::expand_thunk): If not this_adjusting,
   handle the first argument in the same loop as all the other arguments.
 
 --- gcc/cgraphunit.c.jj   2015-01-15 14:05:05.0 +0100
 +++ gcc/cgraphunit.c  2015-01-26 17:26:18.629818527 +0100
 @@ -1610,14 +1610,18 @@ cgraph_node::expand_thunk (bool output_a
for (arg = a; arg; arg = DECL_CHAIN (arg))
  nargs++;
auto_vectree vargs (nargs);
 +  i = 0;
 +  arg = a;
if (this_adjusting)
 -vargs.quick_push (thunk_adjust (bsi, a, 1, fixed_offset,
 - virtual_offset));
 -  else if (nargs)
 -vargs.quick_push (a);
 + {
 +   vargs.quick_push (thunk_adjust (bsi, a, 1, fixed_offset,
 +   virtual_offset));
 +   arg = DECL_CHAIN (a);
 +   i = 1;
 + }
  
if (nargs)
 -for (i = 1, arg = DECL_CHAIN (a); i  nargs; i++, arg = DECL_CHAIN 
 (arg))
 + for (; i  nargs; i++, arg = DECL_CHAIN (arg))
 {
   tree tmp = arg;
   if (!is_gimple_val (arg))
 
   Jakub
 
 

-- 
Richard Biener rguent...@suse.de
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild,
Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)

[PATCH] Fix PR64798


The new exceptional EH allocator failed to align exception objects
properly (it ended up aligning to __alignof__((std::size_t))).  The
following fixes that by aligning to what __attribute__((aligned))
would align to (this is what _Unwind_Exception is aligned to, a
member of __cxa_refcounted_exception).

Bootstrapped and tested on x86_64-unknown-linux-gnu - Rainer is
testing this on sparc-solaris where it broke 
g++.old-deja/g++.eh/badalloc1.C.

Ok for trunk?

Thanks,
Richard.

2015-01-27  Richard Biener  rguent...@suse.de

PR libstdc++/64798
* libsupc++/eh_alloc.cc (struct allocated_entry): Align
data member.
(pool::allocate): Adjust allocation size and alignment to
that change.
(pool::free): Adjust pointer offsetting.

Index: libstdc++-v3/libsupc++/eh_alloc.cc
===
--- libstdc++-v3/libsupc++/eh_alloc.cc  (revision 220164)
+++ libstdc++-v3/libsupc++/eh_alloc.cc  (working copy)
@@ -94,7 +94,7 @@ namespace
   };
   struct allocated_entry {
std::size_t size;
-   char data[];
+   char data[] __attribute__((aligned));
   };
 
   // A single mutex controlling emergency allocations.
@@ -133,17 +133,18 @@ namespace
   void *pool::allocate (std::size_t size)
 {
   __gnu_cxx::__scoped_lock sentry(emergency_mutex);
-  // We need an additional size_t member.
-  size += sizeof (std::size_t);
+  // We need an additional size_t member plus the padding to
+  // ensure proper alignment of data.
+  size += offsetof (allocated_entry, data);
   // And we need to at least hand out objects of the size of
   // a freelist entry.
   if (size  sizeof (free_entry))
size = sizeof (free_entry);
-  // And we need to align objects we hand out to the required
-  // alignment of a freelist entry (this really aligns the
+  // And we need to align objects we hand out to the maximum
+  // alignment required on the target (this really aligns the
   // tail which will become a new freelist entry).
-  size = ((size + __alignof__(free_entry) - 1)
-  ~(__alignof__(free_entry) - 1));
+  size = ((size + __alignof__ (allocated_entry::data) - 1)
+  ~(__alignof__ (allocated_entry::data) - 1));
   // Search for an entry of proper size on the freelist.
   free_entry **e;
   for (e = first_free_entry;
@@ -185,7 +186,7 @@ namespace
 {
   __gnu_cxx::__scoped_lock sentry(emergency_mutex);
   allocated_entry *e = reinterpret_cast allocated_entry *
-   (reinterpret_cast char * (data) - sizeof (std::size_t));
+   (reinterpret_cast char * (data) - offsetof (allocated_entry, data));
   std::size_t sz = e-size;
   if (!first_free_entry)
{

Re: RFA: patch to fix a bad code generation for PR64110 -- new constraints addition

2015-01-27 Thread Richard Sandiford

Jeff Law l...@redhat.com writes:
 On 01/24/15 04:29, Richard Sandiford wrote:

 Yeah.  I expect in practice most people who used ? and ! attached
 them to a particular operand for a reason.  From a quick scan through
 386.exp it looked like almost all uses would either want this behaviour
 or wouldn't care.  An interesting exception is:

 (define_insn extendsidi2_1
[(set (match_operand:DI 0 nonimmediate_operand =*A,r,?r,?*o)
  (sign_extend:DI (match_operand:SI 1 register_operand 0,0,r,r)))
 (clobber (reg:CC FLAGS_REG))
 (clobber (match_scratch:SI 2 =X,X,X,r))]
!TARGET_64BIT
#)

 I don't know how effective the third alternative is with LRA.  Surely
 a r-0 alternative is by definition a case where r-r is possible
 only with a ?-cost reload?  Seems to me we could just delete it.
 But assuming it does some good, I suppose the ? really does apply to
 the alternative as a whole.  If we had to reload operand 1 or operand 0,
 there's an extra cost if it can't use the same register as the other
 operand.

 Wouldn't it be better to make ? and ! behave the new way and only
 add new constraints if it turns out that the old behaviour really is
 useful in some cases?

 Maybe stage 4 isn't the time to be making that kind of change.
 Still, it'd be great if someone who's set up do x86_64 benchmarking
 could measure the effect of making ? and ! behave like the
 new constraints.
 My worry isn't the x86_64 port, but all the others that folks don't test 
 as regularly.

 I'd rather go the other direction, have folks familiar with the port go 
 through it changing the constraints where it makes sense.  That just 
 seems a hell of a lot safer.

 A port maintainer could certainly hack something together for testing 
 purposes to guide them as to whether or not there's something to be 
 gained by converting many/most of the ?! to the new constraints.

Yeah, but in practice that's only ever going to be a partial transition.
Many port maintainers won't look at this, so we'll have to support both
versions indefinitely, even if the new behaviour turns out to be the
best for all cases.

I just think we're going to regret having two sets of constraints with
such subtly different meanings.

Looking back at the original PR, Jakub said:

  The ! has been added by me for PR63594, so it isn't there from the era
  when i?86 backend was using reload.  If there is a better way to
  express that RA should prefer to use memory or xmm register and only
  use r constraint if it already is in a r register and doesn't need to
  be reloaded, I can use that.  Whether it is ?, ??? or something else.
  ! description in gcc docs just fitted most what I wanted...

In some ways this seems to match the intention of *.  Originally I think
it was just an RA-only thing and was ignored by reload, but LRA does take it
into account too (which sounds like progress to me).

If I revert the patch locally and change the *vec_dupmode pattern to
use *, it passes both the test for PR64110 and the tests for PR63594.
Would that be OK as an alternative?

Thanks,
Richard

Re: Merge current set of OpenACC changes from gomp-4_0-branch

2015-01-27 Thread Julian Brown

On Mon, 26 Jan 2015 17:34:26 +0300
Ilya Verbin iver...@gmail.com wrote:

 Here is my current patch, it works for OpenMP-MIC, but obviously
 will not work for PTX, since it requires symmetrical changes in the
 plugin.  Could you please take a look, whether it is possible to
 support this new interface in PTX plugin?

I think it can probably be made to work. I'll have a look in more
detail.

Thanks,

Julian

Re: [Patch, Fortran, OOP] PR 64230: [4.9/5 Regression] Invalid memory reference in a compiler-generated finalizer for allocatable component

On Tue, Jan 27, 2015 at 07:20:10PM +0100, Janus Weil wrote:
 2015-01-27 10:30 GMT+01:00 Jakub Jelinek ja...@redhat.com:
  Yeah, if you want to add ubsan tests, you need to add gfortran.dg/ubsan/
  directory and hack up ubsan.exp in there
 
 Thanks for the remark, I was suspecting something like that. However,
 for this case it's not really worth the hassle. In fact the test case
 does not really need the sanitizer and should also work without it. So
 I'll just remove the -fsanitize option:
 
 Index: gcc/testsuite/gfortran.dg/class_allocate_18.f90
 ===
 --- gcc/testsuite/gfortran.dg/class_allocate_18.f90(Revision 220180)
 +++ gcc/testsuite/gfortran.dg/class_allocate_18.f90(Arbeitskopie)
 @@ -1,5 +1,4 @@
  ! { dg-do run }
 -! { dg-options -fsanitize=undefined }
  !
  ! PR 64230: [4.9/5 Regression] Invalid memory reference in a
 compiler-generated finalizer for allocatable component
  !

LGTM.

Jakub

Re: [PATCH][RFA][PR target/15184] Partial fix for direct byte access on x86


On 01/26/15 22:11, Segher Boessenkool wrote:

On Mon, Jan 26, 2015 at 08:07:29PM -0700, Jeff Law wrote:

The second change we need is an additional simplification.

If we have
(subreg:M1 (zero_extend:M2 (x))

Where M1  M2 and both are scalar integer modes.  It's advantageous to
strip the SUBREG and instead have a wider extension.


Should you also check M1 is not multiple registers?
We're generally working with pseudos, so we could estimate, but not know 
for sure if we're dealing with multiple hard regs.  But more 
importantly, I'm not sure what that check would buy us.






Bootstrapped and regression tested on x86_64-unknown-linux-gnu.
Thoughts?


It looks fine to me.  Well, some comments...


@@ -2643,6 +2644,24 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
rtx_insn *i0,
   || GET_CODE (src) == LSHIFTRT)
nshift++;
}
+
+  /* If I0 loads a memory and I3 sets the same memory, then I2 and I3
+are likely manipulating its value.  Ideally we'll be able to combine
+all four insns into a bitfield insertion of some kind.
+
+Note the source in I0 might be inside a sign/zero extension and the
+memory modes in I0 and I3 might be different.  So extract the address
+from the destination of I3 and search for it in the source of I0.
+
+In the event that there's a match but the source/dest do not actually
+refer to the same memory, the worst that happens is we try some
+combinations that we wouldn't have otherwise.  */
+  if ((set0 = single_set (i0))
+  (set3 = single_set (i3))
+  GET_CODE (SET_DEST (set3)) == MEM
+  rtx_referenced_p (XEXP (SET_DEST (set3), 0), SET_SRC (set0)))
+   ngood += 2;


I think you should test MEM_P (SET_SRC (set0)), too.  Or even just test
rtx_equal_p (SET_DEST (set3), SET_SRC (set0)) ?
Yea, we need a tighter test on set0 to ensure it's a MEM.  That code got 
twidded before the last testrun.  I'll take care of that.


Earlier versions checked reg_equal_p on the MEM.  But that's often a 
mistake because the modes of the two memory references may be different. 
 I don't recall which of the various tests, but I was definitely seeing 
SImode in the load and HImode in the store.


Similarly you don't want to check reg_equal_p on the addresses as they 
aren't necessarily the same either (they're obviously related).


That's how I ultimately settled on rtx_referenced_p form you see above. 
 I'm still not sure that's 100% what I want, but I don't have any tests 
yet which require something more complex.








+
if (ngood  2  nshift  2)
return 0;
  }
@@ -5663,6 +5682,25 @@ combine_simplify_rtx (rtx x, machine_mode op0_mode, int 
in_dest,
  return CONST0_RTX (mode);
}

+  /* If we have (subreg:M1 (zero_extend:M2 (x))) or
+(subreg:M1 (sign_extend: M2 (x))) where M1 is wider
+then M2, then go ahead and just widen the original extension.
+
+While the subreg is useful in saying I don't care about those
+upper bits.  Squashing out the subreg results in simpler RTL that
+is more easily matched.  */


Closing quote missing.

Fixed locally.




+  if ((GET_CODE (SUBREG_REG (x)) == ZERO_EXTEND
+  || GET_CODE (SUBREG_REG (x)) == SIGN_EXTEND)
+  SCALAR_INT_MODE_P (GET_MODE (x))
+  SCALAR_INT_MODE_P (GET_MODE (SUBREG_REG (x)))
+  GET_MODE (x)  GET_MODE (SUBREG_REG (x)))


GET_MODE_SIZE instead?
It's work as-is.  But using GET_MODE_SIZE shows the intent clearer. 
I'll fix that momentarily.





Does this do anything good for the dec mem thing on x86?  That would
be a nice bonus :-)
It might, but I haven't tested for that specifically.  If you've got 
sample code or a PR in mind, pass it along and I'll take a look.  I'd 
think dec mem would generally be handled by 3-1 insn combination code 
unless there's something else going on.


jef

[RFC PATCH] Emit DW_LANG_Fortran{03,08}

Hi!

DW_LANG_Fortran03 and DW_LANG_Fortran08 DW_AT_language values were recently
accepted into DWARF5.  This patch changes GCC to handle those similarly to
how e.g. the -std=c++11, -std=c++14 or -std=c11 are handled.

As it will take some time for consumers to catch up, I'm enabling that
only if -gdwarf-5 is used for now.

2015-01-27  Jakub Jelinek  ja...@redhat.com

* dwarf2.h (enum dwarf_source_language): Add DW_LANG_Fortran03
and DW_LANG_Fortran08.
* dwarf2out.c (is_fortran): Also return true for DW_LANG_Fortran03
or DW_LANG_Fortran08.
(lower_bound_default): Return 1 for DW_LANG_Fortran03 or
DW_LANG_Fortran08.
(gen_compile_unit_die): Handle GNU Fortran2003 and
GNU Fortran2008 language strings.
* dbxout.c (get_lang_number): Use lang_GNU_Fortran.
* langhooks.h (lang_GNU_Fortran): New prototype.
* langhooks.c (lang_GNU_Fortran): New function.
fortran/
* options.c: Include langhooks.h.
(gfc_post_options): Change lang_hooks.name based on
selected -std= mode.

--- include/dwarf2.h.jj 2014-11-26 20:35:01.0 +0100
+++ include/dwarf2.h2015-01-27 17:55:18.086122137 +0100
@@ -312,6 +312,8 @@ enum dwarf_source_language
 DW_LANG_C_plus_plus_11 = 0x001a, /* dwarf5.20141029.pdf DRAFT */
 DW_LANG_C11 = 0x001d,
 DW_LANG_C_plus_plus_14 = 0x0021,
+DW_LANG_Fortran03 = 0x0022,
+DW_LANG_Fortran08 = 0x0023,
 
 DW_LANG_lo_user = 0x8000,  /* Implementation-defined range start.  */
 DW_LANG_hi_user = 0x,  /* Implementation-defined range start.  */
--- gcc/dwarf2out.c.jj  2015-01-27 17:54:13.0 +0100
+++ gcc/dwarf2out.c 2015-01-27 19:03:30.632411565 +0100
@@ -4736,7 +4736,9 @@ is_fortran (void)
 
   return (lang == DW_LANG_Fortran77
  || lang == DW_LANG_Fortran90
- || lang == DW_LANG_Fortran95);
+ || lang == DW_LANG_Fortran95
+ || lang == DW_LANG_Fortran03
+ || lang == DW_LANG_Fortran08);
 }
 
 /* Return TRUE if the language is Ada.  */
@@ -16720,6 +16722,8 @@ lower_bound_default (void)
 case DW_LANG_Fortran77:
 case DW_LANG_Fortran90:
 case DW_LANG_Fortran95:
+case DW_LANG_Fortran03:
+case DW_LANG_Fortran08:
   return 1;
 case DW_LANG_UPC:
 case DW_LANG_D:
@@ -19781,8 +19785,17 @@ gen_compile_unit_die (const char *filena
 {
   if (strcmp (language_string, GNU Ada) == 0)
language = DW_LANG_Ada95;
-  else if (strcmp (language_string, GNU Fortran) == 0)
-   language = DW_LANG_Fortran95;
+  else if (strncmp (language_string, GNU Fortran, 11) == 0)
+   {
+ language = DW_LANG_Fortran95;
+ if (dwarf_version = 5 /* || !dwarf_strict */)
+   {
+ if (strcmp (language_string, GNU Fortran2003) == 0)
+   language = DW_LANG_Fortran03;
+ else if (strcmp (language_string, GNU Fortran2008) == 0)
+   language = DW_LANG_Fortran08;
+   }
+   }
   else if (strcmp (language_string, GNU Java) == 0)
language = DW_LANG_Java;
   else if (strcmp (language_string, GNU Objective-C) == 0)
@@ -19796,7 +19809,7 @@ gen_compile_unit_die (const char *filena
}
 }
   /* Use a degraded Fortran setting in strict DWARF2 so is_fortran works.  */
-  else if (strcmp (language_string, GNU Fortran) == 0)
+  else if (strncmp (language_string, GNU Fortran, 11) == 0)
 language = DW_LANG_Fortran90;
 
   add_AT_unsigned (die, DW_AT_language, language);
@@ -19806,6 +19819,8 @@ gen_compile_unit_die (const char *filena
 case DW_LANG_Fortran77:
 case DW_LANG_Fortran90:
 case DW_LANG_Fortran95:
+case DW_LANG_Fortran03:
+case DW_LANG_Fortran08:
   /* Fortran has case insensitive identifiers and the front-end
 lowercases everything.  */
   add_AT_unsigned (die, DW_AT_identifier_case, DW_ID_down_case);
--- gcc/dbxout.c.jj 2015-01-15 20:25:30.0 +0100
+++ gcc/dbxout.c2015-01-27 18:58:58.286033152 +0100
@@ -967,7 +967,7 @@ get_lang_number (void)
 return N_SO_CC;
   else if (strcmp (language_string, GNU F77) == 0)
 return N_SO_FORTRAN;
-  else if (strcmp (language_string, GNU Fortran) == 0)
+  else if (lang_GNU_Fortran ())
 return N_SO_FORTRAN90; /* CHECKME */
   else if (strcmp (language_string, GNU Pascal) == 0)
 return N_SO_PASCAL;
--- gcc/langhooks.c.jj  2015-01-09 21:59:54.0 +0100
+++ gcc/langhooks.c 2015-01-27 18:58:37.375387995 +0100
@@ -731,3 +731,11 @@ lang_GNU_CXX (void)
 {
   return strncmp (lang_hooks.name, GNU C++, 7) == 0;
 }
+
+/* Returns true if the current lang_hooks represents the GNU Fortran frontend. 
 */
+
+bool
+lang_GNU_Fortran (void)
+{
+  return strncmp (lang_hooks.name, GNU Fortran, 11) == 0;
+}
--- gcc/langhooks.h.jj  2015-01-05 13:07:13.0 +0100
+++ gcc/langhooks.h 2015-01-27 18:57:51.139172602 +0100
@@ -509,5 +509,6 @@ extern tree add_builtin_type (const char
 
 extern bool lang_GNU_C

Re: [RFC PATCH] Emit DW_LANG_Fortran{03,08}

2015-01-27 Thread David Malcolm

On Tue, 2015-01-27 at 19:19 +0100, Jakub Jelinek wrote:
 Hi!
 
 DW_LANG_Fortran03 and DW_LANG_Fortran08 DW_AT_language values were recently
 accepted into DWARF5.  This patch changes GCC to handle those similarly to
 how e.g. the -std=c++11, -std=c++14 or -std=c11 are handled.
 
 As it will take some time for consumers to catch up, I'm enabling that
 only if -gdwarf-5 is used for now.
 
 2015-01-27  Jakub Jelinek  ja...@redhat.com
 
   * dwarf2.h (enum dwarf_source_language): Add DW_LANG_Fortran03
   and DW_LANG_Fortran08.
   * dwarf2out.c (is_fortran): Also return true for DW_LANG_Fortran03
   or DW_LANG_Fortran08.
   (lower_bound_default): Return 1 for DW_LANG_Fortran03 or
   DW_LANG_Fortran08.
   (gen_compile_unit_die): Handle GNU Fortran2003 and
   GNU Fortran2008 language strings.
   * dbxout.c (get_lang_number): Use lang_GNU_Fortran.
   * langhooks.h (lang_GNU_Fortran): New prototype.
   * langhooks.c (lang_GNU_Fortran): New function.
 fortran/
   * options.c: Include langhooks.h.
   (gfc_post_options): Change lang_hooks.name based on
   selected -std= mode.

(...snip...)

 --- gcc/fortran/options.c.jj  2015-01-12 21:29:11.0 +0100
 +++ gcc/fortran/options.c 2015-01-27 19:07:33.729285229 +0100
 @@ -43,6 +43,7 @@ along with GCC; see the file COPYING3.
  #include cpp.h
  #include diagnostic.h  /* For global_dc.  */
  #include tm.h
 +#include langhooks.h
  
  gfc_option_t gfc_option;
  
 @@ -398,6 +399,11 @@ gfc_post_options (const char **pfilename
  
gfc_cpp_post_options ();
  
 +  if (gfc_option.allow_std  GFC_STD_F2008)
 +lang_hooks.name = GNU Fortran2008;
 +  else if (gfc_option.allow_std  GFC_STD_F2003)
 +lang_hooks.name = GNU Fortran2003;
 +

Did you test this on rs6000?

In particular, rs6000_output_function_epilogue has a:
  else if (! strcmp (language_string, GNU F77)
   || ! strcmp (language_string, GNU Fortran))
i = 1;

Does that conditional need updating to track the langhooks.name change
(maybe to use your new lang_GNU_Fortran function?)

Dave

Re: [PATCH][2/2] Improve array-bound warnings and VRP

2015-01-27 Thread Martin Uecker



Richard Biener wrote:

 On Mon, 26 Jan 2015, Jakub Jelinek wrote:

  Then it probably should be ok.  I'm really afraid of emitting more warnings
  with such high false positive rate now.

 As the patch also mitigates some of the code bloat we get with
 the complete peeling (regression against 4.7) I have installed it.
 It's also the easiest vehicle to verify range-info is not broken
 by passes between vrp1 and vrp2.

You could make warnings appear only for warn_array_bounds  1
if there are concerns about false positives.

For what it's worth, I tested the old version of both patches on 
one of my projects (mostly numerical algorithms) and it did not 
produce additional warnings.

I really appreciate all improvements in this area.

Martin

Re: [RFC PATCH] Emit DW_LANG_Fortran{03,08}

2015-01-27 Thread Tobias Burnus


Jakub Jelinek wrote:

DW_LANG_Fortran03 and DW_LANG_Fortran08 DW_AT_language values were recently
accepted into DWARF5.  This patch changes GCC to handle those similarly to
how e.g. the -std=c++11, -std=c++14 or -std=c11 are handled.


For completeness: gfortran currently produces GNU Fortran and 
DW_LANG_Fortran95; GCC itself also handles ...Fortran77 and 
...Fortran90, but those are not produced with gfortran.


With the patch, it produces for -gdwarf-2/3/4 (4 is default) or 
-gdwarf-5 -std=f95 the same as above. For -std=f2003 -gdwarf-5, it 
yields GNU Fortran2003 and DW_LANG_Fortran2003. And for -gdwarf-5 and 
the rest of -std= (f2008, f2008ts, gnu, legacy), it produces GNU 
Fortran2008 and DW_LANG_Fortran2008.


(In principle, they could have prepared for the future and added Fortran 
2015 as well.)



Regarding the change: it is fine with me. (However, I wonder how much 
will break, once the || !dwarf_strict is enabled, knowing that 
compilers are often more frequently updated as debuggers, valgrind and 
similar programs. On the other, except of debuggers, most tools should 
care much about the DW_LANG.)


Tobias

PS: Talking about DWARF5, do you know when it will be available as 
public draft? I am especially looking forward to 
http://dwarfstd.org/ShowIssue.php?issue=121221.1 (Allow DW_AT_type with 
DW_TAG_string_type), which would be a low-hanging fruit in terms of 
implementation. Contrary to the array additions of 130313.5.



As it will take some time for consumers to catch up, I'm enabling that
only if -gdwarf-5 is used for now.

2015-01-27  Jakub Jelinek  ja...@redhat.com

* dwarf2.h (enum dwarf_source_language): Add DW_LANG_Fortran03
and DW_LANG_Fortran08.
* dwarf2out.c (is_fortran): Also return true for DW_LANG_Fortran03
or DW_LANG_Fortran08.
(lower_bound_default): Return 1 for DW_LANG_Fortran03 or
DW_LANG_Fortran08.
(gen_compile_unit_die): Handle GNU Fortran2003 and
GNU Fortran2008 language strings.
* dbxout.c (get_lang_number): Use lang_GNU_Fortran.
* langhooks.h (lang_GNU_Fortran): New prototype.
* langhooks.c (lang_GNU_Fortran): New function.
fortran/
* options.c: Include langhooks.h.
(gfc_post_options): Change lang_hooks.name based on
selected -std= mode.

Re: [Patch, Fortran] PR64771 - Fix coarray ICE

2015-01-27 Thread Tobias Burnus


Tobias Burnus wrote:

This one compiles just as well, of course.

 From my side, that patch (using MAX) is fine. Thanks for
bearing the bootstrap failure and for the patch.


I have now committed it (i.e. Rainer's patch) as Rev. 220182.
I have also committed the fixed-up/combined patch to the 4.9 branch as 
Rev. 220184.


(BTW: The original patch was approved by Paul on IRC.)

Tobias

Re: [Patch, Fortran, OOP] PR 64230: [4.9/5 Regression] Invalid memory reference in a compiler-generated finalizer for allocatable component

2015-01-27 Thread Janus Weil

 On Tue, Jan 27, 2015 at 10:24:47AM +0100, Andreas Schwab wrote:

  2015-01-19  Janus Weil  ja...@gcc.gnu.org
 
  PR fortran/64230
  * gfortran.dg/class_allocate_18.f90: Extended.

 FAIL: gfortran.dg/class_allocate_18.f90   -O0  (test for excess errors)
 Excess errors:
 /usr/ia64-suse-linux/bin/ld: cannot find -lubsan

Sorry for the breakage, guys!


2015-01-27 10:30 GMT+01:00 Jakub Jelinek ja...@redhat.com:
 Yeah, if you want to add ubsan tests, you need to add gfortran.dg/ubsan/
 directory and hack up ubsan.exp in there

Thanks for the remark, I was suspecting something like that. However,
for this case it's not really worth the hassle. In fact the test case
does not really need the sanitizer and should also work without it. So
I'll just remove the -fsanitize option:

Index: gcc/testsuite/gfortran.dg/class_allocate_18.f90
===
--- gcc/testsuite/gfortran.dg/class_allocate_18.f90(Revision 220180)
+++ gcc/testsuite/gfortran.dg/class_allocate_18.f90(Arbeitskopie)
@@ -1,5 +1,4 @@
 ! { dg-do run }
-! { dg-options -fsanitize=undefined }
 !
 ! PR 64230: [4.9/5 Regression] Invalid memory reference in a
compiler-generated finalizer for allocatable component
 !


Cheers,
Janus

Re: Merge current set of OpenACC changes from gomp-4_0-branch

2015-01-27 Thread Jack Howarth

Thomas,
 Any plans to fix
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64635 soon? On x86_64
darwin, the OpenACC merge resulted a huge number of failures in the
libgomp test suite…


=== libgomp Summary ===
# of expected passes 10628
# of unexpected failures 724
# of unsupported tests 562

which are resolved with a fix similar to
https://gcc.gnu.org/bugzilla/attachment.cgi?id=34480.
   Jack


On Mon, Jan 26, 2015 at 8:44 AM, Thomas Schwinge
tho...@codesourcery.com wrote:
 Hi!

 Sorry for the late answer -- I've been on sick leave, and just now
 returning to work.  Julian, would you please have a look at the following
 issues?

   In r219682, I have committed to trunk our current set of OpenACC changes,
   which we had prepared on gomp-4_0-branch.  Thanks to everyone who has
   been contributing!

 On Fri, 23 Jan 2015 20:20:53 +0300, Ilya Verbin iver...@gmail.com wrote:
 On 17 Jan 02:16, Ilya Verbin wrote:
  Unfortunately, it broke offloading from shared libraries (I mean common 
  libs
  with NEEDED entries, not dlopened).

 Sorry for that!

  Such things are not covered by the
  testsuite, that's why you missed this issue.  Here is a simple testcase:

 http://news.gmane.org/find-root.php?message_id=%3C20150116231632.GB48380%40msticlxl57.ims.intel.com%3E

 Probably a good motivation for adding such a test case.  ;-)

  So, you don't assume that a device can have multiple images from multiple 
  libs?

 Ping?

 This probably is just a bug that we introduced with our changes?
 (Julian?)


 Also, could you please explain, why did you divide a device initialization 
 into
 two functions -- gomp_init_device and gomp_init_tables?

 As I understand it (again, Julian, please correct me if I got that
 wrong), the reason is that for OpenACC support, we need these as two
 separate (independent) actions.  Is this causing problems for OpenMP
 offloading?


 Currently I'm trying to rebase on trunk my old patch, which fixes offloading
 from dlopened libraries: 
 http://gcc.gnu.org/ml/gcc-patches/2014-11/msg01604.html
 It works for OpenMP and MIC, but I don't know how not to break OpenACC and 
 PTX.


 Grüße,
  Thomas

Re: [Patch, Fortran, OOP] PR 64230: [4.9/5 Regression] Invalid memory reference in a compiler-generated finalizer for allocatable component

2015-01-27 Thread Janus Weil

2015-01-27 19:23 GMT+01:00 Jakub Jelinek ja...@redhat.com:
 On Tue, Jan 27, 2015 at 07:20:10PM +0100, Janus Weil wrote:
 2015-01-27 10:30 GMT+01:00 Jakub Jelinek ja...@redhat.com:
  Yeah, if you want to add ubsan tests, you need to add gfortran.dg/ubsan/
  directory and hack up ubsan.exp in there

 Thanks for the remark, I was suspecting something like that. However,
 for this case it's not really worth the hassle. In fact the test case
 does not really need the sanitizer and should also work without it. So
 I'll just remove the -fsanitize option:

 Index: gcc/testsuite/gfortran.dg/class_allocate_18.f90
 ===
 --- gcc/testsuite/gfortran.dg/class_allocate_18.f90(Revision 220180)
 +++ gcc/testsuite/gfortran.dg/class_allocate_18.f90(Arbeitskopie)
 @@ -1,5 +1,4 @@
  ! { dg-do run }
 -! { dg-options -fsanitize=undefined }
  !
  ! PR 64230: [4.9/5 Regression] Invalid memory reference in a
 compiler-generated finalizer for allocatable component
  !

 LGTM.

Good, committed as r220181. Since I had already backported the
original patch to 4.9 yesterday, I'll do the same there ...

Cheers,
Janus

Re: [RFC PATCH] Emit DW_LANG_Fortran{03,08}

On Tue, Jan 27, 2015 at 01:52:12PM -0500, David Malcolm wrote:
  @@ -398,6 +399,11 @@ gfc_post_options (const char **pfilename
   
 gfc_cpp_post_options ();
   
  +  if (gfc_option.allow_std  GFC_STD_F2008)
  +lang_hooks.name = GNU Fortran2008;
  +  else if (gfc_option.allow_std  GFC_STD_F2003)
  +lang_hooks.name = GNU Fortran2003;
  +
 
 Did you test this on rs6000?
 
 In particular, rs6000_output_function_epilogue has a:
   else if (! strcmp (language_string, GNU F77)
  || ! strcmp (language_string, GNU Fortran))
   i = 1;

You're right, missed that.  Consider that changed to lang_GNU_Fortran ().

Jakub

C++ PATCH for c++/63889 (ICE with member variable template)

2015-01-27 Thread Jason Merrill

We were trying to instantiate is_ok with only the innermost set of 
template arguments; we need to make sure that the outer args are 
provided as well.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit e2df55ffbe254dfc15801a204af16d012aeb4cb5
Author: Jason Merrill ja...@redhat.com
Date:   Mon Jan 26 10:55:42 2015 -0500

	PR c++/63889
	* pt.c (finish_template_variable): Move from semantics.c.
	Handle multiple template arg levels.  Handle coercion here.
	(lookup_template_variable): Not here.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index bc26530..d377daa 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -8091,13 +8091,28 @@ tree
 lookup_template_variable (tree templ, tree arglist)
 {
   tree type = unknown_type_node;
-  tsubst_flags_t complain = tf_warning_or_error;
-  tree parms = INNERMOST_TEMPLATE_PARMS (DECL_TEMPLATE_PARMS (templ));
-  arglist = coerce_template_parms (parms, arglist, templ, complain,
-   /*req_all*/true, /*use_default*/true);
   return build2 (TEMPLATE_ID_EXPR, type, templ, arglist);
 }
 
+/* Instantiate a variable declaration from a TEMPLATE_ID_EXPR for use. */
+
+tree
+finish_template_variable (tree var)
+{
+  tree templ = TREE_OPERAND (var, 0);
+
+  tree arglist = TREE_OPERAND (var, 1);
+  tree tmpl_args = DECL_TI_ARGS (DECL_TEMPLATE_RESULT (templ));
+  arglist = add_outermost_template_args (tmpl_args, arglist);
+
+  tree parms = DECL_TEMPLATE_PARMS (templ);
+  tsubst_flags_t complain = tf_warning_or_error;
+  arglist = coerce_innermost_template_parms (parms, arglist, templ, complain,
+	 /*req_all*/true,
+	 /*use_default*/true);
+
+  return instantiate_template (templ, arglist, complain);
+}
 
 struct pair_fn_data
 {
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 915048d..75aa501 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -2454,15 +2454,6 @@ finish_call_expr (tree fn, vectree, va_gc **args, bool disallow_virtual,
   return result;
 }
 
-/* Instantiate a variable declaration from a TEMPLATE_ID_EXPR for use. */
-
-tree
-finish_template_variable (tree var)
-{
-  return instantiate_template (TREE_OPERAND (var, 0), TREE_OPERAND (var, 1),
-   tf_error);
-}
-
 /* Finish a call to a postfix increment or decrement or EXPR.  (Which
is indicated by CODE, which should be POSTINCREMENT_EXPR or
POSTDECREMENT_EXPR.)  */
diff --git a/gcc/testsuite/g++.dg/cpp1y/var-templ22.C b/gcc/testsuite/g++.dg/cpp1y/var-templ22.C
new file mode 100644
index 000..9ddc925
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/var-templ22.C
@@ -0,0 +1,14 @@
+// PR c++/63889
+// { dg-do compile { target c++14 } }
+
+templateclass T
+struct A
+{
+  templateclass
+  static constexpr bool is_ok = true;
+
+  templatebool v = is_okT
+  A(T) { }
+};
+
+Aint p(42);

Re: [Patch, Fortran] PR63861 - fix OpenMP/ACC's gfc_has_alloc_comps

On Tue, Jan 27, 2015 at 08:27:07AM +0100, Tobias Burnus wrote:
 2015-01-27  Tobias Burnus  bur...@net-b.de
 
   PR fortran/63861
 gcc/fortran/
   * trans-openmp.c (gfc_has_alloc_comps, gfc_trans_omp_clauses):
   Fix handling for scalar coarrays.
   * trans-types.c (gfc_get_element_type): Add comment.
 
 gcc/testsuite/
   * gfortran.dg/goacc/coarray_2.f90: New.

Ok, thanks.

Jakub

Re: [PATCH] Fix ICE during ipa dumping (PR ipa/64730)

On Mon, 26 Jan 2015, Jakub Jelinek wrote:

 Hi!
 
 On various targets, %s in fprintf can't handle NULL arguments,
 and even when edge-call_stmt is non-NULL, it still might have
 UNKNOWN_LOCATION or BUILTINS_LOCATION, which have NULL filename.
 In this particular case it is a fnsplit created call.
 
 Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
 ok for trunk?

Ok.

Thanks,
Richard.

 2015-01-26  Jakub Jelinek  ja...@redhat.com
 
   PR ipa/64730
   * ipa-inline.c (inline_small_functions): Print unknown even
   if edge-call_stmt is non-NULL, but has builtins or unknown
   location.
 
 --- gcc/ipa-inline.c.jj   2015-01-22 21:45:18.0 +0100
 +++ gcc/ipa-inline.c  2015-01-26 15:41:57.193640527 +0100
 @@ -1822,6 +1822,9 @@ inline_small_functions (void)
   Estimated badness is %f, frequency %.2f.\n,
  edge-caller-name (), edge-caller-order,
  edge-call_stmt
 + (LOCATION_LOCUS (gimple_location ((const_gimple)
 + edge-call_stmt))
 + BUILTINS_LOCATION)
  ? gimple_filename ((const_gimple) edge-call_stmt)
  : unknown,
  edge-call_stmt
 
   Jakub
 
 

-- 
Richard Biener rguent...@suse.de
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild,
Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)

Re: [Patch, Fortran, OOP] PR 64230: [4.9/5 Regression] Invalid memory reference in a compiler-generated finalizer for allocatable component

On Tue, Jan 27, 2015 at 10:24:47AM +0100, Andreas Schwab wrote:
 Janus Weil ja...@gcc.gnu.org writes:
 
  2015-01-19  Janus Weil  ja...@gcc.gnu.org
 
  PR fortran/64230
  * gfortran.dg/class_allocate_18.f90: Extended.
 
 FAIL: gfortran.dg/class_allocate_18.f90   -O0  (test for excess errors)
 Excess errors:
 /usr/ia64-suse-linux/bin/ld: cannot find -lubsan

Yeah, if you want to add ubsan tests, you need to add gfortran.dg/ubsan/
directory and hack up ubsan.exp in there, from say gcc.dg/ubsan/ubsan.exp
and gfortran.dg/dg.exp.

Jakub

Re: [PATCH] Update BBs in cleanup_barriers pass (PR rtl-optimization/61058)

2015-01-27 Thread Eric Botcazou

 Yes, they do, that is why it crashed during final.

OK.  Why wouldn't it work to call reorder_insns instead of reorder_insns_nobb?

-- 
Eric Botcazou

[PATCH] S/390: -mhotpatch v2

2015-01-27 Thread Dominik Vogt

The attached patch updates the -mhotpatch option and the hopatch
function attribute with (incompatible) new semantics.  Please
refer to the commit in the patch for details.

--

2015-01-27  Dominik Vogt  v...@linux.vnet.ibm.com

* doc/extend.texi: s/390: Update documentation of hotpatch attribute.
* doc/invoke.texi (-mhotpatch): s/390: Update documentation of
-mhotpatch= option.
* config/s390/s390.opt (mhotpatch): s/390: Remove -mhotpatch and
-mno-hotpatch options.  Change syntax of -mhotpatch= option.
* config/s390/s390.c (s390_hotpatch_trampoline_halfwords_default):
Renamed.
(s390_hotpatch_trampoline_halfwords_max): Renamed.
(s390_hotpatch_hw_max): New name.
(s390_hotpatch_trampoline_halfwords): Renamed.
(s390_hotpatch_hw_before_label): New name.
(get_hotpatch_attribute): Removed.
(s390_hotpatch_hw_after_label): New name.
(s390_handle_hotpatch_attribute): Add second parameter to hotpatch
attribute.
(s390_attribute_table): Ditto.
(s390_function_num_hotpatch_trampoline_halfwords): Renamed.
(s390_function_num_hotpatch_hw): New name.
Remove special handling of inline functions and hotpatching.
Return number of nops before and after the function label.
(s390_can_inline_p): Removed.
(s390_asm_output_function_label): Emit a configurable number of nops
after the function label.
(s390_option_override): Update -mhotpatch= syntax and remove -mhotpatch.
(TARGET_CAN_INLINE_P) Removed.
(TARGET_FUNCTION_ATTRIBUTE_INLINABLE_P): New.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
From 9123265bb1d6e325f4edc99a2d1f33a862b3ba53 Mon Sep 17 00:00:00 2001
From: Dominik Vogt v...@linux.vnet.ibm.com
Date: Mon, 1 Dec 2014 15:59:42 +0100
Subject: [PATCH] S/390: -mhotpatch v2

Update the -mhotpatch option and the hotpatch function attribute to take
exactly two arguments.  The first is the number of halfwords to be filled with
two-byte-nops before the function label.  The second is the number of halfwords
to be filled with nops after the label (the biggest available nop instructions
are used).

Further changes are:

* Artificial functions and the main function are also patched.
* Functions selected for hotpatching can still be inlined.  It's the
  responsibility of the user to take care of this when patching, or to
  explicitly disable inlining.
---
 gcc/config/s390/s390.c | 227 -
 gcc/config/s390/s390.opt   |  12 +-
 gcc/doc/extend.texi|  17 +-
 gcc/doc/invoke.texi|  16 +-
 gcc/testsuite/gcc.target/s390/hotpatch-1.c |  14 +-
 gcc/testsuite/gcc.target/s390/hotpatch-10.c|  15 +-
 gcc/testsuite/gcc.target/s390/hotpatch-11.c|  12 +-
 gcc/testsuite/gcc.target/s390/hotpatch-12.c|  14 +-
 gcc/testsuite/gcc.target/s390/hotpatch-13.c|  17 ++
 gcc/testsuite/gcc.target/s390/hotpatch-14.c|  17 ++
 gcc/testsuite/gcc.target/s390/hotpatch-15.c|  17 ++
 gcc/testsuite/gcc.target/s390/hotpatch-16.c|  17 ++
 gcc/testsuite/gcc.target/s390/hotpatch-17.c|  17 ++
 gcc/testsuite/gcc.target/s390/hotpatch-18.c|  16 ++
 gcc/testsuite/gcc.target/s390/hotpatch-19.c|  23 +++
 gcc/testsuite/gcc.target/s390/hotpatch-2.c |  12 +-
 gcc/testsuite/gcc.target/s390/hotpatch-20.c|  20 ++
 gcc/testsuite/gcc.target/s390/hotpatch-3.c |  10 +-
 gcc/testsuite/gcc.target/s390/hotpatch-4.c |  18 +-
 gcc/testsuite/gcc.target/s390/hotpatch-5.c |  15 +-
 gcc/testsuite/gcc.target/s390/hotpatch-6.c |  13 +-
 gcc/testsuite/gcc.target/s390/hotpatch-7.c |  13 +-
 gcc/testsuite/gcc.target/s390/hotpatch-8.c |  24 +--
 gcc/testsuite/gcc.target/s390/hotpatch-9.c |  15 +-
 gcc/testsuite/gcc.target/s390/hotpatch-compile-1.c |  24 +--
 .../gcc.target/s390/hotpatch-compile-10.c  |  12 ++
 .../gcc.target/s390/hotpatch-compile-11.c  |  12 ++
 .../gcc.target/s390/hotpatch-compile-12.c  |  12 ++
 .../gcc.target/s390/hotpatch-compile-13.c  |  29 +++
 .../gcc.target/s390/hotpatch-compile-14.c  |  11 +
 .../gcc.target/s390/hotpatch-compile-15.c  |  43 
 .../gcc.target/s390/hotpatch-compile-16.c  |  24 +++
 gcc/testsuite/gcc.target/s390/hotpatch-compile-2.c |  24 +--
 gcc/testsuite/gcc.target/s390/hotpatch-compile-3.c |  24 +--
 gcc/testsuite/gcc.target/s390/hotpatch-compile-4.c |   2 +-
 gcc/testsuite/gcc.target/s390/hotpatch-compile-5.c |  23 +--
 gcc/testsuite/gcc.target/s390/hotpatch-compile-6.c |   4 +-
 gcc/testsuite/gcc.target/s390/hotpatch-compile-7.c |  66 +-
 gcc/testsuite/gcc.target/s390/hotpatch-compile-8.c |  23 +--
 gcc/testsuite/gcc.target/s390/hotpatch-compile-9.c |  12 ++
 40 files changed, 532 insertions(+), 404

Re: [PATCH] Fix PR64277

On Tue, 27 Jan 2015, Jakub Jelinek wrote:

 On Tue, Jan 27, 2015 at 10:25:48AM +0100, Richard Biener wrote:
  
  This disables array-bound warnings from VRP2 as discussed.
  
  Bootstrapped and tested on x86_64-unknown-linux-gnu - ok for trunk?
 
 So nothing in the testsuite needed to change?  Nice.

Yes.

 Ok for trunk.
 
  I'll search for duplicates and add a few testcases.
 
 Thanks.

Committed as follows (first testcase in PR59124 not fixed - it warns
from the first pass).

2015-01-27  Richard Biener  rguent...@suse.de

PR tree-optimization/56273
PR tree-optimization/59124
PR tree-optimization/64277
* tree-vrp.c (vrp_finalize): Emit array-bound warnings only
from the first VRP pass.

* g++.dg/warn/Warray-bounds-6.C: New testcase.
* gcc.dg/Warray-bounds-12.c: Likewise.
* gcc.dg/Warray-bounds-13.c: Likewise.

Index: gcc/tree-vrp.c
===
*** gcc/tree-vrp.c.orig 2015-01-27 10:34:26.453743828 +0100
--- gcc/tree-vrp.c  2015-01-27 10:43:04.970610102 +0100
*** vrp_finalize (void)
*** 10229,10235 
substitute_and_fold (op_with_constant_singleton_value_range,
   vrp_fold_stmt, false);
  
!   if (warn_array_bounds)
  check_all_array_refs ();
  
/* We must identify jump threading opportunities before we release
--- 10229,10235 
substitute_and_fold (op_with_constant_singleton_value_range,
   vrp_fold_stmt, false);
  
!   if (warn_array_bounds  first_pass_instance)
  check_all_array_refs ();
  
/* We must identify jump threading opportunities before we release
Index: gcc/testsuite/g++.dg/warn/Warray-bounds-6.C
===
*** /dev/null   1970-01-01 00:00:00.0 +
--- gcc/testsuite/g++.dg/warn/Warray-bounds-6.C 2015-01-27 10:40:31.311871855 
+0100
***
*** 0 
--- 1,26 
+ // { dg-do compile }
+ // { dg-options -O3 -Warray-bounds }
+ 
+ struct type {
+ bool a, b;
+ bool get_b() { return b; }
+ };
+ 
+ type stuff[9u];
+ 
+ void bar();
+ 
+ void foo()
+ {
+   for(unsigned i = 0u; i  9u; i++)
+ {
+   if(!stuff[i].a)
+   continue;
+ 
+   bar();
+ 
+   for(unsigned j = i + 1u; j  9u; j++)
+   if(stuff[j].a  stuff[j].get_b()) // { dg-bogus above array bounds }
+ return;
+ }
+ }
Index: gcc/testsuite/gcc.dg/Warray-bounds-12.c
===
*** /dev/null   1970-01-01 00:00:00.0 +
--- gcc/testsuite/gcc.dg/Warray-bounds-12.c 2015-01-27 10:40:58.196175989 
+0100
***
*** 0 
--- 1,26 
+ /* { dg-do compile } */
+ /* { dg-options -O3 -Warray-bounds } */
+ /* { dg-additional-options -mssse3 { target x86_64-*-* i?86-*-* } } */
+ 
+ void foo(short a[], short m)
+ {
+   int i, j;
+   int f1[10];
+   short nc;
+ 
+   nc = m + 1;
+   if (nc  3)
+ {
+   for (i = 0; i = nc; i++)
+   {
+ f1[i] = f1[i] + 1;
+   }
+ }
+ 
+   for (i = 0, j = m; i  nc; i++, j--)
+ {
+   a[i] = f1[i]; /* { dg-bogus above array bounds } */
+   a[j] = i;
+ }
+   return;
+ }
Index: gcc/testsuite/gcc.dg/Warray-bounds-13.c
===
*** /dev/null   1970-01-01 00:00:00.0 +
--- gcc/testsuite/gcc.dg/Warray-bounds-13.c 2015-01-27 10:42:43.738369929 
+0100
***
*** 0 
--- 1,18 
+ /* { dg-do compile } */
+ /* { dg-options -O3 -Warray-bounds } */
+ 
+ extern char *bar[17];
+ 
+ int foo(int argc, char **argv)
+ {
+   int i;
+   int n = 0;
+ 
+   for (i = 0; i  argc; i++)
+ n++;
+ 
+   for (i = 0; i  argc; i++)
+ argv[i] = bar[i + n]; /* { dg-bogus above array bounds } */
+ 
+   return 0;
+ }

Re: [PATCH][AArch64] Use target builtin instead of __builtin_sqrt for vsqrt_f64

2015-01-27 Thread Kyrill Tkachov



On 19/01/15 15:46, Kyrill Tkachov wrote:

On 19/01/15 15:44, James Greenhalgh wrote:

On Mon, Jan 12, 2015 at 05:30:46PM +, Andrew Pinski wrote:

On Mon, Jan 12, 2015 at 7:52 AM, Kyrill Tkachov kyrylo.tkac...@arm.com wrote:

Hi all,

As raised in https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01237.html and
discussed in that thread, using __builtin_sqrt for vsqrt_f64 may end up in a
call to the library sqrt at -O0. To avoid that this patch uses a target
builtin for sqrt on DF mode and uses that to implement the intrinsic.

With this patch I don't see sqrt calls being created at -O0 on a large
arm_neon.h testcase where they were generated before.
aarch64-none-elf testing and the intrinsics testsuite in particular are
clean.
Ok for trunk?

Maybe have a target fold which folds this into sqrt if -fno-math-errno
is supplied.  This might be useful the -ffast-math case.
Maybe also fold it when a constant is supplied too.

Given that we are now in Stage 4, I'd rather see this fixed for GCC 5.0
in the way Kyrill proposed than languishing on a TODO list. Though an
IOU ticket on bugzilla for the missed optimization seems a good idea
to me.

Unless Kyrill already has something in the works to address your
comment, this looks like the right short-term solution to me
(Though Marcus/Richard will have to approve it).

Sorry, this slipped through the cracks.
I agree with James. A missed-optimization issue on bugzilla would be
helpful to keep track of this.


I've filed PR 64821 to keep track of this for GCC 6.
Can I ping https://gcc.gnu.org/ml/gcc-patches/2015-01/msg00710.html then?
It's a regression fix at -O0 so should be appropriate for stage4

Thanks,
Kyrill



Kyrill


Thanks,
James


2015-01-12  Kyrylo Tkachov  kyrylo.tkac...@arm.com

  * config/aarch64/aarch64-simd-builtins.def (sqrt): Use BUILTIN_VDQF_DF.
  * config/aarch64/arm_neon.h (vsqrt_f64): Use __builtin_aarch64_sqrtdf
  instead of __builtin_sqrt.

Re: [PATCH] Fix PR64277

2015-01-27 12:47 GMT+03:00 Richard Biener rguent...@suse.de:
 On Tue, 27 Jan 2015, Jakub Jelinek wrote:

 On Tue, Jan 27, 2015 at 10:25:48AM +0100, Richard Biener wrote:
 
  This disables array-bound warnings from VRP2 as discussed.
 
  Bootstrapped and tested on x86_64-unknown-linux-gnu - ok for trunk?

 So nothing in the testsuite needed to change?  Nice.

 Yes.

 Ok for trunk.

  I'll search for duplicates and add a few testcases.

 Thanks.

 Committed as follows (first testcase in PR59124 not fixed - it warns
 from the first pass).

Are you going to port it to 4.9 branch?

Thanks,
Ilya


 2015-01-27  Richard Biener  rguent...@suse.de

 PR tree-optimization/56273
 PR tree-optimization/59124
 PR tree-optimization/64277
 * tree-vrp.c (vrp_finalize): Emit array-bound warnings only
 from the first VRP pass.

 * g++.dg/warn/Warray-bounds-6.C: New testcase.
 * gcc.dg/Warray-bounds-12.c: Likewise.
 * gcc.dg/Warray-bounds-13.c: Likewise.

 Index: gcc/tree-vrp.c
 ===
 *** gcc/tree-vrp.c.orig 2015-01-27 10:34:26.453743828 +0100
 --- gcc/tree-vrp.c  2015-01-27 10:43:04.970610102 +0100
 *** vrp_finalize (void)
 *** 10229,10235 
 substitute_and_fold (op_with_constant_singleton_value_range,
vrp_fold_stmt, false);

 !   if (warn_array_bounds)
   check_all_array_refs ();

 /* We must identify jump threading opportunities before we release
 --- 10229,10235 
 substitute_and_fold (op_with_constant_singleton_value_range,
vrp_fold_stmt, false);

 !   if (warn_array_bounds  first_pass_instance)
   check_all_array_refs ();

 /* We must identify jump threading opportunities before we release
 Index: gcc/testsuite/g++.dg/warn/Warray-bounds-6.C
 ===
 *** /dev/null   1970-01-01 00:00:00.0 +
 --- gcc/testsuite/g++.dg/warn/Warray-bounds-6.C 2015-01-27 10:40:31.311871855 
 +0100
 ***
 *** 0 
 --- 1,26 
 + // { dg-do compile }
 + // { dg-options -O3 -Warray-bounds }
 +
 + struct type {
 + bool a, b;
 + bool get_b() { return b; }
 + };
 +
 + type stuff[9u];
 +
 + void bar();
 +
 + void foo()
 + {
 +   for(unsigned i = 0u; i  9u; i++)
 + {
 +   if(!stuff[i].a)
 +   continue;
 +
 +   bar();
 +
 +   for(unsigned j = i + 1u; j  9u; j++)
 +   if(stuff[j].a  stuff[j].get_b()) // { dg-bogus above array bounds 
 }
 + return;
 + }
 + }
 Index: gcc/testsuite/gcc.dg/Warray-bounds-12.c
 ===
 *** /dev/null   1970-01-01 00:00:00.0 +
 --- gcc/testsuite/gcc.dg/Warray-bounds-12.c 2015-01-27 10:40:58.196175989 
 +0100
 ***
 *** 0 
 --- 1,26 
 + /* { dg-do compile } */
 + /* { dg-options -O3 -Warray-bounds } */
 + /* { dg-additional-options -mssse3 { target x86_64-*-* i?86-*-* } } */
 +
 + void foo(short a[], short m)
 + {
 +   int i, j;
 +   int f1[10];
 +   short nc;
 +
 +   nc = m + 1;
 +   if (nc  3)
 + {
 +   for (i = 0; i = nc; i++)
 +   {
 + f1[i] = f1[i] + 1;
 +   }
 + }
 +
 +   for (i = 0, j = m; i  nc; i++, j--)
 + {
 +   a[i] = f1[i]; /* { dg-bogus above array bounds } */
 +   a[j] = i;
 + }
 +   return;
 + }
 Index: gcc/testsuite/gcc.dg/Warray-bounds-13.c
 ===
 *** /dev/null   1970-01-01 00:00:00.0 +
 --- gcc/testsuite/gcc.dg/Warray-bounds-13.c 2015-01-27 10:42:43.738369929 
 +0100
 ***
 *** 0 
 --- 1,18 
 + /* { dg-do compile } */
 + /* { dg-options -O3 -Warray-bounds } */
 +
 + extern char *bar[17];
 +
 + int foo(int argc, char **argv)
 + {
 +   int i;
 +   int n = 0;
 +
 +   for (i = 0; i  argc; i++)
 + n++;
 +
 +   for (i = 0; i  argc; i++)
 + argv[i] = bar[i + n]; /* { dg-bogus above array bounds } */
 +
 +   return 0;
 + }

Re: [PATCH] Fix for PR64741 (UBSan/ASan integration)

On Tue, Jan 27, 2015 at 09:19:20AM +0300, Yury Gribov wrote:
 As described in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64741 , ASan
 may currently report false positives for UBSan internal variables due to
 their incomplete type information. This patch fixes this.
 
 Bootstrapped and regtested on Linux x64. Ok to commit?
 
 -Y

 commit cf083510ece7b7bde1ab5a41e293b5a6a5bb4550
 Author: Yury Gribov y.gri...@samsung.com
 Date:   Mon Jan 26 10:19:03 2015 +0300
 
 2015-01-26  Yury Gribov  y.gri...@samsung.com
 
   PR ubsan/64741
 
   * ubsan.c (ubsan_type_descriptor): Update type size.

No extra newline between PR and * ubsan.c lines.

 --- a/gcc/ubsan.c
 +++ b/gcc/ubsan.c
 @@ -504,6 +504,14 @@ ubsan_type_descriptor (tree type, enum ubsan_print_style 
 pstyle)
tinfo = get_ubsan_type_info_for_type (type);
  
/* Create a new VAR_DECL of type descriptor.  */
 +  const char *tmp = pp_formatted_text (pretty_name);
 +  size_t len = strlen (tmp);
 +  tree str = build_string (len + 1, tmp);
 +  TREE_TYPE (str) = build_array_type (char_type_node,
 +   build_index_type (size_int (len)));
 +  TREE_READONLY (str) = 1;
 +  TREE_STATIC (str) = 1;

While touching this, could you please rewrite it as:
  const char *tmp = pp_formatted_text (pretty_name);
  size_t len = strlen (tmp) + 1;
  tree str = build_string (len, tmp);
  TREE_TYPE (str) = build_array_type_nelts (char_type_node, len);
  TREE_READONLY (str) = 1;
  TREE_STATIC (str) = 1;
?  Or, if you want, do it as a follow-up.  There is another occurrence
of this in ubsan_source_location.

Ok for trunk with or without this change.

Jakub

Re: [PATCH] Update BBs in cleanup_barriers pass (PR rtl-optimization/61058)

2015-01-27 Thread Eric Botcazou

 Because reorder_insns doesn't handle the case of moving a barrier into a
 middle of basic block.

Right, I should have read the audit trail. :-)  The patch is OK then, but add 
a ??? note at the end of the comment saying that the proper thing to do here 
is probably not to run cleanup_barrier for this back-end.

-- 
Eric Botcazou

Re: [PATCH] wide-int division fix (PR tree-optimization/64807)

On Mon, 26 Jan 2015, Jakub Jelinek wrote:

 Hi!
 
 On the following testcase we generate wrong code, because
 apparently divmod_internal_2 relies on 0 being the topmost
 element (at b_dividend[m]):
algorithm.  M is the number of significant elements of U however
there needs to be at least one extra element of B_DIVIDEND
allocated, N is the number of elements of B_DIVISOR.  */
 The comment talks just about allocation, but from the code
 it seems it really relies on it being 0.
 There is space for it:
   unsigned HOST_HALF_WIDE_INT
 b_dividend[(4 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_HALF_WIDE_INT) + 
 1];
   unsigned HOST_HALF_WIDE_INT
 b_divisor[4 * MAX_BITSIZE_MODE_ANY_INT / HOST_BITS_PER_HALF_WIDE_INT];
 (the + 1), and usually there already is a zero in there:
   m = dividend_blocks_needed;
   while (m  1  b_dividend[m - 1] == 0)
 m--;
 so the only problematic case is if m isn't decreased.
 
 Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
 ok for trunk?

Ok.

Thanks,
Richard.

 2015-01-26  Jakub Jelinek  ja...@redhat.com
 
   PR tree-optimization/64807
   * wide-int.cc (wi::divmod_internal): Clear
   b_dividend[dividend_blocks_needed].
 
   * gcc.dg/pr64807.c: New test.
 
 --- gcc/wide-int.cc.jj2015-01-09 21:59:38.0 +0100
 +++ gcc/wide-int.cc   2015-01-26 19:21:56.114316481 +0100
 @@ -1819,6 +1819,7 @@ wi::divmod_internal (HOST_WIDE_INT *quot
divisor_blocks_needed, divisor_prec, sgn);
  
m = dividend_blocks_needed;
 +  b_dividend[m] = 0;
while (m  1  b_dividend[m - 1] == 0)
  m--;
  
 --- gcc/testsuite/gcc.dg/pr64807.c.jj 2015-01-26 19:24:13.612943033 +0100
 +++ gcc/testsuite/gcc.dg/pr64807.c2015-01-26 19:32:34.502237566 +0100
 @@ -0,0 +1,19 @@
 +/* PR tree-optimization/64807 */
 +/* { dg-do run { target int128 } } */
 +/* { dg-options -O2 } */
 +
 +__uint128_t
 +foo (void)
 +{
 +  __uint128_t a = -1;
 +  __uint128_t b = -1;
 +  return a / b;
 +}
 +
 +int
 +main ()
 +{
 +  if (foo () != 1)
 +__builtin_abort ();
 +  return 0;
 +}
 
   Jakub
 
 

-- 
Richard Biener rguent...@suse.de
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild,
Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)

[PATCH, PR tree-optimization/64277] Improve loop iterations count estimation

Hi,

This patch was supposed to fix PR tree-optimization/64277.  Tracker is now 
fixed by warnings disabling but I think patch is still useful to avoid dead 
code generated by complete unroll.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

Thanks,
Ilya
--
gcc/

2015-01-27  Ilya Enkovich  ilya.enkov...@intel.com

* tree-ssa-loop-niter.c (record_nonwrapping_iv): Use base
range info when possible to refine estimation.

gcc/testsuite/

2015-01-27  Ilya Enkovich  ilya.enkov...@intel.com

* gcc.dg/pr64277.c: New.


diff --git a/gcc/testsuite/gcc.dg/pr64277.c b/gcc/testsuite/gcc.dg/pr64277.c
new file mode 100644
index 000..0d5ef11
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr64277.c
@@ -0,0 +1,21 @@
+/* PR tree-optimization/64277 */
+/* { dg-do compile } */
+/* { dg-options -O3 -Wall -Werror } */
+
+
+int f1[10];
+void test1 (short a[], short m, unsigned short l)
+{
+  int i = l;
+  for (i = i + 5; i  m; i++)
+f1[i] = a[i]++;
+}
+
+void test2 (short a[], short m, short l)
+{
+  int i;
+  if (m  5)
+m = 5;
+  for (i = m; i  l; i--)
+f1[i] = a[i]++;
+}
diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index 919f5c0..6a55c6f 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -2754,6 +2754,7 @@ record_nonwrapping_iv (struct loop *loop, tree base, tree 
step, gimple stmt,
 {
   tree niter_bound, extreme, delta;
   tree type = TREE_TYPE (base), unsigned_type;
+  tree orig_base = base;
 
   if (TREE_CODE (step) != INTEGER_CST || integer_zerop (step))
 return;
@@ -2777,16 +2778,32 @@ record_nonwrapping_iv (struct loop *loop, tree base, 
tree step, gimple stmt,
 
   if (tree_int_cst_sign_bit (step))
 {
+  wide_int min, max;
   extreme = fold_convert (unsigned_type, low);
-  if (TREE_CODE (base) != INTEGER_CST)
+  if (TREE_CODE (orig_base) == SSA_NAME
+  TREE_CODE (high) == INTEGER_CST
+  !POINTER_TYPE_P (TREE_TYPE (orig_base))
+  SSA_NAME_RANGE_INFO (orig_base)
+  get_range_info (orig_base, min, max) == VR_RANGE
+  wi::gts_p (wide_int (high), max))
+   base = wide_int_to_tree (unsigned_type, max);
+  else if (TREE_CODE (base) != INTEGER_CST)
base = fold_convert (unsigned_type, high);
   delta = fold_build2 (MINUS_EXPR, unsigned_type, base, extreme);
   step = fold_build1 (NEGATE_EXPR, unsigned_type, step);
 }
   else
 {
+  wide_int min, max;
   extreme = fold_convert (unsigned_type, high);
-  if (TREE_CODE (base) != INTEGER_CST)
+  if (TREE_CODE (orig_base) == SSA_NAME
+  TREE_CODE (low) == INTEGER_CST
+  !POINTER_TYPE_P (TREE_TYPE (orig_base))
+  SSA_NAME_RANGE_INFO (orig_base)
+  get_range_info (orig_base, min, max) == VR_RANGE
+  wi::gts_p (min, wide_int (low)))
+   base = wide_int_to_tree (unsigned_type, min);
+  else if (TREE_CODE (base) != INTEGER_CST)
base = fold_convert (unsigned_type, low);
   delta = fold_build2 (MINUS_EXPR, unsigned_type, extreme, base);
 }

Re: [PATCH] Update BBs in cleanup_barriers pass (PR rtl-optimization/61058)

On Tue, Jan 27, 2015 at 09:25:32AM +0100, Eric Botcazou wrote:
  Yes, they do, that is why it crashed during final.
 
 OK.  Why wouldn't it work to call reorder_insns instead of reorder_insns_nobb?

Because reorder_insns doesn't handle the case of moving a barrier into a
middle of basic block.

  if (!BARRIER_P (from)
   (bb2 = BLOCK_FOR_INSN (from)))
{
  if (BB_END (bb2) == to)
BB_END (bb2) = prev;
  df_set_bb_dirty (bb2);
}

  if (BB_END (bb) == after)
BB_END (bb) = to;

  for (x = from; x != NEXT_INSN (to); x = NEXT_INSN (x))
if (!BARRIER_P (x))
  df_insn_change_bb (x, bb);

from == to is a BARRIER in this case, BB_END (bb) != after (BB_END
is actually PREV_INSN (from)), so this doesn't do anything at all.

While what we need is:

1) set BB_END to after
2) clear BLOCK_FOR_INSN on the notes after AFTER (after addition of
   barrier after FROM == TO) until former PREV_INSN (FROM) (inclusive)

Jakub

Re: [Patch, Fortran, OOP] PR 64230: [4.9/5 Regression] Invalid memory reference in a compiler-generated finalizer for allocatable component

2015-01-27 Thread Andreas Schwab

Janus Weil ja...@gcc.gnu.org writes:

 2015-01-19  Janus Weil  ja...@gcc.gnu.org

 PR fortran/64230
 * gfortran.dg/class_allocate_18.f90: Extended.

FAIL: gfortran.dg/class_allocate_18.f90   -O0  (test for excess errors)
Excess errors:
/usr/ia64-suse-linux/bin/ld: cannot find -lubsan

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
And now for something completely different.

[PATCH] Fix PR64277


This disables array-bound warnings from VRP2 as discussed.

Bootstrapped and tested on x86_64-unknown-linux-gnu - ok for trunk?

I'll search for duplicates and add a few testcases.

Thanks,
Richard.

2015-01-27  Richard Biener  rguent...@suse.de

PR tree-optimization/64277
* tree-vrp.c (vrp_finalize): Emit array-bound warnings only
from the first VRP pass.

Index: gcc/tree-vrp.c
===
--- gcc/tree-vrp.c  (revision 220107)
+++ gcc/tree-vrp.c  (working copy)
@@ -10229,7 +10197,7 @@ vrp_finalize (void)
   substitute_and_fold (op_with_constant_singleton_value_range,
   vrp_fold_stmt, false);
 
-  if (warn_array_bounds)
+  if (warn_array_bounds  first_pass_instance)
 check_all_array_refs ();
 
   /* We must identify jump threading opportunities before we release

Re: [PATCH] Fix PR64277

On Tue, Jan 27, 2015 at 10:25:48AM +0100, Richard Biener wrote:
 
 This disables array-bound warnings from VRP2 as discussed.
 
 Bootstrapped and tested on x86_64-unknown-linux-gnu - ok for trunk?

So nothing in the testsuite needed to change?  Nice.

Ok for trunk.

 I'll search for duplicates and add a few testcases.

Thanks.

 2015-01-27  Richard Biener  rguent...@suse.de
 
   PR tree-optimization/64277
   * tree-vrp.c (vrp_finalize): Emit array-bound warnings only
   from the first VRP pass.
 
 Index: gcc/tree-vrp.c
 ===
 --- gcc/tree-vrp.c(revision 220107)
 +++ gcc/tree-vrp.c(working copy)
 @@ -10229,7 +10197,7 @@ vrp_finalize (void)
substitute_and_fold (op_with_constant_singleton_value_range,
  vrp_fold_stmt, false);
  
 -  if (warn_array_bounds)
 +  if (warn_array_bounds  first_pass_instance)
  check_all_array_refs ();
  
/* We must identify jump threading opportunities before we release

Jakub

Re: [debug-early] C++ clones and limbo DIEs

2015-01-27 Thread Jason Merrill


On 01/23/2015 01:45 PM, Aldy Hernandez wrote:

It would expect [the flush] to be before free_lang_data and LTO streaming.


The reason this wouldn't make a difference is because, as it stands,
dwarf for the clones are not generated until final.c:

   if (!DECL_IGNORED_P (current_function_decl))
 debug_hooks-function_decl (current_function_decl);

which happens after free_lang_data.


I agree that the current code doesn't have this effect, but we're 
talking about changing things, right? :)



Unfortunately, this sets DECL_ABSTRACT_P for the static_p above, and
refuses to unset it after the call to dwarf2out_decl.


Well, that sounds like a bug.  Why isn't it being unset?  Is it because 
DECL_ABSTRACT_P was already set for the function, so we don't call 
set_decl_abstract_flags (decl, 0)?  Perhaps a solution to that would be 
to avoid calling set_decl_abstract_flags (decl, 1) if the function is 
already marked as abstract.  Or to teach set_decl_abstract_flags not to 
mess with static local variables.


Jason

Re: [PATCH][RFA][PR target/15184] Partial fix for direct byte access on x86

2015-01-27 Thread Segher Boessenkool

On Tue, Jan 27, 2015 at 01:53:34PM -0700, Jeff Law wrote:
 I do have a specific PR in mind, but I cannot currently find it.  It was
 about x86, dec mem and then using the flags...  Must have sent 100 emails
 in that thread...  And cannot find it now!
 Are you referring to 61225?

That is the one, thanks.


Segher

C++ PATCH for c++/58597 (lambda in default arg)

2015-01-27 Thread Jason Merrill

Here, sometimes we can end up in maybe_add_lambda_conv_op with 
current_function_decl set but not cfun.  If we push_function_context in 
that case, the later pop doesn't clear cfun, but leaves it with a value 
that leads to a crash later on.  So let's avoid calling 
push_function_context in that case.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit ce65568ba19c4613c25f48064a0d5e66454265ac
Author: Jason Merrill ja...@redhat.com
Date:   Tue Jan 27 14:26:18 2015 -0500

	PR c++/58597
	* lambda.c (maybe_add_lambda_conv_op): Check cfun rather than
	current_function_decl.

diff --git a/gcc/cp/lambda.c b/gcc/cp/lambda.c
index 6c9e224..b160c8c 100644
--- a/gcc/cp/lambda.c
+++ b/gcc/cp/lambda.c
@@ -854,7 +854,7 @@ prepare_op_call (tree fn, int nargs)
 void
 maybe_add_lambda_conv_op (tree type)
 {
-  bool nested = (current_function_decl != NULL_TREE);
+  bool nested = (cfun != NULL);
   bool nested_def = decl_function_context (TYPE_MAIN_DECL (type));
   tree callop = lambda_function (type);
 
diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-defarg6.C b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-defarg6.C
new file mode 100644
index 000..fe8767a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-defarg6.C
@@ -0,0 +1,9 @@
+// PR c++/58597
+// { dg-do compile { target c++11 } }
+
+templatetypename struct A
+{
+  templatetypename T A(T, int = []{ return 0; }()) {}
+};
+
+Aint a = 0;

Bug 62044 - [4.8/4.9 Regression] ICE in USE statement with RENAME for extended derived type

2015-01-27 Thread Paul Richard Thomas

Dear All,

The highly embarrassing bug in mold = allocations to class entities
has been fixed in revisions 220140 and 220191 for trunk and 4.9
respectively. The PR has been set as RESOLVED.

Cheers

Paul

[Patch, fortran] PR63205 - [OOP] Wrongly rejects type = class (for identical declared type)

2015-01-27 Thread Paul Richard Thomas

Dear All,

This patch enables the passing of an allocatable class object, scalar
or array, to a derived type of the declared type, either in an
assignment or as an actual argument. Much of the effort went into
sorting out the finalization call so that the 'left over' allocatable
components added by the dynamic type do not leak memory. At the
moment, the existence of the finalization function is tested for. A
check to see if the dynamic type is the same as the declared type
could be added.

Note that adding the 'must_finalize' field to gfc_expr will be useful
in enabling the missing mandatory finalization calls.

There are still interrogation marks about the patch; especially in
build_class_array_ref, where I do not understand why the added code
does not work in general, except for hidden function results.
Nonetheless, the code does not leak memory, apart perhaps from the
compound derived type constructors, with allocatable components that
already show leaks elsewhere. It is also well ringfenced and so should
not cause any regressions... touch wood!

Bootstraps and regtests on x86_64/FC21 - OK for trunk?

Paul

2015-01-27  Paul Thomas  pa...@gcc.gnu.org

PR fortran/63205
* gfortran.h: Add 'must finalize' field to gfc_expr and
prototypes for gfc_is_alloc_class_scalar_function and for
gfc_is_alloc_class_array_function.
* expr.c (gfc_is_alloc_class_scalar_function,
gfc_is_alloc_class_array_function): New functions.
* trans-array.c (gfc_add_loop_ss_code): Do not move the
expression for allocatable class scalar functions outside the
loop.
(conv_array_index_offset): Cope with deltas being NULL_TREE.
(build_class_array_ref): Do not return with allocatable class
array functions. Add code to pick out the returned class array.
Dereference if necessary and return if not a class object.
(gfc_conv_scalarized_array_ref): Cope with offsets being NULL.
(gfc_walk_function_expr): Return an array ss for the result of
an allocatable class array function.
* trans-expr.c (gfc_conv_subref_array_arg): Remove the assert
that the argument should be a variable. If an allocatable class
array function, set the offset to zero and skip the write-out
loop in this case.
(gfc_conv_procedure_call): Add allocatable class array function
to the assert. Call gfc_conv_subref_array_arg for allocatable
class array function arguments with derived type formal arg..
Add the code for handling allocatable class functions, including
finalization calls to prevent memory leaks.
(arrayfunc_assign_needs_temporary): Return if an allocatable
class array function.
(gfc_trans_assignment_1): Set must_finalize to rhs expression
for allocatable class functions. Set scalar_to_array as needed
for scalar class allocatable functions assigned to an array.
Nullify the allocatable components corresponding the the lhs
derived type so that the finalization does not free them.

2015-01-27  Paul Thomas  pa...@gcc.gnu.org

PR fortran/63205
* gfortran.dg/class_to_type_4.f90: New test
Index: gcc/fortran/gfortran.h
===
*** gcc/fortran/gfortran.h  (revision 208092)
--- gcc/fortran/gfortran.h  (working copy)
*** typedef struct gfc_expr
*** 1753,1758 
--- 1753,1761 
/* Mark an expression as being a MOLD argument of ALLOCATE.  */
unsigned int mold : 1;

+   /* Will require finalization after use.  */
+   unsigned int must_finalize : 1;
+
/* If an expression comes from a Hollerith constant or compile-time
   evaluation of a transfer statement, it may have a prescribed target-
   memory representation, and these cannot always be backformed from
*** bool gfc_expr_check_typed (gfc_expr*, gf
*** 2804,2809 
--- 2807,2814 

  gfc_component * gfc_get_proc_ptr_comp (gfc_expr *);
  bool gfc_is_proc_ptr_comp (gfc_expr *);
+ bool gfc_is_alloc_class_scalar_function (gfc_expr *);
+ bool gfc_is_alloc_class_array_function (gfc_expr *);

  bool gfc_ref_this_image (gfc_ref *ref);
  bool gfc_is_coindexed (gfc_expr *);
Index: gcc/fortran/expr.c
===
*** gcc/fortran/expr.c  (revision 208092)
--- gcc/fortran/expr.c  (working copy)
*** gfc_is_proc_ptr_comp (gfc_expr *expr)
*** 4274,4279 
--- 4274,4313 
  }


+ /* Determine if an expression is a function with an allocatable class scalar
+result.  */
+ bool
+ gfc_is_alloc_class_scalar_function (gfc_expr *expr)
+ {
+   if (expr-expr_type == EXPR_FUNCTION
+expr-value.function.esym
+expr-value.function.esym-result
+expr-value.function.esym-result-ts.type == BT_CLASS
+!CLASS_DATA (expr-value.function.esym-result)-attr.dimension
+CLASS_DATA (expr-value.function.esym-result)-attr.allocatable)
+ return true;
+
+   return false;
+ }
+
+
+ /* Determine if an expression is a

Re: [PATCH][RFA][PR target/15184] Partial fix for direct byte access on x86

2015-01-27 Thread Segher Boessenkool

On Tue, Jan 27, 2015 at 12:27:38PM -0700, Jeff Law wrote:
 On 01/26/15 22:11, Segher Boessenkool wrote:
 On Mon, Jan 26, 2015 at 08:07:29PM -0700, Jeff Law wrote:
 The second change we need is an additional simplification.
 
 If we have
 (subreg:M1 (zero_extend:M2 (x))
 
 Where M1  M2 and both are scalar integer modes.  It's advantageous to
 strip the SUBREG and instead have a wider extension.
 
 Should you also check M1 is not multiple registers?
 We're generally working with pseudos, so we could estimate, but not know 
 for sure if we're dealing with multiple hard regs.  But more 
 importantly, I'm not sure what that check would buy us.

I mean e.g. DI on a 32-bit target.  My worry is that zero_extend:DI then
is more expensive -- if say, it is implemented as a split, combine itself
cannot get rid of the redundancy.

 Earlier versions checked reg_equal_p on the MEM.  But that's often a 
 mistake because the modes of the two memory references may be different. 
  I don't recall which of the various tests, but I was definitely seeing 
 SImode in the load and HImode in the store.
 
 Similarly you don't want to check reg_equal_p on the addresses as they 
 aren't necessarily the same either (they're obviously related).
 
 That's how I ultimately settled on rtx_referenced_p form you see above. 
  I'm still not sure that's 100% what I want, but I don't have any tests 
 yet which require something more complex.

Okay, if there are actual real cases like that :-)  All this code does
is cull cases that are not useful to try to combine, since without that
combining four insns is very expensive.

 Does this do anything good for the dec mem thing on x86?  That would
 be a nice bonus :-)
 It might, but I haven't tested for that specifically.  If you've got 
 sample code or a PR in mind, pass it along and I'll take a look.  I'd 
 think dec mem would generally be handled by 3-1 insn combination code 
 unless there's something else going on.

I do have a specific PR in mind, but I cannot currently find it.  It was
about x86, dec mem and then using the flags...  Must have sent 100 emails
in that thread...  And cannot find it now!


Segher

Re: [PATCH][RFA][PR target/15184] Partial fix for direct byte access on x86


On 01/27/15 13:36, Segher Boessenkool wrote:

On Tue, Jan 27, 2015 at 12:27:38PM -0700, Jeff Law wrote:

On 01/26/15 22:11, Segher Boessenkool wrote:

On Mon, Jan 26, 2015 at 08:07:29PM -0700, Jeff Law wrote:

The second change we need is an additional simplification.

If we have
(subreg:M1 (zero_extend:M2 (x))

Where M1  M2 and both are scalar integer modes.  It's advantageous to
strip the SUBREG and instead have a wider extension.


Should you also check M1 is not multiple registers?

We're generally working with pseudos, so we could estimate, but not know
for sure if we're dealing with multiple hard regs.  But more
importantly, I'm not sure what that check would buy us.


I mean e.g. DI on a 32-bit target.  My worry is that zero_extend:DI then
is more expensive -- if say, it is implemented as a split, combine itself
cannot get rid of the redundancy.
We might lose for a case like (subreg:DI ({zero,sign}_extend:SI (x))) on 
a 32 bit target if something were able to recognize that the upper bits 
were don't cares.


The most likely place for that to happen would be at assembly output 
time -- but that would require the target to exploit the don't care 
semantics of those bits.  I don't recall any port doing that.


We could exploit this in generic splitting code, but I don't think we 
do.  lower-subreg slams in a zero when it finds a paradoxical subreg and 
we've asked for the high word.  I don't immediately see that does 
anything special when the operand of the subreg is anything other than 
another reg or mem.


combine exploits the don't care nature of those bits to eliminate 
masking and such.  It's not going to be able to eliminate the subreg 
entirely unless it folks into some later insn and we're ultimately able 
to narrow the operation back down to SImode.



Also note that while ports may not have special cases around the subreg 
variant, several have special cases for ZERO_EXTEND.  Basically they 
slam in a zero to the upper word, either via a splitter or during 
assembly code output.  Those special cases will be recognized more often 
now.



Jeff

Re: [PATCH][RFA][PR target/15184] Partial fix for direct byte access on x86


On 01/27/15 13:36, Segher Boessenkool wrote:

I mean e.g. DI on a 32-bit target.  My worry is that zero_extend:DI then
is more expensive -- if say, it is implemented as a split, combine itself
cannot get rid of the redundancy.

OK.  Let me play with that a bit.



Okay, if there are actual real cases like that :-)  All this code does
is cull cases that are not useful to try to combine, since without that
combining four insns is very expensive.

There are :-)  It surprised me as well.




Does this do anything good for the dec mem thing on x86?  That would
be a nice bonus :-)

It might, but I haven't tested for that specifically.  If you've got
sample code or a PR in mind, pass it along and I'll take a look.  I'd
think dec mem would generally be handled by 3-1 insn combination code
unless there's something else going on.


I do have a specific PR in mind, but I cannot currently find it.  It was
about x86, dec mem and then using the flags...  Must have sent 100 emails
in that thread...  And cannot find it now!

Are you referring to 61225?

Jeff

Re: Fix 59828 - Broken assembly on ppc* with two -mcpu= options

2015-01-27 Thread Alan Modra

On Wed, Jan 21, 2015 at 02:01:44PM -0500, David Edelsohn wrote:
 I want to avoid duplicating the -mcpu parsing logic or the Rube
 Goldberg mechanism to re-generate the -mXXX assembler directive.

Oh well, I had fun writing the patch.  I thought it reasonably
elegant, meeting the goals you state above.  You think differently,
and I won't push my approach further.  The bug isn't important enough
to argue over.

-- 
Alan Modra
Australia Development Lab, IBM

RE: [Patch][wwwdocs]Deprecate the ARM TPCS related options in gcc 5.0

2015-01-27 Thread Terry Guo



 -Original Message-
 From: Gerald Pfeifer [mailto:ger...@pfeifer.com]
 Sent: Monday, January 26, 2015 7:34 PM
 To: Terry Guo
 Cc: gcc-patches@gcc.gnu.org; Richard Earnshaw; Ramana Radhakrishnan
 Subject: Re: [Patch][wwwdocs]Deprecate the ARM TPCS related options in
 gcc 5.0
 
 On Monday 2015-01-26 16:47, Terry Guo wrote:
  This patch intends to update gcc 5.0 change.html to deprecate TPCS
  related options because TPCS is obsoleted per the ABI document at
 
 http://infocenter.arm.com/help/topic/com.arm.doc.ihi0042e/IHI0042E_aapc
 s.pdf.
  Is it OK?
 
 From a language perspective I suggest to say The options list the
options
 here related to the old ABI... or The options related to the old ABI
--
 list the options -- ..., where I somewhat prefer the former.
 
 Please wait for Richard or Ramana for final review and approval.
 
 Gerald

Thanks Gerald. Patch is updated. Is this one OK?

BR,
Terry

Index: htdocs/gcc-5/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v
retrieving revision 1.73
diff -u -p -r1.73 changes.html
--- htdocs/gcc-5/changes.html   26 Jan 2015 09:40:03 -  1.73
+++ htdocs/gcc-5/changes.html   27 Jan 2015 09:35:32 -
@@ -513,8 +513,9 @@ void operator delete[] (void *, std::siz
   li The deprecated option code-mwords-little-endian/code
has been removed.
   /li
-  li The options relating to the old ABI code-mapcs/code and
-  code-mapcs-frame/code have been deprecated.
+  li The options code-mapcs/code, code-mapcs-frame/code,
+  code-mtpcs-frame/code and code-mtpcs-leaf-frame/code
+  which are only applicable to the old ABI have been deprecated.
   /li
   liThe transitional options code-mlra/code and
code-mno-lra/code
have been removed. The ARM backend now uses the local register
allocator

Re: Fix 59828 - Broken assembly on ppc* with two -mcpu= options

2015-01-27 Thread David Edelsohn

On Tue, Jan 27, 2015 at 7:27 PM, Alan Modra amo...@gmail.com wrote:
 On Wed, Jan 21, 2015 at 02:01:44PM -0500, David Edelsohn wrote:
 I want to avoid duplicating the -mcpu parsing logic or the Rube
 Goldberg mechanism to re-generate the -mXXX assembler directive.

 Oh well, I had fun writing the patch.  I thought it reasonably
 elegant, meeting the goals you state above.  You think differently,
 and I won't push my approach further.  The bug isn't important enough
 to argue over.

Alan,

I am sorry that you do not want to finish the patch.  I don't
understand why you find the command line argument so appealing when
the .machine pseudo-op was designed for this purpose.

Thanks, David

Re: [PATCH][AArch64] Improve bit-test-branch pattern to avoid unnecessary register clobber

On Tue, Jan 27, 2015 at 02:31:14PM +, Jiong Wang wrote:
 testcase changed to execution version, and moved to gcc.dg. the compile time 
 only
 take several seconds. (previously I am using cc1 built by O0 which at most 
 take 24s)
 
 ok to install?

Ok for the testcase.
The config/aarch64/ bits I'll defer to aarch64 maintainers.

 2015-01-19  Ramana Radhakrishnan  ramana.radhakrish...@arm.com
 Jiong Wang  jiong.w...@arm.com
 
   gcc/
 * config/aarch64/aarch64.md (tboptabmode1): Clobber CC reg instead of 
 scratch reg.
 (cboptabmode1): Likewise.
 * config/aarch64/iterators.md (bcond): New define_code_attr.
 
   gcc/testsuite/
 * gcc.dg/long_branch.c: New testcase.

Jakub

RE: [PATCH RFA MIPS] Prohibit vector modes in accumulators

2015-01-27 Thread Moore, Catherine

 -Original Message-
 From: Matthew Fortune [mailto:matthew.fort...@imgtec.com]
 Sent: Tuesday, January 27, 2015 7:19 AM
 To: Richard Sandiford
 Cc: Robert Suchanek; gcc-patches@gcc.gnu.org; Moore, Catherine
 Subject: RE: [PATCH RFA MIPS] Prohibit vector modes in accumulators

 Richard Sandiford rdsandif...@googlemail.com writes:
  Matthew Fortune matthew.fort...@imgtec.com writes:
   2015-01-23  Robert Suchanek  robert.sucha...@imgtec.com

* config/mips/mips.c (mips_hard_regno_mode_ok_p): Prohibit
   accumulators
for all vector modes.

   This seems like a genuine bug and although it can only be triggered
   by loongson or paired-single support it probably qualifies for fixing.

  Agreed FWIW.  We shouldn't mark something as valid for a mode if even
  the mode's move pattern can't handle it.

  I think this kind of thing should go in regardless of development stage.

 Given that it was one of the pre-existing tests that failed I'm happy that we
 are covering this issue. All of these LRA related issues are likely to phase 
 in
 and out with subtle changes to code-gen so I don't think we can always get a
 test case that fails on trunk.

That's true.

 Since Catherine asked for further info then I will leave her to say if she is
 happy to accept on this basis.

I withdraw my request for a testcase.

Catherine

[RFC PATCH] Avoid most of the BUILT_IN_*_CHKP enum values

Hi!

I've grepped for BUILT_IN_.*_CHKP in the sources and we actually need
far fewer enum values than the 1204 that are being defined.

This patch requires builtins.def to say explicitly (by using
DEF_*BUILTIN_CHKP macro instead of corresponding DEF_*BUILTIN) which
ones need that, for all the others only space in the enum is reserved and
nothing else.

I'd hope this could work around the buggy AIX stabs handling, but even
on say x86_64-linux it has a benefit of decreasing cc1plus .debug_info
by about 2.7MB (of course, with dwz that benefit goes to almost nothing,
just the ~ 7000 bytes or so, plus .debug_str cost (that is merged even
without dwz between TUs).  The cost without dwz is obviously mainly
from repeating that in most of the translation units.  But why declare
BUILT_IN_*_CHKP enums that are never used by anything...

2015-01-27  Jakub Jelinek  ja...@redhat.com

* builtins.def (DEF_BUILTIN_CHKP): Define if not defined.
(DEF_LIB_BUILTIN_CHKP, DEF_EXT_LIB_BUILTIN_CHKP): Redefine.
(DEF_CHKP_BUILTIN): Define using DEF_BUILTIN_CHKP instead
of DEF_BUILTIN.
(BUILT_IN_MEMCPY, BUILT_IN_MEMMOVE, BUILT_IN_MEMSET, BUILT_IN_STRCAT,
BUILT_IN_STRCHR, BUILT_IN_STRCPY, BUILT_IN_STRLEN): Use
DEF_LIB_BUILTIN_CHKP macro instead of DEF_LIB_BUILTIN.
(BUILT_IN_MEMCPY_CHK, BUILT_IN_MEMMOVE_CHK, BUILT_IN_MEMPCPY_CHK,
BUILT_IN_MEMPCPY, BUILT_IN_MEMSET_CHK, BUILT_IN_STPCPY_CHK,
BUILT_IN_STPCPY, BUILT_IN_STRCAT_CHK, BUILT_IN_STRCPY_CHK): Use
DEF_EXT_LIB_BUILTIN_CHKP macro instead of DEF_EXT_LIB_BUILTIN.
* tree-core.h (enum built_in_function): In between
BEGIN_CHKP_BUILTINS and END_CHKP_BUILTINS only define enum values
for builtins that use DEF_BUILTIN_CHKP macro.

--- gcc/builtins.def.jj 2015-01-15 23:39:10.0 +0100
+++ gcc/builtins.def2015-01-27 15:04:44.860924664 +0100
@@ -63,6 +63,16 @@ along with GCC; see the file COPYING3.
 
The builtins is registered only if COND is true.  */
 
+/* A macro for builtins where the
+   BUILT_IN_*_CHKP = BUILT_IN_* + BEGIN_CHKP_BUILTINS + 1
+   enums should be defined too.  */
+#ifndef DEF_BUILTIN_CHKP
+#define DEF_BUILTIN_CHKP(ENUM, NAME, CLASS, TYPE, LIBTYPE, BOTH_P, \
+FALLBACK_P, NONANSI_P, ATTRS, IMPLICIT, COND)  \
+  DEF_BUILTIN(ENUM, NAME, CLASS, TYPE, LIBTYPE, BOTH_P, FALLBACK_P,\
+ NONANSI_P, ATTRS, IMPLICIT, COND)
+#endif
+
 /* A GCC builtin (like __builtin_saveregs) is provided by the
compiler, but does not correspond to a function in the standard
library.  */
@@ -87,6 +97,10 @@ along with GCC; see the file COPYING3.
 #define DEF_LIB_BUILTIN(ENUM, NAME, TYPE, ATTRS)   \
   DEF_BUILTIN (ENUM, __builtin_ NAME, BUILT_IN_NORMAL, TYPE, TYPE,   \
   true, true, false, ATTRS, true, true)
+#undef DEF_LIB_BUILTIN_CHKP
+#define DEF_LIB_BUILTIN_CHKP(ENUM, NAME, TYPE, ATTRS)  \
+  DEF_BUILTIN_CHKP (ENUM, __builtin_ NAME, BUILT_IN_NORMAL, TYPE,\
+   TYPE, true, true, false, ATTRS, true, true)
 
 /* Like DEF_LIB_BUILTIN, except that the function is not one that is
specified by ANSI/ISO C.  So, when we're being fully conformant we
@@ -96,6 +110,10 @@ along with GCC; see the file COPYING3.
 #define DEF_EXT_LIB_BUILTIN(ENUM, NAME, TYPE, ATTRS)   \
   DEF_BUILTIN (ENUM, __builtin_ NAME, BUILT_IN_NORMAL, TYPE, TYPE,   \
   true, true, true, ATTRS, false, true)
+#undef DEF_EXT_LIB_BUILTIN_CHKP
+#define DEF_EXT_LIB_BUILTIN_CHKP(ENUM, NAME, TYPE, ATTRS)  \
+  DEF_BUILTIN_CHKP (ENUM, __builtin_ NAME, BUILT_IN_NORMAL, TYPE,\
+   TYPE, true, true, true, ATTRS, false, true)
 
 /* Like DEF_LIB_BUILTIN, except that the function is only a part of
the standard in C94 or above.  */
@@ -199,8 +217,8 @@ along with GCC; see the file COPYING3.
 /* Builtin used by the implementation of Pointer Bounds Checker.  */
 #undef DEF_CHKP_BUILTIN
 #define DEF_CHKP_BUILTIN(ENUM, NAME, TYPE, ATTRS) \
-  DEF_BUILTIN (ENUM, __builtin_ NAME, BUILT_IN_NORMAL, TYPE, TYPE,\
-  true, true, false, ATTRS, true, true)
+  DEF_BUILTIN_CHKP (ENUM, __builtin_ NAME, BUILT_IN_NORMAL, TYPE,\
+   TYPE, true, true, false, ATTRS, true, true)
 
 /* Define an attribute list for math functions that are normally
impure because some of them may write into global memory for
@@ -595,22 +613,22 @@ DEF_EXT_LIB_BUILTIN(BUILT_IN_BZERO,
 DEF_EXT_LIB_BUILTIN(BUILT_IN_INDEX, index, 
BT_FN_STRING_CONST_STRING_INT, ATTR_PURE_NOTHROW_NONNULL_LEAF)
 DEF_LIB_BUILTIN(BUILT_IN_MEMCHR, memchr, 
BT_FN_PTR_CONST_PTR_INT_SIZE, ATTR_PURE_NOTHROW_NONNULL_LEAF)
 DEF_LIB_BUILTIN(BUILT_IN_MEMCMP, memcmp, 
BT_FN_INT_CONST_PTR_CONST_PTR_SIZE, ATTR_PURE_NOTHROW_NONNULL_LEAF)
-DEF_LIB_BUILTIN(BUILT_IN_MEMCPY, memcpy, 
BT_FN_PTR_PTR_CONST_PTR_SIZE, ATTR_RET1_NOTHROW_NONNULL_LEAF)
-DEF_LIB_BUILTIN(BUILT_IN_MEMMOVE, memmove,

Re: RFA: patch to fix a bad code generation for PR64110 -- new constraints addition


On 01/27/15 07:08, Richard Sandiford wrote:


Yeah, but in practice that's only ever going to be a partial transition.
Many port maintainers won't look at this, so we'll have to support both
versions indefinitely, even if the new behaviour turns out to be the
best for all cases.
Yes, most likely.   I find myself pondering the related question of how 
we get ports to transition to LRA and if we could tie these together. 
Maintainers are going to need to transition to LRA if we're ever going 
to start removing blobs of reload.  As a part of that transition they're 
presumably going to be looking closely at their backend and could make 
the constraint transition.


In an ideal world, we'd declare release X.Y has a cut-off point.  Ports 
that haven't transitioned to LRA get deprecated at that point.  Those 
ports are the ones most likely not to make the constraint transition as 
well.  I think we would have to consider any uses of ?! that remain 
after that point as intentional.




I just think we're going to regret having two sets of constraints with
such subtly different meanings.
But isn't that inevitable?  While I suspect that most instances of ?! 
should be converted, there may be some that should not.  If that's the 
case then we're going to have both forever.





Looking back at the original PR, Jakub said:

   The ! has been added by me for PR63594, so it isn't there from the era
   when i?86 backend was using reload.  If there is a better way to
   express that RA should prefer to use memory or xmm register and only
   use r constraint if it already is in a r register and doesn't need to
   be reloaded, I can use that.  Whether it is ?, ??? or something else.
   ! description in gcc docs just fitted most what I wanted...

In some ways this seems to match the intention of *.  Originally I think
it was just an RA-only thing and was ignored by reload, but LRA does take it
into account too (which sounds like progress to me).

If I revert the patch locally and change the *vec_dupmode pattern to
use *, it passes both the test for PR64110 and the tests for PR63594.
Would that be OK as an alternative?

I think that's up to Uros and Jakub to sort out.

Jeff

Re: [PATCH] Fix ada bootstrap under cygwin-64

2015-01-27 Thread Arnaud Charlet

 this patch fixes the ada bootstrap under cygwin-64.
 
 Boot-strapped under x86_64-pc-cygwin.
 OK for trunk?

OK

Re: [RFC PATCH] Avoid most of the BUILT_IN_*_CHKP enum values

2015-01-27 17:27 GMT+03:00 Jakub Jelinek ja...@redhat.com:
 Hi!

 I've grepped for BUILT_IN_.*_CHKP in the sources and we actually need
 far fewer enum values than the 1204 that are being defined.

 This patch requires builtins.def to say explicitly (by using
 DEF_*BUILTIN_CHKP macro instead of corresponding DEF_*BUILTIN) which
 ones need that, for all the others only space in the enum is reserved and
 nothing else.

 I'd hope this could work around the buggy AIX stabs handling, but even
 on say x86_64-linux it has a benefit of decreasing cc1plus .debug_info
 by about 2.7MB (of course, with dwz that benefit goes to almost nothing,
 just the ~ 7000 bytes or so, plus .debug_str cost (that is merged even
 without dwz between TUs).  The cost without dwz is obviously mainly
 from repeating that in most of the translation units.  But why declare
 BUILT_IN_*_CHKP enums that are never used by anything...

Enum values not mentioned in the code are not fully useless.  When we
have builtin functions defined as 'always_inline' functions, they are
instrumented and enum names may be used in dumps and debugging.
That's not a big value though.  Thanks a lot for taking care of it!

Ilya

 2015-01-27  Jakub Jelinek  ja...@redhat.com

 * builtins.def (DEF_BUILTIN_CHKP): Define if not defined.
 (DEF_LIB_BUILTIN_CHKP, DEF_EXT_LIB_BUILTIN_CHKP): Redefine.
 (DEF_CHKP_BUILTIN): Define using DEF_BUILTIN_CHKP instead
 of DEF_BUILTIN.
 (BUILT_IN_MEMCPY, BUILT_IN_MEMMOVE, BUILT_IN_MEMSET, BUILT_IN_STRCAT,
 BUILT_IN_STRCHR, BUILT_IN_STRCPY, BUILT_IN_STRLEN): Use
 DEF_LIB_BUILTIN_CHKP macro instead of DEF_LIB_BUILTIN.
 (BUILT_IN_MEMCPY_CHK, BUILT_IN_MEMMOVE_CHK, BUILT_IN_MEMPCPY_CHK,
 BUILT_IN_MEMPCPY, BUILT_IN_MEMSET_CHK, BUILT_IN_STPCPY_CHK,
 BUILT_IN_STPCPY, BUILT_IN_STRCAT_CHK, BUILT_IN_STRCPY_CHK): Use
 DEF_EXT_LIB_BUILTIN_CHKP macro instead of DEF_EXT_LIB_BUILTIN.
 * tree-core.h (enum built_in_function): In between
 BEGIN_CHKP_BUILTINS and END_CHKP_BUILTINS only define enum values
 for builtins that use DEF_BUILTIN_CHKP macro.

 --- gcc/builtins.def.jj 2015-01-15 23:39:10.0 +0100
 +++ gcc/builtins.def2015-01-27 15:04:44.860924664 +0100
 @@ -63,6 +63,16 @@ along with GCC; see the file COPYING3.

 The builtins is registered only if COND is true.  */

 +/* A macro for builtins where the
 +   BUILT_IN_*_CHKP = BUILT_IN_* + BEGIN_CHKP_BUILTINS + 1
 +   enums should be defined too.  */
 +#ifndef DEF_BUILTIN_CHKP
 +#define DEF_BUILTIN_CHKP(ENUM, NAME, CLASS, TYPE, LIBTYPE, BOTH_P, \
 +FALLBACK_P, NONANSI_P, ATTRS, IMPLICIT, COND)  \
 +  DEF_BUILTIN(ENUM, NAME, CLASS, TYPE, LIBTYPE, BOTH_P, FALLBACK_P,\
 + NONANSI_P, ATTRS, IMPLICIT, COND)
 +#endif
 +
  /* A GCC builtin (like __builtin_saveregs) is provided by the
 compiler, but does not correspond to a function in the standard
 library.  */
 @@ -87,6 +97,10 @@ along with GCC; see the file COPYING3.
  #define DEF_LIB_BUILTIN(ENUM, NAME, TYPE, ATTRS)   \
DEF_BUILTIN (ENUM, __builtin_ NAME, BUILT_IN_NORMAL, TYPE, TYPE,   \
true, true, false, ATTRS, true, true)
 +#undef DEF_LIB_BUILTIN_CHKP
 +#define DEF_LIB_BUILTIN_CHKP(ENUM, NAME, TYPE, ATTRS)  \
 +  DEF_BUILTIN_CHKP (ENUM, __builtin_ NAME, BUILT_IN_NORMAL, TYPE,\
 +   TYPE, true, true, false, ATTRS, true, true)

  /* Like DEF_LIB_BUILTIN, except that the function is not one that is
 specified by ANSI/ISO C.  So, when we're being fully conformant we
 @@ -96,6 +110,10 @@ along with GCC; see the file COPYING3.
  #define DEF_EXT_LIB_BUILTIN(ENUM, NAME, TYPE, ATTRS)   \
DEF_BUILTIN (ENUM, __builtin_ NAME, BUILT_IN_NORMAL, TYPE, TYPE,   \
true, true, true, ATTRS, false, true)
 +#undef DEF_EXT_LIB_BUILTIN_CHKP
 +#define DEF_EXT_LIB_BUILTIN_CHKP(ENUM, NAME, TYPE, ATTRS)  \
 +  DEF_BUILTIN_CHKP (ENUM, __builtin_ NAME, BUILT_IN_NORMAL, TYPE,\
 +   TYPE, true, true, true, ATTRS, false, true)

  /* Like DEF_LIB_BUILTIN, except that the function is only a part of
 the standard in C94 or above.  */
 @@ -199,8 +217,8 @@ along with GCC; see the file COPYING3.
  /* Builtin used by the implementation of Pointer Bounds Checker.  */
  #undef DEF_CHKP_BUILTIN
  #define DEF_CHKP_BUILTIN(ENUM, NAME, TYPE, ATTRS) \
 -  DEF_BUILTIN (ENUM, __builtin_ NAME, BUILT_IN_NORMAL, TYPE, TYPE,\
 -  true, true, false, ATTRS, true, true)
 +  DEF_BUILTIN_CHKP (ENUM, __builtin_ NAME, BUILT_IN_NORMAL, TYPE,\
 +   TYPE, true, true, false, ATTRS, true, true)

  /* Define an attribute list for math functions that are normally
 impure because some of them may write into global memory for
 @@ -595,22 +613,22 @@ DEF_EXT_LIB_BUILTIN(BUILT_IN_BZERO,
  DEF_EXT_LIB_BUILTIN(BUILT_IN_INDEX, index, 
 BT_FN_STRING_CONST_STRING_INT, ATTR_PURE_NOTHROW_NONNULL_LEAF)

[PATCH] Add comdat_group effective target (PR bootstrap/64612)

Hi!

This patch introduces a new effective target check and adds it to the pr64612.C
- if comdat groups aren't used, there is no guarantee that the D2 dtor will
be emitted always alongside of D1 dtor.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2015-01-27  Jakub Jelinek  ja...@redhat.com

PR bootstrap/64612
* lib/target-supports.exp (check_effective_target_comdat_group): New.
* g++.dg/ipa/pr64612.C: Guard scan-assembler test with
{ target comdat_group }.

* doc/sourcebuild.texi (comdat_group): Document.

--- gcc/testsuite/lib/target-supports.exp.jj2015-01-15 23:39:06.0 
+0100
+++ gcc/testsuite/lib/target-supports.exp   2015-01-26 15:24:55.325236098 
+0100
@@ -6198,3 +6198,13 @@ proc check_effective_target_pie_copyrelo
 
 return $pie_copyreloc_available_saved
 }
+
+# Return 1 if the target uses comdat groups.
+
+proc check_effective_target_comdat_group {} {
+return [check_no_messages_and_pattern comdat_group 
\.section\[^\n\r]*,comdat assembly {
+   // C++
+   inline int foo () { return 1; }
+   int (*fn) () = foo;
+}]
+}
--- gcc/testsuite/g++.dg/ipa/pr64612.C.jj   2015-01-26 15:25:43.301410027 
+0100
+++ gcc/testsuite/g++.dg/ipa/pr64612.C  2015-01-26 15:23:11.380025863 +0100
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options -O3 -std=c++11 } */
-/* { dg-final { scan-assembler _ZN5QListI7QStringED1Ev } } */
+/* { dg-final { scan-assembler _ZN5QListI7QStringED1Ev { target comdat_group 
} } } */
 
 class A
 {
--- gcc/doc/sourcebuild.texi.jj 2015-01-15 23:39:02.0 +0100
+++ gcc/doc/sourcebuild.texi2015-01-27 16:07:37.504081520 +0100
@@ -1930,6 +1930,9 @@ Target supports @code{wchar_t} that is c
 
 @item wchar_t_char32_t_compatible
 Target supports @code{wchar_t} that is compatible with @code{char32_t}.
+
+@item comdat_group
+Target uses comdat groups.
 @end table
 
 @subsubsection Local to tests in @code{gcc.target/i386}

Jakub

[PATCH PR64809]

2015-01-27 Thread Yuri Rumyantsev

Hi All,

Here is a simple patch that cures ICE - skip debug gimples.
Test is also included.

Bootstrap and regression testing did not show any new failures.

Is it OK for trunk?

ChangeLog:

2015-01-27  Yuri Rumyantsev  ysrum...@gmail.com

PR tree-optimization/64809
* cfgexpand.c (reorder_operands): Skip debug gimples.

gcc/testsuite/ChangeLog

* gcc.dg/pr64809.c: New test.


patch
Description: Binary data

Re: [Patch, Fortran] PR64771 - Fix coarray ICE

2015-01-27 Thread Rainer Orth

Steve Kargl s...@troutmask.apl.washington.edu writes:

 On Sat, Jan 24, 2015 at 06:13:04PM +0100, Tobias Burnus wrote:
if (s1-as-type == AS_EXPLICIT)
 -for (i = 0; i  s1-as-rank + s1-as-corank; i++)
 +for (i = 0; i  s1-as-rank + std::max(0, s1-as-corank-1); i++)

 Doesn't this require '#include algorithms'?
 I suspect that you are depending on namespace pollution
 via some other header (coretypes.h?).

It was committed with that change, which unfortunately broke Solaris
bootstrap:

In file included from ./config.h:6:0,
 from /vol/gcc/src/hg/trunk/local/gcc/fortran/interface.c:68:
./auto-host.h:2055:0: error: _FILE_OFFSET_BITS redefined [-Werror]
 #define _FILE_OFFSET_BITS 64
 ^
In file included from /usr/include/iso/stdlib_iso.h:24:0,
 from /usr/include/stdlib.h:11,
 from 
/var/gcc/regression/trunk/11-gcc/build/prev-i386-pc-solaris2.11/libstdc++-v3/include/cstdlib:72,
 from 
/var/gcc/regression/trunk/11-gcc/build/prev-i386-pc-solaris2.11/libstdc++-v3/include/bits/stl_algo.h:59,
 from 
/var/gcc/regression/trunk/11-gcc/build/prev-i386-pc-solaris2.11/libstdc++-v3/include/algorithm:62,
 from /vol/gcc/src/hg/trunk/local/gcc/fortran/interface.c:66:
/var/gcc/regression/trunk/11-gcc/build/prev-gcc/include-fixed/sys/feature_tests.h:213:0:
 note: this is the location of the previous definition
 #define _FILE_OFFSET_BITS 32
 ^

The problem is (as so often) that algorithm was included *before*
config.h.  Moving it after the other includes allows interface.c to
compile without warnings.

Ok for mainline?

Rainer


2015-01-27  Rainer Orth  r...@cebitec.uni-bielefeld.de

gcc/fortran:
* interface.c: Include algorithm after config.h

diff --git a/gcc/fortran/interface.c b/gcc/fortran/interface.c
--- a/gcc/fortran/interface.c
+++ b/gcc/fortran/interface.c
@@ -63,8 +63,6 @@ along with GCC; see the file COPYING3.  
formal argument list points to symbols within the same namespace as
the program unit name.  */
 
-#include algorithm  /* For std::max.  */
-
 #include config.h
 #include system.h
 #include coretypes.h
@@ -73,6 +71,8 @@ along with GCC; see the file COPYING3.  
 #include match.h
 #include arith.h
 
+#include algorithm  /* For std::max.  */
+
 /* The current_interface structure holds information about the
interface currently being parsed.  This structure is saved and
restored during recursive interfaces.  */

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [Patch, Fortran] PR64771 - Fix coarray ICE

On Tue, Jan 27, 2015 at 03:55:17PM +0100, Rainer Orth wrote:
 Steve Kargl s...@troutmask.apl.washington.edu writes:
 
  On Sat, Jan 24, 2015 at 06:13:04PM +0100, Tobias Burnus wrote:
 if (s1-as-type == AS_EXPLICIT)
  -  for (i = 0; i  s1-as-rank + s1-as-corank; i++)
  +  for (i = 0; i  s1-as-rank + std::max(0, s1-as-corank-1); i++)
 
  Doesn't this require '#include algorithms'?
  I suspect that you are depending on namespace pollution
  via some other header (coretypes.h?).
 
 It was committed with that change, which unfortunately broke Solaris
 bootstrap:
 
 In file included from ./config.h:6:0,
  from /vol/gcc/src/hg/trunk/local/gcc/fortran/interface.c:68:
 ./auto-host.h:2055:0: error: _FILE_OFFSET_BITS redefined [-Werror]
  #define _FILE_OFFSET_BITS 64
  ^
 In file included from /usr/include/iso/stdlib_iso.h:24:0,
  from /usr/include/stdlib.h:11,
  from 
 /var/gcc/regression/trunk/11-gcc/build/prev-i386-pc-solaris2.11/libstdc++-v3/include/cstdlib:72,
  from 
 /var/gcc/regression/trunk/11-gcc/build/prev-i386-pc-solaris2.11/libstdc++-v3/include/bits/stl_algo.h:59,
  from 
 /var/gcc/regression/trunk/11-gcc/build/prev-i386-pc-solaris2.11/libstdc++-v3/include/algorithm:62,
  from /vol/gcc/src/hg/trunk/local/gcc/fortran/interface.c:66:
 /var/gcc/regression/trunk/11-gcc/build/prev-gcc/include-fixed/sys/feature_tests.h:213:0:
  note: this is the location of the previous definition
  #define _FILE_OFFSET_BITS 32
  ^
 
 The problem is (as so often) that algorithm was included *before*
 config.h.  Moving it after the other includes allows interface.c to
 compile without warnings.

Why don't you use MAX macro instead of std::max as everywhere else
in the gcc sources?

Your change is wrong, you can't include system headers after including
system.h and other headers.

Jakub

Re: [PATCH][AArch64] Improve bit-test-branch pattern to avoid unnecessary register clobber

2015-01-27 Thread Marcus Shawcroft

On 27 January 2015 at 14:31, Jiong Wang jiong.w...@arm.com wrote:

 2015-01-19  Ramana Radhakrishnan  ramana.radhakrish...@arm.com
 Jiong Wang  jiong.w...@arm.com

   gcc/
 * config/aarch64/aarch64.md (tboptabmode1): Clobber CC reg instead
 of scratch reg.
 (cboptabmode1): Likewise.
 * config/aarch64/iterators.md (bcond): New define_code_attr.

OK /Marcus


   gcc/testsuite/
 * gcc.dg/long_branch.c: New testcase.

Re: [PATCH][AArch64] Improve bit-test-branch pattern to avoid unnecessary register clobber

2015-01-27 Thread Jiong Wang


On 19/01/15 10:58, Jakub Jelinek wrote:


On Mon, Jan 19, 2015 at 10:52:14AM +, Ramana Radhakrishnan wrote:

What is aarch64 specific on the testcase?


The number of if-then-else's required to get the compiler to generate
cmp branch sequences rather than the tbnz instruction.

That doesn't mean the same testcase couldn't be tested on other targets and
perhaps find bugs in there.
That said, if the testcase is too expensive to compile (several seconds is
ok, minutes is not), then perhaps it shouldn't be included at all, or should
be guarded with run_expensive_tests target.

Jakub



testcase changed to execution version, and moved to gcc.dg. the compile time 
only
take several seconds. (previously I am using cc1 built by O0 which at most take 
24s)

ok to install?

Thanks.

2015-01-19  Ramana Radhakrishnan  ramana.radhakrish...@arm.com
Jiong Wang  jiong.w...@arm.com

  gcc/
* config/aarch64/aarch64.md (tboptabmode1): Clobber CC reg instead of 
scratch reg.
(cboptabmode1): Likewise.
* config/aarch64/iterators.md (bcond): New define_code_attr.

  gcc/testsuite/
* gcc.dg/long_branch.c: New testcase.
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 597ff8c..1e00396 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -466,13 +466,17 @@
 		   (const_int 0))
 	 (label_ref (match_operand 2  ))
 	 (pc)))
-   (clobber (match_scratch:DI 3 =r))]
+   (clobber (reg:CC CC_REGNUM))]
   
-  *
-  if (get_attr_length (insn) == 8)
-return \ubfx\\t%w3, %w0, %1, #1\;cbz\\t%w3, %l2\;
-  return \tbz\\t%w0, %1, %l2\;
-  
+  {
+if (get_attr_length (insn) == 8)
+  {
+	operands[1] = GEN_INT (HOST_WIDE_INT_1U  UINTVAL (operands[1]));
+	return tst\t%w0, %1\;bcond\t%l2;
+  }
+else
+  return tbz\t%w0, %1, %l2;
+  }
   [(set_attr type branch)
(set (attr length)
 	(if_then_else (and (ge (minus (match_dup 2) (pc)) (const_int -32768))
@@ -486,13 +490,21 @@
  (const_int 0))
 			   (label_ref (match_operand 1  ))
 			   (pc)))
-   (clobber (match_scratch:DI 2 =r))]
+   (clobber (reg:CC CC_REGNUM))]
   
-  *
-  if (get_attr_length (insn) == 8)
-return \ubfx\\t%w2, %w0, sizem1, #1\;cbz\\t%w2, %l1\;
-  return \tbz\\t%w0, sizem1, %l1\;
-  
+  {
+if (get_attr_length (insn) == 8)
+  {
+	char buf[64];
+	uint64_t val = ((uint64_t ) 1)
+			 (GET_MODE_SIZE (MODEmode) * BITS_PER_UNIT - 1);
+	sprintf (buf, tst\t%%w0, %PRId64, val);
+	output_asm_insn (buf, operands);
+	return bcond\t%l1;
+  }
+else
+  return tbz\t%w0, sizem1, %l1;
+  }
   [(set_attr type branch)
(set (attr length)
 	(if_then_else (and (ge (minus (match_dup 1) (pc)) (const_int -32768))
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index 7dd3917..bd144f9 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -823,6 +823,9 @@
 		  (smax s) (umax u)
 		  (smin s) (umin u)])
 
+;; Emit conditional branch instructions.
+(define_code_attr bcond [(eq beq) (ne bne) (lt bne) (ge beq)])
+
 ;; Emit cbz/cbnz depending on comparison type.
 (define_code_attr cbz [(eq cbz) (ne cbnz) (lt cbnz) (ge cbz)])
 
diff --git a/gcc/testsuite/gcc.dg/long_branch.c b/gcc/testsuite/gcc.dg/long_branch.c
new file mode 100644
index 000..f388a80
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/long_branch.c
@@ -0,0 +1,198 @@
+/* { dg-do run } */
+/* { dg-options -O2 -fno-reorder-blocks } */
+
+void abort ();
+
+__attribute__((noinline, noclone)) int
+restore (int a, int b)
+{
+  return a * b;
+}
+
+__attribute__((noinline, noclone)) void
+do_nothing (int *input)
+{
+  *input = restore (*input, 1);
+  return;
+}
+
+#define CASE_ENTRY(n) \
+  case n: \
+sum = sum / (n + 1); \
+sum = restore (sum, n + 1); \
+if (sum == (n + addend)) \
+  break;\
+sum = sum / (n + 2); \
+sum = restore (sum, n + 2); \
+sum = sum / (n + 3); \
+sum = restore (sum, n + 3); \
+sum = sum / (n + 4); \
+sum = restore (sum, n + 4); \
+sum = sum / (n + 5); \
+sum = restore (sum, n + 5); \
+sum = sum / (n + 6); \
+sum = restore (sum, n + 6); \
+sum = sum / (n + 7); \
+sum = restore (sum, n + 7); \
+sum = sum / (n + 8); \
+sum = restore (sum, n + 8); \
+sum = sum / (n + 9); \
+sum = restore (sum, n + 9); \
+sum = sum / (n + 10); \
+sum = restore (sum, n + 10); \
+sum = sum / (n + 11); \
+sum = restore (sum, n + 11); \
+sum = sum / (n + 12); \
+sum = restore (sum, n + 12); \
+sum = sum / (n + 13); \
+sum = restore (sum, n + 13); \
+sum = sum / (n + 14); \
+sum = restore (sum, n + 14); \
+sum = sum / (n + 15); \
+sum = restore (sum, n + 15); \
+sum = sum / (n + 16); \
+sum = restore (sum, n + 16); \
+sum = sum / (n + 17); \
+sum = restore (sum, n + 17); \
+sum = sum / (n + 18); \
+sum = restore (sum, n + 18); \
+sum = sum / (n + 19); \
+sum = restore (sum, n + 19);

[PATCH] Fix ada bootstrap under cygwin-64

2015-01-27 Thread Bernd Edlinger

Hi,


this patch fixes the ada bootstrap under cygwin-64.

Boot-strapped under x86_64-pc-cygwin.
OK for trunk?


Thanks
Bernd.
  2015-01-27  Bernd Edlinger  bernd.edlin...@hotmail.de

Fix build under cygwin/64.
* adaint.h: Add check for __CYGWIN__.
* mingw32.h: Prevent windows.h from including x86intrin.h in GCC.



patch-ada-cygwin64.diff
Description: Binary data

Re: RFA: patch to fix a bad code generation for PR64110 -- new constraints addition

2015-01-27 Thread Vladimir Makarov

On 01/27/2015 12:11 PM, Richard Sandiford wrote:
 Vladimir Makarov vmaka...@redhat.com writes:
 On 01/27/2015 09:08 AM, Richard Sandiford wrote:
 Yeah, but in practice that's only ever going to be a partial transition.
 Many port maintainers won't look at this, so we'll have to support both
 versions indefinitely, even if the new behaviour turns out to be the
 best for all cases.

 I just think we're going to regret having two sets of constraints with
 such subtly different meanings.

 Looking back at the original PR, Jakub said:

   The ! has been added by me for PR63594, so it isn't there from the era
   when i?86 backend was using reload.  If there is a better way to
   express that RA should prefer to use memory or xmm register and only
   use r constraint if it already is in a r register and doesn't need to
   be reloaded, I can use that.  Whether it is ?, ??? or something else.
   ! description in gcc docs just fitted most what I wanted...

 In some ways this seems to match the intention of *.  Originally I think
 it was just an RA-only thing and was ignored by reload, but LRA does take it
 into account too (which sounds like progress to me).
   I guess we don't need '*' in many cases.  It is overused.  Imho, IRA
 should decide what class is better based on costs of alternatives and
 the explicit exclusion of register class by using '*' is a bad practice.

   Saying that I believe we should do register class preferrencing
 algorithm more alternative oriented.  The algorithm should choose first
 an alternative (of may be subset of alternatives) and then register
 classes.  I think it is more logical.  It would permits us to rid off
 all such constraints including '*' and use only one like '?' which
 increases the alternative cost.

   In perspective it is even better to rid of '?' too and have some hook
 (or attribute) to get insn alternative costs which can be depended on
 sub-target or other run-time characteristics.  Otherwise we need to
 duplicate insn descriptions and put different insn guards.  I am going
 to work on this.  But it is hard to say will it work well (may be I have
 some performance issues with this).  This hook somehow (min or average
 of the values for all alternatives) can be used in combiner and other
 algorithms need an insn cost. That is how I see the solution of the
 problem in a long perspective.
 Definitely agree that it'd be better to remove these constraints
 in favour of a new attribute.  preferred_for_size and preferred_for_speed
 give something similar, though they're much more stringent than what
 we need here.

 If I revert the patch locally and change the *vec_dupmode pattern to
 use *, it passes both the test for PR64110 and the tests for PR63594.
 Would that be OK as an alternative?

   I don't think it will work in general case.  It probably works because
 a different class is chosen in IRA.  If IRA for some reasons choose the
 same class, we might see the same problem in LRA.
 But isn't that the point of '*'?  It should stop IRA from using the 'r'
 alternative as an indication that 'r' is a good choice for this instruction.
 If IRA chooses 'r' anyway, it must be because other instructions that
 use the same allocno strongly prefer 'r'.

 And in those some circumstances -- i.e. if IRA does choose 'r' despite
 the constraints in this instruction -- then I think we do want to use the
 'r' alternative.  And AIUI that's also what the new constraint is designed
 to do.  If IRA chooses 'r' anyway, the new constraint causes LRA to prefer
 the 'r' alternative _even if_ another operand (the destination) has to
 be reloaded, which is the fundamental difference between the new constraint
 and '!'.

 So I'm still not sure why '*' wouldn't do what we want.
Frequently use of '*' (and sometimes '!' for reload) means that we need
splitting for this alternative probably into 2/3 insns.  Instead of '*'
use we would need to set up costs of all these insns.  I believe just
ignoring the class with '*' is wrong.  There are some cases where we
need '*' to avoid definitely this reg class, e.g. mmx when we use other
classes for fp values.  But I guess this solution is not reliable and
without the constraints we could set the alternative cost very high to
have a reliable right solution.

  I also don't like when register classes are excluded by '*' for IRA
 (see my thoughts above).
 Understood, and I agree it would be good to move to attributes.
 But in a way, I think that's an even better reason to try to avoid
 adding these new constraints.  It sounds like we're hoping to get rid
 of them as soon as we've added them :-)


Sometimes to get rid off, you should add more :)

But to be serious, what I wrote can not be implemented for GCC-5.0 (and
the generated code performance is still unknown for the proposed
approach).  I believe the current solution is more reliable than using
'*'.   Ridding off the new constraints will be much much smaller problem
than ridding of other constraints.

Re: [PATCH][RFA][PR target/15184] Partial fix for direct byte access on x86


On 01/27/15 14:21, Segher Boessenkool wrote:

On Tue, Jan 27, 2015 at 01:53:34PM -0700, Jeff Law wrote:

I do have a specific PR in mind, but I cannot currently find it.  It was
about x86, dec mem and then using the flags...  Must have sent 100 emails
in that thread...  And cannot find it now!

Are you referring to 61225?


That is the one, thanks.

It's not going to help 61225.  The key insns in 61225 are:

(insn 6 3 7 2 (set (reg:SI 91 [ *x_3(D) ])
(mem:SI (reg/v/f:SI 90 [ x ]) [1 *x_3(D)+0 S4 A32])) k.c:11 90 
{*movsi_internal}

 (nil))
(insn 7 6 8 2 (parallel [
(set (reg:SI 88 [ D.1494 ])
(plus:SI (reg:SI 91 [ *x_3(D) ])
(const_int -1 [0x])))
(clobber (reg:CC 17 flags))
]) k.c:11 220 {*addsi_1}
 (expr_list:REG_DEAD (reg:SI 91 [ *x_3(D) ])
(expr_list:REG_UNUSED (reg:CC 17 flags)
(expr_list:REG_EQUAL (plus:SI (mem:SI (reg/v/f:SI 90 [ x ]) 
[1 *x_3(D)+0 S4 A32])

(const_int -1 [0x]))
(nil)
(insn 8 7 9 2 (set (mem:SI (reg/v/f:SI 90 [ x ]) [1 *x_3(D)+0 S4 A32])
(reg:SI 88 [ D.1494 ])) k.c:11 90 {*movsi_internal}
 (nil))
(insn 9 8 10 2 (set (reg:CCZ 17 flags)
(compare:CCZ (reg:SI 88 [ D.1494 ])
(const_int 0 [0]))) k.c:11 3 {*cmpsi_ccno_1}
 (expr_list:REG_DEAD (reg:SI 88 [ D.1494 ])
(nil)))


Note how REG:SI 88 has two uses.   Never do we pass a set of insns into 
try_combine that are useful to optimize this particular case (we never 
include insn #9 in any of the attempted combinations).


Even a bridge pattern isn't going to help here.


jeff

Re: [PATCH] Add comdat_group effective target (PR bootstrap/64612)

2015-01-27 Thread Mike Stump

On Jan 27, 2015, at 7:10 AM, Jakub Jelinek ja...@redhat.com wrote:
 
 This patch introduces a new effective target check and adds it to the 
 pr64612.C
 - if comdat groups aren't used, there is no guarantee that the D2 dtor will
 be emitted always alongside of D1 dtor.
 
 Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

Re: [RFC PATCH] Avoid most of the BUILT_IN_*_CHKP enum values

On Tue, Jan 27, 2015 at 06:04:53PM +0300, Ilya Enkovich wrote:
 2015-01-27 17:27 GMT+03:00 Jakub Jelinek ja...@redhat.com:
  I've grepped for BUILT_IN_.*_CHKP in the sources and we actually need
  far fewer enum values than the 1204 that are being defined.
 
  This patch requires builtins.def to say explicitly (by using
  DEF_*BUILTIN_CHKP macro instead of corresponding DEF_*BUILTIN) which
  ones need that, for all the others only space in the enum is reserved and
  nothing else.
 
  I'd hope this could work around the buggy AIX stabs handling, but even
  on say x86_64-linux it has a benefit of decreasing cc1plus .debug_info
  by about 2.7MB (of course, with dwz that benefit goes to almost nothing,
  just the ~ 7000 bytes or so, plus .debug_str cost (that is merged even
  without dwz between TUs).  The cost without dwz is obviously mainly
  from repeating that in most of the translation units.  But why declare
  BUILT_IN_*_CHKP enums that are never used by anything...
 
 Enum values not mentioned in the code are not fully useless.  When we
 have builtin functions defined as 'always_inline' functions, they are
 instrumented and enum names may be used in dumps and debugging.
 That's not a big value though.  Thanks a lot for taking care of it!

Note, patch successfully bootstrapped/regtested on x86_64-linux and
i686-linux, and David said that on AIX it passed stage1 cc1 linking.

Ok for trunk?

As for the enums, I doubt the pain is worth the trouble.
What perhaps could be done (apparently preexisting issue, because you
include builtins.def just once in built_in_names, would be to tweak
fprintf (file,  built-in %s:%s,
 built_in_class_names[(int) DECL_BUILT_IN_CLASS (node)],
 built_in_names[(int) DECL_FUNCTION_CODE (node)]);
so that if DECL_FUNCTION_CODE is in between
BEGIN_CHKP_BUILTINS and END_CHKP_BUILTINS you don't print (null) on glibc
there or crash (on various other hosts), but actually print
built_in_names[(int) DECL_FUNCTION_CODE (node) - (int) BEGIN_CHKP_BUILTINS - 1]
concatenated with _CHKP.

  2015-01-27  Jakub Jelinek  ja...@redhat.com
 
  * builtins.def (DEF_BUILTIN_CHKP): Define if not defined.
  (DEF_LIB_BUILTIN_CHKP, DEF_EXT_LIB_BUILTIN_CHKP): Redefine.
  (DEF_CHKP_BUILTIN): Define using DEF_BUILTIN_CHKP instead
  of DEF_BUILTIN.
  (BUILT_IN_MEMCPY, BUILT_IN_MEMMOVE, BUILT_IN_MEMSET, 
  BUILT_IN_STRCAT,
  BUILT_IN_STRCHR, BUILT_IN_STRCPY, BUILT_IN_STRLEN): Use
  DEF_LIB_BUILTIN_CHKP macro instead of DEF_LIB_BUILTIN.
  (BUILT_IN_MEMCPY_CHK, BUILT_IN_MEMMOVE_CHK, BUILT_IN_MEMPCPY_CHK,
  BUILT_IN_MEMPCPY, BUILT_IN_MEMSET_CHK, BUILT_IN_STPCPY_CHK,
  BUILT_IN_STPCPY, BUILT_IN_STRCAT_CHK, BUILT_IN_STRCPY_CHK): Use
  DEF_EXT_LIB_BUILTIN_CHKP macro instead of DEF_EXT_LIB_BUILTIN.
  * tree-core.h (enum built_in_function): In between
  BEGIN_CHKP_BUILTINS and END_CHKP_BUILTINS only define enum values
  for builtins that use DEF_BUILTIN_CHKP macro.
 
  --- gcc/builtins.def.jj 2015-01-15 23:39:10.0 +0100
  +++ gcc/builtins.def2015-01-27 15:04:44.860924664 +0100
  @@ -63,6 +63,16 @@ along with GCC; see the file COPYING3.
 
  The builtins is registered only if COND is true.  */
 
  +/* A macro for builtins where the
  +   BUILT_IN_*_CHKP = BUILT_IN_* + BEGIN_CHKP_BUILTINS + 1
  +   enums should be defined too.  */
  +#ifndef DEF_BUILTIN_CHKP
  +#define DEF_BUILTIN_CHKP(ENUM, NAME, CLASS, TYPE, LIBTYPE, BOTH_P, \
  +FALLBACK_P, NONANSI_P, ATTRS, IMPLICIT, COND)  \
  +  DEF_BUILTIN(ENUM, NAME, CLASS, TYPE, LIBTYPE, BOTH_P, FALLBACK_P,\
  + NONANSI_P, ATTRS, IMPLICIT, COND)
  +#endif
  +
   /* A GCC builtin (like __builtin_saveregs) is provided by the
  compiler, but does not correspond to a function in the standard
  library.  */
  @@ -87,6 +97,10 @@ along with GCC; see the file COPYING3.
   #define DEF_LIB_BUILTIN(ENUM, NAME, TYPE, ATTRS)   \
 DEF_BUILTIN (ENUM, __builtin_ NAME, BUILT_IN_NORMAL, TYPE, TYPE,   \
 true, true, false, ATTRS, true, true)
  +#undef DEF_LIB_BUILTIN_CHKP
  +#define DEF_LIB_BUILTIN_CHKP(ENUM, NAME, TYPE, ATTRS)  \
  +  DEF_BUILTIN_CHKP (ENUM, __builtin_ NAME, BUILT_IN_NORMAL, TYPE,\
  +   TYPE, true, true, false, ATTRS, true, true)
 
   /* Like DEF_LIB_BUILTIN, except that the function is not one that is
  specified by ANSI/ISO C.  So, when we're being fully conformant we
  @@ -96,6 +110,10 @@ along with GCC; see the file COPYING3.
   #define DEF_EXT_LIB_BUILTIN(ENUM, NAME, TYPE, ATTRS)   \
 DEF_BUILTIN (ENUM, __builtin_ NAME, BUILT_IN_NORMAL, TYPE, TYPE,   \
 true, true, true, ATTRS, false, true)
  +#undef DEF_EXT_LIB_BUILTIN_CHKP
  +#define DEF_EXT_LIB_BUILTIN_CHKP(ENUM, NAME, TYPE, ATTRS)  \
  +  DEF_BUILTIN_CHKP (ENUM, __builtin_ NAME, BUILT_IN_NORMAL, TYPE,\
  +

Re: [PATCH, PR tree-optimization/64277] Improve loop iterations count estimation

On 27 Jan 12:40, Ilya Enkovich wrote:
 Hi,
 
 This patch was supposed to fix PR tree-optimization/64277.  Tracker is now 
 fixed by warnings disabling but I think patch is still useful to avoid dead 
 code generated by complete unroll.
 
 Bootstrapped and tested on x86_64-unknown-linux-gnu.
 
 Thanks,
 Ilya
 --
 gcc/
 
 2015-01-27  Ilya Enkovich  ilya.enkov...@intel.com
 
   * tree-ssa-loop-niter.c (record_nonwrapping_iv): Use base
   range info when possible to refine estimation.
 
 gcc/testsuite/
 
 2015-01-27  Ilya Enkovich  ilya.enkov...@intel.com
 
   * gcc.dg/pr64277.c: New.
 
 

Here is a new version fixed according to comments in the tracker.  I also fixed 
a test to scan cunroll dumps.  Does it look OK?

What are possible branches for this patch?

Thanks,
Ilya
--
diff --git a/gcc/testsuite/gcc.dg/pr64277.c b/gcc/testsuite/gcc.dg/pr64277.c
new file mode 100644
index 000..c6ef331
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr64277.c
@@ -0,0 +1,23 @@
+/* PR tree-optimization/64277 */
+/* { dg-do compile } */
+/* { dg-options -O3 -Wall -Werror -fdump-tree-cunroll-details } */
+/* { dg-final { scan-tree-dump loop with 5 iterations completely unrolled 
cunroll } } */
+/* { dg-final { scan-tree-dump loop with 6 iterations completely unrolled 
cunroll } } */
+/* { dg-final { cleanup-tree-dump cunroll } } */
+
+int f1[10];
+void test1 (short a[], short m, unsigned short l)
+{
+  int i = l;
+  for (i = i + 5; i  m; i++)
+f1[i] = a[i]++;
+}
+
+void test2 (short a[], short m, short l)
+{
+  int i;
+  if (m  5)
+m = 5;
+  for (i = m; i  l; i--)
+f1[i] = a[i]++;
+}
diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index 919f5c0..1cd297d 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -2754,6 +2754,7 @@ record_nonwrapping_iv (struct loop *loop, tree base, tree 
step, gimple stmt,
 {
   tree niter_bound, extreme, delta;
   tree type = TREE_TYPE (base), unsigned_type;
+  tree orig_base = base;
 
   if (TREE_CODE (step) != INTEGER_CST || integer_zerop (step))
 return;
@@ -2777,16 +2778,30 @@ record_nonwrapping_iv (struct loop *loop, tree base, 
tree step, gimple stmt,
 
   if (tree_int_cst_sign_bit (step))
 {
+  wide_int min, max;
   extreme = fold_convert (unsigned_type, low);
-  if (TREE_CODE (base) != INTEGER_CST)
+  if (TREE_CODE (orig_base) == SSA_NAME
+  TREE_CODE (high) == INTEGER_CST
+  INTEGRAL_TYPE_P (TREE_TYPE (orig_base))
+  get_range_info (orig_base, min, max) == VR_RANGE
+  wi::gts_p (wide_int (high), max))
+   base = wide_int_to_tree (unsigned_type, max);
+  else if (TREE_CODE (base) != INTEGER_CST)
base = fold_convert (unsigned_type, high);
   delta = fold_build2 (MINUS_EXPR, unsigned_type, base, extreme);
   step = fold_build1 (NEGATE_EXPR, unsigned_type, step);
 }
   else
 {
+  wide_int min, max;
   extreme = fold_convert (unsigned_type, high);
-  if (TREE_CODE (base) != INTEGER_CST)
+  if (TREE_CODE (orig_base) == SSA_NAME
+  TREE_CODE (low) == INTEGER_CST
+  INTEGRAL_TYPE_P (TREE_TYPE (orig_base))
+  get_range_info (orig_base, min, max) == VR_RANGE
+  wi::gts_p (min, wide_int (low)))
+   base = wide_int_to_tree (unsigned_type, min);
+  else if (TREE_CODE (base) != INTEGER_CST)
base = fold_convert (unsigned_type, low);
   delta = fold_build2 (MINUS_EXPR, unsigned_type, extreme, base);
 }

Re: [PATCH] Fix PR64277

On Tue, 27 Jan 2015, Ilya Enkovich wrote:

 2015-01-27 12:47 GMT+03:00 Richard Biener rguent...@suse.de:
  On Tue, 27 Jan 2015, Jakub Jelinek wrote:
 
  On Tue, Jan 27, 2015 at 10:25:48AM +0100, Richard Biener wrote:
  
   This disables array-bound warnings from VRP2 as discussed.
  
   Bootstrapped and tested on x86_64-unknown-linux-gnu - ok for trunk?
 
  So nothing in the testsuite needed to change?  Nice.
 
  Yes.
 
  Ok for trunk.
 
   I'll search for duplicates and add a few testcases.
 
  Thanks.
 
  Committed as follows (first testcase in PR59124 not fixed - it warns
  from the first pass).
 
 Are you going to port it to 4.9 branch?

I plan to do that (4.8 as well) after some time.

Richard.

 Thanks,
 Ilya
 
 
  2015-01-27  Richard Biener  rguent...@suse.de
 
  PR tree-optimization/56273
  PR tree-optimization/59124
  PR tree-optimization/64277
  * tree-vrp.c (vrp_finalize): Emit array-bound warnings only
  from the first VRP pass.
 
  * g++.dg/warn/Warray-bounds-6.C: New testcase.
  * gcc.dg/Warray-bounds-12.c: Likewise.
  * gcc.dg/Warray-bounds-13.c: Likewise.
 
  Index: gcc/tree-vrp.c
  ===
  *** gcc/tree-vrp.c.orig 2015-01-27 10:34:26.453743828 +0100
  --- gcc/tree-vrp.c  2015-01-27 10:43:04.970610102 +0100
  *** vrp_finalize (void)
  *** 10229,10235 
  substitute_and_fold (op_with_constant_singleton_value_range,
 vrp_fold_stmt, false);
 
  !   if (warn_array_bounds)
check_all_array_refs ();
 
  /* We must identify jump threading opportunities before we release
  --- 10229,10235 
  substitute_and_fold (op_with_constant_singleton_value_range,
 vrp_fold_stmt, false);
 
  !   if (warn_array_bounds  first_pass_instance)
check_all_array_refs ();
 
  /* We must identify jump threading opportunities before we release
  Index: gcc/testsuite/g++.dg/warn/Warray-bounds-6.C
  ===
  *** /dev/null   1970-01-01 00:00:00.0 +
  --- gcc/testsuite/g++.dg/warn/Warray-bounds-6.C 2015-01-27 
  10:40:31.311871855 +0100
  ***
  *** 0 
  --- 1,26 
  + // { dg-do compile }
  + // { dg-options -O3 -Warray-bounds }
  +
  + struct type {
  + bool a, b;
  + bool get_b() { return b; }
  + };
  +
  + type stuff[9u];
  +
  + void bar();
  +
  + void foo()
  + {
  +   for(unsigned i = 0u; i  9u; i++)
  + {
  +   if(!stuff[i].a)
  +   continue;
  +
  +   bar();
  +
  +   for(unsigned j = i + 1u; j  9u; j++)
  +   if(stuff[j].a  stuff[j].get_b()) // { dg-bogus above array 
  bounds }
  + return;
  + }
  + }
  Index: gcc/testsuite/gcc.dg/Warray-bounds-12.c
  ===
  *** /dev/null   1970-01-01 00:00:00.0 +
  --- gcc/testsuite/gcc.dg/Warray-bounds-12.c 2015-01-27 
  10:40:58.196175989 +0100
  ***
  *** 0 
  --- 1,26 
  + /* { dg-do compile } */
  + /* { dg-options -O3 -Warray-bounds } */
  + /* { dg-additional-options -mssse3 { target x86_64-*-* i?86-*-* } } */
  +
  + void foo(short a[], short m)
  + {
  +   int i, j;
  +   int f1[10];
  +   short nc;
  +
  +   nc = m + 1;
  +   if (nc  3)
  + {
  +   for (i = 0; i = nc; i++)
  +   {
  + f1[i] = f1[i] + 1;
  +   }
  + }
  +
  +   for (i = 0, j = m; i  nc; i++, j--)
  + {
  +   a[i] = f1[i]; /* { dg-bogus above array bounds } */
  +   a[j] = i;
  + }
  +   return;
  + }
  Index: gcc/testsuite/gcc.dg/Warray-bounds-13.c
  ===
  *** /dev/null   1970-01-01 00:00:00.0 +
  --- gcc/testsuite/gcc.dg/Warray-bounds-13.c 2015-01-27 
  10:42:43.738369929 +0100
  ***
  *** 0 
  --- 1,18 
  + /* { dg-do compile } */
  + /* { dg-options -O3 -Warray-bounds } */
  +
  + extern char *bar[17];
  +
  + int foo(int argc, char **argv)
  + {
  +   int i;
  +   int n = 0;
  +
  +   for (i = 0; i  argc; i++)
  + n++;
  +
  +   for (i = 0; i  argc; i++)
  + argv[i] = bar[i + n]; /* { dg-bogus above array bounds } */
  +
  +   return 0;
  + }
 
 

-- 
Richard Biener rguent...@suse.de
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild,
Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)

[PATCH] pr 64047 - explicitly handle target_option_default_node in rs6000_set_current_function

2015-01-27 Thread tbsaunde+gcc

From: Trevor Saunders tbsaunde+...@tbsaunde.org

Hi,

the compiler crashes on pr52429.c because this_target_ira_int gets initialized
with null x_init_costs and x_op_costs.  While I don't really understand this 
option handling mess r217659 made the analogous change to i386 when it broke 
this.  So it seems likely this is the right way to fix the regression.

bootstrapped + regtested ppc64-linux-gnu, without regression and pr52429.c is
fixed, ok?


Trev

gcc/

* config/rs6000/rs6000.c (rs6000_set_current_function): Handle
explicit default options.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 85eb0fd..207fc55 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -32609,7 +32609,7 @@ rs6000_set_current_function (tree fndecl)
   if (old_tree == new_tree)
;
 
-  else if (new_tree)
+  else if (new_tree  new_tree != target_option_default_node)
{
  cl_target_option_restore (global_options,
TREE_TARGET_OPTION (new_tree));
@@ -32620,7 +32620,7 @@ rs6000_set_current_function (tree fndecl)
  = save_target_globals_default_opts ();
}
 
-  else if (old_tree)
+  else if (old_tree  old_tree != target_option_default_node)
{
  new_tree = target_option_current_node;
  cl_target_option_restore (global_options,
-- 
2.1.4

Re: Merge current set of OpenACC changes from gomp-4_0-branch

2015-01-27 Thread Julian Brown

On Mon, 26 Jan 2015 14:44:19 +0100
Thomas Schwinge tho...@codesourcery.com wrote:

On 17 Jan 02:16, Ilya Verbin wrote:
Unfortunately, it broke offloading from shared libraries (I mean
common libs with NEEDED entries, not dlopened).

Sorry for that!

Such things are not covered by the
testsuite, that's why you missed this issue. Here is a simple
testcase:

http://news.gmane.org/find-root.php?message_id=%3C20150116231632.GB48380%40msticlxl57.ims.intel.com%3E

Probably a good motivation for adding such a test case. ;-)

So, you don't assume that a device can have multiple images from
multiple libs?

Ping?

This probably is just a bug that we introduced with our changes?
(Julian?)

AFAICR, we haven't yet figured out how to make (shared) libraries work
with PTX. Actually I'm not entirely sure if static libraries containing
PTX code will work either. But, multiple images (e.g. from different
object files) are supported, via the loop in gomp_target_init.

(The semantics of gomp_register_image_for_device were changed, but not
-- intentionally! -- to limit the number of offloaded images to one.)

Also, could you please explain, why did you divide a device
initialization into two functions -- gomp_init_device and
gomp_init_tables?

As I understand it (again, Julian, please correct me if I got that
wrong), the reason is that for OpenACC support, we need these as two
separate (independent) actions. Is this causing problems for OpenMP
offloading?

This was certainly necessary at some point, when the support for
multiple devices of the same type in the OpenACC runtime was delegated
entirely to target-dependent code. Later (after one round of
refactoring), the gomp_device_descr and the memory map were still
separate, with the former possibly representing a number of devices,
and the latter having independent copies for each instance of a device.

That's largely been refactored (again) away now though -- a
gomp_device_descr and its memory map are stored together, per-device
instance. So this separation of their initialisation can probably go
away, although some (somewhat delicate) code in oacc-init.c would need
to be tweaked.

Julian

RE: [PATCH RFA MIPS] Prohibit vector modes in accumulators

2015-01-27 Thread Matthew Fortune

Richard Sandiford rdsandif...@googlemail.com writes:
 Matthew Fortune matthew.fort...@imgtec.com writes:
  2015-01-23  Robert Suchanek  robert.sucha...@imgtec.com
 
 * config/mips/mips.c (mips_hard_regno_mode_ok_p): Prohibit
  accumulators
 for all vector modes.
 
  This seems like a genuine bug and although it can only be triggered by
  loongson or paired-single support it probably qualifies for fixing.
 
 Agreed FWIW.  We shouldn't mark something as valid for a mode if even
 the mode's move pattern can't handle it.
 
 I think this kind of thing should go in regardless of development stage.

Given that it was one of the pre-existing tests that failed I'm happy that
we are covering this issue. All of these LRA related issues are likely
to phase in and out with subtle changes to code-gen so I don't think we
can always get a test case that fails on trunk.

Since Catherine asked for further info then I will leave her to say if she
is happy to accept on this basis.

Matthew

Re: [PATCH] Fix PR64277

2015-01-27 13:59 GMT+03:00 Richard Biener rguent...@suse.de:
 On Tue, 27 Jan 2015, Ilya Enkovich wrote:

 2015-01-27 12:47 GMT+03:00 Richard Biener rguent...@suse.de:
  On Tue, 27 Jan 2015, Jakub Jelinek wrote:
 
  On Tue, Jan 27, 2015 at 10:25:48AM +0100, Richard Biener wrote:
  
   This disables array-bound warnings from VRP2 as discussed.
  
   Bootstrapped and tested on x86_64-unknown-linux-gnu - ok for trunk?
 
  So nothing in the testsuite needed to change?  Nice.
 
  Yes.
 
  Ok for trunk.
 
   I'll search for duplicates and add a few testcases.
 
  Thanks.
 
  Committed as follows (first testcase in PR59124 not fixed - it warns
  from the first pass).

 Are you going to port it to 4.9 branch?

 I plan to do that (4.8 as well) after some time.

Great, thanks!

Ilya


 Richard.

 Thanks,
 Ilya

 
  2015-01-27  Richard Biener  rguent...@suse.de
 
  PR tree-optimization/56273
  PR tree-optimization/59124
  PR tree-optimization/64277
  * tree-vrp.c (vrp_finalize): Emit array-bound warnings only
  from the first VRP pass.
 
  * g++.dg/warn/Warray-bounds-6.C: New testcase.
  * gcc.dg/Warray-bounds-12.c: Likewise.
  * gcc.dg/Warray-bounds-13.c: Likewise.
 
  Index: gcc/tree-vrp.c
  ===
  *** gcc/tree-vrp.c.orig 2015-01-27 10:34:26.453743828 +0100
  --- gcc/tree-vrp.c  2015-01-27 10:43:04.970610102 +0100
  *** vrp_finalize (void)
  *** 10229,10235 
  substitute_and_fold (op_with_constant_singleton_value_range,
 vrp_fold_stmt, false);
 
  !   if (warn_array_bounds)
check_all_array_refs ();
 
  /* We must identify jump threading opportunities before we release
  --- 10229,10235 
  substitute_and_fold (op_with_constant_singleton_value_range,
 vrp_fold_stmt, false);
 
  !   if (warn_array_bounds  first_pass_instance)
check_all_array_refs ();
 
  /* We must identify jump threading opportunities before we release
  Index: gcc/testsuite/g++.dg/warn/Warray-bounds-6.C
  ===
  *** /dev/null   1970-01-01 00:00:00.0 +
  --- gcc/testsuite/g++.dg/warn/Warray-bounds-6.C 2015-01-27 
  10:40:31.311871855 +0100
  ***
  *** 0 
  --- 1,26 
  + // { dg-do compile }
  + // { dg-options -O3 -Warray-bounds }
  +
  + struct type {
  + bool a, b;
  + bool get_b() { return b; }
  + };
  +
  + type stuff[9u];
  +
  + void bar();
  +
  + void foo()
  + {
  +   for(unsigned i = 0u; i  9u; i++)
  + {
  +   if(!stuff[i].a)
  +   continue;
  +
  +   bar();
  +
  +   for(unsigned j = i + 1u; j  9u; j++)
  +   if(stuff[j].a  stuff[j].get_b()) // { dg-bogus above array 
  bounds }
  + return;
  + }
  + }
  Index: gcc/testsuite/gcc.dg/Warray-bounds-12.c
  ===
  *** /dev/null   1970-01-01 00:00:00.0 +
  --- gcc/testsuite/gcc.dg/Warray-bounds-12.c 2015-01-27 
  10:40:58.196175989 +0100
  ***
  *** 0 
  --- 1,26 
  + /* { dg-do compile } */
  + /* { dg-options -O3 -Warray-bounds } */
  + /* { dg-additional-options -mssse3 { target x86_64-*-* i?86-*-* } } */
  +
  + void foo(short a[], short m)
  + {
  +   int i, j;
  +   int f1[10];
  +   short nc;
  +
  +   nc = m + 1;
  +   if (nc  3)
  + {
  +   for (i = 0; i = nc; i++)
  +   {
  + f1[i] = f1[i] + 1;
  +   }
  + }
  +
  +   for (i = 0, j = m; i  nc; i++, j--)
  + {
  +   a[i] = f1[i]; /* { dg-bogus above array bounds } */
  +   a[j] = i;
  + }
  +   return;
  + }
  Index: gcc/testsuite/gcc.dg/Warray-bounds-13.c
  ===
  *** /dev/null   1970-01-01 00:00:00.0 +
  --- gcc/testsuite/gcc.dg/Warray-bounds-13.c 2015-01-27 
  10:42:43.738369929 +0100
  ***
  *** 0 
  --- 1,18 
  + /* { dg-do compile } */
  + /* { dg-options -O3 -Warray-bounds } */
  +
  + extern char *bar[17];
  +
  + int foo(int argc, char **argv)
  + {
  +   int i;
  +   int n = 0;
  +
  +   for (i = 0; i  argc; i++)
  + n++;
  +
  +   for (i = 0; i  argc; i++)
  + argv[i] = bar[i + n]; /* { dg-bogus above array bounds } */
  +
  +   return 0;
  + }



 --
 Richard Biener rguent...@suse.de
 SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild,
 Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)

Re: [PATCH][2/2] Improve array-bound warnings and VRP

On Mon, 26 Jan 2015, Jakub Jelinek wrote:

 On Mon, Jan 26, 2015 at 04:18:32PM +0100, Richard Biener wrote:
Ok for trunk?  Or should I delay this to GCC 6?
   
   Does this work even without the other patch?
  
  Yes, I've actually developed 2/2 first.  The other patch only ever
  emits more warnings...
 
 Then it probably should be ok.  I'm really afraid of emitting more warnings
 with such high false positive rate now.

As the patch also mitigates some of the code bloat we get with
the complete peeling (regression against 4.7) I have installed it.
It's also the easiest vehicle to verify range-info is not broken
by passes between vrp1 and vrp2.

Thanks,
Richard.

Re: [PATCH, PR tree-optimization/64277] Improve loop iterations count estimation

On Tue, Jan 27, 2015 at 11:47 AM, Ilya Enkovich enkovich@gmail.com wrote:
 On 27 Jan 12:40, Ilya Enkovich wrote:
 Hi,

 This patch was supposed to fix PR tree-optimization/64277.  Tracker is now 
 fixed by warnings disabling but I think patch is still useful to avoid dead 
 code generated by complete unroll.

 Bootstrapped and tested on x86_64-unknown-linux-gnu.

 Thanks,
 Ilya
 --
 gcc/

 2015-01-27  Ilya Enkovich  ilya.enkov...@intel.com

   * tree-ssa-loop-niter.c (record_nonwrapping_iv): Use base
   range info when possible to refine estimation.

 gcc/testsuite/

 2015-01-27  Ilya Enkovich  ilya.enkov...@intel.com

   * gcc.dg/pr64277.c: New.



 Here is a new version fixed according to comments in the tracker.  I also 
 fixed a test to scan cunroll dumps.  Does it look OK?

Minor comments below.

 What are possible branches for this patch?

You can probably create a testcase that shows code-size regressions
against a version that didn't peel completely (GCC 4.7).  Thus I'd say
it would apply to 4.9 as well (4.8 doesn't have range information).

 Thanks,
 Ilya
 --
 diff --git a/gcc/testsuite/gcc.dg/pr64277.c b/gcc/testsuite/gcc.dg/pr64277.c
 new file mode 100644
 index 000..c6ef331
 --- /dev/null
 +++ b/gcc/testsuite/gcc.dg/pr64277.c
 @@ -0,0 +1,23 @@
 +/* PR tree-optimization/64277 */
 +/* { dg-do compile } */
 +/* { dg-options -O3 -Wall -Werror -fdump-tree-cunroll-details } */
 +/* { dg-final { scan-tree-dump loop with 5 iterations completely unrolled 
 cunroll } } */
 +/* { dg-final { scan-tree-dump loop with 6 iterations completely unrolled 
 cunroll } } */
 +/* { dg-final { cleanup-tree-dump cunroll } } */
 +
 +int f1[10];
 +void test1 (short a[], short m, unsigned short l)
 +{
 +  int i = l;
 +  for (i = i + 5; i  m; i++)
 +f1[i] = a[i]++;
 +}
 +
 +void test2 (short a[], short m, short l)
 +{
 +  int i;
 +  if (m  5)
 +m = 5;
 +  for (i = m; i  l; i--)
 +f1[i] = a[i]++;
 +}
 diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
 index 919f5c0..1cd297d 100644
 --- a/gcc/tree-ssa-loop-niter.c
 +++ b/gcc/tree-ssa-loop-niter.c
 @@ -2754,6 +2754,7 @@ record_nonwrapping_iv (struct loop *loop, tree base, 
 tree step, gimple stmt,
  {
tree niter_bound, extreme, delta;
tree type = TREE_TYPE (base), unsigned_type;
 +  tree orig_base = base;

if (TREE_CODE (step) != INTEGER_CST || integer_zerop (step))
  return;
 @@ -2777,16 +2778,30 @@ record_nonwrapping_iv (struct loop *loop, tree base, 
 tree step, gimple stmt,

if (tree_int_cst_sign_bit (step))
  {
 +  wide_int min, max;
extreme = fold_convert (unsigned_type, low);
 -  if (TREE_CODE (base) != INTEGER_CST)
 +  if (TREE_CODE (orig_base) == SSA_NAME
 +  TREE_CODE (high) == INTEGER_CST
 +  INTEGRAL_TYPE_P (TREE_TYPE (orig_base))
 +  get_range_info (orig_base, min, max) == VR_RANGE
 +  wi::gts_p (wide_int (high), max))

For me a simple wi::gts_p (high, max) worked fine.

 +   base = wide_int_to_tree (unsigned_type, max);
 +  else if (TREE_CODE (base) != INTEGER_CST)
 base = fold_convert (unsigned_type, high);
delta = fold_build2 (MINUS_EXPR, unsigned_type, base, extreme);
step = fold_build1 (NEGATE_EXPR, unsigned_type, step);
  }
else
  {
 +  wide_int min, max;
extreme = fold_convert (unsigned_type, high);
 -  if (TREE_CODE (base) != INTEGER_CST)
 +  if (TREE_CODE (orig_base) == SSA_NAME
 +  TREE_CODE (low) == INTEGER_CST
 +  INTEGRAL_TYPE_P (TREE_TYPE (orig_base))
 +  get_range_info (orig_base, min, max) == VR_RANGE
 +  wi::gts_p (min, wide_int (low)))

Likewise.

Ok for trunk with that changes.  For the 4.9 branch you need to adjust
the patch to not use wide-ints.  I'd leave it on trunk for a while and
eventually open a bugreport for the size regression to keep track of it.

Thanks,
Richard.

 +   base = wide_int_to_tree (unsigned_type, min);
 +  else if (TREE_CODE (base) != INTEGER_CST)
 base = fold_convert (unsigned_type, low);
delta = fold_build2 (MINUS_EXPR, unsigned_type, extreme, base);
  }

Re: [PATCH 6/8] Handle SCRATCH in decompose_address

2015-01-27 Thread Maxim Kuvyrkov

On Oct 23, 2014, at 4:18 AM, Jeff Law l...@redhat.com wrote:

 On 10/22/14 17:01, Maxim Kuvyrkov wrote:
 On Oct 23, 2014, at 9:02 AM, Jeff Law l...@redhat.com wrote:
 
 On 10/20/14 21:35, Maxim Kuvyrkov wrote:
 Hi,
 
 This patch is a simple fix to allow decompose_address to handle
 SCRATCH'es during 2nd scheduler pass. This patch is a
 prerequisite for a scheduler improvement that relies on
 decompose_address to parse insns.
 
 Bootstrapped and regtested on x86_64-linux-gnu and regtested on
 arm-linux-gnueabihf and aarch64-linux-gnu.
 I'd like to see some further discussion here.
 
 get_base_term is supposed to look at its argument as a base
 address. I'm curious under what circumstances you want to have a
 SCRATCH as a base address?
 
 I didn't see anything in patch #8 which obviously dependended on
 this, but maybe it's in there, but more subtle than expected.
 
 If you can justify why it's useful to handle scratch in here, then
 the patch will be fine.
 
 Without this patch decompose_address() ICEs during second scheduler
 pass on prologue instructions that usually have (clobber (mem:BLK
 (scratch)).  The only reason for this patch is to prevent that fault
 and enable use of decompose_address during 2nd scheduler pass.
 
 Does this answer your question, or are you looking for a more
 in-depth reason?
 Yea, that's everything I needed to know.  Patch approved.

Hi,

Turns out that the above patch applies without conflicts to two functions in 
rtlanal.c: get_base_term(), for which the patch is intended, and 
get_index_term(), for which the patch is not.

Due to git rebases and patch updates, I have accidentally pushed the patch 
twice and unintentionally changed get_index_term().  From what I can tell the 
change is benign, but, still, it is unnecessary.  The attached patch reverts 
the accidental commit.  It was bootstrapped arm-linux-gnueabihf.

OK for stage 1?  I'll regtest it before committing, just in case.

Thanks,

--
Maxim Kuvyrkov
www.linaro.org




0001-Revert-accidental-commit-get_base_index-was-the-inte.patch
Description: Binary data

Re: [PATCH][RFA][PR target/15184] Partial fix for direct byte access on x86

I'm withdrawing the combine_simplify_rtx hunk of this patch.  While 
working cleaning up my improvements for the remaining of testcases I 
stumbled upon a simpler change which covers all the tests.


What's kind of funny is I'd been staring at the relevant code a goodly 
part of the weekend without seeing how easily it could be extended and 
that the result doesn't have to match a pattern as combine can split the 
horrid mess in such a way that we two matched insns which when combined 
with other nearby insns ultimately collapse into precisely what we want.




We're still going to need the changes to the heuristic to enable 4 insn 
combinations as we need to be giving nice big blobs of code to 
combine_simplify_rtx and its children.


It's actually kind of cool to see something like this flow into 
make_field_assignment:


(set (mem/c:HI (symbol_ref:SI (y) [flags 0x40] var_decl 
0x7670bcf0 y) [2 y+0 S2 A16])
(subreg:HI (ior:SI (zero_extend:SI (mem/c:QI (symbol_ref:SI (y) 
[flags 0x40] var_decl 0x7670bcf0 y) [2 y+0 S1 A16]))

(reg:SI 100 [ D.1569 ])) 0))

And make_field_assignment turns it into:

(set (mem/c:QI (const:SI (plus:SI (symbol_ref:SI (y) [flags 0x40] 
var_decl 0x7670bcf0 y)

(const_int 1 [0x1]))) [2 y+1 S1 A8])
(subreg:QI (lshiftrt:SI (reg:SI 100 [ D.1569 ])
(const_int 8 [0x8])) 0))

Combine then chooses the lshift expression as a split point and emits an 
insn for the lshift and substitutes a nice simple reg into that hunk of 
RTL above for the lshift


(set (mem/c:QI (const:SI (plus:SI (symbol_ref:SI (y) [flags 0x40] 
var_decl 0x7670bcf0 y)

(const_int 1 [0x1]))) [2 y+1 S1 A8])
(subreg:QI (reg:SI 103) 0))

oh, that looks perfect.  Now it's just a simple matter of 
cleanup :-)  Which is actually kindof fun to watch:


We'll have this after the combine  split step:

(insn 9 8 10 2 (parallel [
(set (reg:SI 99 [ D.1569 ])
(ashift:SI (reg:SI 96 [ c ])
(const_int 8 [0x8])))
(clobber (reg:CC 17 flags))
]) j.c:33 510 {*ashlsi3_1}
 (expr_list:REG_UNUSED (reg:CC 17 flags)
(expr_list:REG_DEAD (reg:SI 96 [ c ])
(nil

(insn 10 9 13 2 (parallel [
(set (reg:SI 100 [ D.1569 ])
(and:SI (reg:SI 99 [ D.1569 ])
(const_int 65280 [0xff00])))
(clobber (reg:CC 17 flags))
]) j.c:33 380 {*andsi_1}
 (expr_list:REG_UNUSED (reg:CC 17 flags)
(expr_list:REG_DEAD (reg:SI 99 [ D.1569 ])
(nil

(insn 13 10 14 2 (parallel [
(set (reg:SI 103)
(lshiftrt:SI (reg:SI 100 [ D.1569 ])
(const_int 8 [0x8])))
(clobber (reg:CC 17 flags))
]) j.c:33 543 {*lshrsi3_1}
 (expr_list:REG_UNUSED (reg:CC 17 flags)
(expr_list:REG_DEAD (reg:SI 100 [ D.1569 ])
(nil

(insn 14 13 0 2 (set (mem/c:QI (const:SI (plus:SI (symbol_ref:SI (y) 
[flags 0x40] var_decl 0x7670bcf0 y)

(const_int 1 [0x1]))) [2 y+1 S1 A8])
(subreg:QI (reg:SI 103) 0)) j.c:33 93 {*movqi_internal}
 (expr_list:REG_DEAD (reg:SI 103)
(nil)))

Eventually all those participate in combinations again resulting in just:

(insn 14 13 0 2 (set (mem/c:QI (const:SI (plus:SI (symbol_ref:SI (y) 
[flags 0x40]  var_decl 0x7fd247c16cf0 y)

(const_int 1 [0x1]))) [2 y+1 S1 A8])
(subreg:QI (reg:SI 96 [ c ]) 0)) j.c:33 93 {*movqi_internal}
 (expr_list:REG_DEAD (reg:SI 96 [ c ])
(nil)))

ie, movb %al, y+1


Sometimes I malign the combiner, but there are also days when I just 
have to say wow, not bad given its last major revamp was in 1990/1991 
(which brought in the splitting code noted above.


Anyway, onward bootstrapping and testing...


Jeff

On 01/26/15 20:07, Jeff Law wrote:

Segher: I know you're not officially noted as a maintainer or reviewer
for combine.c, but that's something I'd like to change if you're
interested in a larger role.  In the mean time, any feedback you have
would be appreciated.


So the issue mentioned in the BZ is that fairly obvious code sequences
that ought to use simple byte moves are expanding into hideous sequences
(load, store, couple bitwise logicals, maybe a shift or extension thrown
in for good measure).

As mentioned in the BZ, one of the issues is that combine is limited in
terms of how many insns it will look at.  As it turns out that was
addressed not terribly low ago and we can do 4 insn combinations. With
just a little work in combine.c we can get the desired code for the
first two testcases as well as two of my own.

The first issue is 4 insn combinations are (reasonably) guarded in such
a way as to avoid them if they are unlikely to succeed.  We basically
look at the operands of the 4 insns and try to guess if there's a
reasonable chance a combination would succeed.  If not, no 4 insn
combinations are

Re: [patch, libobjc] export __objc_get_forward_imp, get_imp again

2015-01-27 Thread Matthias Klose

On 01/22/2015 05:09 PM, Matthias Klose wrote:
 On 01/22/2015 12:56 AM, Andrew Pinski wrote:
 On Wed, Jan 21, 2015 at 8:51 AM, Jakub Jelinek ja...@redhat.com wrote:
 On Wed, Jan 21, 2015 at 08:41:46AM -0800, pins...@gmail.com wrote:
 On Jan 21, 2015, at 1:02 AM, Matthias Klose d...@ubuntu.com wrote:

 __objc_get_forward_imp and get_imp were exported in libobjc since GCC 
 4.1, for
 some reason these are not exported anymore in GCC 5 (both declared 
 inline).  So
 either export these as before, or don't export them and bump the soname.  
 The
 latter seems to be unwanted, and at least gnustep-base is using the 
 get_imp
 function.  So better keep the references in GCC 5?

 Is this an intended change in GCC 5 to not to export inline methods 
 anymore?

 Just remove the inline instead.

 The comments like:

 /* The new name of get_imp().  */
 IMP
 class_getMethodImplementation (Class class_, SEL selector)
 {
   if (class_ == Nil  ||  selector == NULL)
 return NULL;

   /* get_imp is inlined, so we're good.  */
   return get_imp (class_, selector);
 }

 don't make me very confident in such a change.
 The extern prototypes really work with both -std=gnu89 and -std=gnu11 and
 thus will at least keep status quo.

 Let's do that then.
 
 get_imp was renamed to class_getMethodImplementation, which is exported from
 objc/runtime.h.  GNUstep-base uses get_imp to define it's own
 class_getMethodImplementation, so get_imp isn't really needed. So either make
 the two functions inline, and don't export them, or declare the prototypes.  
 For
 the latter I would suggest objc-private/runtime.h 
 (class_getMethodImplementation
 is declared in objc/runtime.h).

now commited the following patch after approval on IRC.

  Matthias


libobjc/

2015-01-27  Matthias Klose  d...@ubuntu.com

	* sendmsg.c: Add prototypes for __objc_get_forward_imp and get_imp.

Index: libobjc/sendmsg.c
===
--- libobjc/sendmsg.c	(revision 220167)
+++ libobjc/sendmsg.c	(working copy)
@@ -104,6 +104,10 @@
 struct objc_method * search_for_method_in_list (struct objc_method_list * list, SEL op);
 id nil_method (id, SEL);
 
+/* Make sure this inline function is exported regardless of GNU89 or C99
+   inlining semantics as it is part of the libobjc ABI.  */
+extern IMP __objc_get_forward_imp (id, SEL);
+
 /* Given a selector, return the proper forwarding implementation.  */
 inline
 IMP
@@ -320,6 +324,10 @@
   return res;
 }
 
+/* Make sure this inline function is exported regardless of GNU89 or C99
+   inlining semantics as it is part of the libobjc ABI.  */
+extern IMP get_imp (Class, SEL);
+
 inline
 IMP
 get_imp (Class class, SEL sel)

Re: [PATCH] Fix PR64798

2015-01-27 Thread Jonathan Wakely


On 27/01/15 14:43 +0100, Richard Biener wrote:


The new exceptional EH allocator failed to align exception objects
properly (it ended up aligning to __alignof__((std::size_t))).  The
following fixes that by aligning to what __attribute__((aligned))
would align to (this is what _Unwind_Exception is aligned to, a
member of __cxa_refcounted_exception).

Bootstrapped and tested on x86_64-unknown-linux-gnu - Rainer is
testing this on sparc-solaris where it broke
g++.old-deja/g++.eh/badalloc1.C.

Ok for trunk?


Yes, thanks.

[PATCH][AArch32] Testcase fix for __ATOMIC_CONSUME

2015-01-27 Thread Alex Velenko


Hi,

This patch fixes arm/atomic-op-consume.c test to expect safe LDAEX
instruction to be generated when __ATOMIC_CONSUME semantics is requested.

This patch was tested by running the modified test on arm-none-eabi and
arm-none-linux-gnueabi compilers.

Is this patch ok?

Alex

2015-01-27  Alex Velenko  alex.vele...@arm.com

gcc/testsuite/

  * gcc.target/arm/atomic-op-consume.c (scan-assember-times): Adjust
  scan-assembler-times pattern.

diff --git a/gcc/testsuite/gcc.target/arm/atomic-op-consume.c 
b/gcc/testsuite/gcc.target/arm/atomic-op-consume.c
index 0354717..cc6c028 100644
--- a/gcc/testsuite/gcc.target/arm/atomic-op-consume.c
+++ b/gcc/testsuite/gcc.target/arm/atomic-op-consume.c
@@ -5,6 +5,9 @@
 
 #include ../aarch64/atomic-op-consume.x
 
-/* { dg-final { scan-assembler-times ldrex\tr\[0-9\]+, \\\[r\[0-9\]+\\\] 6 } 
} */
+/* To workaround Bugzilla 59448 issue, a request for __ATOMIC_CONSUME is always
+   promoted to __ATOMIC_ACQUIRE, implemented as MEMMODEL_ACQUIRE.  This causes
+   LDAEX to be generated instead of LDREX.  */
+/* { dg-final { scan-assembler-times ldaex\tr\[0-9\]+, \\\[r\[0-9\]+\\\] 6 } 
} */
 /* { dg-final { scan-assembler-times strex\t...?, r\[0-9\]+, 
\\\[r\[0-9\]+\\\] 6 } } */
 /* { dg-final { scan-assembler-not dmb } } */

Re: [patch] libstdc++/64368 add configure check for timed mutex operations

2015-01-27 Thread Jonathan Wakely


This isn't related to the last patch for this bug, except that the PR
is currently being used for all darwin FAILs.

We need to check a configure macro before using
pthread_rwlock_timedrdlock because Darwin doesn't define the
_POSIX_TIMEOUTS option.

Tested x86_64-linux, committed to trunk.
commit d48fe00ea96b3515a6a1f7a6926dfe2ff7db643c
Author: Jonathan Wakely jwak...@redhat.com
Date:   Tue Jan 27 10:38:09 2015 +

	PR libstdc++/64368
	* include/std/shared_mutex (shared_timed_mutex::try_lock_for,
	shared_timed_mutex::try_lock_until): Only define when POSIX thread
	timeouts option is supported.
	(shared_timed_mutex::try_shared_lock_for,
	shared_timed_mutex::try_shared_lock_until): Likewise.

diff --git a/libstdc++-v3/include/std/shared_mutex b/libstdc++-v3/include/std/shared_mutex
index 643768c..47cfc64 100644
--- a/libstdc++-v3/include/std/shared_mutex
+++ b/libstdc++-v3/include/std/shared_mutex
@@ -108,6 +108,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   return true;
 }
 
+#if _GTHREAD_USE_MUTEX_TIMEDLOCK
 templatetypename _Rep, typename _Period
   bool
   try_lock_for(const chrono::duration_Rep, _Period __rel_time)
@@ -149,6 +150,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	const auto __s_atime = __s_entry + __delta;
 	return try_lock_until(__s_atime);
   }
+#endif
 
 void
 unlock()
@@ -186,6 +188,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   return true;
 }
 
+#if _GTHREAD_USE_MUTEX_TIMEDLOCK
 templatetypename _Rep, typename _Period
   bool
   try_lock_shared_for(const chrono::duration_Rep, _Period __rel_time)
@@ -230,6 +233,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	const auto __s_atime = __s_entry + __delta;
 	return try_lock_shared_until(__s_atime);
   }
+#endif
 
 void
 unlock_shared()

Re: [PATCH, PR tree-optimization/64277] Improve loop iterations count estimation

On 27 Jan 12:29, Richard Biener wrote:
 On Tue, Jan 27, 2015 at 11:47 AM, Ilya Enkovich enkovich@gmail.com 
 wrote:
  On 27 Jan 12:40, Ilya Enkovich wrote:
  Hi,
 
  This patch was supposed to fix PR tree-optimization/64277.  Tracker is now 
  fixed by warnings disabling but I think patch is still useful to avoid 
  dead code generated by complete unroll.
 
  Bootstrapped and tested on x86_64-unknown-linux-gnu.
 
  Thanks,
  Ilya
  --
  gcc/
 
  2015-01-27  Ilya Enkovich  ilya.enkov...@intel.com
 
* tree-ssa-loop-niter.c (record_nonwrapping_iv): Use base
range info when possible to refine estimation.
 
  gcc/testsuite/
 
  2015-01-27  Ilya Enkovich  ilya.enkov...@intel.com
 
* gcc.dg/pr64277.c: New.
 
 
 
  Here is a new version fixed according to comments in the tracker.  I also 
  fixed a test to scan cunroll dumps.  Does it look OK?
 
 Minor comments below.
 
  What are possible branches for this patch?
 
 You can probably create a testcase that shows code-size regressions
 against a version that didn't peel completely (GCC 4.7).  Thus I'd say
 it would apply to 4.9 as well (4.8 doesn't have range information).
 
  Thanks,
  Ilya
  --
  diff --git a/gcc/testsuite/gcc.dg/pr64277.c b/gcc/testsuite/gcc.dg/pr64277.c
  new file mode 100644
  index 000..c6ef331
  --- /dev/null
  +++ b/gcc/testsuite/gcc.dg/pr64277.c
  @@ -0,0 +1,23 @@
  +/* PR tree-optimization/64277 */
  +/* { dg-do compile } */
  +/* { dg-options -O3 -Wall -Werror -fdump-tree-cunroll-details } */
  +/* { dg-final { scan-tree-dump loop with 5 iterations completely 
  unrolled cunroll } } */
  +/* { dg-final { scan-tree-dump loop with 6 iterations completely 
  unrolled cunroll } } */
  +/* { dg-final { cleanup-tree-dump cunroll } } */
  +
  +int f1[10];
  +void test1 (short a[], short m, unsigned short l)
  +{
  +  int i = l;
  +  for (i = i + 5; i  m; i++)
  +f1[i] = a[i]++;
  +}
  +
  +void test2 (short a[], short m, short l)
  +{
  +  int i;
  +  if (m  5)
  +m = 5;
  +  for (i = m; i  l; i--)
  +f1[i] = a[i]++;
  +}
  diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
  index 919f5c0..1cd297d 100644
  --- a/gcc/tree-ssa-loop-niter.c
  +++ b/gcc/tree-ssa-loop-niter.c
  @@ -2754,6 +2754,7 @@ record_nonwrapping_iv (struct loop *loop, tree base, 
  tree step, gimple stmt,
   {
 tree niter_bound, extreme, delta;
 tree type = TREE_TYPE (base), unsigned_type;
  +  tree orig_base = base;
 
 if (TREE_CODE (step) != INTEGER_CST || integer_zerop (step))
   return;
  @@ -2777,16 +2778,30 @@ record_nonwrapping_iv (struct loop *loop, tree 
  base, tree step, gimple stmt,
 
 if (tree_int_cst_sign_bit (step))
   {
  +  wide_int min, max;
 extreme = fold_convert (unsigned_type, low);
  -  if (TREE_CODE (base) != INTEGER_CST)
  +  if (TREE_CODE (orig_base) == SSA_NAME
  +  TREE_CODE (high) == INTEGER_CST
  +  INTEGRAL_TYPE_P (TREE_TYPE (orig_base))
  +  get_range_info (orig_base, min, max) == VR_RANGE
  +  wi::gts_p (wide_int (high), max))
 
 For me a simple wi::gts_p (high, max) worked fine.
 
  +   base = wide_int_to_tree (unsigned_type, max);
  +  else if (TREE_CODE (base) != INTEGER_CST)
  base = fold_convert (unsigned_type, high);
 delta = fold_build2 (MINUS_EXPR, unsigned_type, base, extreme);
 step = fold_build1 (NEGATE_EXPR, unsigned_type, step);
   }
 else
   {
  +  wide_int min, max;
 extreme = fold_convert (unsigned_type, high);
  -  if (TREE_CODE (base) != INTEGER_CST)
  +  if (TREE_CODE (orig_base) == SSA_NAME
  +  TREE_CODE (low) == INTEGER_CST
  +  INTEGRAL_TYPE_P (TREE_TYPE (orig_base))
  +  get_range_info (orig_base, min, max) == VR_RANGE
  +  wi::gts_p (min, wide_int (low)))
 
 Likewise.
 
 Ok for trunk with that changes.  For the 4.9 branch you need to adjust
 the patch to not use wide-ints.  I'd leave it on trunk for a while and
 eventually open a bugreport for the size regression to keep track of it.
 
 Thanks,
 Richard.
 

Thanks a lot for review!  Here is a final version for GCC 5.0.  Will prepare 
4.9 version later.

Thanks,
Ilya
--
diff --git a/gcc/testsuite/gcc.dg/pr64277.c b/gcc/testsuite/gcc.dg/pr64277.c
new file mode 100644
index 000..c6ef331
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr64277.c
@@ -0,0 +1,23 @@
+/* PR tree-optimization/64277 */
+/* { dg-do compile } */
+/* { dg-options -O3 -Wall -Werror -fdump-tree-cunroll-details } */
+/* { dg-final { scan-tree-dump loop with 5 iterations completely unrolled 
cunroll } } */
+/* { dg-final { scan-tree-dump loop with 6 iterations completely unrolled 
cunroll } } */
+/* { dg-final { cleanup-tree-dump cunroll } } */
+
+int f1[10];
+void test1 (short a[], short m, unsigned short l)
+{
+  int i = l;
+  for (i = i + 5; i  m; i++)
+f1[i] = a[i]++;
+}
+
+void test2 (short a[], short m, short l)
+{
+  int i;
+  if (m  5)
+m = 5;
+  for (i = m; i

[PATCH, CHKP] Fix PR middle-end/64805