[Bug target/66920] ICE in expand_debug_locations, at cfgexpand.c:3826

2015-07-20 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66920

Uroš Bizjak ubizjak at gmail dot com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
  Component|debug   |target
 Resolution|--- |DUPLICATE

--- Comment #3 from Uroš Bizjak ubizjak at gmail dot com ---
(In reply to Jan Smets from comment #1)
 Likely related/identical to 66931

True, this ICE is due to ISA mismatch, it just happens to be in debug part.
There is no vector register available, so BLKmode is used instead of V4SImode.

Please note that gcc warns:

pr66920.C: At global scope:
pr66920.C:15:1: warning: always_inline function might not be inlinable
[-Wattributes]
 counter_inc(struct counter *, long, long) {

*** This bug has been marked as a duplicate of bug 66931 ***

[Bug middle-end/66931] ICE in convert_move, at expr.c:316

2015-07-20 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66931

--- Comment #1 from Uroš Bizjak ubizjak at gmail dot com ---
*** Bug 66920 has been marked as a duplicate of this bug. ***

[Bug middle-end/66931] ICE in convert_move, at expr.c:316

2015-07-20 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66931

Uroš Bizjak ubizjak at gmail dot com changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2015-07-20
 Ever confirmed|0   |1
  Known to fail||6.0

--- Comment #2 from Uroš Bizjak ubizjak at gmail dot com ---
Confirmed  with [trunk revision 225993]

~/cc1plus -m32 -mno-sse pr66931.C
pr66931.C:5:1: warning: always_inline function might not be inlinable
[-Wattributes]
 counter(tcounter *self, long, long) {
 ^
ppr66931.C: In function ‘bool test_sp()’:
pr66931.C:6:25: internal compiler error: in convert_move, at expr.c:281
   *(v2di *)self + v2di{};
 ^
0xb83e07 convert_move(rtx_def*, rtx_def*, int)
/home/uros/gcc-svn/trunk/gcc/expr.c:281
0xb9aa13 store_expr_with_bounds(tree_node*, rtx_def*, int, bool, tree_node*)
/home/uros/gcc-svn/trunk/gcc/expr.c:5475
[...]


(gdb) f 2
#2  0x00b83e08 in convert_move (to=0x2e960eb8, from=0x2e9621e0,
unsignedp=0) at /home/uros/gcc-svn/trunk/gcc/expr.c:281
281   gcc_assert (from_mode != BLKmode);
(gdb) list
276   : (unsignedp ? ZERO_EXTEND :
SIGN_EXTEND));
277
278
279   gcc_assert (to_real == from_real);
280   gcc_assert (to_mode != BLKmode);
281   gcc_assert (from_mode != BLKmode);
282
283   /* If the source and destination are already the same, then there's
284  nothing to do.  */
285   if (to == from)
(gdb) p debug_rtx (to)
(reg:V4SI 92 [ D.2412 ])
$1 = void
(gdb) p debug_rtx (from)
(mem/c:BLK (plus:SI (reg/f:SI 82 virtual-stack-vars)
(const_int -32 [0xffe0])) [0 D.2405+0 S16 A128])
$2 = void

This happens due to ISA mismatch, and the compiler warns with:

pr66931.C: At global scope:
pr66931.C:5:1: warning: always_inline function might not be inlinable
[-Wattributes]
 counter(tcounter *self, long, long) {

There is no V4SImode available without -msse, so BLKmode is used instead. IIRC,
we already have a PR about this issue.

[Bug c++/66943] New: GCC warns of Unknown Pragma for OpenMP, even though it support it.

2015-07-20 Thread noloader at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66943

Bug ID: 66943
   Summary: GCC warns of Unknown Pragma for OpenMP, even though it
support it.
   Product: gcc
   Version: 4.8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: noloader at gmail dot com
  Target Milestone: ---

The following source code results in a slew of Unknown Pragma messages. The
problem is, GCC supports OpenMP. GCC responds properly to -fopenmp, and even
defines _OPENMP when it encounters -fopenmp.

This is related to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53431, and
managing warnings with pragmas when using -Wall. The relation to 53431 is: I
can't seem to get pragma GCC diagnostic to work to ignore the warnings when
-Wall is in effect.

// Defines GCC_DIAGNOSTIC_AWARE if GCC 4.7 or above.
#define GCC_DIAGNOSTIC_AWARE 1

#if GCC_DIAGNOSTIC_AWARE
# pragma GCC diagnostic ignored -Wunknown-pragmas
#endif

...
Integer ModularRoot(const Integer a, const Integer dp, const Integer dq,
const Integer p, const Integer q, const Integer u)
{
Integer p2, q2;
#pragma omp parallel
#pragma omp sections
{
#pragma omp section
p2 = ModularExponentiation((a % p), dp, p);
#pragma omp section
q2 = ModularExponentiation((a % q), dq, q);
}
return CRT(p2, p, q2, q, u);
}
...

**

To duplicate:

git clone https://github.com/weidai11/cryptopp.git cryptopp-warn
cd cryptopp-warn
export CXXFLAGS=-g2 -O3 -DNDEBUG -Wall
make


[Bug c++/66943] GCC warns of Unknown Pragma for OpenMP, even though it support it.

2015-07-20 Thread noloader at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66943

--- Comment #1 from Jeffrey Walton noloader at gmail dot com ---
I've experienced this issue on Cygwin i386 and x86_64 running GCC 4.8.1; Fedora
21 and 22, i386 and x86_64 running GCC 4.9 and 5.1, and a few others.

So it appears to be a widespread issue, and not an isolated case.


[Bug fortran/66942] New: trans-expr.c:5701 runtime error: member call on null pointer of type 'struct vec'

2015-07-20 Thread zeccav at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66942

Bug ID: 66942
   Summary: trans-expr.c:5701 runtime error: member call on null
pointer of type 'struct vec'
   Product: gcc
   Version: 5.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zeccav at gmail dot com
  Target Milestone: ---

! gcc-5.2.0/gcc/fortran/trans-expr.c:5701:19: runtime error: member call on
null pointer of type 'struct vec'
! gfortran source line retargs-splice (arglist);
! retargs is NULL
! double check with gcc_assert(retargs); immediately before
  call sub 
  END


Re: [PATCH 1/3] tree-ssa-tail-merge: add IPA ICF infrastructure.

2015-07-20 Thread Martin Liška
On 07/16/2015 01:03 PM, Martin Liška wrote:
 On 07/09/2015 06:24 PM, Jeff Law wrote:
 On 07/09/2015 07:56 AM, mliska wrote:
 gcc/ChangeLog:

 2015-07-09  Martin Liska  mli...@suse.cz

 * dbgcnt.def: Add new debug counter.
 * ipa-icf-gimple.c (func_checker::compare_ssa_name): Add flag
 for strict mode.
 (func_checker::compare_memory_operand): Likewise.
 (func_checker::compare_cst_or_decl): Handle if we are in
 tail_merge_mode.
 (func_checker::compare_operand): Pass strict flag properly.
 (func_checker::stmt_local_def): New function.
 (func_checker::compare_phi_node): Move from sem_function class.
 (func_checker::compare_bb_tail_merge): New function.
 (func_checker::compare_bb): Improve STMT iteration.
 (func_checker::compare_gimple_call): Pass strict flag.
 (func_checker::compare_gimple_assign): Likewise.
 (func_checker::compare_gimple_label): Remove unused flag.
 (ssa_names_set): New class.
 (ssa_names_set::build): New function.
 * ipa-icf-gimple.h (func_checker::gsi_next_nonlocal): New
 function.
 (ssa_names_set::contains): New function.
 (ssa_names_set::add): Likewise.
 * ipa-icf.c (sem_function::equals_private): Use transformed
 function.
 (sem_function::compare_phi_node): Move to func_checker class.
 * ipa-icf.h: Add new declarations.
 * tree-ssa-tail-merge.c (check_edges_correspondence): New
 function.
 (find_duplicate): Add usage of IPA ICF gimple infrastructure.
 (find_clusters_1): Pass new sem_function argument.
 (find_clusters): Likewise.
 (tail_merge_optimize): Call IPA ICF comparison machinery.
 So a general question.  We're passing in STRICT to several routines, which 
 is fine.  But then we're also checking M_TAIL_MERGE_MODE.  What's the 
 difference between the two?  Can they be unified?
 
 Hello.
 
 I would say that STRICT is a bit generic mechanism that was introduced some 
 time before. It's e.g. used for checking of THIS arguments for methods and 
 make checking
 more sensitive in situations that are somehow special.
 
 The newly added state is orthogonal to the previous one.
 



 -/* Verifies that trees T1 and T2 are equivalent from perspective of ICF.  
 */
 +/* Verifies that trees T1 and T2 are equivalent from perspective of ICF.
 +   If STRICT flag is true, versions must match strictly.  */

   bool
 -func_checker::compare_ssa_name (tree t1, tree t2)
 +func_checker::compare_ssa_name (tree t1, tree t2, bool strict)
 This (and other) functions would seem to be used more than just ICF at this 
 point.  A pass over the comments to update them as appropriate would be 
 appreciated.

 @@ -626,6 +648,136 @@ func_checker::parse_labels (sem_bb *bb)
   }
   }

 +/* Return true if gimple STMT is just a local difinition in a
 +   basic block.  Used SSA names are contained in SSA_NAMES_SET.  */
 s/difinition/definition/
 
 Thanks.
 

 I didn't find this comment particularly useful in understanding what this 
 function does.  AFAICT the function looks as the uses of the LHS of STMT and 
 verifies they're all in the same block as STMT, right?

 It also verifies that the none of the operands within STMT are part of 
 SSA_NAMES_SET.

 What role do those properties play in the meaning of local definition?
 
 I tried to explain it more deeply what's the purpose of this function.
 




 @@ -1037,4 +1205,60 @@ func_checker::compare_gimple_asm (const gasm *g1, 
 const gasm *g2)
 return true;
   }

 +void
 +ssa_names_set::build (basic_block bb)
 Needs a function comment.  What are the important names we're collecting 
 here?

 Is a single forward and backward pass really sufficient to find all the 
 important names?

 In the backward pass, do you have to consider things like ASMs?  I guess 
 it's difficult to understand what you need to look at because it's not 
 entirely clear the set of SSA_NAMEs you're building.



 @@ -149,12 +153,20 @@ public:
mapping between basic blocks and labels.  */
 void parse_labels (sem_bb *bb);

 +  /* For given basic blocks BB1 and BB2 (from functions FUNC1 and FUNC),
 + true value is returned if phi nodes are semantically
 + equivalent in these blocks.  */
 +  bool compare_phi_node (sem_bb *sem_bb1, sem_bb *sem_bb2);
 Presumably in the case of tail merging, FUNC1 and FUNC will be the same :-)
 
 Yes, the function is not called from tail-merge pass.
 


 /* Verifies that trees T1 and T2 are equivalent from perspective of 
 ICF.  */
 -  bool compare_ssa_name (tree t1, tree t2);
 +  bool compare_ssa_name (tree t1, tree t2, bool strict = true);

 /* Verification function for edges E1 and E2.  */
 bool compare_edge (edge e1, edge e2);
 @@ -204,7 +216,7 @@ public:
 bool compare_tree_ssa_label (tree t1, tree t2);

 /* Function compare for equality given memory operands T1 and T2.  */
 -  bool compare_memory_operand (tree t1, tree t2);
 +  bool compare_memory_operand (tree t1, tree t2, bool strict = true);

 /* 

RE: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt) estimation in -ffast-math

2015-07-20 Thread Kumar, Venkataramanan
Hi, 

I missed your email and noticed it this week.

What does column 2  tests?  Are you trying to implement square roots  using 
reciprocal estimate and step? 

But reciprocal square root  using reciprocal estimate and (2 for fp 3 for dp) 
step seems  to be better that using fdiv and fsqrt in your case.   

Regards,
Venkat.

 -Original Message-
 From: Evandro Menezes [mailto:e.mene...@samsung.com]
 Sent: Wednesday, July 15, 2015 3:45 AM
 To: Kumar, Venkataramanan; pins...@gmail.com; 'Dr. Philipp Tomsich'
 Cc: 'James Greenhalgh'; 'Benedikt Huber'; gcc-patches@gcc.gnu.org; 'Marcus
 Shawcroft'; 'Ramana Radhakrishnan'; 'Richard Earnshaw'
 Subject: RE: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt)
 estimation in -ffast-math
 
 I ran a simple test on A57 rev. 0, looping a million times around sqrt{,f} and
 the respective series iterations with the values in the sequence 1..100
 and got these results:
 
 sqrt(x):36593844/s  1/sqrt(x):  18283875/s
 3 Steps:47922557/s  3 Steps:49005194/s
 
 sqrtf(x):   143988480/s 1/sqrtf(x): 69516857/s
 2 Steps:78740157/s  2 Steps:80385852/s
 
 I'm a bit surprised that the 3-iteration series for DP is faster than sqrt(), 
 but
 not that it's much faster for the reciprocal of sqrt().  As for SP, the 
 2-iteration
 series is faster only for the reciprocal for sqrtf().
 
 There might still be some leg for this patch in real-world cases which I'd 
 like to
 investigate.
 
 --
 Evandro Menezes  Austin, TX
 
 
  -Original Message-
  From: gcc-patches-ow...@gcc.gnu.org
  [mailto:gcc-patches-ow...@gcc.gnu.org] On Behalf Of Kumar,
  Venkataramanan
  Sent: Monday, June 29, 2015 13:50
  To: pins...@gmail.com; Dr. Philipp Tomsich
  Cc: James Greenhalgh; Benedikt Huber; gcc-patches@gcc.gnu.org; Marcus
  Shawcroft; Ramana Radhakrishnan; Richard Earnshaw
  Subject: RE: [PATCH] [aarch64] Implemented reciprocal square root
  (rsqrt) estimation in -ffast-math
 
  Hi,
 
   -Original Message-
   From: pins...@gmail.com [mailto:pins...@gmail.com]
   Sent: Monday, June 29, 2015 10:23 PM
   To: Dr. Philipp Tomsich
   Cc: James Greenhalgh; Kumar, Venkataramanan; Benedikt Huber; gcc-
   patc...@gcc.gnu.org; Marcus Shawcroft; Ramana Radhakrishnan;
 Richard
   Earnshaw
   Subject: Re: [PATCH] [aarch64] Implemented reciprocal square root
   (rsqrt) estimation in -ffast-math
  
  
  
  
  
On Jun 29, 2015, at 4:44 AM, Dr. Philipp Tomsich
   philipp.toms...@theobroma-systems.com wrote:
   
James,
   
On 29 Jun 2015, at 13:36, James Greenhalgh
   james.greenha...@arm.com wrote:
   
On Mon, Jun 29, 2015 at 10:18:23AM +0100, Kumar, Venkataramanan
   wrote:
   
-Original Message-
From: Dr. Philipp Tomsich
[mailto:philipp.toms...@theobroma-systems.com]
Sent: Monday, June 29, 2015 2:17 PM
To: Kumar, Venkataramanan
Cc: pins...@gmail.com; Benedikt Huber; gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] [aarch64] Implemented reciprocal square
root
(rsqrt) estimation in -ffast-math
   
Kumar,
   
This does not come unexpected, as the initial estimation and
each iteration will add an architecturally-defined number of
bits of precision (ARMv8 guarantuees only a minimum number of
bits
   provided
per operation… the exact number is specific to each micro-arch,
   though).
Depending on your architecture and on the required number of
precise bits by any given benchmark, one may see miscompares.
   
True.
   
I would be very uncomfortable with this approach.
   
Same here. The default must be safe. Always.
Unlike other architectures, we don’t have a problem with making
the proper defaults for “safety”, as the ARMv8 ISA guarantees a
minimum number of precise bits per iteration.
   
From Richard Biener's post in the thread Michael Matz linked
earlier in the thread:
   
  It would follow existing practice of things we allow in
  -funsafe-math-optimizations.  Existing practice in that we
  want to allow -ffast-math use with common benchmarks we care
  about.
   
  https://gcc.gnu.org/ml/gcc-patches/2009-11/msg00100.html
   
With the solution you seem to be converging on (2-steps for some
microarchitectures, 3 for others), a binary generated for one
micro-arch may drop below a minimum guarantee of precision when
run on another. This seems to go against the spirit of the
practice above. I would only support adding this optimization to
-Ofast if we could keep to architectural guarantees of precision
in the generated code
   (i.e. 3-steps everywhere).
   
I don't object to adding a -mlow-precision-recip-sqrt style
option, which would be off by default, would enable the 2-step
mode, and would need to be explicitly enabled (i.e. not implied
by
-mcpu=foo) but I don't see what this buys you beyond the Gromacs

[Bug c++/66943] GCC warns of Unknown Pragma for OpenMP, even though it support it.

2015-07-20 Thread noloader at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66943

--- Comment #2 from Jeffrey Walton noloader at gmail dot com ---
My bad... Here's the error message:

g++ -DNDEBUG -g2 -O3 -Wall -march=native -pipe -c nbtheory.cpp
nbtheory.cpp:655:0: warning: ignoring #pragma omp parallel [-Wunknown-pragmas]
  #pragma omp parallel
 ^
nbtheory.cpp:656:0: warning: ignoring #pragma omp sections [-Wunknown-pragmas]
   #pragma omp sections
 ^
...


[PATCH] PR debug/53927: fix value for DW_AT_static_link

2015-07-20 Thread Pierre-Marie de Rodat

Hello,

This patch fixes the static link description in DWARF to comply with the 
specification. In order to do so, it appends a field to all FRAME 
objects to hold the frame base address (DW_AT_frame_base) so that the 
nested subrograms can directly reference this field.


See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53927 for the context 
(in particular why we need this additional field in FRAME objects).


Bootstrapped and regtested successfuly on x86_64-linux. Ok for trunk? 
Thank you in advance!


gcc/ChangeLog:

PR debug/53927
* tree-nested.c (finalize_nesting_tree_1): Append a field to
hold the frame base address.
* dwarf2out.c (gen_subprogram_die): Generate for
DW_AT_static_link a location description that computes the value
of this field.

--
Pierre-Marie de Rodat


[Bug c++/53431] C++ preprocessor ignores #pragma GCC diagnostic

2015-07-20 Thread allan.chandler at oakton dot com.au
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53431

Allan Chandler allan.chandler at oakton dot com.au changed:

   What|Removed |Added

 CC||allan.chandler at oakton dot 
com.a
   ||u

--- Comment #11 from Allan Chandler allan.chandler at oakton dot com.au ---
http://stackoverflow.com/questions/31509434/gcc-does-not-honor-pragma-gcc-diagnostic-to-silence-warnings

Now you've done it. This was reported over three years ago and now it's
affected someone on Stack Overflow. You guys are in for it now :-)


Re: [PATCH 1/3] tree-ssa-tail-merge: add IPA ICF infrastructure.

2015-07-20 Thread Martin Liška
On 07/16/2015 01:03 PM, Martin Liška wrote:
 On 07/09/2015 06:24 PM, Jeff Law wrote:
 On 07/09/2015 07:56 AM, mliska wrote:
 gcc/ChangeLog:

 2015-07-09  Martin Liska  mli...@suse.cz

 * dbgcnt.def: Add new debug counter.
 * ipa-icf-gimple.c (func_checker::compare_ssa_name): Add flag
 for strict mode.
 (func_checker::compare_memory_operand): Likewise.
 (func_checker::compare_cst_or_decl): Handle if we are in
 tail_merge_mode.
 (func_checker::compare_operand): Pass strict flag properly.
 (func_checker::stmt_local_def): New function.
 (func_checker::compare_phi_node): Move from sem_function class.
 (func_checker::compare_bb_tail_merge): New function.
 (func_checker::compare_bb): Improve STMT iteration.
 (func_checker::compare_gimple_call): Pass strict flag.
 (func_checker::compare_gimple_assign): Likewise.
 (func_checker::compare_gimple_label): Remove unused flag.
 (ssa_names_set): New class.
 (ssa_names_set::build): New function.
 * ipa-icf-gimple.h (func_checker::gsi_next_nonlocal): New
 function.
 (ssa_names_set::contains): New function.
 (ssa_names_set::add): Likewise.
 * ipa-icf.c (sem_function::equals_private): Use transformed
 function.
 (sem_function::compare_phi_node): Move to func_checker class.
 * ipa-icf.h: Add new declarations.
 * tree-ssa-tail-merge.c (check_edges_correspondence): New
 function.
 (find_duplicate): Add usage of IPA ICF gimple infrastructure.
 (find_clusters_1): Pass new sem_function argument.
 (find_clusters): Likewise.
 (tail_merge_optimize): Call IPA ICF comparison machinery.
 So a general question.  We're passing in STRICT to several routines, which 
 is fine.  But then we're also checking M_TAIL_MERGE_MODE.  What's the 
 difference between the two?  Can they be unified?
 
 Hello.
 
 I would say that STRICT is a bit generic mechanism that was introduced some 
 time before. It's e.g. used for checking of THIS arguments for methods and 
 make checking
 more sensitive in situations that are somehow special.
 
 The newly added state is orthogonal to the previous one.
 



 -/* Verifies that trees T1 and T2 are equivalent from perspective of ICF.  
 */
 +/* Verifies that trees T1 and T2 are equivalent from perspective of ICF.
 +   If STRICT flag is true, versions must match strictly.  */

   bool
 -func_checker::compare_ssa_name (tree t1, tree t2)
 +func_checker::compare_ssa_name (tree t1, tree t2, bool strict)
 This (and other) functions would seem to be used more than just ICF at this 
 point.  A pass over the comments to update them as appropriate would be 
 appreciated.

 @@ -626,6 +648,136 @@ func_checker::parse_labels (sem_bb *bb)
   }
   }

 +/* Return true if gimple STMT is just a local difinition in a
 +   basic block.  Used SSA names are contained in SSA_NAMES_SET.  */
 s/difinition/definition/
 
 Thanks.
 

 I didn't find this comment particularly useful in understanding what this 
 function does.  AFAICT the function looks as the uses of the LHS of STMT and 
 verifies they're all in the same block as STMT, right?

 It also verifies that the none of the operands within STMT are part of 
 SSA_NAMES_SET.

 What role do those properties play in the meaning of local definition?
 
 I tried to explain it more deeply what's the purpose of this function.
 




 @@ -1037,4 +1205,60 @@ func_checker::compare_gimple_asm (const gasm *g1, 
 const gasm *g2)
 return true;
   }

 +void
 +ssa_names_set::build (basic_block bb)
 Needs a function comment.  What are the important names we're collecting 
 here?

 Is a single forward and backward pass really sufficient to find all the 
 important names?

 In the backward pass, do you have to consider things like ASMs?  I guess 
 it's difficult to understand what you need to look at because it's not 
 entirely clear the set of SSA_NAMEs you're building.



 @@ -149,12 +153,20 @@ public:
mapping between basic blocks and labels.  */
 void parse_labels (sem_bb *bb);

 +  /* For given basic blocks BB1 and BB2 (from functions FUNC1 and FUNC),
 + true value is returned if phi nodes are semantically
 + equivalent in these blocks.  */
 +  bool compare_phi_node (sem_bb *sem_bb1, sem_bb *sem_bb2);
 Presumably in the case of tail merging, FUNC1 and FUNC will be the same :-)
 
 Yes, the function is not called from tail-merge pass.
 


 /* Verifies that trees T1 and T2 are equivalent from perspective of 
 ICF.  */
 -  bool compare_ssa_name (tree t1, tree t2);
 +  bool compare_ssa_name (tree t1, tree t2, bool strict = true);

 /* Verification function for edges E1 and E2.  */
 bool compare_edge (edge e1, edge e2);
 @@ -204,7 +216,7 @@ public:
 bool compare_tree_ssa_label (tree t1, tree t2);

 /* Function compare for equality given memory operands T1 and T2.  */
 -  bool compare_memory_operand (tree t1, tree t2);
 +  bool compare_memory_operand (tree t1, tree t2, bool strict = true);

 /* 

[Bug debug/66920] ICE in expand_debug_locations, at cfgexpand.c:3826

2015-07-20 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66920

Uroš Bizjak ubizjak at gmail dot com changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2015-07-20
  Component|middle-end  |debug
 Ever confirmed|0   |1
  Known to fail||6.0

--- Comment #2 from Uroš Bizjak ubizjak at gmail dot com ---
Confirmed with current mainline [trunk revision 225993]:
-
~/gcc-build/gcc/cc1plus -m32 -O1 -gdwarf-4 -mno-sse pr66920.C

#2  0x00a5b1e0 in expand_debug_locations () at
/home/uros/gcc-svn/trunk/gcc/cfgexpand.c:4992
#3  0x00a5c028 in (anonymous namespace)::pass_expand::execute
(this=0x25c4080, fun=0x2e9519d8) at
/home/uros/gcc-svn/trunk/gcc/cfgexpand.c:6026
#4  0x00f3c7c4 in execute_one_pass (pass=0x25c4080) at
/home/uros/gcc-svn/trunk/gcc/passes.c:2319

(gdb) f 2
#2  0x00a5b1e0 in expand_debug_locations () at
/home/uros/gcc-svn/trunk/gcc/cfgexpand.c:4992
4992gcc_assert (mode == GET_MODE (val)
(gdb) list
4987  val = gen_rtx_UNKNOWN_VAR_LOC ();
4988else
4989  {
4990mode = GET_MODE (INSN_VAR_LOCATION (insn));
4991
4992gcc_assert (mode == GET_MODE (val)
4993|| (GET_MODE (val) == VOIDmode
4994 (CONST_SCALAR_INT_P (val)
4995|| GET_CODE (val) == CONST_FIXED
4996|| GET_CODE (val) == LABEL_REF)));

(gdb) p debug_rtx (val)
(concatn:V4SI [
(const_int 0 [0])
(const_int 0 [0])
(const_int 0 [0])
(const_int 0 [0])
])
$3 = void
(gdb) p debug_rtx (insn)
(debug_insn 28 27 29 2 (var_location:BLK v2di (simple_return:CC)) pr66920.C:16
-1
 (nil))
$4 = void

Re: [PATCH] Fix partial template specialization syntax in wide-int.h

2015-07-20 Thread Mikhail Maltsev
On 07/17/2015 07:46 PM, Mike Stump wrote:
 On Jul 17, 2015, at 2:28 AM, Mikhail Maltsev malts...@gmail.com wrote:
 The following code (reduced from wide-int.h) is rejected by Intel C++
 Compiler (EDG-based):
 
 So, could you test this with the top of the tree compiler and file a bug
 report against g++ for it, if it seems to not work right.  If that bug report
 is rejected, then I’d say file a bug report against clang and EDG.

In addition to usual bootstrap+regtest, I also checked that build succeeds with
GCC 4.3.6 (IIRC, this is now the minimal required version) as well as with
recent GCC snapshot used as stage 0. Committed as r225993.
I also filed this bugreport: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66941

 I think that the warning is correct, and template  should not be used
 here. The attached patch should fix this issue. Bootstrapped and regtested
 on x86_64-linux. OK for trunk?
 
 Ok.  Does this need to go into the gcc-5 release branch as well?  If so, ok
 there too.  Thanks.
I think there is no need for it.

-- 
Regards,
Mikhail Maltsev


Re: [PATCH][AArch64] Use cinc for if_then_else of plus-immediates

2015-07-20 Thread Kyrill Tkachov


On 19/07/15 07:40, Oleg Endo wrote:

On 19 Jul 2015, at 12:13, Andrew Pinski pins...@gmail.com wrote:


On Thu, Jul 16, 2015 at 8:33 AM, Kyrill Tkachov kyrylo.tkac...@arm.com wrote:

Hi all,

This patch improves codegen for expressions of the form:
(x ? y + c1 : y + c2) when |c1 - c2| == 1

It matches the if_then_else of the two plus-immediates,
performs one of them, then generates a conditional increment
operation.

Thus, for the code in the testcase we generate a single add, compare
and cinc instruction rather than two adds, a compare and a csel.

Bootstrapped and tested on aarch64.

Ok for trunk?

Why isn't this done in the generic code already.  That is ifcvt?  It
seems better to have it optimize it there rather than having a target
specific patch for something which is not really target specific
except maybe the cost.

I'd be better if something transformed

   x  100 ? x - 2 : x - 1;

into

   x - 1 - (x  100)


This looks like a good idea.
Thanks for the suggestion.

Kyrill



This is much easier to handle with combine patterns without having to rely on 
conditional move patterns.  In this case, it seems that the if_then_else 
combine patterns will be formed by going through conditional move patterns.  So 
if the target doesn't define conditional moves, it will never be able to get 
there.

There are some other similar missed ifcvt cases, such as 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54236#c9
Maybe the existing addmodecc handling in ifcvt could be extended to do that.

Cheers,
Oleg





[Bug target/66930] [5 Regression]: gengtype.c is miscompiled during stage2

2015-07-20 Thread glaubitz at physik dot fu-berlin.de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66930

--- Comment #2 from John Paul Adrian Glaubitz glaubitz at physik dot 
fu-berlin.de ---
(In reply to Oleg Endo from comment #1)
 (In reply to John Paul Adrian Glaubitz from comment #0)
  As previously discussed in private mail, I am now filing a bug report for
  the regression in gcc-5 that was introduced somewhere between r222550 and
  r225710 which leads to the miscompilation of gcc/gengtype.c when building a
  native compiler on SH [1]:
 
 Looking at the SVN log for gcc-5-branch:
svn log -r222550:r225710

r223346 isn't affected either, 5.1.1-6 is still building and already got past
gengtype.c in stage 2.

Adrian


[Bug middle-end/66915] [6 Regression] FAIL: gcc.dg/fixed-point/unary.c execution test on arm

2015-07-20 Thread ktkachov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66915

--- Comment #2 from ktkachov at gcc dot gnu.org ---
Created attachment 36015
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=36015action=edit
Diff between good and bad versions of the assembly


[Bug c++/66943] GCC warns of Unknown Pragma for OpenMP, even though it support it.

2015-07-20 Thread manu at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66943

Manuel López-Ibáñez manu at gcc dot gnu.org changed:

   What|Removed |Added

 CC||manu at gcc dot gnu.org

--- Comment #5 from Manuel López-Ibáñez manu at gcc dot gnu.org ---
(In reply to Jeffrey Walton from comment #4)
 (For what its worth, I understand the compiler writers are always right.
 They are demi-gods in my little corner of the universe :)

You can also be a compiler writer:
https://gcc.gnu.org/wiki/GettingStarted#Basics:_Contributing_to_GCC_in_10_easy_steps

For what is worth, I understand the point by Andrew that without -fopenmp, the
#pragmas are effectively ignored, thus the warning seems useful. Perhaps it
would be more useful a specific -Wopenmp-pragmas  that says:

 warning: ignoring '#pragma omp parallel' without '-fopenmp' [-Wopenmp-pragmas]

But it seems more important to fix PR53431, if someone has time for that.

The zero-column :0 in the diagnostic is also a bug.

[Bug preprocessor/66932] Preprocessor includes wrong header file

2015-07-20 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66932

--- Comment #4 from Andrew Pinski pinskia at gcc dot gnu.org ---
An empty : causes the current directory to be added.


[Bug c++/53431] C++ preprocessor ignores #pragma GCC diagnostic

2015-07-20 Thread manu at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53431

--- Comment #12 from Manuel López-Ibáñez manu at gcc dot gnu.org ---
(In reply to Allan Chandler from comment #11)
 Now you've done it. This was reported over three years ago and now it's
 affected someone on Stack Overflow. You guys are in for it now :-)

Unfortunately, the C/C++ FEs in GCC have very very few developers relative to
their importance and amount work they require. There is a patch in comment #10,
but it requires some additional work for which I do not have enough free time.
If you or someone else has some free time to finish this work, this is how I
would proceed:

1. Try to figure out why the preprocessor removes the pragmas (and not other
#-directives)
2. If you cannot figure it out, ask in gcc@ with explicit CC to C/C++/libcpp
maintainers (see MAINTAINERS file).
3. Complete the patch, bootstrapregression test, add a Changelog, submit to
gcc-patches and ping until it is approved.

More details:
https://gcc.gnu.org/wiki/GettingStarted#Basics:_Contributing_to_GCC_in_10_easy_steps

Otherwise, given that this hasn't been fixed in more than 4 years (see
PR48914), it seems likely that active developers have higher priority things to
work on and it will remain unfixed until some new volunteer steps up to the
task.

If/When I have a little free time to work on GCC, there are at least a couple
of other bugs I would rather fix before this one.

Re: [GSoC] Patches for shared_ptr array and polymorphic_allocator

2015-07-20 Thread Jonathan Wakely

On 18/07/15 00:02 -0700, Tim Shen wrote:

On Fri, Jul 17, 2015 at 7:16 PM, Fan You youfan.n...@gmail.com wrote:

Hi,

According to 
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4335.html#memory.smartptr

Here is my implementation of

[8.2] Extend shared_ptr to support arrays


Please don't resend the shared_ptr patch, since it's already tracked
in another thread.


Right, we can keep the two pieces of work entirely separate, I think.


[8.3] Type-Erased allocator


Please send a working patch with tests and (probably with Makefile.am changes).


Yes, the code looks really good but we need tests to know it works.

Tim's review is great, I will just add a few more comments.


#ifndef _GLIBCXX_MEMORY_RESOURCE
#define _GLIBCXX_MEMORY_RESOURCE 1


Please make this _GLIBCXX_EXPERIMENTAL_MEMORY_RESOURCE so that it
doesn't need to be renamed if the header is added to a later standard
and we have both memory_resource and experimental/memory_resource.


// Decleartion
 class memory_resource;


I don't think we need this comment, especially spelled wrong :-)


L70:
   static std::atomicmemory_resource* s_default_resource;

naming: _S_default_resource.


Similarly, dont_care_type is not a reserved name, so the header could
not be used in this legal program:

#define dont_care_type oops this isn't going to compile
#include experimental/memory_resource

However, I think the dont_care_type and __constructor_helper should go
away completely, see below.

Also, the TS says The name resource_adaptor_imp is for exposition
only and is not normative, which means we cannot use that name,
because users could define it as a macro. I suggest changing it to
__resource_adaptor_impl or similar.



L43:
   virtual ~memory_resource() { }

Please break the line after virtual/return type. This also applies for
other places in the patch.


Not essential for 'virtual' or generally for inline functions that fit
on a single line, so this is OK to leave as it is.


Consider use a more readable helper name, like
__uses_allocator_construction_helper and document it.


I don't think the helper type is needed at all. It is performing the
same job as the helper types in bits/uses_alloc.h, which I see get
used, but not correctly.

The point of the __uses_alloc0, __uses_alloc1 and __uses_alloc2 tag
types is that you overload on them to perform the correct type of
initialization according to the tag type.

The second construct() overload:

 // Specializations for pair using piecewise construction
 template typename _Tp1, typename _Tp2,
  typename... _Args1, typename... _Args2
   void construct(pair_Tp1, _Tp2* __p, piecewise_construct_t,
  tuple_Args1... __x,
  tuple_Args2... __y)

should not use __constructor_helper at all, because you never need to
pass an allocator to std::pair, instead you (conditionally) pass an
allocator to its members, but that's done by _M_construct_p(). So you
should not be using __constructor_helper for this overload. Just
construct a std::pair directly.

And the first construct() overload:

 template typename _Tp1, typename... _Args //used here
   void construct(_Tp1* __p, _Args... __args)
   {
 using _Ctor_imp = __constructor_helper_Tp1;
 ::new ((void*)__p) _Ctor_imp(allocator_arg,
  this-resource(),
  std::forward_Args(__args)...);
   }

also doesn't need __constructor_helper if you follow the same design
as used in scoped_allocator and dispatch to another function
(called _M_construct in std::scoped_allocator_adaptor) using the
appropriate __use_alloc tag type.

Implementing uses-allocator construction that way is consistent with
our existing code and will work for this type, which I think would
fail with your __constructor_helper because only polymorphic_allocator
can call its constructor:

 class Unfriendly
 {
 private:
   Unfriendly() = default;

   templatetypename T friend class polymorphic_allocator;
 };

Also in regard to the __uses_allocX tag types, these functions:

 templatetypename... _Args
   decltype(auto)
   _M_construct_p(__uses_alloc1_, tuple_Args... __t)
   { return tuple_cat(make_tuple(allocator_arg, this-resources()),
  std::move(__t)); }

 templatetypename... _Args
   decltype(auto)
   _M_construct_p(__uses_alloc2_, tuple_Args... __t)
   { return tuple_cat(std::move(__t), make_tuple(this-resources())); }

could use the _M_a member of the tag type to get the resource, instead
of this-resources(). That will actually compile, because the member
function is called resource() not resources() (so I assume these
functions have never been tested :-)

All the calls to tuple_cat and make_tuple must be qualified with std::
so that they do not use ADL.



L73:
 bool operator==(const memory_resource __a,
 const memory_resource __b) noexcept
 { return __a == __b || 

[Bug fortran/64986] class_to_type_4.f90: valgrind error: Invalid read/write of size 8

2015-07-20 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64986

--- Comment #10 from Uroš Bizjak ubizjak at gmail dot com ---
(In reply to Mikael Morin from comment #9)
 The components are deallocated after the containing object.
 Draft patch:

Yes, this fixes the testsuite failure for me.

[Bug libfortran/66936] io/unix.c gratuitously uses S_IRWXG and S_IRWXO on the basis that umask() is available

2015-07-20 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66936

--- Comment #9 from Andrew Pinski pinskia at gcc dot gnu.org ---
(In reply to Keith Marshall from comment #8)
 (In reply to kargl from comment #7)
  So add
  
  #define S_IRWXG 0
  #define S_IRWXO 0
  
  to the header file wherever mingw defines the available mask bits
  for umask(3).  The bug is clearly in mingw were it gratuitously maps
  umask() to _umask() without properly adding the mappings for the
  mode_t argument bits of umask(3).
 
 Absolutely not!  Those bits are utterly irrelevant for the windows (MinGW)
 platform; to add them would be do nothing more than create confusion.  The
 mask bits for umask(), on the windows platform are S_IREAD | S_IWRITE; those
 are the only mask bits YOUR code should be passing to non-POSIX
 umask(5axxx3be.aspx).
 
 This is NOT a MinGW bug; it's a GCC bug, and that's where it should be
 fixed.  Until you do fix it, I have my work-around, (which I'm perfectly
 willing to publish in MinGW forked source for GCC, prominently commented as
 a ghastly hack to circumvent a gross upstream GCC bug).


Well it is a libgfortran bug yes.  What we could do add to io/unix.c:
#if MINGW  !defined(S_IRWXG)
#define S_IRWXG 0
#endif
#if MINGW  !defined(S_IRWXO)
#define S_IRWXO 0
#endif

And that will allow it to work correctly.


Re: [PATCH][RTL-ifcvt] Make non-conditional execution if-conversion more aggressive

2015-07-20 Thread Kyrill Tkachov

Ping.

https://gcc.gnu.org/ml/gcc-patches/2015-07/msg01047.html

The go testsuite passes for me on x86_64-unknown-linux-gnu for me.
A third data point on testing would be appreciated...

Thanks,
Kyrill

On 13/07/15 15:03, Kyrill Tkachov wrote:

Hi Bernhard,

On 13/07/15 10:45, Kyrill Tkachov wrote:

PS: no -mbranch-cost and, a tad more seriously, no --param branch-cost either ;)
PPS: attached meant to illustrate comments above. Untested.

Thanks a lot! This is all very helpful.
I'll respin the patch.

Here it is. I've expanded the comments in the functions you mentioned,
moved the tests to gcc.dg and enabled them for aarch64 and x86 and changed
the types of the costs used to unsigned int.


Bootstrapped on aarch64 and x86_64.
The go testsuite passes on x86_64-unknown-linux-gnu for me...

Thanks,
Kyrill

2015-07-13  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* ifcvt.c (struct noce_if_info): Add then_simple, else_simple,
then_cost, else_cost fields.  Change branch_cost field to unsigned int.
(end_ifcvt_sequence): Call set_used_flags on each insn in the
sequence.
(noce_simple_bbs): New function.
(noce_try_move): Bail if basic blocks are not simple.
(noce_try_store_flag): Likewise.
(noce_try_store_flag_constants): Likewise.
(noce_try_addcc): Likewise.
(noce_try_store_flag_mask): Likewise.
(noce_try_cmove): Likewise.
(noce_try_minmax): Likewise.
(noce_try_abs): Likewise.
(noce_try_sign_mask): Likewise.
(noce_try_bitop): Likewise.
(bbs_ok_for_cmove_arith): New function.
(noce_emit_all_but_last): Likewise.
(noce_emit_insn): Likewise.
(noce_emit_bb): Likewise.
(noce_try_cmove_arith): Handle non-simple basic blocks.
(insn_valid_noce_process_p): New function.
(bb_valid_for_noce_process_p): Likewise.
(noce_process_if_block): Allow non-simple basic blocks
where appropriate.


2015-07-13  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* gcc.dg/ifcvt-1.c: New test.
* gcc.dg/ifcvt-2.c: Likewise.
* gcc.dg/ifcvt-3.c: Likewise.





Thanks,
Kyrill



cheers,




[Bug target/65952] [AArch64] Will not vectorize storing induction of pointer addresses for LP64

2015-07-20 Thread vekumar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65952

vekumar at gcc dot gnu.org changed:

   What|Removed |Added

 CC||vekumar at gcc dot gnu.org

--- Comment #9 from vekumar at gcc dot gnu.org ---
As per Richards, suggestion I added a pattern in vector recog.
This seems to vectorize the this PR. 

However I need some help on the following  

(1)How do I check the shift amount and also care about type/signedness.  There
could be different shift amounts allowed in the target architecture when
looking for power 2 constants.

(2)Should I need to check if target architecture supports vectorized shifts
before converting the pattern?

---Patch---
diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
index f034635..995c9b2 100644
--- a/gcc/tree-vect-patterns.c
+++ b/gcc/tree-vect-patterns.c
@@ -76,6 +76,10 @@ static gimple vect_recog_vector_vector_shift_pattern
(vecgimple *,
  tree *, tree *);
 static gimple vect_recog_divmod_pattern (vecgimple *,
 tree *, tree *);
+
+static gimple vect_recog_multconst_pattern (vecgimple *,
+ tree *, tree *);
+
 static gimple vect_recog_mixed_size_cond_pattern (vecgimple *,
  tree *, tree *);
 static gimple vect_recog_bool_pattern (vecgimple *, tree *, tree *);
@@ -90,6 +94,7 @@ static vect_recog_func_ptr
vect_vect_recog_func_ptrs[NUM_PATTERNS] = {
vect_recog_rotate_pattern,
vect_recog_vector_vector_shift_pattern,
vect_recog_divmod_pattern,
+vect_recog_multconst_pattern,
vect_recog_mixed_size_cond_pattern,
vect_recog_bool_pattern};

@@ -2147,6 +2152,90 @@ vect_recog_vector_vector_shift_pattern (vecgimple
*stmts,
   return pattern_stmt;
 }

+static gimple
+vect_recog_multconst_pattern (vecgimple *stmts,
+   tree *type_in, tree *type_out)
+{
+  gimple last_stmt = stmts-pop ();
+  tree oprnd0, oprnd1, vectype, itype, cond;
+  gimple pattern_stmt, def_stmt;
+  enum tree_code rhs_code;
+  stmt_vec_info stmt_vinfo = vinfo_for_stmt (last_stmt);
+  loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_vinfo);
+  bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (stmt_vinfo);
+  optab optab;
+  tree q;
+  int dummy_int, prec;
+  stmt_vec_info def_stmt_vinfo;
+
+  if (!is_gimple_assign (last_stmt))
+return NULL;
+
+  rhs_code = gimple_assign_rhs_code (last_stmt);
+  switch (rhs_code)
+{
+case MULT_EXPR:
+  break;
+default:
+  return NULL;
+}
+
+  if (STMT_VINFO_IN_PATTERN_P (stmt_vinfo))
+return NULL;
+
+  oprnd0 = gimple_assign_rhs1 (last_stmt);
+  oprnd1 = gimple_assign_rhs2 (last_stmt);
+  itype = TREE_TYPE (oprnd0);
+  if (TREE_CODE (oprnd0) != SSA_NAME
+  || TREE_CODE (oprnd1) != INTEGER_CST
+  || TREE_CODE (itype) != INTEGER_TYPE
+  || TYPE_PRECISION (itype) != GET_MODE_PRECISION (TYPE_MODE (itype)))
+return NULL;
+  vectype = get_vectype_for_scalar_type (itype);
+  if (vectype == NULL_TREE)
+return NULL;
+
+  /* If the target can handle vectorized division or modulo natively,
+ don't attempt to optimize this.  */
+  optab = optab_for_tree_code (rhs_code, vectype, optab_default);
+  if (optab != unknown_optab)
+{
+  machine_mode vec_mode = TYPE_MODE (vectype);
+  int icode = (int) optab_handler (optab, vec_mode);
+  if (icode != CODE_FOR_nothing)
+return NULL;
+}
+
+  prec = TYPE_PRECISION (itype);
+  if (integer_pow2p (oprnd1))
+{
+  /*if (TYPE_UNSIGNED (itype) || tree_int_cst_sgn (oprnd1) != 1)
+return NULL;
+ */
+
+  /* Pattern detected.  */
+  if (dump_enabled_p ())
+dump_printf_loc (MSG_NOTE, vect_location,
+ vect_recog_multconst_pattern: detected:\n);
+
+  tree shift;
+
+  shift = build_int_cst (itype, tree_log2 (oprnd1));
+  pattern_stmt
+= gimple_build_assign (vect_recog_temp_ssa_var (itype, NULL),
+   LSHIFT_EXPR, oprnd0, shift);
+  if (dump_enabled_p ())
+dump_gimple_stmt_loc (MSG_NOTE, vect_location, TDF_SLIM, pattern_stmt,
+  0);
+
+ stmts-safe_push (last_stmt);
+
+  *type_in = vectype;
+  *type_out = vectype;
+  return pattern_stmt;
+   } 
+return NULL;
+}
 /* Detect a signed division by a constant that wouldn't be
otherwise vectorized:

diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
index 48c1f8d..833fe4b 100644
--- a/gcc/tree-vectorizer.h
+++ b/gcc/tree-vectorizer.h
@@ -1131,7 +1131,7 @@ extern void vect_slp_transform_bb (basic_block);
Additional pattern recognition functions can (and will) be added
in the future.  */
 typedef gimple (* vect_recog_func_ptr) (vecgimple *, tree 

[Bug c++/66943] GCC warns of Unknown Pragma for OpenMP, even though it support it.

2015-07-20 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66943

--- Comment #3 from Andrew Pinski pinskia at gcc dot gnu.org ---
The warning is correct though, maybe it should add a message about needing
-fopenmp to have them to be known.


[Bug inline-asm/49611] Inline asm should support input/output of flags

2015-07-20 Thread gccbugzilla at limegreensocks dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49611

--- Comment #16 from David gccbugzilla at limegreensocks dot com ---
I've tried it now and it seems to do good things.  This code:

int main(int argc, char *argv[])
{
   char x;

   asm(setc : =@ccc(x));

   if (!x)
  return 6;
   else
  return argc;
}

produces this output (-O3):

movl$6, %eax
/APP
 # 6 ./r.cpp 1
setc
 # 0  2
/NO_APP
cmovc   %ebx, %eax
addq$32, %rsp
popq%rbx
ret

Although a minor variation (change return argc to return 7) ends up doing
setc+cmpb, so it's not a perfect solution.

Still, if I were Richard, I'd be closing this bug.  If someone has optimization
issues with his solution, that's a new bug.


[Bug c++/66943] GCC warns of Unknown Pragma for OpenMP, even though it support it.

2015-07-20 Thread noloader at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66943

--- Comment #4 from Jeffrey Walton noloader at gmail dot com ---
(In reply to Andrew Pinski from comment #3)
 The warning is correct though, maybe it should add a message about needing
 -fopenmp to have them to be known.

From a dumb user's point of view (folks like me): that behavior squashes a lot
of the benefit of cross-platform sources and using parallel tasks.

I think the OpenMP folks feel about the same. From
http://openmp.org/wp/openmp-specifications/ and
http://www.openmp.org/mp-documents/OpenMP4.0.0.pdf:

Each directive starts with #pragma omp. The remainder of the
directive follows the conventions of the C and C++ standards
for compiler directives. In particular, white space can be
used before and after the #, and sometimes white space must
be used to separate the words in a directive. Preprocessing
tokens following the #pragma omp are subject to macro
replacement. 

There's no expectation that a conforming compiler will issue a warning for
#pragma omp when -fopenmp is not in effect. In fact, I can't find authority to
issue a warning from a conforming compiler.

I think it would be much better to always accept `#pragma omp` *if* the
compiler supports OpenMP, regardless of the status of `-fopenmp`. Conversely,
if the compiler does not support OpenMP, then always issue an unknown pragma
warning (modulo expected behavior of the diagnostic).

Speaking from experience, OpenBSD and Cygwin get into an odd area where they
advertise support for OpenMP by accepting -fopenmp and defining _OPENMP, but
then fail to compile the program. But I think that's a different issue.

(For what its worth, I understand the compiler writers are always right. They
are demi-gods in my little corner of the universe :)


magic 8 constant (bits / byte maybe?) in GCC JIT memento_of_new_rvalue_from_const long::get_wide_int

2015-07-20 Thread Basile Starynkevitch
Hello All,

In GCC trunk svn 225726 the file gcc/jit/jit-recording.c contains the
following code near line 4168:


/* The get_wide_int specialization for long.  */

template 
bool
memento_of_new_rvalue_from_const long::get_wide_int (wide_int *out) const
{
  *out = wi::shwi (m_value, sizeof (m_value) * 8);
  return true;
}


I am guessing that the magic constant 8 above (i.e. line 4170) is the
bits per char on the host machine, but I am not entirely sure of that.
(Maybe it is the bits per char on the target, but I guess not)

Is my understanding correct?

Do we care about running GCCJIT on weird host machines where char is
not a 8 bit byte?

Do we care abour running GCCJIT for cross-compilation to weird target
machines where char is not a 8 bits byte?

Out of curiosity, what are today the systems for which GCC is hosted,
or is targetted, on something where char are not an 8 bits byte?

Regards.

-- 
Basile Starynkevitch  http://starynkevitch.net/Basile/
France


[Bug inline-asm/49611] Inline asm should support input/output of flags

2015-07-20 Thread gcc.hall at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49611

--- Comment #17 from Jeremy gcc.hall at gmail dot com ---
Did you mean stc rather than setc ???

But yes, it looks like its working well.

On 20 July 2015 at 10:05, gccbugzilla at limegreensocks dot com 
gcc-bugzi...@gcc.gnu.org wrote:

 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49611

 --- Comment #16 from David gccbugzilla at limegreensocks dot com ---
 I've tried it now and it seems to do good things.  This code:

 int main(int argc, char *argv[])
 {
char x;

asm(setc : =@ccc(x));

if (!x)
   return 6;
else
   return argc;
 }

 produces this output (-O3):

 movl$6, %eax
 /APP
  # 6 ./r.cpp 1
 setc
  # 0  2
 /NO_APP
 cmovc   %ebx, %eax
 addq$32, %rsp
 popq%rbx
 ret

 Although a minor variation (change return argc to return 7) ends up
 doing
 setc+cmpb, so it's not a perfect solution.

 Still, if I were Richard, I'd be closing this bug.  If someone has
 optimization
 issues with his solution, that's a new bug.

 --
 You are receiving this mail because:
 You are on the CC list for the bug.



[Bug target/65952] [AArch64] Will not vectorize storing induction of pointer addresses for LP64

2015-07-20 Thread vekumar at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65952

--- Comment #10 from vekumar at gcc dot gnu.org ---
With the patch I get 
loop:
adrpx0, array
ldr q1, .LC0
ldr q2, .LC1
adrpx1, ptrs
add x1, x1, :lo12:ptrs
ldr x0, [x0, #:lo12:array]
dup v0.2d, x0
add v1.2d, v0.2d, v1.2d == vectorized
add v0.2d, v0.2d, v2.2d == vectorized
str q1, [x1]
str q0, [x1, 16]
ret
.size   loop, .-loop
.align  4
.LC0:
.xword  0
.xword  16
.align  4
.LC1:
.xword  32
.xword  48


[Bug target/63304] Aarch64 pc-relative load offset out of range

2015-07-20 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

Andrew Pinski pinskia at gcc dot gnu.org changed:

   What|Removed |Added

   Assignee|pinskia at gcc dot gnu.org |unassigned at gcc dot 
gnu.org

--- Comment #15 from Andrew Pinski pinskia at gcc dot gnu.org ---
Not working on this any time soon.  But someone from ARM really should look
into fixing this as it blocks standard C/C++ code from HPC and distros.


Re: [gomp4] remove kernel-specific launch

2015-07-20 Thread Tom de Vries

On 19/07/15 23:08, Nathan Sidwell wrote:

On 07/19/15 16:30, Thomas Schwinge wrote:


 gcc/tree-parloops.c:/* Remove GOACC_kernels.  */
 libgomp/libgomp.map:GOACC_kernels;
 libgomp/libgomp_g.h:extern void GOACC_kernels (int, void (*)
(void *), size_t,


I fixed all byt the parloops comment.  That comment didn't really make
sense to me -- it seems to be doing something with the pragma not the
call.   Perhaps Tom could correct/clarify it?



Committed as attached.


Does it make sense then to rename GOACC_kernels_internal to
GOACC_kernels?


I  agree with Tom.  But perhaps it should be an internal fn? IIUC those
are for pseudo-funcs that should be converted to something else before
the end of compilation.


Turning it into an internal fn will make it harder to convert a 
GOACC_kernels_internal call into a GOACC_parallel call, which we're 
doing here in omp-low.c:

...
  tree fndecl = builtin_decl_explicit (BUILT_IN_GOACC_PARALLEL);
  gimple_call_set_fndecl (call, fndecl);
  gimple_call_set_fntype (call, TREE_TYPE (fndecl));
  gimple_call_reset_alias_info (call);
 ...

Thanks,
- Tom

Update create_parallel_loop for remove GOACC_kernels

2015-07-20  Tom de Vries  t...@codesourcery.com

	* tree-parloops.c (create_parallel_loop): Update comments for removal of
	GOACC_kernels.  Rename goacc_kernels variable into
	goacc_kernels_internal.
---
 gcc/tree-parloops.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c
index 149c336..7111f93 100644
--- a/gcc/tree-parloops.c
+++ b/gcc/tree-parloops.c
@@ -2045,11 +2045,12 @@ create_parallel_loop (struct loop *loop, tree loop_fn, tree data,
 }
   else
 {
-  /* Create oacc parallel pragma based on oacc kernels pragma.  */
+  /* Create oacc parallel pragma based on oacc kernels pragma and
+	 GOAC_kernels_internal call.  */
   gomp_target *kernels = as_a gomp_target * (gsi_stmt (gsi));
 
   gsi_prev (gsi);
-  gcall *goacc_kernels = as_a gcall * (gsi_stmt (gsi));
+  gcall *goacc_kernels_internal = as_a gcall * (gsi_stmt (gsi));
 
   tree clauses = gimple_omp_target_clauses (kernels);
   /* FIXME: We need a more intelligent mapping onto vector, gangs,
@@ -2070,7 +2071,8 @@ create_parallel_loop (struct loop *loop, tree loop_fn, tree data,
   gimple_omp_target_set_child_fn (stmt, child_fn);
   tree data_arg = gimple_omp_target_data_arg (kernels);
   gimple_omp_target_set_data_arg (stmt, data_arg);
-  tree ganglocal_size = gimple_call_arg (goacc_kernels, /* TODO */ 9);
+  tree ganglocal_size
+	= gimple_call_arg (goacc_kernels_internal, /* TODO */ 9);
   gimple_omp_target_set_ganglocal_size (stmt, ganglocal_size);
 
   gimple_set_location (stmt, loc);
@@ -2085,7 +2087,7 @@ create_parallel_loop (struct loop *loop, tree loop_fn, tree data,
 	/* Insert pragma acc parallel.  */
 	gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
 
-	/* Remove GOACC_kernels.  */
+	/* Remove GOACC_kernels_internal call.  */
 	replace_uses_by (gimple_vdef (gsi_stmt (gsi2)),
 			 gimple_vuse (gsi_stmt (gsi2)));
 	gsi_remove (gsi2, true);
-- 
1.9.1



Re: [CHKP, GCC 5] Port a set of stability chkp patches to gcc-5-branch

2015-07-20 Thread Ilya Enkovich
Ping

2015-06-19 17:10 GMT+03:00 Ilya Enkovich enkovich@gmail.com:
 Hi,

 There was a set of stability fixes (mostly different ICEs) for Pointer Bounds 
 Checker done in GCC 6.  But only few of them were approved to be ported to 
 GCC 5.  Will it be OK to port other chkp specific stability fixes to GCC 5?  
 Here is a list of patches:

  https://gcc.gnu.org/ml/gcc-patches/2015-03/msg00995.html
  https://gcc.gnu.org/ml/gcc-patches/2015-05/msg01067.html
  https://gcc.gnu.org/ml/gcc-patches/2015-05/msg01065.html
  https://gcc.gnu.org/ml/gcc-patches/2015-05/msg01386.html
  https://gcc.gnu.org/ml/gcc-patches/2015-06/msg01248.html
  https://gcc.gnu.org/ml/gcc-patches/2015-06/msg01252.html
  https://gcc.gnu.org/ml/gcc-patches/2015-06/msg01253.html
  https://gcc.gnu.org/ml/gcc-patches/2015-06/msg01319.html

 Thanks,
 Ilya


[Bug c++/66943] GCC warns of Unknown Pragma for OpenMP, even though it support it.

2015-07-20 Thread noloader at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66943

--- Comment #8 from Jeffrey Walton noloader at gmail dot com ---
(In reply to Jeffrey Walton from comment #6)
  Maybe it could be in effect with `-Wextra`?
 
 That would just move the problem somewhere else instead of fixing it. Many
 people do compile with -Wall -Wextra (like GCC itself).

Yeah, but it works for me :) But more seriously, I understand what you are
saying. When I have the luxury of a new project, I use -Wall -Wextra
-Wconversion.

 
  Enabling Unknown Pragma warnings for #pragma omp under -Wall when the
  compiler supports it, coupled with the inability to manage warnings with
  'pragma GCC diagnostic` (Bug #53431), means we just turned OFF -Wall. We are
  moving in the wrong direction :(
 
 You could always use -Wall -Wno-unknown-pragmas, but yes, fixing PR53431
 seems the key here. I hope someone finds time to do that before GCC 6 closes
 for development.

-Wno-unknown-pragmas is just one of many we need. Others appear to include
-Wunused-variable, -Wunused-value and -Wunused-function. And we are no longer
managing the warnings in the source code through a GCC diagnostic block;
rather, we are polluting the command line.

We produce a library, so we not only pollute our command line, we polute the
user's command line. That's after the user complains about it because GCC
diagnostic block don't just work.

Does GCC have a Bounty program? If so, I'd be happy to make a donation. I'd
even solicit a for grants because `-Wall` and managing warnings is *that*
important. I feel awful that we yanked it for GCC.

On the good side, our sources are cross-compiler and cross-platform, So we are
effectively using -Wall for MSVC, Clang and ICC. But others don't have that
luxury. For example, the Asterisk project uses trampolines, so the code does
not compile under Clang (and it could never compile under MSVC).


[Bug target/63304] Aarch64 pc-relative load offset out of range

2015-07-20 Thread jiwang at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

Jiong Wang jiwang at gcc dot gnu.org changed:

   What|Removed |Added

 CC||jiwang at gcc dot gnu.org

--- Comment #16 from Jiong Wang jiwang at gcc dot gnu.org ---
Have done a quick look at this, basic ideas to fix this:

  * generate a special pattern which initialize literal pool start address.
  * implement TARGET_MACHINE_DEPENDENT_REORG to calculate whehter the
pc-relative literal load is within range.
  * output final insruction sequences which initializing literal pool start
address based on the result from reorg pass analysis. Use movk/z, adrp +
add,
single adr for different distance.


[Bug c++/66943] GCC warns of Unknown Pragma for OpenMP, even though it support it.

2015-07-20 Thread noloader at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66943

--- Comment #6 from Jeffrey Walton noloader at gmail dot com ---
(In reply to Manuel López-Ibáñez from comment #5)
 ...
 For what is worth, I understand the point by Andrew that without -fopenmp,
 the #pragmas are effectively ignored, thus the warning seems useful. Perhaps
 it would be more useful a specific -Wopenmp-pragmas  that says:
 

Maybe it could be in effect with `-Wextra`?

Enabling Unknown Pragma warnings for #pragma omp under -Wall when the compiler
supports it, coupled with the inability to manage warnings with 'pragma GCC
diagnostic` (Bug #53431), means we just turned OFF -Wall. We are moving in the
wrong direction :(

[Bug c++/66943] GCC warns of Unknown Pragma for OpenMP, even though it support it.

2015-07-20 Thread manu at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66943

--- Comment #7 from Manuel López-Ibáñez manu at gcc dot gnu.org ---
(In reply to Jeffrey Walton from comment #6)
 Maybe it could be in effect with `-Wextra`?

That would just move the problem somewhere else instead of fixing it. Many
people do compile with -Wall -Wextra (like GCC itself).

 Enabling Unknown Pragma warnings for #pragma omp under -Wall when the
 compiler supports it, coupled with the inability to manage warnings with
 'pragma GCC diagnostic` (Bug #53431), means we just turned OFF -Wall. We are
 moving in the wrong direction :(

You could always use -Wall -Wno-unknown-pragmas, but yes, fixing PR53431 seems
the key here. I hope someone finds time to do that before GCC 6 closes for
development.

Re: [PATCH, PR ipa/66566] Fix ICE in early_inliner: internal compiler error: in operator[]

2015-07-20 Thread Ilya Enkovich
Ping

2015-07-13 11:47 GMT+03:00 Ilya Enkovich enkovich@gmail.com:
 Ping

 2015-06-18 12:54 GMT+03:00 Ilya Enkovich enkovich@gmail.com:
 Hi,

 In early_inliner we do recompute inline summaries for edges after 
 optimize_inline_calls, but check this summary exists in case new edges 
 appear.  But then it calls inline_update_overall_summary which also going 
 through edges inline summaries but with no check this time causing segfault. 
  This patch fixes it.  Bootstrapped and regtested for 
 x86_64-unknown-linux-gnu.  Is it OK for trunk and gcc-5-branch?

 Thanks,
 Ilya
 --
 gcc/

 2015-06-18  Ilya Enkovich  enkovich@gmail.com

 PR ipa/66566
 * ipa-inline-analysis.c (estimate_calls_size_and_time): Check
 edge summary is available.

 gcc/testsuite/

 2015-06-18  Ilya Enkovich  enkovich@gmail.com

 PR ipa/66566
 * gcc.target/i386/mpx/pr66566.c: New test.


 diff --git a/gcc/ipa-inline-analysis.c b/gcc/ipa-inline-analysis.c
 index bbde855..e910ac5 100644
 --- a/gcc/ipa-inline-analysis.c
 +++ b/gcc/ipa-inline-analysis.c
 @@ -3122,6 +3122,9 @@ estimate_calls_size_and_time (struct cgraph_node 
 *node, int *size,
struct cgraph_edge *e;
for (e = node-callees; e; e = e-next_callee)
  {
 +  if (inline_edge_summary_vec.length () = (unsigned) e-uid)
 +   continue;
 +
struct inline_edge_summary *es = inline_edge_summary (e);

/* Do not care about zero sized builtins.  */
 @@ -3153,6 +3156,9 @@ estimate_calls_size_and_time (struct cgraph_node 
 *node, int *size,
  }
for (e = node-indirect_calls; e; e = e-next_callee)
  {
 +  if (inline_edge_summary_vec.length () = (unsigned) e-uid)
 +   continue;
 +
struct inline_edge_summary *es = inline_edge_summary (e);
if (!es-predicate
   || evaluate_predicate (es-predicate, possible_truths))
 diff --git a/gcc/testsuite/gcc.target/i386/mpx/pr66566.c 
 b/gcc/testsuite/gcc.target/i386/mpx/pr66566.c
 new file mode 100644
 index 000..a405c20
 --- /dev/null
 +++ b/gcc/testsuite/gcc.target/i386/mpx/pr66566.c
 @@ -0,0 +1,12 @@
 +/* { dg-do compile } */
 +/* { dg-options -O2 -fcheck-pointer-bounds -mmpx } */
 +
 +union jsval_layout
 +{
 +  void *asPtr;
 +};
 +union jsval_layout a;
 +union jsval_layout b;
 +union jsval_layout __inline__ fn1() { return b; }
 +
 +void fn2() { a = fn1(); }


[Bug fortran/66929] [6 regression] ICE with iso_varying_string

2015-07-20 Thread mikael at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66929

--- Comment #3 from Mikael Morin mikael at gcc dot gnu.org ---
Reduced test:

module iso_varying_string
  type, public :: varying_string
 character(LEN=1), dimension(:), allocatable :: chars
  end type varying_string
  interface operator(/=)
 module procedure op_ne_VS_CH
  end interface operator (/=)
  interface trim
 module procedure trim_
  end interface
contains
  elemental function op_ne_VS_CH (string_a, string_b) result (op_ne)
type(varying_string), intent(in) :: string_a
character(LEN=*), intent(in) :: string_b
logical  :: op_ne
  end function op_ne_VS_CH
  elemental function trim_ (string) result (trim_string)
type(varying_string), intent(in) :: string
type(varying_string) :: trim_string
  end function trim_
end module iso_varying_string
module syntax_rules
  use iso_varying_string, string_t = varying_string
contains
  subroutine set_rule_type_and_key ()
type(string_t) :: key
if (trim (key) /= ) then
end if
  end subroutine set_rule_type_and_key
end module syntax_rules


[Bug target/63304] Aarch64 pc-relative load offset out of range

2015-07-20 Thread wdijkstr at arm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304

Wilco wdijkstr at arm dot com changed:

   What|Removed |Added

 CC||wdijkstr at arm dot com

--- Comment #17 from Wilco wdijkstr at arm dot com ---
Well there seem to be 2 ways to address this:

* If a function is huge, emit literals as const data. This enables the use of
anchors and sharing of literals across all functions in a compilation unit.

* Reserve a register in the adr/ldr literal patterns and add a 2-instruction
sequence using adrp when out of range. Ideally the register should only be
reserved if a function is huge.


Re: [PATCH, PR66846] Mark inner loop for fixup in parloops

2015-07-20 Thread Tom de Vries

On 16/07/15 12:15, Richard Biener wrote:

On Thu, Jul 16, 2015 at 11:39 AM, Tom de Vries tom_devr...@mentor.com wrote:

On 16/07/15 10:44, Richard Biener wrote:


On Wed, Jul 15, 2015 at 9:36 PM, Tom de Vries tom_devr...@mentor.com
wrote:


Hi,

I.

In openmp expansion of loops, we do some effort to try to create matching
loops in the loop state of the child function, f.i.in
expand_omp_for_generic:
...
struct loop *outer_loop;
if (seq_loop)
  outer_loop = l0_bb-loop_father;
else
  {
outer_loop = alloc_loop ();
outer_loop-header = l0_bb;
outer_loop-latch = l2_bb;
add_loop (outer_loop, l0_bb-loop_father);
  }

if (!gimple_omp_for_combined_p (fd-for_stmt))
  {
struct loop *loop = alloc_loop ();
loop-header = l1_bb;
/* The loop may have multiple latches.  */
add_loop (loop, outer_loop);
  }
...

And if that doesn't work out, we try to mark the loop state for fixup, in
expand_omp_taskreg and expand_omp_target:
...
/* When the OMP expansion process cannot guarantee an up-to-date
   loop tree arrange for the child function to fixup loops.  */
if (loops_state_satisfies_p (LOOPS_NEED_FIXUP))
  child_cfun-x_current_loops-state |= LOOPS_NEED_FIXUP;
...

and expand_omp_for:
...
else
  /* If there isn't a continue then this is a degerate case where
 the introduction of abnormal edges during lowering will prevent
 original loops from being detected.  Fix that up.  */
  loops_state_set (LOOPS_NEED_FIXUP);
...

However, loops are fixed up anyway, because the first pass we execute
with
the new child function is pass_fixup_cfg.

The new child function contains a function call to
__builtin_omp_get_num_threads, which is marked with ECF_CONST, so
execute_fixup_cfg marks the function for TODO_cleanup_cfg, and
subsequently
the loops with LOOPS_NEED_FIXUP.


II.

This patch adds a verification that at the end of the omp-expand
processing
of the child function, either the loop structure is ok, or marked for
fixup.

This verfication triggered a failure in parloops. When an outer loop is
being parallelized, both the outer and inner loop are cancelled. Then
during
omp-expansion, we create a loop in the loop state for the outer loop (the
one that is transformed), but not for the inner, which causes the
verification failure:
...
outer-1.c:11:3: error: loop with header 5 not in loop tree
...

[ I ran into this verification failure with an openacc kernels testcase
on
the gomp-4_0-branch, where parloops is called additionally from a
different
location, and pass_fixup_cfg is not the first pass that the child
function
is processed by. ]

The patch contains a bit that makes sure that the loop state of the child
function is marked for fixup in parloops. The bit is non-trival since it
create a loop state and sets the fixup flag on the loop state, but
postpones
the init_loops_structure call till move_sese_region_to_fn, where it can
succeed.




SNIP


Can we fix the root-cause of the issue instead?  That is, build a valid loop
structure in the first place?



This patch manages to keep the loop structure, that is, to not cancel 
the loop tree in parloops, and guarantee a valid loop structure at the 
end of parloops.


The transformation to insert the omp_for invalidates the loop state 
properties LOOPS_HAVE_RECORDED_EXITS and LOOPS_HAVE_SIMPLE_LATCHES, so 
we drop those in parloops.


In expand_omp_for_static_nochunk, we detect the existing loop struct of 
the omp_for, and keep it.


Then by calling pass_tree_loop_init after pass_expand_omp_ssa, we get 
the loop state properties LOOPS_HAVE_RECORDED_EXITS and 
LOOPS_HAVE_SIMPLE_LATCHES back.


Tested by running:
- gcc dg.exp parloops*.c
- gcc autopar.exp
- target-libgomp c.exp

Currently bootstrapping and reg-testing on x86_64.

If that succeeds, OK for trunk?

Thanks,
- Tom


Don't cancel loop tree in parloops

2015-07-20  Tom de Vries  t...@codesourcery.com

	PR tree-optimization/66846
	* omp-low.c (expand_omp_taskreg) [ENABLE_CHECKING]: Call
	verify_loop_structure for child_cfun if !LOOPS_NEED_FIXUP.
	(expand_omp_target) [ENABLE_CHECKING]: Same.
	(execute_expand_omp)  [ENABLE_CHECKING]: Call verify_loop_structure for
	cfun if !LOOPS_NEED_FIXUP.
	(expand_omp_for_static_nochunk): Handle case that omp_for already has
	its own loop struct.
	* passes.def: Add pass_tree_loop_init after pass_expand_omp_ssa in
	pass_parallelize_loops.
	* tree-parloops.c (create_parallel_loop): Add comment.
	(gen_parallel_loop): Remove call to cancel_loop_tree.
	(parallelize_loops): Skip loops that are inner loops of parallelized
	loops.
	(pass_parallelize_loops::execute): Clear LOOPS_HAVE_RECORDED_EXITS and
	LOOPS_HAVE_SIMPLE_LATCHES on loop state.
	[ENABLE_CHECKING]: Call verify_loop_structure.
	* tree-ssa-loop.c (pass_tree_loop_init::clone): New function.
---
 gcc/omp-low.c   | 22 

[Bug fortran/64589] [OOP] Linking error due to undefined integer symbol with unlimited polymorphism

2015-07-20 Thread vehre at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64589

vehre at gcc dot gnu.org changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from vehre at gcc dot gnu.org ---
Fixed, closing.


[Bug fortran/66035] [5/6 Regression] gfortran ICE segfault

2015-07-20 Thread vehre at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66035

vehre at gcc dot gnu.org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #7 from vehre at gcc dot gnu.org ---
Fixed, closing.


[Bug target/66912] Copy relocation against protected symbol doesn't work

2015-07-20 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66912

--- Comment #1 from Richard Earnshaw rearnsha at gcc dot gnu.org ---
Erm, isn't that the whole point of marking the symbol 'protected'?

From the ELF spec:

quote
STV_PROTECTED
A symbol defined in the current component is protected if it is visible in
other components but not preemptable, meaning that any reference to such a
symbol from within the defining component must be resolved to the definition in
that component, even if there is a definition in another component that would
preempt by the default rules. A symbol with STB_LOCAL binding may not have
STV_PROTECTED visibility. If a symbol definition with STV_PROTECTED visibility
from a shared object is taken as resolving a reference from an executable or
another shared object, the SHN_UNDEF symbol table entry created has STV_DEFAULT
visibility.
/quote

If we know it will resolve to the definition inside this DSO, then we don't
need to indirect via the GOT to address it.


Re: [gomp4] remove kernel-specific launch

2015-07-20 Thread Nathan Sidwell

On 07/20/15 05:54, Tom de Vries wrote:


Turning it into an internal fn will make it harder to convert a
GOACC_kernels_internal call into a GOACC_parallel call, which we're doing here


I wondered ...



[Bug c/66918] Disable inline function declared but never defined warning

2015-07-20 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66918

Marek Polacek mpolacek at gcc dot gnu.org changed:

   What|Removed |Added

 CC||mpolacek at gcc dot gnu.org

--- Comment #5 from Marek Polacek mpolacek at gcc dot gnu.org ---
I don't think the C++ FE has this warning; it's about C99 inlines.

I think this should not be a part of -Wunused-function, maybe just a part of
-Wpedantic warning.


[PATCH, i386, PR driver/66737] Don't pass '-z bndplt' to linker for 32bit target

2015-07-20 Thread Ilya Enkovich
Hi,

This patch adds a target filter for '-z bndplt' linker option.  Bootstrapped 
and regtested for x86_64-unknown-linux-gnu.  MPX tests at lto.exp are not 
marked as unsupported for 32bit any more.  Going to commit it to trunk in a few 
days if no obections appear.

Thanks,
Ilya
--
gcc/

2015-07-20  Ilya Enkovich  enkovich@gmail.com

PR target/66737
* config/i386/linux-common.h (MPX_SPEC): Use linker option
for 64bit target only.


diff --git a/gcc/config/i386/linux-common.h b/gcc/config/i386/linux-common.h
index 63dd8d8..da09d3d 100644
--- a/gcc/config/i386/linux-common.h
+++ b/gcc/config/i386/linux-common.h
@@ -72,7 +72,7 @@ along with GCC; see the file COPYING3.  If not see
 
 #ifndef MPX_SPEC
 #define MPX_SPEC \
- %{mmpx:%{fcheck-pointer-bounds:%{!static: LINK_MPX }}}
+ %{mmpx:%{fcheck-pointer-bounds:%{!static:%{ SPEC_64 : LINK_MPX 
 #endif
 
 #ifndef LIBMPX_SPEC


Re: [PATCH][combine][1/2] Try to simplify before substituting

2015-07-20 Thread Kyrill Tkachov


On 18/07/15 17:02, Segher Boessenkool wrote:

On Fri, Jul 17, 2015 at 02:47:34PM -0600, Jeff Law wrote:

I mean move the whole if (BINARY_P ... block to after the existing
simplify calls, to just before the First see if we can apply comment,
and not do a new simplify_rtx call at all.  Does that work?

Yes, and here's the patch.
It just moves the simplification block.
The effect on codegen in SPEC2006 on aarch64 looks sane in the same
way as the original patch I posted (i.e. many redundant zero_extends
eliminated)
and together with patch 2/2 this helps in the -abs testcase.

I'm bootstrapping this on aarch64, arm and x86.
Any other testing would be appreciated.

Is this version ok if testing comes clean?

Thanks,
Kyrill

2015-07-17  Kyrylo Tkachov  kyrylo.tkac...@arm.com

 * combine.c (combine_simplify_rtx): Move simplification step
 before various transformations/substitutions.

OK.
jeff

The patch improves generated code on most archs (or at least code size,
which strongly correlates for combine), or is neutral.  xtensa regresses
a tiny bit; powerpc64 and hppa64 regress more.  I analysed the powerpc64
differences, and it seems to be all down to code that is now expressed as

(set (reg:DI) (lt:DI (reg:SI) (const_int 0)))

where before it was a bit extract (of a subreg).  The newly generated
pattern is simper alright, but the backend didn't recognise it.  With a
simple patch, it does, and the generated code is nicely better than
before.


Thanks for analyzing. So will you submit a powerpc patch
for this? I'm not familiar with the patterns there :)



The hppa64 problem looks similar.  Maybe other targets could use such
an improvement as well.

So yes, the patch is fine.  Thank you for working on it Kyrill :-)


x86_64, aarch64 and arm bootstrap passed successfully on my end
and testing looked clean too.
I've committed this patch with r225996 and 2/2 with r225997.

Thanks for helping me work through this!
Kyrill




Segher





Re: [gomp] Move openacc vector worker single handling to RTL

2015-07-20 Thread Nathan Sidwell

On 07/18/15 11:37, Thomas Schwinge wrote:

Hi Nathan!



For OpenACC nvptx offloading, there must still be something wrong; here's
a count of the (non-deterministic!) regressions of ten runs of the
libgomp testsuite.  As private-vars-loop-worker-5.c fails most often, it
probably makes sense to look into that one first.


I'll take a look. :(

nathan


[PATCH] Track indirect calls for call site information in debug info.

2015-07-20 Thread Pierre-Marie de Rodat

Hello,

On PowerPC targets with -mlongcall, most subprogram calls are turned 
into indirect calls: the call target is read from a register even though 
it is compile-time known. This makes it difficult for machine code 
static analysis engines to recover the callee information. The attached 
patch is an attempt to help such engines, generating 
DW_AT_abstract_origin attributes for all DW_TAG_GNU_call_site we are 
interested in.


Here is how it works:

   1. At -O0, the var-tracking pass is disabled, so in order to get a 
NOTE_INSN_CALL_ARG_LOCATION for each call we are interested in, this 
patch creates a new naive var-tracking pass. When optimizing, the 
regular var-tracking pass does this job and this new pass is disabled.


   2. The DWARF back-end (dwarf2out.c) first registers this RTL note 
(in dwarf2out_var_location, already existing code) and extracts the 
corresponding callee function symbol reference (new code to add handling 
for the case we are interested in).


  There, the patch also relaxes assertions in gen_subprogram_die: 
yes, we can have both a compile-time known call target and an indirect 
call. Already existing code in gen_subprogram_die calls 
gen_call_site_die and takes care of generating the corresponding debug 
information.


Bootstrapped and regtested on x86_64-pc-linux-gnu and powerpc-linux-gnu: 
no regression. Ok for trunk? Thank you in advance for your feedback!


gcc/ChangeLog:

* passes.def: Add a new pass: variable_tracking_no_opt.
* rtl.h (variable_tracking_no_opt_main): New.
* tree-pass.h (make_pass_variable_tracking_no_opt): New.
* var-tracking.c (variable_tracking_no_opt_main,
pass_data_variable_tracking_no_opt,
pass_variable_tracking_no_opt,
make_pass_variable_tracking_no_opt): New.  Implement the new
pass which adds notes for indirect calls.
* dwarf2out.c (dwarf2out_var_location): Set the symbol reference
for calls whose target is compile-time known but that are
indirect.
(gen_subprogram_die): Handle such calls.

--
Pierre-Marie de Rodat

From 1fee786f51baca25f1363cd82f207cd67f48e69f Mon Sep 17 00:00:00 2001
From: Pierre-Marie de Rodat dero...@adacore.com
Date: Thu, 13 Jun 2013 11:13:08 +0200
Subject: [PATCH] Track indirect calls for call site information in debug info.

gcc/ChangeLog:

	* passes.def: Add a new pass: variable_tracking_no_opt.
	* rtl.h (variable_tracking_no_opt_main): New.
	* tree-pass.h (make_pass_variable_tracking_no_opt): New.
	* var-tracking.c (variable_tracking_no_opt_main,
	pass_data_variable_tracking_no_opt,
	pass_variable_tracking_no_opt,
	make_pass_variable_tracking_no_opt): New.  Implement the new
	pass which adds notes for indirect calls.
	* dwarf2out.c (dwarf2out_var_location): Set the symbol reference
	for calls whose target is compile-time known but that are
	indirect.
	(gen_subprogram_die): Handle such calls.
---
 gcc/dwarf2out.c|  46 +++-
 gcc/passes.def |   1 +
 gcc/rtl.h  |   1 +
 gcc/tree-pass.h|   1 +
 gcc/var-tracking.c | 102 +
 5 files changed, 135 insertions(+), 16 deletions(-)

diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 2834d57..a6bcb48 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -19219,18 +19219,23 @@ gen_subprogram_die (tree decl, dw_die_ref context_die)
 		}
 		  if (mode == VOIDmode || mode == BLKmode)
 		continue;
-		  if (XEXP (XEXP (arg, 0), 0) == pc_rtx)
+		  /* Sometimes, the target of a call is compile-time known, but
+		 for various reasons, there is still an indirect call
+		 instruction: do not output redundant debug information for
+		 them.  */
+		  if (ca_loc-symbol_ref == NULL_RTX)
 		{
-		  gcc_assert (ca_loc-symbol_ref == NULL_RTX);
-		  tloc = XEXP (XEXP (arg, 0), 1);
-		  continue;
-		}
-		  else if (GET_CODE (XEXP (XEXP (arg, 0), 0)) == CLOBBER
-			XEXP (XEXP (XEXP (arg, 0), 0), 0) == pc_rtx)
-		{
-		  gcc_assert (ca_loc-symbol_ref == NULL_RTX);
-		  tlocc = XEXP (XEXP (arg, 0), 1);
-		  continue;
+		  if (XEXP (XEXP (arg, 0), 0) == pc_rtx)
+			{
+			  tloc = XEXP (XEXP (arg, 0), 1);
+			  continue;
+			}
+		  else if (GET_CODE (XEXP (XEXP (arg, 0), 0)) == CLOBBER
+			XEXP (XEXP (XEXP (arg, 0), 0), 0) == pc_rtx)
+			{
+			  tlocc = XEXP (XEXP (arg, 0), 1);
+			  continue;
+			}
 		}
 		  reg = NULL;
 		  if (REG_P (XEXP (XEXP (arg, 0), 0)))
@@ -22344,11 +22349,20 @@ dwarf2out_var_location (rtx_insn *loc_note)
   x = get_call_rtx_from (PATTERN (prev));
   if (x)
 	{
-	  x = XEXP (XEXP (x, 0), 0);
-	  if (GET_CODE (x) == SYMBOL_REF
-	   SYMBOL_REF_DECL (x)
-	   TREE_CODE (SYMBOL_REF_DECL (x)) == FUNCTION_DECL)
-	ca_loc-symbol_ref = x;
+	  /* Try to get the call symbol, if any.  */
+	  if (MEM_P (XEXP (x, 0)))
+	x = XEXP (x, 0);
+	  /* First, look for a memory access to a symbol_ref.  */
+	  if 

[Bug c/66918] Disable inline function declared but never defined warning

2015-07-20 Thread manu at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66918

--- Comment #6 from Manuel López-Ibáñez manu at gcc dot gnu.org ---
(In reply to Marek Polacek from comment #5)
 I don't think the C++ FE has this warning; it's about C99 inlines.

If not, it has a very similar warning:

/home/manuel/test.cc:1:13: warning: inline function ‘void test()’ used but
never defined
 inline void test(void);
 ^

Re: [gomp4] New nvptx pattern and internal builtin

2015-07-20 Thread Nathan Sidwell

On 07/17/15 11:37, Bernd Schmidt wrote:

I've made this change at the request of Cesar who says it's needed for his
reductions work. It makes a new instruction to represent shfl.down, a thread
communication instruction, and some builtin functions for internal use to access
it.


I was looking at adding another target builtin, and found this code rather 
convoluted.  It seemed to have been cloned from somewhere more complicated -- 
for instance, nvptx_expand_binop_builtin's comment discusses a MACFLAG argument, 
which is nowhere to be seen.


I ended up reimplementing using a single array describing the builtins and 
allowing direct indexing using the builtin number, rather than iteration when 
expanding.


ok?

nathan
2015-07-20  Nathan Sidwell  nat...@codesourcery.com

	* config/nvptx/nvptx.c (nvptx_builtins): Delete enum.
	(nvptx_types): New enum.
	(builtin_description): Add type and num_args fields.
	(builtins): New array describing builtins.
	(NVPTX_BUILTIN_MAX): Define.
	(def_builtin): Delete.
	(nvptx_init_builtins): Reimplement using builtins array.
	(nvptx_expand_binop_builtin): Delete.
	(bdesc_2arg): Delete.
	(nvptx_expand_builtin): Reimplement using builtins array.

Index: config/nvptx/nvptx.c
===
--- config/nvptx/nvptx.c	(revision 225992)
+++ config/nvptx/nvptx.c	(working copy)
@@ -3058,16 +3058,34 @@ nvptx_file_end (void)
 }
 }
 
-/* Codes for all the NVPTX builtins.  */
-enum nvptx_builtins
+enum nvptx_types
+  {
+NT_UINT_UINT_INT,
+NT_ULL_ULL_INT,
+NT_FLT_FLT_INT,
+
+NT_MAX
+  };
+
+struct builtin_description
 {
-  NVPTX_BUILTIN_SHUFFLE_DOWN,
-  NVPTX_BUILTIN_SHUFFLE_DOWNF,
-  NVPTX_BUILTIN_SHUFFLE_DOWNLL,
+  const char *name;
+  enum insn_code icode;
+  unsigned short type;
+  unsigned short num_args;
+};
 
-  NVPTX_BUILTIN_MAX
+static const struct builtin_description builtins[] =
+{
+  {__builtin_nvptx_shuffle_down, CODE_FOR_thread_shuffle_downsi,
+   NT_UINT_UINT_INT, 2},
+  {__builtin_nvptx_shuffle_downf, CODE_FOR_thread_shuffle_downsf,
+   NT_FLT_FLT_INT, 2},
+  { __builtin_nvptx_shuffle_downll, CODE_FOR_thread_shuffle_downdi,
+NT_ULL_ULL_INT, 2},
 };
 
+#define NVPTX_BUILTIN_MAX (sizeof (builtins) / sizeof (builtins[0]))
 
 static GTY(()) tree nvptx_builtin_decls[NVPTX_BUILTIN_MAX];
 
@@ -3081,92 +3099,30 @@ nvptx_builtin_decl (unsigned code, bool
   return nvptx_builtin_decls[code];
 }
 
-#define def_builtin(NAME, TYPE, CODE)	\
-do {	\
-  tree bdecl;\
-  bdecl = add_builtin_function ((NAME), (TYPE), (CODE), BUILT_IN_MD,	\
-NULL, NULL_TREE);			\
-  nvptx_builtin_decls[CODE] = bdecl;	\
-} while (0)
-
 /* Set up all builtin functions for this target.  */
 static void
 nvptx_init_builtins (void)
-{ 
-  tree uint_ftype_uint_int
+{
+  tree types[NT_MAX];
+  unsigned ix;
+
+  types[NT_UINT_UINT_INT]
 = build_function_type_list (unsigned_type_node, unsigned_type_node,
 integer_type_node, NULL_TREE);
-  tree ull_ftype_ull_int
+  types[NT_ULL_ULL_INT]
 = build_function_type_list (long_long_unsigned_type_node,
 long_long_unsigned_type_node,
 integer_type_node, NULL_TREE);
-  tree float_ftype_float_int
+  types[NT_FLT_FLT_INT]
 = build_function_type_list (float_type_node, float_type_node,
 integer_type_node, NULL_TREE);
-  def_builtin (__builtin_nvptx_shuffle_down, uint_ftype_uint_int,
-	   NVPTX_BUILTIN_SHUFFLE_DOWN);
-  def_builtin (__builtin_nvptx_shuffle_downf, float_ftype_float_int,
-	   NVPTX_BUILTIN_SHUFFLE_DOWNF);
-  def_builtin (__builtin_nvptx_shuffle_downll, ull_ftype_ull_int,
-	   NVPTX_BUILTIN_SHUFFLE_DOWNLL);
-}
-
-/* Subroutine of nvptx_expand_builtin to take care of binop insns.  MACFLAG is -1
-   if this is a normal binary op, or one of the MACFLAG_xxx constants.  */
-
-static rtx
-nvptx_expand_binop_builtin (enum insn_code icode, tree exp, rtx target)
-{
-  rtx pat;
-  tree arg0 = CALL_EXPR_ARG (exp, 0);
-  tree arg1 = CALL_EXPR_ARG (exp, 1);
-  rtx op0 = expand_expr (arg0, NULL_RTX, VOIDmode, EXPAND_NORMAL);
-  rtx op1 = expand_expr (arg1, NULL_RTX, VOIDmode, EXPAND_NORMAL);
-  machine_mode op0mode = GET_MODE (op0);
-  machine_mode op1mode = GET_MODE (op1);
-  machine_mode tmode = insn_data[icode].operand[0].mode;
-  machine_mode mode0 = insn_data[icode].operand[1].mode;
-  machine_mode mode1 = insn_data[icode].operand[2].mode;
-  rtx ret = target;
-
-  if (! target
-  || GET_MODE (target) != tmode
-  || ! (*insn_data[icode].operand[0].predicate) (target, tmode))
-target = gen_reg_rtx (tmode);
-
-  gcc_assert ((op0mode == mode0 || op0mode == VOIDmode)
-	   (op1mode == mode1 || op1mode == VOIDmode));
 
-  if (! (*insn_data[icode].operand[1].predicate) (op0, mode0))
-op0 = copy_to_mode_reg (mode0, op0);
-  if (! (*insn_data[icode].operand[2].predicate) (op1, mode1))
-op1 = copy_to_mode_reg (mode1, op1);
-
-  pat = GEN_FCN (icode) (target, op0, op1);
-
-  if (! pat)
-return 0;
-
-  emit_insn (pat);

[Bug target/66930] [5 Regression]: gengtype.c is miscompiled during stage2

2015-07-20 Thread glaubitz at physik dot fu-berlin.de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66930

--- Comment #3 from John Paul Adrian Glaubitz glaubitz at physik dot 
fu-berlin.de ---
(In reply to John Paul Adrian Glaubitz from comment #2)
 r223346 isn't affected either, 5.1.1-6 is still building and already got
 past gengtype.c in stage 2.

Going through the other logs while r223346 is still building, r224724 *is*
affected. So unless the bug is somewhere else, it has to be an issue with 
PR/65979 or PR/66611.

Will test that once the current build of r223346 has finished.

Adrian


Re: Fix two more memory leaks in threader

2015-07-20 Thread James Greenhalgh
On Wed, May 20, 2015 at 05:36:25PM +0100, Jeff Law wrote:
 
 These fix the remaining leaks in the threader that I'm aware of.  We 
 failed to properly clean-up when we had to cancel certain jump threading 
 opportunities.  So thankfully this wasn't a big leak.

Hi Jeff,

I don't have a reduced testcase to go on, but by inspection this patch
looks wrong to me... 

 diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
 index c5b78a4..4bccad0 100644
 --- a/gcc/tree-ssa-threadupdate.c
 +++ b/gcc/tree-ssa-threadupdate.c
 @@ -2159,9 +2159,16 @@ mark_threaded_blocks (bitmap threaded_blocks)
  {
 /* Attach the path to the starting edge if none is yet recorded.  */
if ((*path)[0]-e-aux == NULL)
 -(*path)[0]-e-aux = path;
 -   else if (dump_file  (dump_flags  TDF_DETAILS))
 - dump_jump_thread_path (dump_file, *path, false);
 + {
 +  (*path)[0]-e-aux = path;
 + }
 +   else
 + {
 +   paths.unordered_remove (i);

Here we are part-way through iterating through PATHS. With unordered_remove
we are going to move the end element of the vector to position 'i'. We'll
then move on and look at 'i + 1' and so never look at the element we just
moved.

This manifests as a lower number of cancelled jump-threads, and in
turn some extra jumps threaded - some of which may no longer be safe.

For a particular workload we've talked about before in relation to
jump-threading, dom1 ends up cancelling and threading these edges with
your patch applied:

  Cancelling jump thread: (28, 32) incoming edge;  (32, 36) joiner;  (36, 61) 
normal;
  Cancelling jump thread: (31, 7) incoming edge;  (7, 92) joiner;  (92, 8) 
normal;
  Cancelling jump thread: (63, 39) incoming edge;  (39, 89) joiner;  (89, 40) 
normal;
  Cancelling jump thread: (67, 68) incoming edge;  (68, 69) joiner;  (69, 93) 
normal;
  Cancelling jump thread: (4, 32) incoming edge;  (32, 36) joiner;  (36, 64) 
normal;
  Threaded jump 30 -- 28 to 299
  Threaded jump 91 -- 28 to 300
  Threaded jump 35 -- 36 to 302
  Threaded jump 88 -- 60 to 305
  Threaded jump 62 -- 60 to 306
  Threaded jump 32 -- 36 to 304

Reverting the patch we get these edges cancelled and threaded:

  Cancelling jump thread: (28, 32) incoming edge;  (32, 36) joiner;  (36, 61) 
normal;
  Cancelling jump thread: (31, 7) incoming edge;  (7, 92) joiner;  (92, 8) 
normal;
  Cancelling jump thread: (63, 39) incoming edge;  (39, 89) joiner;  (89, 40) 
normal;
  Cancelling jump thread: (67, 68) incoming edge;  (68, 69) joiner;  (69, 93) 
normal;
  Cancelling jump thread: (4, 32) incoming edge;  (32, 36) joiner;  (36, 64) 
normal;
  Cancelling jump thread: (4, 29) incoming edge;  (29, 91) joiner;  (91, 30) 
normal; (30, 31) nocopy;
  Cancelling jump thread: (32, 36) incoming edge;  (36, 64) joiner;  (64, 68) 
normal;
  Cancelling jump thread: (36, 61) incoming edge;  (61, 88) joiner;  (88, 62) 
normal; (62, 63) nocopy;
  Cancelling jump thread: (64, 68) incoming edge;  (68, 69) joiner;  (69, 93) 
normal;
  Threaded jump 30 -- 28 to 299
  Threaded jump 91 -- 28 to 300
  Threaded jump 35 -- 36 to 302
  Threaded jump 88 -- 60 to 303
  Threaded jump 62 -- 60 to 304

Note the extra thread of 32 -- 36 to 304 with this patch applied.

I think we either want to defer the unordered_remove until we're done
processing all the vector elements, or make sure to look at element 'i'
again after we've moved something new in to it.

A testcase would need to expose at least two threads which we later want
to cancel, one of which ends up at the end of the vector of threading
opportunities. We should then see only the first of those threads
actually get cancelled, and the second skipped over. Reproducing these
conditions is quite tough, which has stopped me finding a useful example
for the list, I'll be sure to follow-up this thread if I do get to one.

 +   if (dump_file  (dump_flags  TDF_DETAILS))
 + dump_jump_thread_path (dump_file, *path, false);
 +   delete_jump_thread_path (path);
 + }
  }
  }
/* Second, look for paths that have any other jump thread attached to
 @@ -2185,8 +2192,10 @@ mark_threaded_blocks (bitmap threaded_blocks)
 else
   {
 e-aux = NULL;
 +   paths.unordered_remove (i);

Likewise here.

Thanks,
James

 if (dump_file  (dump_flags  TDF_DETAILS))
   dump_jump_thread_path (dump_file, *path, false);
 +   delete_jump_thread_path (path);
   }
   }
  }



[PATCH][match.pd] PR middle-end/66915 Restrict A - B - A + (-B) to non-fixed-point types

2015-07-20 Thread Kyrill Tkachov

Hi all,

This patch fixes the PR in question which is a miscompilation of 
gcc.dg/fixed-point/unary.c on arm.
It just restricts the A - B - A + (-B) transformation when the type is 
fixed-point.

This fixes the testcase for me.
Is this the right approach?

Bootstrap and test on arm and x86 running.

Ok if testing is clean?

Thanks,
Kyrill


2015-07-20  Kyrylo Tkachov  kyrylo.tkac...@arm.com

PR middle-end/66915
* match.pd (A - B - A + (-B)): Don't allow folding
when type if a fixed-point type.
commit c6669b5cde3d7b504aec388282e7af955af58681
Author: Kyrylo Tkachov kyrylo.tkac...@arm.com
Date:   Mon Jul 20 15:02:17 2015 +0100

[match.pd] PR middle-end/66915 Restrict A - B - A + (-B) to non-fixed-point types

diff --git a/gcc/match.pd b/gcc/match.pd
index 4427000..3d7b32e 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -522,8 +522,8 @@ along with GCC; see the file COPYING3.  If not see
 /* A - B - A + (-B) if B is easily negatable.  */
 (simplify
  (minus @0 negate_expr_p@1)
- (plus @0 (negate @1)))
-
+ (if (!FIXED_POINT_TYPE_P (type))
+ (plus @0 (negate @1
 
 /* Try to fold (type) X op CST - (type) (X op ((type-x) CST))
when profitable.


Re: Fix two more memory leaks in threader

2015-07-20 Thread Marek Polacek
On Mon, Jul 20, 2015 at 03:19:06PM +0100, James Greenhalgh wrote:
 On Wed, May 20, 2015 at 05:36:25PM +0100, Jeff Law wrote:
  
  These fix the remaining leaks in the threader that I'm aware of.  We 
  failed to properly clean-up when we had to cancel certain jump threading 
  opportunities.  So thankfully this wasn't a big leak.
 
 Hi Jeff,
 
 I don't have a reduced testcase to go on, but by inspection this patch
 looks wrong to me... 
 
  diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
  index c5b78a4..4bccad0 100644
  --- a/gcc/tree-ssa-threadupdate.c
  +++ b/gcc/tree-ssa-threadupdate.c
  @@ -2159,9 +2159,16 @@ mark_threaded_blocks (bitmap threaded_blocks)
   {
/* Attach the path to the starting edge if none is yet recorded.  */
 if ((*path)[0]-e-aux == NULL)
  -(*path)[0]-e-aux = path;
  - else if (dump_file  (dump_flags  TDF_DETAILS))
  -   dump_jump_thread_path (dump_file, *path, false);
  +   {
  +  (*path)[0]-e-aux = path;
  +   }
  + else
  +   {
  + paths.unordered_remove (i);
 
 Here we are part-way through iterating through PATHS. With unordered_remove
 we are going to move the end element of the vector to position 'i'. We'll
 then move on and look at 'i + 1' and so never look at the element we just
 moved.
 
 This manifests as a lower number of cancelled jump-threads, and in
 turn some extra jumps threaded - some of which may no longer be safe.
 
 For a particular workload we've talked about before in relation to
 jump-threading, dom1 ends up cancelling and threading these edges with
 your patch applied:
 
   Cancelling jump thread: (28, 32) incoming edge;  (32, 36) joiner;  (36, 61) 
 normal;
   Cancelling jump thread: (31, 7) incoming edge;  (7, 92) joiner;  (92, 8) 
 normal;
   Cancelling jump thread: (63, 39) incoming edge;  (39, 89) joiner;  (89, 40) 
 normal;
   Cancelling jump thread: (67, 68) incoming edge;  (68, 69) joiner;  (69, 93) 
 normal;
   Cancelling jump thread: (4, 32) incoming edge;  (32, 36) joiner;  (36, 64) 
 normal;
   Threaded jump 30 -- 28 to 299
   Threaded jump 91 -- 28 to 300
   Threaded jump 35 -- 36 to 302
   Threaded jump 88 -- 60 to 305
   Threaded jump 62 -- 60 to 306
   Threaded jump 32 -- 36 to 304
 
 Reverting the patch we get these edges cancelled and threaded:
 
   Cancelling jump thread: (28, 32) incoming edge;  (32, 36) joiner;  (36, 61) 
 normal;
   Cancelling jump thread: (31, 7) incoming edge;  (7, 92) joiner;  (92, 8) 
 normal;
   Cancelling jump thread: (63, 39) incoming edge;  (39, 89) joiner;  (89, 40) 
 normal;
   Cancelling jump thread: (67, 68) incoming edge;  (68, 69) joiner;  (69, 93) 
 normal;
   Cancelling jump thread: (4, 32) incoming edge;  (32, 36) joiner;  (36, 64) 
 normal;
   Cancelling jump thread: (4, 29) incoming edge;  (29, 91) joiner;  (91, 30) 
 normal; (30, 31) nocopy;
   Cancelling jump thread: (32, 36) incoming edge;  (36, 64) joiner;  (64, 68) 
 normal;
   Cancelling jump thread: (36, 61) incoming edge;  (61, 88) joiner;  (88, 62) 
 normal; (62, 63) nocopy;
   Cancelling jump thread: (64, 68) incoming edge;  (68, 69) joiner;  (69, 93) 
 normal;
   Threaded jump 30 -- 28 to 299
   Threaded jump 91 -- 28 to 300
   Threaded jump 35 -- 36 to 302
   Threaded jump 88 -- 60 to 303
   Threaded jump 62 -- 60 to 304
 
 Note the extra thread of 32 -- 36 to 304 with this patch applied.
 
 I think we either want to defer the unordered_remove until we're done
 processing all the vector elements, or make sure to look at element 'i'
 again after we've moved something new in to it.
 
 A testcase would need to expose at least two threads which we later want
 to cancel, one of which ends up at the end of the vector of threading
 opportunities. We should then see only the first of those threads
 actually get cancelled, and the second skipped over. Reproducing these
 conditions is quite tough, which has stopped me finding a useful example
 for the list, I'll be sure to follow-up this thread if I do get to one.
 
Yes, there's something wrong about this patch, see PR66372.
I guess what you wrote above is the problem in this PR.

Marek


Re: [PING] Re: [PATCH] New configure option to default enable Smart Stack Protection

2015-07-20 Thread Magnus Granberg
måndag 13 juli 2015 15.20.40 skrev  Magnus Granberg:
 söndag 05 juli 2015 23.59.32 skrev  Magnus Granberg:
  Changlogs
  /gcc
  2015-07-05  Magnus Granberg  zo...@gentoo.org
  
  * common.opt (fstack-protector): Initialize to -1.
  (fstack-protector-all): Likewise.
  (fstack-protector-strong): Likewise.
  (fstack-protector-explicit): Likewise.
  * configure.ac: Add --enable-default-ssp.
  * defaults.h (DEFAULT_FLAG_SSP): New.  Default SSP to strong.
  * opts.c (finish_options): Update opts-x_flag_stack_protect if it
  
  is -1. * doc/install.texi: Document --enable-default-ssp.
  
  * config.in: Regenerated.
  * configure: Likewise.
  
  /testsuite
  2015-07-05  Magnus Granberg  zo...@gentoo.org
  
  * lib/target-supports.exp
  (check_effective_target_fstack_protector_enabled): New test.
  * gcc.target/i386/ssp-default.c: New test.
 
 Patch updated and tested on x86_64-unknown-linux-gnu (Gentoo)
 
 Changlogs
 /gcc
 2015-07-05  Magnus Granberg  zo...@gentoo.org
 
 * common.opt (fstack-protector): Initialize to -1.
 (fstack-protector-all): Likewise.
 (fstack-protector-strong): Likewise.
 (fstack-protector-explicit): Likewise.
 * configure.ac: Add --enable-default-ssp.
 * defaults.h (DEFAULT_FLAG_SSP): New.  Default SSP to strong.
 * opts.c (finish_options): Update opts-x_flag_stack_protect if it
 is -1. * doc/install.texi: Document --enable-default-ssp.
 * config.in: Regenerated.
 * configure: Likewise.
 
 /testsuite
 2015-07-13  Magnus Granberg  zo...@gentoo.org
 
 * lib/target-supports.exp
 (check_effective_target_fstack_protector_enabled): New test.
 * gcc.target/i386/ssp-default.c: New test.
 ---
Ping
Can this be commited to trunk?



[Bug tree-optimization/66372] [6 Regression] ICE on valid code at -O3 on x86_64-linux-gnu

2015-07-20 Thread jgreenhalgh at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66372

James Greenhalgh jgreenhalgh at gcc dot gnu.org changed:

   What|Removed |Added

 CC||jgreenhalgh at gcc dot gnu.org

--- Comment #3 from James Greenhalgh jgreenhalgh at gcc dot gnu.org ---
Possibly related to this report on gcc-patches:
https://gcc.gnu.org/ml/gcc-patches/2015-07/msg01649.html


[COMMITTED][AArch64] Restrict got_mem_hoist_1.c with small memory model

2015-07-20 Thread Jiong Wang

Test testcase fail on tiny model, it's because single ldr generated to
address got entry of bar, while the early RTL PRE pass has done the IV
hoisting, so loop IV will do nothing, thus we should restrict this
testcase under small model.

For large model, -fpic is unsupported on AArch64, while for absolute
address, anchor used, single ldr generated, IV hoisted by PRE pass
also, in either case, this testcase doesn't apply, we should skip it
thus.

Committed attach patch as obivious.

2015-07-20  Jiong Wang  jiong.w...@arm.com

gcc/testsuite/
  * gcc.target/aarch64/got_mem_hoist.c (dg-skip-if): Skip tiny and large
  model.
  
-- 
Regards,
Jiong

diff --git a/gcc/testsuite/gcc.target/aarch64/got_mem_hoist_1.c b/gcc/testsuite/gcc.target/aarch64/got_mem_hoist_1.c
index 6d29718..2d8c8ae 100644
--- a/gcc/testsuite/gcc.target/aarch64/got_mem_hoist_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/got_mem_hoist_1.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options -O2 -fpic -fdump-rtl-loop2_invariant } */
+/* { dg-skip-if Load/Store hoisted by RTL PRE already { aarch64-*-* }  { -mcmodel=tiny -mcmodel=large } {  } } */
 
 int bar (int);
 int cal (void *);


[Bug target/66912] Copy relocation against protected symbol doesn't work

2015-07-20 Thread nsz at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66912

--- Comment #2 from nsz at gcc dot gnu.org ---
protected only means it cannot be overridden.

so we know the symbol will be resolved to the local one,
however it may be visible externally and then the address
must be the same in the other modules which is a problem
if the main executable has a copy reloc against this symbol.


Re: [gomp4] New nvptx pattern and internal builtin

2015-07-20 Thread Bernd Schmidt

On 07/20/2015 03:19 PM, Nathan Sidwell wrote:


I was looking at adding another target builtin, and found this code
rather convoluted.  It seemed to have been cloned from somewhere more
complicated -- for instance, nvptx_expand_binop_builtin's comment
discusses a MACFLAG argument, which is nowhere to be seen.


Okay, I admit to tuning out comments for code that I know, and I didn't 
notice that one. As for being convoluted - this is pretty much the 
standard structure for the machine specific builtins which is used in a 
lot of ports.



I ended up reimplementing using a single array describing the builtins
and allowing direct indexing using the builtin number, rather than
iteration when expanding.


If you really want to, that's fine, but note the point about consistency 
with other ports.



Bernd



[Bug c++/55095] Wshift-overflow

2015-07-20 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55095

--- Comment #12 from Marek Polacek mpolacek at gcc dot gnu.org ---
Author: mpolacek
Date: Mon Jul 20 13:43:45 2015
New Revision: 225998

URL: https://gcc.gnu.org/viewcvs?rev=225998root=gccview=rev
Log:
PR c++/55095
* c-common.c (c_fully_fold_internal): Warn about left shift overflows.
Use EXPR_LOC_OR_LOC.
(maybe_warn_shift_overflow): New function.
* c-common.h (maybe_warn_shift_overflow): Declare.
* c-opts.c (c_common_post_options): Set warn_shift_overflow.
* c.opt (Wshift-overflow): New option.

* c-typeck.c (digest_init): Pass OPT_Wpedantic to pedwarn_init.
(build_binary_op): Warn about left shift overflows.

* typeck.c (cp_build_binary_op): Warn about left shift overflows.

* doc/invoke.texi: Document -Wshift-overflow and -Wshift-overflow=.

* c-c++-common/Wshift-overflow-1.c: New test.
* c-c++-common/Wshift-overflow-2.c: New test.
* c-c++-common/Wshift-overflow-3.c: New test.
* c-c++-common/Wshift-overflow-4.c: New test.
* c-c++-common/Wshift-overflow-5.c: New test.
* g++.dg/cpp1y/left-shift-1.C: New test.
* gcc.dg/c90-left-shift-2.c: New test.
* gcc.dg/c90-left-shift-3.c: New test.
* gcc.dg/c99-left-shift-2.c: New test.
* gcc.dg/c99-left-shift-3.c: New test.
* gcc.dg/pr40501.c: Use -Wno-shift-overflow.
* gcc.c-torture/execute/pr40386.c: Likewise.
* gcc.dg/vect/pr33373.c: Likewise.
* gcc.dg/vect/vect-shift-2-big-array.c: Likewise.
* gcc.dg/vect/vect-shift-2.c: Likewise.

Added:
trunk/gcc/testsuite/c-c++-common/Wshift-overflow-1.c
trunk/gcc/testsuite/c-c++-common/Wshift-overflow-2.c
trunk/gcc/testsuite/c-c++-common/Wshift-overflow-3.c
trunk/gcc/testsuite/c-c++-common/Wshift-overflow-4.c
trunk/gcc/testsuite/c-c++-common/Wshift-overflow-5.c
trunk/gcc/testsuite/g++.dg/cpp1y/left-shift-1.C
trunk/gcc/testsuite/gcc.dg/c90-left-shift-2.c
trunk/gcc/testsuite/gcc.dg/c90-left-shift-3.c
trunk/gcc/testsuite/gcc.dg/c99-left-shift-2.c
trunk/gcc/testsuite/gcc.dg/c99-left-shift-3.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/c-family/ChangeLog
trunk/gcc/c-family/c-common.c
trunk/gcc/c-family/c-common.h
trunk/gcc/c-family/c-opts.c
trunk/gcc/c-family/c.opt
trunk/gcc/c/ChangeLog
trunk/gcc/c/c-typeck.c
trunk/gcc/cp/ChangeLog
trunk/gcc/cp/typeck.c
trunk/gcc/doc/invoke.texi
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.c-torture/execute/pr40386.c
trunk/gcc/testsuite/gcc.dg/pr40501.c
trunk/gcc/testsuite/gcc.dg/vect/pr33373.c
trunk/gcc/testsuite/gcc.dg/vect/vect-shift-2-big-array.c
trunk/gcc/testsuite/gcc.dg/vect/vect-shift-2.c


Re: [C/C++ PATCH] Implement -Wshift-overflow (PR c++/55095) (take 3)

2015-07-20 Thread Marek Polacek
On Fri, Jul 17, 2015 at 03:51:33PM -0600, Jeff Law wrote:
 I'll approve the C++ parts given how simple they are :-)

Thanks, I've committed the patch now after another regtest/bootstrap.

Marek


[Bug c++/55095] Wshift-overflow

2015-07-20 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55095

Marek Polacek mpolacek at gcc dot gnu.org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #13 from Marek Polacek mpolacek at gcc dot gnu.org ---
Implemented for GCC 6.


[Bug target/66930] [5 Regression]: gengtype.c is miscompiled during stage2

2015-07-20 Thread glaubitz at physik dot fu-berlin.de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66930

--- Comment #4 from John Paul Adrian Glaubitz glaubitz at physik dot 
fu-berlin.de ---
(In reply to John Paul Adrian Glaubitz from comment #3)
 Will test that once the current build of r223346 has finished.

Another hint is a changelog entry for the Debian package 5.1.1-11:

 * Build with -O1 on sh4 (try to work around PR target/66358).

So it *might* be a combination of PR/65979 and using -O1 instead of -O2.

Adrian


[PATCH] Update decls in genemit.c

2015-07-20 Thread Marek Polacek
Since r225883, genemit.c's gen_insn/gen_expand/gen_split have a parameter
of type md_rtx_info *, not rtx, but the function prototypes weren't updated,
which results into 'declared but never defined' warnings.  The following
trivial patch updates the declarations.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2015-07-20  Marek Polacek  pola...@redhat.com

* genemit.c (gen_insn, gen_expand, gen_split): Update declaration.

diff --git gcc/genemit.c gcc/genemit.c
index 2d2fb62..2fa2062 100644
--- gcc/genemit.c
+++ gcc/genemit.c
@@ -51,9 +51,9 @@ struct clobber_ent
 
 static void print_code (RTX_CODE);
 static void gen_exp(rtx, enum rtx_code, char *);
-static void gen_insn   (rtx, int);
-static void gen_expand (rtx);
-static void gen_split  (rtx);
+static void gen_insn   (md_rtx_info *);
+static void gen_expand (md_rtx_info *);
+static void gen_split  (md_rtx_info *);
 static void output_add_clobbers(void);
 static void output_added_clobbers_hard_reg_p (void);
 static void gen_rtx_scratch(rtx, enum rtx_code);

Marek


[Bug tree-optimization/66926] [6 regression] FAIL: gfortran.dg/graphite/vect-pr40979.f90 -O (internal compiler error)

2015-07-20 Thread ysrumyan at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66926

--- Comment #2 from Yuri Rumyantsev ysrumyan at gmail dot com ---
Could somebody provides me with an instruction how to build trunk (fresh)
compiler with graphite?

Thanks.


[gomp4.1] Pretty print '+' for sink positive offsets

2015-07-20 Thread Aldy Hernandez

Positive numbers get no '+' by default, so the dumps look like this:

#pragma omp ordered depend(sink:i1)

The attached patch fixes this oversight to print:

#pragma omp ordered depend(sink:i+1)

OK for branch?
commit fc23dda3b860931ca72bf00236e3f929b48c751e
Author: Aldy Hernandez al...@redhat.com
Date:   Mon Jul 20 07:44:52 2015 -0700

* tree-pretty-print.c (dump_omp_clause): Print '+' for positive
numbers.

diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index bd28844..d3cc245 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -566,8 +566,12 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, 
int flags)
  {
dump_generic_node (pp, TREE_VALUE (t), spc, flags, false);
if (TREE_PURPOSE (t) != integer_zero_node)
- dump_generic_node (pp, TREE_PURPOSE (t), spc, flags,
-false);
+ {
+   if (!wi::neg_p (TREE_PURPOSE (t)))
+ pp_plus (pp);
+   dump_generic_node (pp, TREE_PURPOSE (t), spc, flags,
+  false);
+ }
if (TREE_CHAIN (t))
  pp_comma (pp);
  }


Re: [gomp4.1] Pretty print '+' for sink positive offsets

2015-07-20 Thread Jakub Jelinek
On Mon, Jul 20, 2015 at 07:47:25AM -0700, Aldy Hernandez wrote:
 Positive numbers get no '+' by default, so the dumps look like this:
 
 #pragma omp ordered depend(sink:i1)
 
 The attached patch fixes this oversight to print:
 
 #pragma omp ordered depend(sink:i+1)
 
 OK for branch?

Ok, thanks.

 commit fc23dda3b860931ca72bf00236e3f929b48c751e
 Author: Aldy Hernandez al...@redhat.com
 Date:   Mon Jul 20 07:44:52 2015 -0700
 
   * tree-pretty-print.c (dump_omp_clause): Print '+' for positive
   numbers.
 
 diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
 index bd28844..d3cc245 100644
 --- a/gcc/tree-pretty-print.c
 +++ b/gcc/tree-pretty-print.c
 @@ -566,8 +566,12 @@ dump_omp_clause (pretty_printer *pp, tree clause, int 
 spc, int flags)
 {
   dump_generic_node (pp, TREE_VALUE (t), spc, flags, false);
   if (TREE_PURPOSE (t) != integer_zero_node)
 -   dump_generic_node (pp, TREE_PURPOSE (t), spc, flags,
 -  false);
 +   {
 + if (!wi::neg_p (TREE_PURPOSE (t)))
 +   pp_plus (pp);
 + dump_generic_node (pp, TREE_PURPOSE (t), spc, flags,
 +false);
 +   }
   if (TREE_CHAIN (t))
 pp_comma (pp);
 }


Jakub


[PATCH, rtl-opt, i386]: Backport fix for PR 58066, __tls_get_addr is called with misaligned stack on x86-64

2015-07-20 Thread Uros Bizjak
Attached patch backports fixes for PR 58066 to release branches.

2015-07-XX  Uros Bizjak  ubiz...@gmail.com

Backport from mainline:
2015-07-17  Uros Bizjak  ubiz...@gmail.com

PR rtl-optimization/66891
* calls.c (expand_call): Wrap precompute_register_parameters with
NO_DEFER_POP/OK_DEFER_POP to prevent deferred pops.

2015-07-15  Uros Bizjak  ubiz...@gmail.com

PR target/58066
* config/i386/i386.md (*tls_global_dynamic_64_mode): Depend on SP_REG.
(*tls_local_dynamic_base_64_mode): Ditto.
(*tls_local_dynamic_base_64_largepic): Ditto.
(tls_global_dynamic_64_mode): Update expander pattern.
(tls_local_dynamic_base_64_mode): Ditto.

2015-07-15  Uros Bizjak  ubiz...@gmail.com

PR rtl-optimization/58066
* calls.c (expand_call): Precompute register parameters before stack

testsuite/ChangeLog:

2015-07-XX  Uros Bizjak  ubiz...@gmail.com

Backport from mainline:
2015-07-17  Uros Bizjak  ubiz...@gmail.com

PR target/66891
* gcc.target/i386/pr66891.c: New test.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}  for
all default languages, obj-c++ and go.

OK for branches?

Uros.
Index: calls.c
===
--- calls.c (revision 226001)
+++ calls.c (working copy)
@@ -3115,6 +3115,19 @@ expand_call (tree exp, rtx target, int ignore)
 
   compute_argument_addresses (args, argblock, num_actuals);
 
+  /* Stack is properly aligned, pops can't safely be deferred during
+the evaluation of the arguments.  */
+  NO_DEFER_POP;
+
+  /* Precompute all register parameters.  It isn't safe to compute
+anything once we have started filling any specific hard regs.
+TLS symbols sometimes need a call to resolve.  Precompute
+register parameters before any stack pointer manipulation
+to avoid unaligned stack in the called function.  */
+  precompute_register_parameters (num_actuals, args, reg_parm_seen);
+
+  OK_DEFER_POP;
+
   /* Perform stack alignment before the first push (the last arg).  */
   if (argblock == 0
adjusted_args_size.constant  reg_parm_stack_space
@@ -3155,10 +3168,6 @@ expand_call (tree exp, rtx target, int ignore)
 
   funexp = rtx_for_function_call (fndecl, addr);
 
-  /* Precompute all register parameters.  It isn't safe to compute anything
-once we have started filling any specific hard regs.  */
-  precompute_register_parameters (num_actuals, args, reg_parm_seen);
-
   if (CALL_EXPR_STATIC_CHAIN (exp))
static_chain_value = expand_normal (CALL_EXPR_STATIC_CHAIN (exp));
   else
Index: testsuite/gcc.target/i386/pr66891.c
===
--- testsuite/gcc.target/i386/pr66891.c (revision 0)
+++ testsuite/gcc.target/i386/pr66891.c (working copy)
@@ -0,0 +1,16 @@
+/* { dg-do compile { target ia32 } } */
+/* { dg-options -O2 } */
+
+__attribute__((__stdcall__)) void fn1();
+
+int a;
+
+static void fn2() {
+  for (;;)
+;
+}
+
+void fn3() {
+  fn1(0);
+  fn2(a == 0);
+}


[Bug tree-optimization/66372] [6 Regression] ICE on valid code at -O3 on x86_64-linux-gnu

2015-07-20 Thread jgreenhalgh at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66372

--- Comment #4 from James Greenhalgh jgreenhalgh at gcc dot gnu.org ---
I think this is the same issue as I spotted in the larger testcase.

Looking at the cancelled jumps:

  trunk/foo.c.096t.dom1:  Cancelling jump thread: (6, 8) incoming edge;  (8,
13) joiner;  (13, 9) normal;
  trunk/foo.c.096t.dom1:  Cancelling jump thread: (7, 9) incoming edge;  (9,
13) joiner;  (13, 10) normal;

With r223448 reverted:

  reverted/small.c.096t.dom1:  Cancelling jump thread: (6, 8) incoming edge; 
(8, 13) joiner;  (13, 9) normal;
  reverted/small.c.096t.dom1:  Cancelling jump thread: (7, 9) incoming edge; 
(9, 13) joiner;  (13, 10) normal;
  reverted/small.c.096t.dom1:  Cancelling jump thread: (2, 3) incoming edge; 
(3, 4) joiner;  (4, 5) normal;

If we fail to cancel 2 -- 3 to 5, we will probably end up in trouble given the
other blocks we've threaded through:

  trunk/small.c.096t.dom1:  Threaded jump 6 -- 10 to 16
  trunk/small.c.096t.dom1:  Threaded jump 7 -- 10 to 16
  trunk/small.c.096t.dom1:  Threaded jump 15 -- 13 to 17
  trunk/small.c.096t.dom1:  Threaded jump 8 -- 13 to 17
  trunk/small.c.096t.dom1:  Threaded jump 3 -- 4 to 15
  trunk/small.c.096t.dom1:  Threaded jump 14 -- 13 to 16
  trunk/small.c.096t.dom1:  Threaded jump 9 -- 13 to 16

And taking a look at the path we're processing just before the ICE, it does
start (2, 3).


Re: [PATCH] Simple optimization for MASK_STORE.

2015-07-20 Thread Yuri Rumyantsev
Hi Jeff!

Thanks for your details comments.

You asked:
How are conditionals on vectors usually handled?  You should try to
mimick that and report, with detail, why that's not working.

In current implementation of vectorization pass all comparisons are
handled through COND_EXPR, i.e. vect-pattern pass transforms all
comparisons producing bool values to conditional expressions like a[i]
!= 0  -- a[i]!=0? 1: 0 which vectorizers transforms to
vect-cond-expr. So we don't have operations with vector operands and
scalar (bool) result.
To implement such operations I introduced target-hook. Could you
propose another solution implementing it?

Thanks.
Yuri.

2015-07-10 8:51 GMT+03:00 Jeff Law l...@redhat.com:
 On 06/18/2015 08:32 AM, Yuri Rumyantsev wrote:

 Richard,

 Here is updated patch which does not include your proposal related to
 the target hook deletion.
 You wrote:

 I still don't understand why you need the new target hook.  If we have a
 masked
 load/store then the mask is computed by an assignment with a
 VEC_COND_EXPR
 (in your example) and thus a test for a zero mask is expressible as just

if (vect__ifc__41.17_167 == { 0, 0, 0, 0... })


 or am I missing something?

 Such vector compare produces vector and does not set up cc flags
 required for conditional branch (GIMPLE_COND).
 If we use such comparison for GIMPLE_COND we got error message, so I
 used target hook which does set up cc flags aka ptest instruction and
 I left this part.

 I think we need to look for something better.  I really don't like the idea
 of using a target hook like this within the gimple optimizers unless it's
 absolutely necessary.

 How are conditionals on vectors usually handled?  You should try to mimick
 that and report, with detail, why that's not working.

 I'm also not happy with the mechanisms to determine whether or not we should
 make this transformation.  I'm willing to hash through other details and
 come back to this issue once we've got some of the other stuff figured out.
 I guess a new flag with the target adjusting is the fallback if we can't
 come up with some reasonable way to select this on or off.

 The reason I don't like having the target files make this kind of decision
 is it makes more gimple transformations target dependent. Based on the
 history with RTL, that ultimately leads to an unwieldy mess.

 And yes, I know gimple isn't 100% target independent -- but that doesn't
 mean we should keep adding more target dependencies.  Each one we add needs
 to be well vetted.


 patch.3

 diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
 index 44a8624..e90de32 100644
 --- a/gcc/config/i386/i386.c
 +++ b/gcc/config/i386/i386.c
 @@ -100,6 +100,8 @@ along with GCC; see the file COPYING3.  If not see
  #include tree-iterator.h
  #include tree-chkp.h
  #include rtl-chkp.h
 +#include stringpool.h
 +#include tree-ssanames.h
 So ideally we'll figure out why you're unable to generate a suitable
 conditional at the gimple level and the need for the x86 backend to generate
 the vector zero test will go away.  And if it does go away, we want those
 #includes to disappear.

 Additionally, instead of including stringpool.h  tree-ssanames.h, include
 ssa.h -- as a general rule.


  static rtx legitimize_dllimport_symbol (rtx, bool);
  static rtx legitimize_pe_coff_extern_decl (rtx, bool);
 @@ -41100,6 +41102,47 @@ ix86_vectorize_builtin_gather (const_tree
 mem_vectype,
return ix86_get_builtin (code);
  }

 +/* Returns true if SOURCE type is supported by builtin ptest.
 +   NAME is lhs of created ptest call.  All created statements are added
 +   to GS.  */
 +
 +static bool
 +ix86_vectorize_build_zero_vector_test (tree source, tree name,

 Given the stated goal of not doing this in the target files, I'm not going
 to actually look at this routine or any of the infrastructure for this
 target hook.


 diff --git a/gcc/params.def b/gcc/params.def
 index 3e4ba3a..9e8de11 100644
 --- a/gcc/params.def
 +++ b/gcc/params.def
 @@ -1069,6 +1069,12 @@ DEFPARAM (PARAM_MAX_STORES_TO_SINK,
Maximum number of conditional store pairs that can be sunk,
2, 0, 0)

 +/* Enable inserion test on zero mask for masked stores if non-zero.  */
 s/inserion/insertion/

 +DEFPARAM (PARAM_ZERO_TEST_FOR_STORE_MASK,
 + zero-test-for-store-mask,
 + Enable insertion of test on zero mask for masked stores,
 + 1, 0, 1)
 I'm resisting the temptation to bike-shed...  I don't like the name, but I
 don't have a better one yet.  Similarly for the short description.


 +/* { dg-final { scan-tree-dump-times Move MASK_STORE to new bb 2 vect }
 } */
 +/* { dg-final { cleanup-tree-dump vect } } */
 cleanup-tree-dump is no longer needed.


 diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
 index 7ba0d8f..e31479b 100644
 --- a/gcc/tree-vect-stmts.c
 +++ b/gcc/tree-vect-stmts.c
 @@ -2072,6 +2072,7 @@ vectorizable_mask_load_store (gimple stmt,
 gimple_stmt_iterator *gsi,
  {
tree 

Re: [gomp] Move openacc vector worker single handling to RTL

2015-07-20 Thread Nathan Sidwell

On 07/20/15 09:01, Nathan Sidwell wrote:

On 07/18/15 11:37, Thomas Schwinge wrote:

Hi Nathan!



For OpenACC nvptx offloading, there must still be something wrong; here's
a count of the (non-deterministic!) regressions of ten runs of the
libgomp testsuite.  As private-vars-loop-worker-5.c fails most often, it
probably makes sense to look into that one first.


I'll take a look. :(


Having difficulty reproducing it (preprocessed source compiled at -O0 works for 
me).  Do you have an exact recipe?



nathan


Re: [RFC, PR66873] Use graphite for parloops

2015-07-20 Thread Sebastian Pop
Tom de Vries wrote:
 So I wondered, why not always use the graphite dependency analysis
 in parloops. (Of course you could use -floop-parallelize-all, but
 that also changes the heuristic). So I wrote a patch for parloops to
 use graphite dependency analysis by default (so without
 -floop-parallelize-all), but while testing found out that all the
 reduction test-cases started failing because the modifications
 graphite makes to the code messes up the parloops reduction
 analysis.
 
 Then I came up with this patch, which:
 - first runs a parloops pass, restricted to reduction loops only,

I would prefer to fix graphite to catch the reduction loop and avoid running an
extra pass before graphite for that case.  Can you please specify which file is
failing to be parallelized?  Are they all those testcases that you update the 
flags?

Also it seems to me that you are missing -ffast-math to parallelize all these
loops: without that flag graphite would not mark reductions as
associative/commutative operations and they would not be recognized as parallel.
Is that something the current parloops detection is not too strict about?

Thanks,
Sebastian

 - then runs graphite dependency analysis
 - followed by a normal parloops pass run.
 
 This way, we get to both:
 - compile the reduction testcases as before, and
 - profit from the better graphite dependency analysis otherwise.
 
 A point worth noting is that I stopped running pass_iv_canon before
 parloops (only in case of -ftree-parallelize-loops  1) because
 running it before graphite makes the graphite scop detection fail.
 
 Bootstrapped and reg-tested on x86_64.
 
 Any comments?
 
 Thanks,
 - Tom

 Use graphite for parloops
 
 2015-07-15  Tom de Vries  t...@codesourcery.com
 
   PR tree-optimization/66873
   * graphite-isl-ast-to-gimple.c (translate_isl_ast_for_loop):
   (scop_to_isl_ast): Handle flag_tree_parallelize_loops.
   * graphite-poly.c (apply_poly_transforms): Same.
   * graphite.c (gate_graphite_transforms): Remove static.
   (pass_graphite_parloops): New pass.
   (make_pass_graphite_parloops): New function.
   (pass_graphite_transforms2): New pass.
   (make_pass_graphite_transforms2): New function.
   * omp-low.c (pass_expand_omp_ssa::clone): Same.
   * passes.def: Add pass groups pass_parallelize_reductions and
   pass_graphite_parloops.
   * tree-parloops.c (gen_parallel_loop): Add debug print for alternative
   exit-first loop transform.
   (parallelize_loops): Add reductions_only parameter.
   (pass_parallelize_loops::execute): Call parallelize_loops with extra
   argument.
   (pass_parallelize_reductions): New pass.
   (pass_parallelize_reductions::execute)
   (make_pass_parallelize_reductions): New function.
   * tree-pass.h (make_pass_graphite_parloops)
   (make_pass_parallelize_reductions, make_pass_graphite_transforms2)
   (gate_graphite_transforms): Declare.
   tree-ssa-loop-ivcanon.c (pass_iv_canon::gate): Return false if
   flag_tree_parallelize_loops  1.
 
   * gcc.dg/autopar/outer-6.c: Update for new pass parloopsred.
   * gcc.dg/autopar/reduc-1.c: Same.
   * gcc.dg/autopar/reduc-1char.c: Same.
   * gcc.dg/autopar/reduc-1short.c: Same.
   * gcc.dg/autopar/reduc-2.c: Same.
   * gcc.dg/autopar/reduc-2char.c: Same.
   * gcc.dg/autopar/reduc-2short.c: Same.
   * gcc.dg/autopar/reduc-3.c: Same.
   * gcc.dg/autopar/reduc-6.c: Same.
   * gcc.dg/autopar/reduc-7.c: Same.
   * gcc.dg/autopar/reduc-8.c: Same.
   * gcc.dg/autopar/reduc-9.c: Same.
   * gcc.dg/parloops-exit-first-loop-alt-2.c: Same.
   * gcc.dg/parloops-exit-first-loop-alt-3.c: Same.
   * gcc.dg/parloops-exit-first-loop-alt-4.c: Same.
   * gcc.dg/parloops-exit-first-loop-alt-5.c: Same.
   * gcc.dg/parloops-exit-first-loop-alt-6.c: Same.
   * gcc.dg/parloops-exit-first-loop-alt-7.c: Same.
   * gcc.dg/parloops-exit-first-loop-alt-pr66652.c: Same.
   * gcc.dg/parloops-exit-first-loop-alt.c: Same.
   * gfortran.dg/parloops-exit-first-loop-alt-2.f95: Same.
   * gfortran.dg/parloops-exit-first-loop-alt.f95: Same.
   * gfortran.dg/parloops-outer-1.f95: New test.
 ---
  gcc/graphite-isl-ast-to-gimple.c   |  6 +-
  gcc/graphite-poly.c|  3 +-
  gcc/graphite.c | 83 ++-
  gcc/omp-low.c  |  1 +
  gcc/passes.def | 11 +++
  gcc/testsuite/gcc.dg/autopar/outer-6.c |  6 +-
  gcc/testsuite/gcc.dg/autopar/reduc-1.c |  7 +-
  gcc/testsuite/gcc.dg/autopar/reduc-1char.c |  7 +-
  gcc/testsuite/gcc.dg/autopar/reduc-1short.c|  7 +-
  gcc/testsuite/gcc.dg/autopar/reduc-2.c |  7 +-
  gcc/testsuite/gcc.dg/autopar/reduc-2char.c |  7 +-
  gcc/testsuite/gcc.dg/autopar/reduc-2short.c|  7 +-
 

[Bug c++/66944] New: static thread_local member in class template may cause compilation to fail

2015-07-20 Thread zhykzhykzhyk at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66944

Bug ID: 66944
   Summary: static thread_local member in class template may cause
compilation to fail
   Product: gcc
   Version: 5.2.0
Status: UNCONFIRMED
  Keywords: rejects-valid
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zhykzhykzhyk at gmail dot com
  Target Milestone: ---

Affect Versions: 4.9.3, 5.1.0, 5.2.0

Expected behavior:
  compiles correctly (like clang does)

Compiler output:
bug.cc:14:34: error: redefinition of 'bool __tls_guard'
 thread_local std::vectorvoid * AT::v;
  ^
bug.cc:14:34: note: 'bool __tls_guard' previously declared here
bug.cc:14: confused by earlier errors, bailing out

== code ==
struct Heavy { Heavy(){} };

template typename T
struct A {
  virtual void foo() { v; }
  static thread_local Heavy v;
};

template typename T
thread_local Heavy AT::v;

struct E {};
struct F {};

int main() {
  AE foo;
  AF bar;
  bar.v;
}
==


[Bug tree-optimization/66946] New: Spurious uninitialized warning

2015-07-20 Thread wdijkstr at arm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66946

Bug ID: 66946
   Summary: Spurious uninitialized warning
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: wdijkstr at arm dot com
  Target Milestone: ---

Created attachment 36016
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=36016action=edit
preprocessed iso-2022-cn-ext.c

Since recently (around May) GCC6 has started to emit spurious uninitialized
warnings. An example from building GLIBC (compiled attached pp file with -O2
-Wall -c out.i):

In file included from iso-2022-cn-ext.c:659:0:
../iconv/skeleton.c: In function 'gconv':
../iconv/loop.c:325:295: warning: '*((void *)buf+1)' is used uninitialized in
this function [-Wuninitialized]
../iconv/loop.c:325:295: warning: 'buf' is used uninitialized in this function
[-Wuninitialized]
../iconv/loop.c:325:295: warning: '*((void *)buf+1)' is used uninitialized in
this function [-Wuninitialized]
../iconv/loop.c:325:295: warning: 'buf' is used uninitialized in this function
[-Wuninitialized]
In file included from iso-2022-cn-ext.c:659:0:
../iconv/loop.c:435:295: warning: '*((void *)buf+1)' is used uninitialized in
this function [-Wuninitialized]
../iconv/loop.c:435:295: warning: 'buf' is used uninitialized in this function
[-Wuninitialized]

I checked the control flow and it seems correct.
glibc/iconvdata/iso-2022-cn-ext.c calls one of ucs4_to_gb2312, ucs4_to_isoir165
and ucs4_to_cns11643l1 at lines 432-447 - buf is not written if
__UNKNOWN_10646_CHAR is returned. In that case however the if statement is
executed which does eventually set buf[0] and buf[1]. There is no warning if
inlining is disabled, so it is caused by the ucs4_* inline functions not always
setting buf.


[Bug fortran/66929] [6 regression] ICE with iso_varying_string

2015-07-20 Thread juergen.reuter at desy dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66929

Jürgen Reuter juergen.reuter at desy dot de changed:

   What|Removed |Added

 CC||juergen.reuter at desy dot de

--- Comment #4 from Jürgen Reuter juergen.reuter at desy dot de ---
Also just stumbled over this one. Happens for us with r225979:

libtool: compile:  /var/lib/jenkins/slave/opt/gcc-6.0-225979/bin/gfortran -g
-O2 -c ../../../omega/src/iso_varying_string.f90  -fPIC -o
.libs/iso_varying_string.o
../../../omega/src/iso_varying_string.f90:2467:0:

   rep_string = substring//extract(work_string,
start=i_target+length_target)//rep_string
1
internal compiler error: Segmentation fault
0xbde39f crash_signal
../../source/gcc/toplev.c:352
0x6aa154 expr_may_alias_variables
../../source/gcc/fortran/trans-expr.c:4566
0x6b0d71 gfc_conv_procedure_call(gfc_se*, gfc_symbol*, gfc_actual_arglist*,
gfc_expr*, vectree_node*, va_gc, vl_embed*)
../../source/gcc/fortran/trans-expr.c:5405
0x6b5a12 gfc_conv_expr(gfc_se*, gfc_expr*)
../../source/gcc/fortran/trans-expr.c:7482
0x6bc4a8 gfc_conv_expr_reference(gfc_se*, gfc_expr*)
../../source/gcc/fortran/trans-expr.c:7617
0x6b140f gfc_conv_procedure_call(gfc_se*, gfc_symbol*, gfc_actual_arglist*,
gfc_expr*, vectree_node*, va_gc, vl_embed*)
../../source/gcc/fortran/trans-expr.c:5055
0x6b5a12 gfc_conv_expr(gfc_se*, gfc_expr*)
../../source/gcc/fortran/trans-expr.c:7482
0x6bd840 gfc_trans_assignment_1
../../source/gcc/fortran/trans-expr.c:9204
0x678d35 trans_code
../../source/gcc/fortran/trans.c:1674
0x6eea43 gfc_trans_if_1
../../source/gcc/fortran/trans-stmt.c:1106
0x6f58ea gfc_trans_if(gfc_code*)
../../source/gcc/fortran/trans-stmt.c:1137
0x678b57 trans_code
../../source/gcc/fortran/trans.c:1771
0x6f78d4 gfc_trans_do_while(gfc_code*)
../../source/gcc/fortran/trans-stmt.c:2083
0x678b07 trans_code
../../source/gcc/fortran/trans.c:1791
0x6a69bf gfc_generate_function_code(gfc_namespace*)
../../source/gcc/fortran/trans-decl.c:5884
0x67d1d1 gfc_generate_module_code(gfc_namespace*)
../../source/gcc/fortran/trans.c:2045
0x63449d translate_all_program_units
../../source/gcc/fortran/parse.c:5506
0x63449d gfc_parse_file()
../../source/gcc/fortran/parse.c:5724
0x675792 gfc_be_parse_file
../../source/gcc/fortran/f95-lang.c:209

Re: revised and updated new-if-converter patch… [PATCH] fix PR46029: reimplement if conversion of loads and stores

2015-07-20 Thread Alan Lawrence

Abe wrote:


diff --git a/gcc/testsuite/gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c 
b/gcc/testsuite/gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c
index 71f2db3..2b159d7 100644
--- a/gcc/testsuite/gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c
+++ b/gcc/testsuite/gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c
@@ -65,4 +65,12 @@ main (void)
return 0;
  }

-/* { dg-final { scan-tree-dump-times vectorized 1 loops 1 vect  { xfail { { 
vect_no_align  { ! vect_hw_misalign } } || { ! vect_strided2 } } } } } */
+/* foo() is not vectorized FOR NOW because gather-load
+   cannot handle conditional gather loads as of June 2015.
+
+   The old way of if-converting loads was unsafe because
+   it resulted in thread-unsafe code where the as-written code was OK.
+   The old if conversion did not contain indirection in the loads,
+   so it was simpler, therefor the vectorizer was able to pick it up.  */
+
+/* { dg-final { scan-tree-dump-times vectorized 1 loops 0 vect  { xfail { { 
vect_no_align  { ! vect_hw_misalign } } || { ! vect_strided2 } } } } } */


Would having a testsuite predicate for the target supporting gathered loads, let 
you run this test on those architectures? I'd expect one to be useful in a few 
other places too, in time if it doesn't exist already...


--Alan



[Bug debug/66468] [6 Regression] ICE in in check_die, at dwarf2out.c:5719

2015-07-20 Thread jason at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66468

--- Comment #7 from Jason Merrill jason at gcc dot gnu.org ---
(In reply to Aldy Hernandez from comment #5)
 It was my understanding that DW_AT_inline cannot appear in conjunction
 with location info.

Right.

A debugging information entry that is a member of an abstract instance tree
should not contain any attributes which describe aspects of the subroutine
which
vary between distinct inlined expansions or distinct out-of-line expansions.

I'll take a look.


[Bug sanitizer/66908] Uninitialized variable when compiled with UBsan

2015-07-20 Thread y.gribov at samsung dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66908

--- Comment #1 from Yury Gribov y.gribov at samsung dot com ---
Looks like -fsanitize=bounds may introduce uninitialized variables when run
after shift.


[PATCH, i386]: Fix asm flag output to DImode for 32bit targets

2015-07-20 Thread Uros Bizjak
2015-07-20  Uros Bizjak  ubiz...@gmail.com

* config/i386/i386.c (ix86_md_asm_adjust): Handle DImode dest_mode
for !TARGET_64BIT.

testsuite/ChangeLog:

2015-07-20  Uros Bizjak  ubiz...@gmail.com

* gcc.target/i386/asm-flag-5.c (f_ll): New.

Bootstrapped and regression tested on x86_64-linux-gnu {,m32}.

Committed to mainline SVN.

Uros.
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 225982)
+++ config/i386/i386.c  (working copy)
@@ -45861,6 +45861,10 @@ ix86_md_asm_adjust (vecrtx outputs, vecrtx /
  error (invalid type for asm flag output);
  continue;
}
+
+  if (dest_mode == DImode  !TARGET_64BIT)
+   dest_mode = SImode;
+
   if (dest_mode != QImode)
{
  rtx destqi = gen_reg_rtx (QImode);
@@ -45877,7 +45881,16 @@ ix86_md_asm_adjust (vecrtx outputs, vecrtx /
  else
x = gen_rtx_ZERO_EXTEND (dest_mode, destqi);
}
-  emit_insn (gen_rtx_SET (dest, x));
+
+  if (dest_mode != GET_MODE (dest))
+   {
+ rtx tmp = gen_reg_rtx (SImode);
+
+ emit_insn (gen_rtx_SET (tmp, x));
+ emit_insn (gen_zero_extendsidi2 (dest, tmp));
+   }
+  else
+   emit_insn (gen_rtx_SET (dest, x));
 }
   rtx_insn *seq = get_insns ();
   end_sequence ();
Index: testsuite/gcc.target/i386/asm-flag-5.c
===
--- testsuite/gcc.target/i386/asm-flag-5.c  (revision 225982)
+++ testsuite/gcc.target/i386/asm-flag-5.c  (working copy)
@@ -7,6 +7,7 @@ void f_c(void) { char x; asm( : =@ccc(x)); }
 void f_s(void) { short x; asm( : =@ccc(x)); }
 void f_i(void) { int x; asm( : =@ccc(x)); }
 void f_l(void) { long x; asm( : =@ccc(x)); }
+void f_ll(void) { long long x; asm( : =@ccc(x)); }
 
 void f_f(void)
 {


Re: [gomp4.1] Initial support for some OpenMP 4.1 construct parsing

2015-07-20 Thread Ilya Verbin
On Fri, Jul 17, 2015 at 18:43:06 +0200, Jakub Jelinek wrote:
 On Fri, Jul 17, 2015 at 07:31:36PM +0300, Ilya Verbin wrote:
  One big question is who will maintain the list of scheduled job, its
  dependencies, etc. - libgomp or each target plugin?
  
  
  OpenACC has async queues:
  #pragma acc parallel async(2) wait(1)
  
  But it's not possible to have 2 waits like:
  #pragma acc parallel async(3) wait(1) wait(2)
  
  (GOMP_OFFLOAD_openacc_async_wait_async has only one argument with the 
  number of
  queue to wait)
  
  Thomas, please correct me if I'm wrong.
  
  In this regard, OpenMP is more complicated, since it allows e.g.:
  #pragma omp target nowait depend(in: a, b) depend(out: c, d)
 
 If it is each plugin, then supposedly it should use (if possible) some
 common libgomp routine to maintain the queues, duplicating the dependency
 graph handling code in each plugins might be too ugly.
 
  Currently I'm trying to figure out what liboffloadmic can do.

Latest liboffloadmic (I'm preparing an update for trunk) can take some pointer
*ptr* as argument of __offload_offload, which is used for execution and data
transfer.  When given job is finished, it will call some callback in libgomp on
host, passing *ptr* back to it, thus libgomp can distinguish which job has
been finished.  BTW, which word to use here to avoid confusion? (task? job?)

I'm going to prototype something in libgomp using this interface.

  -- Ilya


Re: [gomp4.1] Initial support for some OpenMP 4.1 construct parsing

2015-07-20 Thread Jakub Jelinek
Hi!

And here is untested incremental libgomp side of the proposed
GOMP_MAP_FIRSTPRIVATE_POINTER.
I'll probably tweak it so that if GOMP_MAP_FIRSTPRIVATE
is followed by one or more GOMP_MAP_FIRSTPRIVATE_POINTER where
the pointers fall into the extents of the firstprivate object,
it will make the whole object firstprivate and then overwrite
the pointers in it (somewhat similar to GOMP_MAP_TO_PSET).

--- include/gomp-constants.h.jj 2015-07-20 12:27:58.0 +0200
+++ include/gomp-constants.h2015-07-20 19:43:47.734326985 +0200
@@ -74,6 +74,8 @@ enum gomp_map_kind
 GOMP_MAP_FORCE_DEVICEPTR = (GOMP_MAP_FLAG_SPECIAL_1 | 0),
 /* Do not map, copy bits for firstprivate instead.  */
 GOMP_MAP_FIRSTPRIVATE =(GOMP_MAP_FLAG_SPECIAL | 0),
+/* Do not map, but pointer assign a pointer instead.  */
+GOMP_MAP_FIRSTPRIVATE_POINTER =(GOMP_MAP_FLAG_SPECIAL | 1),
 /* Allocate.  */
 GOMP_MAP_FORCE_ALLOC = (GOMP_MAP_FLAG_FORCE | GOMP_MAP_ALLOC),
 /* ..., and copy to device.  */
--- libgomp/target.c.jj 2015-07-20 16:03:20.0 +0200
+++ libgomp/target.c2015-07-20 19:57:38.735556137 +0200
@@ -276,12 +276,8 @@ gomp_map_vars (struct gomp_device_descr
  tgt-list[i].key = NULL;
  continue;
}
-  cur_node.host_start = (uintptr_t) hostaddrs[i];
-  if (!GOMP_MAP_POINTER_P (kind  typemask))
-   cur_node.host_end = cur_node.host_start + sizes[i];
-  else
-   cur_node.host_end = cur_node.host_start + sizeof (void *);
-  if ((kind  typemask) == GOMP_MAP_FIRSTPRIVATE)
+  if ((kind  typemask) == GOMP_MAP_FIRSTPRIVATE
+ || (kind  typemask) == GOMP_MAP_FIRSTPRIVATE_POINTER)
{
  tgt-list[i].key = NULL;
 
@@ -289,10 +285,18 @@ gomp_map_vars (struct gomp_device_descr
  if (tgt_align  align)
tgt_align = align;
  tgt_size = (tgt_size + align - 1)  ~(align - 1);
- tgt_size += cur_node.host_end - cur_node.host_start;
+ if ((kind  typemask) == GOMP_MAP_FIRSTPRIVATE_POINTER)
+   tgt_size += sizeof (void *);
+ else
+   tgt_size += sizes[i];
  has_firstprivate = true;
  continue;
}
+  cur_node.host_start = (uintptr_t) hostaddrs[i];
+  if (!GOMP_MAP_POINTER_P (kind  typemask))
+   cur_node.host_end = cur_node.host_start + sizes[i];
+  else
+   cur_node.host_end = cur_node.host_start + sizeof (void *);
   splay_tree_key n = splay_tree_lookup (mem_map, cur_node);
   if (n)
gomp_map_vars_existing (devicep, n, cur_node, tgt-list[i],
@@ -374,15 +378,28 @@ gomp_map_vars (struct gomp_device_descr
int kind = get_kind (short_mapkind, kinds, i);
if (hostaddrs[i] == NULL)
  continue;
-   if ((kind  typemask) == GOMP_MAP_FIRSTPRIVATE)
+   if ((kind  typemask) == GOMP_MAP_FIRSTPRIVATE
+   || (kind  typemask) == GOMP_MAP_FIRSTPRIVATE_POINTER)
  {
size_t align = (size_t) 1  (kind  rshift);
tgt_size = (tgt_size + align - 1)  ~(align - 1);
tgt-list[i].offset = tgt_size;
-   size_t len = sizes[i];
-   devicep-host2dev_func (devicep-target_id,
-   (void *) (tgt-tgt_start + tgt_size),
-   (void *) hostaddrs[i], len);
+   size_t len;
+   if ((kind  typemask) == GOMP_MAP_FIRSTPRIVATE_POINTER)
+ {
+   len = sizeof (void *);
+   gomp_map_pointer (tgt, (uintptr_t)
+  *(void **) (hostaddrs[i]),
+ tgt_size, sizes[i]);
+ }
+   else
+ {
+   len = sizes[i];
+   devicep-host2dev_func (devicep-target_id,
+   (void *) (tgt-tgt_start
+ + tgt_size),
+   (void *) hostaddrs[i], len);
+ }
tgt_size += len;
continue;
  }


Jakub


[Bug tree-optimization/66768] address space gets lost on literal pointer

2015-07-20 Thread gjl at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66768

Georg-Johann Lay gjl at gcc dot gnu.org changed:

   What|Removed |Added

   Keywords||addr-space
 CC||gjl at gcc dot gnu.org

--- Comment #10 from Georg-Johann Lay gjl at gcc dot gnu.org ---
Guess the target should be x86?


Re: [RFC, PR66873] Use graphite for parloops

2015-07-20 Thread Sebastian Pop
Tom de Vries wrote:
 graphite dependence analysis is too slow to be enabled unconditionally.
 (read: hours in some simple cases - see bugzilla)
 
 Haha, cool!  ;-)
 
 Maybe it is still reasonable to use graphite to analyze the code inside
 OpenACC kernels regions -- maybe such code can reasonably be expected to
 not have the properties that make its analysis lengthy?  So, Tom, could
 you please identify and check such PRs, to get an understanding of what
 these properties are?
 
 Like the one in PR62113 or 53852 or 59121.
 
 PR62113 and PR59121 do not reproduce for me on trunk.
 
 PR53852 does reproduce for me (to the point that I had to reset my laptop).

ISL has a way to count the number of operations, based on a watermark it will
output an error code that we can use to leave graphite: see documentation of
isl_ctx_set_max_operations().  With that mechanism we can set a goal for
graphite of at max (say 10% overhead) of whole compilation time.





Re: [DWARF] Tracking uninitialized variables

2015-07-20 Thread Michael Eager

On 07/20/2015 09:55 AM, Nikolai Bozhenov wrote:


On 07/17/2015 08:31 PM, Michael Eager wrote:



A related issue is where the breakpoint is taken.  GCC sets breakpoints
at the first instruction generated for a statement, which in this case,
appears to be before any of the arguments to bar are evaluated.  A
possibly better location would be after arguments are evaluated, before
the call is executed.



In this case GDB set the breakpoint at the instruction at 0x0d where
evaluation of the first argument for the call is performed. I'm not
sure that there is a less confusing way to choose an address to set a
breakpoint. For example I don't think it is a good idea to ignore
evaluation of function arguments and set a breakpoint right at the
call instruction. But even if there is a better way, such new feature
is likely to be implemented in GDB rather than in GCC.


Debugging optimized programs is difficult, and deciding the best
location to set a breakpoint is a matter for some thought.  (See
Caroline Tice's dissertation.)  If none of the arguments is evaluated
until after the first instruction of a function call, then you cannot
display the argument values if that is your breakpoint.

No matter what location you feel is the best for a breakpoint, it is
the compiler which generates the line number table and indicates where
a breakpoint should be placed, not the debugger.  Don't expect anything
in GDB which will do something when GCC provides incomplete or
inaccurate information.


--
Michael Eagerea...@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077


Re: [PATCH 3/4] S390 -march=native related fixes

2015-07-20 Thread Dominik Vogt
On Sat, Jul 18, 2015 at 01:09:26AM +0200, Ulrich Weigand wrote:
 Dominik Vogt wrote:
 
  +  opt_esa_zarch = (has_highgprs) ?  -mzarch :  -mesa;
 
 This will force -mesa on old machines *even in -m64 mode*,
 which is wrong and will cause compilation to fail.
 
 
  -/* Defaulting rules.  */
   #ifdef DEFAULT_TARGET_64BIT
  -#define DRIVER_SELF_SPECS  \
 
 This completely removes use of DRIVER_SELF_SPECS for defaulting,
 which I introduced back here:
 https://gcc.gnu.org/ml/gcc-patches/2003-06/msg03369.html
...

New version of the patch and ChangeLog attached, as discussed
internally:

 * -march=native never setzt -mvx or -mhtm but disables these
   features if the cpu does not support them.  This always wins
   against any options set on the command line.

 * -march=native never sets -mesa, and appends -mzarch to the
   command line if the highgprs cpu flag is present and the command
   line does not specify -mesa (or -mzarch).

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/ChangeLog

* config/s390/driver-native.c (s390_host_detect_local_cpu): Handle
processor capabilities with -march=native.
* config/s390/s390.h (MARCH_MTUNE_NATIVE_SPECS): Likewise.
(DRIVER_SELF_SPECS): Likewise.  Join specs for 31 and 64 bit.
* (S390_TARGET_BITS_STRING): Macro to simplify specs.
From 2120164854639bee14053d7cd418fdbb7ebfc8b7 Mon Sep 17 00:00:00 2001
From: Dominik Vogt v...@linux.vnet.ibm.com
Date: Mon, 6 Jul 2015 16:28:32 +0100
Subject: [PATCH 3/4] S390: Handle processor capabilities with -march=native.

---
 gcc/config/s390/driver-native.c | 143 
 gcc/config/s390/s390.h  |  28 
 2 files changed, 129 insertions(+), 42 deletions(-)

diff --git a/gcc/config/s390/driver-native.c b/gcc/config/s390/driver-native.c
index 88c76bd..5f7fe0a 100644
--- a/gcc/config/s390/driver-native.c
+++ b/gcc/config/s390/driver-native.c
@@ -42,6 +42,16 @@ s390_host_detect_local_cpu (int argc, const char **argv)
   char buf[256];
   FILE *f;
   bool arch;
+  const char *options = ;
+  unsigned int has_features;
+  unsigned int has_processor;
+  unsigned int is_cpu_z9_109 = 0;
+  unsigned int has_highgprs = 0;
+  unsigned int has_dfp = 0;
+  unsigned int has_te = 0;
+  unsigned int has_vx = 0;
+  unsigned int has_opt_esa_zarch = 0;
+  int i;
 
   if (argc  1)
 return NULL;
@@ -49,43 +59,120 @@ s390_host_detect_local_cpu (int argc, const char **argv)
   arch = strcmp (argv[0], arch) == 0;
   if (!arch  strcmp (argv[0], tune))
 return NULL;
+  for (i = 1; i  argc; i++)
+if (strcmp (argv[i], mesa_mzarch) == 0)
+  has_opt_esa_zarch = 1;
 
   f = fopen (/proc/cpuinfo, r);
   if (f == NULL)
 return NULL;
 
-  while (fgets (buf, sizeof (buf), f) != NULL)
-if (strncmp (buf, processor, sizeof (processor) - 1) == 0)
-  {
-	if (strstr (buf, machine = 9672) != NULL)
-	  cpu = g5;
-	else if (strstr (buf, machine = 2064) != NULL
-		 || strstr (buf, machine = 2066) != NULL)
-	  cpu = z900;
-	else if (strstr (buf, machine = 2084) != NULL
-		 || strstr (buf, machine = 2086) != NULL)
-	  cpu = z990;
-	else if (strstr (buf, machine = 2094) != NULL
-		 || strstr (buf, machine = 2096) != NULL)
-	  cpu = z9-109;
-	else if (strstr (buf, machine = 2097) != NULL
-		 || strstr (buf, machine = 2098) != NULL)
-	  cpu = z10;
-	else if (strstr (buf, machine = 2817) != NULL
-		 || strstr (buf, machine = 2818) != NULL)
-	  cpu = z196;
-	else if (strstr (buf, machine = 2827) != NULL
-		 || strstr (buf, machine = 2828) != NULL)
-	  cpu = zEC12;
-	else if (strstr (buf, machine = 2964) != NULL)
-	  cpu = z13;
-	break;
-  }
+  for (has_features = 0, has_processor = 0;
+   (has_features == 0 || has_processor == 0)
+	  fgets (buf, sizeof (buf), f) != NULL; )
+{
+  if (has_processor == 0  strncmp (buf, processor, 9) == 0)
+	{
+	  const char *p;
+	  long machine_id;
+
+	  p = strstr (buf, machine = );
+	  if (p == NULL)
+	continue;
+	  p += 10;
+	  has_processor = 1;
+	  machine_id = strtol (p, NULL, 16);
+	  switch (machine_id)
+	{
+	case 0x9672:
+	  cpu = g5;
+	  break;
+	case 0x2064:
+	case 0x2066:
+	  cpu = z900;
+	  break;
+	case 0x2084:
+	case 0x2086:
+	  cpu = z990;
+	  break;
+	case 0x2094:
+	case 0x2096:
+	  cpu = z9-109;
+	  is_cpu_z9_109 = 1;
+	  break;
+	case 0x2097:
+	case 0x2098:
+	  cpu = z10;
+	  break;
+	case 0x2817:
+	case 0x2818:
+	  cpu = z196;
+	  break;
+	case 0x2827:
+	case 0x2828:
+	  cpu = zEC12;
+	  break;
+	case 0x2964:
+	  cpu = z13;
+	  break;
+	}
+	}
+  if (has_features == 0  strncmp (buf, features, 8) == 0)
+	{
+	  const char *p;
+
+	  p = strchr (buf, ':');
+	  if (p == NULL)
+	continue;
+	  p++;
+	  while (*p != 0)
+	{
+	  int i;
+
+	  while (ISSPACE (*p))
+		p++;
+	  for (i = 0; !ISSPACE (p[i])  p[i] != 0; i++)
+		

[Bug target/66217] PowerPC rotate/shift/mask instructions not optimal

2015-07-20 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66217

--- Comment #2 from Segher Boessenkool segher at gcc dot gnu.org ---
Author: segher
Date: Mon Jul 20 16:30:56 2015
New Revision: 226005

URL: https://gcc.gnu.org/viewcvs?rev=226005root=gccview=rev
Log:
PR target/66217
* config/rs6000/constraints.md (S, T, t): Delete.  Update
available letters comment.
* config/rs6000/predicates.md (mask_operand, mask_operand_wrap,
mask64_operand, mask64_2_operand, any_mask_operand, and64_2_operand,
and_2rld_operand):  Delete.
(and_operand): Adjust.
(rotate_mask_operator): New.
* config/rs6000/rs6000-protos.h (build_mask64_2_operands,
includes_lshift_p, includes_rshift_p, includes_rldic_lshift_p,
includes_rldicr_lshift_p, insvdi_rshift_rlwimi_p, extract_MB,
extract_ME): Delete.
(rs6000_is_valid_mask, rs6000_is_valid_and_mask,
rs6000_is_valid_shift_mask, rs6000_is_valid_insert_mask,
rs6000_insn_for_and_mask, rs6000_insn_for_shift_mask,
rs6000_insn_for_insert_mask, rs6000_is_valid_2insn_and,
rs6000_emit_2insn_and): New.
* config/rs6000/rs6000.c (num_insns_constant): Adjust.
(build_mask64_2_operands, includes_lshift_p, includes_rshift_p,
includes_rldic_lshift_p, includes_rldicr_lshift_p,
insvdi_rshift_rlwimi_p, extract_MB, extract_ME): Delete.
(rs6000_is_valid_mask, rs6000_is_valid_and_mask,
rs6000_insn_for_and_mask, rs6000_is_valid_shift_mask,
s6000_insn_for_shift_mask, rs6000_is_valid_insert_mask,
rs6000_insn_for_insert_mask, rs6000_is_valid_2insn_and,
rs6000_emit_2insn_and): New.
(print_operand) 'b', 'B', 'm', 'M', 's', 'S', 'W': Delete.
(rs6000_rtx_costs) CONST_INT: Delete mask_operand and mask64_operand
handling.
NOT: Don't fall through to next case.
AND: Handle the various rotate-and-mask cases directly.
IOR: Always cost as one insn.
* config/rs6000/rs6000.md (splitter for bswap:SI): Adjust.
(andmode3): Adjust expander for the new patterns.
(andmode3_imm, andmode3_imm_dot, andmode3_imm_dot2,
andmode3_imm_mask_dot, andmode3_imm_mask_dot2): Adjust condition.
(*andmode3_imm_dot_shifted): New.
(*andmode3_mask): Delete, rewrite as ...
(andmode3_mask): ... New.
(*andmode3_mask_dot, *andmode3_mask_dot): Rewrite.
(andsi3_internal0_nomc): Delete.
(*andsi3_internal6): Delete.
(*andmode3_2insn): New.
(insv, insvsi_internal, *insvsi_internal1, *insvsi_internal2,
*insvsi_internal3, *insvsi_internal4, *insvsi_internal5,
*insvsi_internal6, insvdi_internal, *insvdi_internal2,
*insvdi_internal3): Delete.
(*rotlmode3_mask, *rotlmode3_mask_dot, *rotlmode3_mask_dot2,
*rotlmode3_insert, *rotlmode3_insert_2, *rotlmode3_insert_3,
*rotlmode3_insert_4, two splitters for multi-precision shifts,
*iormode_mask): New.
(extzv, extzvdi_internal, *extzvdi_internal1, *extzvdi_internal2,
*rotlsi3_mask, *rotlsi3_mask_dot, *rotlsi3_mask_dot2,
*ashlsi3_imm_mask, *ashlsi3_imm_mask_dot, *ashlsi3_imm_mask_dot2,
*lshrsi3_imm_mask, *lshrsi3_imm_mask_dot, *lshrsi3_imm_mask_dot2):
Delete.
(ashrmode3): Delete expander.
(*ashrmode3): Rename to ...
(ashrmode3): ... This.
(ashrdi3_no_power, *ashrdisi3_noppc64be): Delete.
(*rotldi3_internal4, *rotldi3_internal5 and split,
*rotldi3_internal6 and split, *ashldi3_internal4, ashldi3_internal5
and split, *ashldi3_internal6 and split, *ashldi3_internal7,
ashldi3_internal8 and split, *ashldi3_internal9 and split): Delete.
(*anddi3_2rld, *anddi3_2rld_dot, *anddi3_2rld_dot2): Delete.
(splitter for loading a mask): Adjust.
* doc/md.texi (Machine Constraints): Remove q, S, T, t constraints.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/rs6000/constraints.md
trunk/gcc/config/rs6000/predicates.md
trunk/gcc/config/rs6000/rs6000-protos.h
trunk/gcc/config/rs6000/rs6000.c
trunk/gcc/config/rs6000/rs6000.md
trunk/gcc/doc/md.texi


Re: [PATCH] Refactor graphite-isl-ast-to-gimple.c

2015-07-20 Thread Tobias Grosser

On 07/20/2015 06:24 PM, Aditya Kumar wrote:

From: Aditya Kumar hiradi...@msn.com

Refactor graphite-isl-ast-to-gimple.c:
Refactor so that each function can access 'region'. This will help
maintain a parameter rename_map within a region. No functional change intended.
This patch will be followed by another set of patches
where translate_isl_ast_to_gimple::region is used to keep parameters which needs


which need


renaming. Since we are planning to remove limit_scops, we now have to maintain a
set of parameters which needs renaming. This refactoring helps avoid passing
`region' to all the functions in this file.

It passes bootstrap and regtest.


LGTM.

Best,
Tobias



[Bug bootstrap/66947] New: Bootstrap error: Extraneous text after `else' directive

2015-07-20 Thread skunk at iskunk dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66947

Bug ID: 66947
   Summary: Bootstrap error: Extraneous text after `else'
directive
   Product: gcc
   Version: 5.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: bootstrap
  Assignee: unassigned at gcc dot gnu.org
  Reporter: skunk at iskunk dot org
  Target Milestone: ---
  Host: x86_64-unknown-linux-gnu
Target: x86_64-unknown-linux-gnu
 Build: x86_64-unknown-linux-gnu

Bootstrapping 5.2.0 on a 64-bit Linux system:

[...]
config.status: linking /home/src/gcc-5.2.0/libgcc/config/i386/sfp-machine.h to
sfp-machine.h
config.status: linking /home/src/gcc-5.2.0/libgcc/gthr-posix.h to
gthr-default.h
config.status: executing default commands
gmake[3]: Entering directory `/tmp/gcc-build/x86_64-unknown-linux-gnu/libgcc'
/home/src/gcc-5.2.0/libgcc/config/t-softfp:106: Extraneous text after `else'
directive
/home/src/gcc-5.2.0/libgcc/config/t-softfp:113: *** only one `else' per
conditional.  Stop.
gmake[3]: Leaving directory `/tmp/gcc-build/x86_64-unknown-linux-gnu/libgcc'
gmake[2]: *** [all-stage1-target-libgcc] Error 2
gmake[2]: Leaving directory `/tmp/gcc-build'
gmake[1]: *** [stage1-bubble] Error 2
gmake[1]: Leaving directory `/tmp/gcc-build'
gmake: *** [bootstrap-lean] Error 2


$ gmake --version
GNU Make 3.80
Copyright (C) 2002  Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.


I think the else ifneq ... construct in the t-softfp file requires a newer
version of GNU Make. (The GCC Prerequisites page indicates that version 3.80
is sufficient.)


Re: [PATCH][AArch64][3/14] Refactor option override code

2015-07-20 Thread James Greenhalgh
On Thu, Jul 16, 2015 at 04:20:37PM +0100, Kyrill Tkachov wrote:
 Hi all,
 
 This one is more meaty than the previous ones. It buffs up the parsing 
 functions for
 the mcpu, march, mtune options, decouples them and makes them return an enum 
 describing
 the errors that may occur.  This will allow us to use these functions in 
 other contexts
 beyond aarch64_override_options.
 
 aarch64_override_options itself gets an overhaul and is split up into code 
 that must run
 only once after the command line option have been processed, and code that 
 has to be run
 every time the backend-specific state changes (after SWITCHABLE_TARGET is 
 implemented).
 
 The stuff that must be run every time the backend state changes is put into
 aarch64_override_options_internal.
 
 Also, this patch deletes the declarations of aarch64_{arch,cpu,tune}_string 
 from aarch64.opt
 as they are superfluous since the march, mtune and mcpu option specification 
 implicitly
 declares these variables.
 
 This patch looks large, but a lot of it is moving code around...
 
 Bootstrapped and tested as part of the series on aarch64.
 
 Ok for trunk?

I'm a bit hazy on the logic of one part, that is the refactoring of
aarch64_override_options_after_change in to
aarch64_override_options_after_change_1.

It seems like if we have this hunk:

 diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
 index 32b974a..5ea65e3 100644
 --- a/gcc/config/aarch64/aarch64.c
 +++ b/gcc/config/aarch64/aarch64.c
 @@ -7473,85 +7507,253 @@ aarch64_parse_override_string (const char* 
 input_string,
free (string_root);
  }
  
 -/* Implement TARGET_OPTION_OVERRIDE.  */
  
  static void
 -aarch64_override_options (void)
 +aarch64_override_options_after_change_1 (struct gcc_options *opts)
  {
 -  /* -mcpu=CPU is shorthand for -march=ARCH_FOR_CPU, -mtune=CPU.
 - If either of -march or -mtune is given, they override their
 - respective component of -mcpu.
 +  if (opts-x_flag_omit_frame_pointer)
 +opts-x_flag_omit_leaf_frame_pointer = false;
 +  else if (opts-x_flag_omit_leaf_frame_pointer)
 +opts-x_flag_omit_frame_pointer = true;
  
 - So, first parse AARCH64_CPU_STRING, then the others, be careful
 - with -march as, if -mcpu is not present on the command line, march
 - must set a sensible default CPU.  */
 -  if (aarch64_cpu_string)
 +  /* If not opzimizing for size, set the default
 + alignment to what the target wants.  */
 +  if (!opts-x_optimize_size)
  {
 -  aarch64_parse_cpu ();
 +  if (opts-x_align_loops = 0)
 + opts-x_align_loops = aarch64_tune_params.loop_align;
 +  if (opts-x_align_jumps = 0)
 + opts-x_align_jumps = aarch64_tune_params.jump_align;
 +  if (opts-x_align_functions = 0)
 + opts-x_align_functions = aarch64_tune_params.function_align;
  }
 +}

Then this code left behind in aarch64_override_options_after_change :

  if (flag_omit_frame_pointer)
flag_omit_leaf_frame_pointer = false;
  else if (flag_omit_leaf_frame_pointer)
flag_omit_frame_pointer = true;

  /* If not optimizing for size, set the default
 alignment to what the target wants */
  if (!optimize_size)
{
  if (align_loops = 0)
   align_loops = aarch64_tune_params.loop_align;
  if (align_jumps = 0)
   align_jumps = aarch64_tune_params.jump_align;
  if (align_functions = 0)
   align_functions = aarch64_tune_params.function_align;
}

is redundant/misleading.

 diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
 index 32b974a..5ea65e3 100644
 --- a/gcc/config/aarch64/aarch64.c
 +++ b/gcc/config/aarch64/aarch64.c
 @@ -7101,12 +7101,27 @@ aarch64_add_stmt_cost (void *data, int count, enum 
 vect_cost_for_stmt kind,
return retval;
  }
  
 -static void initialize_aarch64_code_model (void);
 +static void initialize_aarch64_code_model (struct gcc_options *);
  
 -/* Parse the architecture extension string.  */
 +/* Enum describing the various ways that the
 +   aarch64_parse_{arch,tune,cpu,extension} functions can fail.
 +   This way their callers can choose what kind of error to give.  */
  
 -static void
 -aarch64_parse_extension (char *str)
 +enum aarch64_parse_opt_result
 +{
 +  AARCH64_PARSE_OK,  /* Parsing was successful.  */
 +  AARCH64_PARSE_MISSING_ARG, /* Missing argument.  */
 +  AARCH64_PARSE_INVALID_FEATURE, /* Invalid feature modifier.  */
 +  AARCH64_PARSE_INVALID_ARG  /* Invalid arch, tune, cpu arg.  */
 +};
 +
 +

Extra newline here.

 -/* Parse the ARCH string.  */
 +/* Parse the TO_PARSE string and put the architecture struct that it
 +   selects into RES and the architectural features into ISA_FLAGS.
 +   Return an aarch64_parse_opt_result describing the parse result.
 +   If there is an error parsing, RES and ISA_FLAGS are left unchanged.  */
  
 -static void
 -aarch64_parse_arch (void)
 +static enum aarch64_parse_opt_result
 +aarch64_parse_arch (const char *to_parse, const struct 

[Bug fortran/66942] trans-expr.c:5701 runtime error: member call on null pointer of type 'struct vec'

2015-07-20 Thread kargl at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66942

kargl at gcc dot gnu.org changed:

   What|Removed |Added

 CC||kargl at gcc dot gnu.org

--- Comment #1 from kargl at gcc dot gnu.org ---
(In reply to Vittorio Zecca from comment #0)
 ! gcc-5.2.0/gcc/fortran/trans-expr.c:5701:19: runtime error: member call on
 null pointer of type 'struct vec'
 ! gfortran source line retargs-splice (arglist);
 ! retargs is NULL
 ! double check with gcc_assert(retargs); immediately before
   call sub 
   END

What compiler options?  What target?  What is sub?

With 

subroutine sub
end subroutine sub

gfortran 6.0 and 5.2.1 compile, link your code, and executes
without an error.


Re: Refactor openacc wait routine

2015-07-20 Thread Nathan Sidwell

On 07/19/15 08:37, Nathan Sidwell wrote:

this trunk patch refactors libgomp's goacc_wait, which is used for two different
purposes.

1) when openacc pragmas specify a (non-zero) waits.

2) when the wait pragma itself specifies a zero number of waits.

this leads to #2 calling goacc_wait with num_waits=0, and forces #1 to never do
that.

Fixed by breaking out the num_waits == 0 handling from goacc_wait into
GOACC_wait, the wait pragma handler.  I have kept the num_wait=0 checks
elsewhere, but they are now for efficiency rather than correctness.


I committed to trunk  gomp with following ChangeLog:


2015-07-20  Nathan Sidwell  nat...@codesourcery.com

* oacc-parallel.c (GOACC_parallel): Move variadic handling into
wait=-specific if.
(GOACC_enter_exit_data, GOACC_update): Use consistent num_waits
!=0 condition.
(goacc_waits): Move !num_waits handling to ...
(GOACC_wait): ... here, the only caller that might have zero waits.



[PATCH 0/3] rs6000: Some updates for rotate etc.

2015-07-20 Thread Segher Boessenkool
Hi all,

Two updates for the rotate revamp, and a third patch that won't apply
without it.  I'll fold the first two together with the big patch, if
approved.

Everything bootstrapped and tested on powerpc64-linux, as usual; no
regressions.


Segher


Segher Boessenkool (3):
  Doc fixes for rot
  Fix shift amount (GPR-SI)
  lt0_disi

 gcc/config/rs6000/constraints.md |  2 +-
 gcc/config/rs6000/rs6000.md  | 35 +++
 gcc/doc/md.texi  | 12 
 3 files changed, 24 insertions(+), 25 deletions(-)

-- 
1.8.1.4



Re: [PING][PATCH, 1/2] Merge rewrite_virtuals_into_loop_closed_ssa from gomp4 branch

2015-07-20 Thread Tom de Vries

On 09/07/15 13:04, Richard Biener wrote:

On Thu, 9 Jul 2015, Tom de Vries wrote:


On 07/07/15 17:58, Tom de Vries wrote:

If you can
handle one exit edge I also can't see the difficulty in handling
all exit edges.



Agreed, that doesn't look to complicated. I could call
rewrite_virtuals_into_loop_closed_ssa for all loops in
rewrite_virtuals_into_loop_closed_ssa, to get non-single_dom_exit loops
exercising the code, and fix what breaks.


Hmm, I just realised, it's more complicated than I thought.

In loops with single_dom_exit, the exit dominates the uses outside the loop,
so I can replace the uses of the def with the uses of the exit phi result.

If !single_dom_exit, the exit(s) may not dominate all uses, and I need to
insert non-loop-exit phi nodes to deal with that.


Yes.  This is why I originally suggested to amend the regular
loop-close-SSA rewriting code.



This patch renames rewrite_into_loop_closed_ssa to 
rewrite_into_loop_closed_ssa_1, and adds arguments:

- a loop argument, to limit the defs for which the uses are
  rewritten
- a use_flags argument, to determine the type of uses rewritten:
  SSA_OP_USE/SSA_OP_VIRTUAL_USES/SSA_OP_ALL_USES

The original rewrite_into_loop_closed_ssa is reimplemented using 
rewrite_into_loop_closed_ssa_1.


And the !single_dom_exit case of rewrite_into_loop_closed_ssa is 
implemented using rewrite_into_loop_closed_ssa_1. [ The patch was tested 
as attached, always using rewrite_into_loop_closed_ssa_1, otherwise it 
would not be triggered. ]


Bootstrapped and reg-tested on x86_64.

Is this sort of what you had in mind?

Thanks,
- Tom

Implement rewrite_virtuals_into_loop_closed_ssa for !single_dom_exit

---
 gcc/tree-ssa-loop-manip.c | 146 --
 1 file changed, 129 insertions(+), 17 deletions(-)

diff --git a/gcc/tree-ssa-loop-manip.c b/gcc/tree-ssa-loop-manip.c
index cb762df..2b085df 100644
--- a/gcc/tree-ssa-loop-manip.c
+++ b/gcc/tree-ssa-loop-manip.c
@@ -407,7 +407,8 @@ find_uses_to_rename_use (basic_block bb, tree use, bitmap *use_blocks,
NEED_PHIS.  */
 
 static void
-find_uses_to_rename_stmt (gimple stmt, bitmap *use_blocks, bitmap need_phis)
+find_uses_to_rename_stmt (gimple stmt, bitmap *use_blocks, bitmap need_phis,
+			  int use_flags)
 {
   ssa_op_iter iter;
   tree var;
@@ -416,8 +417,16 @@ find_uses_to_rename_stmt (gimple stmt, bitmap *use_blocks, bitmap need_phis)
   if (is_gimple_debug (stmt))
 return;
 
-  FOR_EACH_SSA_TREE_OPERAND (var, stmt, iter, SSA_OP_USE)
-find_uses_to_rename_use (bb, var, use_blocks, need_phis);
+  /* Iterator does not allows SSA_OP_VIRTUAL_USES only.  */
+  if (use_flags == SSA_OP_VIRTUAL_USES)
+{
+  tree vuse = gimple_vuse (stmt);
+  if (vuse != NULL_TREE)
+	find_uses_to_rename_use (bb, gimple_vuse (stmt), use_blocks, need_phis);
+}
+  else
+FOR_EACH_SSA_TREE_OPERAND (var, stmt, iter, use_flags)
+  find_uses_to_rename_use (bb, var, use_blocks, need_phis);
 }
 
 /* Marks names that are used in BB and outside of the loop they are
@@ -426,24 +435,30 @@ find_uses_to_rename_stmt (gimple stmt, bitmap *use_blocks, bitmap need_phis)
need exit PHIs in NEED_PHIS.  */
 
 static void
-find_uses_to_rename_bb (basic_block bb, bitmap *use_blocks, bitmap need_phis)
+find_uses_to_rename_bb (basic_block bb, bitmap *use_blocks, bitmap need_phis,
+			int use_flags)
 {
   edge e;
   edge_iterator ei;
+  bool do_virtuals = (use_flags  SSA_OP_VIRTUAL_USES) != 0;
+  bool do_nonvirtuals = (use_flags  SSA_OP_USE) != 0;
 
   FOR_EACH_EDGE (e, ei, bb-succs)
 for (gphi_iterator bsi = gsi_start_phis (e-dest); !gsi_end_p (bsi);
 	 gsi_next (bsi))
   {
 gphi *phi = bsi.phi ();
-	if (! virtual_operand_p (gimple_phi_result (phi)))
+	bool virtual_p = virtual_operand_p (gimple_phi_result (phi));
+	if ((virtual_p  do_virtuals)
+	|| (!virtual_p  do_nonvirtuals))
 	  find_uses_to_rename_use (bb, PHI_ARG_DEF_FROM_EDGE (phi, e),
    use_blocks, need_phis);
   }
 
   for (gimple_stmt_iterator bsi = gsi_start_bb (bb); !gsi_end_p (bsi);
gsi_next (bsi))
-find_uses_to_rename_stmt (gsi_stmt (bsi), use_blocks, need_phis);
+find_uses_to_rename_stmt (gsi_stmt (bsi), use_blocks, need_phis,
+			  use_flags);
 }
 
 /* Marks names that are used outside of the loop they are defined in
@@ -452,7 +467,8 @@ find_uses_to_rename_bb (basic_block bb, bitmap *use_blocks, bitmap need_phis)
scan only blocks in this set.  */
 
 static void
-find_uses_to_rename (bitmap changed_bbs, bitmap *use_blocks, bitmap need_phis)
+find_uses_to_rename (bitmap changed_bbs, bitmap *use_blocks, bitmap need_phis,
+		 int use_flags)
 {
   basic_block bb;
   unsigned index;
@@ -460,10 +476,76 @@ find_uses_to_rename (bitmap changed_bbs, bitmap *use_blocks, bitmap need_phis)
 
   if (changed_bbs)
 EXECUTE_IF_SET_IN_BITMAP (changed_bbs, 0, index, bi)
-  find_uses_to_rename_bb (BASIC_BLOCK_FOR_FN (cfun, index), use_blocks, need_phis);
+  

Re: making the new if-converter not mangle IR that is already vectorizer-friendly

2015-07-20 Thread Alan Lawrence

Abe wrote:



of course this says nothing about whether there is *some* other ISA that gets 
regressed!


After finishing fixing the known regressions, I intend/plan to reg-test for 
AArch64;
after that, I think I`m going to need some community help to reg-test for other 
ISAs.


OK, I'm confused. When you write making the new if-converter not mangle 
IR...does the new if-converter mean your scratchpad fix to PR46029, or is 
there some other new if-conversion phase that you are still working on and 
haven't posted yet? If the latter, does this replace the existing 
tree-if-conv.c, or is it an extra stage before that? I haven't yet understood 
what you mean about vectorizer-friendly IR being mangled; is the problem that 
your new phase transforms IR that can currently be if-converted by the existing 
phase, into IR that can't? (Example?) Then I might (only might, sorry!) be 
able to help...


Cheers, Alan



[Bug c++/66944] static thread_local member in class template may cause compilation to fail

2015-07-20 Thread zhykzhykzhyk at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66944

zhykzhykzhyk at gmail dot com changed:

   What|Removed |Added

   Severity|normal  |major


Re: [gomp4.1] Initial support for some OpenMP 4.1 construct parsing

2015-07-20 Thread Jakub Jelinek
On Fri, Jul 17, 2015 at 06:43:06PM +0200, Jakub Jelinek wrote:
  BTW, do you plan to remove GOMP_MAP_POINTER mappings from array sections?
  The enter/exit patch for libgomp depends on this change.
 
 My current plan (for Monday and onwards) is to first implement firstprivate
 on target construct, once that works hack on the GOMP_MAP_POINTER
 replacement, and then rewrite the gimplification rules for target construct
 for the new 2.15.5 rules (so that this one does not really break all the
 target tests we need the first two working somehow).

Ok, so here is the first part of that, GOMP_MAP_FIRSTPRIVATE support as a
way to support firstprivate/is_device_ptr clauses on target construct (and 
private
clause too, though that is compiler only change).
firstprivate VLAs aren't supported yet, but that will be a compiler only
change.

I'll commit this patch tomorrow.

2015-07-20  Jakub Jelinek  ja...@redhat.com

gcc/
* omp-low.c (scan_sharing_clauses): Handle firstprivate
and is_device_ptr clauses on target region.
(lower_omp_target): Handle OMP_CLAUSE_FIRSTPRIVATE,
OMP_CLAUSE_IS_DEVICE_PTR and OMP_CLAUSE_PRIVATE.
include/
* gomp-constants.h (enum gomp_map_kind): Add GOMP_MAP_FIRSTPRIVATE.
libgomp/
* target.c (gomp_map_vars): Handle GOMP_MAP_FIRSTPRIVATE.
* testsuite/libgomp.c/target-13.c: New test.
* testsuite/libgomp.c/target-14.c: New test.
* testsuite/libgomp.c++/target-5.C: New test.
* testsuite/libgomp.c++/target-6.C: New test.

--- gcc/omp-low.c.jj2015-07-16 18:09:25.0 +0200
+++ gcc/omp-low.c   2015-07-20 17:43:33.271401254 +0200
@@ -1930,6 +1930,10 @@ scan_sharing_clauses (tree clauses, omp_
  else if (!global)
install_var_field (decl, by_ref, 3, ctx);
}
+ else if ((OMP_CLAUSE_CODE (c) == OMP_CLAUSE_FIRSTPRIVATE
+   || OMP_CLAUSE_CODE (c) == OMP_CLAUSE_IS_DEVICE_PTR)
+   is_gimple_omp_offloaded (ctx-stmt))
+   install_var_field (decl, !is_reference (decl), 3, ctx);
  install_var_local (decl, ctx);
  if (is_gimple_omp_oacc (ctx-stmt)
   OMP_CLAUSE_CODE (c) == OMP_CLAUSE_REDUCTION)
@@ -12929,6 +12933,21 @@ lower_omp_target (gimple_stmt_iterator *
DECL_HAS_VALUE_EXPR_P (new_var) = 1;
  }
map_cnt++;
+   break;
+
+  case OMP_CLAUSE_FIRSTPRIVATE:
+  case OMP_CLAUSE_IS_DEVICE_PTR:
+   map_cnt++;
+   var = OMP_CLAUSE_DECL (c);
+   if (!is_reference (var)
+!is_gimple_reg_type (TREE_TYPE (var)))
+ {
+   x = build_receiver_ref (var, true, ctx);
+   tree new_var = lookup_decl (var, ctx);
+   SET_DECL_VALUE_EXPR (new_var, x);
+   DECL_HAS_VALUE_EXPR_P (new_var) = 1;
+ }
+   break;
   }
 
   if (offloaded)
@@ -12994,7 +13013,8 @@ lower_omp_target (gimple_stmt_iterator *
   for (c = clauses; c ; c = OMP_CLAUSE_CHAIN (c))
switch (OMP_CLAUSE_CODE (c))
  {
-   tree ovar, nc;
+   tree ovar, nc, s, purpose, var, x;
+   unsigned int talign;
 
  default:
break;
@@ -13037,13 +13057,13 @@ lower_omp_target (gimple_stmt_iterator *
  continue;
  }
 
-   unsigned int talign = TYPE_ALIGN_UNIT (TREE_TYPE (ovar));
+   talign = TYPE_ALIGN_UNIT (TREE_TYPE (ovar));
if (DECL_P (ovar)  DECL_ALIGN_UNIT (ovar)  talign)
  talign = DECL_ALIGN_UNIT (ovar);
if (nc)
  {
-   tree var = lookup_decl_in_outer_ctx (ovar, ctx);
-   tree x = build_sender_ref (ovar, ctx);
+   var = lookup_decl_in_outer_ctx (ovar, ctx);
+   x = build_sender_ref (ovar, ctx);
if (maybe_lookup_oacc_reduction (var, ctx))
  {
gcc_checking_assert (offloaded
@@ -13092,11 +13112,11 @@ lower_omp_target (gimple_stmt_iterator *
gimplify_assign (x, var, ilist);
  }
  }
-   tree s = OMP_CLAUSE_SIZE (c);
+   s = OMP_CLAUSE_SIZE (c);
if (s == NULL_TREE)
  s = TYPE_SIZE_UNIT (TREE_TYPE (ovar));
s = fold_convert (size_type_node, s);
-   tree purpose = size_int (map_idx++);
+   purpose = size_int (map_idx++);
CONSTRUCTOR_APPEND_ELT (vsize, purpose, s);
if (TREE_CODE (s) != INTEGER_CST)
  TREE_STATIC (TREE_VEC_ELT (t, 1)) = 0;
@@ -13126,6 +13146,52 @@ lower_omp_target (gimple_stmt_iterator *
build_int_cstu (tkind_type, tkind));
if (nc  nc != c)
  c = nc;
+   break;
+
+ case OMP_CLAUSE_FIRSTPRIVATE:
+ case OMP_CLAUSE_IS_DEVICE_PTR:
+   ovar = OMP_CLAUSE_DECL (c);
+   if (is_reference (ovar))
+ talign = TYPE_ALIGN_UNIT (TREE_TYPE (TREE_TYPE (ovar)));
+  

Re: [PATCH 2/3] Fix shift amount (GPR-SI)

2015-07-20 Thread David Edelsohn
On Mon, Jul 20, 2015 at 12:04 PM, Segher Boessenkool
seg...@kernel.crashing.org wrote:
 This changes the shift amount to always be SI (as it should be), not GPR.
 It doesn't matter for constant shifts, but there are some variable shifts
 as well, and consistency is good.

 No changelog, I'll fold it into the previous big patch, if approved.

This is okay.

Thanks, David


  1   2   >