Re: [PATCH] Fix declaration of pthread-structs in s-osinte-rtems.ads (ada/68169)

2015-11-30 Thread Jan Sommer
Could someone with write access please commit the patch?
The paperwork with the FSF has gone through. If something else is missing, 
please tell me.
I won't be available next week.

Best regards,

   Jan

Am Tuesday 24 November 2015, 08:47:49 schrieb Jan Sommer:
> It has gone through.
> That was why I resubmitted the patch.
> Joel can confirm. Apparently he is on a respective list and saw my paperwork 
> being cleared.
> 
> Best regards,
> 
>Jan
> 
> Am Tuesday 24 November 2015, 07:45:30 schrieb Sebastian Huber:
> > Hello Jan,
> > 
> > On 23/11/15 23:15, Jan Sommer wrote:
> > > If someone with commit rights could check and push the patches we might 
> > > get it into the next release.
> > 
> > what is the status of your copyright assignment to the FSF which is 
> > required to integrate changes into GCC?
> > 
> > 
> 
> ___
> devel mailing list
> de...@rtems.org
> http://lists.rtems.org/mailman/listinfo/devel



Re: regrename/i386: ROP vs df and stack-regs

2015-11-30 Thread Bernd Schmidt

On 11/27/2015 10:02 AM, Bernd Schmidt wrote:

This is a patch for PRs 68471 and 68472, which show problems with the
ROP mitigation:
  * reg-stack doesn't call df_insn_update when it makes changes, and
if df checking is enabled, any subsequent df_analyze call will
abort
  * Using -mcmodel=medium fails because of a pattern that has lea type
and needs its modrm_class overridden.

Both of these are fixed in the i386 backend. As a further safety
measure, I've added some extra code to regrename to ignore stack regs
after regstack_complete - they can't be dealt with anymore.

Bootstrapped and tested on x86_64-linux, with -mmitigate-rop forced on. Ok?



PR target/68471
PR target/68472
* config/i386/i386.c (ix86_mitigate_rop): Don't call
compute_bb_for_insn again.  Call df_insn_rescan_all.
* config/i386/i386.md (set_got_rex64): Override modrm_class.

* regrename.c (build_def_use): Ignore stack regs if regstack_completed.

testsuite/
* gcc.target/i386/rop1.c: New test.

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 2ac6c25..14c99eb 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -45243,8 +45243,9 @@ ix86_mitigate_rop (void)
   COPY_HARD_REG_SET (inout_risky, input_risky);
   IOR_HARD_REG_SET (inout_risky, output_risky);

-  compute_bb_for_insn ();
   df_note_add_problem ();
+  /* Fix up what stack-regs did.  */
+  df_insn_rescan_all ();
   df_analyze ();

   regrename_init (true);
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index a57d165..671580d 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -12418,6 +12418,7 @@
   "lea{q}\t{_GLOBAL_OFFSET_TABLE_(%%rip), %0|%0, _GLOBAL_OFFSET_TABLE_[rip]}"
   [(set_attr "type" "lea")
(set_attr "length_address" "4")
+   (set_attr "modrm_class" "unknown")
(set_attr "mode" "DI")])

 (define_insn "set_rip_rex64"
--- /dev/null   2015-11-23 12:05:22.553607702 +0100
+++ gcc/testsuite/gcc.target/i386/rop1.c2015-11-24 15:40:04.381086953 
+0100
@@ -0,0 +1,7 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-options "-mcmodel=medium -mmitigate-rop" } */
+void
+foo (void)
+{
+}


Ccing Uros for the i386 bits.


Bernd



Re: regrename/i386: ROP vs df and stack-regs

2015-11-30 Thread Uros Bizjak
On Mon, Nov 30, 2015 at 10:38 PM, Bernd Schmidt  wrote:
> On 11/27/2015 10:02 AM, Bernd Schmidt wrote:
>>
>> This is a patch for PRs 68471 and 68472, which show problems with the
>> ROP mitigation:
>>   * reg-stack doesn't call df_insn_update when it makes changes, and
>> if df checking is enabled, any subsequent df_analyze call will
>> abort
>>   * Using -mcmodel=medium fails because of a pattern that has lea type
>> and needs its modrm_class overridden.
>>
>> Both of these are fixed in the i386 backend. As a further safety
>> measure, I've added some extra code to regrename to ignore stack regs
>> after regstack_complete - they can't be dealt with anymore.
>>
>> Bootstrapped and tested on x86_64-linux, with -mmitigate-rop forced on.
>> Ok?
>
>
>> PR target/68471
>> PR target/68472
>> * config/i386/i386.c (ix86_mitigate_rop): Don't call
>> compute_bb_for_insn again.  Call df_insn_rescan_all.
>> * config/i386/i386.md (set_got_rex64): Override modrm_class.
>>
>> * regrename.c (build_def_use): Ignore stack regs if
>> regstack_completed.
>>
>> testsuite/
>> * gcc.target/i386/rop1.c: New test.
>>
>> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
>> index 2ac6c25..14c99eb 100644
>> --- a/gcc/config/i386/i386.c
>> +++ b/gcc/config/i386/i386.c
>> @@ -45243,8 +45243,9 @@ ix86_mitigate_rop (void)
>>COPY_HARD_REG_SET (inout_risky, input_risky);
>>IOR_HARD_REG_SET (inout_risky, output_risky);
>>
>> -  compute_bb_for_insn ();
>>df_note_add_problem ();
>> +  /* Fix up what stack-regs did.  */
>> +  df_insn_rescan_all ();
>>df_analyze ();
>>
>>regrename_init (true);
>> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
>> index a57d165..671580d 100644
>> --- a/gcc/config/i386/i386.md
>> +++ b/gcc/config/i386/i386.md
>> @@ -12418,6 +12418,7 @@
>>"lea{q}\t{_GLOBAL_OFFSET_TABLE_(%%rip), %0|%0,
>> _GLOBAL_OFFSET_TABLE_[rip]}"
>>[(set_attr "type" "lea")
>> (set_attr "length_address" "4")
>> +   (set_attr "modrm_class" "unknown")
>> (set_attr "mode" "DI")])
>>
>>  (define_insn "set_rip_rex64"
>> --- /dev/null   2015-11-23 12:05:22.553607702 +0100
>> +++ gcc/testsuite/gcc.target/i386/rop1.c2015-11-24
>> 15:40:04.381086953 +0100
>> @@ -0,0 +1,7 @@
>> +/* { dg-do compile } */
>> +/* { dg-require-effective-target lp64 } */
>> +/* { dg-options "-mcmodel=medium -mmitigate-rop" } */
>> +void
>> +foo (void)
>> +{
>> +}
>
>
> Ccing Uros for the i386 bits.

These are OK.

Thanks,
Uros.


Re: [PATCH] Fix PR68067

2015-11-30 Thread Richard Biener
On Fri, 27 Nov 2015, Alan Lawrence wrote:

> On 27/11/15 15:07, Alan Lawrence wrote:
> > On 23/11/15 09:43, Richard Biener wrote:
> > > On Fri, 20 Nov 2015, Alan Lawrence wrote:
> > > 
> > > > ...the asserts
> > > > you suggested in
> > > > (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D68117#c27)...
> >  >>
> > > > So I have to ask, how sure are you that those assertions are(/should
> > > > be!) "correct"? :)
> > > 
> > > Ideally they should be correct but they happen to be not (and I think
> > > the intent was that this should be harmless).  Basically I tried
> > > to assert that nobody creates stale edge redirect data that is not
> > > later consumed or cleared.  Happens to be too optimistic :/
> > 
> > Maybe so, but it looks like the edge_var_redirect_map is still suspect here.
> > On
> > the ~~28th call to loop_version, from tree_unswitch_loop, the call to
> > lv_flush_pending_stmts executes (tree-cfg.c flush_pending_stmts):
> > 
> > def = redirect_edge_var_map_def (vm);
> > add_phi_arg (phi, def, e, redirect_edge_var_map_location(vm));
> > 
> > and BLOCK_LOCATION (redirect_edge_var_map_location(vm)) is
> > 
> > < 0x7fb7704a80 side-effects addressable asm_written used
> > protected static visited tree_0 tree_2 tree_5>
> > 
> > so yeah, next question, how'd that get there...
> > 
> > A.
> 
> Well, pass_dominator::execute calls redirect_edge_var_map with that edge
> pointer, at which time the edge is from from 32 (0x7fb79cc6e8) to block 20
> (0x7fb7485e38), and locus is 2147483884; and then again, with locus 0.
> 
> With no intervening calls to redirect_edge_var_map_clear for that edge,
> loop_version's call to flush_pending_statements then reads
> redirect_edge_var_map_vector for that edge pointer - which is now an edge from
> block 126 (0x7fb7485af8) to 117 (0x7fb74856e8). It sees those locations
> (2147483884 and 0)...
> 
> Clearing the edge redirect map at the end of pass_dominator fixes the ICE (as
> would clearing it at the end of each stage, or clearing it at the beginning of
> loop_unswitch, I guess).
> 
> I'll post a patch after more testing, but obviously I'm keen to hear if there
> are obvious problems with the approach?
> 
> And coming up with a testcase, well, heh - this broke because of identical
> pointers to structures allocated at different times, with intervening
> free...ideas welcome of course!

Yeah.  I've pondered with clearing the hashmap after each pass
(and hope no IPA pass would redirect edges).  Or even more aggressive,
clear the hashmap as well when we do set_cfun ().

Maybe you can try that?

And no, I don't think any pass expects this stuff to be live across
passes.

Richard.


Re: [PATCH PR68529]Fix not recognized scev by computing no-overflow info for loop with NE_EXPR exit condition

2015-11-30 Thread Richard Biener
On Sat, Nov 28, 2015 at 6:50 AM, Bin.Cheng  wrote:
> On Fri, Nov 27, 2015 at 8:51 PM, Richard Biener
>  wrote:
>> On Fri, Nov 27, 2015 at 12:44 PM, Bin Cheng  wrote:
>>> Hi,
>>> This patch is to fix PR68529.  In my previous scev/niter overflow patches, I
>>> only computed no-overflow information for control iv in simple loops with
>>> LT_EXPR as exit condition code.  This bug is about loop with NE_EXPR as exit
>>> condition code.  Given below example:
>>>
>>> #include 
>>> #include 
>>>
>>> int main(){
>>> char c[1]={};
>>> unsigned int nchar=;
>>>
>>> while(nchar--!=0){
>>>c[nchar]='A';
>>>   }
>>>
>>> printf("%s\n",c);
>>> return 0;
>>> }
>>> nchar used as an index to array 'c' doesn't overflow during loop iterations.
>>> Thus [nchar] acts as a scev.  GCC now fails to do that.  With this patch,
>>> this issue is fixed.
>>>
>>> Furthermore, the computation of no-overflow information could be improved by
>>> using TREE_OVERFLOW_UNDEFINED semantic of signed type for C/C++.  I didn't
>>> do that because:
>>> 1) I doubt how useful it could be because I have already changed scev to use
>>> the semantic whenever possible.  It doesn't need loop niter analysis' help.
>>> 2) To do that, I need to expose chrec_convert_aggressive information out of
>>> scev in function simple_iv, because that function could corrupt
>>> TREE_OVERFLOW_UNDEFINED semantic assumption.  This isn't appropriate for
>>> Stage3.
>>>
>>> Bootstrap and test on x86_64 and x86.  I don't expect any issue on aarch64
>>> either.  Is it OK?
>>
>> +  if (integer_onep (e)
>> +  && (integer_onep (s)
>> + || (TREE_CODE (c) == INTEGER_CST
>> + && TREE_CODE (s) == INTEGER_CST
>> + && wi::mod_trunc (c, s, TYPE_SIGN (type)) == 0)))
>>
>> the only thing I'm looking at here is the modulo sign.  Considering
>> we're looking at the sign bit of the step to normalize 'c' and 's' what
>> happens for
>>
>>   for (unsigned int i = 0; i != 1000; --i)
>>
>> ?  I suppose we get s == 1 and c == -1000U and you'll say the control
>> IV doesn't wrap.  Similar for i -= 2 where even when we use a signed
>> modulo (singed)-1000U % 2 is still 0.
>>
>> So I think you need to remember whether we consider the step
>> to be negative and compare iv->base and final as well.
> I think the patch does the monotonic check wrto sign of step with below code:
>
> +  if (tree_int_cst_sign_bit (iv->step))
> +e = fold_build2 (GE_EXPR, boolean_type_node, iv->base, final);
> +  else
> +e = fold_build2 (LE_EXPR, boolean_type_node, iv->base, final);
> +  e = simplify_using_initial_conditions (loop, e);
> +  if (integer_onep (e)
>
> It acts as expected with your example.
>
>>
>> Bonus points for a wrong-code testcase with the above.
>>
>> I'd also like to see a testcase exercising step != 1.
> I added two new tests each for "step != 1" and the previous case.  I
> also tuned original pr68529-3.c a little.  Actually for the case in
> the original patch as below:
> +void bar(char *s);
> +int foo(unsigned short l)
> +{
> +  char c[1] = {};
> +  unsigned short nchar = ;
> +
> +  if (nchar < l)
> +return -1;
> +
> +  while(nchar-- != l)
> +{
> +  c[nchar] = 'A';
> +}
> +
> +  bar (c);
> +  return 0;
> +}
>
> The offset IS an affine.  GCC can't detect that because condition
> "nchar (==) < l" is split into two conditions: "l_8 > " and
> "l_8 != ".  For now simplify_using_initial_conditions can't merge
> range information from two different conditions.  Maybe jump threading
> can merge the two condition/jumps, or VRP improvement discussed before
> can handle that.
>
> Here is the updated patch.  Is it OK?

Ok.

Thanks,
Richard.

> Thanks,
> bin


[PATCH, PR46032] Handle BUILT_IN_GOMP_PARALLEL in ipa-pta

2015-11-30 Thread Tom de Vries

Hi,

this patch fixes PR46032.

It handles a call:
...
  __builtin_GOMP_parallel (fn, data, num_threads, flags)
...
as:
...
  fn (data)
...
in ipa-pta.

This improves ipa-pta alias analysis in the parallelized function fn, 
and allows vectorization in the testcase without a runtime alias test.


Bootstrapped and reg-tested on x86_64.

OK for stage3 trunk?

Thanks,
- Tom
Handle BUILT_IN_GOMP_PARALLEL in pta

2015-11-30  Tom de Vries  

	PR tree-optimization/46032
	* tree-ssa-structalias.c (find_func_aliases_for_builtin_call)
	(find_func_clobbers): Handle BUILT_IN_GOMP_PARALLEL.
	(ipa_pta_execute): Same.  Handle node->parallelized_function as a local
	function.

	* gcc.dg/pr46032.c: New test.

	* testsuite/libgomp.c/pr46032.c: New test.

---
 gcc/testsuite/gcc.dg/pr46032.c| 47 ++
 gcc/tree-ssa-structalias.c| 73 ++-
 libgomp/testsuite/libgomp.c/pr46032.c | 44 +
 3 files changed, 162 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/pr46032.c b/gcc/testsuite/gcc.dg/pr46032.c
new file mode 100644
index 000..b91190e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr46032.c
@@ -0,0 +1,47 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fopenmp -ftree-vectorize -std=c99 -fipa-pta -fdump-tree-vect-all" } */
+
+extern void abort (void);
+
+#define nEvents 1000
+
+static void __attribute__((noinline, noclone, optimize("-fno-tree-vectorize")))
+init (unsigned *results, unsigned *pData)
+{
+  unsigned int i;
+  for (i = 0; i < nEvents; ++i)
+pData[i] = i % 3;
+}
+
+static void __attribute__((noinline, noclone, optimize("-fno-tree-vectorize")))
+check (unsigned *results)
+{
+  unsigned sum = 0;
+  for (int idx = 0; idx < (int)nEvents; idx++)
+sum += results[idx];
+
+  if (sum != 1998)
+abort ();
+}
+
+int
+main (void)
+{
+  unsigned results[nEvents];
+  unsigned pData[nEvents];
+  unsigned coeff = 2;
+
+  init ([0], [0]);
+
+#pragma omp parallel for
+  for (int idx = 0; idx < (int)nEvents; idx++)
+results[idx] = coeff * pData[idx];
+
+  check ([0]);
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "note: vectorized 1 loop" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-not "versioning for alias required" "vect" } } */
+
diff --git a/gcc/tree-ssa-structalias.c b/gcc/tree-ssa-structalias.c
index f24ebeb..3fe538b 100644
--- a/gcc/tree-ssa-structalias.c
+++ b/gcc/tree-ssa-structalias.c
@@ -4488,6 +4488,39 @@ find_func_aliases_for_builtin_call (struct function *fn, gcall *t)
 	}
 	  return true;
 	}
+  case BUILT_IN_GOMP_PARALLEL:
+	{
+	  /* Handle
+	   __builtin_GOMP_parallel (fn, data, num_threads, flags).  */
+	  if (in_ipa_mode)
+	{
+	  tree fnarg = gimple_call_arg (t, 0);
+	  gcc_assert (TREE_CODE (fnarg) == ADDR_EXPR);
+	  tree fndecl = TREE_OPERAND (fnarg, 0);
+	  varinfo_t fi = get_vi_for_tree (fndecl);
+	  tree arg = gimple_call_arg (t, 1);
+	  gcc_assert (TREE_CODE (arg) == ADDR_EXPR);
+
+	  /* Assign the passed argument to the appropriate incoming
+		 parameter of the function.  */
+	  struct constraint_expr lhs ;
+	  lhs = get_function_part_constraint (fi, fi_parm_base + 0);
+	  auto_vec rhsc;
+	  struct constraint_expr *rhsp;
+	  get_constraint_for_rhs (arg, );
+	  while (rhsc.length () != 0)
+		{
+		  rhsp =  ();
+		  process_constraint (new_constraint (lhs, *rhsp));
+		  rhsc.pop ();
+		}
+
+	  return true;
+	}
+	  /* Else fallthru to generic handling which will let
+	 the frame escape.  */
+	  break;
+	}
   /* printf-style functions may have hooks to set pointers to
 	 point to somewhere into the generated string.  Leave them
 	 for a later exercise...  */
@@ -5036,6 +5069,37 @@ find_func_clobbers (struct function *fn, gimple *origt)
 	  case BUILT_IN_VA_START:
 	  case BUILT_IN_VA_END:
 	return;
+	  case BUILT_IN_GOMP_PARALLEL:
+	{
+	  /* Handle
+		   __builtin_GOMP_parallel (fn, data, num_threads, flags).  */
+	  tree fnarg = gimple_call_arg (t, 0);
+	  gcc_assert (TREE_CODE (fnarg) == ADDR_EXPR);
+	  tree fndecl = TREE_OPERAND (fnarg, 0);
+	  varinfo_t cfi = get_vi_for_tree (fndecl);
+	  tree arg = gimple_call_arg (t, 1);
+	  gcc_assert (TREE_CODE (arg) == ADDR_EXPR);
+
+	  /* Parameter passed by value is used.  */
+	  lhs = get_function_part_constraint (fi, fi_uses);
+	  struct constraint_expr *rhsp;
+	  get_constraint_for_address_of (arg, );
+	  FOR_EACH_VEC_ELT (rhsc, j, rhsp)
+		process_constraint (new_constraint (lhs, *rhsp));
+	  rhsc.truncate (0);
+
+	  /* The caller clobbers what the callee does.  */
+	  lhs = get_function_part_constraint (fi, fi_clobbers);
+	  rhs = get_function_part_constraint (cfi, fi_clobbers);
+	  process_constraint (new_constraint (lhs, rhs));
+
+	  /* The caller uses what the callee does.  */
+	  lhs = get_function_part_constraint (fi, fi_uses);
+	

Re: [patch] c/c++ asan tests for FreeBSD

2015-11-30 Thread Bernd Schmidt

On 11/29/2015 08:32 PM, Andreas Tobler wrote:

Hi all,

the attached patch prepares the testsuite, c and c++, for the upcoming
ASAN support for FreeBSD (x86_64 first).

I tested the patch on CentOS7.1 x86_64 and on FreeBSD x86_64.
Results can be seen on the list.

Is this ok for trunk?

-/* { dg-do run { target { *-*-linux* } } } */
+/* { dg-do run { target { *-*-linux* *-*-freebsd* } } } */


I see a patch from you to add asan support to x86 freebsd, but what 
about other architectures?



Bernd


Re: [patch] c/c++ asan tests for FreeBSD

2015-11-30 Thread Jakub Jelinek
On Mon, Nov 30, 2015 at 05:17:29PM +0100, Bernd Schmidt wrote:
> On 11/30/2015 01:12 PM, Andreas Tobler wrote:
> >On 30.11.15 11:28, Bernd Schmidt wrote:
> >>On 11/29/2015 08:32 PM, Andreas Tobler wrote:
> >>>-/* { dg-do run { target { *-*-linux* } } } */
> >>>+/* { dg-do run { target { *-*-linux* *-*-freebsd* } } } */
> >>
> >>I see a patch from you to add asan support to x86 freebsd, but what
> >>about other architectures?
> >
> >You mean because of the wildcard? I'll add them as I have time to port
> >them.
> >
> >For now they are UNSUPPORTED.
> 
> Is that how they show up, or do you get FAILs on other FreeBSDs?

This is inside of asan.exp, which is guarded with
check_effective_target_fsanitize_address
and therefore should not be run at all on non-asan targets.
I think the testsuite changes are fine, but it IMHO doesn't make sense to
commit it until the FreeBSD asan supports lands in (which is dependent on
the upstream libsanitizer change I believe).  Once it happens, it can be
cherry-picked from there, the config/i386 part looks reasonable.

Jakub


Re: [patch] c/c++ asan tests for FreeBSD

2015-11-30 Thread Bernd Schmidt

On 11/30/2015 01:12 PM, Andreas Tobler wrote:

On 30.11.15 11:28, Bernd Schmidt wrote:

On 11/29/2015 08:32 PM, Andreas Tobler wrote:

-/* { dg-do run { target { *-*-linux* } } } */
+/* { dg-do run { target { *-*-linux* *-*-freebsd* } } } */


I see a patch from you to add asan support to x86 freebsd, but what
about other architectures?


You mean because of the wildcard? I'll add them as I have time to port
them.

For now they are UNSUPPORTED.


Is that how they show up, or do you get FAILs on other FreeBSDs?


Does every *-*-linux* has asan support?


Probably not, but I guess the main ones people tend to test.


Bernd


Re: [PATCH] [ARC] Add support for atomic memory built-in.

2015-11-30 Thread Claudiu Zissulescu
Ping. This patch is stalling for two weeks.

Thanks,
Claudiu

On Mon, Nov 16, 2015 at 11:18 AM, Claudiu Zissulescu
 wrote:
> This patch adds support for atomic memory built-in for ARCHS and ARC700. 
> Tested with dg.exp.
>
> OK to apply?
>
> Thanks,
> Claudiu
>
> ChangeLogs:
> gcc/
>
> 2015-11-12  Claudiu Zissulescu  
>
> * config/arc/arc-protos.h (arc_expand_atomic_op): Prototype.
> (arc_split_compare_and_swap): Likewise.
> (arc_expand_compare_and_swap): Likewise.
> * config/arc/arc.c (arc_init): Check usage atomic option.
> (arc_pre_atomic_barrier): New function.
> (arc_post_atomic_barrier): Likewise.
> (emit_unlikely_jump): Likewise.
> (arc_expand_compare_and_swap_qh): Likewise.
> (arc_expand_compare_and_swap): Likewise.
> (arc_split_compare_and_swap): Likewise.
> (arc_expand_atomic_op): Likewise.
> * config/arc/arc.h (TARGET_CPU_CPP_BUILTINS): New C macro.
> (ASM_SPEC): Enable mlock option when matomic is used.
> * config/arc/arc.md (UNSPEC_ARC_MEMBAR): Define.
> (VUNSPEC_ARC_CAS): Likewise.
> (VUNSPEC_ARC_LL): Likewise.
> (VUNSPEC_ARC_SC): Likewise.
> (VUNSPEC_ARC_EX): Likewise.
> * config/arc/arc.opt (matomic): New option.
> * config/arc/constraints.md (ATO): New constraint.
> * config/arc/predicates.md (mem_noofs_operand): New predicate.
> * doc/invoke.texi: Document -matomic.
> * config/arc/atomic.md: New file.
>
> gcc/testsuite
>
> 2015-11-12  Claudiu Zissulescu  
>
> * lib/target-supports.exp (check_effective_target_arc_atomic): New
> function.
> (check_effective_target_sync_int_long): Add checks for ARC atomic
> feature.
> (check_effective_target_sync_char_short): Likewise.
> ---
>  gcc/config/arc/arc-protos.h   |   4 +
>  gcc/config/arc/arc.c  | 391 
> ++
>  gcc/config/arc/arc.h  |   6 +-
>  gcc/config/arc/arc.md |   9 +
>  gcc/config/arc/arc.opt|   3 +
>  gcc/config/arc/atomic.md  | 235 
>  gcc/config/arc/constraints.md |   6 +
>  gcc/config/arc/predicates.md  |   4 +
>  gcc/doc/invoke.texi   |   8 +-
>  gcc/testsuite/lib/target-supports.exp |  11 +
>  10 files changed, 675 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/config/arc/atomic.md
>
> diff --git a/gcc/config/arc/arc-protos.h b/gcc/config/arc/arc-protos.h
> index 6e04351..3581bb0 100644
> --- a/gcc/config/arc/arc-protos.h
> +++ b/gcc/config/arc/arc-protos.h
> @@ -41,6 +41,10 @@ extern int arc_output_commutative_cond_exec (rtx 
> *operands, bool);
>  extern bool arc_expand_movmem (rtx *operands);
>  extern bool prepare_move_operands (rtx *operands, machine_mode mode);
>  extern void emit_shift (enum rtx_code, rtx, rtx, rtx);
> +extern void arc_expand_atomic_op (enum rtx_code, rtx, rtx, rtx, rtx, rtx);
> +extern void arc_split_compare_and_swap (rtx *);
> +extern void arc_expand_compare_and_swap (rtx *);
> +
>  #endif /* RTX_CODE */
>
>  #ifdef TREE_CODE
> diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
> index 8bb0969..d47bbe4 100644
> --- a/gcc/config/arc/arc.c
> +++ b/gcc/config/arc/arc.c
> @@ -61,6 +61,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "context.h"
>  #include "builtins.h"
>  #include "rtl-iter.h"
> +#include "alias.h"
>
>  /* Which cpu we're compiling for (ARC600, ARC601, ARC700).  */
>  static const char *arc_cpu_string = "";
> @@ -884,6 +885,9 @@ arc_init (void)
>flag_pic = 0;
>  }
>
> +  if (TARGET_ATOMIC && !(TARGET_ARC700 || TARGET_HS))
> +error ("-matomic is only supported for ARC700 or ARC HS cores");
> +
>arc_init_reg_tables ();
>
>/* Initialize array for PRINT_OPERAND_PUNCT_VALID_P.  */
> @@ -9650,6 +9654,393 @@ arc_use_by_pieces_infrastructure_p (unsigned 
> HOST_WIDE_INT size,
>return default_use_by_pieces_infrastructure_p (size, align, op, speed_p);
>  }
>
> +/* Emit a (pre) memory barrier around an atomic sequence according to
> +   MODEL.  */
> +
> +static void
> +arc_pre_atomic_barrier (enum memmodel model)
> +{
> + switch (model & MEMMODEL_MASK)
> +{
> +case MEMMODEL_RELAXED:
> +case MEMMODEL_CONSUME:
> +case MEMMODEL_ACQUIRE:
> +case MEMMODEL_SYNC_ACQUIRE:
> +  break;
> +case MEMMODEL_RELEASE:
> +case MEMMODEL_ACQ_REL:
> +case MEMMODEL_SYNC_RELEASE:
> +  emit_insn (gen_membar (const0_rtx));
> +  break;
> +case MEMMODEL_SEQ_CST:
> +case MEMMODEL_SYNC_SEQ_CST:
> +  emit_insn (gen_sync (const1_rtx));
> +  break;
> +default:
> +  gcc_unreachable ();
> +}
> +}
> +
> +/* Emit a (post) memory barrier around an atomic sequence according to
> +   MODEL.  */
> +
> +static void
> +arc_post_atomic_barrier (enum 

Re: [PATCH] [PR68603] Associate conditional C++ loop's back-jump with start, not body

2015-11-30 Thread Jason Merrill

OK.

Jason


Re: Fix verify_type ICE during Ada bootstrap

2015-11-30 Thread Jan Hubicka
> 
> I think you are doing too many things in one patch.  I'm fine with
> dropping the zero-alias-set streaming (but I'd rather not assert
> as FE get_alias_set langhook may assign zero to random tree nodes).

Ok, the assert was there mostly to double check that all zero alias
sets rematerialize correctly in LTO which I tested so it can go.
> 
> I'm also fine with handling flag_strict_aliasing conservatively
> during inlining - but the condition you placed on this handling
> needs a comment.  I couldn't decipher it ;)

OK, there is symmetric condition in ipa-inline-analysis, will comment on it.
It indeed can go in separately.
> 
> > +  if (dump_file)
> > + fprintf (dump_file, "Dropping flag_strict_aliasing on %s:%i\n",
> > +  to->name (), to->order);
> 
> So I wonder if it makes sense to pessimize such inlining as well.

I don't know - even for Firefox that heavily mix -fstrict-aliasing
and -fno-strict-aliasing units this seems quite rare occasion and it
is hard to judge when dopping the flag_strict_aliasing.
> 
> The two above should be enough to fix the correctness issue.

We also need to prevent ipa-icf and fold_const from optimizing functions
early in a way that is not compatible with inlining -fno-strict-aliasing comdat
to -fstrict-aliasing function.

Honza
> 
> The parse_optimize_options hack looks indeed interesting, but we solved
> the issue differently by
> 
> 2014-11-27  Richard Biener  
> 
> PR middle-end/63704
> * alias.c (mems_in_disjoint_alias_sets_p): Remove assert
> and instead return false when !fstrict-aliasing.
> 
> So the hack can be removed as a separate commit after the first one
> above.  This should make optimize("fno-strict-aliasing") work.
> 
> 
> I don't really see why we need all the other changes and IMHO the
> get_alias_set interface change is ugly and fragile.  And this doesn't
> look like sth for stage3.
> 
> Thus please split the patch up.
> 
> Thanks,
> Richard.
> 
> > Honza
> > 
> > * tree.c (free_lang_data): Pass true to get_alias_set.
> > * tree-streamer-in.c (unpack_ts_type_common_value_fields): Do not stream
> > alias set.
> > * tree-ssa-alias.c (ao_ref_base_alias_set, ao_ref_alias_set): Pass true
> > to get_alias_set; comment.
> > (same_type_for_tbaa): Likewise.
> > * alias.c (alias_set_subset_of, alias_sets_conflict_p): When strict
> > aliasing is disabled, return true.
> > (get_alias_set): New parameter strict.
> > (new_alias_set): Always produce new alias set.
> > (record_component_aliases): Pass true to get_alias_set.
> > * alias.h (get_alias_set): New optional parameter STRICT.
> > * lto-streamer-out.c (hash_tree): Do not hash alias set.
> > * ipa-inline-transform.c (inline_call): Drop strict aliasing of
> > caller if needed.
> > * ipa-icf-gimple.c (func_checker::compatible_types_p): Pass true
> > to get_alias_set.
> > * tree-streamer-out.c (pack_ts_type_common_value_fields): Do not
> > stream TYPE_ALIAS_SET; sanity check that alias set 0 at LTO time will
> > match what frontneds does.
> > * fold-const.c (operand_equal_p): Be cureful about TBAA info before
> > inlining even with -fno-strict-aliasing.
> > * gimple.c (gimple_get_alias_set): Pass true to get_alias_set.
> > 
> > * misc.c (gnat_get_alias_set): Pass true to get_alias_set.
> > * utils.c (relate_alias_sets): Likewise.
> > * trans.c (validate_unchecked_conversion): Likewise.
> > 
> > * lto-symtab.c (warn_type_compatibility_p): Pass true to get_alias_set.
> > * lto.c (compare_tree_sccs_1): Do not ocmpare TYPE_ALIAS_SET.
> > 
> > * gcc.c-torture/execute/alias-1.c: New testcase.
> > * gcc.dg/lto/alias-1_0.c: New testcase.
> > * gcc.dg/lto/alias-1_1.c: New testcase.
> > 
> > * c-common.c (parse_optimize_options): Remove hack about
> > flag_strict_aliasing.
> > (convert_vector_to_pointer_for_subscript): Pass true to get_alias_set.
> > 
> > * cp-objcp-common.c (cxx_get_alias_set): Pass true to get_alias_set.
> > 
> > * rtti.c (typeid_ok_p): Pass true to get_alias_set.
> > Index: tree.c
> > ===
> > --- tree.c  (revision 231020)
> > +++ tree.c  (working copy)
> > @@ -5971,7 +5971,8 @@ free_lang_data (void)
> >   while the slots are still in the way the frontends generated them.  */
> >for (i = 0; i < itk_none; ++i)
> >  if (integer_types[i])
> > -  TYPE_ALIAS_SET (integer_types[i]) = get_alias_set (integer_types[i]);
> > +  TYPE_ALIAS_SET (integer_types[i]) = get_alias_set (integer_types[i],
> > +true);
> >  
> >/* Traverse the IL resetting language specific information for
> >   operands, expressions, etc.  */
> > Index: cp/rtti.c
> > ===
> > --- cp/rtti.c   (revision 231020)
> > +++ cp/rtti.c   

Re: [PATCH, PR46032] Handle BUILT_IN_GOMP_PARALLEL in ipa-pta

2015-11-30 Thread Tom de Vries

On 30/11/15 14:24, Richard Biener wrote:

On Mon, 30 Nov 2015, Tom de Vries wrote:


On 30/11/15 10:16, Richard Biener wrote:

On Mon, 30 Nov 2015, Tom de Vries wrote:


Hi,

this patch fixes PR46032.

It handles a call:
...
__builtin_GOMP_parallel (fn, data, num_threads, flags)
...
as:
...
fn (data)
...
in ipa-pta.

This improves ipa-pta alias analysis in the parallelized function fn, and
allows vectorization in the testcase without a runtime alias test.

Bootstrapped and reg-tested on x86_64.

OK for stage3 trunk?


+ /* Assign the passed argument to the appropriate incoming
+parameter of the function.  */
+ struct constraint_expr lhs ;
+ lhs = get_function_part_constraint (fi, fi_parm_base + 0);
+ auto_vec rhsc;
+ struct constraint_expr *rhsp;
+ get_constraint_for_rhs (arg, );
+ while (rhsc.length () != 0)
+   {
+ rhsp =  ();
+ process_constraint (new_constraint (lhs, *rhsp));
+ rhsc.pop ();
+   }

please use style used elsewhere with

   FOR_EACH_VEC_ELT (rhsc, j, rhsp)
 process_constraint (new_constraint (lhs, *rhsp));
   rhsc.truncate (0);



That code was copied from find_func_aliases_for_call.
I've factored out the bit that I copied as find_func_aliases_for_call_arg, and
fixed the style there (and dropped 'rhsc.truncate (0)' since AFAIU it's
redundant at the end of a function).


+ /* Parameter passed by value is used.  */
+ lhs = get_function_part_constraint (fi, fi_uses);
+ struct constraint_expr *rhsp;
+ get_constraint_for_address_of (arg, );

This isn't correct - you want to use get_constraint_for (arg, ).
After all rhs is already an ADDR_EXPR.



Can we add an assert somewhere to detect this incorrect usage?


+ FOR_EACH_VEC_ELT (rhsc, j, rhsp)
+   process_constraint (new_constraint (lhs, *rhsp));
+ rhsc.truncate (0);
+
+ /* The caller clobbers what the callee does.  */
+ lhs = get_function_part_constraint (fi, fi_clobbers);
+ rhs = get_function_part_constraint (cfi, fi_clobbers);
+ process_constraint (new_constraint (lhs, rhs));
+
+ /* The caller uses what the callee does.  */
+ lhs = get_function_part_constraint (fi, fi_uses);
+ rhs = get_function_part_constraint (cfi, fi_uses);
+ process_constraint (new_constraint (lhs, rhs));

I don't see why you need those.  The solver should compute these
in even better precision (context sensitive on the call side).

The same is true for the function parameter.  That is, the only
needed part of the patch should be that making sure we see
the "direct" call and assign parameters correctly.



Dropped this bit.

OK for stage3 trunk if bootstrap and reg-test succeeds?


-|| node->address_taken);
+|| (node->address_taken
+&& !node->parallelized_function));

please add a comment here on why this is safe.

Ok with this change.


Updated with comment, committed as attached.

Thanks,
- Tom


Handle BUILT_IN_GOMP_PARALLEL in ipa-pta

2015-11-30  Tom de Vries  

	PR tree-optimization/46032
	* tree-ssa-structalias.c (find_func_aliases_for_call_arg): New function,
	factored out of ...
	(find_func_aliases_for_call): ... here.
	(find_func_aliases_for_builtin_call, find_func_clobbers): Handle
	BUILT_IN_GOMP_PARALLEL.
	(ipa_pta_execute): Same.  Handle node->parallelized_function as a local
	function.

	* gcc.dg/pr46032.c: New test.

	* testsuite/libgomp.c/pr46032.c: New test.

---
 gcc/testsuite/gcc.dg/pr46032.c| 47 +++
 gcc/tree-ssa-structalias.c| 71 ---
 libgomp/testsuite/libgomp.c/pr46032.c | 44 ++
 3 files changed, 149 insertions(+), 13 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/pr46032.c b/gcc/testsuite/gcc.dg/pr46032.c
new file mode 100644
index 000..b91190e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr46032.c
@@ -0,0 +1,47 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fopenmp -ftree-vectorize -std=c99 -fipa-pta -fdump-tree-vect-all" } */
+
+extern void abort (void);
+
+#define nEvents 1000
+
+static void __attribute__((noinline, noclone, optimize("-fno-tree-vectorize")))
+init (unsigned *results, unsigned *pData)
+{
+  unsigned int i;
+  for (i = 0; i < nEvents; ++i)
+pData[i] = i % 3;
+}
+
+static void __attribute__((noinline, noclone, optimize("-fno-tree-vectorize")))
+check (unsigned *results)
+{
+  unsigned sum = 0;
+  for (int idx = 0; idx < (int)nEvents; idx++)
+sum += results[idx];
+
+  if (sum != 1998)
+abort ();
+}
+
+int
+main (void)
+{
+  unsigned results[nEvents];
+  unsigned pData[nEvents];
+  unsigned coeff = 2;
+
+  init ([0], [0]);
+

[PATCH] [PR68603] Associate conditional C++ loop's back-jump with start, not body

2015-11-30 Thread Andreas Arnez
SVN commit r230979 always associates a loop's back-jump with the start
of the loop body.  This caused a regression for gcov with conditional
loops, because then the loop body appears to be covered twice per
iteration.

gcc/cp/ChangeLog:

PR gcov-profile/68603
* cp-gimplify.c (genericize_cp_loop): For the back-jump's location
use the start of the loop body only if the loop is unconditional.
---
 gcc/cp/cp-gimplify.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/cp/cp-gimplify.c b/gcc/cp/cp-gimplify.c
index a9a34cd..3c89f1b 100644
--- a/gcc/cp/cp-gimplify.c
+++ b/gcc/cp/cp-gimplify.c
@@ -264,7 +264,9 @@ genericize_cp_loop (tree *stmt_p, location_t start_locus, 
tree cond, tree body,
 }
   else
 {
-  location_t loc = EXPR_LOCATION (expr_first (body));
+  location_t loc = start_locus;
+  if (!cond || integer_nonzerop (cond))
+   loc = EXPR_LOCATION (expr_first (body));
   if (loc == UNKNOWN_LOCATION)
loc = start_locus;
   loop = build1_loc (loc, LOOP_EXPR, void_type_node, stmt_list);
-- 
2.5.0



Re: [PATCH] rs6000_adjust_cost old thinko

2015-11-30 Thread Eric Botcazou
> FYI, the function should test recog_memoized (dep_insn) also.

I don't think that's needed as it doesn't call get_attr_type on dep_insn.

-- 
Eric Botcazou


Re: [PATCH] Fix PR68067

2015-11-30 Thread Jeff Law

On 11/30/2015 01:42 AM, Richard Biener wrote:


Yeah.  I've pondered with clearing the hashmap after each pass
(and hope no IPA pass would redirect edges).  Or even more aggressive,
clear the hashmap as well when we do set_cfun ().

Maybe you can try that?

And no, I don't think any pass expects this stuff to be live across
passes.
I'd argue that any pass that expects this stuff to be live across a pass 
is fundamentally broken.


jeff



Re: S/390: Fix warnings in "*setmem_long..." patterns.

2015-11-30 Thread Ulrich Weigand
Andreas Krebbel wrote:
> On 11/30/2015 04:11 PM, Dominik Vogt wrote:
> > The attached patch fixes some warnings generated by the setmem...
> > patterns in s390.md during build and add test cases for the
> > patterns.  The patch is to be added on to p of the movstr patch:
> > https://gcc.gnu.org/ml/gcc-patches/2015-11/msg03485.html
> > 
> > The test cases validate that the patterns are actually used, but
> > at the moment the setmem_long_and pattern is never actually used
> > and thus the test case would fail.  So I've split the patch in two
> > (both attached to this message) to activate this part of the test
> > once we've fixed that.
> > 
> > The patch has passed the SPEC2006 testsuite without any measurable
> > changes in performance.
> 
> Shouldn't we instead describe the whole setmem operation as unspec including 
> the other operands as
> well? The semantics of the introduced UNSPEC_P_TO_BLK operation is not clear 
> to me.  It suggests to
> be some kind of "cast" which it isn't. In fact it is not able to do its job 
> without the length which
> is specified as use outside the unspec.

Well, I guess I suggested to Dominik to leave the basic
[parallel
  (set (dst:BLK) (src:BLK))
  (use (length)]
structure in place; my understanding is that the middle-end recognizes
this as a block move.  As "source" in this case we'd use a BLKmode
operand that consist iof the same byte replicated a number of times.

If we were to use just a single UNSPEC, how would we indicate to the
middle-end that a block of memory is modified, without using too coarse-
grained clobbers?

However, I agree that UNSPEC_P_TO_BLK really should also get the length
as input, to make it have precisely defined semantics.  Also, I'd rather
use a more descriptive name, like UNSPEC_REPLICATE_BYTE or the like.

What would you think about something like the following?

(define_insn "*setmem_long"
  [(clobber (match_operand: 0 "register_operand" "=d"))
   (set (mem:BLK (subreg:P (match_operand: 3 "register_operand" "0") 0))
(unspec:BLK [(match_operand:P 2 "shift_count_or_setmem_operand" "Y")
 (subreg:P (match_dup 3) 1)] UNSPEC_REPLICATE_BYTE))
   (use (match_operand: 1 "register_operand" "d"))
   (clobber (reg:CC CC_REGNUM))]

[ Not sure if we'd need an extra (use (match_dup 3)) any more. ]

B.t.w. this is certainly wrong and cannot be generated by common code:
(and:BLK (unspec:BLK
  [(match_operand:P 2 "shift_count_or_setmem_operand" "Y")]
  UNSPEC_P_TO_BLK)
 (match_operand 4 "const_int_operand" "n"))
(This explains why the pattern would never match.)

The AND should be on the filler byte instead:
(unspec:BLK [(and:P (match_operand:P 2 "shift_count_or_setmem_operand" 
"Y")
(match_operand:P 4 "const_int_operand" 
"n"))
 (subreg:P (match_dup 3) 1)] UNSPEC_REPLICATE_BYTE))

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com



Re: [PATCH, PR46032] Handle BUILT_IN_GOMP_PARALLEL in ipa-pta

2015-11-30 Thread Jakub Jelinek
On Mon, Nov 30, 2015 at 05:36:25PM +0100, Tom de Vries wrote:
> +int
> +main (void)
> +{
> +  unsigned results[nEvents];
> +  unsigned pData[nEvents];
> +  unsigned coeff = 2;
> +
> +  init ([0], [0]);
> +
> +#pragma omp parallel for
> +  for (int idx = 0; idx < (int)nEvents; idx++)
> +results[idx] = coeff * pData[idx];

Could you please add another testcase, where you have say pData
and some other pointer that init sets to alias with pData, and verify
that such loop (would need to be say normal loop inside #pragma omp single
or master) is not vectorized?

Jakub


Re: [PATCH 3/4] [ARM] PR63870 Add test cases

2015-11-30 Thread Charles Baylis
Applied to trunk as r231077.

On 26 November 2015 at 09:43, James Greenhalgh  wrote:
> On Thu, Nov 26, 2015 at 09:41:15AM +, Charles Baylis wrote:
>> Hi James,
>>
>> Ping. This needs an ack from an AArch64 reviewer/maintainer
>
> Fine by me, it will considerably clean up my test results for ARM!
>
> Thanks,
> James
>
>


Re: [RFC] Combine vectorized loops with its scalar remainder.

2015-11-30 Thread Yuri Rumyantsev
Richard,

Thanks a lot for your detailed comments!

Few words about 436.cactusADM gain. The loop which was transformed for
avx2 is very huge and this is the last inner-most loop in routine
Bench_StaggeredLeapfrog2 (StaggeredLeapfrog2.F #366). If you don't
have sources, let me know.

Yuri.

2015-11-27 16:45 GMT+03:00 Richard Biener :
> On Fri, Nov 13, 2015 at 11:35 AM, Yuri Rumyantsev  wrote:
>> Hi Richard,
>>
>> Here is updated version of the patch which 91) is in sync with trunk
>> compiler and (2) contains simple cost model to estimate profitability
>> of scalar epilogue elimination. The part related to vectorization of
>> loops with small trip count is in process of developing. Note that
>> implemented cost model was not tuned  well for HASWELL and KNL but we
>> got  ~6% speed-up on 436.cactusADM from spec2006 suite for HASWELL.
>
> Ok, so I don't know where to start with this.
>
> First of all while I wanted to have the actual stmt processing to be
> as post-processing
> on the vectorized loop body I didn't want to have this competely separated 
> from
> vectorizing.
>
> So, do combine_vect_loop_remainder () from vect_transform_loop, not by 
> iterating
> over all (vectorized) loops at the end.
>
> Second, all the adjustments of the number of iterations for the vector
> loop should
> be integrated into the main vectorization scheme as should determining the
> cost of the predication.  So you'll end up adding a
> LOOP_VINFO_MASK_MAIN_LOOP_FOR_EPILOGUE flag, determined during
> cost analysis and during code generation adjust vector iteration computation
> accordingly and _not_ generate the epilogue loop (or wire it up correctly in
> the first place).
>
> The actual stmt processing should then still happen in a similar way as you 
> do.
>
> So I'm going to comment on that part only as I expect the rest will look a lot
> different.
>
> +/* Generate induction_vector which will be used to mask evaluation.  */
> +
> +static tree
> +gen_vec_induction (loop_vec_info loop_vinfo, unsigned elem_size, unsigned 
> size)
> +{
>
> please make use of create_iv.  Add more comments.  I reverse-engineered
> that you add a { { 0, ..., vf }, +, {vf, ... vf } } IV which you use
> in gen_mask_for_remainder
> by comparing it against { niter, ..., niter }.
>
> +  gsi = gsi_after_labels (loop->header);
> +  niters = LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo)
> +  ? LOOP_VINFO_NITERS (loop_vinfo)
> +  : LOOP_VINFO_NITERS_UNCHANGED (loop_vinfo);
>
> that's either wrong or unnecessary.  if ! peeling for alignment
> loop-vinfo-niters
> is equal to loop-vinfo-niters-unchanged.
>
> +  ptr = build_int_cst (reference_alias_ptr_type (ref), 0);
> +  if (!SSA_NAME_PTR_INFO (addr))
> +   copy_ref_info (build2 (MEM_REF, TREE_TYPE (ref), addr, ptr), ref);
>
> vect_duplicate_ssa_name_ptr_info.
>
> +
> +static void
> +fix_mask_for_masked_ld_st (vec *masked_stmt, tree mask)
> +{
> +  gimple *stmt, *new_stmt;
> +  tree old, lhs, vectype, var, n_lhs;
>
> no comment?  what's this for.
>
> +/* Convert vectorized reductions to VEC_COND statements to preserve
> +   reduction semantic:
> +   s1 = x + s2 --> t = x + s2; s1 = (mask)? t : s2.  */
> +
> +static void
> +convert_reductions (loop_vec_info loop_vinfo, tree mask)
> +{
>
> for reductions it looks like preserving the last iteration x plus the mask
> could avoid predicating it this way and compensate in the reduction
> epilogue by "subtracting" x & mask?  With true predication support
> that'll likely be more expensive of course.
>
> +  /* Generate new VEC_COND expr.  */
> +  vec_cond_expr = build3 (VEC_COND_EXPR, vectype, mask, new_lhs, rhs);
> +  new_stmt = gimple_build_assign (lhs, vec_cond_expr);
>
> gimple_build_assign (lhs, VEC_COND_EXPR, vectype, mask, new_lhs, rhs);
>
> +/* Return true if MEM_REF is incremented by vector size and false
> otherwise.  */
> +
> +static bool
> +mem_ref_is_vec_size_incremented (loop_vec_info loop_vinfo, tree lhs)
> +{
> +  struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
>
> what?!  Just look at DR_STEP of the store?
>
>
> +void
> +combine_vect_loop_remainder (loop_vec_info loop_vinfo)
> +{
> +  struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
> +  auto_vec loads;
> +  auto_vec stores;
>
> so you need to re-structure this in a way that it computes
>
>   a) wheter it can perform the operation - and you need to do that
>   reliably before the operation has taken place
>   b) its cost
>
> instead of looking at def types or gimple_assign_load/store_p predicates
> please look at STMT_VINFO_TYPE instead.
>
> I don't like the new target hook for the costing.  We do need some major
> re-structuring in the vectorizer cost model implementation, this doesn't go
> into the right direction.
>
> A simplistic hook following the current scheme would have used
> the vect_cost_for_stmt as argument and mirror builtin_vectorization_cost.
>
> There is not a single testcase in the 

S/390: Fix warnings in "*setmem_long..." patterns.

2015-11-30 Thread Dominik Vogt
The attached patch fixes some warnings generated by the setmem...
patterns in s390.md during build and add test cases for the
patterns.  The patch is to be added on to p of the movstr patch:
https://gcc.gnu.org/ml/gcc-patches/2015-11/msg03485.html

The test cases validate that the patterns are actually used, but
at the moment the setmem_long_and pattern is never actually used
and thus the test case would fail.  So I've split the patch in two
(both attached to this message) to activate this part of the test
once we've fixed that.

The patch has passed the SPEC2006 testsuite without any measurable
changes in performance.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/ChangeLog

* config/s390/s390.c (s390_expand_setmem): Use new expanders.
* config/s390/s390.md ("*setmem_long")
("*setmem_long_and", "*setmem_long_31z"): Fix warnings.
("setmem_long_"): New expanders.
("setmem_long"): Removed.

gcc/testsuite/ChangeLog

* gcc.target/s390/md/setmem_long-1.c: New test.
* gcc.target/s390/md/setmem_long-2.c: New test.
>From 6b484cd8a9f39a38b3e990b4ac160c8254c03f6b Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Wed, 4 Nov 2015 03:16:24 +0100
Subject: [PATCH 1/1.5] S/390: Fix warnings in "*setmem_long..." patterns.

---
 gcc/config/s390/s390.c   |  7 ++-
 gcc/config/s390/s390.md  | 18 +-
 gcc/testsuite/gcc.target/s390/md/setmem_long-1.c | 20 
 gcc/testsuite/gcc.target/s390/md/setmem_long-2.c | 20 
 4 files changed, 59 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/md/setmem_long-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/md/setmem_long-2.c

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 40ee2f7..8f2396f 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -5178,7 +5178,12 @@ s390_expand_setmem (rtx dst, rtx len, rtx val)
   else if (TARGET_MVCLE)
 {
   val = force_not_mem (convert_modes (Pmode, QImode, val, 1));
-  emit_insn (gen_setmem_long (dst, convert_to_mode (Pmode, len, 1), val));
+  if (TARGET_64BIT)
+	emit_insn (gen_setmem_long_di (dst, convert_to_mode (Pmode, len, 1),
+   val));
+  else
+	emit_insn (gen_setmem_long_si (dst, convert_to_mode (Pmode, len, 1),
+   val));
 }
 
   else
diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index 75e9af7..ed98101 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -70,6 +70,9 @@
; Copy CC as is into the lower 2 bits of an integer register
UNSPEC_CC_TO_INT
 
+   ; Convert Pmode to BLKmode
+   UNSPEC_P_TO_BLK
+
; GOT/PLT and lt-relative accesses
UNSPEC_LTREL_OFFSET
UNSPEC_LTREL_BASE
@@ -3281,11 +3284,12 @@
 
 ; Initialize a block of arbitrary length with (operands[2] % 256).
 
-(define_expand "setmem_long"
+(define_expand "setmem_long_"
   [(parallel
 [(clobber (match_dup 1))
  (set (match_operand:BLK 0 "memory_operand" "")
-  (match_operand 2 "shift_count_or_setmem_operand" ""))
+	  (unspec:BLK [(match_operand:P 2 "shift_count_or_setmem_operand" "")]
+		  UNSPEC_P_TO_BLK))
  (use (match_operand 1 "general_operand" ""))
  (use (match_dup 3))
  (clobber (reg:CC CC_REGNUM))])]
@@ -3312,7 +3316,8 @@
 (define_insn "*setmem_long"
   [(clobber (match_operand: 0 "register_operand" "=d"))
(set (mem:BLK (subreg:P (match_operand: 3 "register_operand" "0") 0))
-(match_operand 2 "shift_count_or_setmem_operand" "Y"))
+(unspec:BLK [(match_operand:P 2 "shift_count_or_setmem_operand" "Y")]
+		UNSPEC_P_TO_BLK))
(use (match_dup 3))
(use (match_operand: 1 "register_operand" "d"))
(clobber (reg:CC CC_REGNUM))]
@@ -3324,7 +3329,9 @@
 (define_insn "*setmem_long_and"
   [(clobber (match_operand: 0 "register_operand" "=d"))
(set (mem:BLK (subreg:P (match_operand: 3 "register_operand" "0") 0))
-(and (match_operand 2 "shift_count_or_setmem_operand" "Y")
+(and:BLK (unspec:BLK
+	  [(match_operand:P 2 "shift_count_or_setmem_operand" "Y")]
+	  UNSPEC_P_TO_BLK)
 	 (match_operand 4 "const_int_operand" "n")))
(use (match_dup 3))
(use (match_operand: 1 "register_operand" "d"))
@@ -3338,7 +3345,8 @@
 (define_insn "*setmem_long_31z"
   [(clobber (match_operand:TI 0 "register_operand" "=d"))
(set (mem:BLK (subreg:SI (match_operand:TI 3 "register_operand" "0") 4))
-(match_operand 2 "shift_count_or_setmem_operand" "Y"))
+(unspec:BLK [(match_operand:P 2 "shift_count_or_setmem_operand" "Y")]
+		UNSPEC_P_TO_BLK))
(use (match_dup 3))
(use (match_operand:TI 1 "register_operand" "d"))
(clobber (reg:CC CC_REGNUM))]
diff --git a/gcc/testsuite/gcc.target/s390/md/setmem_long-1.c b/gcc/testsuite/gcc.target/s390/md/setmem_long-1.c
new file mode 100644
index 000..9a926ce
--- /dev/null
+++ 

Re: [PATCH] Add save_expr langhook (PR c/68513)

2015-11-30 Thread Marek Polacek
On Sat, Nov 28, 2015 at 04:05:30PM +, Joseph Myers wrote:
> On Sat, 28 Nov 2015, Richard Biener wrote:
> 
> > Different approach: after the FE folds (unexpectedly?), scan the result 
> > for SAVE_EXPRs and if found, drop the folding.
> 
> Or, if conversions are going to fold from language-independent code (which 
> is the underlying problem here - a conversion without folding would be 
> preferred once the fallout from that can be resolved), make the front end 
> fold with c_fully_fold before doing the conversion, and wrap the result of 
> the conversion in a C_MAYBE_CONST_EXPR with c_wrap_maybe_const in the same 
> way as done in other places that fold early (if either c_fully_fold 
> indicates it can't occur in a constant expression, or the result of 
> folding / conversion is not an INTEGER_CST).

Unfortunately, even this doesn't seem to work :(; I'm getting leaked
C_MAYBE_CONST_EXPRs e.g. when converting to (_Complex float), and a bunch of
missing warnings resulting in big testsuite fallout.

Marek


Re: [PATCH] Add save_expr langhook (PR c/68513)

2015-11-30 Thread Richard Biener
On Mon, 30 Nov 2015, Richard Biener wrote:

> On Mon, 30 Nov 2015, Marek Polacek wrote:
> 
> > On Sat, Nov 28, 2015 at 08:50:12AM +0100, Richard Biener wrote:
> > > Different approach: after the FE folds (unexpectedly?), scan the result 
> > > for
> > > SAVE_EXPRs and if found, drop the folding.
> > 
> > Neither this fixes this problem completely, because we simply don't know 
> > where
> > those SAVE_EXPRs might be introduced: it might be convert(), but e.g. when I
> > changed the original testcase a tiny bit (added -), then those SAVE_EXPRs 
> > were
> > introduced in a different spot (via c_process_stmt_expr -> c_fully_fold).
> 
> So the following "disables" save_expr generation from generic-match.c
> by failing to simplify if save_expr would end up not returning a
> non-save_expr.
> 
> I expect this will make fixing PR68590 difficult (w/o re-introducing
> some fold-const.c code or changing genmatch to "special-case"
> things).
> 
> The other option for this PR is to re-introduce the TREE_SIDE_EFFECTS
> check I removed earlier (to avoid un-CSEing large expressions at
> -O0 for example) and thus only FAIL if the save_expr were needed
> for correctness.

And the following will avoid quite some fallout (eventually).  Testing
as desired change independently.

Richard.

Index: gcc/match.pd
===
--- gcc/match.pd(revision 231065)
+++ gcc/match.pd(working copy)
@@ -1828,15 +1828,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  
 /* Simplify comparison of something with itself.  For IEEE
floating-point, we can only do some of these simplifications.  */
-(simplify
- (eq @0 @0)
- (if (! FLOAT_TYPE_P (TREE_TYPE (@0))
-  || ! HONOR_NANS (TYPE_MODE (TREE_TYPE (@0
-  { constant_boolean_node (true, type); }))
-(for cmp (ge le)
+(for cmp (eq ge le)
  (simplify
   (cmp @0 @0)
-  (eq @0 @0)))
+  (if (! FLOAT_TYPE_P (TREE_TYPE (@0))
+   || ! HONOR_NANS (TYPE_MODE (TREE_TYPE (@0
+   { constant_boolean_node (true, type); }
+   (if (cmp != EQ_EXPR)
+(eq @0 @0)
 (for cmp (ne gt lt)
  (simplify
   (cmp @0 @0)


> Richard.
> 
> Index: gcc/tree.c
> ===
> --- gcc/tree.c(revision 231065)
> +++ gcc/tree.c(working copy)
> @@ -3231,8 +3231,6 @@ decl_address_ip_invariant_p (const_tree
> not handle arithmetic; that's handled in skip_simple_arithmetic and
> tree_invariant_p).  */
>  
> -static bool tree_invariant_p (tree t);
> -
>  static bool
>  tree_invariant_p_1 (tree t)
>  {
> @@ -3282,7 +3280,7 @@ tree_invariant_p_1 (tree t)
>  
>  /* Return true if T is function-invariant.  */
>  
> -static bool
> +bool
>  tree_invariant_p (tree t)
>  {
>tree inner = skip_simple_arithmetic (t);
> Index: gcc/tree.h
> ===
> --- gcc/tree.h(revision 231065)
> +++ gcc/tree.h(working copy)
> @@ -4320,6 +4320,10 @@ extern tree staticp (tree);
>  
>  extern tree save_expr (tree);
>  
> +/* Return true if T is function-invariant.  */
> +
> +extern bool tree_invariant_p (tree);
> +
>  /* Look inside EXPR into any simple arithmetic operations.  Return the
> outermost non-arithmetic or non-invariant node.  */
>  
> Index: gcc/genmatch.c
> ===
> --- gcc/genmatch.c(revision 231065)
> +++ gcc/genmatch.c(working copy)
> @@ -3106,7 +3106,9 @@ dt_simplify::gen_1 (FILE *f, int indent,
> else if (is_a  (opr))
>   is_predicate = true;
> /* Search for captures used multiple times in the result expression
> -  and dependent on TREE_SIDE_EFFECTS emit a SAVE_EXPR.  */
> +  and check if we can safely evaluate it multiple times.  Otherwise
> +  fail, avoiding a SAVE_EXPR because that confuses the C FE
> +  const expression folding.  */
> if (!is_predicate)
>   for (int i = 0; i < s->capture_max + 1; ++i)
> {
> @@ -3114,8 +3116,8 @@ dt_simplify::gen_1 (FILE *f, int indent,
> continue;
>   if (cinfo.info[i].result_use_count > 1)
> fprintf_indent (f, indent,
> -   "captures[%d] = save_expr (captures[%d]);\n",
> -   i, i);
> +   "if (! tree_invariant_p (captures[%d])) "
> +   "return NULL_TREE;\n", i);
> }
> for (unsigned j = 0; j < e->ops.length (); ++j)
>   {
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: [PATCH 3/4] Add libgomp plugin for Intel MIC

2015-11-30 Thread Aleksander Ivanyushenko
On Wed, Nov 11, 2015 at 17:56:15 +0300, Aleksander Ivanyushenko wrote:
> On Mon, Aug 24, 2015 at 10:45:03 +0200, Jakub Jelinek wrote:
> > On Thu, Aug 06, 2015 at 05:34:56PM +0300, Maxim Blumental wrote:
> > >  Applied the idea with python script alternative. Review, please.
> > 
> > > 2015-07-28  Maxim Blumenthal  
> > > 
> > >   * configure.ac: Add a check for xxd or python presence when the target
> > >   is intelmic or intelmicemul.
> > >   * configure: Regenerate.
> > >   * liboffloadmic/plugin/Makefile.am: Add a condition into
> > >   make_target_image.h generating code.  This condition performs an
> > >   action with either xxd or a special python script during the
> > >   generating.
> > >   * liboffloadmic/plugin/xxd.py: New file.
> > >   * liboffloadmic/plugin/Makefile.in: Regenerate.
> > 
> > I still don't like this, there should be no `which ...` uses in the
> > Makefile.
> > Instead, use AC_CHECK_PROG/AC_CHECK_PROGS in configure.ac, for python
> > perhaps search for python python2 python3 or what is common in the python
> > land.  And prepare the command line to use in the Makefile.am in configure
> > too, then AC_SUBST it and use the variable in there (and the variable will
> > use $@ etc.).
> Maxim has left Intel so I have fixed this issue. I tried to build with and
> without xxd, so everything works fine. ok for trunk?
> 
> 2015-11-10  Aleksander Ivanushenko  
>   Maxim Blumenthal  
> 
>   * configure.ac: Add xxd and python check for intelmic and
>   intelmicemul.
>   * configure: Regenerate.
> 
> liboffloadmic/
> 2015-11-10  Aleksander Ivanushenko  
>   Maxim Blumenthal  
>   David Malcolm  
> 
>   * plugin/xxd.py: New file.
>   * plugin/configure.ac: Add searching for xxd and python pathes.
>   * plugin/Makefile.am: Add python script usage in case when xxd is not
>   available.
>   * plugin/configure: Regenerate.
>   * plugin/Makefile.in: Regenerate.
> 
>
Ping. 


Re: [PATCH] Add save_expr langhook (PR c/68513)

2015-11-30 Thread Richard Biener
On Mon, 30 Nov 2015, Marek Polacek wrote:

> On Sat, Nov 28, 2015 at 08:50:12AM +0100, Richard Biener wrote:
> > Different approach: after the FE folds (unexpectedly?), scan the result for
> > SAVE_EXPRs and if found, drop the folding.
> 
> Neither this fixes this problem completely, because we simply don't know where
> those SAVE_EXPRs might be introduced: it might be convert(), but e.g. when I
> changed the original testcase a tiny bit (added -), then those SAVE_EXPRs were
> introduced in a different spot (via c_process_stmt_expr -> c_fully_fold).

So the following "disables" save_expr generation from generic-match.c
by failing to simplify if save_expr would end up not returning a
non-save_expr.

I expect this will make fixing PR68590 difficult (w/o re-introducing
some fold-const.c code or changing genmatch to "special-case"
things).

The other option for this PR is to re-introduce the TREE_SIDE_EFFECTS
check I removed earlier (to avoid un-CSEing large expressions at
-O0 for example) and thus only FAIL if the save_expr were needed
for correctness.

Richard.

Index: gcc/tree.c
===
--- gcc/tree.c  (revision 231065)
+++ gcc/tree.c  (working copy)
@@ -3231,8 +3231,6 @@ decl_address_ip_invariant_p (const_tree
not handle arithmetic; that's handled in skip_simple_arithmetic and
tree_invariant_p).  */
 
-static bool tree_invariant_p (tree t);
-
 static bool
 tree_invariant_p_1 (tree t)
 {
@@ -3282,7 +3280,7 @@ tree_invariant_p_1 (tree t)
 
 /* Return true if T is function-invariant.  */
 
-static bool
+bool
 tree_invariant_p (tree t)
 {
   tree inner = skip_simple_arithmetic (t);
Index: gcc/tree.h
===
--- gcc/tree.h  (revision 231065)
+++ gcc/tree.h  (working copy)
@@ -4320,6 +4320,10 @@ extern tree staticp (tree);
 
 extern tree save_expr (tree);
 
+/* Return true if T is function-invariant.  */
+
+extern bool tree_invariant_p (tree);
+
 /* Look inside EXPR into any simple arithmetic operations.  Return the
outermost non-arithmetic or non-invariant node.  */
 
Index: gcc/genmatch.c
===
--- gcc/genmatch.c  (revision 231065)
+++ gcc/genmatch.c  (working copy)
@@ -3106,7 +3106,9 @@ dt_simplify::gen_1 (FILE *f, int indent,
  else if (is_a  (opr))
is_predicate = true;
  /* Search for captures used multiple times in the result expression
-and dependent on TREE_SIDE_EFFECTS emit a SAVE_EXPR.  */
+and check if we can safely evaluate it multiple times.  Otherwise
+fail, avoiding a SAVE_EXPR because that confuses the C FE
+const expression folding.  */
  if (!is_predicate)
for (int i = 0; i < s->capture_max + 1; ++i)
  {
@@ -3114,8 +3116,8 @@ dt_simplify::gen_1 (FILE *f, int indent,
  continue;
if (cinfo.info[i].result_use_count > 1)
  fprintf_indent (f, indent,
- "captures[%d] = save_expr (captures[%d]);\n",
- i, i);
+ "if (! tree_invariant_p (captures[%d])) "
+ "return NULL_TREE;\n", i);
  }
  for (unsigned j = 0; j < e->ops.length (); ++j)
{


Re: [PATCH] Fix vector rsqrt discovery (PR tree-optimization/68501)

2015-11-30 Thread Richard Biener
On Mon, 30 Nov 2015, Jakub Jelinek wrote:

> On Mon, Nov 30, 2015 at 02:30:04PM +, Richard Sandiford wrote:
> > > keep the builtin_reciprocal hook (perhaps renamed to builtin_rsqrt)
> > > for the purpose of this condition and nothing else (i.e. return a
> > > boolean) and let the rest be determined from the optab, just commit
> > > the already posted patch, something else?
> > 
> > ...I suppose the problem with adding extra conditions to the expander
> > is that it would break cases where the expander is used for target
> > built-ins too.
> > 
> > Maybe optabs shouldn't be used for built-ins if the usage conditions
> > aren't the same.  But if that's fighting too much against existing usage,
> > the hook "hack" could check these conditions too.
> 
> Yeah, I'm aware that the target builtins use those expanders with the
> current conditions and so would need to be renamed to something different
> if we take the approach of adding the conditions to all rsqrt* expanders.
> 
> So, maybe it is best if I just apply my original patch right away so that
> the bug is fixed and we can continue discussions on how we want to handle
> it.

Yes, I've seen the IFN idea as a followup improvement and go with
your original patch for now.

Richard.


Re: [PATCH 3/4] Add libgomp plugin for Intel MIC

2015-11-30 Thread Jakub Jelinek
On Wed, Nov 11, 2015 at 05:56:15PM +0300, Aleksander Ivanyushenko wrote:
> diff --git a/configure.ac b/configure.ac
> index 9241261..b997646 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -494,6 +494,18 @@ else
>  fi])
>  AC_SUBST(extra_liboffloadmic_configure_flags)
>  
> +# Intelmic and intelmicemul require xxd or python.
> +case "${target}" in
> +  *-intelmic-* | *-intelmicemul-*)
> +AC_CHECK_PROG(xxd_present, xxd, "yes", "no")
> +AC_CHECK_PROG(python2_present, python2, "yes", "no")
> +AC_CHECK_PROG(python3_present, python3, "yes", "no")
> +if test "$xxd_present$python2_present$python3_present" = "nonono"; then
> +  AC_MSG_ERROR([cannot find neither xxd nor python])
> +fi
> +;;
> +esac

Why here?  I'd do something like that only in
liboffloadmic/plugin/configure.ac.  Furthermore, it is inconsistent
with what you actually use in liboffloadmic/plugin (where you look only
for python and above you only look for python[23]).

> @@ -73,7 +75,7 @@ main_target_image.h: offload_target_main
>   @echo "};" >> $@
>   @echo "extern \"C\" const MainTargetImage main_target_image = {" >> $@
>   @echo "  image_size, \"offload_target_main\"," >> $@
> - @cat $< | xxd -include >> $@
> + @if test "x$(xxd_path)" != "xno"; then cat $< | $(xxd_path) -include >> 
> $@; else $(python_path) $(XXD_PY) $< >> $@; fi;
>   @echo "};" >> $@

I'd prefer to use $(XXD) and $(PYTHON) instead of $(xxd_path) and 
$(python_path),
that is more consistent with dozens of other variables for other tools.

> --- a/liboffloadmic/plugin/configure.ac
> +++ b/liboffloadmic/plugin/configure.ac
> @@ -124,6 +124,10 @@ case ${enable_version_specific_runtime_libs} in
>  ;;
>  esac
>  
> +# Find path to xxd or python
> +AC_PATH_PROG(xxd_path, xxd, "no")
> +AC_PATH_PROG(python_path, python, "no")

I'd use
+AC_PATH_PROG(XXD, xxd, no)
+AC_PATH_PROGS(PYTHON, python python2 python3, no)
and then add the conditional AC_MSG_ERROR if
x$XXD = xno && x$PYTHON = xno

Jakub


Re: [PATCH] Add save_expr langhook (PR c/68513)

2015-11-30 Thread Marek Polacek
On Sat, Nov 28, 2015 at 08:50:12AM +0100, Richard Biener wrote:
> Different approach: after the FE folds (unexpectedly?), scan the result for
> SAVE_EXPRs and if found, drop the folding.

Neither this fixes this problem completely, because we simply don't know where
those SAVE_EXPRs might be introduced: it might be convert(), but e.g. when I
changed the original testcase a tiny bit (added -), then those SAVE_EXPRs were
introduced in a different spot (via c_process_stmt_expr -> c_fully_fold).

Marek


Re: [RFC] [Patch] PR67326 - relax trap assumption by looking at similar DRS

2015-11-30 Thread H.J. Lu
On Fri, Nov 27, 2015 at 12:24 AM, Kumar, Venkataramanan
 wrote:
> Hi Richard,
>
>> -Original Message-
>> From: Richard Biener [mailto:richard.guent...@gmail.com]
>> Sent: Tuesday, November 24, 2015 9:07 PM
>> To: Kumar, Venkataramanan
>> Cc: Jakub Jelinek (ja...@redhat.com); gcc-patches@gcc.gnu.org
>> Subject: Re: [RFC] [Patch] PR67326 - relax trap assumption by looking at
>> similar DRS
>>
>> On Fri, Nov 20, 2015 at 1:02 PM, Kumar, Venkataramanan
>>  wrote:
>> > Hi Richard,
>> >
>> > As per Jakub suggestion in
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67326, the below patch fixes
>> the regression in tree if conversion.
>> > Basically allowing if conversion to happen for a candidate DR, if we find
>> similar DR with same dimensions  and that DR will not trap.
>> >
>> > To find similar DRs using hash table to hashing the offset and DR pairs.
>> > Also reusing  read/written information that was stored for reference tree.
>> >
>> > Also.
>> > (1) I guard these checks for  -ftree-loop-if-convert-stores and -fno-
>> common.
>> > Sometimes vectorization flags also triggers if conversion.
>> > (2) Also hashing base DRs for writes only.
>> >
>> > gcc/ChangeLog
>> > 2015-11-19  Venkataramanan  
>> >
>> > PR tree-optimization/67326
>> > * tree-if-conv.c  (offset_DR_map): Define.
>> > (struct ifc_dr): Add new tree base_predicate field.
>> > (hash_memrefs_baserefs_and_store_DRs_read_written_info): Hash
>> offsets, DR pairs
>> > and hash base ref,  DR pairs  for write type DRs.
>> > (ifcvt_memrefs_wont_trap):  Guard checks with -ftree-loop-if-
>> convert-stores flag.
>> >Check for similar DR that are accessed unconditionally.
>> >(if_convertible_loop_p_1):  Initialize and delete offset hash
>> > maps
>> >
>> > gcc/testsuite/ChangeLog
>> > 2015-11-19  Venkataramanan  
>> > * gcc.dg/tree-ssa/ifc-pr67326.c:  Add new.
>> >
>> > Regstrapped on x86_64, Ok for trunk?
>>
>> +  if (offset)
>> +{
>> +  offset_master_dr = _DR_map->get_or_insert (offset,);
>> +  if (!exist3)
>> +   *offset_master_dr = a;
>> +
>> +  if (DR_RW_UNCONDITIONALLY (*offset_master_dr) != 1)
>> +   DR_RW_UNCONDITIONALLY (*offset_master_dr)
>> +   = DR_RW_UNCONDITIONALLY (*master_dr);
>>
>> this is fishy - as far as I can see offset_master globs all _candidates_ and
>>
>> +  else if (DR_OFFSET (a))
>> +{
>> +  offset_dr = offset_DR_map->get (DR_OFFSET (a));
>> +  if ((DR_RW_UNCONDITIONALLY (*offset_dr) == 1)
>> +  && DR_NUM_DIMENSIONS (a) == DR_NUM_DIMENSIONS
>> (*offset_dr))
>> +   {
>> + tree base_tree = get_base_address (DR_REF (a));
>> + if (DECL_P (base_tree)
>> + && flag_tree_loop_if_convert_stores
>> + && decl_binds_to_current_def_p (base_tree)
>> + && !TREE_READONLY (base_tree))
>> +   return true;
>> +   }
>> +}
>>
>> where with this that actually checks something (DR_NUM_DIMENSIONS is
>> not something you can use to identify two arrays with the same domain) will
>> then consider DR_DW_UNCONDITIONALLY ORed from all _candidates_ but
>> not only from those which really have the same domain.
>>
>> You need to do the domain check as part of the hash-map
>> hashing/comparing.
>>
>> Note that there is no bounds info in the data ref info so you need to
>>   a) consider DR_OFFSET + DR_INIT
>>   b) verify the access size is the same (TYPE_SIZE_UNIT (TREE_TYPE (dr-
>> >ref)))
>>   c) verify the base objects are of the same size - note this is somewhat
>> difficult as the base object for DR_OFFSET/INIT is starting at
>> DR_BASE_ADDRESS so maybe restrict this to ADDR_EXPR 
>> DR_BASE_ADDRESS cases where you can look at DECL_SIZE (decl) of both
>> candidates
>>
>> You can also try using indices (DR_BASE_OBJECT plus DR_ACCESS_FNS when
>> DR_UNCONSTRAINED_BASE is false).  If the size of DR_BASE_OBJECT
>> matches and all access functions are equal it should be a compatible enough
>> case as well.
>
> Ok,  I will take some time to figure out on domain analysis part.
>
>>
>> I'd say you should split out the base_predicate introduction into a separate
>> patch (this change looks ok).
>>
>
> Attached patch has the  "base_predicate" introduction part alone.
> It does the predicate folding  and hashes base references for only write type 
> DRs while hashing.
> I have not added any new test case since we already have  ifc-8.c
>
> Also fixed formatting issues Jakub  pointed out for this patch.
>
> Boot strapped on X86_64.
>
> Ok to upstream if it passes regression tests?
>
> gcc/ChangeLog
> 2015-11-27  Venkataramanan Kumar  
>
> * tree-if-conv.c (struct ifc_dr): Add new tree
> base_predicate field.
> (hash_memrefs_baserefs_and_store_DRs_read_written_info): Hash
> base ref, DR pairs 

Re: S/390: Fix warnings in "*setmem_long..." patterns.

2015-11-30 Thread Andreas Krebbel
On 11/30/2015 04:11 PM, Dominik Vogt wrote:
> The attached patch fixes some warnings generated by the setmem...
> patterns in s390.md during build and add test cases for the
> patterns.  The patch is to be added on to p of the movstr patch:
> https://gcc.gnu.org/ml/gcc-patches/2015-11/msg03485.html
> 
> The test cases validate that the patterns are actually used, but
> at the moment the setmem_long_and pattern is never actually used
> and thus the test case would fail.  So I've split the patch in two
> (both attached to this message) to activate this part of the test
> once we've fixed that.
> 
> The patch has passed the SPEC2006 testsuite without any measurable
> changes in performance.

Shouldn't we instead describe the whole setmem operation as unspec including 
the other operands as
well? The semantics of the introduced UNSPEC_P_TO_BLK operation is not clear to 
me.  It suggests to
be some kind of "cast" which it isn't. In fact it is not able to do its job 
without the length which
is specified as use outside the unspec.

Bye,

-Andreas-



Re: S/390: Fix warnings in "*setmem_long..." patterns.

2015-11-30 Thread Andreas Krebbel
On 11/30/2015 06:11 PM, Ulrich Weigand wrote:
...
> However, I agree that UNSPEC_P_TO_BLK really should also get the length
> as input, to make it have precisely defined semantics.  Also, I'd rather
> use a more descriptive name, like UNSPEC_REPLICATE_BYTE or the like.
> 
> What would you think about something like the following?
> 
> (define_insn "*setmem_long"
>   [(clobber (match_operand: 0 "register_operand" "=d"))
>(set (mem:BLK (subreg:P (match_operand: 3 "register_operand" "0") 0))
> (unspec:BLK [(match_operand:P 2 "shift_count_or_setmem_operand" "Y")
>  (subreg:P (match_dup 3) 1)] UNSPEC_REPLICATE_BYTE))
>(use (match_operand: 1 "register_operand" "d"))
>(clobber (reg:CC CC_REGNUM))]

Fine with me. Thanks!

Bye,

-Andreas-



[gomp4] Re: [PATCH, 10/16] Add pass_oacc_kernels pass group in passes.def

2015-11-30 Thread Thomas Schwinge
Hi!

On Wed, 25 Nov 2015 11:43:14 +0100 (CET), Richard Biener  
wrote:
> On Tue, 24 Nov 2015, Tom de Vries wrote:
> > > [...]
> > 
> > Reposting using the in_loop_pipeline style in pass_lim.
> 
> Ok.

I merged trunk r230907 into gomp-4_0-branch in a very simplistic way,
basically just moving pass_fre in between pass_oacc_kernels and the (new)
pass_oacc_kernels2 pass groups.  We'll want to clean this up later (on
gomp-4_0-branch), once we're more clear on what difference will remain
between the trunk and gomp-4_0-branch pass structures (if any); for now
this makes sure we don't regress OpenACC kernels functionality on
gomp-4_0-branch.  In gomp-4_0-branch r231078, I effectively applied the
following:

commit ffae8a36e195172327a233bd397a4230a7939681
Merge: 8249e60 e1e1688
Author: tschwinge 
Date:   Mon Nov 30 17:28:07 2015 +

svn merge -r 230906:230907 svn+ssh://gcc.gnu.org/svn/gcc/trunk


git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@231078 
138bc75d-0d04-0410-961f-82ee72b054a4

 gcc/ChangeLog   |  6 
 gcc/passes.def  | 13 +++--
 gcc/testsuite/ChangeLog | 76 +
 3 files changed, 92 insertions(+), 3 deletions(-)

[diff --git gcc/ChangeLog gcc/ChangeLog]
diff --git gcc/passes.def gcc/passes.def
index f4eb235..9fe4fec 100644
--- gcc/passes.def
+++ gcc/passes.def
@@ -84,36 +84,43 @@ along with GCC; see the file COPYING3.  If not see
  /* After CCP we rewrite no longer addressed locals into SSA
 form if possible.  */
  NEXT_PASS (pass_forwprop);
  NEXT_PASS (pass_sra_early);
  /* pass_build_ealias is a dummy pass that ensures that we
 execute TODO_rebuild_alias at this point.  */
  NEXT_PASS (pass_build_ealias);
- /* Pass group that runs when there are oacc kernels in the
-function.  */
+ /* Pass group that runs when the function is an offloaded function
+containing oacc kernels loops.  Part 1.  */
  NEXT_PASS (pass_oacc_kernels);
  PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels)
  NEXT_PASS (pass_dominator, false /* may_peel_loop_headers_p */);
  NEXT_PASS (pass_ch);
  NEXT_PASS (pass_dominator, false /* may_peel_loop_headers_p */);
+ POP_INSERT_PASSES ()
+ NEXT_PASS (pass_fre);
+ /* Pass group that runs when the function is an offloaded function
+containing oacc kernels loops.  Part 2.  */
+ NEXT_PASS (pass_oacc_kernels2);
+ PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels2)
+ /* We use pass_lim to rewrite in-memory iteration and reduction
+variable accesses in loops into local variables accesses.  */
  NEXT_PASS (pass_tree_loop_init);
  NEXT_PASS (pass_lim);
  NEXT_PASS (pass_copy_prop);
  NEXT_PASS (pass_lim);
  NEXT_PASS (pass_copy_prop);
  NEXT_PASS (pass_scev_cprop);
  NEXT_PASS (pass_tree_loop_done);
  NEXT_PASS (pass_dominator, false /* may_peel_loop_headers_p */);
  NEXT_PASS (pass_dce);
  NEXT_PASS (pass_tree_loop_init);
  NEXT_PASS (pass_parallelize_loops_oacc_kernels);
  NEXT_PASS (pass_expand_omp_ssa);
  NEXT_PASS (pass_tree_loop_done);
  POP_INSERT_PASSES ()
- NEXT_PASS (pass_fre);
  NEXT_PASS (pass_merge_phi);
   NEXT_PASS (pass_dse);
  NEXT_PASS (pass_cd_dce);
  NEXT_PASS (pass_early_ipa_sra);
  NEXT_PASS (pass_tail_recursion);
  NEXT_PASS (pass_convert_switch);
  NEXT_PASS (pass_cleanup_eh);
[diff --git gcc/testsuite/ChangeLog gcc/testsuite/ChangeLog]

..., so the following difference from trunk to gomp-4_0-branch remains to
be resolved/reduced (plus the corresponding testsuite tree dump scanning
changes):

--- gcc/passes.def
+++ gcc/passes.def
@@ -89,25 +89,36 @@ along with GCC; see the file COPYING3.  If not see
 execute TODO_rebuild_alias at this point.  */
  NEXT_PASS (pass_build_ealias);
  /* Pass group that runs when the function is an offloaded function
 containing oacc kernels loops.  Part 1.  */
  NEXT_PASS (pass_oacc_kernels);
  PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels)
+ NEXT_PASS (pass_dominator, false /* may_peel_loop_headers_p */);
  NEXT_PASS (pass_ch);
+ NEXT_PASS (pass_dominator, false /* may_peel_loop_headers_p */);
  POP_INSERT_PASSES ()
  NEXT_PASS (pass_fre);
  /* Pass group that runs when the function is an offloaded function
 containing oacc kernels loops.  Part 2.  */
  NEXT_PASS (pass_oacc_kernels2);
  PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels2)
  /* We use pass_lim to rewrite 

Re: [PATCH, PR46032] Handle BUILT_IN_GOMP_PARALLEL in ipa-pta

2015-11-30 Thread Tom de Vries

On 30/11/15 17:48, Jakub Jelinek wrote:

On Mon, Nov 30, 2015 at 05:36:25PM +0100, Tom de Vries wrote:

+int
+main (void)
+{
+  unsigned results[nEvents];
+  unsigned pData[nEvents];
+  unsigned coeff = 2;
+
+  init ([0], [0]);
+
+#pragma omp parallel for
+  for (int idx = 0; idx < (int)nEvents; idx++)
+results[idx] = coeff * pData[idx];


Could you please add another testcase, where you have say pData
and some other pointer that init sets to alias with pData, and verify
that such loop (would need to be say normal loop inside #pragma omp single
or master) is not vectorized?


I've:
- added a simpler (not vectorizer-based) version of the testcase as
  pr46032-2.c, and
- copied pr46032-2.c to pr46032-3.c and modified it such that two
  pointers are aliasing

Committed to trunk.

Thanks,
- Tom

Add gcc.dg/pr46032-{2,3}.c test-cases

2015-11-30  Tom de Vries  

	* gcc.dg/pr46032-2.c: New test.
	* gcc.dg/pr46032-3.c: New test.

---
 gcc/testsuite/gcc.dg/pr46032-2.c | 29 +
 gcc/testsuite/gcc.dg/pr46032-3.c | 28 
 2 files changed, 57 insertions(+)

diff --git a/gcc/testsuite/gcc.dg/pr46032-2.c b/gcc/testsuite/gcc.dg/pr46032-2.c
new file mode 100644
index 000..e110880
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr46032-2.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fopenmp -std=c99 -fipa-pta -fdump-tree-optimized" } */
+
+#define N 2
+
+int
+foo (void)
+{
+  int a[N], b[N], c[N];
+  int *ap = [0];
+  int *bp = [0];
+  int *cp = [0];
+
+#pragma omp parallel for
+  for (unsigned int idx = 0; idx < N; idx++)
+{
+  ap[idx] = 1;
+  bp[idx] = 2;
+  cp[idx] = ap[idx];
+}
+
+  return *cp;
+}
+
+/* { dg-final { scan-tree-dump-times "\\] = 1;" 2 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "\\] = 2;" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "\\] = _\[0-9\]*;" 0 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "\\] = " 3 "optimized" } } */
+
diff --git a/gcc/testsuite/gcc.dg/pr46032-3.c b/gcc/testsuite/gcc.dg/pr46032-3.c
new file mode 100644
index 000..a4af7ec
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr46032-3.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fopenmp -std=c99 -fipa-pta -fdump-tree-optimized" } */
+
+#define N 2
+
+int
+foo (void)
+{
+  int a[N], c[N];
+  int *ap = [0];
+  int *bp = [0];
+  int *cp = [0];
+
+#pragma omp parallel for
+  for (unsigned int idx = 0; idx < N; idx++)
+{
+  ap[idx] = 1;
+  bp[idx] = 2;
+  cp[idx] = ap[idx];
+}
+
+  return *cp;
+}
+
+/* { dg-final { scan-tree-dump-times "\\] = 1;" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "\\] = 2;" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "\\] = _\[0-9\]*;" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "\\] = " 3 "optimized" } } */


[openacc] fortran loop clauses and splitting

2015-11-30 Thread Cesar Philippidis
This patch contains the following bug fixes:

 * Teaches gfortran to accept both num and static gang arguments inside
   same clause. E.g. gang(num:10, static:30). Currently, gfortran only
   allows one of those arguments to appear in a gang clause.

 * Make the diagnostics reported by resovle_oacc_positive_int_expr more
   accurate for worker and vector clauses.

 * Updates how combined loops are split to account for the renamed gang
   clause members in gfc_omp_clauses.  Also corrected a bug that Tom
   discovered in the c front end where combined reductions were being
   attached to kernels and parallel constructs. Now, they are only
   associated with the split acc loop.

Is this OK for trunk?

Cesar
2015-11-30  Cesar Philippidis  

	gcc/fortran/
	* dump-parse-tree.c (show_omp_clauses): Handle optional num and static
	arguments for the gang clause.
	* gfortran.h (gfc_omp_clauses): Rename gang_expr as gang_num_expr.
	Add gang_static_expr.
	* openmp.c (gfc_free_omp_clauses): Update to free gang_num_expr and
	gang_static_expr.
	(match_oacc_clause_gang): Update to support both num and static in
	the same clause.
	(resolve_omp_clauses): Formatting.  Also handle gang_num_expr and
	gang_static_expr.
	(resolve_oacc_params_in_parallel): New const char arg argument.
	Use it to report more accurate gang, worker and vector clause errors.
	(resolve_oacc_loop_blocks): Update calls to
	resolve_oacc_params_in_parallel.
	* trans-openmp.c (gfc_trans_omp_clauses): Update the gimplification of
	the gang clause.
	(gfc_trans_oacc_combined_directive): Make use of gang_num_expr and
	gang_static_expr.  Remove OMP_LIST_REDUCTION from construct_clauses.

	gcc/testsuite/
	* gfortran.dg/goacc/gang-static.f95: Add tests for gang num arguments.
	* gfortran.dg/goacc/loop-2.f95: Update expected diagnostics.
	* gfortran.dg/goacc/loop-6.f95: Likewise.
	* gfortran.dg/goacc/loop-7.f95: New test.
	* gfortran.dg/goacc/reduction-2.f95: New test.

diff --git a/gcc/fortran/dump-parse-tree.c b/gcc/fortran/dump-parse-tree.c
index 48476af..f9abf40 100644
--- a/gcc/fortran/dump-parse-tree.c
+++ b/gcc/fortran/dump-parse-tree.c
@@ -1146,10 +1146,24 @@ show_omp_clauses (gfc_omp_clauses *omp_clauses)
   if (omp_clauses->gang)
 {
   fputs (" GANG", dumpfile);
-  if (omp_clauses->gang_expr)
+  if (omp_clauses->gang_num_expr || omp_clauses->gang_static_expr)
 	{
 	  fputc ('(', dumpfile);
-	  show_expr (omp_clauses->gang_expr);
+	  if (omp_clauses->gang_num_expr)
+	{
+	  fprintf (dumpfile, "num:");
+	  show_expr (omp_clauses->gang_num_expr);
+	}
+	  if (omp_clauses->gang_num_expr && omp_clauses->gang_static)
+	fputc (',', dumpfile);
+	  if (omp_clauses->gang_static)
+	{
+	  fprintf (dumpfile, "static:");
+	  if (omp_clauses->gang_static_expr)
+		show_expr (omp_clauses->gang_static_expr);
+	  else
+		fputc ('*', dumpfile);
+	}
 	  fputc (')', dumpfile);
 	}
 }
diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index 5487c93..90b03ef 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -1226,7 +1226,8 @@ typedef struct gfc_omp_clauses
 
   /* OpenACC. */
   struct gfc_expr *async_expr;
-  struct gfc_expr *gang_expr;
+  struct gfc_expr *gang_static_expr;
+  struct gfc_expr *gang_num_expr;
   struct gfc_expr *worker_expr;
   struct gfc_expr *vector_expr;
   struct gfc_expr *num_gangs_expr;
diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index a07cee1..2941ad4 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -77,7 +77,8 @@ gfc_free_omp_clauses (gfc_omp_clauses *c)
   gfc_free_expr (c->thread_limit);
   gfc_free_expr (c->dist_chunk_size);
   gfc_free_expr (c->async_expr);
-  gfc_free_expr (c->gang_expr);
+  gfc_free_expr (c->gang_num_expr);
+  gfc_free_expr (c->gang_static_expr);
   gfc_free_expr (c->worker_expr);
   gfc_free_expr (c->vector_expr);
   gfc_free_expr (c->num_gangs_expr);
@@ -395,21 +396,41 @@ cleanup:
 static match
 match_oacc_clause_gang (gfc_omp_clauses *cp)
 {
-  if (gfc_match_char ('(') != MATCH_YES)
+  match ret = MATCH_YES;
+
+  if (gfc_match (" ( ") != MATCH_YES)
 return MATCH_NO;
-  if (gfc_match (" num :") == MATCH_YES)
-{
-  cp->gang_static = false;
-  return gfc_match (" %e )", >gang_expr);
-}
-  if (gfc_match (" static :") == MATCH_YES)
+
+  /* The gang clause accepts two optional arguments, num and static.
+ The num argument may either be explicit (num: ) or
+ implicit without ( without num:).  */
+
+  while (ret == MATCH_YES)
 {
-  cp->gang_static = true;
-  if (gfc_match (" * )") != MATCH_YES)
-	return gfc_match (" %e )", >gang_expr);
-  return MATCH_YES;
+  if (gfc_match (" static :") == MATCH_YES)
+	{
+	  if (cp->gang_static)
+	return MATCH_ERROR;
+	  else
+	cp->gang_static = true;
+	  if (gfc_match_char ('*') == MATCH_YES)
+	cp->gang_static_expr = NULL;
+	  else if (gfc_match (" %e ", >gang_static_expr) != MATCH_YES)
+	return 

Re: [OpenACC 0/7] host_data construct

2015-11-30 Thread Julian Brown
On Thu, 19 Nov 2015 16:57:23 +0100
Jakub Jelinek  wrote:

> If it is unclear, I think disallowing acc {parallel,kernels} inside of
> acc host_data might be too big hammer, but perhaps just erroring out
> or warning during gimplification that if you (explicitly or
> implicitly) try to map a var that is in use_device clause in some
> outer context, it is either wrong, unsupported or will not do what
> users think?

I think we can only assume that trying to map a variable declared in
a surrounding use_device clause is undefined behaviour. I haven't had
any response to my questions about host_data & deviceptr on the OpenACC
list.

> > #pragma acc host_data use_device(x)
> > {
> >   target_primitive(x);
> >   #pragma acc parallel deviceptr(x)
> >   {
> > ...
> >   }
> > }
> 
> Is deviceptr as above meant to work?  That is the OpenACC counterpart
> of is_device_ptr, right?  If yes, then I'd suggest just warning if you
> try to implicitly or explicitly map something use_device in outer
> contexts, and just make sure you don't ICE on the cases where you
> warn. If the standard does not say what it means, then it is
> unspecified behavior...

A problem with deviceptr, unlike is_device_ptr, is that it turns out to
be defined only to work with pointers, not arrays (OpenACC 2.0a
2.6.5.2), and there are no rules describing the latter decaying to the
former. So at least if 'x' is an array, it appears the answer is "no".

So, the attached patch disallows (via raising an error):

* Variables being declared in explicit mapping clauses that are
  declared in enclosing host_data regions.

* Variables being implicitly used (mapped) in offloaded regions that
  are declared in enclosing host_data regions.

It's otherwise equivalent to the previously-posted version, but without
the hacks to {maybe_,}lookup_decl_in_outer_ctx. I added checks for the
above conditions during gimplification, which seemed to be about the
same phase that other similar kinds of errors are diagnosed.

Tests look OK (libgomp/gcc/g++/libstdc++), and the new ones pass.

OK for mainline?

Thanks,

Julian

ChangeLog

Julian Brown  
Cesar Philippidis  
James Norris  

gcc/
* c-family/c-pragma.c (oacc_pragmas): Add PRAGMA_OACC_HOST_DATA.
* c-family/c-pragma.h (pragma_kind): Add PRAGMA_OACC_HOST_DATA.
(pragma_omp_clause): Add PRAGMA_OACC_CLAUSE_USE_DEVICE.
* c/c-parser.c (c_parser_omp_clause_name): Add use_device support.
(c_parser_oacc_clause_use_device): New function.
(c_parser_oacc_all_clauses): Add use_device support.
(OACC_HOST_DATA_CLAUSE_MASK): New macro.
(c_parser_oacc_host_data): New function.
(c_parser_omp_construct): Add host_data support.
* c/c-tree.h (c_finish_oacc_host_data): Add prototype.
* c/c-typeck.c (c_finish_oacc_host_data): New function.
(c_finish_omp_clauses): Add use_device support.
* cp/cp-tree.h (finish_oacc_host_data): Add prototype.
* cp/parser.c (cp_parser_omp_clause_name): Add use_device support.
(cp_parser_oacc_all_clauses): Add use_device support.
(OACC_HOST_DATA_CLAUSE_MASK): New macro.
(cp_parser_oacc_host_data): New function.
(cp_parser_omp_construct): Add host_data support.
(cp_parser_pragma): Add host_data support.
* cp/semantics.c (finish_omp_clauses): Add use_device support.
(finish_oacc_host_data): New function.
* gimple-pretty-print.c (dump_gimple_omp_target): Add host_data
support.
* gimple.h (gf_mask): Add GF_OMP_TARGET_KIND_OACC_HOST_DATA.
(is_gimple_omp_oacc): Add support for above.
* gimplify.c (omp_region_type): Add ORT_ACC_HOST_DATA.
(omp_notice_variable): Diagnose undefined implicit uses of
use_device variables in offloaded regions.
(gimplify_scan_omp_clauses): Add host_data, use_device
support. Diagnose undefined mapping of use_device variables in
OpenACC clauses.
(gimplify_omp_workshare): Add host_data support.
(gimplify_expr): Likewise.
* omp-builtins.def (BUILT_IN_GOACC_HOST_DATA): New.
* omp-low.c (lookup_decl_in_outer_ctx)
(maybe_lookup_decl_in_outer_ctx): Add optional argument to skip
host_data regions.
(scan_sharing_clauses): Support use_device.
(check_omp_nesting_restrictions): Support host_data.
(expand_omp_target): Support host_data.
(lower_omp_target): Skip over outer host_data regions when looking
up decls. Support use_device.
(make_gimple_omp_edges): Support host_data.
* tree-nested.c (convert_nonlocal_omp_clauses): Add use_device
clause.

libgomp/
* oacc-parallel.c (GOACC_host_data): New function.
* libgomp.map (GOACC_host_data): Add to GOACC_2.0.1.
* testsuite/libgomp.oacc-c-c++-common/host_data-1.c: New test.
* testsuite/libgomp.oacc-c-c++-common/host_data-2.c: New test.
* testsuite/libgomp.oacc-c-c++-common/host_data-3.c: New test.
* 

Re: [PATCH 1/2][ARM] PR/65956 AAPCS update for alignment attribute

2015-11-30 Thread Florian Weimer
On 11/27/2015 06:55 PM, Eric Botcazou wrote:

> There is no official ABI for Ada so I guess that's not really a problem as 
> long as it's documented on https://gcc.gnu.org/gcc-5/changes.html.

It's still surprising to make such a far-reaching change in a minor
release, I think.

Florian



[PATCH 1/2] [graphite] always print parameter names as P_{SSA_NAME_VERSION}

2015-11-30 Thread Sebastian Pop
---
 gcc/graphite-isl-ast-to-gimple.c  |  4 ++--
 gcc/graphite-scop-detection.c | 31 ---
 gcc/graphite-sese-to-poly.c   | 16 +++-
 gcc/testsuite/gcc.dg/graphite/pr35356-1.c |  2 +-
 4 files changed, 22 insertions(+), 31 deletions(-)

diff --git a/gcc/graphite-isl-ast-to-gimple.c b/gcc/graphite-isl-ast-to-gimple.c
index 33423dd..16cb5fa 100644
--- a/gcc/graphite-isl-ast-to-gimple.c
+++ b/gcc/graphite-isl-ast-to-gimple.c
@@ -2220,7 +2220,7 @@ translate_isl_ast_to_gimple::copy_loop_close_phi_args 
(basic_block old_bb,
   get_loc (old_name));
   if (dump_file)
{
- fprintf (dump_file, "[codegen] Adding loop-closed phi: ");
+ fprintf (dump_file, "[codegen] Adding loop close phi: ");
  print_gimple_stmt (dump_file, new_close_phi, 0, 0);
}
 
@@ -2265,7 +2265,7 @@ translate_isl_ast_to_gimple::copy_loop_close_phi_nodes 
(basic_block old_bb,
basic_block new_bb)
 {
   if (dump_file)
-fprintf (dump_file, "[codegen] copying loop closed phi nodes in bb_%d.\n",
+fprintf (dump_file, "[codegen] copying loop close phi nodes in bb_%d.\n",
 new_bb->index);
   /* Loop close phi nodes should have only one argument.  */
   gcc_assert (1 == EDGE_COUNT (old_bb->preds));
diff --git a/gcc/graphite-scop-detection.c b/gcc/graphite-scop-detection.c
index 1f8fc76..2f4231a 100644
--- a/gcc/graphite-scop-detection.c
+++ b/gcc/graphite-scop-detection.c
@@ -382,7 +382,7 @@ canonicalize_loop_closed_ssa (loop_p loop)
   if (single_pred_p (bb))
 {
   e = split_block_after_labels (bb);
-  DEBUG_PRINT (dp << "\nSplitting bb_" << bb->index);
+  DEBUG_PRINT (dp << "Splitting bb_" << bb->index << ".\n");
   make_close_phi_nodes_unique (e->src);
 }
   else
@@ -391,7 +391,7 @@ canonicalize_loop_closed_ssa (loop_p loop)
   basic_block close = split_edge (e);
 
   e = single_succ_edge (close);
-  DEBUG_PRINT (dp << "\nSplitting edge (" << e->src->index << ","
+  DEBUG_PRINT (dp << "Splitting edge (" << e->src->index << ","
  << e->dest->index << ")\n");
 
   for (psi = gsi_start_phis (bb); !gsi_end_p (psi); gsi_next ())
@@ -846,7 +846,7 @@ scop_detection::merge_sese (sese_l first, sese_l second) 
const
combined.exit = single_succ_edge (imm_succ);
   else
{
- DEBUG_PRINT (dp << "\n[scop-detection-fail] Discarding SCoP because "
+ DEBUG_PRINT (dp << "[scop-detection-fail] Discarding SCoP because "
  << "no single exit (empty succ) for sese exit";
   print_sese (dump_file, combined));
  return invalid_sese;
@@ -870,7 +870,7 @@ scop_detection::build_scop_depth (sese_l s, loop_p loop)
   if (!loop)
 return s;
 
-  DEBUG_PRINT (dp << "\n[Depth loop_" << loop->num << "]");
+  DEBUG_PRINT (dp << "[Depth loop_" << loop->num << "]\n");
   s = build_scop_depth (s, loop->inner);
 
   sese_l s2 = merge_sese (s, get_sese (loop));
@@ -895,7 +895,7 @@ scop_detection::build_scop_breadth (sese_l s1, loop_p loop)
 {
   if (!loop)
 return s1;
-  DEBUG_PRINT (dp << "\n[Breadth loop_" << loop->num << "]");
+  DEBUG_PRINT (dp << "[Breadth loop_" << loop->num << "]\n");
   gcc_assert (s1);
 
   loop_p l = loop;
@@ -981,7 +981,7 @@ scop_detection::loop_is_valid_scop (loop_p loop, sese_l 
scop) const
   if (loop_body_is_valid_scop (loop, scop))
 {
   DEBUG_PRINT (dp << "[valid-scop] loop_" << loop->num
- << "is a valid scop.\n");
+ << " is a valid scop.\n");
   return true;
 }
   return false;
@@ -1013,15 +1013,15 @@ scop_detection::add_scop (sese_l s)
   /* Do not add scops with only one loop.  */
   if (region_has_one_loop (s))
 {
-  DEBUG_PRINT (dp << "\n[scop-detection-fail] Discarding one loop SCoP";
+  DEBUG_PRINT (dp << "[scop-detection-fail] Discarding one loop SCoP.\n";
   print_sese (dump_file, s));
   return;
 }
 
   if (get_exit_bb (s) == EXIT_BLOCK_PTR_FOR_FN (cfun))
 {
-  DEBUG_PRINT (dp << "\n[scop-detection-fail] "
- << "Discarding SCoP exiting to return";
+  DEBUG_PRINT (dp << "[scop-detection-fail] "
+ << "Discarding SCoP exiting to return.";
   print_sese (dump_file, s));
   return;
 }
@@ -1033,7 +1033,7 @@ scop_detection::add_scop (sese_l s)
   remove_intersecting_scops (s);
 
   scops.safe_push (s);
-  DEBUG_PRINT (dp << "\nAdding SCoP "; print_sese (dump_file, s));
+  DEBUG_PRINT (dp << "Adding SCoP "; print_sese (dump_file, s));
 }
 
 /* Return true when a statement in SCOP cannot be represented by Graphite.
@@ -1047,7 +1047,7 @@ scop_detection::harmful_stmt_in_region (sese_l scop) const
   basic_block exit_bb = get_exit_bb (scop);
   basic_block entry_bb = get_entry_bb (scop);
 
-  DEBUG_PRINT (dp << "\n[checking-harmful-bbs] ";

Re: [patch] c/c++ asan tests for FreeBSD

2015-11-30 Thread Andreas Tobler

On 30.11.15 17:22, Jakub Jelinek wrote:

On Mon, Nov 30, 2015 at 05:17:29PM +0100, Bernd Schmidt wrote:

On 11/30/2015 01:12 PM, Andreas Tobler wrote:

On 30.11.15 11:28, Bernd Schmidt wrote:

On 11/29/2015 08:32 PM, Andreas Tobler wrote:

-/* { dg-do run { target { *-*-linux* } } } */
+/* { dg-do run { target { *-*-linux* *-*-freebsd* } } } */


I see a patch from you to add asan support to x86 freebsd, but what
about other architectures?


You mean because of the wildcard? I'll add them as I have time to port
them.

For now they are UNSUPPORTED.


Is that how they show up, or do you get FAILs on other FreeBSDs?


This is inside of asan.exp, which is guarded with
check_effective_target_fsanitize_address
and therefore should not be run at all on non-asan targets.


It manifests this way:

/usr/local/bin/ld: cannot find libasan_preinit.o: No such file or directory
/usr/local/bin/ld: cannot find -lasan
collect2: error: ld returned 1 exit status

Then it bails out and the asan tests are skipped.

...
testsuite/gcc.dg/asan/asan.exp completed in 1 seconds
...

There is no UNSUPPORTED in the log file.


I think the testsuite changes are fine, but it IMHO doesn't make sense to
commit it until the FreeBSD asan supports lands in (which is dependent on
the upstream libsanitizer change I believe).  Once it happens, it can be
cherry-picked from there, the config/i386 part looks reasonable.


I agree that it doesn't make much sense to commit for the public, but 
I'd have a patch less on the table ;)


But, np problem at all.

This is the cherry I'd like to pick once it has landed :)

http://reviews.llvm.org/D15049

The part for lib/asan/asan_linux.cc.

Thanks for the comments!
Andreas


[PATCH 2/2] [graphite] check for ISL generated code that leads to division by zero

2015-11-30 Thread Sebastian Pop
we used to generate modulo and division by zero because ISL uses big numbers
which translate to zero in modulo arithmetic.  The patch also improves error 
handling
and bails out early in case of wrong code gen.
---
 gcc/graphite-isl-ast-to-gimple.c | 85 +++-
 1 file changed, 83 insertions(+), 2 deletions(-)

diff --git a/gcc/graphite-isl-ast-to-gimple.c b/gcc/graphite-isl-ast-to-gimple.c
index 16cb5fa..bfce316 100644
--- a/gcc/graphite-isl-ast-to-gimple.c
+++ b/gcc/graphite-isl-ast-to-gimple.c
@@ -502,7 +502,7 @@ private:
 tree
 translate_isl_ast_to_gimple::
 gcc_expression_from_isl_ast_expr_id (tree type,
-__isl_keep isl_ast_expr *expr_id,
+__isl_take isl_ast_expr *expr_id,
 ivs_params )
 {
   gcc_assert (isl_ast_expr_get_type (expr_id) == isl_ast_expr_id);
@@ -550,8 +550,13 @@ binary_op_to_tree (tree type, __isl_take isl_ast_expr 
*expr, ivs_params )
   tree tree_lhs_expr = gcc_expression_from_isl_expression (type, arg_expr, ip);
   arg_expr = isl_ast_expr_get_op_arg (expr, 1);
   tree tree_rhs_expr = gcc_expression_from_isl_expression (type, arg_expr, ip);
+
   enum isl_ast_op_type expr_type = isl_ast_expr_get_op_type (expr);
   isl_ast_expr_free (expr);
+
+  if (codegen_error)
+return NULL_TREE;
+
   switch (expr_type)
 {
 case isl_ast_op_add:
@@ -564,15 +569,43 @@ binary_op_to_tree (tree type, __isl_take isl_ast_expr 
*expr, ivs_params )
   return fold_build2 (MULT_EXPR, type, tree_lhs_expr, tree_rhs_expr);
 
 case isl_ast_op_div:
+  /* As ISL operates on arbitrary precision numbers, we may end up with
+division by 2^64 that is folded to 0.  */
+  if (integer_zerop (tree_rhs_expr))
+   {
+ codegen_error = true;
+ return NULL_TREE;
+   }
   return fold_build2 (EXACT_DIV_EXPR, type, tree_lhs_expr, tree_rhs_expr);
 
 case isl_ast_op_pdiv_q:
+  /* As ISL operates on arbitrary precision numbers, we may end up with
+division by 2^64 that is folded to 0.  */
+  if (integer_zerop (tree_rhs_expr))
+   {
+ codegen_error = true;
+ return NULL_TREE;
+   }
   return fold_build2 (TRUNC_DIV_EXPR, type, tree_lhs_expr, tree_rhs_expr);
 
 case isl_ast_op_pdiv_r:
+  /* As ISL operates on arbitrary precision numbers, we may end up with
+division by 2^64 that is folded to 0.  */
+  if (integer_zerop (tree_rhs_expr))
+   {
+ codegen_error = true;
+ return NULL_TREE;
+   }
   return fold_build2 (TRUNC_MOD_EXPR, type, tree_lhs_expr, tree_rhs_expr);
 
 case isl_ast_op_fdiv_q:
+  /* As ISL operates on arbitrary precision numbers, we may end up with
+division by 2^64 that is folded to 0.  */
+  if (integer_zerop (tree_rhs_expr))
+   {
+ codegen_error = true;
+ return NULL_TREE;
+   }
   return fold_build2 (FLOOR_DIV_EXPR, type, tree_lhs_expr, tree_rhs_expr);
 
 case isl_ast_op_and:
@@ -620,6 +653,9 @@ ternary_op_to_tree (tree type, __isl_take isl_ast_expr 
*expr, ivs_params )
   tree tree_third_expr
 = gcc_expression_from_isl_expression (type, arg_expr, ip);
   isl_ast_expr_free (expr);
+
+  if (codegen_error)
+return NULL_TREE;
   return fold_build3 (COND_EXPR, type, tree_first_expr,
  tree_second_expr, tree_third_expr);
 }
@@ -635,7 +671,7 @@ unary_op_to_tree (tree type, __isl_take isl_ast_expr *expr, 
ivs_params )
   isl_ast_expr *arg_expr = isl_ast_expr_get_op_arg (expr, 0);
   tree tree_expr = gcc_expression_from_isl_expression (type, arg_expr, ip);
   isl_ast_expr_free (expr);
-  return fold_build1 (NEGATE_EXPR, type, tree_expr);
+  return codegen_error ? NULL_TREE : fold_build1 (NEGATE_EXPR, type, 
tree_expr);
 }
 
 /* Converts an isl_ast_expr_op expression E with unknown number of arguments
@@ -661,11 +697,25 @@ nary_op_to_tree (tree type, __isl_take isl_ast_expr 
*expr, ivs_params )
 }
   isl_ast_expr *arg_expr = isl_ast_expr_get_op_arg (expr, 0);
   tree res = gcc_expression_from_isl_expression (type, arg_expr, ip);
+
+  if (codegen_error)
+{
+  isl_ast_expr_free (expr);
+  return NULL_TREE;
+}
+
   int i;
   for (i = 1; i < isl_ast_expr_get_op_n_arg (expr); i++)
 {
   arg_expr = isl_ast_expr_get_op_arg (expr, i);
   tree t = gcc_expression_from_isl_expression (type, arg_expr, ip);
+
+  if (codegen_error)
+   {
+ isl_ast_expr_free (expr);
+ return NULL_TREE;
+   }
+
   res = fold_build2 (op_code, type, res, t);
 }
   isl_ast_expr_free (expr);
@@ -680,6 +730,12 @@ translate_isl_ast_to_gimple::
 gcc_expression_from_isl_expr_op (tree type, __isl_take isl_ast_expr *expr,
 ivs_params )
 {
+  if (codegen_error)
+{
+  isl_ast_expr_free (expr);
+  return NULL_TREE;
+}
+
   gcc_assert (isl_ast_expr_get_type (expr) == isl_ast_expr_op);
 

Re: [gomp4.5] Handle #pragma omp declare target link

2015-11-30 Thread Ilya Verbin
On Mon, Nov 30, 2015 at 13:04:59 +0100, Jakub Jelinek wrote:
> On Fri, Nov 27, 2015 at 07:50:09PM +0300, Ilya Verbin wrote:
> > + /* Most significant bit of the size marks such vars.  */
> > + unsigned HOST_WIDE_INT isize = tree_to_uhwi (size);
> > + isize |= 1ULL << (int_size_in_bytes (const_ptr_type_node) * 8 - 1);
> 
> That supposedly should be BITS_PER_UNIT instead of 8.

Fixed.

> > diff --git a/gcc/varpool.c b/gcc/varpool.c
> > index 36f19a6..cbd1e05 100644
> > --- a/gcc/varpool.c
> > +++ b/gcc/varpool.c
> > @@ -561,17 +561,21 @@ varpool_node::assemble_decl (void)
> >   are not real variables, but just info for debugging and codegen.
> >   Unfortunately at the moment emutls is not updating varpool correctly
> >   after turning real vars into value_expr vars.  */
> > +#ifndef ACCEL_COMPILER
> >if (DECL_HAS_VALUE_EXPR_P (decl)
> >&& !targetm.have_tls)
> >  return false;
> > +#endif
> >  
> >/* Hard register vars do not need to be output.  */
> >if (DECL_HARD_REGISTER (decl))
> >  return false;
> >  
> > +#ifndef ACCEL_COMPILER
> >gcc_checking_assert (!TREE_ASM_WRITTEN (decl)
> >&& TREE_CODE (decl) == VAR_DECL
> >&& !DECL_HAS_VALUE_EXPR_P (decl));
> > +#endif
> 
> This looks wrong, both of these clearly could affect anything with
> DECL_HAS_VALUE_EXPR_P, not just the link vars.
> So, if you need to handle the "omp declare target link" vars specially,
> you should only handle those specially and nothing else.  And please try to
> explain why.

Actually these ifndefs are not needed, because assemble_decl never will be
called by accel compiler for original link vars.  I've added a check into
output_in_order, but missed a second place where assemble_decl is called -
symbol_table::output_variables.  So, fixed now.

> > @@ -1005,13 +1026,18 @@ gomp_load_image_to_device (struct gomp_device_descr 
> > *devicep, unsigned version,
> >for (i = 0; i < num_vars; i++)
> >  {
> >struct addr_pair *target_var = _table[num_funcs + i];
> > -  if (target_var->end - target_var->start
> > - != (uintptr_t) host_var_table[i * 2 + 1])
> > +  uintptr_t target_size = target_var->end - target_var->start;
> > +
> > +  /* Most significant bit of the size marks "omp declare target link"
> > +variables.  */
> > +  bool is_link = target_size & (1ULL << (sizeof (uintptr_t) * 8 - 1));
> 
> __CHAR_BIT__ here instead of 8?

Fixed.

> > @@ -1019,7 +1045,7 @@ gomp_load_image_to_device (struct gomp_device_descr 
> > *devicep, unsigned version,
> >k->host_end = k->host_start + (uintptr_t) host_var_table[i * 2 + 1];
> >k->tgt = tgt;
> >k->tgt_offset = target_var->start;
> > -  k->refcount = REFCOUNT_INFINITY;
> > +  k->refcount = is_link ? REFCOUNT_LINK : REFCOUNT_INFINITY;
> >k->async_refcount = 0;
> >array->left = NULL;
> >array->right = NULL;
> 
> Do we need to do anything in gomp_unload_image_from_device ?
> I mean at least in questionable programs that for link vars don't decrement
> the refcount of the var that replaced the link var to 0 first before
> dlclosing the library.
> At least host_var_table[j * 2 + 1] will have the MSB set, so we need to
> handle it differently.  Perhaps for that case perform a lookup, and if we
> get something which has link_map non-NULL, first perform as if there is
> target exit data delete (var) on it first?

You're right, it doesn't deallocate memory on the device if DSO leaves nonzero
refcount.  And currently host compiler doesn't set MSB in host_var_table, it's
set only by accel compiler.  But it's possible to do splay_tree_lookup for each
var to determine whether is it linked or not, like in the patch bellow.
Or do you prefer to set the bit in host compiler too?  It requires
lookup_attribute ("omp declare target link") for all vars in the table during
compilation, but allows to do splay_tree_lookup at run-time only for vars with
MSB set in host_var_table.
Unfortunately, calling gomp_exit_data from gomp_unload_image_from_device works
only for DSO, but it crashed when an executable leaves nonzero refcount, because
target device may be already uninitialized from plugin's __run_exit_handlers
(and it is in case of intelmic), so gomp_exit_data cannot run free_func.
Is it possible do add some atexit (...) to libgomp, which will set shutting_down
flag, and just do nothing in gomp_unload_image_from_device if it is set?


diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 369574f..b73caa1 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -822,6 +822,8 @@ const struct attribute_spec c_common_attribute_table[] =
  handle_simd_attribute, false },
   { "omp declare target", 0, 0, true, false, false,
  handle_omp_declare_target_attribute, false },
+  { "omp declare target link", 0, 0, true, false, false,
+ 

Re: [Patch,SLP]: Correction in the comment for SLP vectorization profitable case.

2015-11-30 Thread Jeff Law

On 11/30/2015 02:00 AM, Ajit Kumar Agarwal wrote:

This patch made correction in the comment for SLP profitable vectorization case.

Correction in the comment for vectorizable profitable case. The comment is
contradicting the condition vec_outside_cost + vec_inside_cost > scalar_cost.

ChangeLog:
2015-11-30  Ajit Agarwal  

 * tree-vect-slp.c
 (vect_bb_vectorization_profitable_p): Correction in the comment.

OK.  Please install.

Thanks,
Jeff



RE: [PR68001, CilkPlus] Fix for PR68001

2015-11-30 Thread Zamyatin, Igor
> 
> FAIL: obj-c++.dg/property/dotsyntax-11.mm -fgnu-runtime  (test for errors,
> line 51)
> FAIL: obj-c++.dg/property/dotsyntax-11.mm -fgnu-runtime  (test for errors,
> line 56)
> FAIL: obj-c++.dg/property/dotsyntax-11.mm -fgnu-runtime  (test for errors,
> line 59)
> 
> Andreas.

Here is the patch that properly limits GS_ERROR exit only in case of error in 
cilk spawn detection.

Bootstrapped and regtested on x86_64, ok for trunk?

Thanks,
Igor

cp/Changelog

2015-11-27  Igor Zamyatin  

PR c++/68001
* cp-gimplify.c (cp_gimplify_expr): Limit GS_ERROR only in case of
error in cilk spawn detection.



diff --git a/gcc/cp/cp-gimplify.c b/gcc/cp/cp-gimplify.c
index 09ee5ff..3dbbd7f 100644
--- a/gcc/cp/cp-gimplify.c
+++ b/gcc/cp/cp-gimplify.c
@@ -559,6 +559,7 @@ int
 cp_gimplify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p)
 {
   int saved_stmts_are_full_exprs_p = 0;
+  bool is_spawn_detected = true;
   enum tree_code code = TREE_CODE (*expr_p);
   enum gimplify_status ret;
 
@@ -614,12 +615,12 @@ cp_gimplify_expr (tree *expr_p, gimple_seq *pre_p, 
gimple_seq *post_p)
 25979.  */
 case INIT_EXPR:
   if (fn_contains_cilk_spawn_p (cfun)
- && cilk_detect_spawn_and_unwrap (expr_p))
+ && (is_spawn_detected = cilk_detect_spawn_and_unwrap (expr_p)))
{
  cilk_cp_gimplify_call_params_in_spawned_fn (expr_p, pre_p, post_p);
  return (enum gimplify_status) gimplify_cilk_spawn (expr_p);
}
-  if (seen_error ())
+  if (!is_spawn_detected && seen_error ())
return GS_ERROR;
 
   cp_gimplify_init_expr (expr_p);





Re: [gomp4.5] Handle #pragma omp declare target link

2015-11-30 Thread Jakub Jelinek
On Mon, Nov 30, 2015 at 11:29:34PM +0300, Ilya Verbin wrote:
> > This looks wrong, both of these clearly could affect anything with
> > DECL_HAS_VALUE_EXPR_P, not just the link vars.
> > So, if you need to handle the "omp declare target link" vars specially,
> > you should only handle those specially and nothing else.  And please try to
> > explain why.
> 
> Actually these ifndefs are not needed, because assemble_decl never will be
> called by accel compiler for original link vars.  I've added a check into
> output_in_order, but missed a second place where assemble_decl is called -
> symbol_table::output_variables.  So, fixed now.

Great.

> > Do we need to do anything in gomp_unload_image_from_device ?
> > I mean at least in questionable programs that for link vars don't decrement
> > the refcount of the var that replaced the link var to 0 first before
> > dlclosing the library.
> > At least host_var_table[j * 2 + 1] will have the MSB set, so we need to
> > handle it differently.  Perhaps for that case perform a lookup, and if we
> > get something which has link_map non-NULL, first perform as if there is
> > target exit data delete (var) on it first?
> 
> You're right, it doesn't deallocate memory on the device if DSO leaves nonzero
> refcount.  And currently host compiler doesn't set MSB in host_var_table, it's
> set only by accel compiler.  But it's possible to do splay_tree_lookup for 
> each
> var to determine whether is it linked or not, like in the patch bellow.
> Or do you prefer to set the bit in host compiler too?  It requires
> lookup_attribute ("omp declare target link") for all vars in the table during
> compilation, but allows to do splay_tree_lookup at run-time only for vars with
> MSB set in host_var_table.
> Unfortunately, calling gomp_exit_data from gomp_unload_image_from_device works
> only for DSO, but it crashed when an executable leaves nonzero refcount, 
> because
> target device may be already uninitialized from plugin's __run_exit_handlers
> (and it is in case of intelmic), so gomp_exit_data cannot run free_func.
> Is it possible do add some atexit (...) to libgomp, which will set 
> shutting_down
> flag, and just do nothing in gomp_unload_image_from_device if it is set?

Sorry, I didn't mean you should call gomp_exit_data, what I meant was that
you perform the same action as would delete(var) do in that case.
Calling gomp_exit_data e.g. looks it up again etc.
Supposedly having the MSB in host table too is useful, so if you could
handle that, it would be nice.  And splay_tree_lookup only if the MSB is
set.
So,
if (!host_data_has_msb_set)
  splay_tree_remove (>mem_map, );
else
  {
splay_tree_key n = splay_tree_lookup (>mem_map, );
if (n->link_key)
  {
n->refcount = 0;
n->link_key = NULL;
splay_tree_remove (>mem_map, n);
if (n->tgt->refcount > 1)
  n->tgt->refcount--;
else
  gomp_unmap_tgt (n->tgt);
  }
else
  splay_tree_remove (>mem_map, n);
  }
or so.

Jakub


Re: [gomp4.5] Handle #pragma omp declare target link

2015-11-30 Thread Ilya Verbin
On Mon, Nov 30, 2015 at 21:49:02 +0100, Jakub Jelinek wrote:
> On Mon, Nov 30, 2015 at 11:29:34PM +0300, Ilya Verbin wrote:
> > You're right, it doesn't deallocate memory on the device if DSO leaves 
> > nonzero
> > refcount.  And currently host compiler doesn't set MSB in host_var_table, 
> > it's
> > set only by accel compiler.  But it's possible to do splay_tree_lookup for 
> > each
> > var to determine whether is it linked or not, like in the patch bellow.
> > Or do you prefer to set the bit in host compiler too?  It requires
> > lookup_attribute ("omp declare target link") for all vars in the table 
> > during
> > compilation, but allows to do splay_tree_lookup at run-time only for vars 
> > with
> > MSB set in host_var_table.
> > Unfortunately, calling gomp_exit_data from gomp_unload_image_from_device 
> > works
> > only for DSO, but it crashed when an executable leaves nonzero refcount, 
> > because
> > target device may be already uninitialized from plugin's __run_exit_handlers
> > (and it is in case of intelmic), so gomp_exit_data cannot run free_func.
> > Is it possible do add some atexit (...) to libgomp, which will set 
> > shutting_down
> > flag, and just do nothing in gomp_unload_image_from_device if it is set?
> 
> Sorry, I didn't mean you should call gomp_exit_data, what I meant was that
> you perform the same action as would delete(var) do in that case.
> Calling gomp_exit_data e.g. looks it up again etc.
> Supposedly having the MSB in host table too is useful, so if you could
> handle that, it would be nice.  And splay_tree_lookup only if the MSB is
> set.
> So,
> if (!host_data_has_msb_set)
>   splay_tree_remove (>mem_map, );
> else
>   {
> splay_tree_key n = splay_tree_lookup (>mem_map, );
> if (n->link_key)
> {
>   n->refcount = 0;
>   n->link_key = NULL;
>   splay_tree_remove (>mem_map, n);
>   if (n->tgt->refcount > 1)
> n->tgt->refcount--;
>   else
> gomp_unmap_tgt (n->tgt);
> }
>   else
> splay_tree_remove (>mem_map, n);
>   }
> or so.

Ok, but it doesn't solve the issue with doing it for the executable, because
gomp_unmap_tgt (n->tgt) will want to run free_func on uninitialized device.

  -- Ilya


Re: [patch] RFC asan support for i?86/x86_64-*freebsd*

2015-11-30 Thread Jeff Law

On 11/29/2015 03:10 PM, Andreas Tobler wrote:

All,

this patch adds support for asan for i?86/x86_64-*freebsd*.

Test results can be found on the list.

These modifications belong only to gcc. There is one modification to
asan/asan_linux.cc, this one is sent upstream. Until this one is in, my
patch is on hold.

One thing to note, FreeBSD does not need to link against -ldl. That is
why I added an extra config check.

But nevertheless I'd like to get some comments on the patch.

Thanks to Jakub and Dan McGregor.

Thanks,
Andreas


2015-11-29  Andreas Tobler  

 * config/i386/i386.h: Define two new macros:
 SUBTARGET_SHADOW_OFFSET_64 and SUBTARGET_SHADOW_OFFSET_32.
 * config/i386/i386.c (ix86_asan_shadow_offset): Use these macros.
 * config/i386/darwin.h: Override the SUBTARGET_SHADOW_OFFSET_64
 macro.
 * config/i386/freebsd.h: Override the SUBTARGET_SHADOW_OFFSET_64
 and the SUBTARGET_SHADOW_OFFSET_32 macro.
 * config/freebsd.h (LIBASAN_EARLY_SPEC): Define.
 (LIBTSAN_EARLY_SPEC): Likewise.
 (LIBLSAN_EARLY_SPEC): Likewise.

2015-11-29  Andreas Tobler  

 * configure.ac: Replace the hard-coded -ldl requirement for
 link_sanitizer_common with a configure time check for -ldl.
 * configure: Regenerate.
 * configure.tgt: Add x86_64- and i?86-*-freebsd* targets.
The configury bits are fine.  Uros would own review on the x86 specific 
changes.


jeff



Re: [PATCH] fix PR65726

2015-11-30 Thread Jeff Law

On 11/26/2015 11:49 AM, Andreas Tobler wrote:

Hi all,

the attached patch fixes the build issue from this ticket if bootstrap
is disabled.

Tested on x86_64-*-linux* and on x86_64-*-freebsd* with gcc and clang.

Ok for trunk?

And 5.3?

Thanks,
Andreas

2015-11-26  Andreas Tobler  

 PR libffi/65726
 * Makefile.def (lang_env_dependencies): Make libffi depend
 on cxx.
 * Makefile.in: Regenerate.


OK.
jeff


[hsa] Use proper accesses to gimple_omp_for

2015-11-30 Thread Martin Jambor
Hi,

when looking at the attempt_target_gridification function I realized I
forgot to to replace some of the early code with proper gimple
statement access function calls.  This patch addresses that.
Committed to the branch.

Thanks,

Martin


2015-11-30  Martin Jambor  

* omp-low.c (attempt_target_gridification): Use proper access into
iter array of the inner loop.
---
 gcc/omp-low.c | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 5933c60..bdf6539 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -17484,21 +17484,21 @@ attempt_target_gridification (gomp_target *target, 
gimple_stmt_iterator *gsi,
   size_t collapse = gimple_omp_for_collapse (inner_loop);
   for (size_t i = 0; i < collapse; i++)
 {
-  gimple_omp_for_iter iter = inner_loop->iter[i];
-  walk_tree (, remap_prebody_decls, , NULL);
-  walk_tree (, remap_prebody_decls, , NULL);
-
-  tree itype, type = TREE_TYPE (iter.index);
+  tree itype, type = TREE_TYPE (gimple_omp_for_index (inner_loop, i));
   if (POINTER_TYPE_P (type))
itype = signed_type_for (type);
   else
itype = type;
 
-  enum tree_code cond_code = iter.cond;
-  tree n1 = iter.initial;
-  tree n2 = iter.final;
+  enum tree_code cond_code = gimple_omp_for_cond (inner_loop, i);
+  tree n1 = unshare_expr (gimple_omp_for_initial (inner_loop, i));
+  walk_tree (, remap_prebody_decls, , NULL);
+  tree n2 = unshare_expr (gimple_omp_for_final (inner_loop, i));
+  walk_tree (, remap_prebody_decls, , NULL);
   adjust_for_condition (loc, _code, );
-  tree step = get_omp_for_step_from_incr (loc, iter.incr);
+  tree step;
+  step = get_omp_for_step_from_incr (loc,
+gimple_omp_for_incr (inner_loop, i));
   n1 = force_gimple_operand_gsi (gsi, fold_convert (type, n1), true,
 NULL_TREE, true, GSI_SAME_STMT);
   n2 = force_gimple_operand_gsi (gsi, fold_convert (itype, n2), true,
-- 
2.6.0



Re: [PATCH] Fix declaration of pthread-structs in s-osinte-rtems.ads (ada/68169)

2015-11-30 Thread Jeff Law

On 11/30/2015 03:06 PM, Jan Sommer wrote:

Could someone with write access please commit the patch?
The paperwork with the FSF has gone through. If something else is missing, 
please tell me.
I won't be available next week.
I'm not sure what you built your patches again, but I can't apply them 
to the trunk.  Can you resend a patch as a diff against the trunk.


Often I can fix things by hand, but this is Ada and I'd be much more 
likely to botch something.



jeff




Re: [RFA] Implement incremental IL linking

2015-11-30 Thread Jan Hubicka
> Hi,
> this is polished version of the patch to implement IL level incremental 
> inking.
> -flinker-output is now documented and can be specified to the GCC driver.
> In this case plugin gets option -linker-output-known and it stops from
> attempts to detect it from info passed down by linker. I also added doc for
> the flag to invoke.texi
> 
> Modulo the testsuite compensation the rest of patch is basically unchanged
> since earlier version: lto-wrapper looks for linker-output flag and switches 
> to
> non-WPA mode (because we do not want to execute ltrans compilatoins) and lto
> frontends configure the compiler to output IL and possibly flat lto binary to
> the object file.
> 
> Bootstrapped/regtested x86_64-linux, OK?
Hmm and now for the fun part.  I just noticed that the patch works well with 
both
GNU LD and Gold from my system instalation, wich is

GNU gold (GNU Binutils 2.24.51.20140405) 1.11

while newer version:

GNU gold (GNU Binutils 2.25.51.20150520) 1.11

fails with:

/tmp/ccPIuUSA.lto.o: plugin needed to handle lto object

in the final stage of incremental linking.  This seems like binutils bug - the
message should be output only if there are LTO objects not claimed by the linker
before invoking the plugin. There is no need to error out when plugin itself
produce IL for incremental linking.

I will check if new version fixes it and fill in PR.  I suppose I can whitelist
ld versions in the plugin and enable -flinker-output=rel only on binutils
version where this works. There is LDPT_GOLD_VERSION which tells me the
info.  I will update patch accordingly and check what version range refuses to
finish the link.

Honza

> 
> Honza
> 
>   * lto-plugin.c: Document options; add -linker-output-known;
>   determine when to use rel and when nolto-rel output.
> 
>   * lto-wrapper.c (run_gcc): Look for -flinker-output=rel also in the
>   list of options passed from the driver.
>   * passes.c (ipa_write_summaries): Only modify statements if body
>   is in memory.
>   * cgraphunit.c (ipa_passes): Also produce intermeidate code when
>   incrementally linking.
>   (ipa_passes): LIkewise.
>   * lto-cgraph.c (lto_output_node): When incrementally linking do not
>   pass down resolution info.
>   * common.opt (flag_incremental_link): Update info.
>   * gcc.c (plugin specs): Turn flinker-output=* to
>   -plugin-opt=-linker-output-known
>   * toplev.c (compile_file): Also cut compilation when doing incremental
>   link.
>   * flag-types.h (enum lto_partition_model): Add
>   LTO_LINKER_OUTPUT_NOLTOREL.
>   (invoke.texi): Add -flinker-output docs.
> 
>   * lang.opt (lto_linker_output): Add nolto-rel.
>   * lto-lang.c (lto_post_options): Handle LTO_LINKER_OUTPUT_REL
>   and LTO_LINKER_OUTPUT_NOLTOREL:.
>   (lto_init): Generate lto when doing incremental link.
> 
>   * gcc.dg/lto/20081120-2_0.c: Add -flinker-output=nolto-rel
>   * gcc.dg/lto/20090126-1_0.c: Likewise.
>   * gcc.dg/lto/20091020-2_0.c: Likewise.
>   * gcc.dg/lto/20081204-2_0.c: Likewise.
>   * gcc.dg/lto/20091015-1_0.c: Likewise.
>   * gcc.dg/lto/20090126-2_0.c: Likewiwe.
>   * gcc.dg/lto/20090116_0.c: Likewise.
>   * gcc.dg/lto/20081224_0.c: Likewise.
>   * gcc.dg/lto/20091027-1_0.c: Likewise.
>   * gcc.dg/lto/20090219_0.c: Likewise.
>   * gcc.dg/lto/20081212-1_0.c: Likewise.
>   * gcc.dg/lto/20091013-1_0.c: Likewise.
>   * gcc.dg/lto/20081126_0.c: Likewise.
>   * gcc.dg/lto/20090206-1_0.c: Likewise.
>   * gcc.dg/lto/20091016-1_0.c: Likewise.
>   * gcc.dg/lto/20081120-1_0.c: Likewise.
>   * gcc.dg/lto/20091020-1_0.c: Likewise.
>   * gcc.dg/lto/20100426_0.c: Likewise.
>   * gcc.dg/lto/20081204-1_0.c: Likewise.
>   * gcc.dg/lto/20091014-1_0.c: Likewise.
>   * g++.dg/lto/20081109-1_0.C: Likewise.
>   * g++.dg/lto/20100724-1_0.C: Likewise.
>   * g++.dg/lto/20081204-1_0.C: Likewise.
>   * g++.dg/lto/pr45679-2_0.C: Likewise.
>   * g++.dg/lto/20110311-1_0.C: Likewise.
>   * g++.dg/lto/20090302_0.C: Likewise.
>   * g++.dg/lto/20081118_0.C: Likewise.
>   * g++.dg/lto/20091002-2_0.C: Likewise.
>   * g++.dg/lto/20081120-2_0.C: Likewise.
>   * g++.dg/lto/20081123_0.C: Likewise.
>   * g++.dg/lto/20090313_0.C: Likewise.
>   * g++.dg/lto/pr54625-1_0.c: Likewise.
>   * g++.dg/lto/pr48354-1_0.C: Likewise.
>   * g++.dg/lto/20081219_0.C: Likewise.
>   * g++.dg/lto/pr48042_0.C: Likewise.
>   * g++.dg/lto/20101015-2_0.C: Likewise.
>   * g++.dg/lto/pr45679-1_0.C: Likewise.
>   * g++.dg/lto/20091026-1_0.C: Likewise.
>   * g++.dg/lto/pr45621_0.C: Likewise.
>   * g++.dg/lto/20081119-1_0.C: Likewise.
>   * g++.dg/lto/20101010-4_0.C: Likewise.
>   * g++.dg/lto/20081120-1_0.C: Likewise.
>   * g++.dg/lto/20091002-1_0.C: Likewise.
>   * g++.dg/lto/20091002-3_0.C: Likewise.
>   * gfortran.dg/lto/20091016-1_0.f90: 

Re: [PATCH AArch64]Handle REG+REG+CONST and REG+NON_REG+CONST in legitimize address

2015-11-30 Thread Bin.Cheng
On Tue, Nov 24, 2015 at 6:18 PM, Richard Earnshaw
 wrote:
> On 24/11/15 09:56, Richard Earnshaw wrote:
>> On 24/11/15 02:51, Bin.Cheng wrote:
> The aarch64's problem is we don't define addptr3 pattern, and we don't
>>> have direct insn pattern describing the "x + y << z".  According to
>>> gcc internal:
>>>
>>> ‘addptrm3’
>>> Like addm3 but is guaranteed to only be used for address calculations.
>>> The expanded code is not allowed to clobber the condition code. It
>>> only needs to be defined if addm3 sets the condition code.
>
> addm3 on aarch64 does not set the condition codes, so by this rule we
> shouldn't need to define this pattern.
>>> Hi Richard,
>>> I think that rule has a prerequisite that backend needs to support
>>> register shifted addition in addm3 pattern.
>>
>> addm3 is a named pattern and its format is well defined.  It does not
>> take a shifted operand and never has.
>>
>>> Apparently for AArch64,
>>> addm3 only supports "reg+reg" or "reg+imm".  Also we don't really
>>> "does not set the condition codes" actually, because both
>>> "adds_shift_imm_*" and "adds_mul_imm_*" do set the condition flags.
>>
>> You appear to be confusing named patterns (used by expand) with
>> recognizers.  Anyway, we have
>>
>> (define_insn "*add__"
>>   [(set (match_operand:GPI 0 "register_operand" "=r")
>> (plus:GPI (ASHIFT:GPI (match_operand:GPI 1 "register_operand" "r")
>>   (match_operand:QI 2
>> "aarch64_shift_imm_" "n"))
>>   (match_operand:GPI 3 "register_operand" "r")))]
>>
>> Which is a non-flag setting add with shifted operand.
>>
>>> Either way I think it is another backend issue, so do you approve that
>>> I commit this patch now?
>>
>> Not yet.  I think there's something fundamental amiss here.
>>
>> BTW, it looks to me as though addptr3 should have exactly the same
>> operand rules as add3 (documentation reads "like add3"), so a
>> shifted operand shouldn't be supported there either.  If that isn't the
>> case then that should be clearly called out in the documentation.
>>
>> R.
>>
>
> PS.
>
> I presume you are aware of the canonicalization rules for add?  That is,
> for a shift-and-add operation, the shift operand must appear first.  Ie.
>
> (plus (shift (op, op)), op)
>
> not
>
> (plus (op, (shift (op, op))

Hi Richard,
Thanks for the comments.  I realized that the not-recognized insn
issue is because the original patch build non-canonical expressions.
When reloading address expression, LRA generates non-canonical
register scaled insn, which can't be recognized by aarch64 backend.

Here is the updated patch using canonical form pattern,  it passes
bootstrap and regression test.  Well, the ivo failure still exists,
but it analyzed in the original message.

Is this patch OK?

As for Jiong's concern about the additional extension instruction, I
think this only stands for atmoic load store instructions.  For
general load store, AArch64 supports zext/sext in register scaling
addressing mode, the additional instruction can be forward propagated
into memory reference.  The problem for atomic load store is AArch64
only supports direct register addressing mode.  After LRA reloads
address expression out of memory reference, there is no combine/fwprop
optimizer to merge instructions.  The problem is atomic_store's
predicate doesn't match its constraint.   The predicate used for
atomic_store is memory_operand, while all other atomic patterns
use aarch64_sync_memory_operand.  I think this might be a typo.  With
this change, expand will not generate addressing mode requiring reload
anymore.  I will test another patch fixing this.

Thanks,
bin
>
> R.
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 3fe2f0f..5b3e3c4 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -4757,13 +4757,65 @@ aarch64_legitimize_address (rtx x, rtx /* orig_x  */, 
machine_mode mode)
  We try to pick as large a range for the offset as possible to
  maximize the chance of a CSE.  However, for aligned addresses
  we limit the range to 4k so that structures with different sized
- elements are likely to use the same base.  */
+ elements are likely to use the same base.  We need to be careful
+ not split CONST for some forms address expressions, otherwise it
+ will generate sub-optimal code.  */
 
   if (GET_CODE (x) == PLUS && CONST_INT_P (XEXP (x, 1)))
 {
   HOST_WIDE_INT offset = INTVAL (XEXP (x, 1));
   HOST_WIDE_INT base_offset;
 
+  if (GET_CODE (XEXP (x, 0)) == PLUS)
+   {
+ rtx op0 = XEXP (XEXP (x, 0), 0);
+ rtx op1 = XEXP (XEXP (x, 0), 1);
+
+ /* For addr expression in the form like "r1 + r2 + 0x3ffc".
+Since the offset is within range supported by addressing
+mode "reg+offset", we don't split the const and legalize
+it into below insn and 

Re: [RFC, Patch]: Optimized changes in the register used inside loop for LICM and IVOPTS.

2015-11-30 Thread Jeff Law

On 11/29/2015 09:24 AM, Ajit Kumar Agarwal wrote:


I agree with the above.  To add up on the above, we only require to calculate 
the set of objects ( SSA_NAMES) that are live at the birth or the header of the 
loop.
We don't need to calculate the live through the Loop considering Live in and 
Live out of all the basic blocks of the Loop. This is because the set of 
objects (SSA_NAMES)
That are live-in at the birth or header of the loop will be live-in at every 
node in the Loop.

If a v live out at the header of the loop then the variable is live-in at every 
node in the Loop. To prove this, Consider a Loop L with header h such that
The variable v defined at d is live-in at h. Since v is live at h, d is not 
part of L. This follows from the dominance property, i.e. h is strictly 
dominated by d.
Furthermore, there exists a path from h to a use of v which does not go through 
d. For every node of the loop, p, since the loop is strongly connected
Component of the CFG, there exists a path, consisting only of nodes of L from p 
to h. Concatenating those two paths prove that v is live-in and live-out
Of p.

On top of live-in at the birth or header of the loop as proven above, if we 
calculate the Live out of the exit block of the block and Live-in at the 
destination
Edge of the exit block of the loops. This consider the liveness outside of the 
Loop.

The above two cases forms the basis of better estimator for register pressure 
as far as LICM is concerned.

If you agree with the above, I will implement add the above in the patch for 
register_used estimates for better estimate of register pressure for LICM.

Yes, I think we're in agreement.

jeff



[gomp4] fortran routine backports

2015-11-30 Thread Cesar Philippidis
This patch backports the recent fortran routine support changes I've
made in trunk to gomp-4_0-branch. Nothing changed in the fortran front
end, but I corrected a couple of problems with the way that gang, worker
and vector were handled in tree-nested.c. And there's a new test case to
exercise those changes.

This patch has been applied to gomp-4_0-branch.

Cesar
2015-11-30  Cesar Philippidis  

	gcc/
	* tree-nested.c (convert_nonlocal_omp_clauses): Handle optional
	arguments for OMP_CLAUSE_{GANG,WORKER,VECTOR}.
	(convert_local_omp_clauses): Likewise

	gcc/testsuite/
	* gfortran.dg/goacc/subroutines.f90: New test.

diff --git a/gcc/testsuite/gfortran.dg/goacc/subroutines.f90 b/gcc/testsuite/gfortran.dg/goacc/subroutines.f90
new file mode 100644
index 000..6cab798
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/subroutines.f90
@@ -0,0 +1,73 @@
+! Exercise how tree-nested.c handles gang, worker vector and seq.
+
+! { dg-do compile } 
+
+program main
+  integer, parameter :: N = 100
+  integer :: nonlocal_arg
+  integer :: nonlocal_a(N)
+  integer :: nonlocal_i
+  integer :: nonlocal_j
+  
+  nonlocal_a (:) = 5
+  nonlocal_arg = 5
+  
+  call local ()
+  call nonlocal ()
+
+contains
+
+  subroutine local ()
+integer :: local_i
+integer :: local_arg
+integer :: local_a(N)
+integer :: local_j
+
+local_a (:) = 5
+local_arg = 5
+
+!$acc kernels loop gang(num:local_arg) worker(local_arg) vector(local_arg)
+do local_i = 1, N
+   local_a(local_i) = 100
+   !$acc loop seq
+   do local_j = 1, N
+   enddo
+enddo
+!$acc end kernels loop
+
+!$acc kernels loop gang(static:local_arg) worker(local_arg) &
+!$acc vector(local_arg)
+do local_i = 1, N
+   local_a(local_i) = 100
+   !$acc loop seq
+   do local_j = 1, N
+   enddo
+enddo
+!$acc end kernels loop
+  end subroutine local
+
+  subroutine nonlocal ()
+nonlocal_a (:) = 5
+nonlocal_arg = 5
+  
+!$acc kernels loop gang(num:nonlocal_arg) worker(nonlocal_arg) &
+!$acc vector(nonlocal_arg)
+do nonlocal_i = 1, N
+   nonlocal_a(nonlocal_i) = 100
+   !$acc loop seq
+   do nonlocal_j = 1, N
+   enddo
+enddo
+!$acc end kernels loop
+
+!$acc kernels loop gang(static:nonlocal_arg) worker(nonlocal_arg) &
+!$acc vector(nonlocal_arg)
+do nonlocal_i = 1, N
+   nonlocal_a(nonlocal_i) = 100
+   !$acc loop seq
+   do nonlocal_j = 1, N
+   enddo
+enddo
+!$acc end kernels loop
+  end subroutine nonlocal
+end program main
diff --git a/gcc/tree-nested.c b/gcc/tree-nested.c
index e321072..1c9849b 100644
--- a/gcc/tree-nested.c
+++ b/gcc/tree-nested.c
@@ -1109,10 +1109,28 @@ convert_nonlocal_omp_clauses (tree *pclauses, struct walk_stmt_info *wi)
 	case OMP_CLAUSE_NUM_GANGS:
 	case OMP_CLAUSE_NUM_WORKERS:
 	case OMP_CLAUSE_VECTOR_LENGTH:
-	  wi->val_only = true;
-	  wi->is_lhs = false;
-	  convert_nonlocal_reference_op (_CLAUSE_OPERAND (clause, 0),
-	 , wi);
+	case OMP_CLAUSE_GANG:
+	case OMP_CLAUSE_WORKER:
+	case OMP_CLAUSE_VECTOR:
+	  /* Several OpenACC clauses have optional arguments.  Check if they
+	 are present.  */
+	  if (OMP_CLAUSE_OPERAND (clause, 0))
+	{
+	  wi->val_only = true;
+	  wi->is_lhs = false;
+	  convert_nonlocal_reference_op (_CLAUSE_OPERAND (clause, 0),
+	 , wi);
+	}
+
+	  /* The gang clause accepts two arguments.  */
+	  if (OMP_CLAUSE_CODE (clause) == OMP_CLAUSE_GANG
+	  && OMP_CLAUSE_GANG_STATIC_EXPR (clause))
+	{
+		wi->val_only = true;
+		wi->is_lhs = false;
+		convert_nonlocal_reference_op
+		  (_CLAUSE_GANG_STATIC_EXPR (clause), , wi);
+	}
 	  break;
 
 	case OMP_CLAUSE_DIST_SCHEDULE:
@@ -1176,9 +1194,6 @@ convert_nonlocal_omp_clauses (tree *pclauses, struct walk_stmt_info *wi)
 	case OMP_CLAUSE_THREADS:
 	case OMP_CLAUSE_SIMD:
 	case OMP_CLAUSE_DEFAULTMAP:
-	case OMP_CLAUSE_GANG:
-	case OMP_CLAUSE_WORKER:
-	case OMP_CLAUSE_VECTOR:
 	case OMP_CLAUSE_SEQ:
 	  break;
 
@@ -1768,10 +1783,28 @@ convert_local_omp_clauses (tree *pclauses, struct walk_stmt_info *wi)
 	case OMP_CLAUSE_NUM_GANGS:
 	case OMP_CLAUSE_NUM_WORKERS:
 	case OMP_CLAUSE_VECTOR_LENGTH:
-	  wi->val_only = true;
-	  wi->is_lhs = false;
-	  convert_local_reference_op (_CLAUSE_OPERAND (clause, 0), ,
-  wi);
+	case OMP_CLAUSE_GANG:
+	case OMP_CLAUSE_WORKER:
+	case OMP_CLAUSE_VECTOR:
+	  /* Several OpenACC clauses have optional arguments.  Check if they
+	 are present.  */
+	  if (OMP_CLAUSE_OPERAND (clause, 0))
+	{
+	  wi->val_only = true;
+	  wi->is_lhs = false;
+	  convert_local_reference_op (_CLAUSE_OPERAND (clause, 0),
+	  , wi);
+	}
+
+	  /* The gang clause accepts two arguments.  */
+	  if (OMP_CLAUSE_CODE (clause) == OMP_CLAUSE_GANG
+	  && OMP_CLAUSE_GANG_STATIC_EXPR (clause))
+	{
+		wi->val_only = true;
+		wi->is_lhs = false;
+		convert_nonlocal_reference_op
+		  (_CLAUSE_GANG_STATIC_EXPR (clause), , 

-fstrict-aliasing fixes 1/5: propagate -fno-strict-aliasing in the inliner

2015-11-30 Thread Jan Hubicka
Hi,
this is first patch in the broken up series.  It adds the logic into
ipa-inline-transform to drop the flag when inlining.  I do it always until
we find a way to make early optimizations safe WRT this transform.

The testcase triggers with GCC 5.0/4.9 too, older compilers passes if
-fstrict-aliasing is used at linktime and fails otherwise.

Bootstrapped/regtested x86_64-linux, will commit it after re-testing on
Firefox.

Honza

* ipa-inline-transform.c (inline_call): Drop -fstrict-aliasing when
inlining -fno-strict-aliasing into -fstrict-aliasing body.
* gcc.dg/lto/alias-1_0.c: New testcase.
* gcc.dg/lto/alias-1_1.c: New testcase.
Index: ipa-inline-transform.c
===
--- ipa-inline-transform.c  (revision 231081)
+++ ipa-inline-transform.c  (working copy)
@@ -322,6 +322,21 @@ inline_call (struct cgraph_edge *e, bool
   if (DECL_FUNCTION_PERSONALITY (callee->decl))
 DECL_FUNCTION_PERSONALITY (to->decl)
   = DECL_FUNCTION_PERSONALITY (callee->decl);
+  if (!opt_for_fn (callee->decl, flag_strict_aliasing)
+  && opt_for_fn (to->decl, flag_strict_aliasing))
+{
+  struct gcc_options opts = global_options;
+
+  cl_optimization_restore (,
+TREE_OPTIMIZATION (DECL_FUNCTION_SPECIFIC_OPTIMIZATION (to->decl)));
+  opts.x_flag_strict_aliasing = false;
+  if (dump_file)
+   fprintf (dump_file, "Dropping flag_strict_aliasing on %s:%i\n",
+to->name (), to->order);
+  build_optimization_node ();
+  DECL_FUNCTION_SPECIFIC_OPTIMIZATION (to->decl)
+= build_optimization_node ();
+}
 
   /* If aliases are involved, redirect edge to the actual destination and
  possibly remove the aliases.  */
Index: testsuite/gcc.dg/lto/alias-1_0.c
===
--- testsuite/gcc.dg/lto/alias-1_0.c(revision 0)
+++ testsuite/gcc.dg/lto/alias-1_0.c(revision 0)
@@ -0,0 +1,23 @@
+/* { dg-lto-do run } */
+/* { dg-lto-options { { -O2 -flto } } } */
+int val;
+
+__attribute__ ((used))
+int *ptr = 
+__attribute__ ((used))
+float *ptr2 = (void *)
+
+extern void typefun(float val);
+
+void link_error (void);
+
+int
+main()
+{ 
+  *ptr=1;
+  typefun (0);
+  if (*ptr)
+__builtin_abort ();
+  return 0;
+}
+
Index: testsuite/gcc.dg/lto/alias-1_1.c
===
--- testsuite/gcc.dg/lto/alias-1_1.c(revision 0)
+++ testsuite/gcc.dg/lto/alias-1_1.c(revision 0)
@@ -0,0 +1,7 @@
+/* { dg-options "-fno-strict-aliasing" } */
+extern float *ptr2;
+void
+typefun (float val)
+{ 
+  *ptr2=val;
+}


Re: [PATCH 01/15] Selftest framework (unittests v4)

2015-11-30 Thread Jeff Law

On 11/26/2015 05:37 AM, Bernd Schmidt wrote:

On 11/25/2015 11:47 PM, David Malcolm wrote:

FWIW, the reason I special-cased the linked list was to avoid any
dynamic memory allocation: the ctors run before main, so I wanted to
keep them as simple as possible.


Is there any particular reason for this? C++ doesn't disallow memory
allocation in global constructors, does it?

I'm not aware of any such restriction, but I'm not a C++ guru.

David, what's the reason for avoiding dynamic memory allocation here?





I do want some level of determinism over test ordering, for the sake of
everyone's sanity.  It's probably simplest to either hardcode the order,
or have priority levels.  I favor the former (and right now am leaning
towards a very explicit no-magic approach with no auto-registration,
given the linker issues I've been seeing with auto-registration).


I guess that works too. Certainly explicit function calls are
preferrable over #including other C files as a workaround for such a
problem.
My problem with priorities is that it's really just a poor man's 
substitution for dependency analysis. And in my experience, it usually 
fails.




I still wish others would chime in on the rest of the issues we've
discussed (run to first failure vs. providing elaborate test summaries),
I want to make my preference clear but I don't want to dictate it.
I favor run-all over run-to-first-failure as long as we don't have good 
dependency analysis to order the tests.   That in turn tends to imply 
that each test ought to have a pass/fail indicator.


If we had good dependency analysis, then run-to-first-failure would be 
my preference.


Jeff


[RFA] Implement incremental IL linking

2015-11-30 Thread Jan Hubicka
Hi,
this is polished version of the patch to implement IL level incremental inking.
-flinker-output is now documented and can be specified to the GCC driver.
In this case plugin gets option -linker-output-known and it stops from
attempts to detect it from info passed down by linker. I also added doc for
the flag to invoke.texi

Modulo the testsuite compensation the rest of patch is basically unchanged
since earlier version: lto-wrapper looks for linker-output flag and switches to
non-WPA mode (because we do not want to execute ltrans compilatoins) and lto
frontends configure the compiler to output IL and possibly flat lto binary to
the object file.

Bootstrapped/regtested x86_64-linux, OK?

Honza

* lto-plugin.c: Document options; add -linker-output-known;
determine when to use rel and when nolto-rel output.

* lto-wrapper.c (run_gcc): Look for -flinker-output=rel also in the
list of options passed from the driver.
* passes.c (ipa_write_summaries): Only modify statements if body
is in memory.
* cgraphunit.c (ipa_passes): Also produce intermeidate code when
incrementally linking.
(ipa_passes): LIkewise.
* lto-cgraph.c (lto_output_node): When incrementally linking do not
pass down resolution info.
* common.opt (flag_incremental_link): Update info.
* gcc.c (plugin specs): Turn flinker-output=* to
-plugin-opt=-linker-output-known
* toplev.c (compile_file): Also cut compilation when doing incremental
link.
* flag-types.h (enum lto_partition_model): Add
LTO_LINKER_OUTPUT_NOLTOREL.
(invoke.texi): Add -flinker-output docs.

* lang.opt (lto_linker_output): Add nolto-rel.
* lto-lang.c (lto_post_options): Handle LTO_LINKER_OUTPUT_REL
and LTO_LINKER_OUTPUT_NOLTOREL:.
(lto_init): Generate lto when doing incremental link.

* gcc.dg/lto/20081120-2_0.c: Add -flinker-output=nolto-rel
* gcc.dg/lto/20090126-1_0.c: Likewise.
* gcc.dg/lto/20091020-2_0.c: Likewise.
* gcc.dg/lto/20081204-2_0.c: Likewise.
* gcc.dg/lto/20091015-1_0.c: Likewise.
* gcc.dg/lto/20090126-2_0.c: Likewiwe.
* gcc.dg/lto/20090116_0.c: Likewise.
* gcc.dg/lto/20081224_0.c: Likewise.
* gcc.dg/lto/20091027-1_0.c: Likewise.
* gcc.dg/lto/20090219_0.c: Likewise.
* gcc.dg/lto/20081212-1_0.c: Likewise.
* gcc.dg/lto/20091013-1_0.c: Likewise.
* gcc.dg/lto/20081126_0.c: Likewise.
* gcc.dg/lto/20090206-1_0.c: Likewise.
* gcc.dg/lto/20091016-1_0.c: Likewise.
* gcc.dg/lto/20081120-1_0.c: Likewise.
* gcc.dg/lto/20091020-1_0.c: Likewise.
* gcc.dg/lto/20100426_0.c: Likewise.
* gcc.dg/lto/20081204-1_0.c: Likewise.
* gcc.dg/lto/20091014-1_0.c: Likewise.
* g++.dg/lto/20081109-1_0.C: Likewise.
* g++.dg/lto/20100724-1_0.C: Likewise.
* g++.dg/lto/20081204-1_0.C: Likewise.
* g++.dg/lto/pr45679-2_0.C: Likewise.
* g++.dg/lto/20110311-1_0.C: Likewise.
* g++.dg/lto/20090302_0.C: Likewise.
* g++.dg/lto/20081118_0.C: Likewise.
* g++.dg/lto/20091002-2_0.C: Likewise.
* g++.dg/lto/20081120-2_0.C: Likewise.
* g++.dg/lto/20081123_0.C: Likewise.
* g++.dg/lto/20090313_0.C: Likewise.
* g++.dg/lto/pr54625-1_0.c: Likewise.
* g++.dg/lto/pr48354-1_0.C: Likewise.
* g++.dg/lto/20081219_0.C: Likewise.
* g++.dg/lto/pr48042_0.C: Likewise.
* g++.dg/lto/20101015-2_0.C: Likewise.
* g++.dg/lto/pr45679-1_0.C: Likewise.
* g++.dg/lto/20091026-1_0.C: Likewise.
* g++.dg/lto/pr45621_0.C: Likewise.
* g++.dg/lto/20081119-1_0.C: Likewise.
* g++.dg/lto/20101010-4_0.C: Likewise.
* g++.dg/lto/20081120-1_0.C: Likewise.
* g++.dg/lto/20091002-1_0.C: Likewise.
* g++.dg/lto/20091002-3_0.C: Likewise.
* gfortran.dg/lto/20091016-1_0.f90: Likewise.
* gfortran.dg/lto/pr47839_0.f90: Likewise.
* gfortran.dg/lto/pr46911_0.f: Likewise.
* gfortran.dg/lto/20091028-1_0.f90: Likewise.
* gfortran.dg/lto/20091028-2_0.f90: Likewise.
Index: lto-plugin/lto-plugin.c
===
--- lto-plugin/lto-plugin.c (revision 231081)
+++ lto-plugin/lto-plugin.c (working copy)
@@ -27,10 +27,13 @@ along with this program; see the file CO
More information at http://gcc.gnu.org/wiki/whopr/driver.
 
This plugin should be passed the lto-wrapper options and will forward them.
-   It also has 2 options of its own:
+   It also has options at his own:
-debug: Print the command line used to run lto-wrapper.
-nop: Instead of running lto-wrapper, pass the original to the plugin. This
-   only works if the input files are hybrid.  */
+   only works if the input files are hybrid. 
+   -linker-output-known: Do not determine 

[hsa] Describe grid with target clauses

2015-11-30 Thread Martin Jambor
Hi,

Jakub requested that I remove the grid description from new fields of
the classes representing gimple omp statement and put them into
special artificial clauses instead.  This patch implement that, with
one target clause per dimension (so up to three clauses) and each one
describing both the grid size and group size along that dimension
(hence the new clause type has two parameters).

Committed to the branch, I will be preparing a new diff against the
trunk shortly.

Thanks,

Martin


2015-11-30  Martin Jambor  

* gimple.c (gimple_omp_target_init_dimensions): Removed.
* gimple.h (gimple_statement_omp_parallel_layout): Removed fields
dimensions and kernel_dim.
(gimple_omp_target_dimensions): Removed.
(gimple_omp_target_grid_size): Likewise.
(gimple_omp_target_grid_size_ptr): Likewise.
(gimple_omp_target_set_grid_size): Likewise.
(gimple_omp_target_workgroup_size): Likewise.
(gimple_omp_target_workgroup_size_ptr): Likewise.
(gimple_omp_target_set_workgroup_size): Likewise.
* omp-low.c (scan_sharing_clauses): Handle OMP_CLAUSE__GRIDDIM_.
(scan_omp_target): Do not scan kernel_dim.
(region_needs_kernel_p): Use clauses to recognize gridified kernels.
(get_kernel_launch_attributes): Generate launch attributes from
clauses.
(get_target_arguments): Use clauses to recognize gridified kernels.
(expand_target_kernel_body): Likewise.
(attempt_target_gridification): Record grid description into clauses.
* tree-core.h (omp_clause_code): New element OMP_CLAUSE__GRIDDIM_.
(tree_omp_clause): New subcode dimension.
* tree-pretty-print.c (dump_omp_clause): Handle OMP_CLAUSE__GRIDDIM_.
* tree.c (omp_clause_num_ops): Add number of opernads of
OMP_CLAUSE__GRIDDIM_.
(omp_clause_code_name): Add name of OMP_CLAUSE__GRIDDIM_.
(walk_tree_1): Handle OMP_CLAUSE__GRIDDIM_.
* tree.h (OMP_CLAUSE_GRIDDIM_DIMENSION): New.
(OMP_CLAUSE_SET_GRIDDIM_DIMENSION): Likewise.
(OMP_CLAUSE_GRIDDIM_SIZE): Likewise.
(OMP_CLAUSE_GRIDDIM_GROUP): Likewise.
---
 gcc/gimple.c| 11 ---
 gcc/gimple.h| 82 -
 gcc/omp-low.c   | 72 ++-
 gcc/tree-core.h |  9 +-
 gcc/tree-pretty-print.c | 12 
 gcc/tree.c  |  5 ++-
 gcc/tree.h  | 11 +++
 7 files changed, 79 insertions(+), 123 deletions(-)

diff --git a/gcc/gimple.c b/gcc/gimple.c
index d876e90..4658f29 100644
--- a/gcc/gimple.c
+++ b/gcc/gimple.c
@@ -1098,17 +1098,6 @@ gimple_build_omp_target (gimple_seq body, int kind, tree 
clauses)
   return p;
 }
 
-/* Set dimensions of TARGET to NUM and allocate kernel_dim array of the
-   statement with the appropriate number of elements.  */
-
-void
-gimple_omp_target_init_dimensions (gomp_target *target, size_t num)
-{
-  gcc_assert (num > 0);
-  target->dimensions = num;
-  target->kernel_dim = ggc_cleared_vec_alloc (num);
-}
-
 /* Build a GIMPLE_OMP_TEAMS statement.
 
BODY is the sequence of statements that will be executed.
diff --git a/gcc/gimple.h b/gcc/gimple.h
index 14e6cf6..4c4c799 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -661,21 +661,7 @@ struct GTY((tag("GSS_OMP_PARALLEL_LAYOUT")))
  Shared data argument.  */
   tree data_arg;
 
-  /* TODO: Revisit placement of the following two fields.  On one hand, we
- currently only use them on target construct.  On the other, use on
- parallel construct is also possible in the future.  */
-
   /* [ WORD 11 ] */
-  /* Number of elements in kernel_iter array.  */
-  size_t dimensions;
-
-  /* [ WORD 12 ] */
-  /* If target also contains a GPU kernel, it should be run with the
- following grid sizes.  */
-  struct gimple_omp_target_grid_dim
-* GTY((length ("%h.dimensions"))) kernel_dim;
-
-  /* [ WORD 13 ] */
   /* If set, this statement is part of a gridified kernel, its clauses need to
  be scanned and lowered but the statement should be discarded after
  lowering.  */
@@ -1504,7 +1490,6 @@ gomp_sections *gimple_build_omp_sections (gimple_seq, 
tree);
 gimple *gimple_build_omp_sections_switch (void);
 gomp_single *gimple_build_omp_single (gimple_seq, tree);
 gomp_target *gimple_build_omp_target (gimple_seq, int, tree);
-void gimple_omp_target_init_dimensions (gomp_target *, size_t);
 gomp_teams *gimple_build_omp_teams (gimple_seq, tree);
 gomp_atomic_load *gimple_build_omp_atomic_load (tree, tree);
 gomp_atomic_store *gimple_build_omp_atomic_store (tree);
@@ -5683,73 +5668,6 @@ gimple_omp_target_set_data_arg (gomp_target 
*omp_target_stmt,
   omp_target_stmt->data_arg = data_arg;
 }
 
-/* Return the number of dimensions of kernel grid.  */
-
-static inline size_t
-gimple_omp_target_dimensions (gomp_target *omp_target_stmt)
-{
-  return omp_target_stmt->dimensions;
-}
-

Re: [PATCH] Add save_expr langhook (PR c/68513)

2015-11-30 Thread Joseph Myers
On Mon, 30 Nov 2015, Marek Polacek wrote:

> On Sat, Nov 28, 2015 at 08:50:12AM +0100, Richard Biener wrote:
> > Different approach: after the FE folds (unexpectedly?), scan the result for
> > SAVE_EXPRs and if found, drop the folding.
> 
> Neither this fixes this problem completely, because we simply don't know where
> those SAVE_EXPRs might be introduced: it might be convert(), but e.g. when I
> changed the original testcase a tiny bit (added -), then those SAVE_EXPRs were
> introduced in a different spot (via c_process_stmt_expr -> c_fully_fold).

Well, c_fully_fold should eliminate all C_MAYBE_CONST_EXPRs in its 
argument and never pass anything containing them to the 
language-independent folders.  So it shouldn't matter if something called 
by c_fully_fold introduces a SAVE_EXPR.  If it does matter, that indicates 
the problem was earlier (something earlier putting a tree that 
c_fully_fold doesn't fold around a tree containing a C_MAYBE_CONST_EXPR, 
without folding first).

-- 
Joseph S. Myers
jos...@codesourcery.com


[hsa] Use gimplify_expr in gridification

2015-11-30 Thread Martin Jambor
Hi,

doing some more testing of the branch and combining two of my
testcases I came accross a bug where temporaries created by
force_gimple_operand_gsi were not added to the proper bind and thus
were subsequently re-mapped to error_mark when the target construct
was within some other omp construct.  Fixed with this patch, where
pop_gimplify_context does the right thing like at other places in
omp-low.c.  Committed to the branch.

Thanks,

Martin



2015-11-30  Martin Jambor  

* omp-low.c (attempt_target_gridification): Use gimplify_expr.
---
 gcc/omp-low.c | 27 +++
 1 file changed, 15 insertions(+), 12 deletions(-)

diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index bdf6539..7fbdcdf 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -17481,6 +17481,7 @@ attempt_target_gridification (gomp_target *target, 
gimple_stmt_iterator *gsi,
  gpukernel);
 
   walk_tree (_size, remap_prebody_decls, , NULL);
+  push_gimplify_context ();
   size_t collapse = gimple_omp_for_collapse (inner_loop);
   for (size_t i = 0; i < collapse; i++)
 {
@@ -17499,30 +17500,32 @@ attempt_target_gridification (gomp_target *target, 
gimple_stmt_iterator *gsi,
   tree step;
   step = get_omp_for_step_from_incr (loc,
 gimple_omp_for_incr (inner_loop, i));
-  n1 = force_gimple_operand_gsi (gsi, fold_convert (type, n1), true,
-NULL_TREE, true, GSI_SAME_STMT);
-  n2 = force_gimple_operand_gsi (gsi, fold_convert (itype, n2), true,
-NULL_TREE,
-true, GSI_SAME_STMT);
+  gimple_seq tmpseq = NULL;
+  n1 = fold_convert (itype, n1);
+  n2 = fold_convert (itype, n2);
   tree t = build_int_cst (itype, (cond_code == LT_EXPR ? -1 : 1));
   t = fold_build2 (PLUS_EXPR, itype, step, t);
   t = fold_build2 (PLUS_EXPR, itype, t, n2);
-  t = fold_build2 (MINUS_EXPR, itype, t, fold_convert (itype, n1));
+  t = fold_build2 (MINUS_EXPR, itype, t, n1);
   if (TYPE_UNSIGNED (itype) && cond_code == GT_EXPR)
t = fold_build2 (TRUNC_DIV_EXPR, itype,
 fold_build1 (NEGATE_EXPR, itype, t),
 fold_build1 (NEGATE_EXPR, itype, step));
   else
t = fold_build2 (TRUNC_DIV_EXPR, itype, t, step);
-  t = fold_convert (uint32_type_node, t);
-  tree gs = force_gimple_operand_gsi (gsi, t, true, NULL_TREE, true,
- GSI_SAME_STMT);
+  tree gs = fold_convert (uint32_type_node, t);
+  gimplify_expr (, , NULL, is_gimple_val, fb_rvalue);
+  if (!gimple_seq_empty_p (tmpseq))
+   gsi_insert_seq_before (gsi, tmpseq, GSI_SAME_STMT);
+
   tree ws;
   if (i == 0 && group_size)
{
  ws = fold_convert (uint32_type_node, group_size);
- ws = force_gimple_operand_gsi (gsi, ws, true, NULL_TREE, true,
-GSI_SAME_STMT);
+ tmpseq = NULL;
+ gimplify_expr (, , NULL, is_gimple_val, fb_rvalue);
+ if (!gimple_seq_empty_p (tmpseq))
+   gsi_insert_seq_before (gsi, tmpseq, GSI_SAME_STMT);
}
   else
ws = build_zero_cst (uint32_type_node);
@@ -17534,7 +17537,7 @@ attempt_target_gridification (gomp_target *target, 
gimple_stmt_iterator *gsi,
   OMP_CLAUSE_CHAIN (c) = gimple_omp_target_clauses (target);
   gimple_omp_target_set_clauses (target, c);
 }
-
+  pop_gimplify_context (tgt_bind);
   delete declmap;
   return;
 }
-- 
2.6.0



-fstrict-aliasing fixes 3/5: Do not ignore -fstrict-aliasing changes when parsing optimization attribute

2015-11-30 Thread Jan Hubicka
Hi,
this is third part which enables us to change -fstrict-aliasing using
optimize attribute.  This ought to work safely now because inliner
propagate the flag.

Bootstrapped/regtested x86_64-linux.

Honza

* gcc.c-torture/execute/alias-1.c: New testcase.
* c-common.c: Do not silently ignore -fstrict-aliasing changes.
Index: testsuite/gcc.c-torture/execute/alias-1.c
===
--- testsuite/gcc.c-torture/execute/alias-1.c   (revision 0)
+++ testsuite/gcc.c-torture/execute/alias-1.c   (revision 0)
@@ -0,0 +1,19 @@
+int val;
+
+int *ptr = 
+float *ptr2 = 
+
+__attribute__((optimize ("-fno-strict-aliasing")))
+typepun ()
+{
+  *ptr2=0;
+}
+
+main()
+{
+  *ptr=1;
+  typepun ();
+  if (*ptr)
+__builtin_abort ();
+}
+
Index: c-family/c-common.c
===
--- c-family/c-common.c (revision 231097)
+++ c-family/c-common.c (working copy)
@@ -9988,7 +9988,6 @@ parse_optimize_options (tree args, bool
   bool ret = true;
   unsigned opt_argc;
   unsigned i;
-  int saved_flag_strict_aliasing;
   const char **opt_argv;
   struct cl_decoded_option *decoded_options;
   unsigned int decoded_options_count;
@@ -10081,8 +10080,6 @@ parse_optimize_options (tree args, bool
   for (i = 1; i < opt_argc; i++)
 opt_argv[i] = (*optimize_args)[i];
 
-  saved_flag_strict_aliasing = flag_strict_aliasing;
-
   /* Now parse the options.  */
   decode_cmdline_options_to_array_default_mask (opt_argc, opt_argv,
_options,
@@ -10093,9 +10090,6 @@ parse_optimize_options (tree args, bool
 
   targetm.override_options_after_change();
 
-  /* Don't allow changing -fstrict-aliasing.  */
-  flag_strict_aliasing = saved_flag_strict_aliasing;
-
   optimize_args->truncate (0);
   return ret;
 }


[UPC 08/22] target - Darwin

2015-11-30 Thread Gary Funck

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


For Darwin, if -fupc is given, then define various UPC-specific spec's.
Also, override default section names for the UPC-related linker sections.

2015-11-30  Gary Funck  

gcc/config/
* darwin.h (LINK_COMMAND_SPEC_A): If -fupc is asserted:
add UPC start/end files, add include of libgupc.spec
(UPC_SHARED_SECTION_NAME, UPC_PGM_INFO_SECTION_NAME,
UPC_INIT_ARRAY_SECTION_NAME): New.  Override default section names.

Index: gcc/config/darwin.h
===
--- gcc/config/darwin.h (.../trunk) (revision 231059)
+++ gcc/config/darwin.h (.../branches/gupc) (revision 231080)
@@ -176,16 +176,19 @@ extern GTY(()) int darwin_ms_struct;
 %{e*} %{r} \
 %{o*}%{!o:-o a.out} \
 %{!nostdlib:%{!nostartfiles:%S}} \
+
%{!nostdlib:%{!nostartfiles:%{fupc:%:include(upc-crtbegin.spec)%(upc_crtbegin)}}}\
 %{L*} %(link_libgcc) %o 
%{fprofile-arcs|fprofile-generate*|coverage:-lgcov} \
 %{fopenacc|fopenmp|ftree-parallelize-loops=*: \
   %{static|static-libgcc|static-libstdc++|static-libgfortran: libgomp.a%s; 
: -lgomp } } \
 %{fgnu-tm: \
   %{static|static-libgcc|static-libstdc++|static-libgfortran: libitm.a%s; 
: -litm } } \
+%{fupc:%:include(libgupc.spec)%(link_upc)} \
 %{!nostdlib:%{!nodefaultlibs:\
   %{%:sanitize(address): -lasan } \
   %{%:sanitize(undefined): -lubsan } \
   %(link_ssp) %(link_gcc_c_sequence)\
 }}\
+
%{!nostdlib:%{!nostartfiles:%{fupc:%:include(upc-crtend.spec)%(upc_crtend)}}}\
 %{!nostdlib:%{!nostartfiles:%E}} %{T*} %{F*} }}}"
 
 #define DSYMUTIL "\ndsymutil"
@@ -922,6 +925,11 @@ extern void darwin_driver_init (unsigned
 #undef SUPPORTS_INIT_PRIORITY
 #define SUPPORTS_INIT_PRIORITY 0
 
+/* UPC section names */
+#define UPC_SHARED_SECTION_NAME "__DATA,upc_shared"
+#define UPC_PGM_INFO_SECTION_NAME "__DATA,upc_pgm_info"
+#define UPC_INIT_ARRAY_SECTION_NAME "__DATA,upc_init_array"
+
 /* When building cross-compilers (and native crosses) we shall default to 
providing an osx-version-min of this unless overridden by the User.
10.5 is the only version that fully supports all our archs so that's the


[UPC 06/22] target hooks

2015-11-30 Thread Gary Funck

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


Four new target hooks are defined for UPC.  They relate to naming
the various linker sections used by UPC as well as testing for
the availability of the "UPC linker script" feature.

2015-11-30  Gary Funck  

gcc/
* defaults.h (UPC_SHARED_SECTION_NAME): New macro.
(UPC_PGM_INFO_SECTION_NAME): New macro.
(UPC_INIT_ARRAY_SECTION_NAME): New macro.
* target.def (upc): New hook prefix.
(link_script_p, shared_section_name,
pgm_info_section_name, init_array_section_name):
New target hook definitions.
* targhooks.c (default_upc_link_script_p,
default_upc_shared_section_name, default_upc_pgm_info_section_name,
default_upc_init_array_section_name): New default target hooks.
* targhooks.h (default_upc_link_script_p,
default_upc_shared_section_name, default_upc_pgm_info_section_name,
default_upc_init_array_section_name): New target hook prototypes.

Index: gcc/defaults.h
===
--- gcc/defaults.h  (.../trunk) (revision 231059)
+++ gcc/defaults.h  (.../branches/gupc) (revision 231080)
@@ -1488,4 +1488,23 @@ see the files COPYING3 and COPYING.RUNTI
 
 #endif /* GCC_INSN_FLAGS_H  */
 
+/* UPC section names.  */
+
+/* Name of section used to assign addresses to shared data items.  */
+#ifndef UPC_SHARED_SECTION_NAME
+#define UPC_SHARED_SECTION_NAME "upc_shared"
+#endif
+
+/* Name of section used to hold info. describing how
+   a UPC source file was compiled.  */
+#ifndef UPC_PGM_INFO_SECTION_NAME
+#define UPC_PGM_INFO_SECTION_NAME "upc_pgm_info"
+#endif
+
+/* Name of section that holds an array of addresses that points to 
+   the UPC initialization routines.  */
+#ifndef UPC_INIT_ARRAY_SECTION_NAME
+#define UPC_INIT_ARRAY_SECTION_NAME "upc_init_array"
+#endif
+
 #endif  /* ! GCC_DEFAULTS_H */
Index: gcc/target.def
===
--- gcc/target.def  (.../trunk) (revision 231059)
+++ gcc/target.def  (.../branches/gupc) (revision 231080)
@@ -5496,6 +5496,41 @@ DEFHOOK
 
 HOOK_VECTOR_END (cxx)
 
+/* Functions and data for UPC support.  */
+#undef HOOK_PREFIX
+#define HOOK_PREFIX "TARGET_UPC_"
+HOOK_VECTOR (TARGET_UPC, upc)
+
+DEFHOOK
+(link_script_p,
+"This hook returns true if a linker script will be used to\
+ origin the UPC shared section at 0.",
+ bool, (void),
+ default_upc_link_script_p)
+
+DEFHOOK
+(shared_section_name,
+"This hook returns the name of the section used to assign addresses to\
+ UPC shared data items.",
+ const char *, (void),
+ default_upc_shared_section_name)
+
+DEFHOOK
+(pgm_info_section_name,
+"This hook returns the name of the section used to hold information\
+ describing how a UPC source file was compiled.",
+ const char *, (void),
+ default_upc_pgm_info_section_name)
+
+DEFHOOK
+(init_array_section_name,
+"This hook returns the name of the section used to hold an array\
+ of addresses of UPC initialization routines.",
+ const char *, (void),
+ default_upc_init_array_section_name)
+
+HOOK_VECTOR_END (upc)
+
 /* Functions and data for emulated TLS support.  */
 #undef HOOK_PREFIX
 #define HOOK_PREFIX "TARGET_EMUTLS_"
Index: gcc/targhooks.c
===
--- gcc/targhooks.c (.../trunk) (revision 231059)
+++ gcc/targhooks.c (.../branches/gupc) (revision 231080)
@@ -1955,4 +1955,32 @@ can_use_doloop_if_innermost (const wides
   return loop_depth == 1;
 }
 
+bool
+default_upc_link_script_p (void)
+{
+#ifdef HAVE_UPC_LINK_SCRIPT
+  return true;
+#else
+  return false;
+#endif
+}
+
+const char *
+default_upc_shared_section_name (void)
+{
+  return UPC_SHARED_SECTION_NAME;
+}
+
+const char *
+default_upc_pgm_info_section_name (void)
+{
+  return UPC_PGM_INFO_SECTION_NAME;
+}
+
+const char *
+default_upc_init_array_section_name (void)
+{
+  return UPC_INIT_ARRAY_SECTION_NAME;
+}
+
 #include "gt-targhooks.h"
Index: 

[UPC 09/22] target - x86

2015-11-30 Thread Gary Funck

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


UPC pointers-to-shared use a struct to describe their internal
representation.  For efficiency and correctness, ensure that if the struct's
mode is TIMode that a pointer-to-shared parameter is passed in registers.  
Note that the parameter passing logic forces "C" pointer type parameters
to be 'word mode', but that rule doesn't apply to UPC pointers-to-shared
due to their "fat" struct representation.

2015-11-30  Gary Funck  

gcc/config/i386/
* i386.c (classify_argument): check for UPC pointer-to-shared,
on 64-bit target.
(function_value_64): Do not force UPC pointers-to-shared
to be returned in word mode.

Index: gcc/config/i386/i386.c
===
--- gcc/config/i386/i386.c  (.../trunk) (revision 231059)
+++ gcc/config/i386/i386.c  (.../branches/gupc) (revision 231080)
@@ -7943,6 +7943,15 @@ classify_argument (machine_mode mode, co
   && targetm.calls.must_pass_in_stack (mode, type))
 return 0;
 
+  /* Special case check for pointer to shared, on 64-bit target.  */
+  if (TARGET_64BIT && mode == TImode
+  && type && TREE_CODE (type) == POINTER_TYPE
+  && SHARED_TYPE_P (TREE_TYPE (type)))
+{
+  classes[0] = classes[1] = X86_64_INTEGER_CLASS;
+  return 2;
+}
+
   if (type && AGGREGATE_TYPE_P (type))
 {
   int i;
@@ -9536,7 +9545,8 @@ function_value_64 (machine_mode orig_mod
 
   return gen_rtx_REG (mode, regno);
 }
-  else if (POINTER_TYPE_P (valtype))
+  else if (POINTER_TYPE_P (valtype)
+   && !SHARED_TYPE_P (TREE_TYPE (valtype)))
 {
   /* Pointers are always returned in word_mode.  */
   mode = word_mode;
@@ -9680,6 +9690,11 @@ ix86_promote_function_mode (const_tree t
 {
   if (type != NULL_TREE && POINTER_TYPE_P (type))
 {
+  if (SHARED_TYPE_P (TREE_TYPE (type)))
+{
+  *punsignedp = 1;
+  return TYPE_MODE (upc_pts_rep_type_node);
+   }
   *punsignedp = POINTERS_EXTEND_UNSIGNED;
   return word_mode;
 }


[UPC 15/22] RTL changes

2015-11-30 Thread Gary Funck

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


UPC pointers-to-shared have an internal representation which is
defined as a 'struct' with three fields.  Special logic is
needed in promote_mode() to handle this case.

2015-11-30  Gary Funck  

gcc/
* explow.c (promote_mode): For UPC pointer-to-shared values,
return the mode of the UPC PTS representation type.

Index: gcc/explow.c
===
--- gcc/explow.c(.../trunk) (revision 231059)
+++ gcc/explow.c(.../branches/gupc) (revision 231080)
@@ -794,6 +794,8 @@ promote_mode (const_tree type ATTRIBUTE_
 case REFERENCE_TYPE:
 case POINTER_TYPE:
   *punsignedp = POINTERS_EXTEND_UNSIGNED;
+  if (SHARED_TYPE_P (TREE_TYPE (type)))
+return TYPE_MODE (upc_pts_type_node);
   return targetm.addr_space.address_mode
   (TYPE_ADDR_SPACE (TREE_TYPE (type)));
   break;


[UPC 04/22] Make, Config changes

2015-11-30 Thread Gary Funck

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


UPC introduces a new runtime library, libgupc and a new compiler driver, gupc.
These are defined in the top-level Makefile.def and Makefile.tpl files.

The top-level configure script will disable building the libgupc runtime
library on unsupported targets.  For builds where the target is the
same as the host, configure will check if "UPC linker scripts" can be
supported; this check can be over-ridden by the --enable-upc-linker-script
switch.  This check runs a 'perl' script, it will only be run if the
host has perl installed.

2015-11-30  Gary Funck  

* Makefile.def (libgupc):  New.  Define libgupc module.
* Makefile.in: Re-generate.
* Makefile.tpl (BUILD_EXPORTS, EXTRA_TARGET_FLAGS):
Add GUPC and GUPCFLAGS.
(BASE_TARGET_EXPORTS, EXTRA_HOST_FLAGS): Add GUPC.
(GUPC_FOR_BUILD, GUPCFLAGS,
GUPC_FOR_TARGET, GUPCFLAGS_FOR_TARGET): New.
* configure: Re-generate.
* configure.ac (target_libraries): Add target-libgupc.
Disable libgupc on unsupported systems.
Add check for 'gupc' as target tool.
(GUPC_FOR_BUILD): New.  Define 'gupc' as a target tool.
contrib/
* gcc_update (libgupc/aclocal.m4, libgupc/config.h.in,
libgupc/configure, libgupc/Makefile.in,
libgupc/testsuite/Makefile.in): New.  Define libgupc targets.
* update-copyright.py: Add libgupc library to copyright scan list.
(skip_extensions): Add .upc.
(GCCCopyright): Add external authors for
contributors to UPC-related additions.
gcc/
* config.in (HAVE_UPC_LINK_SCRIPT): New. Re-generate.
* configure: Re-generate.
* configure.ac (enable-upc-link-script): Add check for UPC
linker script support.
* Makefile.in (INFOFILES): Add doc/gupc.info.
(MANFILES): Add doc/gupc.1.
gcc/c/
* Make-lang.in (gupc): Add rules to build and install the
'gupc' executable.  Add rule to symlink 'upc' to 'gupc' executable.
* config-lang.in (gtfiles): Add UPC garbage collection
support files to gtfiles.

Index: Makefile.def
===
--- Makefile.def(.../trunk) (revision 231059)
+++ Makefile.def(.../branches/gupc) (revision 231080)
@@ -154,6 +154,7 @@ target_modules = { module= libbacktrace;
 target_modules = { module= libquadmath; };
 target_modules = { module= libgfortran; };
 target_modules = { module= libobjc; };
+target_modules = { module= libgupc; };
 target_modules = { module= libgo; };
 target_modules = { module= libtermcap; no_check=true;
missing=mostlyclean;
@@ -284,6 +285,8 @@ flags_to_pass = { flag= GCJ_FOR_TARGET ;
 flags_to_pass = { flag= GFORTRAN_FOR_TARGET ; };
 flags_to_pass = { flag= GOC_FOR_TARGET ; };
 flags_to_pass = { flag= GOCFLAGS_FOR_TARGET ; };
+flags_to_pass = { flag= GUPC_FOR_TARGET ; };
+flags_to_pass = { flag= GUPCFLAGS_FOR_TARGET ; };
 flags_to_pass = { flag= LD_FOR_TARGET ; };
 flags_to_pass = { flag= LIPO_FOR_TARGET ; };
 flags_to_pass = { flag= LDFLAGS_FOR_TARGET ; };
@@ -561,6 +564,8 @@ dependencies = { module=all-target-libja
 dependencies = { module=all-target-libjava; on=all-target-libffi; };
 dependencies = { module=configure-target-libobjc; 
on=configure-target-boehm-gc; };
 dependencies = { module=all-target-libobjc; on=all-target-boehm-gc; };
+dependencies = { module=all-target-libgupc; on=all-target-libbacktrace; };
+dependencies = { module=all-target-libgupc; on=all-target-libatomic; };
 dependencies = { module=configure-target-libstdc++-v3; 
on=configure-target-libgomp; };
 dependencies = { module=configure-target-liboffloadmic; 
on=configure-target-libgomp; };
 dependencies = { module=configure-target-libsanitizer; 
on=all-target-libstdc++-v3; };
@@ -569,6 +574,9 @@ dependencies = { module=configure-target
 // generated by the libgomp configure.  Unfortunately, due to the use of
 //  recursive make, we can't be that 

[UPC 07/22] lowering, pointer-to-shared ops

2015-11-30 Thread Gary Funck

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


The UPC lowering pass traverses the current function tree
and rewrites UPC related statements and operations into GENERIC.
The resulting GENERIC tree code will retain UPC pointers-to-shared (PTS)
types, but all operations such as 'get' and 'put' which indirect
through a pointer-to-shared have been lowered to use the internal
representation type.  Most of these operations on UPC pointers-to-shared
is implemented in c/c-upc-pts-ops.c.

The UPC lowering pass is implemented by upc_genericize() in
c/c-upc-low.c.  upc_genericize() is called from finish_function()
in c/c-decl.c. It is called just prior to calling c_genericize(),
if -fupc has been asserted.

The file c/c-upc-rts-names.h defines the names of the UPC runtime
entry points and variables that implement the runtime ABI.
To date, there has been no need to implement target dependent names,
perhaps partly because UPC is supported primarily on POSIX-compliant targets.

UPC requires some special logic for handling file scoped initializations.
This is due to the fact that UPC shared addresses are not known
until runtime and therefore cannot be statically initialized
in the usual way.  For example, 'addr_x' below must be initialized
at runtime.

  shared int x;
  shared int *addr_x = 

The routine, upc_check_decl_init(), checks an initialization
statement to determine if it needs special handling.
It is called from store_init_value().  If an initialization
refers to UPC-related constructs that require initialization
at runtime, then upc_decl_init() is called to save the
initialization statement on a list.  This list is
processed by upc_write_global_declarations(), which
is called via a UPC-specific language hook from
c_common_parse_file(), just after calling c_parse_file().


2015-11-30  Gary Funck  

gcc/c-family/
* c-upc-pts.h: New.  Define the sizes and types of fields
in the UPC pointer-to-shared representation.
gcc/c/
* c-upc-low.c: New.  Lower UPC constructs to GENERIC.
* c-upc-low.h: New.  Prototypes for c-upc-low.c.
* c-upc-pts-ops.c: New. Implement UPC pointer-to-shared-operations.
* c-upc-pts-ops.h: New. Prototypes for c-upc-pts-ops.c.
* c-upc-rts-names.h: New.  Names of some functions in the UPC runtime.

Index: gcc/c-family/c-upc-pts.h
===
--- gcc/c-family/c-upc-pts.h(.../trunk) (revision 0)
+++ gcc/c-family/c-upc-pts.h(.../branches/gupc) (revision 231080)
@@ -0,0 +1,40 @@
+/* Define UPC pointer-to-shared representation characteristics.
+   Copyright (C) 2008-2015 Free Software Foundation, Inc.
+   Contributed by Gary Funck 
+ and Nenad Vukicevic .
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#ifndef GCC_C_FAMILY_UPC_PTS_H
+#define GCC_C_FAMILY_UPC_PTS_H 1
+
+#define UPC_PTS_SIZE(LONG_TYPE_SIZE + POINTER_SIZE)
+#define UPC_PTS_PHASE_SIZE  (LONG_TYPE_SIZE / 2)
+#define UPC_PTS_THREAD_SIZE (LONG_TYPE_SIZE / 2)
+#define UPC_PTS_VADDR_SIZE  POINTER_SIZE
+#define UPC_PTS_PHASE_TYPE  ((LONG_TYPE_SIZE == 64) \
+   ? "uint32_t" : "uint16_t")
+#define UPC_PTS_THREAD_TYPE ((LONG_TYPE_SIZE == 64) \
+   ? "uint32_t" : "uint16_t")
+#define UPC_PTS_VADDR_TYPE  "char *"
+
+#define UPC_MAX_THREADS (1 << (((UPC_PTS_THREAD_SIZE) 

[UPC 21/22] gcc.dg test suite

2015-11-30 Thread Gary Funck

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


The test suite additions under gcc/testsuite/gcc.dg/gupc
test most of the "negative" front-end errors generated by GNU UPC.
These are compile-only tests and can be safely run as part of
the gcc.dg test suite.

There are also some tests which test code generation for 'gets'
and 'puts' to UPC shared memory.  These code generation tests
scan the '.original' tree dump for expected UPC runtime calls.

A new gupc.exp file is introduced under gcc/testsuite/gcc.dg/gupc.
It checks that the compiler supports the -fupc switch; if not then
tests will not be run.  gupc.exp sets the compilation flags
to -fno-upc-pre-include by default.  This removes any dependency upon
the libgupc runtime library, but requires that the tests declare
various runtime API's and variables exported by the UPC runtime library
that would have otherwise been declared in the gcc-upc.h file built
under libgupc/include.

2015-11-30  Gary Funck  

gcc/testsuite/gcc.dg/gupc/
* addr-of-shared-bit-field.upc: New.
* assign-local-ptr-to-pts.upc: New.
* assign-pts-to-local-ptr.upc: New.
* assign-pts-with-diff-block-factors-no-cast.upc: New.
* barrier-notify-wait.upc: New.
* block-factor-applied-to-void-type.upc: New.
* block-factor-incompatible-with-ref-type.upc: New.
* block-factor-not-int-constant.upc: New.
* cast-int-to-pts.upc: New.
* cast-local-ptr-to-pts.upc: New.
* cmp-pts-and-local-ptr.upc: New.
* cmp-pts-eq-diff-block-factor-1.upc: New.
* cmp-pts-eq-diff-block-factor-2.upc: New.
* cmp-pts-eq-diff-block-factor-3.upc: New.
* cmp-pts-gt-diff-block-factor-1.upc: New.
* cmp-pts-gt-diff-block-factor-2.upc: New.
* cmp-pts-gt-diff-block-factor-3.upc: New.
* decl-multiple-layout-quals.upc: New.
* deprecated-barrier-notify-stmt.upc: New.
* deprecated-barrier-stmt.upc: New.
* deprecated-barrier-wait-stmt.upc: New.
* diff-pts-and-local-ptr.upc: New.
* dyn-array-decl-threads-more-than-once.upc: New.
* dyn-array-dim-not-simple-multiple-of-threads.upc: New.
* dyn-star-layout-dim-not-multiple-of-threads.upc: New.
* dyn-threads-more-than-once.upc: New.
* dyn-threads-with-indef-block-size.upc: New.
* field-decl-with-shared-qual.upc: New.
* func-decl-has-shared_qual.upc: New.
* get-blk-relaxed.upc: New.
* get-blk-strict.upc: New.
* get-df-relaxed.upc: New.
* get-df-strict.upc: New.
* get-di-relaxed.upc: New.
* get-di-strict.upc: New.
* get-hi-relaxed.upc: New.
* get-hi-strict.upc: New.
* get-qi-relaxed.upc: New.
* get-qi-strict.upc: New.
* get-sf-relaxed.upc: New.
* get-sf-strict.upc: New.
* get-si-relaxed.upc: New.
* get-si-strict.upc: New.
* get-tf-relaxed.upc: New.
* get-tf-strict.upc: New.
* get-ti-relaxed.upc: New.
* get-ti-strict.upc: New.
* getaddr.upc: New.
* gupc.exp: New.  Compile all *.upc tests in this directory.
* init-makes-pts-from-int.upc: New.
* invalid-local-ptr-to-void-arith.upc: New.
* invalid-sizeof-shared-void.upc: New.
* invalid-sizeof-void.upc: New.
* lt-pts-and-local-ptr.upc: New.
* max-block-size-exceeded.upc: New.
* no-closing-layout-qual-bracket.upc: New.
* parm-decl-with-shared-qual.upc: New.
* passing-arg-makes-pts-from-int.upc: New.
* pts-to-void-in-arith.upc: New.
* put-blk-relaxed.upc: New.
* put-blk-strict.upc: New.
* put-df-relaxed.upc: New.
* put-df-strict.upc: New.
* put-di-relaxed.upc: New.
* put-di-strict.upc: New.
* put-hi-relaxed.upc: New.
* put-hi-strict.upc: New.
* put-qi-relaxed.upc: New.
* put-qi-strict.upc: New.
* put-sf-relaxed.upc: New.
* put-sf-strict.upc: New.
* 

[UPC 11/22] documentation

2015-11-30 Thread Gary Funck

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


For UPC, a new gupc.texi file is introduced to describe the
stand-alone 'gupc' command, which is a driver similar to gfortran
that will invoke 'gcc', asserting the -fupc switch and will
compile any .c files on the command line as if they were .upc files.
In addition, it describes how to run UPC programs, along with details
on the command line switches processed by the UPC runtime.

2015-11-30  Gary Funck  

gcc/doc/
* gupc.texi: New.
* install.texi (disable-libgupc, enable-upc-link-script):
New. Describe UPC-specific configure options.
* invoke.texi (fupc, fupc-threads, fupc-pthreads-model-tls,
fupc-inline-lib, fupc-pre-include, fupc-debug, dwarf-2-upc,
fupc-instrument, fupc-instrument-functions):
New. Describe UPS-specific compiler options.
* passes.texi: Describe the UPC lowering pass.
* sourcebuild.texi (libgupc): Add libgupc to list of libraries.
Also make note that target support for UPC is enabled via -fupc.
* tm.texi: Re-generate.
* tm.texi.in (TARGET_UPC_LINK_SCRIPT_P,
TARGET_UPC_SHARED_SECTION_NAME, TARGET_UPC_PGM_INFO_SECTION_NAME,
TARGET_UPC_INIT_ARRAY_SECTION_NAME): Refer to new UPC target hooks.
libgupc/
* libgupc.texi: New.

Index: gcc/doc/gupc.texi
===
--- gcc/doc/gupc.texi   (.../trunk) (revision 0)
+++ gcc/doc/gupc.texi   (.../branches/gupc) (revision 231080)
@@ -0,0 +1,394 @@
+\input texinfo @c -*-texinfo-*-
+@setfilename gupc
+@settitle GNU project UPC compiler
+
+@c Merge the standard indexes into a single one.
+@syncodeindex fn cp
+@syncodeindex vr cp
+@syncodeindex ky cp
+@syncodeindex pg cp
+@syncodeindex tp cp
+
+@include gcc-common.texi
+
+@c Copyright (C) 2001-2015 Free Software Foundation, Inc.
+@c Contributed by Gary Funck 
+@c   and Nenad Vukicevic .
+@c Based on original implementation
+@c   by Jesse M. Draper 
+@c   and William W. Carlson .
+
+@copying
+@c man begin COPYRIGHT
+Copyright @copyright{} 2001-2015 Free Software Foundation, Inc.
+
+Permission is granted to copy, distribute and/or modify this document
+under the terms of the GNU Free Documentation License, Version 1.3 or
+any later version published by the Free Software Foundation; with the
+Invariant Sections being ``GNU General Public License'' and ``Funding
+Free Software'', the Front-Cover texts being (a) (see below), and with
+the Back-Cover Texts being (b) (see below).  A copy of the license is
+included in the
+@c man end
+section entitled ``GNU Free Documentation License''.
+@ignore
+@c man begin COPYRIGHT
+man page gfdl(7).
+@c man end
+@end ignore
+@c man begin COPYRIGHT
+
+(a) The FSF's Front-Cover Text is:
+
+ A GNU Manual
+
+(b) The FSF's Back-Cover Text is:
+
+ You have freedom to copy and modify this GNU Manual, like GNU
+ software.  Copies published by the Free Software Foundation raise
+ funds for GNU development.
+@c man end
+@end copying
+@c Set file name and title for the man page.
+
+@ifinfo
+@dircategory Software development
+@direntry
+* GNU UPC: (gupc).   A GCC-based compiler for the UPC language
+@end direntry
+
+@insertcopying
+@end ifinfo
+
+@titlepage
+@title The GNU UPC Compiler
+@versionsubtitle
+@author Gary Funck and Nenad Vukicevic
+
+@page
+@vskip 0pt plus 1filll
+Published by the Free Software Foundation @*
+51 Franklin Street, Fifth Floor@*
+Boston, MA 02110-1301, USA@*
+@sp 1
+@insertcopying
+@end titlepage
+@contents
+@page
+
+@node Top
+@chapter @command{gupc}--- UPC compiler for parallel computers
+
+@command{gupc} provides a compilation and execution environment for
+programs written in the UPC (Unified Parallel C) language.
+
+@menu
+* GUPC Intro:: Introduction to gupc.
+* Threads::Number of Execution Threads.
+* Invoking GUPC::  How to use gupc.
+* GUPC Options::   GUPC 

[UPC 16/22] gimple/gimplify changes

2015-11-30 Thread Gary Funck

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


In gimple-expr.c, logic is added to useless_type_conversion_p() to
handle conversions involving UPC pointers-to-shared.
lang_hooks.types_compatible_p() is called to check conversions
between UPC pointers-to-shared.  This will in turn call c_types_compatible_p()
which will call upc_types_compatible_p() if -fupc is asserted.

The hook is needed here because the gimple-related routines are
defined at the top-level of the GCC tree and can be linked with
other front-ends.

In gimplify.c, flag_instrument_functions_exclude_p() is exported
as an external function rather than being defined as a static function.
It is called from upc_genericize_function() defined in c/c-upc-low.c,
when -fupc-instrument-functions is asserted.

2015-11-30  Gary Funck  

gcc/
* gimple-expr.c: #include "langhooks.h".
(useless_type_conversion_p): Retain conversions from UPC
pointer-to-shared and a regular C pointer.
Retain conversions between incompatible UPC pointers-to-shared.
Call lang_hooks.types_compatible_p() to check type
compatibility between UPC pointers-to-shared.
* gimplify.c (flag_instrument_functions_exclude_p): Make it into
an external function.
* gimplify.h (flag_instrument_functions_exclude_p): New prototype.

Index: gcc/gimple-expr.c
===
--- gcc/gimple-expr.c   (.../trunk) (revision 231059)
+++ gcc/gimple-expr.c   (.../branches/gupc) (revision 231080)
@@ -29,6 +29,7 @@ along with GCC; see the file COPYING3.
 #include "gimple-ssa.h"
 #include "fold-const.h"
 #include "tree-eh.h"
+#include "langhooks.h"
 #include "gimplify.h"
 #include "stor-layout.h"
 #include "demangle.h"
@@ -67,6 +68,19 @@ useless_type_conversion_p (tree outer_ty
   if (POINTER_TYPE_P (inner_type)
   && POINTER_TYPE_P (outer_type))
 {
+  int i_shared = SHARED_TYPE_P (TREE_TYPE (inner_type));
+  int o_shared = SHARED_TYPE_P (TREE_TYPE (outer_type));
+
+  /* Retain conversions from a UPC shared pointer to
+ a regular C pointer.  */
+  if (!o_shared && i_shared)
+return false;
+
+  /* Retain conversions between incompatible UPC shared pointers.  */
+  if (o_shared && i_shared
+ && !lang_hooks.types_compatible_p (inner_type, outer_type))
+return false;
+
   /* Do not lose casts between pointers to different address spaces.  */
   if (TYPE_ADDR_SPACE (TREE_TYPE (outer_type))
  != TYPE_ADDR_SPACE (TREE_TYPE (inner_type)))
Index: gcc/gimplify.c
===
--- gcc/gimplify.c  (.../trunk) (revision 231059)
+++ gcc/gimplify.c  (.../branches/gupc) (revision 231080)
@@ -11269,7 +11269,7 @@ typedef char *char_p; /* For DEF_VEC_P.
 
 /* Return whether we should exclude FNDECL from instrumentation.  */
 
-static bool
+bool
 flag_instrument_functions_exclude_p (tree fndecl)
 {
   vec *v;
Index: gcc/gimplify.h
===
--- gcc/gimplify.h  (.../trunk) (revision 231059)
+++ gcc/gimplify.h  (.../branches/gupc) (revision 231080)
@@ -77,6 +77,7 @@ extern enum gimplify_status gimplify_exp
 extern void gimplify_type_sizes (tree, gimple_seq *);
 extern void gimplify_one_sizepos (tree *, gimple_seq *);
 extern gbind *gimplify_body (tree, bool);
+extern bool flag_instrument_functions_exclude_p (tree);
 extern enum gimplify_status gimplify_arg (tree *, gimple_seq *, location_t);
 extern void gimplify_function_tree (tree);
 extern enum gimplify_status gimplify_va_arg_expr (tree *, gimple_seq *,


[UPC 05/22] language hooks changes

2015-11-30 Thread Gary Funck

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


Two new UPC-specific 'decl' language hooks are defined and then called from
layout_decl() in stor-layout.c.  The layout_decl_p() function tests if
this is a UPC shared array declaration that requires special handling.
If it does, then layout_decl() is called.

A few new UPC-specific language hooks are defined in a 'upc' sub-structure
of the language hooks structure.  They are defined as
hooks because they are called from code in the 'c-family/' directory,
but are implemented in the 'c/' directory.

2015-11-30  Gary Funck  

gcc/
* langhooks-def.h (lhd_do_nothing_b, lhd_do_nothing_t_t):
New do nothing hook prototypes.
(LANG_HOOKS_UPC_TOGGLE_KEYWORDS,
LANG_HOOKS_UPC_PTS_INIT_TYPE, LANG_HOOKS_UPC_BUILD_INIT_FUNC,
LANG_HOOKS_UPC_WRITE_GLOBAL_DECLS): New default UPC hooks.
* langhooks-def.h (LANG_HOOKS_LAYOUT_DECL_P, LANG_HOOKS_LAYOUT_DECL):
New language hook defaults.
(LANG_HOOKS_UPC): New.  Define UPC hooks structure.
* langhooks.c (lhd_do_nothing_b, lhd_do_nothing_t_t):
New do nothing hooks.
* langhooks.h (layout_decl_p, layout_decl): New language hooks.
(lang_hooks_for_upc): New UPC language hooks structure.
* stor-layout.c (layout_decl): Call the layout_decl_p() and
and layout_decl() hooks.
gcc/c/
* c-lang.c: #include "c-upc-lang.h".
#include "c-upc-low.h".
(LANG_HOOKS_UPC_TOGGLE_KEYWORDS, LANG_HOOKS_UPC_PTS_INIT_TYPE,
LANG_HOOKS_UPC_BUILD_INIT_FUNC, LANG_HOOKS_UPC_WRITE_GLOBAL_DECLS,
LANG_HOOKS_LAYOUT_DECL_P, LANG_HOOKS_LAYOUT_DECL):
Override defaults.  Define UPC-specific hook routines.
* c-upc-lang.c: New.  Implement UPC-specific hook routines.
* c-upc-lang.h: New.  Define UPC-specific hook prototypes.

Index: gcc/langhooks-def.h
===
--- gcc/langhooks-def.h (.../trunk) (revision 231059)
+++ gcc/langhooks-def.h (.../branches/gupc) (revision 231080)
@@ -35,7 +35,9 @@ struct diagnostic_info;
 /* See langhooks.h for the definition and documentation of each hook.  */
 
 extern void lhd_do_nothing (void);
+extern void lhd_do_nothing_b (bool);
 extern void lhd_do_nothing_t (tree);
+extern void lhd_do_nothing_t_t (tree, tree);
 extern void lhd_do_nothing_f (struct function *);
 extern tree lhd_pass_through_t (tree);
 extern bool lhd_post_options (const char **);
@@ -175,6 +177,10 @@ extern tree lhd_make_node (enum tree_cod
 #define LANG_HOOKS_GET_SUBRANGE_BOUNDS NULL
 #define LANG_HOOKS_DESCRIPTIVE_TYPENULL
 #define LANG_HOOKS_RECONSTRUCT_COMPLEX_TYPE reconstruct_complex_type
+#define LANG_HOOKS_UPC_TOGGLE_KEYWORDS  lhd_do_nothing_b
+#define LANG_HOOKS_UPC_PTS_INIT_TYPE  lhd_do_nothing
+#define LANG_HOOKS_UPC_BUILD_INIT_FUNC lhd_do_nothing_t
+#define LANG_HOOKS_UPC_WRITE_GLOBAL_DECLS lhd_do_nothing
 #define LANG_HOOKS_ENUM_UNDERLYING_BASE_TYPE lhd_enum_underlying_base_type
 
 #define LANG_HOOKS_FOR_TYPES_INITIALIZER { \
@@ -219,6 +225,8 @@ extern tree lhd_make_node (enum tree_cod
 #define LANG_HOOKS_OMP_CLAUSE_LINEAR_CTOR NULL
 #define LANG_HOOKS_OMP_CLAUSE_DTOR hook_tree_tree_tree_null
 #define LANG_HOOKS_OMP_FINISH_CLAUSE lhd_omp_finish_clause
+#define LANG_HOOKS_LAYOUT_DECL_P hook_bool_tree_tree_false
+#define LANG_HOOKS_LAYOUT_DECL lhd_do_nothing_t_t
 
 #define LANG_HOOKS_DECLS { \
   LANG_HOOKS_GLOBAL_BINDINGS_P, \
@@ -243,7 +251,9 @@ extern tree lhd_make_node (enum tree_cod
   LANG_HOOKS_OMP_CLAUSE_ASSIGN_OP, \
   LANG_HOOKS_OMP_CLAUSE_LINEAR_CTOR, \
   LANG_HOOKS_OMP_CLAUSE_DTOR, \
-  LANG_HOOKS_OMP_FINISH_CLAUSE \
+  LANG_HOOKS_OMP_FINISH_CLAUSE, \
+  LANG_HOOKS_LAYOUT_DECL_P, \
+  LANG_HOOKS_LAYOUT_DECL \
 }
 
 /* LTO hooks.  */
@@ -261,6 +271,13 @@ extern void lhd_end_section (void);
   LANG_HOOKS_END_SECTION \
 }
 
+#define LANG_HOOKS_UPC { \
+  LANG_HOOKS_UPC_TOGGLE_KEYWORDS, \
+  LANG_HOOKS_UPC_PTS_INIT_TYPE, \
+  LANG_HOOKS_UPC_BUILD_INIT_FUNC, \
+  

[UPC 13/22] C++ changes

2015-11-30 Thread Gary Funck

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


Although UPC is an extension to "C" and not "C++", these changes
are needed to accommodate changes to the common tree-related
code that handles qualified types, and to accommodate UPC's
"layout qualifier" (blocking factor).

In tree.h, check_qualified_type() was changed to accept an
extra argument, block_factor.

/* Check whether CAND is suitable to be returned from get_qualified_type
   (BASE, TYPE_QUALS, BLOCK_FACTOR).  */

extern bool check_qualified_type (const_tree cand, const_tree base,
  int type_quals, tree block_factor);

and the c_build_qualified_type() procedure was renamed to
c_build_qualified_type_1().  c_build_qualified_type was changed
into a macro.

/* Return a version of the TYPE, qualified as indicated by the
   TYPE_QUALS and BLOCK_FACTOR, if one exists.
   If no qualified version exists yet, return NULL_TREE.  */

extern tree get_qualified_type_1 (tree type, int type_quals,
  tree block_factor);
#define get_qualified_type(TYPE, QUALS) \
  get_qualified_type_1 (TYPE, QUALS, 0)

This patch adjusts the C++ front-end so that it works with
the changes described above.

2015-11-30  Gary Funck  

gcc/cp/
* lex.c (init_reswords): Disable UPC keywords.
* tree.c (c_build_qualified_type_1): Rename.
Was: c_build_qualified_type.  
(cp_check_qualified_type): Adjust call to check_qualified_type
to pass a null UPC blocking factor.
Index: gcc/cp/lex.c
===
--- gcc/cp/lex.c(.../trunk) (revision 231059)
+++ gcc/cp/lex.c(.../branches/gupc) (revision 231080)
@@ -179,6 +179,9 @@ init_reswords (void)
   /* The Objective-C keywords are all context-dependent.  */
   mask |= D_OBJC;
 
+  /* UPC constructs are not supported in C++.  */
+  mask |= D_UPC;
+
   ridpointers = ggc_cleared_vec_alloc ((int) RID_MAX);
   for (i = 0; i < num_c_common_reswords; i++)
 {
Index: gcc/cp/tree.c
===
--- gcc/cp/tree.c   (.../trunk) (revision 231059)
+++ gcc/cp/tree.c   (.../branches/gupc) (revision 231080)
@@ -995,7 +995,8 @@ move (tree expr)
the C version of this function does not properly maintain canonical
types (which are not used in C).  */
 tree
-c_build_qualified_type (tree type, int type_quals)
+c_build_qualified_type_1 (tree type, int type_quals,
+ tree ARG_UNUSED (layout_qualifier))
 {
   return cp_build_qualified_type (type, type_quals);
 }
@@ -1867,7 +1868,7 @@ static bool
 cp_check_qualified_type (const_tree cand, const_tree base, int type_quals,
 cp_ref_qualifier rqual, tree raises)
 {
-  return (check_qualified_type (cand, base, type_quals)
+  return (check_qualified_type (cand, base, type_quals, NULL_TREE)
  && comp_except_specs (raises, TYPE_RAISES_EXCEPTIONS (cand),
ce_exact)
  && type_memfn_rqual (cand) == rqual);


[UPC 14/22] constant folding changes

2015-11-30 Thread Gary Funck

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


UPC pointers-to-shared (aka shared pointers) are not interchangeable
with integers as they are in regular "C".  Therefore, additions
and subtraction operations which involve UPC shared pointers
should not be further simplified.

2015-11-30  Gary Funck  

gcc/
* fold-const.c (fold_unary_loc): Do not perform this simplification
if either of the types are UPC pointer-to-shared types.
(fold_binary_loc): Disable optimizations involving UPC
pointers-to-shared because integers are not interoperable
with UPC pointers-to-shared.
* match.pd: Do not simplify POINTER_PLUS operations which
involve UPC pointers-to-shared.  Do not simplify integral
conversions involving UPC pointers-to-shared.  For a chain
of two conversions, do not simplify conversions involving
UPC pointers-to-shared unless they meet specific criteria.

Index: gcc/fold-const.c
===
--- gcc/fold-const.c(.../trunk) (revision 231059)
+++ gcc/fold-const.c(.../branches/gupc) (revision 231080)
@@ -7805,10 +7805,16 @@ fold_unary_loc (location_t loc, enum tre
 
   /* Convert (T1)(X p+ Y) into ((T1)X p+ Y), for pointer type, when the new
 cast (T1)X will fold away.  We assume that this happens when X itself
-is a cast.  */
+is a cast.
+
+Do not perform this simplification if either of the types 
+are UPC pointer-to-shared types.  */
   if (POINTER_TYPE_P (type)
  && TREE_CODE (arg0) == POINTER_PLUS_EXPR
- && CONVERT_EXPR_P (TREE_OPERAND (arg0, 0)))
+ && CONVERT_EXPR_P (TREE_OPERAND (arg0, 0))
+ && !SHARED_TYPE_P (TREE_TYPE (type))
+ && !SHARED_TYPE_P (TREE_TYPE (
+  TREE_TYPE (TREE_OPERAND (arg0, 0)
{
  tree arg00 = TREE_OPERAND (arg0, 0);
  tree arg01 = TREE_OPERAND (arg0, 1);
@@ -9271,6 +9277,14 @@ fold_binary_loc (location_t loc,
   return NULL_TREE;
 
 case PLUS_EXPR:
+  /* Disable further optimizations involving UPC shared pointers,
+ because integers are not interoperable with shared pointers.  */
+  if ((TREE_TYPE (arg0) && POINTER_TYPE_P (TREE_TYPE (arg0))
+  && SHARED_TYPE_P (TREE_TYPE (TREE_TYPE (arg0
+ || (TREE_TYPE (arg1) && POINTER_TYPE_P (TREE_TYPE (arg1))
+ && SHARED_TYPE_P (TREE_TYPE (TREE_TYPE (arg1)
+return NULL_TREE;
+
   if (INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type))
{
  /* X + (X / CST) * -CST is X % CST.  */
@@ -9679,6 +9693,16 @@ fold_binary_loc (location_t loc,
   return NULL_TREE;
 
 case MINUS_EXPR:
+
+  /* Disable further optimizations involving UPC shared pointers,
+ because integers are not interoperable with shared pointers.
+(The test below also detects pointer difference between
+shared pointers, which cannot be folded.  */
+
+  if (TREE_TYPE (arg0) && POINTER_TYPE_P (TREE_TYPE (arg0))
+  && SHARED_TYPE_P (TREE_TYPE (TREE_TYPE (arg0
+return NULL_TREE;
+
   /* (-A) - B -> (-B) - A  where B is easily negated and we can swap.  */
   if (TREE_CODE (arg0) == NEGATE_EXPR
  && negate_expr_p (op1)
Index: gcc/match.pd
===
--- gcc/match.pd(.../trunk) (revision 231059)
+++ gcc/match.pd(.../branches/gupc) (revision 231080)
@@ -931,10 +931,13 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
(if (!fail && wi::bit_and (@1, zero_mask_not) == 0)
 (inner_op @2 { wide_int_to_tree (type, cst_emit); }))
 
-/* Associate (p +p off1) +p off2 as (p +p (off1 + off2)).  */
-(simplify
-  (pointer_plus (pointer_plus:s @0 @1) @3)
-  (pointer_plus @0 (plus @1 @3)))
+/* Associate (p +p off1) +p off2 as (p +p (off1 + off2)).
+   (Do not apply this simplification to UPC pointers-to-shared
+   because they are not 

[UPC 03/22] options processing, driver

2015-11-30 Thread Gary Funck

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


UPC language support requires some extensions to the GCC driver.
Most of the new UPC-specific spec's will be triggered by the presence
of -fupc on the gcc command line.  Further, -fupc, will be asserted
when source files ending in .upc are compiled.
The linker spec, LINK_COMMAND_SPEC, is extended to bring in
UPC start/end files and to link with libgupc when -fupc is asserted.

Some new UPC-specific command line options are defined in c.opt.
These new UPC-specific options will only have an effect when -fupc
is asserted, and will be detected as an error otherwise.

c_common_parse_file() will call the UPC-specific language hook,
lang_hooks.upc.write_global_declarations() just after
parsing the source file via c_parse_file().  It is called
there to generate an initialization routine
within the scope of the current compilation unit.
This initialization routine will initialize file scope
UPC shared variables, and initialize pointers-to-shared
as needed.  The address of this initialization routine 
is placed in a special linker section named by
targetm.upc.init_array_section_name().

A new UPC-specific driver program called 'gupc' is implemented by
gcc/c/gupcspec.c.  It will be installed as both a 'gupc' and 'upc'
executable ('upc' is a symlink to 'gupc').  This is a convenience
driver similar to gfortran.  It asserts -fupc and
will cause .c source files to be compiled as UPC source files.
This implicit handling of .c files as .upc files provides compatibility
with other UPC compilers.

2015-11-30  Gary Funck  

gcc/
* gcc.c (upc_crtbegin_spec, link_upc_spec, upc_crtend_spec,
upc_options): New.  Define UPC-related spec's.
(default_compilers): Add support for .upc files.
(static_specs): Initialize UPC-specific spec's.
(LINK_COMMAND_SPEC): Add UPC-specific linker spec's.
* timevar.def (TV_TREE_UPC_GENERICIZE): New.  Define a new
time variable for the 'UPC genericize' pass.
gcc/c-family/
* c-opts.c: #include "c-upc-pts.h" to bring in UPC_MAX_THREADS.
(upc_init_options, upc_handle_option): New.
(c_common_init_options):
Call upc_init_options() if -fupc is asserted.
(c_common_handle_option): Call upc_handle_option
to handle UPC-specific options.
(c_common_parse_file):
Call lang_hooks.upc.write_global_declarations() if -fupc is asserted.
* c.opt (dwarf-2-upc, fupc, fupc-debug, fupc-inline-lib,
fupc-pre-include, fupc-pthreads-model-tls, fupc-threads,
fupc-instrument, fupc-instrument-functions): New.
Define UPS-specific command line options.
gcc/c/
* gupcspec.c: New.  Implement the 'gupc' driver program.

Index: gcc/gcc.c
===
--- gcc/gcc.c   (.../trunk) (revision 231059)
+++ gcc/gcc.c   (.../branches/gupc) (revision 231080)
@@ -1016,16 +1016,20 @@ proper position among the other output f
 %{flto} %{fno-lto} %{flto=*} %l " LINK_PIE_SPEC \
"%{fuse-ld=*:-fuse-ld=%*} " LINK_COMPRESS_DEBUG_SPEC \
"%X %{o*} %{e*} %{N} %{n} %{r}\
-%{s} %{t} %{u*} %{z} %{Z} %{!nostdlib:%{!nostartfiles:%S}} \
+%{s} %{t} %{u*} %{z} %{Z}\
+
%{!nostdlib:%{!nostartfiles:%{fupc:%:include(upc-crtbegin.spec)%(upc_crtbegin)}}}\
+%{!nostdlib:%{!nostartfiles:%S}} \
 %{static:} %{L*} %(mfwrap) %(link_libgcc) " \
 VTABLE_VERIFICATION_SPEC " " SANITIZER_EARLY_SPEC " %o " CHKP_SPEC " \
 %{fopenacc|fopenmp|%:gt(%{ftree-parallelize-loops=*} 1):\
%:include(libgomp.spec)%(link_gomp)}\
 %{fcilkplus:%:include(libcilkrts.spec)%(link_cilkrts)}\
 %{fgnu-tm:%:include(libitm.spec)%(link_itm)}\
+%{fupc:%:include(libgupc.spec)%(link_upc)}\
 %(mflib) " STACK_SPLIT_SPEC "\
 %{fprofile-arcs|fprofile-generate*|coverage:-lgcov} " SANITIZER_SPEC " \
 %{!nostdlib:%{!nodefaultlibs:%(link_ssp) %(link_gcc_c_sequence)}}\
+

[UPC 18/22] libatomic changes

2015-11-30 Thread Gary Funck

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


The UPC language specification defines atomic operations on
UPC shared data, implemented by a set of library routines.

The UPC runtime library, targeting SMP (symmetric multiprocessor) systems,
uses GCC builtin atomic operations to implement atomic operations
on UPC shared values.  GCC's builtin atomic operations use libatomic
to handle various situations where direct hardware support is unavailable.
During testing, we noticed that when some operations or types are
unsupported that the library will call an internal lock routine and
this lock routine calls pthread_mutex().  That doesn't work well for
UPC because by default a UPC "thread" maps to an OS process.
We discussed this issue in this bug report:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60790.

To work around this locking issue, we build a statically linked
"convenience" library, libatomic_convenience_no_lock.a.
This is the same as the libatomic_convenience library built for libgo,
except it doesn't include lock.o.  In libgupc/smp, the source
file upc_libat_lock.c defines the same entry points as lock.c,
but implements them using a spin lock.

2015-11-30  Gary Funck  

libatomic/
* Makefile.am (LIBAT_SRC_NO_LOCK, libatomic_convenience_no_lock*):
New.  Add rules to build libatomic_convenience_no_lock.a,
used by libgupc.
* Makefile.in: Re-generate.

Index: libatomic/Makefile.am
===
--- libatomic/Makefile.am   (.../trunk) (revision 231059)
+++ libatomic/Makefile.am   (.../branches/gupc) (revision 231080)
@@ -40,7 +40,8 @@ AM_CCASFLAGS = $(XCFLAGS)
 AM_LDFLAGS = $(XLDFLAGS) $(SECTION_LDFLAGS) $(OPT_LDFLAGS)
 
 toolexeclib_LTLIBRARIES = libatomic.la
-noinst_LTLIBRARIES = libatomic_convenience.la
+noinst_LTLIBRARIES = libatomic_convenience.la \
+ libatomic_convenience_no_lock.la 
 
 if LIBAT_BUILD_VERSIONED_SHLIB
 if LIBAT_BUILD_VERSIONED_SHLIB_GNU
@@ -67,8 +68,9 @@ endif
 libatomic_version_info = -version-info $(libtool_VERSION)
 
 libatomic_la_LDFLAGS = $(libatomic_version_info) $(libatomic_version_script) 
$(lt_host_flags)
-libatomic_la_SOURCES = gload.c gstore.c gcas.c gexch.c glfree.c lock.c init.c \
+LIBAT_SRC_NO_LOCK = gload.c gstore.c gcas.c gexch.c glfree.c 0 init.c \
fenv.c fence.c flag.c
+libatomic_la_SOURCES = $(LIBAT_SRC_NO_LOCK) lock.c
 
 SIZEOBJS = load store cas exch fadd fsub fand fior fxor fnand tas
 SIZES = @SIZES@
@@ -139,3 +141,9 @@ endif
 
 libatomic_convenience_la_SOURCES = $(libatomic_la_SOURCES)
 libatomic_convenience_la_LIBADD = $(libatomic_la_LIBADD)
+
+# The "no lock" convenience library is used by libgupc to
+# avoid lock.c's use of pthread_mutex, which won't work
+# for processes using atomics on shared memory.
+libatomic_convenience_no_lock_la_SOURCES = $(LIBAT_SRC_NO_LOCK)
+libatomic_convenience_no_lock_la_LIBADD = $(libatomic_la_LIBADD)


[UPC 12/22] DWARF support

2015-11-30 Thread Gary Funck

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


The Dwarf4 specification defines extensions which add support for UPC.
See: http://dwarfstd.org/doc/DWARF4.pdf for details.  These extensions
are defined in /include/dwarf2.def.  The patch below
implements UPC debugging support.  This support is enabled via
the -dwarf-2-upc compilation switch.  It is not enabled by default,
because some older versions of GDB will abort when encountering
the UPC-related DWARF extensions.

A few years back, we added support to GDB for UPC, though that
support was experimental and not pushed back into the mainline.
A couple of commercial parallel debuggers implemented support
for GNU UPC, utilizing these DWARF extensions.

2015-11-30  Gary Funck  

gcc/
* dwarf2out.c (modified_type_die): If the type is shared qualified,
generate UPC debugging information as defined in
the DWARF4 specification.
(add_subscript_info): If the array index is "THREADS scaled",
add the DW_AT_upc_threads_scaled attribute to the subrange DIE.
(gen_compile_unit_die): If -fupc is asserted,
set the language to DW_LANG_Upc.

Index: gcc/dwarf2out.c
===
--- gcc/dwarf2out.c (.../trunk) (revision 231059)
+++ gcc/dwarf2out.c (.../branches/gupc) (revision 231080)
@@ -10899,6 +10899,50 @@ modified_type_die (tree type, int cv_qua
mod_type_die = d;
  }
 }
+  else if (use_upc_dwarf2_extensions
+   && (cv_quals & TYPE_QUAL_SHARED))
+{
+  HOST_WIDE_INT block_factor = 1;
+
+  /* Inside the compiler,
+ "shared int x;" TYPE_BLOCK_FACTOR is null.
+ "shared [] int *p;" TYPE_BLOCK_FACTOR is zero.
+ "shared [10] int x[50];" TYPE_BLOCK_FACTOR is 10 * bitsize(int)
+ The DWARF2 encoding is as follows:
+ "shared int x;"  DW_AT_count: 1
+ "shared [] int *p;" 
+ "shared [10] int x[50];" DW_AT_count: 10
+ The logic below handles thse various contingencies.  */
+
+  mod_type_die = new_die (DW_TAG_upc_shared_type,
+  comp_unit_die (), type);
+
+  if (TYPE_HAS_BLOCK_FACTOR (type))
+block_factor = TREE_INT_CST_LOW (TYPE_BLOCK_FACTOR (type));
+
+  if (block_factor != 0)
+add_AT_unsigned (mod_type_die, DW_AT_count, block_factor);
+
+  sub_die = modified_type_die (type,
+   cv_quals & ~TYPE_QUAL_SHARED,
+   context_die);
+}
+  else if (use_upc_dwarf2_extensions && cv_quals & TYPE_QUAL_STRICT)
+{
+  mod_type_die = new_die (DW_TAG_upc_strict_type,
+  comp_unit_die (), type);
+  sub_die = modified_type_die (type,
+   cv_quals & ~TYPE_QUAL_STRICT,
+   context_die);
+}
+  else if (use_upc_dwarf2_extensions && cv_quals & TYPE_QUAL_RELAXED)
+{
+  mod_type_die = new_die (DW_TAG_upc_relaxed_type,
+  comp_unit_die (), type);
+  sub_die = modified_type_die (type,
+   cv_quals & ~TYPE_QUAL_RELAXED,
+   context_die);
+}
   else if (code == POINTER_TYPE || code == REFERENCE_TYPE)
 {
   dwarf_tag tag = DW_TAG_pointer_type;
@@ -16992,6 +17036,12 @@ add_subscript_info (dw_die_ref type_die,
   if (!subrange_die)
subrange_die = new_die (DW_TAG_subrange_type, type_die, NULL);
 
+
+  if (use_upc_dwarf2_extensions && TYPE_HAS_THREADS_FACTOR (type))
+{
+ add_AT_flag (subrange_die, DW_AT_upc_threads_scaled, 1);
+   }
+
   if (domain)
{
  /* We have an array type with specified bounds.  */
@@ -20279,6 +20329,10 @@ gen_compile_unit_die (const char *filena
  if (dwarf_version >= 5 /* || !dwarf_strict */)
if (strcmp (language_string, "GNU C11") == 0)
  language = DW_LANG_C11;
+
+  if (use_upc_dwarf2_extensions && flag_upc)
+language 

[UPC 10/22] target - rs6000

2015-11-30 Thread Gary Funck

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


UPC pointers-to-shared have an internal representation that is a 'struct'.
GCC generally assumes that pointers can be targeted into registers.
However, various ABI's will special case how struct's are passed.

In order to insure that UPC pointers-to-shared can be passed to
the UPC runtime, which is written in "C", the convention used
to pass the UPC pointer-to-shared must agree with that of a struct,
because the runtime will describe the internal representation
as a struct.

The code below checks for UPC pointers-to-shared that are
represented as an aggregate type and insures that these
pointers are passed as struct's.

2015-11-30  Gary Funck  

gcc/config/rs6000/
* rs6000.c (rs6000_return_in_memory):
If TYPE is a UPC PTS type with a "struct" internal representation,
handle it as an aggregate type.
(rs6000_function_arg_boundary): For UPC pointers-to-shared with
alignment > 64 that have an internal "struct" representation,
return 128 and skip the ABI warning.
(rs6000_pass_by_reference): If TYPE is a UPC PTS type with
a "struct" internal representation, handle it as an aggregate type.
(rs6000_pass_by_reference): Exclude UPC pointers-to-shared
from the logic that returns pointers in either SImode or DImode.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (.../trunk) (revision 231059)
+++ gcc/config/rs6000/rs6000.c  (.../branches/gupc) (revision 231080)
@@ -9709,12 +9709,21 @@ rs6000_return_in_memory (const_tree type
 NULL, NULL))
 return false;
 
+  /* TRUE if TYPE is a UPC pointer-to-shared type
+ and its underlying representation is an aggregate.  */
+  bool upc_struct_pts_p = (POINTER_TYPE_P (type)
+&& SHARED_TYPE_P (TREE_TYPE (type)))
+  && AGGREGATE_TYPE_P (upc_pts_rep_type_node);
+  /* If TYPE is a UPC struct PTS type, handle it as an aggregate type.  */
+  bool aggregate_p = AGGREGATE_TYPE_P (type)
+ || upc_struct_pts_p;
+
   /* The ELFv2 ABI returns aggregates up to 16B in registers */
-  if (DEFAULT_ABI == ABI_ELFv2 && AGGREGATE_TYPE_P (type)
+  if (DEFAULT_ABI == ABI_ELFv2 && aggregate_p
   && (unsigned HOST_WIDE_INT) int_size_in_bytes (type) <= 16)
 return false;
 
-  if (AGGREGATE_TYPE_P (type)
+  if (aggregate_p
   && (aix_struct_return
  || (unsigned HOST_WIDE_INT) int_size_in_bytes (type) > 8))
 return true;
@@ -10040,6 +10049,18 @@ rs6000_function_arg_boundary (machine_mo
|| DEFAULT_ABI == ABI_ELFv2)
   && type && TYPE_ALIGN (type) > 64)
 {
+
+  /* If the underlying UPC pointer-to-shared representation
+ is an aggregate, and TYPE is either a pointer-to-shared
+or the PTS representation type, then return 16-byte
+alignment and skip the ABI warning.  */
+  if (upc_pts_rep_type_node
+  && AGGREGATE_TYPE_P (upc_pts_rep_type_node)
+  && ((POINTER_TYPE_P (type)
+  && SHARED_TYPE_P (TREE_TYPE (type)))
+  || (TYPE_MAIN_VARIANT (type) == upc_pts_rep_type_node)))
+   return 128;
+
   /* "Aggregate" means any AGGREGATE_TYPE except for single-element
  or homogeneous float/vector aggregates here.  We already handled
  vector aggregates above, but still need to check for float here. */
@@ -11320,7 +11341,16 @@ rs6000_pass_by_reference (cumulative_arg
   return 1;
 }
 
-  if (DEFAULT_ABI == ABI_V4 && AGGREGATE_TYPE_P (type))
+  /* TRUE if TYPE is a UPC pointer-to-shared type
+ and its underlying representation is an aggregate.  */
+  bool upc_struct_pts_p = (POINTER_TYPE_P (type)
+ && SHARED_TYPE_P (TREE_TYPE (type)))
+   && AGGREGATE_TYPE_P (upc_pts_rep_type_node);
+  /* If TYPE is a UPC struct PTS type, handle it as an aggregate type.  */

[UPC 17/22] misc/common changes

2015-11-30 Thread Gary Funck

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


Given that UPC pointers-to-shared (PTS's) have special arithmetic rules
and their internal representation is a structure with
three separate fields, they are not meaningfully convertible to integers
and pointer arithmetic involving PTS's cannot be optimized in
the same fashion as normal "C" pointer arithmetic.  Further,
the representation of a NULL pointer-to-shared is different from
a "C" null pointer.  Logic has been added to convert.c and jump.c
to handle operations involving UPC PTS's.  In function.c,
UPC pointers-to-shared which have an internal representation that
is a 'struct' are treated as aggregates.  Also in function.c
logic is added that prevents marking them as potential
pointer register values.

In varasm.c, a check is added for the linker section used by
UPC to coalesce file scoped UPC shared variables.  This section
is used only to assign offsets into UPC's shared data area for
the UPC shared variables.  When UPC linker scripts are supported,
this shared section is not loaded and has an origin of 0.

2015-11-30  Gary Funck  

gcc/
* convert.c (convert_to_pointer): Add check for null
UPC pointer-to-shared.
(convert_to_integer): Do not optimize pointer
subtraction for UPC pointers-to-shared.
(convert_to_integer): Issue error for an attempt
to convert a UPC pointer-to-shared to an integer.
* dojump.c (do_jump): If a UPC pointer-to-shared conversion
can change representation, it must be compared in the result type.
* function.c (aggregate_value_p): Handle 'struct' pointer-to-shared
values as an aggregate when passing them as a return value.
(assign_parm_setup_reg): Do not target UPC pointers-to-shared that are
represented as a 'struct' into a pointer register.
* varasm.c (default_section_type_flags): Handle UPC's shared
section as BSS, and if a UPC link script is supported,
make it a non-loadable, read-only section.

Index: gcc/convert.c
===
--- gcc/convert.c   (.../trunk) (revision 231059)
+++ gcc/convert.c   (.../branches/gupc) (revision 231080)
@@ -53,6 +53,14 @@ convert_to_pointer_1 (tree type, tree ex
   if (TREE_TYPE (expr) == type)
 return expr;
 
+  if (integer_zerop (expr) && POINTER_TYPE_P (type)
+  && SHARED_TYPE_P (TREE_TYPE (type)))
+{
+  expr = copy_node (upc_null_pts_node);
+  TREE_TYPE (expr) = build_unshared_type (type);
+  return expr;
+}
+
   switch (TREE_CODE (TREE_TYPE (expr)))
 {
 case POINTER_TYPE:
@@ -437,6 +445,16 @@ convert_to_integer_1 (tree type, tree ex
   return error_mark_node;
 }
 
+  /* Can't optimize the conversion of UPC shared pointer difference.  */
+  if (ex_form == MINUS_EXPR
+  && POINTER_TYPE_P (TREE_TYPE (TREE_OPERAND (expr, 0)))
+  && POINTER_TYPE_P (TREE_TYPE (TREE_OPERAND (expr, 1)))
+  && SHARED_TYPE_P (TREE_TYPE (TREE_TYPE (TREE_OPERAND (expr, 0
+  && SHARED_TYPE_P (TREE_TYPE (TREE_TYPE (TREE_OPERAND (expr, 1)
+  {
+  return build1 (CONVERT_EXPR, type, expr);
+  }
+
   if (ex_form == COMPOUND_EXPR)
 {
   tree t = convert_to_integer_1 (type, TREE_OPERAND (expr, 1), dofold);
@@ -581,6 +599,12 @@ convert_to_integer_1 (tree type, tree ex
 {
 case POINTER_TYPE:
 case REFERENCE_TYPE:
+  if (SHARED_TYPE_P (TREE_TYPE (intype)))
+{
+  error ("invalid conversion from a UPC pointer-to-shared "
+"to an integer");
+ expr = integer_zero_node;
+}
   if (integer_zerop (expr))
return build_int_cst (type, 0);
 
Index: gcc/dojump.c
===
--- gcc/dojump.c(.../trunk) (revision 231059)
+++ gcc/dojump.c(.../branches/gupc) (revision 231080)
@@ -468,6 +468,10 @@ do_jump (tree exp, rtx_code_label *if_fa
< TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (exp, 

-fstrict-aliasing fixes 2/5: drop alias set 0 streaming

2015-11-30 Thread Jan Hubicka
Hi,
this patch disables the streaming of alias 0 flag and adds a comment why.

Bootstrapped/regtested x86_64-linux, OK?

Honza

* lto-streamer-out.c (hash_tree): Do not stream TYPE_ALIAS_SET.
* tree-streamer-out.c (pack_ts_type_common_value_fields): Do not
stream TYPE_ALIAS_SET.
* tree-streamer-in.c (unpack_ts_type_common_value_fields): Do not
stream TYPE_ALIAS_SET.

* lto.c (compare_tree_sccs_1): Do not compare TYPE_ALIAS_SET.

Index: lto-streamer-out.c
===
--- lto-streamer-out.c  (revision 231081)
+++ lto-streamer-out.c  (working copy)
@@ -1109,10 +1109,6 @@ hash_tree (struct streamer_tree_cache_d
   hstate.commit_flag ();
   hstate.add_int (TYPE_PRECISION (t));
   hstate.add_int (TYPE_ALIGN (t));
-  hstate.add_int ((TYPE_ALIAS_SET (t) == 0
-|| (!in_lto_p
-&& get_alias_set (t) == 0))
-   ? 0 : -1);
 }
 
   if (CODE_CONTAINS_STRUCT (code, TS_TRANSLATION_UNIT_DECL))
Index: lto/lto.c
===
--- lto/lto.c   (revision 231081)
+++ lto/lto.c   (working copy)
@@ -1166,7 +1166,9 @@ compare_tree_sccs_1 (tree t1, tree t2, t
   compare_values (TYPE_READONLY);
   compare_values (TYPE_PRECISION);
   compare_values (TYPE_ALIGN);
-  compare_values (TYPE_ALIAS_SET);
+  /* Do not compare TYPE_ALIAS_SET.  Doing so introduce ordering issues
+ with calls to get_alias_set which may initialize it for streamed
+in types.  */
 }
 
   /* We don't want to compare locations, so there is nothing do compare
Index: tree-streamer-out.c
===
--- tree-streamer-out.c (revision 231081)
+++ tree-streamer-out.c (working copy)
@@ -317,13 +317,9 @@ pack_ts_type_common_value_fields (struct
   bp_pack_value (bp, TYPE_RESTRICT (expr), 1);
   bp_pack_value (bp, TYPE_USER_ALIGN (expr), 1);
   bp_pack_value (bp, TYPE_READONLY (expr), 1);
-  /* Make sure to preserve the fact whether the frontend would assign
- alias-set zero to this type.  Do that only for main variants, because
- type variants alias sets are never computed.
- FIXME:  This does not work for pre-streamed builtin types.  */
-  bp_pack_value (bp, (TYPE_ALIAS_SET (expr) == 0
- || (!in_lto_p && TYPE_MAIN_VARIANT (expr) == expr
- && get_alias_set (expr) == 0)), 1);
+  /* We used to stream TYPE_ALIAS_SET == 0 information to let frontends mark
+ types that are opaque for TBAA.  This however did not work as intended,
+ becuase TYPE_ALIAS_SET == 0 was regularly lost in canonical type merging. 
 */
   if (RECORD_OR_UNION_TYPE_P (expr))
 {
   bp_pack_value (bp, TYPE_TRANSPARENT_AGGR (expr), 1);
Index: tree-streamer-in.c
===
--- tree-streamer-in.c  (revision 231081)
+++ tree-streamer-in.c  (working copy)
@@ -366,7 +366,6 @@ unpack_ts_type_common_value_fields (stru
   TYPE_RESTRICT (expr) = (unsigned) bp_unpack_value (bp, 1);
   TYPE_USER_ALIGN (expr) = (unsigned) bp_unpack_value (bp, 1);
   TYPE_READONLY (expr) = (unsigned) bp_unpack_value (bp, 1);
-  TYPE_ALIAS_SET (expr) = bp_unpack_value (bp, 1) ? 0 : -1;
   if (RECORD_OR_UNION_TYPE_P (expr))
 {
   TYPE_TRANSPARENT_AGGR (expr) = (unsigned) bp_unpack_value (bp, 1);


Go patch committed: Don't set TYPE_STRING_FLAG on a type variant

2015-11-30 Thread Ian Lance Taylor
PR 68477 observes that gccgo crashes when using -flto1 because a type
variant has TYPE_STRING_FLAG set.  So, don't do that.
TYPE_STRING_FLAG doesn't really do anything, as far as I can tell,
since all the relevant tests in dwarf2out.c also test isfortran().
But, it seems like the right thing to do.  Bootstrapped and ran Go
testsuite on x86_64-pc-linux-gnu.  Committed to mainline.

Ian

2015-11-30  Ian Lance Taylor  

PR go/68477
* go-gcc.cc (Gcc_backend::string_constant_expression): Don't set
TYPE_STRING_FLAG on a variant type.
Index: gcc/go/go-gcc.cc
===
--- gcc/go/go-gcc.cc(revision 230759)
+++ gcc/go/go-gcc.cc(working copy)
@@ -1279,7 +1279,6 @@ Gcc_backend::string_constant_expression(
   tree const_char_type = build_qualified_type(unsigned_char_type_node,
  TYPE_QUAL_CONST);
   tree string_type = build_array_type(const_char_type, index_type);
-  string_type = build_variant_type_copy(string_type);
   TYPE_STRING_FLAG(string_type) = 1;
   tree string_val = build_string(val.length(), val.data());
   TREE_TYPE(string_val) = string_type;


Re: [PATCH] Fix vector rsqrt discovery (PR tree-optimization/68501)

2015-11-30 Thread Bin.Cheng
On Sat, Nov 28, 2015 at 3:40 AM, Jakub Jelinek  wrote:
> Hi!
>
> The recent changes where vector sqrt is represented in the IL using
> IFN_SQRT instead of target specific builtins broke the discovery
> of vector rsqrt, as targetm.builtin_reciprocal is called only
> on builtin functions (not internal functions).  Furthermore,
> for internal fns, not only the IFN_* is significant, but also the
> types (modes actually) of the lhs and/or arguments.
>
> This patch adjusts the target hook, so that the backends can just inspect
> the call (builtin or internal function), whatever it is.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2015-11-27  Jakub Jelinek  
>
> PR tree-optimization/68501
> * target.def (builtin_reciprocal): Replace the 3 arguments with
> a gcall * one, adjust description.
> * targhooks.h (default_builtin_reciprocal): Replace the 3 arguments
> with a gcall * one.
> * targhooks.c (default_builtin_reciprocal): Likewise.
> * tree-ssa-math-opts.c (pass_cse_reciprocals::execute): Use
> targetm.builtin_reciprocal even on internal functions, adjust
> the arguments and allow replacing an internal function with normal
> built-in.
> * config/i386/i386.c (ix86_builtin_reciprocal): Replace the 3 
> arguments
> with a gcall * one.  Handle internal fns too.
> * config/rs6000/rs6000.c (rs6000_builtin_reciprocal): Likewise.
> * config/aarch64/aarch64.c (aarch64_builtin_reciprocal): Likewise.
> * doc/tm.texi (builtin_reciprocal): Document.
>
> --- gcc/target.def.jj   2015-11-18 11:19:19.0 +0100
> +++ gcc/target.def  2015-11-27 16:37:07.870823670 +0100
> @@ -2463,13 +2463,9 @@ identical versions.",
>  DEFHOOK
>  (builtin_reciprocal,
>   "This hook should return the DECL of a function that implements reciprocal 
> of\n\
> -the builtin function with builtin function code @var{fn}, or\n\
> -@code{NULL_TREE} if such a function is not available.  @var{md_fn} is true\n\
> -when @var{fn} is a code of a machine-dependent builtin function.  When\n\
> -@var{sqrt} is true, additional optimizations that apply only to the 
> reciprocal\n\
> -of a square root function are performed, and only reciprocals of 
> @code{sqrt}\n\
> -function are valid.",
> - tree, (unsigned fn, bool md_fn, bool sqrt),
> +the builtin or internal function call @var{call}, or\n\
> +@code{NULL_TREE} if such a function is not available.",
> + tree, (gcall *call),
>   default_builtin_reciprocal)
>
>  /* For a vendor-specific TYPE, return a pointer to a statically-allocated
> --- gcc/targhooks.h.jj  2015-11-18 11:19:17.0 +0100
> +++ gcc/targhooks.h 2015-11-27 16:37:44.828301093 +0100
> @@ -90,7 +90,7 @@ extern tree default_builtin_vectorized_c
>
>  extern int default_builtin_vectorization_cost (enum vect_cost_for_stmt, 
> tree, int);
>
> -extern tree default_builtin_reciprocal (unsigned int, bool, bool);
> +extern tree default_builtin_reciprocal (gcall *);
>
>  extern HOST_WIDE_INT default_vector_alignment (const_tree);
>
> --- gcc/targhooks.c.jj  2015-11-18 11:19:17.0 +0100
> +++ gcc/targhooks.c 2015-11-27 16:38:21.461783097 +0100
> @@ -600,9 +600,7 @@ default_builtin_vectorization_cost (enum
>  /* Reciprocal.  */
>
>  tree
> -default_builtin_reciprocal (unsigned int fn ATTRIBUTE_UNUSED,
> -   bool md_fn ATTRIBUTE_UNUSED,
> -   bool sqrt ATTRIBUTE_UNUSED)
> +default_builtin_reciprocal (gcall *)
>  {
>return NULL_TREE;
>  }
> --- gcc/tree-ssa-math-opts.c.jj 2015-11-25 09:57:47.0 +0100
> +++ gcc/tree-ssa-math-opts.c2015-11-27 17:07:22.756162308 +0100
> @@ -601,19 +601,17 @@ pass_cse_reciprocals::execute (function
>
>   if (is_gimple_call (stmt1)
>   && gimple_call_lhs (stmt1)
> - && (fndecl = gimple_call_fndecl (stmt1))
> - && (DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_NORMAL
> - || DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_MD))
> + && (gimple_call_internal_p (stmt1)
> + || ((fndecl = gimple_call_fndecl (stmt1))
> + && (DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_NORMAL
> + || (DECL_BUILT_IN_CLASS (fndecl)
> + == BUILT_IN_MD)
> {
> - enum built_in_function code;
> - bool md_code, fail;
> + bool fail;
>   imm_use_iterator ui;
>   use_operand_p use_p;
>
> - code = DECL_FUNCTION_CODE (fndecl);
> - md_code = DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_MD;
> -
> - fndecl = targetm.builtin_reciprocal (code, md_code, false);
> + fndecl = targetm.builtin_reciprocal (as_a  
> (stmt1));
>   if (!fndecl)
> 

RFC: Merge the GUPC branch into the GCC 6.0 trunk

2015-11-30 Thread Gary Funck

Some time ago, we submitted an RFC for the introduction of
UPC support into GCC.  During the intervening time period,
we have continued to keep the 'gupc' (GNU UPC) branch in sync
with the GCC trunk and have incorporated feedback and contributions from
various GCC developers (Joseph Myers, Tom Tromey, Jakub Jelinek,
Richard Henderson, Meador Inge, and others).  We have also implemented
various bug fixes and improvements.

At this time, we would like to re-submit the UPC patches for comment
with the goal of introducing these changes into GCC 6.0.

This email provides an overview of UPC and summarizes the
impact of UPC changes on the GCC front-end.

Subsequent emails will include various patch sets which are grouped
by the area of GCC that they impact (front-end, generic, documentation,
build, test, target-specific, and so on), so that they can receive
a more focused review by their respective maintainers.

The main review-related changes are:

* GUPC is no longer implemented as a separate language
(e.g., Objective-C or C++) compiler.  Rather, a new -fupc switch
has been added, which enables UPC support in the C compiler.

* The UPC blocking factor now only uses two of the tree's
"spare" bits.  If the UPC blocking factor is not the default
value of 1 or the "indefinite" value of 0, then it is recorded
in a separate hash table, indexed by the tree node.

* UPC-specific tree support has been integrated into
gcc/c-family/c-common.def and gcc/c-family/c-common.h.

* The number of UPC-specific configuration options
have been reduced.

* The UPC pointer-to-shared format per-target configuration
has been simplified.  Before, both a "packed" and a "struct"
pointer-to-shared representation was supported.  Now, only
the "struct" format is supported and various configuration
options for tweaking field sizes and such have been removed.

* In keeping with current GCC development guidelines
target macros are no longer used.  Rather, where needed,
target hooks are defined and used.

* FIXME's and TODO's were either fixed or cleaned up.

* The copyright and license notices were updated.

* The code was reviewed for conformance to coding standards and updated.

* Diagnostics now use appropriate format strings rather than building
up the strings with sprintf().

* Files in c-family/ no longer include c-tree.h to conform with modularization
improvements.

* Most of the #ifdef conditionals have been removed.  Some target hooks
have been defined and documented in tm.texi.

* The code was reviewed to verify that it conforms with
current GCC coding practices and that it incorporates cleanups
done in the past several years.

* Comments were added to most new functions, and typos and
spelling errors in comments were fixed.

* Changes that appeared in the diff's that were unrelated to UPC
were removed or incorporated into the trunk.

* The linkage to the libgupc library was changed to use the newly
defined method (used in libgomp/libgo for example) of including
library 'spec' files.  This led to a simplification where we no
longer needed to add UPC-specific spec. files in various
target-specific config. directories.

Introduction: UPC-related Changes
-

Below, various UPC-related changes are summarized.
This introduction is provided as background for review of the UPC
changes implemented in the GUPC branch.  Each individual change will be
discussed in more detail in the patch sets found in the following emails.

The current GUPC branch is based upon a recent version of the GCC trunk
and has been bootstrapped on x86_64/i686 Linux, x86_64
Darwin, IA64/Altix Linux, PowerPC Power7 (big endian), and Power8
(little endian).  Also some testing has been done on various flavors
of BSD and Solaris and in the past MIPS was tested and supported.

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

In the discussion below, some changes are excerpted in order to
highlight important aspects of the changes.

UPC's Shared Qualifier and Layout Qualifier
---

The UPC language specification describes
the language syntax and semantics:
  http://upc.lbl.gov/publications/upc-spec-1.3.pdf

UPC introduces a new qualifier, "shared" that indicates that the
qualified object is located in a global shared address space that is
accessible by all UPC threads.  Additional qualifiers ("strict" and
"relaxed") further specify the semantics of accesses to
UPC shared objects.

In UPC, a shared qualified array can optionally specify a "layout
qualifier" that indicates how the shared data is blocked and
distributed across UPC threads.

There are two language pre-defined identifiers that indicate the
number of threads that will be 

[UPC 02/22] tree-related changes

2015-11-30 Thread Gary Funck

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


UPC introduces a new qualifier, "shared", that indicates that the
qualified object is located in a global shared address space that is
accessible by all UPC threads.  Additional qualifiers ("strict" and
"relaxed") further specify the semantics of accesses to
UPC shared objects.

In UPC, a shared qualified array can further specify a "layout
qualifier" (blocking factor) that indicates how the shared data
is blocked and distributed.

The following example illustrates the use of the UPC "shared" qualifier
combined with a layout qualifier.

#define BLKSIZE 5
#define N_PER_THREAD (4 * BLKSIZE)
shared [BLKSIZE] double A[N_PER_THREAD*THREADS];

Above the "[BLKSIZE]" construct is the UPC layout qualifier; this
specifies that the shared array, A, distributes its elements across
each thread in blocks of 5 elements.  If the program is run with two
threads, then A is distributed as shown below:

Thread 0Thread 1
-
A[ 0.. 4]   A[ 5.. 9]
A[10..14]   A[15..19]
A[20..24]   A[25..29]
A[30..34]   A[35..39]

Above, the elements shown for thread 0 are defined as having "affinity"
to thread 0.  Similarly, those elements shown for thread 1 have
affinity to thread 1.  In UPC, a pointer to a shared object can be
cast to a thread local pointer (a "C" pointer), when the designated
shared object has affinity to the referencing thread.

A UPC "pointer-to-shared" (PTS) is a pointer that references a UPC
shared object.  A UPC pointer-to-shared is a "fat" pointer with the
following logical fields:
   (virt_addr, thread, phase)

The virtual address (virt_addr) field is combined with the thread
number (thread) to derive the location of the referenced object
within the UPC shared address space.  The phase field is used
keep track of the current block offset for PTS's that have
blocking factor that is greater than one.

GUPC implements pointer-to-shared objects using a "struct" internal
representation.  Until recently, GUPC also supported a "packed"
representation, which is more space efficient, but limits the range of
various fields in the UPC pointer-to-shared representation.  We have
decided to support only the "struct" representation so that the
compiler uses a single ABI that supports the full range of addresses,
threads, and blocking factors.

GCC's internal tree representation is extended to record the UPC
"shared", "strict", "relaxed" qualifiers, and the layout qualifier.

--- gcc/tree-core.h (.../trunk) (revision 228959)
+++ gcc/tree-core.h (.../branches/gupc) (revision 229159)
@@ -470,7 +470,11 @@ enum cv_qualifier {
   TYPE_QUAL_CONST= 0x1,
   TYPE_QUAL_VOLATILE = 0x2,
   TYPE_QUAL_RESTRICT = 0x4,
-  TYPE_QUAL_ATOMIC   = 0x8
+  TYPE_QUAL_ATOMIC   = 0x8,
+  /* UPC qualifiers */
+  TYPE_QUAL_SHARED   = 0x10,
+  TYPE_QUAL_RELAXED  = 0x20,
+  TYPE_QUAL_STRICT   = 0x40
 };
[...]
@@ -857,9 +875,14 @@ struct GTY(()) tree_base {
   unsigned user_align : 1;
   unsigned nameless_flag : 1;
   unsigned atomic_flag : 1;
-  unsigned spare0 : 3;
-
-  unsigned spare1 : 8;
+  unsigned shared_flag : 1;
+  unsigned strict_flag : 1;
+  unsigned relaxed_flag : 1;
+
+  unsigned threads_factor_flag : 1;
+  unsigned block_factor_0 : 1;
+  unsigned block_factor_x : 1;
+  unsigned spare1 : 5;

A given type is a UPC shared type if its 'shared_flag' is set.
However, for array types, the shared_flag of the *element type*
must be checked.  Thus,

/* Return TRUE if TYPE is a shared type.  For arrays,
   the element type must be queried, because array types
   are never qualified.  */
#define SHARED_TYPE_P(TYPE) \
  ((TYPE) && TYPE_P (TYPE) \
   && TYPE_SHARED ((TREE_CODE (TYPE) != ARRAY_TYPE \
? (TYPE) : strip_array_types (TYPE

By default, a type has a blocking factor of 1.  If the blocking factor is 0
(known as "indefinite") then 'block_factor_0' is set. If the blocking
factor is neither 0 nor 1, then 'block_factor_x' is set and 

[gomp4] Merge trunk r231075 (2015-11-30) into gomp-4_0-branch

2015-11-30 Thread Thomas Schwinge
Hi!

Committed to gomp-4_0-branch in r231099:

commit 4f88f92b308151aa2c2592102da20c417df69c27
Merge: 24e5942 851c1b0
Author: tschwinge 
Date:   Tue Dec 1 07:44:27 2015 +

svn merge -r 230907:231075 svn+ssh://gcc.gnu.org/svn/gcc/trunk


git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@231099 
138bc75d-0d04-0410-961f-82ee72b054a4


Grüße
 Thomas


[UPC 20/22] libgupc runtime library [4/9]

2015-11-30 Thread Gary Funck
[NOTE: Due to email list size limits, this patch is broken into 9 parts.]

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


Libgupc is the UPC runtime library, for GUPC.  The configuration,
makefile, and documentation related changes have been broken out into
separate patches.

As noted in the ChangeLog entry below, this is all new code.
Two communication layers are supported: (1) SMP, via 'mmap'
or (2) the Portals4 library API, which supports multi-node
operation.  Libgupc generally requires a POSIX-compliant target OS.

The 'smp' runtime is the default runtime.  The 'portals4'
runtime is experimental; it supports multi-node operation
using the Portals4 communications library.

Most of the libgupc/include/ directory contains standard headers
defined by the UPC language specification. 'make install' will
install these headers in the directory where other "C"
header files are located.

2015-11-30  Gary Funck  

libgupc/collectives/
* upc_coll.h: New.
* upc_coll_broadcast.upc: New.
* upc_coll_err.upc: New.
* upc_coll_exchange.upc: New.
* upc_coll_gather.upc: New.
* upc_coll_gather_all.upc: New.
* upc_coll_init.upc: New.

Index: libgupc/collectives/upc_coll.h
===
--- libgupc/collectives/upc_coll.h  (.../trunk) (revision 0)
+++ libgupc/collectives/upc_coll.h  (.../branches/gupc) (revision 
231080)
@@ -0,0 +1,67 @@
+/* Copyright (C) 2012-2015 Free Software Foundation, Inc.
+   This file is part of the UPC runtime library.
+   Written by Gary Funck 
+   and Nenad Vukicevic 
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+.  */
+
+/*/
+/*   */
+/*  Copyright (c) 2004, Michigan Technological University*/
+/*  All rights reserved. */
+/*   */
+/*  Redistribution and use in source and binary forms, with or without   */
+/*  modification, are permitted provided that the following conditions   */
+/*  are met: */
+/*   */
+/*  * Redistributions of source code must retain the above copyright */
+/*  notice, this list of conditions and the following disclaimer.*/
+/*  * Redistributions in binary form must reproduce the above*/
+/*  copyright notice, this list of conditions and the following  */
+/*  disclaimer in the documentation and/or other materials provided  */
+/*  with the distribution.   */
+/*  * Neither the name of the Michigan Technological University  */
+/*  nor the names of its contributors may be used to endorse or promote  */
+/*  products derived from this software without specific prior written   */
+/*  

[UPC 20/22] libgupc runtime library [1/9]

2015-11-30 Thread Gary Funck
[NOTE: Due to email list size limits, this patch is broken into 9 parts.]

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


Libgupc is the UPC runtime library, for GUPC.  The configuration,
makefile, and documentation related changes have been broken out into
separate patches.

As noted in the ChangeLog entry below, this is all new code.
Two communication layers are supported: (1) SMP, via 'mmap'
or (2) the Portals4 library API, which supports multi-node
operation.  Libgupc generally requires a POSIX-compliant target OS.

The 'smp' runtime is the default runtime.  The 'portals4'
runtime is experimental; it supports multi-node operation
using the Portals4 communications library.

Most of the libgupc/include/ directory contains standard headers
defined by the UPC language specification. 'make install' will
install these headers in the directory where other "C"
header files are located.

2015-11-30  Gary Funck  

libgupc/
* upc-crtstuff.c: New.
libgupc/include/
* gasp.h: New.
* gasp_upc.h: New.
* gcc-upc.h: New.
* pupc.h: New.
* upc.h: New.
* upc_atomic.h: New.
* upc_castable.h: New.
* upc_collective.h: New.
* upc_nb.h: New.
* upc_relaxed.h: New.
* upc_strict.h: New.
* upc_tick.h: New.
* upc_types.h: New.

Index: libgupc/upc-crtstuff.c
===
--- libgupc/upc-crtstuff.c  (.../trunk) (revision 0)
+++ libgupc/upc-crtstuff.c  (.../branches/gupc) (revision 231080)
@@ -0,0 +1,66 @@
+/* upc-crtstuff.c: UPC specific "C Runtime Support"
+   Copyright (C) 2009-2015 Free Software Foundation, Inc.
+   Contributed by Gary Funck 
+ and Nenad Vukicevic .
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "upc-crt-config.h"
+#include "upc-crt-begin-end.h"
+
+/* Only define section start/end if no link script is used.   */
+
+#ifdef CRT_BEGIN
+
+/* Shared begin is always defined in order to allocate space
+   at the beginning of the section.  */
+#ifdef UPC_SHARED_SECTION_BEGIN
+/* Establish a symbol at the beginning of the data section.  */
+UPC_SHARED_SECTION_BEGIN
+#endif /* UPC_SHARED_SECTION_BEGIN */
+
+#ifndef HAVE_UPC_LINK_SCRIPT
+#ifdef UPC_PGM_INFO_SECTION_BEGIN
+/* Establish a symbol at the beginning of the program info data section.  */
+UPC_PGM_INFO_SECTION_BEGIN
+#endif /* UPC_PGM_INFO_SECTION_BEGIN */
+#ifdef UPC_INIT_ARRAY_SECTION_BEGIN
+/* Establish a symbol at the beginning of the initialization array section.  */
+UPC_INIT_ARRAY_SECTION_BEGIN
+#endif /* UPC_INIT_ARRAY_SECTION_BEGIN */
+#endif /* !HAVE_UPC_LINK_SCRIPT */
+
+#elif defined(CRT_END) /* ! CRT_BEGIN */
+
+#ifndef HAVE_UPC_LINK_SCRIPT
+#ifdef UPC_SHARED_SECTION_END
+/* Establish a symbol at the end of the shared data section.  */
+UPC_SHARED_SECTION_END
+#endif /* UPC_SHARED_SECTION_END */
+#ifdef UPC_PGM_INFO_SECTION_END
+/* Establish a symbol at the end of the program info data section.  */
+UPC_PGM_INFO_SECTION_END
+#endif /* UPC_PGM_INFO_SECTION_END */
+#ifdef UPC_INIT_ARRAY_SECTION_END
+/* Establish a symbol at the end of the initialization array section.  */
+UPC_INIT_ARRAY_SECTION_END
+#endif /* UPC_INIT_ARRAY_SECTION_END */
+#endif /* !HAVE_UPC_LINK_SCRIPT */
+#else /* ! CRT_BEGIN && ! CRT_END */
+#error "One of CRT_BEGIN or CRT_END must be defined."
+#endif

When not optimizing do not compute RTX memory attributes

2015-11-30 Thread Jan Hubicka
Hi,
memory attributes are currently optimized and attached to RTL even when not
optimizing. This is obviously just a wasted effort.

Bootstrapped/regtested x86_64-linux, OK?

Honza
* emit-rtl.c (set_mem_attrs, set_mem_attributes_minus_bitpos):
Do not compute memory attributes when not optimizing.

Index: emit-rtl.c
===
--- emit-rtl.c  (revision 231081)
+++ emit-rtl.c  (working copy)
@@ -336,7 +336,8 @@ static void
 set_mem_attrs (rtx mem, mem_attrs *attrs)
 {
   /* If everything is the default, we can just clear the attributes.  */
-  if (mem_attrs_eq_p (attrs, mode_mem_attrs[(int) GET_MODE (mem)]))
+  if (!optimize
+  || mem_attrs_eq_p (attrs, mode_mem_attrs[(int) GET_MODE (mem)]))
 {
   MEM_ATTRS (mem) = 0;
   return;
@@ -1749,6 +1750,9 @@ set_mem_attributes_minus_bitpos (rtx ref
   struct mem_attrs attrs, *defattrs, *refattrs;
   addr_space_t as;
 
+  if (!optimize)
+return;
+
   /* It can happen that type_for_mode was given a mode for which there
  is no language-level type.  In which case it returns NULL, which
  we can see here.  */


[UPC 20/22] libgupc runtime library [6/9]

2015-11-30 Thread Gary Funck
[NOTE: Due to email list size limits, this patch is broken into 9 parts.]

Background
--

An overview email, describing the UPC-related changes is here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg5.html

The GUPC branch is described here:
  http://gcc.gnu.org/projects/gupc.html

The UPC-related source code differences are summarized here:
  http://gccupc.org/gupc-changes

All languages (c, c++, fortran, go, lto, objc, obj-c++) have been
bootstrapped; no test suite regressions were introduced,
relative to the GCC trunk.

If you are on the cc-list, your name was chosen either
because you are listed as a maintainer for the area that
applies to the patches described in this email, or you
were a frequent contributor of patches made to files listed
in this email.

In the change log entries included in each patch, the directory
containing the affected files is listed, followed by the files.
When the patches are applied, the change log entries will be
distributed to the appropriate ChangeLog file.

Overview


Libgupc is the UPC runtime library, for GUPC.  The configuration,
makefile, and documentation related changes have been broken out into
separate patches.

As noted in the ChangeLog entry below, this is all new code.
Two communication layers are supported: (1) SMP, via 'mmap'
or (2) the Portals4 library API, which supports multi-node
operation.  Libgupc generally requires a POSIX-compliant target OS.

The 'smp' runtime is the default runtime.  The 'portals4'
runtime is experimental; it supports multi-node operation
using the Portals4 communications library.

Most of the libgupc/include/ directory contains standard headers
defined by the UPC language specification. 'make install' will
install these headers in the directory where other "C"
header files are located.

2015-11-30  Gary Funck  

libgupc/collectives/
* upc_coll_reduce.upc: New.
* upc_coll_scatter.upc: New.
* upc_coll_sort.upc: New.

Index: libgupc/collectives/upc_coll_reduce.upc
===
--- libgupc/collectives/upc_coll_reduce.upc (.../trunk) (revision 0)
+++ libgupc/collectives/upc_coll_reduce.upc (.../branches/gupc) 
(revision 231080)
@@ -0,0 +1,4296 @@
+/* Copyright (C) 2012-2015 Free Software Foundation, Inc.
+   This file is part of the UPC runtime library.
+   Written by Gary Funck 
+   and Nenad Vukicevic 
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+.  */
+
+/*/
+/*   */
+/*  Copyright (c) 2004, Michigan Technological University*/
+/*  All rights reserved. */
+/*   */
+/*  Redistribution and use in source and binary forms, with or without   */
+/*  modification, are permitted provided that the following conditions   */
+/*  are met: */
+/*   */
+/*  * Redistributions of source code must retain the above copyright */
+/*  notice, this list of conditions and the following disclaimer.*/
+/*  * Redistributions in binary form must reproduce the above*/
+/*  copyright notice, this list of conditions and the following  */
+/*  disclaimer in the documentation and/or other materials provided  */
+/*  with the distribution.   */
+/*  * Neither the name of the Michigan Technological University  */
+/*  nor the names of its contributors may be used to endorse or promote  */
+/*  products derived from this software without specific prior written   */
+/*  permission.  */
+/* 

Re: [PATCH] GCC system.h and Graphite header order

2015-11-30 Thread Thomas Schwinge
Hi!

On Fri, 27 Nov 2015 11:51:11 -0500, David Edelsohn  wrote:
> On Fri, Nov 27, 2015 at 11:24 AM, Thomas Schwinge
>  wrote:
> > On Tue, 24 Nov 2015 10:32:12 +, Alan Lawrence  
> > wrote:
> >> I note doc/install.texi says that gcc uses "ISL Library version 0.15,
> >> 0.14, 0.13, or 0.12.2". This patch breaks the build with 0.12.2 (a
> >> subset of errors below)
> >
> >  has been filed.  I set you guys on CC.
> >
> >> but seems fine with 0.14. I haven't tested
> >> 0.13. Do we want to update install.texi ?
> >
> > I have a slight preference to keep ISL 0.12.2 supported, but can adapt to
> > a newer version, if necessary.
> 
> I updated the install document yesterday.
> 
> I don't object to support for ISL 0.12.2, but someone has to implement
> an appropriate header file incantation for the Graphite source files
> WITHOUT reordering it again nor including ISL header files first --
> before system.h.  Some GCC header files must be included first in GCC
> source files.

I'm not too much interested in "living in the past", so I went the easier
route, installed ISL 0.15 packages.


Grüße
 Thomas


GCC 5 branch now frozen for GCC 5.3 RC1

2015-11-30 Thread Richard Biener

The GCC 5 branch is now frozen for creating a first release candidate
for GCC 5.3.  All changes from now on require release manager approval.

Thanks,
Richard.


Re: [PATCH] nvptx: implement automatic storage in custom stacks

2015-11-30 Thread Jakub Jelinek
On Thu, Nov 12, 2015 at 04:58:21PM +0300, Alexander Monakov wrote:
> I'm proposing the following patch as a step towards resolving the issue with
> inaccessibility of stack storage (.local memory) in PTX to other threads than
> the one using that stack.  The idea is to have preallocated stacks, and have
> __nvptx_stacks[] array in shared memory hold current stack pointers.  Each
> thread is maintaining __nvptx_stacks[tid.y] as its stack pointer, thus for
> OpenMP the intent is to preallocate on a per-warp basis (not per-thread).
> For OpenMP SIMD regions we'll have to ensure that conflicting accesses are not
> introduced.
> 
> I've exposed a new command-line option -msoft-stack to ease testing, but for
> OpenMP we'll have to automatically flip it based on function attributes.
> Right now it's not easy because OpenMP and OpenACC both use "omp declare
> target".  Jakub, I seem to recall a discussion about OpenACC changing to use a
> separate attribute, but I cannot find it now.  Any advice here?

I believe OpenACC has acc routine {gang,worker,seq} that would roughly match
whether certain OpenMP declare target function (or ompfn region) is/can be
called within the target/teams/distribute context, or parallel context, or
simd context.  For OpenMP we have no such pragmas, so we need some analysis
to help the PTX (and, as Martin said on IRC, HSA apparently too) and add
attributes accordingly.
For the .ompfn* outlined region it is easy, there we know from which
construct it is, for other functions bet we want to do some IPA analysis for
this, start with the .ompfn* functions marked and walk the cgraph and for
declare target functions not callable from outside try to determine if they
are only called from parallel contexts, or not.

Does your patch affect all the stack allocations within certain function
(i.e. no way to select on a per-variable bases what stack to allocate it
to)?  Without any detailed analysis at least e.g. spilled (non-addressable)
vars could at least go to the local stack.  But PTX doesn't have any spills,
right?  Not sure about HSA.  If it is a per-function thing only, then it
isn't worth to do more detailed analysis at the ompexp time.

BTW, surely it will be an advantage if PTX can support alloca through this,
it could e.g. turn on -msoft-stack for all functions that use alloca/VLAs
automatically.

Jakub


Re: [PATCH] Update TARGET_FUNCTION_INCOMING_ARG documentation

2015-11-30 Thread Bernd Schmidt

On 11/29/2015 06:14 PM, H.J. Lu wrote:

Is this safe for stage 3?


Is there a reason to do it now? This doesn't include a testcase.


* function.c (assign_parm_setup_stack): Force source into a
register if needed.
* target.def (function_incoming_arg): Update documentation to
allow arbitrary address computation based on hard register.
* doc/tm.texi: Regenerated.



Bernd



Re: [PATCH, PING*4] Track indirect calls for call site information in debug info.

2015-11-30 Thread Pierre-Marie de Rodat

Hello Jakub,

On 11/24/2015 06:10 PM, Jakub Jelinek wrote:

The new pass is IMNSHO completely useless and undesirable, both for compile
time (another whole IL traversal) reasons and for the unnecessary creation
of memory allocations.
[…]


Thank you for your detailed answer! This is just to say that I’m working 
on this matter: I hope I’ll be able to yield a patch implementing your 
proposal before the end of this week.


--
Pierre-Marie de Rodat


Re: [PATCH] nvptx: implement automatic storage in custom stacks

2015-11-30 Thread Alexander Monakov
On Mon, 30 Nov 2015, Jakub Jelinek wrote:
> Does your patch affect all the stack allocations within certain function
> (i.e. no way to select on a per-variable bases what stack to allocate it
> to)?  Without any detailed analysis at least e.g. spilled (non-addressable)
> vars could at least go to the local stack.  But PTX doesn't have any spills,
> right?  Not sure about HSA.  If it is a per-function thing only, then it
> isn't worth to do more detailed analysis at the ompexp time.

Yes, there's no register allocation and thus no spills on PTX.
 
> BTW, surely it will be an advantage if PTX can support alloca through this,
> it could e.g. turn on -msoft-stack for all functions that use alloca/VLAs
> automatically.

Yes, I'm going to support alloca on soft stacks, but, -msoft-stack has a
prerequisite of soft stacks being initially set up.  Therefore I'm treating it
as an ABI variant (together with another option to handle atomics and
"syscalls" outside of simd regions), and building a separate multilib for
that.  So I see it the other way around: it's not safe for the compiler to
always use soft-stacks for alloca (because OpenACC wouldn't set up soft
stacks), but if soft stacks are enabled, alloca can use them.

In the multilib variant that I'm introducing, all addressable vars go to soft
stacks, and classic .local stacks are used rarely, e.g. for stdarg passing,
and implicitely for calls/returns (and after JIT, they'll service register
spills too).

Alexander


[Ada] Add missing type conversions

2015-11-30 Thread Eric Botcazou
Self-explanatory, tested on x86_64-suse-linux, applied on the mainline.


2015-11-30  Eric Botcazou  

* gcc-interface/utils2.c (gnat_invariant_expr): Add type conversions.


2015-11-30  Eric Botcazou  

* gnat.dg/loop_optimization22.ad[sb]: New test.


-- 
Eric Botcazou-- { dg-do compile }
-- { dg-options "-O" }

pragma Overflow_Mode (Minimized);

package body Loop_Optimization22 is

  procedure Foo (X : Discrim_Type) is
  H : array (1 .. Integer (X.Count) + 1) of Float;
   begin
  for I in 1 .. X.Count loop
 H (Integer(I) + 1):= 0.0;
  end loop;
   end;

end Loop_Optimization22;
package Loop_Optimization22 is

  type Discrim_Type (Count : Positive) is null record;

  procedure Foo (X : Discrim_Type);

end Loop_Optimization22;
Index: gcc-interface/utils2.c
===
--- gcc-interface/utils2.c	(revision 231061)
+++ gcc-interface/utils2.c	(working copy)
@@ -2860,7 +2860,9 @@ gnat_invariant_expr (tree expr)
   tree op0 = gnat_invariant_expr (TREE_OPERAND (expr, 0));
   tree op1 = TREE_OPERAND (expr, 1);
   if (op0 && TREE_CONSTANT (op1))
-	return fold_build2 (TREE_CODE (expr), type, op0, op1);
+	return
+	  fold_build2 (TREE_CODE (expr), type,
+		   fold_convert (type, op0), fold_convert (type, op1));
   else
 	return NULL_TREE;
 }


Re: [PATCH] nvptx: implement automatic storage in custom stacks

2015-11-30 Thread Alexander Monakov
On Mon, 30 Nov 2015, Jakub Jelinek wrote:
> Does it really have to be a full multilib?  I mean, the only precondition is
> that something sets up the var, right?  Would the OpenACC folks be willing
> to set it up too?  For stuff like libc, functions that are ECF_LEAF builtins
> IMHO really don't care whether they are built as -msoft-stack or not, they
> shouldn't be passing addresses of local vars to code that could use OpenMP.
> The only question is if say qsort or other functions that call user callbacks
> could be passing addresses of local vars to those callbacks, or whether they
> only pass addresses passed from callers, or addresses of heap objects.

Well, in that full multilib there's also a second option enabled, to handle
atomics/syscalls outside of simd regions, where the cost is additional code in
the prologue, and one "shuffle" instruction after each atomic/syscall.

It doesn't have to be a multilib, but doing it as a multilib is a safe choice
w.r.t OpenACC work.

Alexander


[Ada] Fix simple C interfacing issues

2015-11-30 Thread Eric Botcazou
This fixes the simple C interfacing issues recently reported by Jan (the 
signedness issue of char will probably be fixed for GCC 6, the duality 
pointer/System.Address probably _not_ unfortunately).

Tested on x86_64-suse-linux, applied on the mainline.


2015-11-30  Eric Botcazou  

* osint.adb: Add use type clause for CRTL.size_t.
(C_String_Length): Return CRTL.size_t instead of Integer.
(To_Path_String_Access): Take CRTL.size_t instead of Integer.
(Get_Libraries_From_Registry): Use CRTL throughout.
(To_Canonical_Dir_Spec): Use CRTL.size_t instead of Integer.
(To_Canonical_File_List): Likewise.
(To_Canonical_File_Spec): Likewise.
(To_Canonical_Path_Spec): Likewise.
(To_Host_Dir_Spec): Likewise.
(To_Host_File_Spec): Likewise.
(Update_Path): Use CRTL throughout.
* s-shasto.adb: Add with clause for System.CRTL.
(Initialize): Rename CRTL.strncpy instead of importing it manually.


-- 
Eric BotcazouIndex: osint.adb
===
--- osint.adb	(revision 231010)
+++ osint.adb	(working copy)
@@ -46,6 +46,8 @@ with GNAT.HTable;
 
 package body Osint is
 
+   use type CRTL.size_t;
+
Running_Program : Program_Type := Unspecified;
--  comment required here ???
 
@@ -135,12 +137,12 @@ package body Osint is
--  A version of Smart_Find_File that also returns a cache of the file
--  attributes for later reuse
 
-   function C_String_Length (S : Address) return Integer;
+   function C_String_Length (S : Address) return CRTL.size_t;
--  Returns length of a C string (zero for a null address)
 
function To_Path_String_Access
  (Path_Addr : Address;
-  Path_Len  : Integer) return String_Access;
+  Path_Len  : CRTL.size_t) return String_Access;
--  Converts a C String to an Ada String. Are we doing this to avoid withing
--  Interfaces.C.Strings ???
--  Caller must free result.
@@ -419,27 +421,18 @@ package body Osint is
  pragma Import (C, C_Get_Libraries_From_Registry,
 "__gnat_get_libraries_from_registry");
 
- function Strlen (Str : Address) return Integer;
- pragma Import (C, Strlen, "strlen");
-
- procedure Strncpy (X : Address; Y : Address; Length : Integer);
- pragma Import (C, Strncpy, "strncpy");
-
- procedure C_Free (Str : Address);
- pragma Import (C, C_Free, "free");
-
  Result_Ptr: Address;
- Result_Length : Integer;
+ Result_Length : CRTL.size_t;
  Out_String: String_Ptr;
 
   begin
  Result_Ptr := C_Get_Libraries_From_Registry;
- Result_Length := Strlen (Result_Ptr);
+ Result_Length := CRTL.strlen (Result_Ptr);
 
- Out_String := new String (1 .. Result_Length);
- Strncpy (Out_String.all'Address, Result_Ptr, Result_Length);
+ Out_String := new String (1 .. Integer (Result_Length));
+ CRTL.strncpy (Out_String.all'Address, Result_Ptr, Result_Length);
 
- C_Free (Result_Ptr);
+ CRTL.free (Result_Ptr);
 
  return Out_String;
   end Get_Libraries_From_Registry;
@@ -673,14 +666,12 @@ package body Osint is
-- C_String_Length --
-
 
-   function C_String_Length (S : Address) return Integer is
-  function Strlen (S : Address) return Integer;
-  pragma Import (C, Strlen, "strlen");
+   function C_String_Length (S : Address) return CRTL.size_t is
begin
   if S = Null_Address then
  return 0;
   else
- return Strlen (S);
+ return CRTL.strlen (S);
   end if;
end C_String_Length;
 
@@ -2959,7 +2950,7 @@ package body Osint is
 
   C_Host_Dir : String (1 .. Host_Dir'Length + 1);
   Canonical_Dir_Addr : Address;
-  Canonical_Dir_Len  : Integer;
+  Canonical_Dir_Len  : CRTL.size_t;
 
begin
   C_Host_Dir (1 .. Host_Dir'Length) := Host_Dir;
@@ -3023,7 +3014,7 @@ package body Osint is
   declare
  Canonical_File_List : String_Access_List (1 .. Num_Files);
  Canonical_File_Addr : Address;
- Canonical_File_Len  : Integer;
+ Canonical_File_Len  : CRTL.size_t;
 
   begin
  --  Retrieve the expanded directory names and build the list
@@ -3056,7 +3047,7 @@ package body Osint is
 
   C_Host_File : String (1 .. Host_File'Length + 1);
   Canonical_File_Addr : Address;
-  Canonical_File_Len  : Integer;
+  Canonical_File_Len  : CRTL.size_t;
 
begin
   C_Host_File (1 .. Host_File'Length) := Host_File;
@@ -3091,7 +3082,7 @@ package body Osint is
 
   C_Host_Path : String (1 .. Host_Path'Length + 1);
   Canonical_Path_Addr : Address;
-  Canonical_Path_Len  : Integer;
+  Canonical_Path_Len  : CRTL.size_t;
 
begin
   C_Host_Path (1 .. Host_Path'Length) := Host_Path;
@@ -3126,7 +3117,7 @@ package 

Re: [gomp4.5] Handle #pragma omp declare target link

2015-11-30 Thread Jakub Jelinek
On Fri, Nov 27, 2015 at 07:50:09PM +0300, Ilya Verbin wrote:
> On Thu, Nov 19, 2015 at 16:31:15 +0100, Jakub Jelinek wrote:
> > On Mon, Nov 16, 2015 at 06:40:43PM +0300, Ilya Verbin wrote:
> > > @@ -2009,7 +2010,8 @@ scan_sharing_clauses (tree clauses, omp_context 
> > > *ctx)
> > > decl = OMP_CLAUSE_DECL (c);
> > > /* Global variables with "omp declare target" attribute
> > >don't need to be copied, the receiver side will use them
> > > -  directly.  */
> > > +  directly.  However, global variables with "omp declare target link"
> > > +  attribute need to be copied.  */
> > > if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
> > > && DECL_P (decl)
> > > && ((OMP_CLAUSE_MAP_KIND (c) != GOMP_MAP_FIRSTPRIVATE_POINTER
> > > @@ -2017,7 +2019,9 @@ scan_sharing_clauses (tree clauses, omp_context 
> > > *ctx)
> > >  != GOMP_MAP_FIRSTPRIVATE_REFERENCE))
> > > || TREE_CODE (TREE_TYPE (decl)) == ARRAY_TYPE)
> > > && is_global_var (maybe_lookup_decl_in_outer_ctx (decl, ctx))
> > > -   && varpool_node::get_create (decl)->offloadable)
> > > +   && varpool_node::get_create (decl)->offloadable
> > > +   && !lookup_attribute ("omp declare target link",
> > > + DECL_ATTRIBUTES (decl)))
> > 
> > I wonder if Honza/Richi wouldn't prefer to have this info also
> > in cgraph, instead of looking up the attribute in each case.
> 
> So should I add a new flag into cgraph?
> Also it is used in gimplify_adjust_omp_clauses.

Richi said on IRC that lookup_attribute is ok, so let's keep it that way for
now.

> +   /* Most significant bit of the size marks such vars.  */
> +   unsigned HOST_WIDE_INT isize = tree_to_uhwi (size);
> +   isize |= 1ULL << (int_size_in_bytes (const_ptr_type_node) * 8 - 1);

That supposedly should be BITS_PER_UNIT instead of 8.

> diff --git a/gcc/varpool.c b/gcc/varpool.c
> index 36f19a6..cbd1e05 100644
> --- a/gcc/varpool.c
> +++ b/gcc/varpool.c
> @@ -561,17 +561,21 @@ varpool_node::assemble_decl (void)
>   are not real variables, but just info for debugging and codegen.
>   Unfortunately at the moment emutls is not updating varpool correctly
>   after turning real vars into value_expr vars.  */
> +#ifndef ACCEL_COMPILER
>if (DECL_HAS_VALUE_EXPR_P (decl)
>&& !targetm.have_tls)
>  return false;
> +#endif
>  
>/* Hard register vars do not need to be output.  */
>if (DECL_HARD_REGISTER (decl))
>  return false;
>  
> +#ifndef ACCEL_COMPILER
>gcc_checking_assert (!TREE_ASM_WRITTEN (decl)
>  && TREE_CODE (decl) == VAR_DECL
>  && !DECL_HAS_VALUE_EXPR_P (decl));
> +#endif

This looks wrong, both of these clearly could affect anything with
DECL_HAS_VALUE_EXPR_P, not just the link vars.
So, if you need to handle the "omp declare target link" vars specially,
you should only handle those specially and nothing else.  And please try to
explain why.

> @@ -1005,13 +1026,18 @@ gomp_load_image_to_device (struct gomp_device_descr 
> *devicep, unsigned version,
>for (i = 0; i < num_vars; i++)
>  {
>struct addr_pair *target_var = _table[num_funcs + i];
> -  if (target_var->end - target_var->start
> -   != (uintptr_t) host_var_table[i * 2 + 1])
> +  uintptr_t target_size = target_var->end - target_var->start;
> +
> +  /* Most significant bit of the size marks "omp declare target link"
> +  variables.  */
> +  bool is_link = target_size & (1ULL << (sizeof (uintptr_t) * 8 - 1));

__CHAR_BIT__ here instead of 8?

> @@ -1019,7 +1045,7 @@ gomp_load_image_to_device (struct gomp_device_descr 
> *devicep, unsigned version,
>k->host_end = k->host_start + (uintptr_t) host_var_table[i * 2 + 1];
>k->tgt = tgt;
>k->tgt_offset = target_var->start;
> -  k->refcount = REFCOUNT_INFINITY;
> +  k->refcount = is_link ? REFCOUNT_LINK : REFCOUNT_INFINITY;
>k->async_refcount = 0;
>array->left = NULL;
>array->right = NULL;

Do we need to do anything in gomp_unload_image_from_device ?
I mean at least in questionable programs that for link vars don't decrement
the refcount of the var that replaced the link var to 0 first before
dlclosing the library.
At least host_var_table[j * 2 + 1] will have the MSB set, so we need to
handle it differently.  Perhaps for that case perform a lookup, and if we
get something which has link_map non-NULL, first perform as if there is
target exit data delete (var) on it first?

Jakub


Re: [patch] link libgccjit using LDFLAGS

2015-11-30 Thread Bernd Schmidt

On 11/30/2015 01:00 AM, Matthias Klose wrote:

link libgccjit using LDFLAGS (which is empty by default), but could be
used to pass hardening options like -Wlz,relro.


Ok when stage 1 opens.


Bernd



Re: [patch] c/c++ asan tests for FreeBSD

2015-11-30 Thread Andreas Tobler

On 30.11.15 11:28, Bernd Schmidt wrote:

On 11/29/2015 08:32 PM, Andreas Tobler wrote:

Hi all,

the attached patch prepares the testsuite, c and c++, for the upcoming
ASAN support for FreeBSD (x86_64 first).

I tested the patch on CentOS7.1 x86_64 and on FreeBSD x86_64.
Results can be seen on the list.

Is this ok for trunk?

-/* { dg-do run { target { *-*-linux* } } } */
+/* { dg-do run { target { *-*-linux* *-*-freebsd* } } } */


I see a patch from you to add asan support to x86 freebsd, but what
about other architectures?


You mean because of the wildcard? I'll add them as I have time to port them.

For now they are UNSUPPORTED.

Does every *-*-linux* has asan support?

Andreas



Re: [PATCH, PR46032] Handle BUILT_IN_GOMP_PARALLEL in ipa-pta

2015-11-30 Thread Tom de Vries

On 30/11/15 10:16, Richard Biener wrote:

On Mon, 30 Nov 2015, Tom de Vries wrote:


Hi,

this patch fixes PR46032.

It handles a call:
...
   __builtin_GOMP_parallel (fn, data, num_threads, flags)
...
as:
...
   fn (data)
...
in ipa-pta.

This improves ipa-pta alias analysis in the parallelized function fn, and
allows vectorization in the testcase without a runtime alias test.

Bootstrapped and reg-tested on x86_64.

OK for stage3 trunk?


+ /* Assign the passed argument to the appropriate incoming
+parameter of the function.  */
+ struct constraint_expr lhs ;
+ lhs = get_function_part_constraint (fi, fi_parm_base + 0);
+ auto_vec rhsc;
+ struct constraint_expr *rhsp;
+ get_constraint_for_rhs (arg, );
+ while (rhsc.length () != 0)
+   {
+ rhsp =  ();
+ process_constraint (new_constraint (lhs, *rhsp));
+ rhsc.pop ();
+   }

please use style used elsewhere with

  FOR_EACH_VEC_ELT (rhsc, j, rhsp)
process_constraint (new_constraint (lhs, *rhsp));
  rhsc.truncate (0);



That code was copied from find_func_aliases_for_call.
I've factored out the bit that I copied as 
find_func_aliases_for_call_arg, and fixed the style there (and dropped 
'rhsc.truncate (0)' since AFAIU it's redundant at the end of a function).



+ /* Parameter passed by value is used.  */
+ lhs = get_function_part_constraint (fi, fi_uses);
+ struct constraint_expr *rhsp;
+ get_constraint_for_address_of (arg, );

This isn't correct - you want to use get_constraint_for (arg, ).
After all rhs is already an ADDR_EXPR.



Can we add an assert somewhere to detect this incorrect usage?


+ FOR_EACH_VEC_ELT (rhsc, j, rhsp)
+   process_constraint (new_constraint (lhs, *rhsp));
+ rhsc.truncate (0);
+
+ /* The caller clobbers what the callee does.  */
+ lhs = get_function_part_constraint (fi, fi_clobbers);
+ rhs = get_function_part_constraint (cfi, fi_clobbers);
+ process_constraint (new_constraint (lhs, rhs));
+
+ /* The caller uses what the callee does.  */
+ lhs = get_function_part_constraint (fi, fi_uses);
+ rhs = get_function_part_constraint (cfi, fi_uses);
+ process_constraint (new_constraint (lhs, rhs));

I don't see why you need those.  The solver should compute these
in even better precision (context sensitive on the call side).

The same is true for the function parameter.  That is, the only
needed part of the patch should be that making sure we see
the "direct" call and assign parameters correctly.



Dropped this bit.

OK for stage3 trunk if bootstrap and reg-test succeeds?

Thanks,
- Tom


Handle BUILT_IN_GOMP_PARALLEL in ipa-pta

2015-11-30  Tom de Vries  

	PR tree-optimization/46032
	* tree-ssa-structalias.c (find_func_aliases_for_call_arg): New function,
	factored out of ...
	(find_func_aliases_for_call): ... here.
	(find_func_aliases_for_builtin_call, find_func_clobbers): Handle
	BUILT_IN_GOMP_PARALLEL.
	(ipa_pta_execute): Same.  Handle node->parallelized_function as a local
	function.

	* gcc.dg/pr46032.c: New test.

	* testsuite/libgomp.c/pr46032.c: New test.

---
 gcc/testsuite/gcc.dg/pr46032.c| 47 +++
 gcc/tree-ssa-structalias.c| 60 +++
 libgomp/testsuite/libgomp.c/pr46032.c | 44 +
 3 files changed, 138 insertions(+), 13 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/pr46032.c b/gcc/testsuite/gcc.dg/pr46032.c
new file mode 100644
index 000..b91190e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr46032.c
@@ -0,0 +1,47 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fopenmp -ftree-vectorize -std=c99 -fipa-pta -fdump-tree-vect-all" } */
+
+extern void abort (void);
+
+#define nEvents 1000
+
+static void __attribute__((noinline, noclone, optimize("-fno-tree-vectorize")))
+init (unsigned *results, unsigned *pData)
+{
+  unsigned int i;
+  for (i = 0; i < nEvents; ++i)
+pData[i] = i % 3;
+}
+
+static void __attribute__((noinline, noclone, optimize("-fno-tree-vectorize")))
+check (unsigned *results)
+{
+  unsigned sum = 0;
+  for (int idx = 0; idx < (int)nEvents; idx++)
+sum += results[idx];
+
+  if (sum != 1998)
+abort ();
+}
+
+int
+main (void)
+{
+  unsigned results[nEvents];
+  unsigned pData[nEvents];
+  unsigned coeff = 2;
+
+  init ([0], [0]);
+
+#pragma omp parallel for
+  for (int idx = 0; idx < (int)nEvents; idx++)
+results[idx] = coeff * pData[idx];
+
+  check ([0]);
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "note: vectorized 1 loop" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-not "versioning for alias required" "vect" } } */
+
diff --git a/gcc/tree-ssa-structalias.c 

[c-family] Fix -fdump-ada-spec ordering issue in C++

2015-11-30 Thread Eric Botcazou
This fixes an ordering issue in the Ada code generated by the -fdump-ada-spec 
option with the C++ compiler on structures/unions with nested anonymous arrays 
of structures/unions.

Given that this only affects the Ada code generated by -fdump-ada-spec and has 
no effect whatsoever on the C and C++ compilers, I have already installed it.

Tested on x86_64-suse-linux, applied on the mainline.

2015-11-30  Eric Botcazou  

c-family/
* c-ada-spec.c (print_ada_macros): Remove redundant blank line.
(decl_sloc_common): Delete and move bulk of processing to...
(decl_sloc): ...here.
(pp_ada_tree_identifier): Remove reference to QUAL_UNION_TYPE.
(dump_ada_double_name): Remove S parameter and compute the suffix.
(dump_ada_array_type): Add PARENT parameter.  Simplify computation of
element type and deal with an anonymous one.
(dump_ada_template): Use RECORD_OR_UNION_TYPE_P macro.
(dump_generic_ada_node): Tweak.  Adjust call to dump_ada_array_type
and remove reference to QUAL_UNION_TYPE.
(dump_nested_types): Make 2 passes on the fields and move bulk to...
(dump_nested_type): ...here.  New function extracted from above.
Generate a full declaration for anonymous element type of arrays.
(print_ada_declaration): Really skip anonymous declarations.  Remove
references to QUAL_UNION_TYPE.  Adjust call to dump_ada_array_type.
Clean up processing of declarations of array types and objects.
(print_ada_struct_decl): Remove reference to QUAL_UNION_TYPE.
Remove obsolete code and tidy up.


2015-11-30  Eric Botcazou  

* gcc.dg/dump-ada-spec-1.c: Move to...
* c-c++-common/dump-ada-spec-1.c: ...here.
* c-c++-common/dump-ada-spec-2.c: New test.

-- 
Eric BotcazouIndex: c-ada-spec.c
===
--- c-ada-spec.c	(revision 231010)
+++ c-ada-spec.c	(working copy)
@@ -375,7 +375,7 @@ print_ada_macros (pretty_printer *pp, cp
 	{
 	  expanded_location sloc = expand_location (macro->line);
 
-	  if (sloc.line != prev_line + 1)
+	  if (sloc.line != prev_line + 1 && prev_line > 0)
 	pp_newline (pp);
 
 	  num_macros++;
@@ -500,39 +500,28 @@ dump_ada_macros (pretty_printer *pp, con
 
 static const char *source_file_base;
 
-/* Compare the declaration (DECL) of struct-like types based on the sloc of
-   their last field (if LAST is true), so that more nested types collate before
-   less nested ones.
-   If ORIG_TYPE is true, also consider struct with a DECL_ORIGINAL_TYPE.  */
+/* Return sloc of DECL, using sloc of last field if LAST is true.  */
 
-static location_t
-decl_sloc_common (const_tree decl, bool last, bool orig_type)
+location_t
+decl_sloc (const_tree decl, bool last)
 {
-  tree type = TREE_TYPE (decl);
+  tree field;
 
+  /* Compare the declaration of struct-like types based on the sloc of their
+ last field (if LAST is true), so that more nested types collate before
+ less nested ones.  */
   if (TREE_CODE (decl) == TYPE_DECL
-  && (orig_type || !DECL_ORIGINAL_TYPE (decl))
-  && RECORD_OR_UNION_TYPE_P (type)
-  && TYPE_FIELDS (type))
+  && !DECL_ORIGINAL_TYPE (decl)
+  && RECORD_OR_UNION_TYPE_P (TREE_TYPE (decl))
+  && (field = TYPE_FIELDS (TREE_TYPE (decl
 {
-  tree f = TYPE_FIELDS (type);
-
   if (last)
-	while (TREE_CHAIN (f))
-	  f = TREE_CHAIN (f);
-
-  return DECL_SOURCE_LOCATION (f);
+	while (DECL_CHAIN (field))
+	  field = DECL_CHAIN (field);
+  return DECL_SOURCE_LOCATION (field);
 }
-  else
-return DECL_SOURCE_LOCATION (decl);
-}
 
-/* Return sloc of DECL, using sloc of last field if LAST is true.  */
-
-location_t
-decl_sloc (const_tree decl, bool last)
-{
-  return decl_sloc_common (decl, last, false);
+  return DECL_SOURCE_LOCATION (decl);
 }
 
 /* Compare two locations LHS and RHS.  */
@@ -1258,7 +1247,6 @@ pp_ada_tree_identifier (pretty_printer *
 		  case ARRAY_TYPE:
 		  case RECORD_TYPE:
 		  case UNION_TYPE:
-		  case QUAL_UNION_TYPE:
 		  case TYPE_DECL:
 		if (package_prefix)
 		  {
@@ -1373,10 +1361,10 @@ dump_ada_decl_name (pretty_printer *buff
 }
 }
 
-/* Dump in BUFFER a name based on both T1 and T2, followed by S.  */
+/* Dump in BUFFER a name based on both T1 and T2 followed by a suffix.  */
 
 static void
-dump_ada_double_name (pretty_printer *buffer, tree t1, tree t2, const char *s)
+dump_ada_double_name (pretty_printer *buffer, tree t1, tree t2)
 {
   if (DECL_NAME (t1))
 pp_ada_tree_identifier (buffer, DECL_NAME (t1), t1, false);
@@ -1396,7 +1384,21 @@ dump_ada_double_name (pretty_printer *bu
   pp_scalar (buffer, "%d", TYPE_UID (TREE_TYPE (t2)));
 }
 
-  pp_string (buffer, s);
+  switch (TREE_CODE (TREE_TYPE (t2)))
+{
+case ARRAY_TYPE:
+  pp_string (buffer, "_array");
+  break;
+case RECORD_TYPE:
+  pp_string (buffer, 

  1   2   >