Re: Type representation in CTF and DWARF

2019-10-08 Thread Indu Bhagat




On 10/08/2019 08:37 AM, Pedro Alves wrote:

On 10/4/19 8:23 PM, Indu Bhagat wrote:

Hello,

At GNU Tools Cauldron this year, some folks were curious to know more on how
the "type representation" in CTF compares vis-a-vis DWARF.

I was one of those, and I brought this up to Jose, after your
presentation.  Glad to see the follow up!  Thanks much for this.

In your Cauldron presentation we saw CTF compared to full blown DWARF
as justification for CTF,


Hmm. And I thought I made the effort reqd to clarify my position that comparing
full-blown DWARF sizes to type-only CTF section sizes is not appropriate, let
alone to not use as a justification for CTF. My intention to show those numbers 
was
only to give some perspective to users curious to know the sizes of CTF debug
info (as generated by dwarf2ctf) because these sections will ideally be not
stripped out of shipped binaries.

The justification for CTF is and will remain - a compact, faster debug format
for type information and support some online debugging use-cases (like
backtraces) in future.


but I was more interested in a comparison between
CTF and a DWARF subset containing exactly only what you have available in
CTF.  Because if DWARF with everything-you-don't-need stripped out
is in the same ballpark, then I am puzzled on why add/maintain a new
Debug format, with all the duplication of effort that entails going
forward.


I shared some numbers on this in the previous emails in this thread. I thought
comparing DWARF's de-duplication-amenable offering (using
-fdebug-types-section) will be useful in this context.

For binaries compiled with -fdebug-types-section -gdwarf-4, here is some data.
The CTF sections are generated with dwarf2ctf because CTF link-time de-dup is
being worked on currently. The end result of link-time CTF de-dup is expected
to be at par with these .ctf section sizes.

The .ctf section sizes below include the CTF string table (.debug_str is
excluded from the calculations however):

(coreutils-0.22)
   .debug_info(D1) | .debug_abbrev(D2) | .debug_str | .debug_types(D3) | .ctf 
(uncompressed) | ratio (.ctf/(D1+D2+D3))
ls  109806 |  18876|  22042 |  12413   |   
26240 | 0.18
pwd 27902  |  7914 |  10851 |  5753|   
13929 | 0.33
groups 26920   |  8173 |  10674 |  5070|   
13378 | 0.33

(emacs-26.3)
   .debug_info(D1) | .debug_abbrev(D2) | .debug_str | .debug_types(D3) | .ctf 
(uncompressed) | ratio (.ctf/(D1+D2+D3))
emacs 3755083  |   202926  |  431926|   143462 |   
273910| 0.06


It is not easy to get an estimate of 'DWARF with everything-you-don't-need
stripped out'. At this time, I don't know of an easy way to make this comparison
more meaningful. Any suggestions ?


Also, it's my understanding that the current CTF format doesn't yet
support C++, Vector registers, etc., maybe other things, so if DWARF
was sufficient for your needs, then in the long run it sounds like
a better option to me, as then you wouldn't have to extend CTF _and_
DWARF whenever some feature is needed.


Yes, CTF does not support C++ at this time. To cover all of C (including
GNU C extensions), we need to add representation for things like Vector type,
non IEEE float etc. (somewhat infrequently occurring constructs)

The issue is not that DWARF cannot represent the required type information.
DWARF is voluminous and secondly, the current workflow to get to CTF from
source programs without direct toolchain support is tiresome and lengthy.

For current and future users of CTF, having the support for the format in the
toolchain is the best way to promote adoption and enhance community experience.


Maybe it would make sense to work on integrating CTF into the DWARF
standard itself, not sure?

I was also curious on your plans for adding unwinding support to CTF,
while the kernel (the main CTF user, IIUC), already has plans to
use its own unwinding format (ORC)?


Kernel's unwinding format (ORC) helps generate backtrace with function
identifiers. For some (ORCL) internal customers, the requirement is to go beyond
that and support input arg values. The requirement there is to generate
backtraces in a fast way, without relying on DWARF.


So with all those questions, I came out of the presentation
thinking that I could not really justify CTF if I were asked to.


Thanks for discussing this openly. I believe there are other GCC
maintainers who are undecided as well :)

I hope I have answered some of your concerns.


(Side note: the Cauldron page is missing slides for your
presentation, so I couldn't go and recheck some things
mentioned above.)

Thanks,
Pedro Alves


I mailed the organizers my slides. They should be online soon.

Thanks



Re: copy/copy_backward/fill/fill_n/equal rework

2019-10-08 Thread François Dumont
Following recently committed patches some changes that couldn't be 
committed are now part of this patch.


Moreover testing istreambuf_iterator std::copy changes I realized that 
this specialization was broken because order of function declarations in 
stl_algobase.h was wrong. I'll check if I can find a way to confirm that 
a given overload is indeed being called.


So here is this patch again.

François

On 9/27/19 11:14 PM, François Dumont wrote:

On 9/27/19 2:28 PM, Jonathan Wakely wrote:

On 09/09/19 20:34 +0200, François Dumont wrote:

Hi

    This patch improves stl_algobase.h 
copy/copy_backward/fill/fill_n/equal implementations. The 
improvements are:


- activation of algo specialization for __gnu_debug::_Safe_iterator 
(w/o _GLIBCXX_DEBUG mode)


- activation of algo specialization for _Deque_iterator even if 
mixed with another kind of iterator.


- activation of algo specializations __copy_move_a2 for something 
else than pointers. For example this code:


std::vector v { 'a', 'b',  };

ostreambuf_iterator out(std::cout);

std::copy(v.begin(), v.end(), out);

is not calling the specialization __copy_move_a2(const char*, const 
char*, ostreambuf_iterator<>);


It also fix a _GLIBCXX_DEBUG issue where the __niter_base 
specialization was wrongly removing the _Safe_iterator<> layer. The 
testsuite/25_algorithms/copy/debug/1_neg.cc test case was failing on 
a debug assertion because _after_ the copy we were trying to 
increment the vector iterator after past-the-end. Of course the 
problem is the _after_, Debug mode should detect this _before_ it 
takes place which it does now.


Note that std::fill_n is now making use of std::fill for some 
optimizations dealing with random access iterators.


Performances are very good:


This looks good, but I'm unable to apply the patch:


error: patch failed: libstdc++-v3/include/bits/deque.tcc:967
error: libstdc++-v3/include/bits/deque.tcc: patch does not apply
error: patch failed: libstdc++-v3/include/bits/stl_algobase.h:499
error: libstdc++-v3/include/bits/stl_algobase.h: patch does not apply

Could you regenerate the patch (against a clean master tree) and
resend? Thanks.


Here it is, thanks.



diff --git a/libstdc++-v3/include/bits/deque.tcc b/libstdc++-v3/include/bits/deque.tcc
index 3f77b4f079c..ab996ca52fa 100644
--- a/libstdc++-v3/include/bits/deque.tcc
+++ b/libstdc++-v3/include/bits/deque.tcc
@@ -967,155 +967,247 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   this->_M_impl._M_finish._M_set_node(__new_nstart + __old_num_nodes - 1);
 }
 
+_GLIBCXX_END_NAMESPACE_CONTAINER
+
   // Overload for deque::iterators, exploiting the "segmented-iterator
   // optimization".
-  template
+  template
 void
-fill(const _Deque_iterator<_Tp, _Tp&, _Tp*>& __first,
-	 const _Deque_iterator<_Tp, _Tp&, _Tp*>& __last, const _Tp& __value)
+__fill_a1(const _GLIBCXX_STD_C::_Deque_iterator<_Tp, _Tp&, _Tp*>& __first,
+	  const _GLIBCXX_STD_C::_Deque_iterator<_Tp, _Tp&, _Tp*>& __last,
+	  const _VTp& __value)
 {
-  typedef typename _Deque_iterator<_Tp, _Tp&, _Tp*>::_Self _Self;
+  typedef _GLIBCXX_STD_C::_Deque_iterator<_Tp, _Tp&, _Tp*> _Iter;
+  if (__first._M_node != __last._M_node)
+	{
+	  std::__fill_a1(__first._M_cur, __first._M_last, __value);
 
-  for (typename _Self::_Map_pointer __node = __first._M_node + 1;
-   __node < __last._M_node; ++__node)
-	std::fill(*__node, *__node + _Self::_S_buffer_size(), __value);
+	  for (typename _Iter::_Map_pointer __node = __first._M_node + 1;
+	   __node < __last._M_node; ++__node)
+	std::__fill_a1(*__node, *__node + _Iter::_S_buffer_size(), __value);
+
+	  std::__fill_a1(__last._M_first, __last._M_cur, __value);
+	}
+  else
+	std::__fill_a1(__first._M_cur, __last._M_cur, __value);
+}
 
+  template
+_OI
+__copy_move_dit(_GLIBCXX_STD_C::_Deque_iterator<_Tp, _Ref, _Ptr> __first,
+		_GLIBCXX_STD_C::_Deque_iterator<_Tp, _Ref, _Ptr> __last,
+		_OI __result)
+{
+  typedef _GLIBCXX_STD_C::_Deque_iterator<_Tp, _Ref, _Ptr> _Iter;
   if (__first._M_node != __last._M_node)
 	{
-	  std::fill(__first._M_cur, __first._M_last, __value);
-	  std::fill(__last._M_first, __last._M_cur, __value);
+	  __result
+	= std::__copy_move_a1<_IsMove>(__first._M_cur, __first._M_last,
+	   __result);
+
+	  for (typename _Iter::_Map_pointer __node = __first._M_node + 1;
+	   __node != __last._M_node; ++__node)
+	__result
+	  = std::__copy_move_a1<_IsMove>(*__node,
+	 *__node + _Iter::_S_buffer_size(),
+	 __result);
+
+	  return std::__copy_move_a1<_IsMove>(__last._M_first, __last._M_cur,
+	  __result);
 	}
-  else
-	std::fill(__first._M_cur, __last._M_cur, __value);
+
+  return std::__copy_move_a1<_IsMove>(__first._M_cur, __last._M_cur,
+	  __result);
 }
 
-  template
-_Deque_iterator<_Tp, _Tp&, _Tp*>
-copy(_Deque_iterator<_Tp, const _Tp&, const _Tp*> __first,
-	 _Deque_iterator<_Tp, const _Tp&, 

[PATCH] Review std::copy istreambuf_iterator specialization

2019-10-08 Thread François Dumont

Hi

    Following what has been done for std::copy_n I think we could 
simplify the __copy_move_a2 overload to also use sgetn. Code is simpler 
and we avoid a friend declaration.


    Tested under Linux x86_64.


    * include/std/streambuf (__copy_move_a2): Remove friend declaration.
    * include/bits/streambuf_iterator.h (__copy_move_a2): Re-implement 
using

    streambuf in_avail and sgetn.

    Ok to commit ?

François

diff --git a/libstdc++-v3/include/bits/streambuf_iterator.h b/libstdc++-v3/include/bits/streambuf_iterator.h
index e3e8736e768..134b3486b9a 100644
--- a/libstdc++-v3/include/bits/streambuf_iterator.h
+++ b/libstdc++-v3/include/bits/streambuf_iterator.h
@@ -345,31 +345,19 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 		   istreambuf_iterator<_CharT> __last, _CharT* __result)
 {
   typedef istreambuf_iterator<_CharT>		   __is_iterator_type;
-  typedef typename __is_iterator_type::traits_type	   traits_type;
   typedef typename __is_iterator_type::streambuf_type  streambuf_type;
-  typedef typename traits_type::int_type		   int_type;
 
   if (__first._M_sbuf && !__last._M_sbuf)
 	{
 	  streambuf_type* __sb = __first._M_sbuf;
-	  int_type __c = __sb->sgetc();
-	  while (!traits_type::eq_int_type(__c, traits_type::eof()))
+	  std::streamsize __avail = __sb->in_avail();
+	  while (__avail > 0)
 	{
-	  const streamsize __n = __sb->egptr() - __sb->gptr();
-	  if (__n > 1)
-		{
-		  traits_type::copy(__result, __sb->gptr(), __n);
-		  __sb->__safe_gbump(__n);
-		  __result += __n;
-		  __c = __sb->underflow();
-		}
-	  else
-		{
-		  *__result++ = traits_type::to_char_type(__c);
-		  __c = __sb->snextc();
-		}
+	  __result += __sb->sgetn(__result, __avail);
+	  __avail = __sb->in_avail();
 	}
 	}
+
   return __result;
 }
 
diff --git a/libstdc++-v3/include/std/streambuf b/libstdc++-v3/include/std/streambuf
index d9ca981d704..3442f19bd78 100644
--- a/libstdc++-v3/include/std/streambuf
+++ b/libstdc++-v3/include/std/streambuf
@@ -149,12 +149,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   friend streamsize
   __copy_streambufs_eof<>(basic_streambuf*, basic_streambuf*, bool&);
 
-  template
-friend typename __gnu_cxx::__enable_if<__is_char<_CharT2>::__value,
-	   _CharT2*>::__type
-__copy_move_a2(istreambuf_iterator<_CharT2>,
-		   istreambuf_iterator<_CharT2>, _CharT2*);
-
   template
 friend typename __gnu_cxx::__enable_if<__is_char<_CharT2>::__value,
   istreambuf_iterator<_CharT2> >::__type


Ping: [PATCH V2] Loop split upon semi-invariant condition (PR tree-optimization/89134)

2019-10-08 Thread Feng Xue OS
Hi, Michael,

   Would you please take a look at this modified version?

Thanks,
Feng


From: Feng Xue OS 
Sent: Thursday, September 12, 2019 6:21 PM
To: Michael Matz
Cc: Richard Biener; gcc-patches@gcc.gnu.org
Subject: Re: Ping agian: [PATCH V2] Loop split upon semi-invariant condition 
(PR tree-optimization/89134)

Hi, Michael,

  Since I was involved in other tasks, it is a little bit late to reply you. 
Sorry
for that. I composed a new one with your suggestions. Please review that
when you are in convenience.

> Generally I do like the idea of the transformation, and the basic building
> blocks seem to be sound.  But I dislike it being a separate pass, so
> please integrate the code you have written into the existing loop split
> pass.  Most building blocks can be used as is, except the main driver.
This new transformation was integrated into the pass of original loop split.

>> +@item max-cond-loop-split-insns
>> +The maximum number of insns to be increased due to loop split on
>> +semi-invariant condition statement.

> "to be increased" --> "to be generated" (or "added")
Done.

>> +@item min-cond-loop-split-prob
>> +The minimum threshold for probability of semi-invaraint condition
>> +statement to trigger loop split.

> typo, semi-invariant
Done.

> I think somewhere in the docs your definition of semi-invariant needs
> to be stated in some form (can be short, doesn't need to reproduce the
> diagram or such), so don't just replicate the short info from the
> params.def file.
Done.

>> +DEFPARAM(PARAM_MIN_COND_LOOP_SPLIT_PROB,
>> +   "min-cond-loop-split-prob",
>> +   "The minimum threshold for probability of semi-invaraint condition "
>> +   "statement to trigger loop split.",

> Same typo: "semi-invariant".
Done.

>> -/* This file implements loop splitting, i.e. transformation of loops like
>> +/* This file implements two kind of loop splitting.

> kind_s_, plural
Done.

>> +/* Another transformation of loops like:
>> +
>> +   for (i = INIT (); CHECK (i); i = NEXT ())
>> + {
>> +   if (expr (a_1, a_2, ..., a_n))
>> + a_j = ...;  // change at least one a_j
>> +   else
>> + S;  // not change any a_j
>> + }

> You should mention that 'expr' needs to be pure, i.e. once it
> becomes false and the inputs don't change, that it remains false.
Done.

>> +static bool
>> +branch_removable_p (basic_block branch_bb)
>> +{
>> +  if (single_pred_p (branch_bb))
>> +return true;
>> +
>> +  edge e;
>> +  edge_iterator ei;
>> +
>> +  FOR_EACH_EDGE (e, ei, branch_bb->preds)
>> +{
>> +  if (dominated_by_p (CDI_DOMINATORS, e->src, branch_bb))
>> +   continue;
>> +
>> +  if (dominated_by_p (CDI_DOMINATORS, branch_bb, e->src))
>> +   continue;

> My gut feeling is surprised by this.  So one of the predecessors of
> branch_bb dominates it.  Why should that indicate that branch_bb
> can be safely removed?
>
> Think about something like this:
>
>   esrc --> cond_bb --> branch_bb
>   '---^

If all predecessors of branch_bb dominate it, these predecessors should also
be in dominating relationship among them, and the conditional statement must
be branch_bb's immediate dominator, and branch_bb is removable. In your example.

For "esrc", loop is continued, nothing is impacted. But in the next iteration, 
we
encounter "cond_bb", it does not dominate "branch_bb", so the function return
false in the following return statement.

> (cond_bb is the callers bb of the cond statement in question).  Now esrc
> dominates branch_bb but still you can't simply remove it, even if
> the cond_bb->branch_bb edge becomes unexecutable.


>> +static int
>> +get_cond_invariant_branch (struct loop *loop, gcond *cond)

> Please return an edge here, not an edge index (which removes the using of
> -1).  I think the name (and comment) should consistently talk about
> semi-invariant, not invariant.  For when you really need an edge index
> later, you could use "EDGE_SUCC(bb, 0) != edge".  But you probably don't
> really need it, e.g. instead of using the gimple pass-local-flag on a
> statement you can just as well also store the edge in your info structure.
Done.

>> +static bool
>> +is_cond_in_hidden_loop (const struct loop *loop, basic_block cond_bb,
>> +   int branch)

> With above change in get_cond_invariant_branch, this also should
> take an edge, not a bb+edge-index.
Done.

>> +static int
>> +compute_added_num_insns (struct loop *loop, const_basic_block cond_bb,
>> +int branch)

> This should take an edge as well.
Done.

>> +  for (unsigned i = 0; i < loop->num_nodes; i++)
>> +{
>> +  /* Do no count basic blocks only in opposite branch.  */
>> +  if (dominated_by_p (CDI_DOMINATORS, bbs[i], targ_bb_var))
>> +   continue;
>> +
>> +  for (gimple_stmt_iterator gsi = gsi_start_bb (bbs[i]); !gsi_end_p 
>> (gsi);
>> +  gsi_next ())
>> +   num += 

Re: [SVE] PR86753

2019-10-08 Thread Prathamesh Kulkarni
On Tue, 8 Oct 2019 at 13:21, Richard Sandiford
 wrote:
>
> Leaving the main review to Richard, just some comments...
>
> Prathamesh Kulkarni  writes:
> > @@ -9774,6 +9777,10 @@ vect_is_simple_cond (tree cond, vec_info *vinfo,
> >
> > When STMT_INFO is vectorized as a nested cycle, for_reduction is true.
> >
> > +   For COND_EXPR if T comes from masked load, and is conditional
> > +   on C, we apply loop mask to result of vector comparison, if it's 
> > present.
> > +   Similarly for E, if it is conditional on !C.
> > +
> > Return true if STMT_INFO is vectorizable in this way.  */
> >
> >  bool
>
> I think this is a bit misleading.  But IMO it'd be better not to have
> a comment here and just rely on the one in the main function body.
> This optimisation isn't really changing the vectorisation strategy,
> and the comment could easily get forgotten if things change in future.
>
> > [...]
> > @@ -,6 +10006,35 @@ vectorizable_condition (stmt_vec_info stmt_info, 
> > gimple_stmt_iterator *gsi,
> >/* Handle cond expr.  */
> >for (j = 0; j < ncopies; j++)
> >  {
> > +  tree loop_mask = NULL_TREE;
> > +  bool swap_cond_operands = false;
> > +
> > +  /* Look up if there is a loop mask associated with the
> > +  scalar cond, or it's inverse.  */
>
> Maybe:
>
>See whether another part of the vectorized code applies a loop
>mask to the condition, or to its inverse.
>
> > +
> > +  if (loop_vinfo && LOOP_VINFO_FULLY_MASKED_P (loop_vinfo))
> > + {
> > +   scalar_cond_masked_key cond (cond_expr, ncopies);
> > +   if (loop_vinfo->scalar_cond_masked_set.contains (cond))
> > + {
> > +   vec_loop_masks *masks = _VINFO_MASKS (loop_vinfo);
> > +   loop_mask = vect_get_loop_mask (gsi, masks, ncopies, vectype, 
> > j);
> > + }
> > +   else
> > + {
> > +   bool honor_nans = HONOR_NANS (TREE_TYPE (cond.op0));
> > +   cond.code = invert_tree_comparison (cond.code, honor_nans);
> > +   if (loop_vinfo->scalar_cond_masked_set.contains (cond))
> > + {
> > +   vec_loop_masks *masks = _VINFO_MASKS (loop_vinfo);
> > +   loop_mask = vect_get_loop_mask (gsi, masks, ncopies,
> > +   vectype, j);
> > +   cond_code = cond.code;
> > +   swap_cond_operands = true;
> > + }
> > + }
> > + }
> > +
> >stmt_vec_info new_stmt_info = NULL;
> >if (j == 0)
> >   {
> > @@ -10114,6 +10153,47 @@ vectorizable_condition (stmt_vec_info stmt_info, 
> > gimple_stmt_iterator *gsi,
> >   }
> >   }
> >   }
> > +
> > +   /* If loop mask is present, then AND it with
>
> Maybe "If we decided to apply a loop mask, ..."
>
> > +  result of vec comparison, so later passes (fre4)
>
> Probably better not to name the pass -- could easily change in future.
>
> > +  will reuse the same condition used in masked load.
>
> Could be a masked store, or potentially other things too.
> So maybe just "will reuse the masked condition"?
>
> > +
> > +  For example:
> > +  for (int i = 0; i < 100; ++i)
> > +x[i] = y[i] ? z[i] : 10;
> > +
> > +  results in following optimized GIMPLE:
> > +
> > +  mask__35.8_43 = vect__4.7_41 != { 0, ... };
> > +  vec_mask_and_46 = loop_mask_40 & mask__35.8_43;
> > +  _19 = [base: z_12(D), index: ivtmp_56, step: 4, offset: 0B];
> > +  vect_iftmp.11_47 = .MASK_LOAD (_19, 4B, vec_mask_and_46);
> > +  vect_iftmp.12_52 = VEC_COND_EXPR  > +vect_iftmp.11_47, { 10, ... }>;
> > +
> > +  instead of recomputing vec != { 0, ... } in vec_cond_expr  */
>
> That's true, but gives the impression that avoiding the vec != { 0, ... }
> is the main goal, whereas we could do that just by forcing a three-operand
> COND_EXPR.  It's really more about making sure that vec != { 0, ... }
> and its masked form aren't both live at the same time.  So maybe:
>
>  instead of using a masked and unmasked forms of
>  vect__4.7_41 != { 0, ... } (masked in the MASK_LOAD,
>  unmasked in the VEC_COND_EXPR).  */
>
Hi Richard,
Thanks for the suggestions, I have updated comments in the attached patch.

Thanks,
Prathamesh
> Thanks,
> Richard
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_cnot_2.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_cnot_2.c
index d689e21dc11..3df2431be38 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/cond_cnot_2.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_cnot_2.c
@@ -32,4 +32,4 @@ TEST_ALL (DEF_LOOP)
 /* { dg-final { scan-assembler-not {\tmov\tz} } } */
 /* { dg-final { scan-assembler-not {\tmovprfx\t} } } */
 /* Currently we canonicalize the ?: so that !b[i] is the "false" value.  */
-/* { dg-final { scan-assembler-not {\tsel\t} { xfail *-*-* } } } */
+/* { dg-final 

Re: [EXTERNAL]Re: Re: [PATCH] [MIPS] Fix PR target/91769

2019-10-08 Thread Paul Hua
Hi,

Thanks for explain that.
Add isa_rev=2 and -mfpxx to dg-options fix the fallout.

On Sun, Oct 6, 2019 at 8:03 PM Dragan Mladjenovic
 wrote:
>
>
>
> On 06.10.2019. 08:43, Paul Hua wrote:
> > Hi:
> >
> > The testsuite has a typo in "dg-final scan-assembler", s/mthc1/mtc1/.
> >
>
>
> Hi,
>
> I think I know what is happening here. My testing setup defaults to
> -mfpxx and yours probably to -mfp32. I should have probably tightened
> the test up to require R2 isa as well.
> Does adding isa_rev=2 and -mfpxx to dg-options fix the fallout form this
> test? I cannot check it right now, but I can send the fix for this
> tomorrow. Sorry for the inconvenience.
>
> Best regards,
> Dragan


Re: GCC wwwdocs move to git done

2019-10-08 Thread Frank Ch. Eigler
Hi -

Thanks - good job with moving this to git!

> Note 1: someone with the right access needs to create the symlink 
> /sourceware/git/gcc-wwwdocs.git -> 
> /sourceware/projects/gcc-home/wwwdocs.git (and anything else needed for 
> anonymous git access to that repository).

Done.

- FChE


GCC wwwdocs move to git done

2019-10-08 Thread Joseph Myers
I've done the move of GCC wwwdocs to git (using the previously posted and 
discussed scripts), including setting up the post-receive hook to do the 
same things previously covered by the old CVS hooks, and minimal updates 
to the web pages dealing with the CVS setup for wwwdocs.

Note 1: someone with the right access needs to create the symlink 
/sourceware/git/gcc-wwwdocs.git -> 
/sourceware/projects/gcc-home/wwwdocs.git (and anything else needed for 
anonymous git access to that repository).  Once that's done I'll simplify 
the path given for git checkouts in about.html and restore instructions 
for anonymous checkout.

Note 2: changes may be needed to the process for updating www.gnu.org and 
Gerald's validator.

Note 3: I don't see any reason the automated process (see 
maintainer-scripts/update_web_docs_svn in the main GCC sources repository) 
for generating install/ documentation should be broken by this, but that 
will still need watching carefully to see if it works as expected.

diff --git a/bin/post-receive b/bin/post-receive
new file mode 100755
index ..1114efcb
--- /dev/null
+++ b/bin/post-receive
@@ -0,0 +1,55 @@
+#!/bin/bash
+
+# The post-receive hook receives, on standard input, a series of lines
+# of the form:
+#
+#  SP  SP  LF
+#
+# describing ref updates that have just occurred.  In the case of this
+# repository, only updates to refs/heads/master are expected, and only
+# such updates should result in automatic website updates.
+
+TOP_DIR=/www/gcc
+
+exec >> "$TOP_DIR/updatelog" 2>&1
+
+export QMAILHOST=gcc.gnu.org
+
+tmp=$(mktemp)
+cat > "$tmp"
+
+# Send commit emails.  Appropriate config values should be set in the
+# git repository for this.
+/sourceware/libre/infra/bin/post-receive-email < "$tmp"
+
+# Update web page checkouts, if applicable.
+while read old_value new_value ref_name; do
+if [ "$ref_name" != "refs/heads/master" ]; then
+   continue
+fi
+unset GIT_DIR
+unset GIT_WORK_TREE
+cd "$TOP_DIR/wwwdocs-checkout"
+git pull --quiet
+# $TOP_DIR/bin, $TOP_DIR/cgi-bin and $TOP_DIR/htdocs-preformatted
+# should be symlinks into wwwdocs-checkout.
+git diff --name-only "$old_value..$new_value" | while read file; do
+   case "$file" in
+   (htdocs/*)
+   ;;
+   (*)
+   continue
+   ;;
+   esac
+   dir="${file%/*}"
+   if ! [ -d "$TOP_DIR/$dir" ]; then
+   mkdir "$TOP_DIR/$dir"
+   chmod 2775 "$TOP_DIR/$dir"
+   fi
+   if [ -f "$file" ]; then
+   /www/gcc/bin/preprocess "${file#htdocs/}"
+   fi
+done
+done < "$tmp"
+
+rm "$tmp"
diff --git a/bin/preprocess b/bin/preprocess
index 2d09d548..56f83838 100755
--- a/bin/preprocess
+++ b/bin/preprocess
@@ -160,7 +160,7 @@ process_file()
 fi
 
 case $f in
-*/CVS|*\.cvsignore)
+*/.git|*\.gitignore)
 ;;
 *\.ihtml|*\.mhtml)
 ;;
@@ -248,8 +248,8 @@ if [ $# -gt 0 ]; then
 done
 else 
 # Process all files in the source tree, excluding files/directories
-# called CVS.
-for f in `find . \( -name CVS -prune \) -o -type f -print` ; do
+# called .git.
+for f in `find . \( -name .git -prune \) -o -type f -print` ; do
 process_file $f
 done
 fi
diff --git a/htdocs/about.html b/htdocs/about.html
index 19dd080c..30a5c943 100644
--- a/htdocs/about.html
+++ b/htdocs/about.html
@@ -19,7 +19,7 @@ many
 https://gcc.gnu.org/onlinedocs/gcc/Contributors.html;>contributors
 .
 
-The web pages are under CVS control.
+The web pages are under git control.
 The pages on gcc.gnu.org are updated directly after a
 change has been committed. www.gnu.org is updated once a day at 4:00 -0700
 (PDT).
@@ -48,20 +48,16 @@ a higher chance of being implemented soon. ;-)
 
 
 
-Using the CVS repository
+Using the git repository
 
-Assuming you have both CVS 
+Assuming you have both git 
 and SSH installed, you can check out the web pages as follows:
 
 
- Set CVS_RSH in your environment to ssh.
- cvs -q -d :ext:username@gcc.gnu.org:/cvs/gcc checkout
--P wwwdocs where username is your user name at gcc.gnu.org
+ git clone 
git+ssh://username@gcc.gnu.org/sourceware/projects/gcc-home/wwwdocs.git
+ where username is your user name at gcc.gnu.org
 
 
-For anonymous access, use
--d :pserver:c...@gcc.gnu.org:/cvs/gcc instead.
-
 
 Validating a change
 
@@ -76,14 +72,14 @@ the https://validator.w3.org;>W3 Validator. 
Just use the
 and prefer that each checkin be of a complete, single logical change.
 
 
-Sync your sources with the master repository via "cvs
-update".
+Sync your sources with the master repository via "git pull".
 This will also identify any files in your local
 tree that you have modified.
 
-We recommend reviewing the output of "cvs diff".
+We recommend reviewing the output of "git diff".
 
-Use "cvs commit" to check in the patch.
+Use "git commit" and "git push origin
+master" to check in the patch.
 
 Upon checkin a message will be 

[PATCH] implement -Wrestrict for sprintf (PR 83688)

2019-10-08 Thread Martin Sebor

Attached is a resubmission of the -Wrestrict implementation for
the sprintf family of functions.  The original patch was posted
in 2017 but never approved.  This revision makes only a few minor
changes to the original code, mostly necessitated by rebasing on
the top of trunk.

The description from the original posting still applies today:

  The enhancement works by first determining the base object (or
  pointer) from the destination of the sprintf call, the constant
  offset into the object (and subobject for struct members), and
  its size.  For each %s argument, it then computes the same data.
  If it determines that overlap between the two is possible it
  stores the data for the directive argument (including the size
  of the argument) for later processing.  After the whole call and
  format string have been processed, the code then iterates over
  the stored directives and their arguments and compares the size
  and length of the argument against the function's overall output.
  If they overlap it issues a warning.

The solution is pretty simple.  The only details that might be
worth calling out are the addition of a few utility functions some
of which at first glance look like they could be replaced by calls
to existing utilities:

 *  array_elt_at_offset
 *  field_at_offset
 *  get_origin_and_offset
 *  alias_offset

Specifically, get_origin_and_offset looks like a dead ringer for
get_addr_base_and_unit_offset, except since the former is only
used for warnings it is less conservative.  It also works with
SSA_NAMEs.  This is also the function I expect to need to make
changes to (and fix bugs in).  The rest of the functions are
general utilities that could perhaps be moved to tree.c at some
point when there is a use for them elsewhere (I have some work
in progress that might need at least one of them).

Another likely question worth addressing is why the sprintf
overlap detection isn't handled in gimple-ssa-warn-restrict.c.
There is an opportunity for code sharing between the two "passes"
but it will require some fairly intrusive changes to the latter.
Those feel out of scope for the initial solution.

Finally, because of new dependencies between existing classes in
the file, some code had to be moved around within it a bit.  That
contributed to the size of the patch making the changes seem more
extensive than they really are.

Tested on x86_64-linux with Binutils/GDB and Glibc.

Martin

The original submission:
https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00036.html
PR middle-end/83688 - check if buffers may overlap when copying strings using sprintf

gcc/ChangeLog:

	PR middle-end/83688
	* gimple-ssa-sprintf.c (format_result::alias_info): New struct.
	(directive::argno): New member.
	(format_result::aliases, format_result::alias_count): New data members.
	(format_result::append_alias): New member function.
	(fmtresult::dst_offset): New data member.
	(pass_sprintf_length::call_info::dst_origin): New data member.
	(pass_sprintf_length::call_info::dst_field, dst_offset): Same.
	(char_type_p, array_elt_at_offset, field_at_offset): New functions.
	(get_origin_and_offset): Same.
	(format_string): Call it.
	(format_directive): Call append_alias and set directive argument
	number.
	(maybe_warn_overlap): New function.
	(pass_sprintf_length::compute_format_length): Call it.
	(pass_sprintf_length::handle_gimple_call): Initialize new members.
	* gcc/tree-ssa-strlen.c (): Also enable when -Wrestrict is on.

gcc/testsuite/ChangeLog:

	PR tree-optimization/35503
	* gcc.dg/tree-ssa/builtin-sprintf-warn-23.c: New test.

diff --git a/gcc/gimple-ssa-sprintf.c b/gcc/gimple-ssa-sprintf.c
index b11d7989d5e..b47ed019615 100644
--- a/gcc/gimple-ssa-sprintf.c
+++ b/gcc/gimple-ssa-sprintf.c
@@ -86,6 +86,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "alloc-pool.h"
 #include "vr-values.h"
 #include "tree-ssa-strlen.h"
+#include "tree-dfa.h"
 
 /* The likely worst case value of MB_LEN_MAX for the target, large enough
for UTF-8.  Ideally, this would be obtained by a target hook if it were
@@ -106,9 +107,6 @@ namespace {
 
 static int warn_level;
 
-struct call_info;
-struct format_result;
-
 /* The minimum, maximum, likely, and unlikely maximum number of bytes
of output either a formatting function or an individual directive
can result in.  */
@@ -133,80 +131,6 @@ struct result_range
   unsigned HOST_WIDE_INT unlikely;
 };
 
-/* The result of a call to a formatted function.  */
-
-struct format_result
-{
-  /* Range of characters written by the formatted function.
- Setting the minimum to HOST_WIDE_INT_MAX disables all
- length tracking for the remainder of the format string.  */
-  result_range range;
-
-  /* True when the range above is obtained from known values of
- directive arguments, or bounds on the amount of output such
- as width and precision, and not the result of  heuristics that
- depend on warning levels.  It's used to issue stricter diagnostics
- in 

[PATCH] PR fortran/92018 -- BOZ the gift that keeps giving

2019-10-08 Thread Steve Kargl
Tested on x86_64-*-freebsd.  OK to commit?

A BOZ literal constant can be an actual argument in a
very limited number of intrinsic subprograms.  For those
intrinsics subprograms, the BOZ literal constant is converted
either during checking (see check.c) or simplification 
(see simplify.c).  In resolve.c (resolve_function), I added
code that would walk the actual argument list to check for a
BOZ, but that code was restricted to functions with the EXTERNAL
attribute.

The new testcase, pr92018.f90, demonstrates a situation 
when neither the INTRINSIC and EXTERNAL attribute is set,
and the actual argument list contains BOZ.  This led to
an ICE.  The patch removes the previous restriction, and
so the actual arguments for all functions are checked.
This works except it pointed to a deficiency in the checking
routines.  If something was rejected, (e.g., IAND(Z'12',Z34')),
the BOZ were passed onto resolve_function() and run-on errors
were reported.  To avoid these additional error messages, I have
added the reset_boz() function, which converts a rejected
BOZ to a default integer kind 0.

2019-10-09  Steven G. Kargl  

PF fortran/92018
* check.c (reset_boz): New function.
(illegal_boz_arg, boz_args_check, gfc_check_complex, gfc_check_float,
gfc_check_transfer): Use it.
(gfc_check_dshift): Use reset_boz, and re-arrange the checking to
help suppress possible run-on errors.
(gfc_check_and): Restore checks for valid argument types.  Use
reset_boz, and re-arrange the checking to help suppress possible
run-on errors.
* resolve.c (resolve_function): Actual arguments cannot be BOZ in
a function reference.

2019-10-09  Steven G. Kargl  

PF fortran/92018
* gfortran.dg/gnu_logical_2.f90: Update dg-error regex.
* gfortran.dg/pr81509_2.f90: Ditto.
* gfortran.dg/pr92018.f90: New test.

-- 
Steve
Index: gcc/fortran/check.c
===
--- gcc/fortran/check.c	(revision 276705)
+++ gcc/fortran/check.c	(working copy)
@@ -30,10 +30,29 @@ along with GCC; see the file COPYING3.  If not see
 #include "coretypes.h"
 #include "options.h"
 #include "gfortran.h"
+#include "arith.h"
 #include "intrinsic.h"
 #include "constructor.h"
 #include "target-memory.h"
 
+
+/* Reset a BOZ to a zero value.  This is used to prevent run-on errors
+   from resolve.c(resolve_function).  */
+
+static void
+reset_boz (gfc_expr *x)
+{
+  /* Clear boz info.  */
+  x->boz.rdx = 0;
+  x->boz.len = 0;
+  free (x->boz.str);
+
+  x->ts.type = BT_INTEGER;
+  x->ts.kind = gfc_default_integer_kind;
+  mpz_init (x->value.integer);
+  mpz_set_ui (x->value.integer, 0);
+}
+
 /* A BOZ literal constant can appear in a limited number of contexts.
gfc_invalid_boz() is a helper function to simplify error/warning
generation.  gfortran accepts the nonstandard 'X' for 'Z', and gfortran
@@ -63,6 +82,7 @@ illegal_boz_arg (gfc_expr *x)
 {
   gfc_error ("BOZ literal constant at %L cannot be an actual argument "
 		 "to %qs", >where, gfc_current_intrinsic);
+  reset_boz (x);
   return true;
 }
 
@@ -79,6 +99,8 @@ boz_args_check(gfc_expr *i, gfc_expr *j)
   gfc_error ("Arguments of %qs at %L and %L cannot both be BOZ "
 		 "literal constants", gfc_current_intrinsic, >where,
 		 >where);
+  reset_boz (i);
+  reset_boz (j);
   return false;
 
 }
@@ -2399,7 +2421,10 @@ gfc_check_complex (gfc_expr *x, gfc_expr *y)
 {
   if (gfc_invalid_boz ("BOZ constant at %L cannot appear in the COMPLEX "
 			   "intrinsic subprogram", >where))
-	return false;
+	{
+	  reset_boz (x);
+	  return false;
+}
   if (y->ts.type == BT_INTEGER && !gfc_boz2int (x, y->ts.kind))
 	return false;
   if (y->ts.type == BT_REAL && !gfc_boz2real (x, y->ts.kind))
@@ -2410,7 +2435,10 @@ gfc_check_complex (gfc_expr *x, gfc_expr *y)
 {
   if (gfc_invalid_boz ("BOZ constant at %L cannot appear in the COMPLEX "
 			   "intrinsic subprogram", >where))
-	return false;
+	{
+	  reset_boz (y);
+	  return false;
+	}
   if (x->ts.type == BT_INTEGER && !gfc_boz2int (y, x->ts.kind))
 	return false;
   if (x->ts.type == BT_REAL && !gfc_boz2real (y, x->ts.kind))
@@ -2674,22 +2702,34 @@ gfc_check_dshift (gfc_expr *i, gfc_expr *j, gfc_expr *
   if (!boz_args_check (i, j))
 return false;
 
-  /* If i is BOZ and j is integer, convert i to type of j.  */
-  if (i->ts.type == BT_BOZ && j->ts.type == BT_INTEGER
-  && !gfc_boz2int (i, j->ts.kind))
-return false;
+  /* If i is BOZ and j is integer, convert i to type of j.  If j is not
+ an integer, clear the BOZ; otherwise, check that i is an integer.  */
+  if (i->ts.type == BT_BOZ)
+{
+  if (j->ts.type != BT_INTEGER)
+reset_boz (i);
+  else if (!gfc_boz2int (i, j->ts.kind))
+	return false;
+}
+  else if (!type_check (i, 0, BT_INTEGER))
+{
+  if (j->ts.type == BT_BOZ)
+	reset_boz (j);
+ 

C++ PATCH for c++/92032 - DR 1601: Promotion of enum with fixed underlying type

2019-10-08 Thread Marek Polacek
I've been messing with compare_ics recently and noticed that we don't
implement CWG 1601, which should be fairly easy.

The motivating example is 

  enum E : char { e };
  void f(char);
  void f(int);
  void g() {
f(e);
  }

where the call to f was ambiguous but we should choose f(char).

Currently we give f(int) cr_promotion in standard_conversion, while
f(char) remains cr_std, which is worse than cr_promotion.  So I thought
I'd give it cr_promotion also and then add a tiebreaker to compare_ics.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2019-10-08  Marek Polacek  

PR c++/92032 - DR 1601: Promotion of enum with fixed underlying type.
* call.c (standard_conversion): When converting an enumeration with
a fixed underlying type to the underlying type, give it the cr_promotion
rank.
(compare_ics): Implement a tiebreaker as per CWG 1601.

* g++.dg/cpp0x/scoped_enum10.C: New test.
* g++.dg/cpp0x/scoped_enum11.C: New test.

diff --git gcc/cp/call.c gcc/cp/call.c
index 6c9acac4614..10172855d4d 100644
--- gcc/cp/call.c
+++ gcc/cp/call.c
@@ -1484,8 +1484,18 @@ standard_conversion (tree to, tree from, tree expr, bool 
c_cast_p,
 
   conv = build_conv (ck_std, to, conv);
 
-  /* Give this a better rank if it's a promotion.  */
-  if (same_type_p (to, type_promotes_to (from))
+  tree underlying_type = NULL_TREE;
+  if (TREE_CODE (from) == ENUMERAL_TYPE
+ && ENUM_FIXED_UNDERLYING_TYPE_P (from))
+   underlying_type = ENUM_UNDERLYING_TYPE (from);
+
+  /* Give this a better rank if it's a promotion.
+
+To handle CWG 1601, also bump the rank if we are converting
+an enumeration with a fixed underlying type to the underlying
+type.  */
+  if ((same_type_p (to, type_promotes_to (from))
+  || (underlying_type && same_type_p (to, underlying_type)))
  && next_conversion (conv)->rank <= cr_promotion)
conv->rank = cr_promotion;
 }
@@ -10506,6 +10516,31 @@ compare_ics (conversion *ics1, conversion *ics2)
}
 }
 
+  /* [over.ics.rank]
+
+ Per CWG 1601:
+ -- A conversion that promotes an enumeration whose underlying type
+ is fixed to its underlying type is better than one that promotes to
+ the promoted underlying type, if the two are different.  */
+  if (ics1->rank == cr_promotion
+  && ics2->rank == cr_promotion
+  && UNSCOPED_ENUM_P (from_type1)
+  && ENUM_FIXED_UNDERLYING_TYPE_P (from_type1)
+  && same_type_p (from_type1, from_type2))
+{
+  tree utype = ENUM_UNDERLYING_TYPE (from_type1);
+  tree prom = type_promotes_to (from_type1);
+  if (!same_type_p (utype, prom))
+   {
+ if (same_type_p (to_type1, utype)
+ && same_type_p (to_type2, prom))
+   return 1;
+ else if (same_type_p (to_type2, utype)
+  && same_type_p (to_type1, prom))
+   return -1;
+   }
+}
+
   /* Neither conversion sequence is better than the other.  */
   return 0;
 }
diff --git gcc/testsuite/g++.dg/cpp0x/scoped_enum10.C 
gcc/testsuite/g++.dg/cpp0x/scoped_enum10.C
new file mode 100644
index 000..b588581cd3e
--- /dev/null
+++ gcc/testsuite/g++.dg/cpp0x/scoped_enum10.C
@@ -0,0 +1,37 @@
+// PR c++/92032 - DR 1601: Promotion of enumeration with fixed underlying type.
+// { dg-do compile { target c++11 } }
+
+enum E : char { e };
+enum F : int { f };
+enum G : long { g };
+enum H : unsigned { h };
+
+int f1(char);
+void f1(int);
+
+void f2(int);
+int f2(char);
+
+int f3(int);
+void f3(short);
+
+int f4(long);
+void f4(int);
+
+void f5(unsigned);
+int f5(int);
+
+int f6(unsigned);
+void f6(int);
+
+void
+test ()
+{
+  int r = 0;
+  r += f1 (e);
+  r += f2 (e);
+  r += f3 (f);
+  r += f4 (g);
+  r += f5 (f);
+  r += f6 (h);
+}
diff --git gcc/testsuite/g++.dg/cpp0x/scoped_enum11.C 
gcc/testsuite/g++.dg/cpp0x/scoped_enum11.C
new file mode 100644
index 000..e6dcfbac9d8
--- /dev/null
+++ gcc/testsuite/g++.dg/cpp0x/scoped_enum11.C
@@ -0,0 +1,35 @@
+// PR c++/92032 - DR 1601: Promotion of enumeration with fixed underlying type.
+// { dg-do compile { target c++11 } }
+
+enum E1 : long { e1 };
+enum E2 : short { e2 };
+
+int f1(short);
+void f1(int);
+
+void f2(int);
+int f2(short);
+
+void f3(int);
+int f3(long);
+
+int f4(short);
+void f4(long);
+
+int f5(int);
+void f5(long);
+
+int f6(unsigned int); // { dg-message "candidate" }
+void f6(long); // { dg-message "candidate" }
+
+void
+fn ()
+{
+  int r = 0;
+  r += f1 (e2);
+  r += f2 (e2);
+  r += f3 (e1);
+  r += f4 (e2);
+  r += f5 (e2);
+  r += f6 (e2); // { dg-error "ambiguous" }
+}


Re: [PATCH] PR tree-optimization/90836 Missing popcount pattern matching

2019-10-08 Thread Steve Ellcey
On Mon, 2019-10-07 at 12:04 +0200, Richard Biener wrote:
> On Tue, Oct 1, 2019 at 1:48 PM Dmitrij Pochepko
>  wrote:
> > 
> > Hi Richard,
> > 
> > I updated patch according to all your comments.
> > Also bootstrapped and tested again on x86_64-pc-linux-gnu and
> > aarch64-linux-gnu, which took some time.
> > 
> > attached v3.
> 
> OK.
> 
> Thanks,
> Richard.

Dmitrij,

I checked in this patch for you.

Steve Ellcey
sell...@marvell.com


Re: [patch, fortran] Fix PR 92004, restore Lapack compilation

2019-10-08 Thread Tobias Burnus

Hi Thomas,

On 10/6/19 5:26 PM, Thomas Koenig wrote:

+/* Under certain conditions, a scalar actual argument can be passed
+   to an array dummy argument - see F2018, 15.5.2.4, clause 14.  This
+   functin returns true for these conditions so that an error or


function ("o" missing); I think it is not clause 14 but paragraph 14.



+   warning for this can be suppressed later.  */
+
+bool
+maybe_dummy_array_arg (gfc_expr *e)
+{
+  gfc_symbol *s;
+
+  if (e->rank > 0)
+return false;
+
+  if (e->ts.type == BT_CHARACTER && e->ts.kind == 1)
+return true;
+
+  if (e->expr_type != EXPR_VARIABLE)
+return false;


What about PARAMETER? :-)



+  s = e->symtree->n.sym;
+  if (s->as == NULL)
+return false;


This looks wrong. You also want to permit dt%array(1) – but not dt(1)%scalar


+  if (s->ts.type == BT_CLASS || s->as->type == AS_ASSUMED_SHAPE
+  || s->attr.pointer)
+return false;


dt%foo – again, "foo" can be an allocatable of polymorphic type or a 
pointer, but at least, it cannot be of assumed shape.


Otherwise it looks good at a glance.

Tobias




Re: [PATCH] handle arrays in -Wclass-memaccess (PR 92001)

2019-10-08 Thread Jason Merrill

On 10/7/19 8:56 PM, Martin Sebor wrote:

-Wclass-memaccess doesn't trigger for access to arrays of
objects whose type it's designed to trigger for.  It looks
to me like a simple oversight in maybe_warn_class_memaccess.
Attached is a trivial patch to correct it tested on
x86_64-linux.


OK, thanks.

Jason




Ping ** (2./7.) Fix PR 92004, restore Lapack compilation

2019-10-08 Thread Thomas Koenig

Hi,

this patch fixes an overzealous interpretation of F2018 15.5.2.4, where
an idiom of passing an array element to an array was rejected. This
also restores Lapack compilation without warning.

Regression-tested. OK for trunk?


Would it be possible to get a speedy review on this?  I'd like to get
this working again as soon as possible.

Regards

Thomas


Re: [PATCH] Come up with ipa passes introduction in gccint documentation

2019-10-08 Thread Sandra Loosemore

On 10/8/19 2:52 AM, luoxhu wrote:

Hi,

This is the formal documentation patch for IPA passes.  Thanks.


None of the IPA passes are documented in passes.texi.  This patch adds
a section IPA passes just before GIMPLE passes and RTL passes in
Chapter 9 "Passes and Files of the Compiler".  Also, a short description
for each IPA pass is provided.
gccint.pdf can be produced without errors.

ChangeLog:
PR middle-end/26241
* doc/lto.texi (IPA): Reference to the IPA passes.
* doc/passes.texi (Pass manager): Add node IPA passes and
  description for each IPA pass.


Thanks for submitting this documentation patch!  The content looks 
helpful to me, but I see that it has quite a few grammar bugs (I 
understand how hard English is even for native speakers), plus some 
issues like indexing, cross-referencing, use of jargon without defining 
it, etc.  I think it would be more efficient for me to take over 
polishing the text some more than to mark it up for you to fix, but I'd 
like to give others a few days to comment on technical content first.


-Sandra


Re: [PATCH] avoid a spurious -Wstringop-overflow due to multiple MEM_REFs (PR 92014)

2019-10-08 Thread Martin Sebor

On 10/7/19 6:58 PM, Martin Sebor wrote:

Last week's enhancement to detect one-byte buffer overflows exposed
a bug that let the warning use the size of a prior MEM_REF access
and "override" the size of the actual store to the character array.
When the store was smaller than the prior access (e.g., one byte,
vs an 8-byte null pointer read such as in a PHI), this would lead
to a false positive.

The attached patch has the function fail after it has determined
the size of the store from a MEM_REF if one of its recursive
invocations finds another MEM_REF.

Tested on x86_64-linux.  Since the bug is causing trouble in Glibc
builds I will plan on committing the fix tomorrow.


I have committed this patch in r276711 along with an additional
minor tweak to take care of bug 92026 that was raised overnight
for test suite failures on a few targets.

Martin


[Darwin, committed] Remove code deprecated in 4.x.

2019-10-08 Thread Iain Sandoe
This removes some code that should be dead.
Given no reported problems from the warning since 4.6 this seems reasonable.

It also happens to clear another build warning by deleting the offending code.

tested on x86_64-darwin16, applied to mainline,
thanks
Iain

gcc/ChangeLog:

2019-10-08  Iain Sandoe  

* config/darwin.c (machopic_select_section): Remove dead code for
old Objective-C section selection method, replace with unreachable.

diff --git a/gcc/config/darwin.c b/gcc/config/darwin.c
index f8d70596d0..679e0c2291 100644
--- a/gcc/config/darwin.c
+++ b/gcc/config/darwin.c
@@ -1696,83 +1696,14 @@ machopic_select_section (tree decl,
   else
return base_section;
 }
-  /* c) legacy meta-data selection.  */
-  else if (TREE_CODE (decl) == VAR_DECL
+  else if (flag_next_runtime
+  && VAR_P (decl)
   && DECL_NAME (decl)
   && TREE_CODE (DECL_NAME (decl)) == IDENTIFIER_NODE
   && IDENTIFIER_POINTER (DECL_NAME (decl))
-  && flag_next_runtime
   && !strncmp (IDENTIFIER_POINTER (DECL_NAME (decl)), "_OBJC_", 6))
-{
-  const char *name = IDENTIFIER_POINTER (DECL_NAME (decl));
-  static bool warned_objc_46 = false;
-  /* We shall assert that zero-sized objects are an error in ObjC
- meta-data.  */
-  gcc_assert (tree_to_uhwi (DECL_SIZE_UNIT (decl)) != 0);
-
-  /* ??? This mechanism for determining the metadata section is
-broken when LTO is in use, since the frontend that generated
-the data is not identified.  We will keep the capability for
-the short term - in case any non-Objective-C programs are using
-it to place data in specified sections.  */
-  if (!warned_objc_46)
-   {
- location_t loc = DECL_SOURCE_LOCATION (decl);
- warning_at (loc, 0, "the use of _OBJC_-prefixed variable names"
- " to select meta-data sections is deprecated at 4.6"
- " and will be removed in 4.7");
- warned_objc_46 = true;
-   }
-
-  if (!strncmp (name, "_OBJC_CLASS_METHODS_", 20))
-return darwin_sections[objc_cls_meth_section];
-  else if (!strncmp (name, "_OBJC_INSTANCE_METHODS_", 23))
-return darwin_sections[objc_inst_meth_section];
-  else if (!strncmp (name, "_OBJC_CATEGORY_CLASS_METHODS_", 29))
-return darwin_sections[objc_cat_cls_meth_section];
-  else if (!strncmp (name, "_OBJC_CATEGORY_INSTANCE_METHODS_", 32))
-return darwin_sections[objc_cat_inst_meth_section];
-  else if (!strncmp (name, "_OBJC_CLASS_VARIABLES_", 22))
-return darwin_sections[objc_class_vars_section];
-  else if (!strncmp (name, "_OBJC_INSTANCE_VARIABLES_", 25))
-return darwin_sections[objc_instance_vars_section];
-  else if (!strncmp (name, "_OBJC_CLASS_PROTOCOLS_", 22))
-return darwin_sections[objc_cat_cls_meth_section];
-  else if (!strncmp (name, "_OBJC_CLASS_NAME_", 17))
-return darwin_sections[objc_class_names_section];
-  else if (!strncmp (name, "_OBJC_METH_VAR_NAME_", 20))
-return darwin_sections[objc_meth_var_names_section];
-  else if (!strncmp (name, "_OBJC_METH_VAR_TYPE_", 20))
-return darwin_sections[objc_meth_var_types_section];
-  else if (!strncmp (name, "_OBJC_CLASS_REFERENCES", 22))
-return darwin_sections[objc_cls_refs_section];
-  else if (!strncmp (name, "_OBJC_CLASS_", 12))
-return darwin_sections[objc_class_section];
-  else if (!strncmp (name, "_OBJC_METACLASS_", 16))
-return darwin_sections[objc_meta_class_section];
-  else if (!strncmp (name, "_OBJC_CATEGORY_", 15))
-return darwin_sections[objc_category_section];
-  else if (!strncmp (name, "_OBJC_SELECTOR_REFERENCES", 25))
-return darwin_sections[objc_selector_refs_section];
-  else if (!strncmp (name, "_OBJC_SELECTOR_FIXUP", 20))
-return darwin_sections[objc_selector_fixup_section];
-  else if (!strncmp (name, "_OBJC_SYMBOLS", 13))
-return darwin_sections[objc_symbols_section];
-  else if (!strncmp (name, "_OBJC_MODULES", 13))
-return darwin_sections[objc_module_info_section];
-  else if (!strncmp (name, "_OBJC_IMAGE_INFO", 16))
-return darwin_sections[objc_image_info_section];
-  else if (!strncmp (name, "_OBJC_PROTOCOL_INSTANCE_METHODS_", 32))
-return darwin_sections[objc_cat_inst_meth_section];
-  else if (!strncmp (name, "_OBJC_PROTOCOL_CLASS_METHODS_", 29))
-return darwin_sections[objc_cat_cls_meth_section];
-  else if (!strncmp (name, "_OBJC_PROTOCOL_REFS_", 20))
-return darwin_sections[objc_cat_cls_meth_section];
-  else if (!strncmp (name, "_OBJC_PROTOCOL_", 15))
-return darwin_sections[objc_protocol_section];
-  else
-return base_section;
-}
+/* c) legacy meta-data selection was deprecated at 4.6, removed now.  */
+gcc_unreachable ();
 
   return base_section;

[Darwin, machopic 2/n, committed] Compute and cache indirection rules.

2019-10-08 Thread Iain Sandoe
This caches a check for the requirement to indirect a symbol in the Darwin
ABI, and uses it where needed.  We also ensure that we place the indirection
pointers into the non-lazy symbol pointers section.  Other placements have
occurred with various platform toolchains - but these seem to have been
unintentional - here we match current platform toolchains.

I fixed a bad indent at the same time (thanks to Segher for pointing it out).

tested on x86_64-darwin16, applied to mainline,
thanks
Iain

gcc/ChangeLog:

2019-10-08  Iain Sandoe  

* config/darwin.c (machopic_indirect_data_reference): Check for
required indirections before making direct access to defined
values.
(machopic_output_indirection): Place the indirected pointes for
required indirections into the non-lazy symbol pointers section.
(darwin_encode_section_info):
* config/darwin.h (MACHO_SYMBOL_FLAG_MUST_INDIRECT): New.
(MACHO_SYMBOL_MUST_INDIRECT_P): New.


diff --git a/gcc/config/darwin.c b/gcc/config/darwin.c
index 869e850c57..f8d70596d0 100644
--- a/gcc/config/darwin.c
+++ b/gcc/config/darwin.c
@@ -665,7 +665,7 @@ machopic_indirect_data_reference (rtx orig, rtx reg)
   /* some other cpu -- writeme!  */
   gcc_unreachable ();
}
-  else if (defined)
+  else if (defined && ! MACHO_SYMBOL_MUST_INDIRECT_P (orig))
{
  rtx offset = NULL;
  if (DARWIN_PPC || HAVE_lo_sum)
@@ -1120,6 +1120,7 @@ machopic_output_indirection (machopic_indirection **slot, 
FILE *asm_out_file)
   machopic_output_stub (asm_out_file, sym, stub);
 }
   else if (! indirect_data (symbol)
+  && ! MACHO_SYMBOL_MUST_INDIRECT_P (symbol)
   && ! MACHO_SYMBOL_HIDDEN_VIS_P (symbol)
   && (machopic_symbol_defined_p (symbol)
   || SYMBOL_REF_LOCAL_P (symbol)))
@@ -1238,11 +1239,17 @@ darwin_encode_section_info (tree decl, rtx rtl, int 
first)
   if (VAR_P (decl))
 SYMBOL_REF_FLAGS (sym_ref) |= MACHO_SYMBOL_FLAG_VARIABLE;
 
+  /* Only really common if there's no initialiser.  */
+  bool really_common_p = (DECL_COMMON (decl)
+ && (DECL_INITIAL (decl) == NULL
+ || (!in_lto_p
+ && DECL_INITIAL (decl) == error_mark_node)));
+
   /* For Darwin, if we have specified visibility and it's not the default
  that's counted 'hidden'.  */
   if (DECL_VISIBILITY_SPECIFIED (decl)
   && DECL_VISIBILITY (decl) != VISIBILITY_DEFAULT)
- SYMBOL_REF_FLAGS (sym_ref) |= MACHO_SYMBOL_FLAG_HIDDEN_VIS;
+SYMBOL_REF_FLAGS (sym_ref) |= MACHO_SYMBOL_FLAG_HIDDEN_VIS;
 
   if (!DECL_EXTERNAL (decl)
   && (!TREE_PUBLIC (decl) || !DECL_WEAK (decl))
@@ -1255,6 +1262,12 @@ darwin_encode_section_info (tree decl, rtx rtl, int 
first)
 
   if (! TREE_PUBLIC (decl))
 SYMBOL_REF_FLAGS (sym_ref) |= MACHO_SYMBOL_FLAG_STATIC;
+
+  /* Short cut check for Darwin 'must indirect' rules.  */
+  if (really_common_p
+  || (DECL_WEAK (decl) && ! MACHO_SYMBOL_HIDDEN_VIS_P (sym_ref))
+  || lookup_attribute ("weakref", DECL_ATTRIBUTES (decl)))
+ SYMBOL_REF_FLAGS (sym_ref) |= MACHO_SYMBOL_FLAG_MUST_INDIRECT;
 }
 
 void
diff --git a/gcc/config/darwin.h b/gcc/config/darwin.h
index 87e1eb63b1..7fab8694f0 100644
--- a/gcc/config/darwin.h
+++ b/gcc/config/darwin.h
@@ -820,6 +820,15 @@ extern GTY(()) section * 
darwin_sections[NUM_DARWIN_SECTIONS];
 #define MACHO_SYMBOL_VARIABLE_P(RTX) \
   ((SYMBOL_REF_FLAGS (RTX) & MACHO_SYMBOL_FLAG_VARIABLE) != 0)
 
+/* Set on a symbol that must be indirected, even when there is a
+   definition in the TU.  The ABI mandates that common symbols are so
+   indirected, as are weak.  If 'fix-and-continue' is operational then
+   data symbols might also be.  */
+
+#define MACHO_SYMBOL_FLAG_MUST_INDIRECT ((SYMBOL_FLAG_SUBT_DEP) << 1)
+#define MACHO_SYMBOL_MUST_INDIRECT_P(RTX) \
+  ((SYMBOL_REF_FLAGS (RTX) & MACHO_SYMBOL_FLAG_MUST_INDIRECT) != 0)
+
 /* Set on a symbol with SYMBOL_FLAG_FUNCTION or MACHO_SYMBOL_FLAG_VARIABLE
to indicate that the function or variable is considered defined in this
translation unit.  */



[libgomp][testsuite] PR testsuite/91884 Add -lquadmath if available

2019-10-08 Thread Tobias Burnus
The test cases in GCC's libraries run via DejaGNU's "[find_gcc]" – 
unless the compiler has been explicitly provided. [1] That means that by 
default everything runs with "/gcc/xgcc" – at least for an 
in-build-dir test run.


As that's the C driver, specific libraries such as "-lgfortran" or 
"-lstdc++" have to added explicitly.


The problem with Fortran is: gfortran is built with libquadmath support 
on several platforms – in that case, libgfortran depends on libquadmath. 
But libquadmath support can be disabled or is not present at all on a 
given platform. – With the "gfortran" driver, that's solved by using a 
.spec file. But for xgcc …


For build-tree test runs, it's simple: As we check for the libquadmath 
libraries explicitly, we know an .a or .so file exists and can add 
"-lquadmath" – but for out-of-tree runs, we cannot. – Hence, I propose a 
link test.


What do you think of the attached patch? (Lightly tested on x86_64.)


Background (see also PR):

On x86_64 the issue is not visible. As Joseph pointed out internally 
(half a year ago): The ELF linker has a (mis)feature where it 
automatically looks for libraries named in DT_NEEDED, emulating how the 
dynamic linker searches and using directories specified in -rpath-link 
if necessary. See ld/emultempl/elf32.em.


But on systems like PowerPC, the linker explicitly requires "-lquadmath" 
in order that the linking succeeds.


Tobias

[1] By contrast, those tests under gcc/testsuite/ call functions like 
"gfortran_init" of "./lib/gfortran.exp" instead to get the proper 
driver  (gfortran, g++ etc.). But this function assume that the compiler 
is relative to current working directly (base_dir) at 
$base_dir/../../. That's the case for the tests under 
gcc/testsuite but not for the testsuite in the libraries.


PS: In principle, the same issue occurs with offloading – and has to be 
fixed there as well. However, nvptx and AMD GCN unsurprisingly don't 
have a libquadmath; though, Intel MIC might …



	libgomp/
	* testsuite/libgomp.fortran/fortran.exp: Add -lquadmath if available.
	* testsuite/libgomp.oacc-fortran/fortran.exp: Ditto.

diff --git a/libgomp/testsuite/libgomp.fortran/fortran.exp b/libgomp/testsuite/libgomp.fortran/fortran.exp
index d848ed4d47f..caffbfe0346 100644
--- a/libgomp/testsuite/libgomp.fortran/fortran.exp
+++ b/libgomp/testsuite/libgomp.fortran/fortran.exp
@@ -54,11 +54,15 @@ if { $lang_test_file_found } {
 	# Allow for spec subsitution.
 	lappend ALWAYS_CFLAGS "additional_flags=-B${blddir}/${quadmath_library_path}/"
 	set ld_library_path "$always_ld_library_path:${blddir}/${lang_library_path}:${blddir}/${quadmath_library_path}"
+	append lang_link_flags " -lquadmath"
 	} else {
 	set ld_library_path "$always_ld_library_path:${blddir}/${lang_library_path}"
 	}
 } else {
 set ld_library_path "$always_ld_library_path"
+if { [check_no_compiler_messages has_libquadmath executable {int main() {return 0;}} "-lgfortran -lquadmath"] } then {
+append lang_link_flags " -lquadmath"
+}
 }
 append ld_library_path [gcc-set-multilib-library-path $GCC_UNDER_TEST]
 set_ld_library_path_env_vars
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/fortran.exp b/libgomp/testsuite/libgomp.oacc-fortran/fortran.exp
index af25a22a522..393518d53f9 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/fortran.exp
+++ b/libgomp/testsuite/libgomp.oacc-fortran/fortran.exp
@@ -56,11 +56,15 @@ if { $lang_test_file_found } {
 	# Allow for spec subsitution.
 	lappend ALWAYS_CFLAGS "additional_flags=-B${blddir}/${quadmath_library_path}/"
 	set ld_library_path "$always_ld_library_path:${blddir}/${lang_library_path}:${blddir}/${quadmath_library_path}"
+	append lang_link_flags " -lquadmath"
 	} else {
 	set ld_library_path "$always_ld_library_path:${blddir}/${lang_library_path}"
 	}
 } else {
 set ld_library_path "$always_ld_library_path"
+if { [check_no_compiler_messages has_libquadmath executable {int main() {return 0;}} "-lgfortran -lquadmath"] } then {
+append lang_link_flags " -lquadmath"
+}
 }
 append ld_library_path [gcc-set-multilib-library-path $GCC_UNDER_TEST]
 set_ld_library_path_env_vars


Re: [04/32] [x86] Robustify vzeroupper handling across calls

2019-10-08 Thread Uros Bizjak
The following patch uses correct SSE register class; vzeroupper
operates only on lower 16 (8 on 32bit target) SSE registers.

2019-10-08  Uroš Bizjak  

PR target/91994
* config/i386/i386.c (x86_avx_u128_mode_needed): Use SSE_REG
instead of ALL_SSE_REG to check if function call preserves some
256-bit SSE registers.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.

On Tue, Oct 1, 2019 at 12:14 PM Uros Bizjak  wrote:
>
> On Wed, Sep 25, 2019 at 5:48 PM Richard Sandiford
>  wrote:
>
> > > The comment suggests that this code is only needed for Win64 and that
> > > not testing for Win64 is just a simplification.  But in practice it was
> > > needed for correctness on GNU/Linux and other targets too, since without
> > > it the RA would be able to keep 256-bit and 512-bit values in SSE
> > > registers across calls that are known not to clobber them.
> > >
> > > This patch conservatively treats calls as AVX_U128_ANY if the RA can see
> > > that some SSE registers are not touched by a call.  There are then no
> > > regressions if the ix86_hard_regno_call_part_clobbered check is disabled
> > > for GNU/Linux (not something we should do, was just for testing).
>
> If RA can sse that some SSE regs are not touched by the call, then we
> are sure that the called function is part of the current TU. In this
> case, the called function will be compiled using VEX instructions,
> where there is no AVX-SSE transition penalty. So, skipping VZEROUPPER
> is beneficial here.
>
> Uros.
>
> > > If in fact we want -fipa-ra to pretend that all functions clobber
> > > SSE registers above 128 bits, it'd certainly be possible to arrange
> > > that.  But IMO that would be an optimisation decision, whereas what
> > > the patch is fixing is a correctness decision.  So I think we should
> > > have this check even so.
> >
> > 2019-09-25  Richard Sandiford  
> >
> > gcc/
> > * config/i386/i386.c: Include function-abi.h.
> > (ix86_avx_u128_mode_needed): Treat function calls as AVX_U128_ANY
> > if they preserve some 256-bit or 512-bit SSE registers.
> >
> > Index: gcc/config/i386/i386.c
> > ===
> > --- gcc/config/i386/i386.c  2019-09-25 16:47:48.0 +0100
> > +++ gcc/config/i386/i386.c  2019-09-25 16:47:49.089962608 +0100
> > @@ -95,6 +95,7 @@ #define IN_TARGET_CODE 1
> >  #include "i386-builtins.h"
> >  #include "i386-expand.h"
> >  #include "i386-features.h"
> > +#include "function-abi.h"
> >
> >  /* This file should be included last.  */
> >  #include "target-def.h"
> > @@ -13511,6 +13512,15 @@ ix86_avx_u128_mode_needed (rtx_insn *ins
> > }
> > }
> >
> > +  /* If the function is known to preserve some SSE registers,
> > +RA and previous passes can legitimately rely on that for
> > +modes wider than 256 bits.  It's only safe to issue a
> > +vzeroupper if all SSE registers are clobbered.  */
> > +  const function_abi  = insn_callee_abi (insn);
> > +  if (!hard_reg_set_subset_p (reg_class_contents[ALL_SSE_REGS],
> > + abi.mode_clobbers (V4DImode)))
> > +   return AVX_U128_ANY;
> > +
> >return AVX_U128_CLEAN;
> >  }
> >
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 276677)
+++ config/i386/i386.c  (working copy)
@@ -13530,7 +13530,7 @@ ix86_avx_u128_mode_needed (rtx_insn *insn)
 modes wider than 256 bits.  It's only safe to issue a
 vzeroupper if all SSE registers are clobbered.  */
   const function_abi  = insn_callee_abi (insn);
-  if (!hard_reg_set_subset_p (reg_class_contents[ALL_SSE_REGS],
+  if (!hard_reg_set_subset_p (reg_class_contents[SSE_REGS],
  abi.mode_clobbers (V4DImode)))
return AVX_U128_ANY;
 


Re: C++ PATCH for C++20 P0388R4 (conversions to arrays of unknown bounds) and CWG 1307 (c++/91364, c++/69531)

2019-10-08 Thread Jason Merrill

On 10/7/19 7:26 PM, Marek Polacek wrote:

On Mon, Oct 07, 2019 at 02:56:10PM -0400, Jason Merrill wrote:

@@ -1378,7 +1381,7 @@ standard_conversion (tree to, tree from, tree expr, bool 
c_cast_p,
 if (same_type_p (from, to))
/* OK */;
-  else if (c_cast_p && comp_ptr_ttypes_const (to, from))
+  else if (c_cast_p && comp_ptr_ttypes_const (to, from, bounds_none))
/* In a C-style cast, we ignore CV-qualification because we
   are allowed to perform a static_cast followed by a
   const_cast.  */


Hmm, I'd expect bounds_either for a C-style cast.


Makes sense, it's just that const_cast shouldn't drop the bounds.


@@ -1670,7 +1673,14 @@ reference_binding (tree rto, tree rfrom, tree expr, bool 
c_cast_p, int flags,
 maybe_warn_cpp0x (CPP0X_INITIALIZER_LISTS);
 /* DR 1288: Otherwise, if the initializer list has a single element
 of type E and ... [T's] referenced type is reference-related to E,
-the object or reference is initialized from that element... */
+the object or reference is initialized from that element...
+
+??? With P0388R4, we should bind 't' directly to U{}:
+  using U = A[2];
+  A (&)[] = {U{}};
+because A[] and A[2] are reference-related.  But we don't do it
+because grok_reference_init has deduced the array size (to 1), and
+A[1] and A[2] aren't reference-related.  */


That sounds like a bug in grok_reference_init; it isn't properly
implementing

"Otherwise, if the initializer list has a single element of type E and
either T is not a reference type or its
referenced type is reference-related to E, the object or reference is
initialized from that element"


Can that be fixed in a follow up?


Sure.


 if (CONSTRUCTOR_NELTS (expr) == 1)
{
  tree elt = CONSTRUCTOR_ELT (expr, 0)->value;
@@ -6982,6 +6992,27 @@ maybe_inform_about_fndecl_for_bogus_argument_init (tree 
fn, int argnum)
"  initializing argument %P of %qD", argnum, fn);
   }
+/* Maybe warn about C++20 Conversions to arrays of unknown bound.  C is
+   the conversion, EXPR is the expression we're converting.  */
+
+static void
+maybe_warn_array_conv (location_t loc, conversion *c, tree expr)
+{
+  if (cxx_dialect >= cxx2a)
+return;
+
+  tree type = TREE_TYPE (expr);
+  type = strip_pointer_operator (type);
+
+  if (TREE_CODE (type) != ARRAY_TYPE
+  || TYPE_DOMAIN (type) == NULL_TREE)
+return;
+
+  if (conv_binds_to_array_of_unknown_bound (c))
+pedwarn (loc, OPT_Wpedantic, "conversions to arrays of unknown bound "
+"are only available with %<-std=c++2a%> or %<-std=gnu++2a%>");
+}
+
   /* Perform the conversions in CONVS on the expression EXPR.  FN and
  ARGNUM are used for diagnostics.  ARGNUM is zero based, -1
  indicates the `this' argument of a method.  INNER is nonzero when
@@ -7401,8 +7432,20 @@ convert_like_real (conversion *convs, tree expr, tree 
fn, int argnum,
  error_at (loc, "cannot bind non-const lvalue reference of "
"type %qH to an rvalue of type %qI", totype, extype);
else if (!reference_compatible_p (TREE_TYPE (totype), extype))
- error_at (loc, "binding reference of type %qH to %qI "
-   "discards qualifiers", totype, extype);
+ {
+   /* If we're converting from T[] to T[N], don't talk
+  about discarding qualifiers.  (Converting from T[N] to
+  T[] is allowed by P0388R4.)  */
+   if (TREE_CODE (extype) == ARRAY_TYPE
+   && TYPE_DOMAIN (extype) == NULL_TREE
+   && TREE_CODE (TREE_TYPE (totype)) == ARRAY_TYPE
+   && TYPE_DOMAIN (TREE_TYPE (totype)) != NULL_TREE)
+ error_at (loc, "binding reference of type %qH to %qI "
+   "discards array bounds", totype, extype);


If we're converting to T[N], that would be adding, not discarding, array
bounds?


True, I've reworded the error mesage.


+   else
+ error_at (loc, "binding reference of type %qH to %qI "
+   "discards qualifiers", totype, extype);
+ }
else
  gcc_unreachable ();
maybe_print_user_conv_context (convs);
@@ -7410,6 +7453,8 @@ convert_like_real (conversion *convs, tree expr, tree fn, 
int argnum,
return error_mark_node;
  }
+   else if (complain & tf_warning)
+ maybe_warn_array_conv (loc, convs, expr);
/* If necessary, create a temporary.
@@ -7493,7 +7538,10 @@ convert_like_real (conversion *convs, tree expr, tree 
fn, int argnum,
   case ck_qual:
 /* Warn about deprecated conversion if appropriate.  */
 if (complain & tf_warning)
-   string_conv_p (totype, expr, 1);
+   {
+ string_conv_p (totype, expr, 1);
+ maybe_warn_array_conv (loc, convs, 

[PR 70929] IPA call type compatibility fix/workaround

2019-10-08 Thread Martin Jambor
Hi,

I've been looking at PR 70929 and at the newly reported duplicate PR
91988 and would like to propose the following patch to address them.
Basically, the idea is that if the types or arguments in TYPE_ARG_TYPES
(as opposed to DECL_ARGUMENTS) from both the type from the target fndecl
and from call statement fntype match, then we can assume that whatever
promotions happened to the arguments they are the same in all
compilation units and the calls will be compatible.  I inserted this
test in the middle of gimple_check_call_args and it works for testcase
from both bugs.

The new test of course can be fooled with programs with clearly
undefined behavior, for example by having an indirect call which early
optimizations would discover to call an incompatible functions - but the
change considered in comment #5 of the bug would be too.  Moreover,
unless we stream argument types one will always be able to fool the
compiler and I find being careful about those and not inlining valid
calls with references to TREE_ADDRESSABLE classes a bad tradeoff.

I verified that - at least on gcc/libstdc++ testsuites - the new
gimple_check_call_args never returns false when the old one would return
true.  I can imagine us not doing the

  fold_convertible_p (TREE_VALUE (f), arg)

bit or returning false whenever reach the line

  tree p;

and the function has any parameters at all.  This would make the
function return false for on un-prototyped and/or K function
declarations, but perhaps we don't care too much?

In any event, I have bootstrapped and tested this on x86_64-linux, is it
perhaps OK for trunk?

Martin


2019-10-03  Martin Jambor  

PR lto/70929
* cgraph.c (gimple_check_call_args): Also compare types of argumen
types and call statement fntype types.

testsuite/
* g++.dg/lto/pr70929_[01].C: New test.
---
 gcc/cgraph.c | 83 ++--
 gcc/testsuite/g++.dg/lto/pr70929_0.C | 18 ++
 gcc/testsuite/g++.dg/lto/pr70929_1.C | 10 
 3 files changed, 95 insertions(+), 16 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/lto/pr70929_0.C
 create mode 100644 gcc/testsuite/g++.dg/lto/pr70929_1.C

diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index 0c3c6e7cac4..5a4c5253b49 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -3636,26 +3636,19 @@ cgraph_node::get_fun () const
 static bool
 gimple_check_call_args (gimple *stmt, tree fndecl, bool args_count_match)
 {
-  tree parms, p;
-  unsigned int i, nargs;
-
   /* Calls to internal functions always match their signature.  */
   if (gimple_call_internal_p (stmt))
 return true;
 
-  nargs = gimple_call_num_args (stmt);
+  unsigned int nargs = gimple_call_num_args (stmt);
 
-  /* Get argument types for verification.  */
-  if (fndecl)
-parms = TYPE_ARG_TYPES (TREE_TYPE (fndecl));
-  else
-parms = TYPE_ARG_TYPES (gimple_call_fntype (stmt));
-
-  /* Verify if the type of the argument matches that of the function
- declaration.  If we cannot verify this or there is a mismatch,
- return false.  */
+  /* If we have access to the declarations of formal parameters, match against
+ those.  */
   if (fndecl && DECL_ARGUMENTS (fndecl))
 {
+  unsigned int i;
+  tree p;
+
   for (i = 0, p = DECL_ARGUMENTS (fndecl);
   i < nargs;
   i++, p = DECL_CHAIN (p))
@@ -3676,17 +3669,75 @@ gimple_check_call_args (gimple *stmt, tree fndecl, bool 
args_count_match)
}
   if (args_count_match && p)
return false;
+
+  return true;
 }
-  else if (parms)
+
+  /* If we don't have decls of formal parameters, see if we have both the type
+ of the callee arguments in the fntype of the call statement and compare
+ those.  We rely on the fact that whatever promotions happened to
+ declarations for exactly the same sequence of types of parameters also
+ happened on the callee side.  */
+  if (fndecl
+  && TYPE_ARG_TYPES (TREE_TYPE (fndecl))
+  && TYPE_ARG_TYPES (gimple_call_fntype (stmt)))
 {
-  for (i = 0, p = parms; i < nargs; i++, p = TREE_CHAIN (p))
+  unsigned int arg_idx = 0;
+  for (tree f = TYPE_ARG_TYPES (TREE_TYPE (fndecl)),
+s = TYPE_ARG_TYPES (gimple_call_fntype (stmt));
+  f || s;
+  f = TREE_CHAIN (f), s = TREE_CHAIN (s), arg_idx++)
{
+ if (!f
+ || !s
+ || TREE_VALUE (f) == error_mark_node
+ || TREE_VALUE (s) == error_mark_node)
+   return false;
+ if (TREE_CODE (TREE_VALUE (f)) == VOID_TYPE)
+   {
+ if (TREE_CODE (TREE_VALUE (s)) != VOID_TYPE
+ || arg_idx != nargs)
+   return false;
+ else
+   break;
+   }
+
  tree arg;
+
+ if (arg_idx >= nargs
+ || (arg = gimple_call_arg (stmt, arg_idx)) == error_mark_node)
+   return false;
+
+ if (TREE_CODE (TREE_VALUE (s)) == VOID_TYPE
+ 

Re: libgomp test report with nvidia offloading

2019-10-08 Thread Jakub Jelinek
On Tue, Oct 08, 2019 at 06:30:36PM +0200, Tom de Vries wrote:
> I just build gcc trunk with nvptx offloading, and ran libgomp on a
> Quadro M1200 with driver version 390.87.
> 
> On the OpenMP side, there are the known errors plus a new internal
> compiler error.

One issue is the usleep problem which is unimplemented in nvptx newlib,
that might get fixed through declare variant directive (have parsing for
that roughly done, but not more than that so far).
Another thing is unimplemented link support in NVPTX offloading, that just
needs work.

> On the OpenACC side, there's the usual lib-81.c timeout, plus a large
> number of new failures in host_data test-cases.

>From my POV for the OpenACC tests, it would be nice if tests that actually
include  or even link against -lcuda, if they don't work with the
--without-cuda-driver setup, would be just UNSUPPORTED, otherwise it is too
much clutter.

Jakub


Re: [PATCH 1/2][GCC][RFC][middle-end]: Add SLP pattern matcher.

2019-10-08 Thread Tamar Christina
Hi Richi,

Thanks for the review, I've added some comments inline.

The 10/07/2019 12:15, Richard Biener wrote:
> On Tue, 1 Oct 2019, Tamar Christina wrote:
> 
> > Hi All,
> > 
> > This adds a framework to allow pattern matchers to be written at based on 
> > the
> > SLP tree.  The difference between this one and the one in 
> > tree-vect-patterns is
> > that this matcher allows the matching of an arbitrary number of parallel
> > statements and replacing of an arbitrary number of children or statements.
> > 
> > Any relationship created by the SLP pattern matcher will be undone if SLP 
> > fails.
> > 
> > The pattern matcher can also cancel all permutes depending on what the 
> > pattern
> > requested it to do.  As soon as one pattern requests the permutes to be
> > cancelled all permutes are cancelled.
> > 
> > Compared to the previous pattern matcher this one will work for an arbitrary
> > group size and will match at any arbitrary node in the SLP tree.  The only
> > requirement is that the entire node is matched or rejected.
> > 
> > vect_build_slp_tree_1 is a bit more lenient in what it accepts as 
> > "compatible
> > operations" but the matcher cannot be because in cases where you match the 
> > order
> > of the operands may be changed.  So all operands must be changed or none.
> > 
> > Furthermore the matcher relies on canonization of the operations inside the
> > SLP tree and on the fact that floating math operations are not commutative.
> > This means that matching a pattern does not need to try all alternatives or
> > combinations and that arguments will always be in the same order if it's the
> > same operation.
> > 
> > The pattern matcher also ignored uninteresting nodes such as type casts, 
> > loads
> > and stores.  Doing so is essential to keep the runtime down.
> > 
> > Each matcher is allowed a post condition that can be run to perform any 
> > changes
> > to the SLP tree as needed before the patterns are created and may also abort
> > the creation of the patterns.
> > 
> > When a pattern is matched it is not immediately created but instead it is
> > deferred until all statements in the node have been analyzed.  Only if all 
> > post
> > conditions are true, and all statements will be replaced will the patterns 
> > be
> > created in batch.  This allows us to not have to undo any work if the 
> > pattern
> > fails but also makes it so we traverse the tree only once.
> > 
> > When a new pattern is created it is a marked as a pattern to the statement 
> > it is
> > replacing and be marked as used in the current SLP scope.  If SLP fails then
> > relationship is undone and the relevancy restored.
> > 
> > Each pattern matcher can detect any number of pattern it wants.  The only
> > constraint is that the optabs they produce must all have the same arity.
> > 
> > The pattern matcher supports instructions that have no scalar form as they
> > are added as pattern statements to the stmt.  The BB is left untouched and
> > so the scalar loop is untouched.
> > 
> > Bootstrapped on aarch64-none-linux-gnu and no issues.
> > No regression testing done yet.
> 
> If you split out the introduction of SLP_TREE_REF_COUNT you can commit
> that right now (sorry for being too lazy there...).
> 

I'll split those off :)

> One overall comment - you do pattern matching after SLP tree
> creation (good) but still do it before the whole SLP graph is
> created (bad).  Would it be possible to instead do it as a separate
> phase in vect_analyze_slp, looping over all instances (the instances
> present entries into the single unified SLP graph now), avoiding
> to visit "duplicates"?
> 

It should be, the only issue I can see is that build SLP may fail because of
an unsupported permute, or because it can use load lanes.  If I'm understanding
it correctly you wouldn't get SLP vectorization in those cases so then the 
matching
can't work? So it would limit it a it more.

> In patch 1/2 I see references (mostly in variable names) to
> "complex" which is IMHO too specific.
> 

Sorry, missed those during my cleanup.

> I'd also think that a useful first pattern to support would be
> what we do with SLP_TREE_TWO_OPERATORS, where code generation
> inserts extra blends.  I have yet to dive into the complex pattern
> details to see if that's feasible or if you maybe re-use that...

Oh, I hadn't thought of that. I'll take a look.

> You seem to at least rely on the SLP build succeeding with
> the mixed plus/minus cases?  Which also restricts the kind of
> patterns we can recognize I guess.  Plus it shows the chicken-and-egg
> issue here - we'd like to detect the patterns _first_ and then
> build the SLP trees (or rather combine lanes into larger groups).
> Doing it after the fact makes it no more powerful than doing
> it as folding post vectorization?

It's true that I do rely on build SLP succeeding, and there is one case I know
of where SLP build fails when I didn't expect it to.  But I was operating under
the impression that 

Re: libgomp test report with nvidia offloading

2019-10-08 Thread Thomas Schwinge
Hi Tom!

On 2019-10-08T18:30:36+0200, Tom de Vries  wrote:
> I just build gcc trunk with nvptx offloading, and ran libgomp on a
> Quadro M1200 with driver version 390.87.
>
> On the OpenMP side, there are the known errors plus a new internal
> compiler error.
>
> On the OpenACC side, there's the usual lib-81.c timeout, plus a large
> number of new failures in host_data test-cases.
>
> Can you reproduce these new failures? If so, are these already filed
> somewhere?

Reproduced, and just filed earlier today:

> FAIL: libgomp.fortran/pr90779.f90   -O  (internal compiler error)
> FAIL: libgomp.fortran/pr90779.f90   -O  (test for excess errors)

 "Regression: 'libgomp.fortran/pr90779.f90'
ICE for nvptx offloading".

> FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/host_data-1.c
> -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0
> execution test
> FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/host_data-1.c
> -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2
> execution test
> FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/host_data-2.c
> -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0
> execution test
> FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/host_data-2.c
> -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2
> execution test
> FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/host_data-4.c
> -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0
> execution test
> FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/host_data-4.c
> -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2
> execution test
> FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/host_data-5.c
> -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0
> execution test
> FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/host_data-5.c
> -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2
> execution test

> FAIL: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/host_data-1.c
> -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0
> execution test
> FAIL: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/host_data-1.c
> -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2
> execution test
> FAIL: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/host_data-2.c
> -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0
> execution test
> FAIL: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/host_data-2.c
> -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2
> execution test
> FAIL: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/host_data-4.c
> -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0
> execution test
> FAIL: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/host_data-4.c
> -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2
> execution test
> FAIL: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/host_data-5.c
> -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0
> execution test
> FAIL: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/host_data-5.c
> -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2
> execution test

> FAIL: libgomp.oacc-fortran/host_data-1.f90 -DACC_DEVICE_TYPE_nvidia=1
> -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0  execution test
> FAIL: libgomp.oacc-fortran/host_data-1.f90 -DACC_DEVICE_TYPE_nvidia=1
> -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O1  execution test
> FAIL: libgomp.oacc-fortran/host_data-1.f90 -DACC_DEVICE_TYPE_nvidia=1
> -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2  execution test
> FAIL: libgomp.oacc-fortran/host_data-1.f90 -DACC_DEVICE_TYPE_nvidia=1
> -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O3 -fomit-frame-pointer
> -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
> FAIL: libgomp.oacc-fortran/host_data-1.f90 -DACC_DEVICE_TYPE_nvidia=1
> -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O3 -g  execution test
> FAIL: libgomp.oacc-fortran/host_data-1.f90 -DACC_DEVICE_TYPE_nvidia=1
> -DACC_MEM_SHARED=0 -foffload=nvptx-none  -Os  execution test
> FAIL: libgomp.oacc-fortran/host_data-2.f90 -DACC_DEVICE_TYPE_nvidia=1
> -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0  execution test
> FAIL: libgomp.oacc-fortran/host_data-2.f90 -DACC_DEVICE_TYPE_nvidia=1
> -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O1  execution test
> FAIL: libgomp.oacc-fortran/host_data-2.f90 -DACC_DEVICE_TYPE_nvidia=1
> -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2  execution test
> FAIL: libgomp.oacc-fortran/host_data-2.f90 -DACC_DEVICE_TYPE_nvidia=1
> -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O3 -fomit-frame-pointer
> -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
> FAIL: libgomp.oacc-fortran/host_data-2.f90 -DACC_DEVICE_TYPE_nvidia=1
> -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O3 -g  execution test
> FAIL: libgomp.oacc-fortran/host_data-2.f90 

libgomp test report with nvidia offloading

2019-10-08 Thread Tom de Vries
Hi,

I just build gcc trunk with nvptx offloading, and ran libgomp on a
Quadro M1200 with driver version 390.87.

On the OpenMP side, there are the known errors plus a new internal
compiler error.

On the OpenACC side, there's the usual lib-81.c timeout, plus a large
number of new failures in host_data test-cases.

Can you reproduce these new failures? If so, are these already filed
somewhere?

Thanks,
- Tom

--- OPENMP ---

FAIL: libgomp.c/target-32.c (test for excess errors)
FAIL: libgomp.c/target-33.c execution test
FAIL: libgomp.c/target-34.c execution test
FAIL: libgomp.c/target-link-1.c execution test
FAIL: libgomp.c/thread-limit-2.c (test for excess errors)
FAIL: libgomp.fortran/pr90779.f90   -O  (internal compiler error)
FAIL: libgomp.fortran/pr90779.f90   -O  (test for excess errors)

--- OPENACC ---

FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/host_data-1.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0
execution test
FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/host_data-1.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2
execution test
FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/host_data-2.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0
execution test
FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/host_data-2.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2
execution test
FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/host_data-4.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0
execution test
FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/host_data-4.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2
execution test
FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/host_data-5.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0
execution test
FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/host_data-5.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2
execution test
FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/lib-81.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0
execution test
FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/lib-81.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2
execution test
FAIL: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/host_data-1.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0
execution test
FAIL: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/host_data-1.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2
execution test
FAIL: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/host_data-2.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0
execution test
FAIL: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/host_data-2.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2
execution test
FAIL: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/host_data-4.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0
execution test
FAIL: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/host_data-4.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2
execution test
FAIL: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/host_data-5.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0
execution test
FAIL: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/host_data-5.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2
execution test
FAIL: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/lib-81.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0
execution test
FAIL: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/lib-81.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2
execution test
FAIL: libgomp.oacc-fortran/host_data-1.f90 -DACC_DEVICE_TYPE_nvidia=1
-DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0  execution test
FAIL: libgomp.oacc-fortran/host_data-1.f90 -DACC_DEVICE_TYPE_nvidia=1
-DACC_MEM_SHARED=0 -foffload=nvptx-none  -O1  execution test
FAIL: libgomp.oacc-fortran/host_data-1.f90 -DACC_DEVICE_TYPE_nvidia=1
-DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2  execution test
FAIL: libgomp.oacc-fortran/host_data-1.f90 -DACC_DEVICE_TYPE_nvidia=1
-DACC_MEM_SHARED=0 -foffload=nvptx-none  -O3 -fomit-frame-pointer
-funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
FAIL: libgomp.oacc-fortran/host_data-1.f90 -DACC_DEVICE_TYPE_nvidia=1
-DACC_MEM_SHARED=0 -foffload=nvptx-none  -O3 -g  execution test
FAIL: libgomp.oacc-fortran/host_data-1.f90 -DACC_DEVICE_TYPE_nvidia=1
-DACC_MEM_SHARED=0 -foffload=nvptx-none  -Os  execution test
FAIL: libgomp.oacc-fortran/host_data-2.f90 -DACC_DEVICE_TYPE_nvidia=1
-DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0  execution test
FAIL: libgomp.oacc-fortran/host_data-2.f90 -DACC_DEVICE_TYPE_nvidia=1
-DACC_MEM_SHARED=0 -foffload=nvptx-none  -O1  execution test
FAIL: 

Re: [RFA][1/3] Remove Cell Broadband Engine SPU targets

2019-10-08 Thread Ulrich Weigand
Thomas Schwinge wrote:

> In r276692, I committed the attached to "Add back Trevor Smigiel; move
> into Write After Approval section", assuming that it was just an
> oversight that he got dropped from the file, as he -- in contrast to
> David and Ulrich -- doesn't remain listed with other roles.

Ah, you're right -- thanks for catching this!

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com



Re: [Patch 1/2][ipa] Add target hook to sanitize cloned declaration's attributes

2019-10-08 Thread Martin Jambor
Hi,

On Tue, Oct 08 2019, Andre Vieira (lists) wrote:
> Hi,
>
> This patch adds a new target hook called 
> TARGET_HOOK_SANITIZE_CLONE_ATTRIBUTES.  This hook is meant to give each 
> target the ability to sanitize a cloned's declaration attribute list.
>
>
> Is this OK for trunk?
>
> Cheers,
> Andre
>
> gcc/ChangeLog:
>
> 2019-10-08  Andre Vieira  
>
>   * ipa-cp.c(create_specialized_node): Call new target hook when
>  creating a new node.

I'm quite sure you don't want to call the hook here but from within
cgraph_node::create_virtual_clone (and perhaps also
cgraph_node::create_version_clone_with_body?), if only now there are two
passes creating clones and there may easily be more in the future.

Thanks,

Martin

>  * target.def(sanitize_clone_attributes): New target hook.
>  * targhooks.c(default_sanitize_clone_attributes): New.
>  * targhooks.h(default_sanitize_clone_attributes): New.
>  * doc/tm.texi.in: Document new target hook.
>  * doc/tm.texi: Regenerated


C++ PATCH to add test for DR 685

2019-10-08 Thread Marek Polacek
I noticed we've fixed this DR already.  It added [conv.prom]/4 and
type_promotes_to implements it:

  /* ISO C++17, 7.6/4.  A prvalue of an unscoped enumeration type
 whose underlying type is fixed (10.2) can be converted to a
 prvalue of its underlying type. Moreover, if integral promotion
 can be applied to its underlying type, a prvalue of an unscoped
 enumeration type whose underlying type is fixed can also be 
 converted to a prvalue of the promoted underlying type.  */

Tested on x86_64-linux, applying to trunk.

2019-10-08  Marek Polacek  

DR 685 - Integral promotion of enum ignores fixed underlying type.
* g++.dg/cpp0x/scoped_enum9.C: New test.

diff --git gcc/testsuite/g++.dg/cpp0x/scoped_enum9.C 
gcc/testsuite/g++.dg/cpp0x/scoped_enum9.C
new file mode 100644
index 000..f38f26dc35b
--- /dev/null
+++ gcc/testsuite/g++.dg/cpp0x/scoped_enum9.C
@@ -0,0 +1,11 @@
+// DR 685 - Integral promotion of enumeration ignores fixed underlying type.
+// { dg-do compile { target c++11 } }
+
+enum E: long { e };
+
+void f(int);
+int f(long);
+
+void g() {
+  int k = f(e);
+}


Re: Type representation in CTF and DWARF

2019-10-08 Thread Pedro Alves
On 10/4/19 8:23 PM, Indu Bhagat wrote:
> Hello,
> 
> At GNU Tools Cauldron this year, some folks were curious to know more on how
> the "type representation" in CTF compares vis-a-vis DWARF.

I was one of those, and I brought this up to Jose, after your
presentation.  Glad to see the follow up!  Thanks much for this.

In your Cauldron presentation we saw CTF compared to full blown DWARF
as justification for CTF, but I was more interested in a comparison between
CTF and a DWARF subset containing exactly only what you have available in
CTF.  Because if DWARF with everything-you-don't-need stripped out
is in the same ballpark, then I am puzzled on why add/maintain a new
Debug format, with all the duplication of effort that entails going
forward.

Also, it's my understanding that the current CTF format doesn't yet
support C++, Vector registers, etc., maybe other things, so if DWARF
was sufficient for your needs, then in the long run it sounds like
a better option to me, as then you wouldn't have to extend CTF _and_
DWARF whenever some feature is needed.

Maybe it would make sense to work on integrating CTF into the DWARF
standard itself, not sure?

I was also curious on your plans for adding unwinding support to CTF,
while the kernel (the main CTF user, IIUC), already has plans to 
use its own unwinding format (ORC)?

So with all those questions, I came out of the presentation
thinking that I could not really justify CTF if I were asked to.

(Side note: the Cauldron page is missing slides for your
presentation, so I couldn't go and recheck some things
mentioned above.)

Thanks,
Pedro Alves



[Patch 2/2][Arm] Implement TARGET_HOOK_SANITIZE_CLONE_ATTRIBUTES to remove cmse_nonsecure_entry

2019-10-08 Thread Andre Vieira (lists)

Hi,

This patch implements the TARGET_HOOK_SANITIZE_CLONE_ATTRIBUTES for the 
arm backend to remove the 'cmse_nonsecure_entry' attribute from cmse.


Bootstrapped the series on x86_64 and built arm-none-eabi, running the 
cmse testsuite for armv8-m.main and armv8-m.base.


Is this OK for trunk?

Cheers,
Andre

gcc/ChangeLog:

2019-10-08  Andre Vieira  

* config/arm/arm.c (TARGET_SANITIZE_CLONE_ATTRIBUTES): Define.
(arm_sanitize_clone_attributes): New.

gcc/testsuite/ChangeLog:
2019-10-08  Andre Vieira  

* gcc.target/arm/cmse/ipa-clone.c: New test.
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 9f0975dc0710626ef46abecaa3a205e993821118..173172bd28303469faded6b7a84a0b353b62de18 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -811,6 +811,9 @@ static const struct attribute_spec arm_attribute_table[] =
 
 #undef TARGET_CONSTANT_ALIGNMENT
 #define TARGET_CONSTANT_ALIGNMENT arm_constant_alignment
+
+#undef TARGET_SANITIZE_CLONE_ATTRIBUTES
+#define TARGET_SANITIZE_CLONE_ATTRIBUTES arm_sanitize_clone_attributes
 
 /* Obstack for minipool constant handling.  */
 static struct obstack minipool_obstack;
@@ -31999,6 +32002,15 @@ arm_constant_alignment (const_tree exp, HOST_WIDE_INT align)
   return align;
 }
 
+static void
+arm_sanitize_clone_attributes (struct cgraph_node * node)
+{
+  tree attrs = DECL_ATTRIBUTES (node->decl);
+  if (lookup_attribute ("cmse_nonsecure_entry", attrs))
+attrs = remove_attribute ("cmse_nonsecure_entry", attrs);
+  DECL_ATTRIBUTES (node->decl) = attrs;
+}
+
 /* Emit a speculation barrier on target architectures that do not have
DSB/ISB directly.  Such systems probably don't need a barrier
themselves, but if the code is ever run on a later architecture, it
diff --git a/gcc/testsuite/gcc.target/arm/cmse/ipa-clone.c b/gcc/testsuite/gcc.target/arm/cmse/ipa-clone.c
new file mode 100644
index ..6ab4c34f7499f9615b5d44c633bb5f9d69e88d39
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/cmse/ipa-clone.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-options "-mcmse" }  */
+/* { dg-additional-options "-Ofast" } */
+
+#include 
+
+int __attribute__ ((cmse_nonsecure_entry))
+foo (int a)
+{
+return 1;
+}
+
+int __attribute__ ((cmse_nonsecure_entry))
+bar (int a)
+{
+return 1;
+}
+
+int main (void)
+{
+return 0;
+}
+
+/* { dg-final { scan-assembler "foo.constprop.0:" } } */
+/* { dg-final { scan-assembler-not "__acle_se_foo.constprop.0:" } } */
+/* { dg-final { scan-assembler "foo:" } } */
+/* { dg-final { scan-assembler "__acle_se_foo:" } } */
+


[Patch 1/2][ipa] Add target hook to sanitize cloned declaration's attributes

2019-10-08 Thread Andre Vieira (lists)

Hi,

This patch adds a new target hook called 
TARGET_HOOK_SANITIZE_CLONE_ATTRIBUTES.  This hook is meant to give each 
target the ability to sanitize a cloned's declaration attribute list.



Is this OK for trunk?

Cheers,
Andre

gcc/ChangeLog:

2019-10-08  Andre Vieira  

* ipa-cp.c(create_specialized_node): Call new target hook when
creating a new node.
* target.def(sanitize_clone_attributes): New target hook.
* targhooks.c(default_sanitize_clone_attributes): New.
* targhooks.h(default_sanitize_clone_attributes): New.
* doc/tm.texi.in: Document new target hook.
* doc/tm.texi: Regenerated
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index a86c210d4fe390bd0356b6e50ba7c6c34a36239a..5c1149ccf47576d8d803d1ab712f6f1b9342d74c 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -12039,6 +12039,11 @@ If defined, this function returns an appropriate alignment in bits for an atomic
 ISO C11 requires atomic compound assignments that may raise floating-point exceptions to raise exceptions corresponding to the arithmetic operation whose result was successfully stored in a compare-and-exchange sequence.  This requires code equivalent to calls to @code{feholdexcept}, @code{feclearexcept} and @code{feupdateenv} to be generated at appropriate points in the compare-and-exchange sequence.  This hook should set @code{*@var{hold}} to an expression equivalent to the call to @code{feholdexcept}, @code{*@var{clear}} to an expression equivalent to the call to @code{feclearexcept} and @code{*@var{update}} to an expression equivalent to the call to @code{feupdateenv}.  The three expressions are @code{NULL_TREE} on entry to the hook and may be left as @code{NULL_TREE} if no code is required in a particular place.  The default implementation leaves all three expressions as @code{NULL_TREE}.  The @code{__atomic_feraiseexcept} function from @code{libatomic} may be of use as part of the code generated in @code{*@var{update}}.
 @end deftypefn
 
+@deftypefn {Target Hook} void TARGET_SANITIZE_CLONE_ATTRIBUTES (struct cgraph_node *@var{})
+This target hook sanitizes the cloned @code{*@var{node}} declaration
+attributes.
+@end deftypefn
+
 @deftypefn {Target Hook} void TARGET_RECORD_OFFLOAD_SYMBOL (tree)
 Used when offloaded functions are seen in the compilation unit and no named
 sections are available.  It is called once for each symbol that must be
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 06dfcda35abea7396c288a59c38ee4ef57c6fef6..f4ef164a595ca1c2e286a7d26020b63b4c966728 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -8134,6 +8134,8 @@ and the associated definitions of those functions.
 
 @hook TARGET_ATOMIC_ASSIGN_EXPAND_FENV
 
+@hook TARGET_SANITIZE_CLONE_ATTRIBUTES
+
 @hook TARGET_RECORD_OFFLOAD_SYMBOL
 
 @hook TARGET_OFFLOAD_OPTIONS
diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index b4fb74e097e2c9f80aa38f8e150be593200d9bdf..9c2154fa78d59873243fc7d7717fb9cab83101ac 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -124,6 +124,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-ssa-ccp.h"
 #include "stringpool.h"
 #include "attribs.h"
+#include "target.h"
 
 template  class ipcp_value;
 
@@ -4017,6 +4018,9 @@ create_specialized_node (struct cgraph_node *node,
   ipcp_discover_new_direct_edges (new_node, known_csts, known_contexts, aggvals);
 
   callers.release ();
+
+  targetm.sanitize_clone_attributes (new_node);
+
   return new_node;
 }
 
diff --git a/gcc/target.def b/gcc/target.def
index f9446fa05a22c79154c2ef36d3d8aea48a5efcc6..18301565e5131fa8b2f194bc73aa0c55d30cb9ef 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -6582,6 +6582,13 @@ DEFHOOK
  void, (tree *hold, tree *clear, tree *update),
  default_atomic_assign_expand_fenv)
 
+DEFHOOK
+(sanitize_clone_attributes,
+"This target hook sanitizes the cloned @code{*@var{node}} declaration\n\
+attributes.",
+void, (struct cgraph_node *),
+default_sanitize_clone_attributes)
+
 /* Leave the boolean fields at the end.  */
 
 /* True if we can create zeroed data by switching to a BSS section
diff --git a/gcc/targhooks.h b/gcc/targhooks.h
index 5aba67660f85406b9fd475e75a3cc65b0d1952f5..0191d0ed36eb60fc79bf718b9782ef6c99c77fc4 100644
--- a/gcc/targhooks.h
+++ b/gcc/targhooks.h
@@ -284,4 +284,6 @@ extern rtx default_speculation_safe_value (machine_mode, rtx, rtx, rtx);
 extern void default_remove_extra_call_preserved_regs (rtx_insn *,
 		  HARD_REG_SET *);
 
+extern void default_sanitize_clone_attributes (struct cgraph_node *);
+
 #endif /* GCC_TARGHOOKS_H */
diff --git a/gcc/targhooks.c b/gcc/targhooks.c
index ed77afb1da57e59bc0725dc0d6fac477391bae03..35d2114956bd27f250335d918d0b3d81d962472b 100644
--- a/gcc/targhooks.c
+++ b/gcc/targhooks.c
@@ -2120,6 +2120,12 @@ default_atomic_assign_expand_fenv (tree *, tree *, tree *)
 {
 }
 
+/* Default implementation of TARGET_SANITIZE_CLONE_ATTRIBUTES.  */
+
+void default_sanitize_clone_attributes (struct cgraph_node *node)
+{
+}
+
 

[Patch 0/2][ipa, Arm] Fix cloning of 'cmse_nonsecure_entry' functions

2019-10-08 Thread Andre Vieira (lists)

Hi,

This patch series is aimed at fixing an issue when cloning a function 
with 'cmse_nonsecure_entry' attribute.
Currently if gcc determines to clone a function with this attribute, 
both the cloned and original declarations will be compiled as nonsecure 
entry functions.  The linker eventually complains about a non-global 
cmse entry function and errors out.


It is legal to clone these functions, since the clone will only be used 
directly by the same translation unit as the original cmse entry 
function.  However, the clones should not be marked as entry functions. 
So in patch 1 I'll add a target hook that allows each target to sanitize 
the cloned declaration's attributes and in patch 2 I'll make the arm's 
implementation of it remove 'cmse_nonsecure_entry' from the cloned 
declaration's attribute list.


Andre Vieira (2)
[Patch 1/2][ipa] Add target hook to sanitize cloned declaration's attributes
[Patch 2/2][Arm] Implement TARGET_HOOK_SANITIZE_CLONE_ATTRIBUTES to 
remove cmse_nonsecure_entry


Bootstrapped the series on x86_64 and built arm-none-eabi, running the 
cmse testsuite for armv8-m.main and armv8-m.base.


Cheers,
Andre


Re: [PATCHv2] Change the library search path when using --with-advance-toolchain

2019-10-08 Thread Segher Boessenkool
On Fri, Oct 04, 2019 at 06:31:34PM -0300, Tulio Magno Quites Machado Filho 
wrote:
> Remove all -L directories from LINK_OS_EXTRA_SPEC32 and
> LINK_OS_EXTRA_SPEC64 so that user directories specified at
> build time have higher preference over the advance toolchain libraries.
> 
> Set MD_STARTFILE_PREFIX to $prefix/lib/ and MD_STARTFILE_PREFIX_1 to
> $at/lib/ so that a compiler library has preference over the Advance
> Toolchain libraries.
> 
> 2019-10-04  Tulio Magno Quites Machado Filho  
> 
>   * config.gcc: Move -L usage from LINK_OS_EXTRA_SPEC32 and
>   LINK_OS_EXTRA_SPEC64 to MD_STARTFILE_PREFIX and
>   MD_STARTFILE_PREFIX_1 when using --with-advance-toolchain.

Committed ( https://gcc.gnu.org/r276702 ).  Thanks!


Segher


Re: [PATCH 3/3] Implementation of -Wclobbered on tree-ssa

2019-10-08 Thread Alexander Monakov
On Thu, 3 Oct 2019, Jeff Law wrote:

> You may want to review the 2018 discussion:
> 
> https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg185287.html
> 
> THe 2018 discussion was primarily concerned with fixing the CFG
> inaccuracies which would in turn fix 61118.  But would not fix 21161.

Well, PR 61118 was incorrectly triaged to begin with.

Let me show starting from a simplified testcase:

void bar(jmp_buf);
void foo(int n)
{
jmp_buf jb;
for (; n; n--)
if (setjmp(jb))
bar(jb);
}

Here we warn that "parameter 'n' might be clobbered..." despite that
"obviously" there is no way 'n' might be modified between setjmp and
bar... right?

Now suppose 'bar' looks like

void bar(jmp_buf jb)
{
static int cnt;
static jmp_buf stash;
if ((cnt ^= -1))
memcpy(stash, jb, sizeof stash);
else
longjmp(stash, 1);
}

So every other iteration of the loop, 'bar' will longjmp to the state
stashed on the previous iteration. Since 'n' is modified between iterations,
this is exactly the situation the warning aims to catch: if 'n' is restored
from jmp_buf, the last write to 'n' is lost.

Still, I can imagine one can argue that the warning in the PR is bogus, as
the loop assigns invariant values to problematic variables:

void foo(struct ctx *arg)
{
while (arg->cond) {
void *var1 = >field;
void (*var2)(void*) = globalfn;
jmp_buf jb;
if (!setjmp(jb))
var2(var1);
}
etc(arg);
}

and since "obviously" >field is invariant, there's no way setjmp will
overwrite 'var1' with a different value, right?.. Except:

If the while loop is not entered, a longjmp from inside 'etc' would restore
default (uninitialized) values; note the compiler doesn't pay attention to the
lifetime of jmp_buf, it simply assumes 'etc' may cause setjmp to return
abnormally (this is correct in case of other returns_twice functions such as
vfork).

(granted, if the loop is not entered, setjmp is not called either and cannot
possibly return normally, let alone abnormally, but our IR does not model that).

So the bottom line here that setjmps in loops are extra tricky, warnings for
them that superficially appear to be false positives may indicate a real issue,
and we can do a better job documenting that and recommending safer patterns.


Now regarding claims that we emit "incorrect" CFG that is causing extra phi
nodes, and that "fixing CFG" would magically eliminate those and help with
false positives.

While in principle it is perhaps nicer to have IR that more closely models
the abnormal flow, I don't see how it would reduce phi nodes; we'd go from

BB2:   BB99:
|  ABNORMAL_DISPATCHER()
v  v--/
BB3:
x_1 = phi(x_2(BB2), x_3(BB99)
ret_1 = setjmp()

to
   BB2:   BB99:
   |  ABNORMAL_DISPATCHER()
   v  |
   BB3:   |
   ret_1 = setjmp()   |
   |\ v
   | >BB4:
   |  ret_2 = __builtin_setjmp_receiver()
   v v---/
   BB4:
   x_1 = phi(x_2(BB2), x_3(BB4)
   ret_3 = phi(ret_1(BB2), ret_2(BB4)

so the phi node for 'x' simply goes to the new join BB, which also needs a 
new phi for setjmp's return value 'ret'.

Plus, we'd need to somehow ensure that BB4 stays empty apart from the receiver,
and we have no infrastructure to ensure that.

So as far as I can tell it's not readily implementable and in fact brings no
benefit for setjmp and vfork. I can see a theoretical benefit for other
functions that have 'returns_twice' attribute and receive arguments: this
would eliminate bogus phi nodes for arguments. But for vfork and setjmp this
doesn't buy anything.

You even said yourself that phi nodes would still be there, just in a different
block:

> You might instead be able to look at the PHI nodes of the successor block 


So in my opinion our CFG is good enough, the real issues with -Wclobbered false
positives are not due to phi nodes but other effects.

If you agree: what would be the next steps?

Alexander


Re: [PATCH, nvptx] Expand OpenACC child function arguments to use CUDA params space

2019-10-08 Thread Thomas Schwinge
Hi Chung-Lin!

While we're all waiting for Tom to comment on this ;-) -- here's another
item I realized:

On 2019-09-10T19:41:59+0800, Chung-Lin Tang  wrote:
> The libgomp nvptx plugin changes are also quite contained, with lots of
> now unneeded [...] code deleted (since we no longer first cuAlloc a
> buffer for the argument record before cuLaunchKernel)

It would be nice ;-) -- but unless I'm confused, it's not that simple: we
either have to reject (force host-fallback execution) or keep supporting
"old-style" nvptx offloading code: new-libgomp has to continue to work
with nvptx offloading code once generated by old-GCC.  Possibly even a
mixture of old and new nvptx offloading code, if libraries are involved,
huh!

I have not completely thought that through, but I suppose this could be
addressed by adding a flag to the 'struct nvptx_fn' (or similar) that's
synthesized by nvptx 'mkoffload'?

Maybe if fact the 'enum id_map_flag' machinery that I once added for
'Un-parallelized OpenACC kernels constructs with nvptx offloading: "avoid
offloading"'?  (That's part of og8 commit
2d42fbf7e989e4bb76727b32ef11deb5845d5ab1 -- not present on og9, huh?!)
The 'enum id_map_flag' machinery serves the purpose of transporting
information from the offload compiler to libgomp, which seems what's
needed here?  (But please verify.)

For reference, your proposed changes:

> --- libgomp/plugin/plugin-nvptx.c (revision 275493)
> +++ libgomp/plugin/plugin-nvptx.c (working copy)
> @@ -696,16 +696,24 @@ link_ptx (CUmodule *module, const struct targ_ptx_
>  
>  static void
>  nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs,
> - unsigned *dims, void *targ_mem_desc,
> - CUdeviceptr dp, CUstream stream)
> + unsigned *dims, CUstream stream)
>  {
>struct targ_fn_descriptor *targ_fn = (struct targ_fn_descriptor *) fn;
>CUfunction function;
>int i;
> -  void *kargs[1];
>struct nvptx_thread *nvthd = nvptx_thread ();
>int warp_size = nvthd->ptx_dev->warp_size;
> +  void **kernel_args = NULL;
>  
> +  GOMP_PLUGIN_debug (0, "prepare mappings (mapnum: %u)\n", (unsigned) 
> mapnum);
> +
> +  if (mapnum > 0)
> +{
> +  kernel_args = alloca (mapnum * sizeof (void *));
> +  for (int i = 0; i < mapnum; i++)
> + kernel_args[i] = (devaddrs[i] ? [i] : [i]);
> +}
> +  
>function = targ_fn->fn;
>  
>/* Initialize the launch dimensions.  Typically this is constant,
> @@ -937,11 +945,10 @@ nvptx_exec (void (*fn), size_t mapnum, void **host
>   api_info);
>  }
>  
> -  kargs[0] = 
>CUDA_CALL_ASSERT (cuLaunchKernel, function,
>   dims[GOMP_DIM_GANG], 1, 1,
>   dims[GOMP_DIM_VECTOR], dims[GOMP_DIM_WORKER], 1,
> - 0, stream, kargs, 0);
> + 0, stream, kernel_args, 0);
>  
>if (profiling_p)
>  {
> @@ -1350,67 +1357,8 @@ GOMP_OFFLOAD_openacc_exec (void (*fn) (void *), si
>  void **hostaddrs, void **devaddrs,
>  unsigned *dims, void *targ_mem_desc)
>  {
> -  GOMP_PLUGIN_debug (0, "  %s: prepare mappings\n", __FUNCTION__);
> +  nvptx_exec (fn, mapnum, hostaddrs, devaddrs, dims, NULL);
>  
> -  struct goacc_thread *thr = GOMP_PLUGIN_goacc_thread ();
> -  acc_prof_info *prof_info = thr->prof_info;
> -  acc_event_info data_event_info;
> -  acc_api_info *api_info = thr->api_info;
> -  bool profiling_p = __builtin_expect (prof_info != NULL, false);
> -
> -  void **hp = NULL;
> -  CUdeviceptr dp = 0;
> -
> -  if (mapnum > 0)
> -{
> -  size_t s = mapnum * sizeof (void *);
> -  hp = alloca (s);
> -  for (int i = 0; i < mapnum; i++)
> - hp[i] = (devaddrs[i] ? devaddrs[i] : hostaddrs[i]);
> -  CUDA_CALL_ASSERT (cuMemAlloc, , s);
> -  if (profiling_p)
> - goacc_profiling_acc_ev_alloc (thr, (void *) dp, s);
> -}
> -
> -  /* Copy the (device) pointers to arguments to the device (dp and hp might 
> in
> - fact have the same value on a unified-memory system).  */
> -  if (mapnum > 0)
> -{
> -  if (profiling_p)
> - {
> -   prof_info->event_type = acc_ev_enqueue_upload_start;
> -
> -   data_event_info.data_event.event_type = prof_info->event_type;
> -   data_event_info.data_event.valid_bytes
> - = _ACC_DATA_EVENT_INFO_VALID_BYTES;
> -   data_event_info.data_event.parent_construct
> - = acc_construct_parallel;
> -   data_event_info.data_event.implicit = 1; /* Always implicit.  */
> -   data_event_info.data_event.tool_info = NULL;
> -   data_event_info.data_event.var_name = NULL;
> -   data_event_info.data_event.bytes = mapnum * sizeof (void *);
> -   data_event_info.data_event.host_ptr = hp;
> -   data_event_info.data_event.device_ptr = (const void *) dp;
> -
> -   api_info->device_api = acc_device_api_cuda;
> -
> -   GOMP_PLUGIN_goacc_profiling_dispatch (prof_info, _event_info,
> -

[PATCH] Fortran polymorphic class-type support for OpenACC

2019-10-08 Thread Julian Brown
This patch provides basic support for Fortran (2003) polymorphic class
pointers. Such pointers have a descriptor that is somewhat like an array
descriptor, so I re-used the GOMP_MAP_TO_PSET mapping to transfer such
class descriptors from the host to the target. That seems to work well,
though I don't know at present how to exhaustively test sophisticated
uses of polymorphic types.

This patch builds on top of the manual deep copy patch posted here:

  https://gcc.gnu.org/ml/gcc-patches/2019-10/msg00444.html

Tested with offloading to nvptx (and bootstrapped on x86_64). The new
tests pass, and a much larger test program using polymorphic types also
works with this patch.

During development of this patch (and the derived-type parts of the
previously-posted manual deep copy patch) I made a few notes on the
various pointer mapping kinds used by the OpenACC support (and lesserly
the OpenMP support) in GCC/libgomp. I've now put those up here:

  https://gcc.gnu.org/wiki/LibgompPointerMappingKinds

An example of how a Fortran class-pointer type is mapped with OpenACC
is given there, under GOMP_MAP_TO_PSET.

OK for trunk?

Thanks,

Julian

ChangeLog

gcc/fortran/
* openmp.c (resolve_oacc_data_clauses): Don't disallow allocatable
polymorphic types for OpenACC.
* trans-openmp.c (gfc_trans_omp_clauses): Support polymorphic class
types.

libgomp/
* testsuite/libgomp.oacc-fortran/class-ptr-param.f95: New test.
* testsuite/libgomp.oacc-fortran/classtypes-1.f95: New test.
* testsuite/libgomp.oacc-fortran/classtypes-2.f95: New test.
---
 gcc/fortran/openmp.c  |   6 -
 gcc/fortran/trans-openmp.c|  69 +---
 .../libgomp.oacc-fortran/class-ptr-param.f95  |  34 ++
 .../libgomp.oacc-fortran/classtypes-1.f95 |  48 
 .../libgomp.oacc-fortran/classtypes-2.f95 | 106 ++
 5 files changed, 244 insertions(+), 19 deletions(-)
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/class-ptr-param.f95
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/classtypes-1.f95
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/classtypes-2.f95

diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index f08f77ce940..cf7612c6750 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -3883,12 +3883,6 @@ check_array_not_assumed (gfc_symbol *sym, locus loc, 
const char *name)
 static void
 resolve_oacc_data_clauses (gfc_symbol *sym, locus loc, const char *name)
 {
-  if ((sym->ts.type == BT_ASSUMED && sym->attr.allocatable)
-  || (sym->ts.type == BT_CLASS && CLASS_DATA (sym)
- && CLASS_DATA (sym)->attr.allocatable))
-gfc_error ("ALLOCATABLE object %qs of polymorphic type "
-  "in %s clause at %L", sym->name, name, );
-  check_symbol_not_pointer (sym, loc, name);
   check_array_not_assumed (sym, loc, name);
 }
 
diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c
index 775548fe9af..892bb1752b9 100644
--- a/gcc/fortran/trans-openmp.c
+++ b/gcc/fortran/trans-openmp.c
@@ -2244,14 +2244,42 @@ gfc_trans_omp_clauses (stmtblock_t *block, 
gfc_omp_clauses *clauses,
TREE_ADDRESSABLE (decl) = 1;
  if (n->expr == NULL || n->expr->ref->u.ar.type == AR_FULL)
{
- if (POINTER_TYPE_P (TREE_TYPE (decl))
- && (gfc_omp_privatize_by_reference (decl)
- || GFC_DECL_GET_SCALAR_POINTER (decl)
- || GFC_DECL_GET_SCALAR_ALLOCATABLE (decl)
- || GFC_DECL_CRAY_POINTEE (decl)
- || GFC_DESCRIPTOR_TYPE_P
-   (TREE_TYPE (TREE_TYPE (decl)))
- || n->sym->ts.type == BT_DERIVED))
+ if (n->sym->ts.type == BT_CLASS)
+   {
+ tree type = TREE_TYPE (decl);
+ if (n->sym->attr.optional)
+   sorry ("optional class parameter");
+ if (POINTER_TYPE_P (type))
+   {
+ node4 = build_omp_clause (input_location,
+   OMP_CLAUSE_MAP);
+ OMP_CLAUSE_SET_MAP_KIND (node4, GOMP_MAP_POINTER);
+ OMP_CLAUSE_DECL (node4) = decl;
+ OMP_CLAUSE_SIZE (node4) = size_int (0);
+ decl = build_fold_indirect_ref (decl);
+   }
+ tree ptr = gfc_class_data_get (decl);
+ ptr = build_fold_indirect_ref (ptr);
+ OMP_CLAUSE_DECL (node) = ptr;
+ OMP_CLAUSE_SIZE (node) = gfc_class_vtab_size_get (decl);
+ node2 = build_omp_clause (input_location, OMP_CLAUSE_MAP);
+ OMP_CLAUSE_SET_MAP_KIND (node2, GOMP_MAP_TO_PSET);
+ 

Re: [PATCH 1/2][vect]PR 88915: Vectorize epilogues when versioning loops

2019-10-08 Thread Andre Vieira (lists)

Hi Richard,

As I mentioned in the IRC channel, I managed to get "most" of the 
regression testsuite working for x86_64 (avx512) and aarch64.


On x86_64 I get a failure that I can't explain, was hoping you might be 
able to have a look with me:

"PASS->FAIL: gcc.target/i386/vect-perm-odd-1.c execution test"

vect-perm-odd-1.exe segfaults and when I gdb it seems to be the first 
iteration of the main loop.  The tree dumps look alright, but I do 
notice the stack usage seems to change between --param 
vect-epilogue-nomask={0,1}.


Am I missing to update some field that may later lead to the amount of 
stack being used? I am confused, it could very well be that I am missing 
something obvious, I am not too familiar with x86's ISA. I will try to 
investigate further.


This patch needs further clean-up and more comments (or comment 
updates), but I thought I'd share current state to see if you can help 
me unblock.


Cheers,
Andre
diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h
index 0b0154ffd7bf031a005de993b101d9db6dd98c43..d01512ea46467f1cf77793bdc75b48e71b0b9641 100644
--- a/gcc/cfgloop.h
+++ b/gcc/cfgloop.h
@@ -21,6 +21,7 @@ along with GCC; see the file COPYING3.  If not see
 #define GCC_CFGLOOP_H
 
 #include "cfgloopmanip.h"
+#include "target.h"
 
 /* Structure to hold decision about unrolling/peeling.  */
 enum lpt_dec
@@ -268,6 +269,9 @@ public:
  the basic-block from being collected but its index can still be
  reused.  */
   basic_block former_header;
+
+  /* Keep track of vector sizes we know we can vectorize the epilogue with.  */
+  vector_sizes epilogue_vsizes;
 };
 
 /* Set if the loop is known to be infinite.  */
diff --git a/gcc/cfgloop.c b/gcc/cfgloop.c
index 4ad1f658708f83dbd8789666c26d4bd056837bc6..f3e81bcd00b3f125389aa15b12dc5201b3578d20 100644
--- a/gcc/cfgloop.c
+++ b/gcc/cfgloop.c
@@ -198,6 +198,7 @@ flow_loop_free (class loop *loop)
   exit->prev = exit;
 }
 
+  loop->epilogue_vsizes.release();
   ggc_free (loop->exits);
   ggc_free (loop);
 }
@@ -355,6 +356,7 @@ alloc_loop (void)
   loop->nb_iterations_upper_bound = 0;
   loop->nb_iterations_likely_upper_bound = 0;
   loop->nb_iterations_estimate = 0;
+  loop->epilogue_vsizes.create(8);
   return loop;
 }
 
diff --git a/gcc/gengtype.c b/gcc/gengtype.c
index 53317337cf8c8e8caefd6b819d28b3bba301e755..80fb6ef71465b24e034fa45d69fec56be6b2e7f8 100644
--- a/gcc/gengtype.c
+++ b/gcc/gengtype.c
@@ -5197,6 +5197,7 @@ main (int argc, char **argv)
   POS_HERE (do_scalar_typedef ("widest_int", ));
   POS_HERE (do_scalar_typedef ("int64_t", ));
   POS_HERE (do_scalar_typedef ("poly_int64", ));
+  POS_HERE (do_scalar_typedef ("poly_uint64", ));
   POS_HERE (do_scalar_typedef ("uint64_t", ));
   POS_HERE (do_scalar_typedef ("uint8", ));
   POS_HERE (do_scalar_typedef ("uintptr_t", ));
@@ -5206,6 +5207,7 @@ main (int argc, char **argv)
   POS_HERE (do_scalar_typedef ("machine_mode", ));
   POS_HERE (do_scalar_typedef ("fixed_size_mode", ));
   POS_HERE (do_scalar_typedef ("CONSTEXPR", ));
+  POS_HERE (do_scalar_typedef ("vector_sizes", ));
   POS_HERE (do_typedef ("PTR", 
 			create_pointer (resolve_typedef ("void", )),
 			));
diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c
index 5c25441c70a271f04730486e513437fffa75b7e3..189f7458b1b20be06a9a20d3ee05e74bc176434c 100644
--- a/gcc/tree-vect-loop-manip.c
+++ b/gcc/tree-vect-loop-manip.c
@@ -26,6 +26,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree.h"
 #include "gimple.h"
 #include "cfghooks.h"
+#include "tree-if-conv.h"
 #include "tree-pass.h"
 #include "ssa.h"
 #include "fold-const.h"
@@ -1724,7 +1725,7 @@ vect_update_init_of_dr (struct data_reference *dr, tree niters, tree_code code)
Apply vect_update_inits_of_dr to all accesses in LOOP_VINFO.
CODE and NITERS are as for vect_update_inits_of_dr.  */
 
-static void
+void
 vect_update_inits_of_drs (loop_vec_info loop_vinfo, tree niters,
 			  tree_code code)
 {
@@ -1736,19 +1737,7 @@ vect_update_inits_of_drs (loop_vec_info loop_vinfo, tree niters,
 
   /* Adjust niters to sizetype and insert stmts on loop preheader edge.  */
   if (!types_compatible_p (sizetype, TREE_TYPE (niters)))
-{
-  gimple_seq seq;
-  edge pe = loop_preheader_edge (LOOP_VINFO_LOOP (loop_vinfo));
-  tree var = create_tmp_var (sizetype, "prolog_loop_adjusted_niters");
-
-  niters = fold_convert (sizetype, niters);
-  niters = force_gimple_operand (niters, , false, var);
-  if (seq)
-	{
-	  basic_block new_bb = gsi_insert_seq_on_edge_immediate (pe, seq);
-	  gcc_assert (!new_bb);
-	}
-}
+niters = fold_convert (sizetype, niters);
 
   FOR_EACH_VEC_ELT (datarefs, i, dr)
 {
@@ -2401,14 +2390,18 @@ class loop *
 vect_do_peeling (loop_vec_info loop_vinfo, tree niters, tree nitersm1,
 		 tree *niters_vector, tree *step_vector,
 		 tree *niters_vector_mult_vf_var, int th,
-		 bool check_profitability, bool niters_no_overflow)
+		 

[OG9][committed] was: [patch][Fortran] Actually permit OpenMP's 'target simd'

2019-10-08 Thread Tobias Burnus
Trunk (r276698): I have tested it with nvptx, it worked; hence, I moved 
it to libgomp as run-time test.


OG9: I have also committed the patch to the OG9 / openacc-gcc-9 branch. 
(54fbada7d4d38e420efb5a10d39e03b02533b1e7)


Thanks,

Tobias

On 10/8/19 2:12 PM, Jakub Jelinek wrote:

On Tue, Oct 08, 2019 at 02:04:17PM +0200, Tobias Burnus wrote:

Seemingly, 'target simd' was forgotten – which yielded the error:
"Unexpected !$OMP TARGET SIMD statement"

OK for the trunk?

Tobias

PS: The test case should also work as 'dg-do run' test, if it makes more
sense. (Only tested on a system w/o offloading, but I would test it with
nvptx before committing it.)

gfortran.dg/gomp/ shouldn't contain dg-do link or dg-do run tests,
those should be in libgomp/testsuite/libgomp.fortran/
but then there is no point duplicating the test in gfortran.dg/gomp/.


fortran/
* parse.c (parse_executable): Add missing ST_OMP_TARGET_SIMD.

testsuite/
* gfortran.dg/gomp/target-simd.f90: New.

Ok, with moving the test to libgomp.fortran/ and removing the dg-do compile
line, or without.


--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/gomp/target-simd.f90
@@ -0,0 +1,26 @@
+! { dg-do compile }
+
+program test
+  implicit none
+  real, allocatable :: a(:), b(:)
+  integer :: i
+
+  a = [(i, i = 1, 100)]
+  allocate(b, mold=a)
+  b = 0
+
+  !$omp target simd map(to:a) map(from:b)
+  do i = 0, size(a)
+b(i) = 5.0 * a(i)
+  end do
+
+  if (any (b - 5.0 *a > 10.0*epsilon(a))) call abort()
+
+  !$omp target simd map(to:a) map(from:b)
+  do i = 0, size(a)
+b(i) = 2.0 * a(i)
+  end do
+  !$omp end target simd
+
+  if (any (b - 2.0 *a > 10.0*epsilon(a))) call abort()
+end program test


Jakub


Re: [patch][Fortran] Actually permit OpenMP's 'target simd'

2019-10-08 Thread Jakub Jelinek
On Tue, Oct 08, 2019 at 02:04:17PM +0200, Tobias Burnus wrote:
> Seemingly, 'target simd' was forgotten – which yielded the error:
> "Unexpected !$OMP TARGET SIMD statement"
> 
> OK for the trunk?
> 
> Tobias
> 
> PS: The test case should also work as 'dg-do run' test, if it makes more
> sense. (Only tested on a system w/o offloading, but I would test it with
> nvptx before committing it.)

gfortran.dg/gomp/ shouldn't contain dg-do link or dg-do run tests,
those should be in libgomp/testsuite/libgomp.fortran/
but then there is no point duplicating the test in gfortran.dg/gomp/.

> 

>   fortran/
>   * parse.c (parse_executable): Add missing ST_OMP_TARGET_SIMD.
> 
>   testsuite/
>   * gfortran.dg/gomp/target-simd.f90: New.

Ok, with moving the test to libgomp.fortran/ and removing the dg-do compile
line, or without.

> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/gomp/target-simd.f90
> @@ -0,0 +1,26 @@
> +! { dg-do compile }
> +
> +program test
> +  implicit none
> +  real, allocatable :: a(:), b(:)
> +  integer :: i
> +
> +  a = [(i, i = 1, 100)]
> +  allocate(b, mold=a)
> +  b = 0
> +
> +  !$omp target simd map(to:a) map(from:b)
> +  do i = 0, size(a)
> +b(i) = 5.0 * a(i)
> +  end do
> +
> +  if (any (b - 5.0 *a > 10.0*epsilon(a))) call abort()
> +
> +  !$omp target simd map(to:a) map(from:b)
> +  do i = 0, size(a)
> +b(i) = 2.0 * a(i)
> +  end do
> +  !$omp end target simd
> +
> +  if (any (b - 2.0 *a > 10.0*epsilon(a))) call abort()
> +end program test


Jakub


[OG9][committed] was: [Patch][Fortran] Improve OpenMP/OpenACC diagnostic if junk comes after a directive

2019-10-08 Thread Tobias Burnus

I have now also committed the patch to the OG9 branch.

Tobias

PS: I added the review comment to add more test cases to my to-do list.

On 10/8/19 11:53 AM, Tobias Burnus wrote:

Simple patch for better diagnostics:

Before:

   17 |   !$omp end target simd hjgfhg
  |    1
Error: Unclassifiable OpenMP directive at (1)

Now:
   17 |   !$omp end target simd hjgfhg
  |    1
Error: Unexpected junk at (1)


OK for the trunk?

Cheers,

Tobias

commit 55858edc1aab472abe850a74b22302dcfa735715
Author: Tobias Burnus 
Date:   Tue Oct 8 14:08:49 2019 +0200

Fortran - Improve OpenMP/OpenACC diagnostic

Backported from mainline.

gcc/fortran/
* match.h (gfc_match_omp_eos_error): Renamed from gfc_match_omp_eos.
* openmp.c (gfc_match_omp_eos): Make static.
(gfc_match_omp_eos_error): New.
* parse.c (matchs, matchdo, matchds): Do as done for 'matcho' -
if error occurred after OpenMP/OpenACC directive matched, do not
try other directives.
(decode_oacc_directive, decode_omp_directive): Call new function
instead.

testsuite/
* gfortran.dg/goacc/continuation-free-form.f95: Update dg-error.

diff --git a/gcc/fortran/ChangeLog.openacc b/gcc/fortran/ChangeLog.openacc
index 6882429ff2a..fe2cf26c117 100644
--- a/gcc/fortran/ChangeLog.openacc
+++ b/gcc/fortran/ChangeLog.openacc
@@ -1,3 +1,17 @@
+2019-10-08  Tobias Burnus  
+
+	Backported from mainline
+	2019-10-08  Tobias Burnus  
+
+	* match.h (gfc_match_omp_eos_error): Renamed from gfc_match_omp_eos.
+	* openmp.c (gfc_match_omp_eos): Make static.
+	(gfc_match_omp_eos_error): New.
+	* parse.c (matchs, matchdo, matchds): Do as done for 'matcho' -
+	if error occurred after OpenMP/OpenACC directive matched, do not
+	try other directives.
+	(decode_oacc_directive, decode_omp_directive): Call new function
+	instead.
+
 2019-10-02  Tobias Burnus  
 
 	Backported from mainline
diff --git a/gcc/fortran/match.h b/gcc/fortran/match.h
index 3a5b866f120..f99abc13916 100644
--- a/gcc/fortran/match.h
+++ b/gcc/fortran/match.h
@@ -151,7 +151,7 @@ match gfc_match_oacc_exit_data (void);
 match gfc_match_oacc_routine (void);
 
 /* OpenMP directive matchers.  */
-match gfc_match_omp_eos (void);
+match gfc_match_omp_eos_error (void);
 match gfc_match_omp_atomic (void);
 match gfc_match_omp_barrier (void);
 match gfc_match_omp_cancel (void);
diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index b4ab752943d..533e0f4ea49 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -31,7 +31,7 @@ along with GCC; see the file COPYING3.  If not see
 /* Match an end of OpenMP directive.  End of OpenMP directive is optional
whitespace, followed by '\n' or comment '!'.  */
 
-match
+static match
 gfc_match_omp_eos (void)
 {
   locus old_loc;
@@ -57,6 +57,17 @@ gfc_match_omp_eos (void)
   return MATCH_NO;
 }
 
+match
+gfc_match_omp_eos_error (void)
+{
+  if (gfc_match_omp_eos() == MATCH_YES)
+return MATCH_YES;
+
+  gfc_error ("Unexpected junk at %C");
+  return MATCH_ERROR;
+}
+
+
 /* Free an omp_clauses structure.  */
 
 void
diff --git a/gcc/fortran/parse.c b/gcc/fortran/parse.c
index 4f22542e9a7..3b50c2174c9 100644
--- a/gcc/fortran/parse.c
+++ b/gcc/fortran/parse.c
@@ -672,15 +672,15 @@ decode_oacc_directive (void)
   match ("declare", gfc_match_oacc_declare, ST_OACC_DECLARE);
   break;
 case 'e':
-  matcha ("end atomic", gfc_match_omp_eos, ST_OACC_END_ATOMIC);
-  matcha ("end data", gfc_match_omp_eos, ST_OACC_END_DATA);
-  matcha ("end host_data", gfc_match_omp_eos, ST_OACC_END_HOST_DATA);
-  matcha ("end kernels loop", gfc_match_omp_eos, ST_OACC_END_KERNELS_LOOP);
-  matcha ("end kernels", gfc_match_omp_eos, ST_OACC_END_KERNELS);
-  matcha ("end loop", gfc_match_omp_eos, ST_OACC_END_LOOP);
-  matcha ("end parallel loop", gfc_match_omp_eos,
+  matcha ("end atomic", gfc_match_omp_eos_error, ST_OACC_END_ATOMIC);
+  matcha ("end data", gfc_match_omp_eos_error, ST_OACC_END_DATA);
+  matcha ("end host_data", gfc_match_omp_eos_error, ST_OACC_END_HOST_DATA);
+  matcha ("end kernels loop", gfc_match_omp_eos_error, ST_OACC_END_KERNELS_LOOP);
+  matcha ("end kernels", gfc_match_omp_eos_error, ST_OACC_END_KERNELS);
+  matcha ("end loop", gfc_match_omp_eos_error, ST_OACC_END_LOOP);
+  matcha ("end parallel loop", gfc_match_omp_eos_error,
 	  ST_OACC_END_PARALLEL_LOOP);
-  matcha ("end parallel", gfc_match_omp_eos, ST_OACC_END_PARALLEL);
+  matcha ("end parallel", gfc_match_omp_eos_error, ST_OACC_END_PARALLEL);
   matcha ("enter data", gfc_match_oacc_enter_data, ST_OACC_ENTER_DATA);
   matcha ("exit data", gfc_match_oacc_exit_data, ST_OACC_EXIT_DATA);
   break;
@@ -740,14 +740,17 @@ decode_oacc_directive (void)
and if spec_only, goto do_spec_only without actually matching.  */
 #define matchs(keyword, subr, st)

[patch][Fortran] Actually permit OpenMP's 'target simd'

2019-10-08 Thread Tobias Burnus

Seemingly, 'target simd' was forgotten – which yielded the error:
"Unexpected !$OMP TARGET SIMD statement"

OK for the trunk?

Tobias

PS: The test case should also work as 'dg-do run' test, if it makes more 
sense. (Only tested on a system w/o offloading, but I would test it with 
nvptx before committing it.)


	fortran/
	* parse.c (parse_executable): Add missing ST_OMP_TARGET_SIMD.

	testsuite/
	* gfortran.dg/gomp/target-simd.f90: New.

diff --git a/gcc/fortran/parse.c b/gcc/fortran/parse.c
index 03fc716dbf5..15f6bf2937c 100644
--- a/gcc/fortran/parse.c
+++ b/gcc/fortran/parse.c
@@ -5534,6 +5534,7 @@ parse_executable (gfc_statement st)
 	case ST_OMP_SIMD:
 	case ST_OMP_TARGET_PARALLEL_DO:
 	case ST_OMP_TARGET_PARALLEL_DO_SIMD:
+	case ST_OMP_TARGET_SIMD:
 	case ST_OMP_TARGET_TEAMS_DISTRIBUTE:
 	case ST_OMP_TARGET_TEAMS_DISTRIBUTE_PARALLEL_DO:
 	case ST_OMP_TARGET_TEAMS_DISTRIBUTE_PARALLEL_DO_SIMD:
diff --git a/gcc/testsuite/gfortran.dg/gomp/target-simd.f90 b/gcc/testsuite/gfortran.dg/gomp/target-simd.f90
new file mode 100644
index 000..733420f4cc7
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/gomp/target-simd.f90
@@ -0,0 +1,26 @@
+! { dg-do compile }
+
+program test
+  implicit none
+  real, allocatable :: a(:), b(:)
+  integer :: i
+
+  a = [(i, i = 1, 100)]
+  allocate(b, mold=a)
+  b = 0
+
+  !$omp target simd map(to:a) map(from:b)
+  do i = 0, size(a)
+b(i) = 5.0 * a(i)
+  end do
+
+  if (any (b - 5.0 *a > 10.0*epsilon(a))) call abort()
+
+  !$omp target simd map(to:a) map(from:b)
+  do i = 0, size(a)
+b(i) = 2.0 * a(i)
+  end do
+  !$omp end target simd
+
+  if (any (b - 2.0 *a > 10.0*epsilon(a))) call abort()
+end program test


Re: [patch,commit] Remove ">>>>>> .r..." from a ChangeLog

2019-10-08 Thread Florian Weimer
* Tobias Burnus:

> Both my mailer and
> https://gcc.gnu.org/ml/gcc-patches/2019-10/msg00528.html show an empty
> line before the "" – as does the file itself.
>
> Thus, there is already one – which should be enough.

Inline patches with

Content-Type: text/plain; charset="utf-8"; format=flowed

are inherently ambiguous.  If you use format=flowed, you need to post
them as attachments (without format=flowed).

Thanks,
Florian


Re: [patch,commit] Remove ">>>>>> .r..." from a ChangeLog

2019-10-08 Thread Jakub Jelinek
On Tue, Oct 08, 2019 at 01:52:17PM +0200, Tobias Burnus wrote:
> Both my mailer and https://gcc.gnu.org/ml/gcc-patches/2019-10/msg00528.html
> show an empty line before the "" – as does the file itself.
> 
> Thus, there is already one – which should be enough.

Guess some issue in the mutt viewer, if I save it to a file and look at over
in less, it looks good (well, there are 2 spaces instead of one, so the
previous line is not empty, but has trailing space).

The patch is obvious in any case.

Jakub


Re: [patch,commit] Remove ">>>>>> .r..." from a ChangeLog

2019-10-08 Thread Tobias Burnus
Both my mailer and 
https://gcc.gnu.org/ml/gcc-patches/2019-10/msg00528.html show an empty 
line before the "" – as does the file itself.


Thus, there is already one – which should be enough.

Cheers,

Tobias,

who also thinks that email clients could be better in terms of line 
breaks and spacing.


On 10/8/19 12:50 PM, Jakub Jelinek wrote:

On Tue, Oct 08, 2019 at 12:46:13PM +0200, Tobias Burnus wrote:

commit 4958da0be3f08d5f715dc4b74a8e93db18ca1a9e
Author: burnus 
Date:   Tue Oct 8 09:35:56 2019 +

 Remove '>>>' merge marker from changelog
 git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@276689 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/gcc/fortran/ChangeLog b/gcc/fortran/ChangeLog
index 84ee818a49d..6e98130dd96 100644
--- a/gcc/fortran/ChangeLog
+++ b/gcc/fortran/ChangeLog
@@ -153,7 +153,6 @@
 (lang_decl): Add new optional_arg field.
 (GFC_DECL_OPTIONAL_ARGUMENT): New macro.
->>> .r276463
  2019-10-01  David Malcolm  

That should be replaced by empty line...


 * error.c (gfc_diagnostic_starter): Clear the prefix before


Jakub


Re: [patch] disentangle range_fold_*ary_expr into various pieces

2019-10-08 Thread Martin Jambor
Hi,

On Tue, Oct 08 2019, Marc Glisse wrote:
> On Mon, 7 Oct 2019, Aldy Hernandez wrote:
>>
>> In testing this patch in isolation from the non-zero canonicalization patch, 
>> I found one regression due to the fact that:
>>
>> a) As discussed, two non-zero representations currently exist for unsigned 
>> ranges.
>>
>> b) ipa-prop.c has it's own hacked up value_range structure (ipa_vr) which 
>> doesn't use any API.  Since there is no agreed upon non-zero, range-ops can 
>> sometimes (correctly) create an unsigned [1,MAX], and ipa-prop.c is 
>> open-coding the check for a pointer non-zero to ~[0,0]. This seems like a 
>> latent bug.
>>
>> I really have no idea, nor do I care (*), what we do with ipa-prop's lack of 
>> API.  For now, I have implemented ipa_vr::nonzero_p(), and used it.  When we 
>> agree on the non-zero normalization we can adjust this method if necessary.
>>
>> +bool
>> +ipa_vr::nonzero_p (tree expr_type) const
>> +{
>> +  if (type == VR_ANTI_RANGE && wi::eq_p (min, 0) && wi::eq_p (max, 0))
>> +return true;
>> +
>> +  unsigned prec = TYPE_PRECISION (expr_type);
>> +  return (type == VR_RANGE
>> + && wi::eq_p (min, wi::one (prec))
>> + && wi::eq_p (max, wi::max_value (prec, TYPE_SIGN (expr_type;
>> +}
>>
>> ...
>>
>>   else if (POINTER_TYPE_P (TREE_TYPE (ddef))
>> -  && vr[i].type == VR_ANTI_RANGE
>> -  && wi::eq_p (vr[i].min, 0)
>> -  && wi::eq_p (vr[i].max, 0))
>> +  && vr[i].nonzero_p (TREE_TYPE (ddef)))
>>
>> Attached is the final adjusted patch I have committed to trunk.
>
> I wonder why we would ever want to ask "is this interval the one that 
> misses exactly the value 0" instead of "does this interval contain the 
> value 0". I naively believe there shouldn't even be any API for the first 
> question. Or if pointers really only have 2 possible intervals (besides 
> varying and undefined), aka [0,0] and ~[0,0], using intervals seems like 
> overkill for them...
>

The only use of this code is to see if we can do

  set_ptr_nonnull (ddef)

where ddef is the default definition of a pointer argument described by
the value range.  For integer arguments, we use the values in ipa_vr to
set_range_info of the default definition, so even if it is an overkill
for pointers, the data structure cannot be replaced with just a flag.

While I know that ~[0,0] is by far the most common pointer ipa_vr, I
will have a look whether it makes sense to rewrite the test as you
suggested after I pull the changes from trunk.

Thanks for pointing it out,

Martin



[PATCH] Add makefile target to update HTML files in source tree

2019-10-08 Thread Jonathan Wakely

Also remove the creation of the html/ext sub-directory, which has been
unused since revision r245258.

* doc/Makefile.am (doc-html-docbook-regenerate): New target.
(${docbook_outdir}/html): Do not create unused 'html/ext' directory.
* doc/Makefile.in: Regenerate.
* doc/xml/manual/documentation_hacking.xml: Document new target.
* doc/html/*: Regenerate.

If you have the docbook stylesheets installed then this allows
updating the libstdc++-v3/doc/html/ contents with a single command.

Committed to trunk.

commit f1aa075fb7778fcd6e0bad1df0e88251dab5d641
Author: Jonathan Wakely 
Date:   Tue Oct 8 11:47:01 2019 +0100

Add makefile target to update HTML files in source tree

Also remove the creation of the html/ext sub-directory, which has been
unused since revision r245258.

* doc/Makefile.am (doc-html-docbook-regenerate): New target.
(${docbook_outdir}/html): Do not create unused 'html/ext' directory.
* doc/Makefile.in: Regenerate.
* doc/xml/manual/documentation_hacking.xml: Document new target.
* doc/html/*: Regenerate.

diff --git a/libstdc++-v3/doc/Makefile.am b/libstdc++-v3/doc/Makefile.am
index b9aca381b74..9b385edecff 100644
--- a/libstdc++-v3/doc/Makefile.am
+++ b/libstdc++-v3/doc/Makefile.am
@@ -476,7 +476,6 @@ ${docbook_outdir}/fo:
 
 ${docbook_outdir}/html:
mkdir -p ${docbook_outdir}/html
-   mkdir -p ${docbook_outdir}/html/ext
mkdir -p ${docbook_outdir}/html/images
mkdir -p ${docbook_outdir}/html/manual
 
@@ -545,6 +544,12 @@ stamp-html-docbook: $(xml_sources) ${docbook_outdir}/html
 
 doc-html-docbook: stamp-html-docbook-data
 
+# Generate the HTML pages and copy them back to the source tree.
+doc-html-docbook-regenerate: doc-html-docbook
+   $(INSTALL_DATA) ${docbook_outdir}/html/*.html ${top_srcdir}/doc/html
+   $(INSTALL_DATA) ${docbook_outdir}/html/images/* 
${top_srcdir}/doc/html/images
+   $(INSTALL_DATA) ${docbook_outdir}/html/manual/*.html 
${top_srcdir}/doc/html/manual
+
 # HTML, all one page
 # NB: Have to generate customization XSL for UTF-8 output.
 manual_html = ${docbook_outdir}/html/libstdc++-manual-single.html
diff --git a/libstdc++-v3/doc/xml/manual/documentation_hacking.xml 
b/libstdc++-v3/doc/xml/manual/documentation_hacking.xml
index e0990a28516..7db776794c2 100644
--- a/libstdc++-v3/doc/xml/manual/documentation_hacking.xml
+++ b/libstdc++-v3/doc/xml/manual/documentation_hacking.xml
@@ -807,13 +807,20 @@
   
 
   
-   Generated files are output into separate sub directores of
+   Generated files are output into separate sub-directores of
doc/docbook/ in the
build directory, based on the output format. For instance, the
HTML docs will be in doc/docbook/html.
   
 
+  
+   The doc-html-docbook-regenerate target will generate
+   the HTML files and copy them back to the libstdc++ source tree.
+   This can be used to update the HTML files that are checked in to
+   version control.
+  
+
   
If the Docbook stylesheets are installed in a custom location,
one can use the variable XSL_STYLE_DIR to


[PATCH] Refactor vectorizer reduction more

2019-10-08 Thread Richard Biener


This builds upon the previous refactorings and does the following

 1) move the reduction meta to the outermost PHI stmt_info (from the
inner loop computation stmt), the new info_for_reduction gets
you to that.
 2) Merge STMT_VINFO_VEC_REDUCTION_TYPE and STMT_VINFO_REDUC_TYPE
into the latter.
 3) Apart from single-def-use, lane-reducting ops and fold-left
reductions code generation is no longer done by
vect_transform_reduction but by individual vectorizable_*
routines.  In particular this gets rid of calling
vectorizable_condition and vectorizable_shift from
vectorizable_reduction and vect_transform_reduction.
 4) Remove easy to remove restrictions for pure nested cycles.
(there are still some left in vect_is_simple_reduction)

While I developed and tested this in baby-steps those are too ugly
in isolation and thus here's a combined patch for all of the above.

Bootstrap & regtest in progress on x86_64-unknown-linux-gnu.

Richard.

2019-10-08  Richard Biener  

* tree-vectorizer.h (_stmt_vec_info::v_reduc_type): Remove.
(_stmt_vec_info::is_reduc_info): Add.
(STMT_VINFO_VEC_REDUCTION_TYPE): Remove.
(vectorizable_condition): Remove.
(vectorizable_shift): Likewise.
(vectorizable_reduction): Adjust.
(info_for_reduction): New.
* tree-vect-loop.c (vect_force_simple_reduction): Fold into...
(vect_analyze_scalar_cycles_1): ... here.
(vect_analyze_loop_operations): Adjust.
(needs_fold_left_reduction_p): Simplify for single caller.
(vect_is_simple_reduction): Likewise.  Remove stmt restriction
for nested cycles not part of double reductions.
(vect_model_reduction_cost): Pass in the reduction type.
(info_for_reduction): New function.
(vect_create_epilog_for_reduction): Use it, access reduction
meta off the stmt info it returns.  Use STMT_VINFO_REDUC_TYPE
instead of STMT_VINFO_VEC_REDUCTION_TYPE.
(vectorize_fold_left_reduction): Remove pointless assert.
(vectorizable_reduction): Analyze the full reduction when
visiting the outermost PHI.  Simplify.  Use STMT_VINFO_REDUC_TYPE
instead of STMT_VINFO_VEC_REDUCTION_TYPE.  Direct reduction
stmt code-generation to vectorizable_* in most cases.  Verify
code-generation only for cases handled by
vect_transform_reductuon.
(vect_transform_reduction): Use info_for_reduction to get at
reduction meta.  Simplify.
(vect_transform_cycle_phi): Likewise.
(vectorizable_live_operation): Likewise.
* tree-vect-patterns.c (vect_reassociating_reduction_p): Look
at the PHI node for STMT_VINFO_REDUC_TYPE.
* tree-vect-slp.c (vect_schedule_slp_instance): Remove no
longer necessary code.
* tree-vect-stmts.c (vectorizable_shift): Make static again.
(vectorizable_condition): Likewise.  Get at reduction related
info via info_for_reduction.
(vect_analyze_stmt): Adjust.
(vect_transform_stmt): Likewise.
* tree-vectorizer.c (vec_info::new_stmt_vec_info): Initialize
STMT_VINFO_REDUC_TYPE instead of STMT_VINFO_VEC_REDUCTION_TYPE.

* gcc.dg/vect/pr65947-1.c: Adjust.
* gcc.dg/vect/pr65947-13.c: Likewise.
* gcc.dg/vect/pr65947-14.c: Likewise.
* gcc.dg/vect/pr65947-4.c: Likewise.
* gcc.dg/vect/pr80631-1.c: Likewise.
* gcc.dg/vect/pr80631-2.c: Likewise.

diff --git a/gcc/testsuite/gcc.dg/vect/pr65947-1.c 
b/gcc/testsuite/gcc.dg/vect/pr65947-1.c
index 879819d576a..b81baed914c 100644
--- a/gcc/testsuite/gcc.dg/vect/pr65947-1.c
+++ b/gcc/testsuite/gcc.dg/vect/pr65947-1.c
@@ -42,4 +42,4 @@ main (void)
 
 /* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 2 "vect" } } */
 /* { dg-final { scan-tree-dump-times "optimizing condition reduction with 
FOLD_EXTRACT_LAST" 4 "vect" { target vect_fold_extract_last } } } */
-/* { dg-final { scan-tree-dump-times "condition expression based on integer 
induction." 4 "vect" { target { ! vect_fold_extract_last } } } } */
+/* { dg-final { scan-tree-dump-times "condition expression based on integer 
induction." 2 "vect" { target { ! vect_fold_extract_last } } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/pr65947-13.c 
b/gcc/testsuite/gcc.dg/vect/pr65947-13.c
index e1d3ff52f5c..4ad5262019a 100644
--- a/gcc/testsuite/gcc.dg/vect/pr65947-13.c
+++ b/gcc/testsuite/gcc.dg/vect/pr65947-13.c
@@ -41,5 +41,5 @@ main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 2 "vect" } } */
-/* { dg-final { scan-tree-dump-times "condition expression based on integer 
induction." 4 "vect" { xfail vect_fold_extract_last } } } */
+/* { dg-final { scan-tree-dump-times "condition expression based on integer 
induction." 2 "vect" { xfail vect_fold_extract_last } } } */
 /* { dg-final { scan-tree-dump-times "optimizing condition reduction with 
FOLD_EXTRACT_LAST" 4 "vect" { target 

Re: [patch,commit] Remove ">>>>>> .r..." from a ChangeLog

2019-10-08 Thread Jakub Jelinek
On Tue, Oct 08, 2019 at 12:46:13PM +0200, Tobias Burnus wrote:
> commit 4958da0be3f08d5f715dc4b74a8e93db18ca1a9e
> Author: burnus 
> Date:   Tue Oct 8 09:35:56 2019 +
> 
> Remove '>>>' merge marker from changelog
> git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@276689 
> 138bc75d-0d04-0410-961f-82ee72b054a4
> 
> diff --git a/gcc/fortran/ChangeLog b/gcc/fortran/ChangeLog
> index 84ee818a49d..6e98130dd96 100644
> --- a/gcc/fortran/ChangeLog
> +++ b/gcc/fortran/ChangeLog
> @@ -153,7 +153,6 @@
> (lang_decl): Add new optional_arg field.
> (GFC_DECL_OPTIONAL_ARGUMENT): New macro.
> ->>> .r276463
>  2019-10-01  David Malcolm  

That should be replaced by empty line...

> * error.c (gfc_diagnostic_starter): Clear the prefix before
> 

Jakub


Re: [PATCH] Fix dump message issue

2019-10-08 Thread Martin Jambor
Hi,

On Tue, Oct 08 2019, luoxhu wrote:
> '}' is missed at the end.

heh, yeah, I wonder for how long.

If it irritates you, I'd say the patch is obvious (though note that I
cannot approve a patch in this area).

Thanks,

Martin


>
> gcc/ChangeLog:
>   tree-sra.c (dump_access): Add missing braces.
> ---
>  gcc/tree-sra.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
> index 48589323a1e..cb59b91f20e 100644
> --- a/gcc/tree-sra.c
> +++ b/gcc/tree-sra.c
> @@ -403,7 +403,7 @@ dump_access (FILE *f, struct access *access, bool grp)
>"grp_hint = %d, grp_covered = %d, "
>"grp_unscalarizable_region = %d, grp_unscalarized_data = %d, "
>"grp_same_access_path = %d, grp_partial_lhs = %d, "
> -  "grp_to_be_replaced = %d, grp_to_be_debug_replaced = %d\n",
> +  "grp_to_be_replaced = %d, grp_to_be_debug_replaced = %d}\n",
>access->grp_read, access->grp_write, access->grp_assignment_read,
>access->grp_assignment_write, access->grp_scalar_read,
>access->grp_scalar_write, access->grp_total_scalarization,
> @@ -413,7 +413,7 @@ dump_access (FILE *f, struct access *access, bool grp)
>access->grp_to_be_replaced, access->grp_to_be_debug_replaced);
>else
>  fprintf (f, ", write = %d, grp_total_scalarization = %d, "
> -  "grp_partial_lhs = %d\n",
> +  "grp_partial_lhs = %d}\n",
>access->write, access->grp_total_scalarization,
>access->grp_partial_lhs);
>  }
> -- 
> 2.21.0.777.g83232e3864


[patch,commit] Remove ">>>>>> .r..." from a ChangeLog

2019-10-08 Thread Tobias Burnus

commit 4958da0be3f08d5f715dc4b74a8e93db18ca1a9e
Author: burnus 
Date:   Tue Oct 8 09:35:56 2019 +

Remove '>>>' merge marker from changelog


git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@276689 138bc75d-0d04-0410-961f-82ee72b054a4


diff --git a/gcc/fortran/ChangeLog b/gcc/fortran/ChangeLog
index 84ee818a49d..6e98130dd96 100644
--- a/gcc/fortran/ChangeLog
+++ b/gcc/fortran/ChangeLog
@@ -153,7 +153,6 @@
(lang_decl): Add new optional_arg field.
(GFC_DECL_OPTIONAL_ARGUMENT): New macro.
 
->>> .r276463

 2019-10-01  David Malcolm  
 
* error.c (gfc_diagnostic_starter): Clear the prefix before





Re: [PATCH] Remove broken URL from libstdc++ manual

2019-10-08 Thread Jonathan Wakely

On 07/10/19 20:54 +0200, Thomas Schwinge wrote:

Hi!

On 2019-09-05T08:45:50+0100, Jonathan Wakely  wrote:

Committed to trunk. I think I'll backport this too, so we don't keep a
non-working link in the docs on release branches.



commit 45a605e970ea6db474e40c02aef6b18993fea05c
Author: Jonathan Wakely 
Date:   Thu Sep 5 08:40:35 2019 +0100

Remove broken URL from libstdc++ manual

The URL for the "What Are Allocators Good For?" article has been a
recurring source of problems. It moved from the C/C++ Users Journal
website to the Dr Dobbs site after CUJ shut down, and the original
domain changed hands, leaving old links pointing to nefarious sites.

Now the URL to the copy on drdobbs.com no longer works either and I
can't find a (legal) copy of the article online. The simplest solution
is to remove the URL.

* doc/xml/manual/allocator.xml: Remove URL for bibliography entry.
* doc/html/*: Regenerate.

diff --git a/libstdc++-v3/doc/xml/manual/allocator.xml 
b/libstdc++-v3/doc/xml/manual/allocator.xml
index 0de1be9465a..922bc49091c 100644
--- a/libstdc++-v3/doc/xml/manual/allocator.xml
+++ b/libstdc++-v3/doc/xml/manual/allocator.xml
@@ -482,12 +482,9 @@
   

   
-  
-   http://www.w3.org/1999/xlink;
- 
xlink:href="http://www.drdobbs.com/the-standard-librarian-what-are-allocato/184403759;>


For what it's worth: see
,
and

seems to be the latest revision they've got.


Good idea, thanks!

I've committed this patch to trunk.


commit 9bc329fa19e859ba535e1ebbefab83b96accf4b9
Author: Jonathan Wakely 
Date:   Tue Oct 8 11:10:08 2019 +0100

Restore URL for Austern article on allocators

This reverts "Remove broken URL from libstdc++ manual" by restoring the
link, but using an archived copy from the Wayback Machine.

* doc/xml/manual/allocator.xml: Use archived copy of CUJ article.
* doc/html/*: Regenerate.

diff --git a/libstdc++-v3/doc/xml/manual/allocator.xml b/libstdc++-v3/doc/xml/manual/allocator.xml
index 922bc49091c..d8a255ca213 100644
--- a/libstdc++-v3/doc/xml/manual/allocator.xml
+++ b/libstdc++-v3/doc/xml/manual/allocator.xml
@@ -483,7 +483,10 @@
 
   
 
+  http://www.w3.org/1999/xlink;
+	xlink:href="https://web.archive.org/web/20190622154249/http://www.drdobbs.com/the-standard-librarian-what-are-allocato/184403759;>
   The Standard Librarian: What Are Allocators Good For?
+  
 
 
 MattAustern


[PATCH 2/2][MSP430] Optimize zero_extend insns and PSImode pointer manipulation

2019-10-08 Thread Jozef Lawrynowicz
This patch has the functional changes to optimize zero_extend insns and pointer
manipulation in the large memory model.

>From f8156e115c4743ce94a86835ffa5601b6d28a555 Mon Sep 17 00:00:00 2001
From: Jozef Lawrynowicz 
Date: Mon, 7 Oct 2019 11:44:16 +0100
Subject: [PATCH 2/2] MSP430: PSImode pointer manipulation and zero extend insn
 optimizations

gcc/ChangeLog:

2019-10-08  Jozef Lawrynowicz  

	* config/msp430/msp430.md (movqipsi): New.
	(zero_extendqipsi2): New.
	(zero_extendqisi2): Optimize case where src register and base dst
	register are the same.
	(zero_extendhipsi2): Don't use 430X insn for rYs->r case.
	(zero_extendpsisi2): Optimize r->m case.
	Add unnamed insn patterns to catch insns combine searches for when
	optimizing pointer manipulation.
---
 gcc/config/msp430/msp430.md | 135 +++-
 1 file changed, 117 insertions(+), 18 deletions(-)

diff --git a/gcc/config/msp430/msp430.md b/gcc/config/msp430/msp430.md
index 2e8e8326232..cb0b3f16dc5 100644
--- a/gcc/config/msp430/msp430.md
+++ b/gcc/config/msp430/msp430.md
@@ -182,6 +182,15 @@
MOV%X1.B\t%1, %0"
 )
 
+(define_insn "movqipsi"
+  [(set (match_operand:PSI		   0 "register_operand" "=r,r")
+	(zero_extend:PSI (match_operand:QI 1 "general_operand" "rYs,m")))]
+  "msp430x"
+  "@
+   MOV.B\t%1, %0
+   MOV%X1.B\t%1, %0"
+)
+
 (define_insn "movqi_topbyte"
   [(set (match_operand:QI 0 "msp430_general_dst_operand" "=r")
 	(subreg:QI (match_operand:PSI 1 "msp430_general_operand" "r") 2))]
@@ -553,6 +562,16 @@
SXT%X0\t%0"
 )
 
+;; 
+;; ZERO EXTEND INSTRUCTIONS
+;; Byte-writes to registers clear bits 19:8
+;;   * Byte-writes to memory do not affect bits 15:8
+;; Word-writes to registers clear bits 19:16
+;; PSImode writes to memory clear bits 16:4 of the second memory word
+;; We define all possible insns since that results in better code than if
+;; they are inferred.
+;; 
+
 (define_insn "zero_extendqihi2"
   [(set (match_operand:HI		  0 "msp430_general_dst_operand" "=rYs,r,r,m")
 	(zero_extend:HI (match_operand:QI 1 "msp430_general_operand" "0,rYs,m,0")))]
@@ -564,19 +583,31 @@
AND%X0\t#0xff, %0"
 )
 
+(define_insn "zero_extendqipsi2"
+  [(set (match_operand:PSI		   0 "register_operand" "=r,r")
+	(zero_extend:PSI (match_operand:QI 1 "general_operand" "rYs,m")))]
+  "msp430x"
+  "@
+   MOV.B\t%1, %0
+   MOV%X1.B\t%1, %0"
+)
+
 (define_insn "zero_extendqisi2"
-  [(set (match_operand:SI 0 "msp430_general_dst_nonv_operand" "=r")
-	(zero_extend:SI (match_operand:QI 1 "nonimmediate_operand" "rm")))]
+  [(set (match_operand:SI 0 "msp430_general_dst_nonv_operand" "=r,r")
+	(zero_extend:SI (match_operand:QI 1 "nonimmediate_operand" "0,rm")))]
   ""
-  "MOV%X1.B\t%1,%L0 { CLR\t%H0"
+  "@
+  CLR\t%H0
+  MOV%X1.B\t%1,%L0 { CLR\t%H0"
 )
 
 (define_insn "zero_extendhipsi2"
-  [(set (match_operand:PSI		   0 "msp430_general_dst_operand" "=r,m")
-	(zero_extend:PSI (match_operand:HI 1 "msp430_general_operand" "rm,r")))]
-  ""
+  [(set (match_operand:PSI		   0 "msp430_general_dst_operand" "=r,r,m")
+	(zero_extend:PSI (match_operand:HI 1 "msp430_general_operand" "rYs,m,r")))]
+  "msp430x"
   "@
-  MOVX\t%1, %0
+  MOV.W\t%1, %0
+  MOV%X1\t%1, %0
   MOVX.A\t%1, %0"
 )
 
@@ -616,22 +647,90 @@
 ; the pair is unused and so it can clobber it.  Try compiling 20050826-2.c
 ; at -O2 to see this.
 
+; FIXME we can use MOVA for r->m if m is  or z16(rdst)
 (define_insn "zero_extendpsisi2"
-  [(set (match_operand:SI  0 "register_operand" "+r")
-	(zero_extend:SI (match_operand:PSI 1 "register_operand" "r")))]
+  [(set (match_operand:SI		   0 "register_operand" "+r,m")
+	(zero_extend:SI (match_operand:PSI 1 "register_operand" "r,r")))]
   ""
-  "*
-if (REGNO (operands[1]) == SP_REGNO)
-  /* If the source register is the stack pointer, the value
- stored in the stack slot will be the value *after* the
-	 stack pointer has been decremented.  So allow for that
-	 here.  */
-  return \"PUSHM.A\t#1, %1 { ADDX.W\t#4, @r1 { POPX.W\t%L0 { POPX.W\t%H0 ; get stack pointer into %L0:%H0\";
-else
+  "@
+  * if (REGNO (operands[1]) == SP_REGNO) \
+  /* If the source register is the stack pointer, the value \
+	 stored in the stack slot will be the value *after* the \
+	 stack pointer has been decremented.  So allow for that \
+	 here.  */ \
+  return \"PUSHM.A\t#1, %1 { ADDX.W\t#4, @r1 { POPX.W\t%L0 { POPX.W\t%H0 ; get stack pointer into %L0:%H0\"; \
+else \
   return \"PUSHM.A\t#1, %1 { POPX.W\t%L0 { POPX.W\t%H0 ; move pointer in %1 into reg-pair %L0:%H0\";
-  "
+  MOVX.A %1, %0"
+)
+
+;; Below are unnamed insn patterns to catch pointer manipulation insns
+;; generated by combine.
+;; We get large code size bloat when a PSImode pointer is stored in
+;; memory, so we try to avoid that where possible and keep point manipulation
+;; between registers.
+; FIXME many of these should be unnnecessary once combine deals with
+; (sign_extend (zero_extend)) or 

[PATCH 1/2][MSP430] Reorder and group zero_extend insns in msp430.md

2019-10-08 Thread Jozef Lawrynowicz
This is an "obvious" mechanical patch which just reorders and groups the
zero_extend insns in msp430.md, in preparation for the next functional patch.

>From 8810aa7a19569d7e49e898613736d793c43c20a1 Mon Sep 17 00:00:00 2001
From: Jozef Lawrynowicz 
Date: Mon, 7 Oct 2019 11:24:31 +0100
Subject: [PATCH 1/2] MSP430: Reorder and group zero_extend insns in msp430.md

gcc/ChangeLog:

2019-10-08  Jozef Lawrynowicz  

	* config/msp430/msp430.md: Group zero_extend* insns together.
---
 gcc/config/msp430/msp430.md | 117 ++--
 1 file changed, 59 insertions(+), 58 deletions(-)

diff --git a/gcc/config/msp430/msp430.md b/gcc/config/msp430/msp430.md
index a533efa1656..2e8e8326232 100644
--- a/gcc/config/msp430/msp430.md
+++ b/gcc/config/msp430/msp430.md
@@ -564,15 +564,11 @@
AND%X0\t#0xff, %0"
 )
 
-;; Eliminate extraneous zero-extends mysteriously created by gcc.
-(define_peephole2
-  [(set (match_operand:HI 0 "register_operand")
-	(zero_extend:HI (match_operand:QI 1 "general_operand")))
-   (set (match_operand:HI 2 "register_operand")
-	(zero_extend:HI (match_operand:QI 3 "register_operand")))]
-  "REGNO (operands[0]) == REGNO (operands[2]) && REGNO (operands[2]) == REGNO (operands[3])"
-  [(set (match_dup 0)
-	(zero_extend:HI (match_dup 1)))]
+(define_insn "zero_extendqisi2"
+  [(set (match_operand:SI 0 "msp430_general_dst_nonv_operand" "=r")
+	(zero_extend:SI (match_operand:QI 1 "nonimmediate_operand" "rm")))]
+  ""
+  "MOV%X1.B\t%1,%L0 { CLR\t%H0"
 )
 
 (define_insn "zero_extendhipsi2"
@@ -584,39 +580,6 @@
   MOVX.A\t%1, %0"
 )
 
-(define_insn "truncpsihi2"
-  [(set (match_operand:HI		0 "msp430_general_dst_operand" "=rm")
-	(truncate:HI (match_operand:PSI 1 "register_operand"  "r")))]
-  ""
-  "MOVX\t%1, %0"
-)
-
-(define_insn "extendhisi2"
-  [(set (match_operand:SI 0 "msp430_general_dst_nonv_operand" "=r")
-	(sign_extend:SI (match_operand:HI 1 "nonimmediate_operand" "r")))]
-  ""
-  { return msp430x_extendhisi (operands); }
-)
-
-(define_insn "extendhipsi2"
-  [(set (match_operand:PSI 0 "msp430_general_dst_nonv_operand" "=r")
-	(subreg:PSI (sign_extend:SI (match_operand:HI 1 "general_operand" "0")) 0))]
-  "msp430x"
-  "RLAM.A #4, %0 { RRAM.A #4, %0"
-)
-
-;; Look for cases where integer/pointer conversions are suboptimal due
-;; to missing patterns, despite us not having opcodes for these
-;; patterns.  Doing these manually allows for alternate optimization
-;; paths.
-
-(define_insn "zero_extendqisi2"
-  [(set (match_operand:SI 0 "msp430_general_dst_nonv_operand" "=r")
-	(zero_extend:SI (match_operand:QI 1 "nonimmediate_operand" "rm")))]
-  ""
-  "MOV%X1.B\t%1,%L0 { CLR\t%H0"
-)
-
 (define_insn "zero_extendhisi2"
   [(set (match_operand:SI 0 "msp430_general_dst_nonv_operand" "=rm,r")
 	(zero_extend:SI (match_operand:HI 1 "general_operand" "0,r")))]
@@ -635,22 +598,6 @@
MOV.W\t%1,%0"
 )
 
-(define_insn "extend_and_shift1_hipsi2"
-  [(set (subreg:SI (match_operand:PSI 0 "msp430_general_dst_nonv_operand" "=r") 0)
-	(ashift:SI (sign_extend:SI (match_operand:HI 1 "general_operand" "0"))
-		   (const_int 1)))]
-  "msp430x"
-  "RLAM.A #4, %0 { RRAM.A #3, %0"
-)
-
-(define_insn "extend_and_shift2_hipsi2"
-  [(set (subreg:SI (match_operand:PSI 0 "msp430_general_dst_nonv_operand" "=r") 0)
-	(ashift:SI (sign_extend:SI (match_operand:HI 1 "general_operand" "0"))
-		   (const_int 2)))]
-  "msp430x"
-  "RLAM.A #4, %0 { RRAM.A #2, %0"
-)
-
 ; Nasty - we are sign-extending a 20-bit PSI value in one register into
 ; two adjacent 16-bit registers to make an SI value.  There is no MSP430X
 ; instruction that will do this, so we push the 20-bit value onto the stack
@@ -685,6 +632,60 @@
   "
 )
 
+
+;; Eliminate extraneous zero-extends mysteriously created by gcc.
+(define_peephole2
+  [(set (match_operand:HI 0 "register_operand")
+	(zero_extend:HI (match_operand:QI 1 "general_operand")))
+   (set (match_operand:HI 2 "register_operand")
+	(zero_extend:HI (match_operand:QI 3 "register_operand")))]
+  "REGNO (operands[0]) == REGNO (operands[2]) && REGNO (operands[2]) == REGNO (operands[3])"
+  [(set (match_dup 0)
+	(zero_extend:HI (match_dup 1)))]
+)
+
+(define_insn "truncpsihi2"
+  [(set (match_operand:HI		0 "msp430_general_dst_operand" "=rm")
+	(truncate:HI (match_operand:PSI 1 "register_operand"  "r")))]
+  ""
+  "MOVX\t%1, %0"
+)
+
+(define_insn "extendhisi2"
+  [(set (match_operand:SI 0 "msp430_general_dst_nonv_operand" "=r")
+	(sign_extend:SI (match_operand:HI 1 "nonimmediate_operand" "r")))]
+  ""
+  { return msp430x_extendhisi (operands); }
+)
+
+(define_insn "extendhipsi2"
+  [(set (match_operand:PSI 0 "msp430_general_dst_nonv_operand" "=r")
+	(subreg:PSI (sign_extend:SI (match_operand:HI 1 "general_operand" "0")) 0))]
+  "msp430x"
+  "RLAM.A #4, %0 { RRAM.A #4, %0"
+)
+
+;; Look for cases where integer/pointer conversions are suboptimal due
+;; to missing patterns, despite us not having opcodes for these
+;; patterns.  Doing these manually allows for alternate optimization

[PATCH 0/2][MSP430] Optimize zero_extend insns and PSImode pointer manipulation

2019-10-08 Thread Jozef Lawrynowicz
In the large memory model, MSP430 instructions have some useful properties when
performing byte, word or address-word writes to registers or memory:
- Byte-writes to registers clear bits 19:8
- Word-writes to registers clear bits 19:16
- PSImode writes to memory clear bits 16:4 of the second memory word

This patch makes use of these properties to optimize some zero_extend
instructions.

There are some "synonyms" for these zero_extend instructions that combine
searches for when optimizing code which manipulates PSImode pointers. The patch
adds a number of these unnamed RTL insns.

The first patch is an "obvious" patch with no functional changes, which just
reorders the zero_extend insns in the md file so we get them in one place.
The second patch has functional changes.

(Note that the patches will not apply cleanly unless the recently submitted
patch to implement post increment addressing has been applied:
https://gcc.gnu.org/ml/gcc-patches/2019-10/msg00492.html)

Successfully regtested on trunk in the small and large memory models.

Ok for trunk?

Jozef Lawrynowicz (2):
  MSP430: Reorder and group zero_extend insns in msp430.md
  MSP430: PSImode pointer manipulation and zero extend insn
optimizations

 gcc/config/msp430/msp430.md | 236 +---
 1 file changed, 168 insertions(+), 68 deletions(-)


Re: [RFA][1/3] Remove Cell Broadband Engine SPU targets

2019-10-08 Thread Thomas Schwinge
Hi!

On 2019-09-02T22:16:52+0200, "Ulrich Weigand"  wrote:
>   * MAINTAINERS: Remove spu port maintainers.

> --- MAINTAINERS   (revision 275321)
> +++ MAINTAINERS   (working copy)
> @@ -109,9 +109,6 @@
>  sh port  Oleg Endo   
>  sparc port   David S. Miller 
>  sparc port   Eric Botcazou   
> -spu port Trevor Smigiel  
> 
> -spu port David Edelsohn  
> -spu port Ulrich Weigand  
>  tilegx port  Walter Lee  
>  tilepro port Walter Lee  
>  v850 portNick Clifton

In r276692, I committed the attached to "Add back Trevor Smigiel; move
into Write After Approval section", assuming that it was just an
oversight that he got dropped from the file, as he -- in contrast to
David and Ulrich -- doesn't remain listed with other roles.


Grüße
 Thomas


From 1be3a79f2a6fbc0cb095474a3ab8f96398efa24c Mon Sep 17 00:00:00 2001
From: tschwinge 
Date: Tue, 8 Oct 2019 10:20:50 +
Subject: [PATCH 3/3] Remove Cell Broadband Engine SPU targets

Follow-up to trunk 275343:

	* MAINTAINERS: Add back Trevor Smigiel; move into Write After
	Approval section.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@276692 138bc75d-0d04-0410-961f-82ee72b054a4
---
 ChangeLog   | 5 +
 MAINTAINERS | 1 +
 2 files changed, 6 insertions(+)

diff --git a/ChangeLog b/ChangeLog
index fb901e4a39e..90413f57284 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,8 @@
+2019-10-08  Thomas Schwinge  
+
+	* MAINTAINERS: Add back Trevor Smigiel; move into Write After
+	Approval section.
+
 2019-10-01  Frederik Harwath 
 
 	* MAINTAINERS: Add myself to Write After Approval
diff --git a/MAINTAINERS b/MAINTAINERS
index bd6a7d44c5a..78f17c35e9e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -585,6 +585,7 @@ Sharad Singhai	
 Johannes Singler
 Franz Sirl	
 Jan Sjodin	
+Trevor Smigiel	
 Edward Smith-Rowland<3dw...@verizon.net>
 Jayant Sonar	
 Anatoly Sokolov	
-- 
2.17.1



signature.asc
Description: PGP signature


Re: [PATCH] PR fortran/68401 Improve allocation error message

2019-10-08 Thread Thomas Schwinge
Hi!

On 2019-08-16T17:31:36+0300, Janne Blomqvist  wrote:
> Improve the error message that is printed when a memory allocation
> fails, by including the location, and the size of the allocation that
> failed.

>   * runtime/error.c (os_error_at): New function.

Committed the attached in r276691 to "Extend
'libgfortran/runtime/minimal.c' per r274599 "PR fortran/68401 Improve
allocation error message"".


Grüße
 Thomas


From 19c0ab5ba623bfe5926f3be04306399f9fc8dd8e Mon Sep 17 00:00:00 2001
From: tschwinge 
Date: Tue, 8 Oct 2019 10:20:41 +
Subject: [PATCH 2/3] Extend 'libgfortran/runtime/minimal.c' per r274599 "PR
 fortran/68401 Improve allocation error message"

	libgfortran/
	PR fortran/68401
	* runtime/minimal.c (os_error_at): New function.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@276691 138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgfortran/ChangeLog |  3 +++
 libgfortran/runtime/minimal.c | 23 ++-
 2 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/libgfortran/ChangeLog b/libgfortran/ChangeLog
index 9e3b1f8bad8..c5a45333042 100644
--- a/libgfortran/ChangeLog
+++ b/libgfortran/ChangeLog
@@ -1,5 +1,8 @@
 2019-10-08  Thomas Schwinge  
 
+	PR fortran/68401
+* runtime/minimal.c (os_error_at): New function.
+
 	* runtime/minimal.c: Revise.
 
 2019-10-05  Paul Thomas  
diff --git a/libgfortran/runtime/minimal.c b/libgfortran/runtime/minimal.c
index a633bc1ce0f..bdaf878ffcb 100644
--- a/libgfortran/runtime/minimal.c
+++ b/libgfortran/runtime/minimal.c
@@ -215,7 +215,28 @@ os_error (const char *message)
   estr_write ("\n");
   exit_error (1);
 }
-iexport(os_error);
+iexport(os_error); /* TODO, DEPRECATED, ABI: Should not be exported
+		  anymore when bumping so version.  */
+
+
+/* Improved version of os_error with a printf style format string and
+   a locus.  */
+
+void
+os_error_at (const char *where, const char *message, ...)
+{
+  va_list ap;
+
+  recursion_check ();
+  estr_write (where);
+  estr_write (": ");
+  va_start (ap, message);
+  estr_vprintf (message, ap);
+  va_end (ap);
+  estr_write ("\n");
+  exit_error (1);
+}
+iexport(os_error_at);
 
 
 /* void runtime_error()-- These are errors associated with an
-- 
2.17.1



signature.asc
Description: PGP signature


Re: RFC: Building a minimal libgfortran for nvptx

2019-10-08 Thread Thomas Schwinge
Hi!

On 2015-03-11T22:48:22+0100, I wrote:
> On Fri, 14 Nov 2014 18:08:32 +0100, Bernd Schmidt  
> wrote:
>> New patch below, [...]
>
> ... got committed.  I now committed the following in r221363:

> libgfortran LIBGFOR_MINIMAL enhancements.

..., and in r276690 have now committed the attached to "Revise
'libgfortran/runtime/minimal.c' to better conform to the original
sources".


Grüße
 Thomas


From c6c3841de5ec2b37d8b579bd7bce54bef811c064 Mon Sep 17 00:00:00 2001
From: tschwinge 
Date: Tue, 8 Oct 2019 10:20:31 +
Subject: [PATCH 1/3] Revise 'libgfortran/runtime/minimal.c' to better conform
 to the original sources

	libgfortran/
	* runtime/minimal.c: Revise.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@276690 138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgfortran/ChangeLog |   4 +
 libgfortran/runtime/minimal.c | 237 +++---
 2 files changed, 169 insertions(+), 72 deletions(-)

diff --git a/libgfortran/ChangeLog b/libgfortran/ChangeLog
index 7736e5da937..9e3b1f8bad8 100644
--- a/libgfortran/ChangeLog
+++ b/libgfortran/ChangeLog
@@ -1,3 +1,7 @@
+2019-10-08  Thomas Schwinge  
+
+	* runtime/minimal.c: Revise.
+
 2019-10-05  Paul Thomas  
 
 	PR fortran/91926
diff --git a/libgfortran/runtime/minimal.c b/libgfortran/runtime/minimal.c
index c1993b99be7..a633bc1ce0f 100644
--- a/libgfortran/runtime/minimal.c
+++ b/libgfortran/runtime/minimal.c
@@ -23,13 +23,38 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 .  */
 
 #include "libgfortran.h"
-#include 
 
+#include 
 
 #ifdef HAVE_UNISTD_H
 #include 
 #endif
 
+
+#if __nvptx__
+/* Map "exit" to "abort"; see PR85463 '[nvptx] "exit" in offloaded region
+   doesn't terminate process'.  */
+# undef exit
+# define exit(status) do { (void) (status); abort (); } while (0)
+#endif
+
+
+#if __nvptx__
+/* 'printf' is all we have.  */
+# undef estr_vprintf
+# define estr_vprintf vprintf
+#else
+# error TODO
+#endif
+
+
+/* runtime/environ.c */
+
+options_t options;
+
+
+/* runtime/main.c */
+
 /* Stupid function to be sure the constructor is always linked in, even
in the case of static linking.  See PR libfortran/22298 for details.  */
 void
@@ -38,11 +63,126 @@ stupid_function_name_for_static_linking (void)
   return;
 }
 
-options_t options;
 
 static int argc_save;
 static char **argv_save;
 
+
+/* Set the saved values of the command line arguments.  */
+
+void
+set_args (int argc, char **argv)
+{
+  argc_save = argc;
+  argv_save = argv;
+}
+iexport(set_args);
+
+
+/* Retrieve the saved values of the command line arguments.  */
+
+void
+get_args (int *argc, char ***argv)
+{
+  *argc = argc_save;
+  *argv = argv_save;
+}
+
+
+/* runtime/error.c */
+
+/* Write a null-terminated C string to standard error. This function
+   is async-signal-safe.  */
+
+ssize_t
+estr_write (const char *str)
+{
+  return write (STDERR_FILENO, str, strlen (str));
+}
+
+
+/* printf() like function for for printing to stderr.  Uses a stack
+   allocated buffer and doesn't lock stderr, so it should be safe to
+   use from within a signal handler.  */
+
+int
+st_printf (const char * format, ...)
+{
+  int written;
+  va_list ap;
+  va_start (ap, format);
+  written = estr_vprintf (format, ap);
+  va_end (ap);
+  return written;
+}
+
+
+/* sys_abort()-- Terminate the program showing backtrace and dumping
+   core.  */
+
+void
+sys_abort (void)
+{
+  /* If backtracing is enabled, print backtrace and disable signal
+ handler for ABRT.  */
+  if (options.backtrace == 1
+  || (options.backtrace == -1 && compile_options.backtrace == 1))
+{
+  estr_write ("\nProgram aborted.\n");
+}
+
+  abort();
+}
+
+
+/* Exit in case of error termination. If backtracing is enabled, print
+   backtrace, then exit.  */
+
+void
+exit_error (int status)
+{
+  if (options.backtrace == 1
+  || (options.backtrace == -1 && compile_options.backtrace == 1))
+{
+  estr_write ("\nError termination.\n");
+}
+  exit (status);
+}
+
+
+/* show_locus()-- Print a line number and filename describing where
+ * something went wrong */
+
+void
+show_locus (st_parameter_common *cmp)
+{
+  char *filename;
+
+  if (!options.locus || cmp == NULL || cmp->filename == NULL)
+return;
+  
+  if (cmp->unit > 0)
+{
+  filename = /* TODO filename_from_unit (cmp->unit) */ NULL;
+
+  if (filename != NULL)
+	{
+	  st_printf ("At line %d of file %s (unit = %d, file = '%s')\n",
+		   (int) cmp->line, cmp->filename, (int) cmp->unit, filename);
+	  free (filename);
+	}
+  else
+	{
+	  st_printf ("At line %d of file %s (unit = %d)\n",
+		   (int) cmp->line, cmp->filename, (int) cmp->unit);
+	}
+  return;
+}
+
+  st_printf ("At line %d of file %s\n", (int) cmp->line, cmp->filename);
+}
+
+
 /* recursion_check()-- It's possible for additional errors to occur
  * during fatal error processing.  We detect this condition here and
  * exit with code 4 immediately. */
@@ -70,9 +210,10 @@ void
 

Re: [Patch][Fortran] Improve OpenMP/OpenACC diagnostic if junk comes after a directive

2019-10-08 Thread Jakub Jelinek
On Tue, Oct 08, 2019 at 11:53:52AM +0200, Tobias Burnus wrote:
> Simple patch for better diagnostics:
> 
> Before:
> 
>17 |   !$omp end target simd hjgfhg
>   |1
> Error: Unclassifiable OpenMP directive at (1)
> 
> Now:
>17 |   !$omp end target simd hjgfhg
>   |1
> Error: Unexpected junk at (1)
> 
> 
> OK for the trunk?
> 
> Cheers,
> 
> Tobias
> 

>   gcc/fortran/
>   * match.h (gfc_match_omp_eos_error): Renamed from gfc_match_omp_eos.
>   * openmp.c (gfc_match_omp_eos): Make static.
>   (gfc_match_omp_eos_error): New.
>   * parse.c (matchs, matchdo, matchds): Do as done for 'matcho' -
> if error occurred after OpenMP/OpenACC directive matched, do not
>   try other directives.
>   (decode_oacc_directive, decode_omp_directive): Call new function
>   instead.
> 
>   testsuite/
>   * gfortran.dg/goacc/continuation-free-form.f95: Update dg-error.

Ok, though it might be useful to have a testcase with some of these
diagnostics at least for a couple of constructs too (not just in the single
line you've changed).

Jakub


[Patch][Fortran] Improve OpenMP/OpenACC diagnostic if junk comes after a directive

2019-10-08 Thread Tobias Burnus

Simple patch for better diagnostics:

Before:

   17 |   !$omp end target simd hjgfhg
  |1
Error: Unclassifiable OpenMP directive at (1)

Now:
   17 |   !$omp end target simd hjgfhg
  |1
Error: Unexpected junk at (1)


OK for the trunk?

Cheers,

Tobias

	gcc/fortran/
	* match.h (gfc_match_omp_eos_error): Renamed from gfc_match_omp_eos.
	* openmp.c (gfc_match_omp_eos): Make static.
	(gfc_match_omp_eos_error): New.
	* parse.c (matchs, matchdo, matchds): Do as done for 'matcho' -
if error occurred after OpenMP/OpenACC directive matched, do not
	try other directives.
	(decode_oacc_directive, decode_omp_directive): Call new function
	instead.

	testsuite/
	* gfortran.dg/goacc/continuation-free-form.f95: Update dg-error.

diff --git a/gcc/fortran/match.h b/gcc/fortran/match.h
index 1bd78b14338..611d7964645 100644
--- a/gcc/fortran/match.h
+++ b/gcc/fortran/match.h
@@ -151,7 +151,7 @@ match gfc_match_oacc_exit_data (void);
 match gfc_match_oacc_routine (void);
 
 /* OpenMP directive matchers.  */
-match gfc_match_omp_eos (void);
+match gfc_match_omp_eos_error (void);
 match gfc_match_omp_atomic (void);
 match gfc_match_omp_barrier (void);
 match gfc_match_omp_cancel (void);
diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index 7df7384c187..cd28384589c 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -31,7 +31,7 @@ along with GCC; see the file COPYING3.  If not see
 /* Match an end of OpenMP directive.  End of OpenMP directive is optional
whitespace, followed by '\n' or comment '!'.  */
 
-match
+static match
 gfc_match_omp_eos (void)
 {
   locus old_loc;
@@ -57,6 +57,17 @@ gfc_match_omp_eos (void)
   return MATCH_NO;
 }
 
+match
+gfc_match_omp_eos_error (void)
+{
+  if (gfc_match_omp_eos() == MATCH_YES)
+return MATCH_YES;
+
+  gfc_error ("Unexpected junk at %C");
+  return MATCH_ERROR;
+}
+
+
 /* Free an omp_clauses structure.  */
 
 void
diff --git a/gcc/fortran/parse.c b/gcc/fortran/parse.c
index 4d343450555..03fc716dbf5 100644
--- a/gcc/fortran/parse.c
+++ b/gcc/fortran/parse.c
@@ -674,15 +674,15 @@ decode_oacc_directive (void)
   match ("declare", gfc_match_oacc_declare, ST_OACC_DECLARE);
   break;
 case 'e':
-  matcha ("end atomic", gfc_match_omp_eos, ST_OACC_END_ATOMIC);
-  matcha ("end data", gfc_match_omp_eos, ST_OACC_END_DATA);
-  matcha ("end host_data", gfc_match_omp_eos, ST_OACC_END_HOST_DATA);
-  matcha ("end kernels loop", gfc_match_omp_eos, ST_OACC_END_KERNELS_LOOP);
-  matcha ("end kernels", gfc_match_omp_eos, ST_OACC_END_KERNELS);
-  matcha ("end loop", gfc_match_omp_eos, ST_OACC_END_LOOP);
-  matcha ("end parallel loop", gfc_match_omp_eos,
+  matcha ("end atomic", gfc_match_omp_eos_error, ST_OACC_END_ATOMIC);
+  matcha ("end data", gfc_match_omp_eos_error, ST_OACC_END_DATA);
+  matcha ("end host_data", gfc_match_omp_eos_error, ST_OACC_END_HOST_DATA);
+  matcha ("end kernels loop", gfc_match_omp_eos_error, ST_OACC_END_KERNELS_LOOP);
+  matcha ("end kernels", gfc_match_omp_eos_error, ST_OACC_END_KERNELS);
+  matcha ("end loop", gfc_match_omp_eos_error, ST_OACC_END_LOOP);
+  matcha ("end parallel loop", gfc_match_omp_eos_error,
 	  ST_OACC_END_PARALLEL_LOOP);
-  matcha ("end parallel", gfc_match_omp_eos, ST_OACC_END_PARALLEL);
+  matcha ("end parallel", gfc_match_omp_eos_error, ST_OACC_END_PARALLEL);
   matcha ("enter data", gfc_match_oacc_enter_data, ST_OACC_ENTER_DATA);
   matcha ("exit data", gfc_match_oacc_exit_data, ST_OACC_EXIT_DATA);
   break;
@@ -738,14 +738,17 @@ decode_oacc_directive (void)
and if spec_only, goto do_spec_only without actually matching.  */
 #define matchs(keyword, subr, st)\
 do {			\
+  match m2;			\
   if (spec_only && gfc_match (keyword) == MATCH_YES)	\
 	goto do_spec_only;	\
-  if (match_word_omp_simd (keyword, subr, _locus,	\
-			   _matched) == MATCH_YES)	\
+  if ((m2 = match_word_omp_simd (keyword, subr, _locus,	\
+			   _matched)) == MATCH_YES)	\
 	{			\
 	  ret = st;		\
 	  goto finish;		\
 	}			\
+  else if (m2 == MATCH_ERROR)\
+	goto error_handling;	\
   else			\
 	undo_new_statement ();  	\
 } while (0)
@@ -776,12 +779,15 @@ decode_oacc_directive (void)
 /* Like match, but set a flag simd_matched if keyword matched.  */
 #define matchds(keyword, subr, st)\
 do {			\
-  if (match_word_omp_simd (keyword, subr, _locus,	\
-			   _matched) == MATCH_YES)	\
+  match m2;			\
+  if ((m2 = match_word_omp_simd (keyword, subr, _locus,	\
+			   _matched)) == MATCH_YES)	\
 	{			\
 	  ret = st;		\
 	  goto finish;		\
 	}			\
+  else if (m2 == MATCH_ERROR)\
+	goto error_handling;	\
   else			\
 	undo_new_statement ();  	\
 } while (0)
@@ -789,14 +795,17 @@ decode_oacc_directive (void)
 /* Like match, but don't match anything if not 

[PATCH] Fix dump message issue

2019-10-08 Thread luoxhu
'}' is missed at the end.

gcc/ChangeLog:
tree-sra.c (dump_access): Add missing braces.
---
 gcc/tree-sra.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index 48589323a1e..cb59b91f20e 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -403,7 +403,7 @@ dump_access (FILE *f, struct access *access, bool grp)
 "grp_hint = %d, grp_covered = %d, "
 "grp_unscalarizable_region = %d, grp_unscalarized_data = %d, "
 "grp_same_access_path = %d, grp_partial_lhs = %d, "
-"grp_to_be_replaced = %d, grp_to_be_debug_replaced = %d\n",
+"grp_to_be_replaced = %d, grp_to_be_debug_replaced = %d}\n",
 access->grp_read, access->grp_write, access->grp_assignment_read,
 access->grp_assignment_write, access->grp_scalar_read,
 access->grp_scalar_write, access->grp_total_scalarization,
@@ -413,7 +413,7 @@ dump_access (FILE *f, struct access *access, bool grp)
 access->grp_to_be_replaced, access->grp_to_be_debug_replaced);
   else
 fprintf (f, ", write = %d, grp_total_scalarization = %d, "
-"grp_partial_lhs = %d\n",
+"grp_partial_lhs = %d}\n",
 access->write, access->grp_total_scalarization,
 access->grp_partial_lhs);
 }
-- 
2.21.0.777.g83232e3864



Re: [PATCH] Come up with ipa passes introduction in gccint documentation

2019-10-08 Thread luoxhu
Hi,

This is the formal documentation patch for IPA passes.  Thanks.


None of the IPA passes are documented in passes.texi.  This patch adds
a section IPA passes just before GIMPLE passes and RTL passes in
Chapter 9 "Passes and Files of the Compiler".  Also, a short description
for each IPA pass is provided.
gccint.pdf can be produced without errors.

ChangeLog:
PR middle-end/26241
* doc/lto.texi (IPA): Reference to the IPA passes.
* doc/passes.texi (Pass manager): Add node IPA passes and
  description for each IPA pass.
---
 gcc/doc/lto.texi|   7 +-
 gcc/doc/passes.texi | 241 
 2 files changed, 245 insertions(+), 3 deletions(-)

diff --git a/gcc/doc/lto.texi b/gcc/doc/lto.texi
index 771e8278e50..1afb6eb4a98 100644
--- a/gcc/doc/lto.texi
+++ b/gcc/doc/lto.texi
@@ -350,8 +350,10 @@ while the @emph{Read summary}, @emph{Execute}, and
 @end itemize
 
 To simplify development, the GCC pass manager differentiates
-between normal inter-procedural passes and small inter-procedural
-passes.  A @emph{small inter-procedural pass}
+between normal inter-procedural passes (@pxref{All regular IPA passes}),
+small inter-procedural passes (@pxref{All small IPA passes})
+and late inter-procedural passes (@pxref{All late IPA passes}).
+A @emph{small inter-procedural pass}
 (@code{SIMPLE_IPA_PASS}) is a pass that does
 everything at once and thus it cannot be executed during WPA in
 WHOPR mode.  It defines only the @emph{Execute} stage and during
@@ -362,7 +364,6 @@ object files.  The simple inter-procedural passes can also 
be used
 for easier prototyping and development of a new inter-procedural
 pass.
 
-
 @subsection Virtual clones
 
 One of the main challenges of introducing the WHOPR compilation
diff --git a/gcc/doc/passes.texi b/gcc/doc/passes.texi
index 6edb9a0bfb7..3f9106d60c3 100644
--- a/gcc/doc/passes.texi
+++ b/gcc/doc/passes.texi
@@ -20,6 +20,7 @@ where near complete.
 * Parsing pass:: The language front end turns text into bits.
 * Gimplification pass::  The bits are turned into something we can optimize.
 * Pass manager:: Sequencing the optimization passes.
+* IPA passes::   Inter-procedural optimizations.
 * Tree SSA passes::  Optimizations on a high-level representation.
 * RTL passes::   Optimizations on a low-level representation.
 * Optimization info::Dumping optimization information from passes.
@@ -178,6 +179,246 @@ TODO: describe the global variables set up by the pass 
manager,
 and a brief description of how a new pass should use it.
 I need to look at what info RTL passes use first@enddots{}
 
+@node IPA passes
+@section IPA passes
+@cindex IPA passes
+The following briefly describes the IPA optimization passes including
+all small IPA passes, all regular IPA passes and all late IPA passes.
+
+@node All small IPA passes
+@subsection All small IPA passes
+@cindex all small ipa passes
+A small IPA pass is a pass derived from @code{simple_ipa_opt_pass}
+that does everything at once and thus it cannot be executed during WPA
+in WHOPR mode.  It defines only the @emph{Execute} stage and during this
+stage it accesses and modifies the function bodies.  No @code{generate_summary}
+@code{read_summary} or @code{write_summary} hooks defined.
+
+@itemize @bullet
+@item IPA free lang data
+
+This pass free resources that are used by FE but are not needed once
+they are done.  It is located in @file{tree.c} and is described by
+@code{pass_ipa_free_lang_data}.
+
+@item IPA function and variable visibility
+
+This is a local function pass handling visibilities of all symbols.  This
+happens before LTO streaming so in particular -fwhole-program should be ignored
+at this level.  It is located in @file{ipa-visibility.c} and is described by
+@code{pass_ipa_function_and_variable_visibility}.
+
+@item IPA remove symbols
+
+This pass performs reachability analysis and reclaim all unreachable nodes.
+It is located in @file{passes.c} and is described by
+@code{pass_ipa_remove_symbols}.
+
+@item IPA oacc
+
+It is located in @file{tree-ssa-loop.c} and is described by
+@code{pass_ipa_oacc}.
+
+@item IPA pta
+
+This is a tree based points-to analysis pass. The idea behind this analyzer
+is to generate set constraints from the program, then solve the resulting
+constraints in order to generate the points-to sets.  It is located in 
+@file{tree-ssa-structalias.c} and is described by @code{pass_ipa_pta}.
+
+@item IPA oacc kernels
+
+This pass is for processing oacc kernels.  It is located in 
@file{tree-ssa-loop.c}
+and is described by @code{pass_ipa_oacc_kernels}.
+
+@item Target clone
+
+This is a pass for parsing functions with multiple target attributes.
+It is located in @file{multiple_target.c} and is described by
+@code{pass_target_clone}.
+
+@item IPA auto profile
+
+This pass uses AutoFDO profile to annotate the control flow graph.
+It is located in @file{auto-profile.c} and is described by

Re: [AArch64] Allow shrink-wrapping of non-leaf vector PCS functions

2019-10-08 Thread Richard Sandiford
Christophe Lyon  writes:
> On Mon, 30 Sep 2019 at 18:48, Richard Sandiford 
> wrote:
>
> Richard Sandiford  writes:
> > [This follows on from:
> >  https://gcc.gnu.org/ml/gcc-patches/2019-09/msg00778.html
> >  https://gcc.gnu.org/ml/gcc-patches/2019-09/msg01456.html]
> >
> > With the function ABI stuff, we can now support shrink-wrapping of
> > non-leaf vector PCS functions.  This is particularly useful if the
> > vector PCS function calls an ordinary function on an error path,
> > since we can then keep the extra saves and restores specific to
> > that path too.
> >
> > Tested on aarch64-linux-gnu.  OK to install?
>
> Now self-approved :-), applied as r276340.
>
>
> Hi Richard,
>
> As you may have noticed from gcc-testresults, the new test simd-abi-9.c fails
> with -mabi=ilp32.

Thanks for the heads up.  The reason for the failure is that on ILP32
targets, there's an extra UXTW instruction to extend the incoming
pointer before the load.  It doesn't seem worth complicating the test
for that, since all we're checking is that an optimisation takes place,
and that optimisation isn't related to pointer size.

So I think the best fix is just to restrict that part of the test to LP64.
Tested on aarch64-linux-gnu, with and without -mabi=ilp32, applied as
r276688.

Richard


2019-10-08  Richard Sandiford  

gcc/testsuite/
* gcc.target/aarch64/torture/simd-abi-9.c: Require LP64 for
the function body test.

Index: gcc/testsuite/gcc.target/aarch64/torture/simd-abi-9.c
===
--- gcc/testsuite/gcc.target/aarch64/torture/simd-abi-9.c   2019-09-30 
17:46:44.107513831 +0100
+++ gcc/testsuite/gcc.target/aarch64/torture/simd-abi-9.c   2019-10-08 
09:23:54.594365710 +0100
@@ -6,7 +6,7 @@
 int callee (void);
 
 /*
-** caller:
+** caller: { target lp64 }
 ** ldr (w[0-9]+), \[x0\]
 ** cbn?z   \1, [^\n]*
 ** ...


Re: [SVE] PR86753

2019-10-08 Thread Richard Sandiford
Leaving the main review to Richard, just some comments...

Prathamesh Kulkarni  writes:
> @@ -9774,6 +9777,10 @@ vect_is_simple_cond (tree cond, vec_info *vinfo,
>  
> When STMT_INFO is vectorized as a nested cycle, for_reduction is true.
>  
> +   For COND_EXPR if T comes from masked load, and is conditional
> +   on C, we apply loop mask to result of vector comparison, if it's present.
> +   Similarly for E, if it is conditional on !C.
> +
> Return true if STMT_INFO is vectorizable in this way.  */
>  
>  bool

I think this is a bit misleading.  But IMO it'd be better not to have
a comment here and just rely on the one in the main function body.
This optimisation isn't really changing the vectorisation strategy,
and the comment could easily get forgotten if things change in future.

> [...]
> @@ -,6 +10006,35 @@ vectorizable_condition (stmt_vec_info stmt_info, 
> gimple_stmt_iterator *gsi,
>/* Handle cond expr.  */
>for (j = 0; j < ncopies; j++)
>  {
> +  tree loop_mask = NULL_TREE;
> +  bool swap_cond_operands = false;
> +
> +  /* Look up if there is a loop mask associated with the
> +  scalar cond, or it's inverse.  */

Maybe:

   See whether another part of the vectorized code applies a loop
   mask to the condition, or to its inverse.

> +
> +  if (loop_vinfo && LOOP_VINFO_FULLY_MASKED_P (loop_vinfo))
> + {
> +   scalar_cond_masked_key cond (cond_expr, ncopies);
> +   if (loop_vinfo->scalar_cond_masked_set.contains (cond))
> + {
> +   vec_loop_masks *masks = _VINFO_MASKS (loop_vinfo);
> +   loop_mask = vect_get_loop_mask (gsi, masks, ncopies, vectype, j);
> + }
> +   else
> + {
> +   bool honor_nans = HONOR_NANS (TREE_TYPE (cond.op0));
> +   cond.code = invert_tree_comparison (cond.code, honor_nans);
> +   if (loop_vinfo->scalar_cond_masked_set.contains (cond))
> + {
> +   vec_loop_masks *masks = _VINFO_MASKS (loop_vinfo);
> +   loop_mask = vect_get_loop_mask (gsi, masks, ncopies,
> +   vectype, j);
> +   cond_code = cond.code;
> +   swap_cond_operands = true;
> + }
> + }
> + }
> +
>stmt_vec_info new_stmt_info = NULL;
>if (j == 0)
>   {
> @@ -10114,6 +10153,47 @@ vectorizable_condition (stmt_vec_info stmt_info, 
> gimple_stmt_iterator *gsi,
>   }
>   }
>   }
> +
> +   /* If loop mask is present, then AND it with

Maybe "If we decided to apply a loop mask, ..."

> +  result of vec comparison, so later passes (fre4)

Probably better not to name the pass -- could easily change in future.

> +  will reuse the same condition used in masked load.

Could be a masked store, or potentially other things too.
So maybe just "will reuse the masked condition"?

> +
> +  For example:
> +  for (int i = 0; i < 100; ++i)
> +x[i] = y[i] ? z[i] : 10;
> +
> +  results in following optimized GIMPLE: 
> +
> +  mask__35.8_43 = vect__4.7_41 != { 0, ... };
> +  vec_mask_and_46 = loop_mask_40 & mask__35.8_43;
> +  _19 = [base: z_12(D), index: ivtmp_56, step: 4, offset: 0B];
> +  vect_iftmp.11_47 = .MASK_LOAD (_19, 4B, vec_mask_and_46);
> +  vect_iftmp.12_52 = VEC_COND_EXPR  +vect_iftmp.11_47, { 10, ... }>;
> +
> +  instead of recomputing vec != { 0, ... } in vec_cond_expr  */

That's true, but gives the impression that avoiding the vec != { 0, ... }
is the main goal, whereas we could do that just by forcing a three-operand
COND_EXPR.  It's really more about making sure that vec != { 0, ... }
and its masked form aren't both live at the same time.  So maybe:

 instead of using a masked and unmasked forms of
 vect__4.7_41 != { 0, ... } (masked in the MASK_LOAD,
 unmasked in the VEC_COND_EXPR).  */

Thanks,
Richard


Re: [AArch64] Allow shrink-wrapping of non-leaf vector PCS functions

2019-10-08 Thread Christophe Lyon
On Mon, 30 Sep 2019 at 18:48, Richard Sandiford 
wrote:

> Richard Sandiford  writes:
> > [This follows on from:
> >  https://gcc.gnu.org/ml/gcc-patches/2019-09/msg00778.html
> >  https://gcc.gnu.org/ml/gcc-patches/2019-09/msg01456.html]
> >
> > With the function ABI stuff, we can now support shrink-wrapping of
> > non-leaf vector PCS functions.  This is particularly useful if the
> > vector PCS function calls an ordinary function on an error path,
> > since we can then keep the extra saves and restores specific to
> > that path too.
> >
> > Tested on aarch64-linux-gnu.  OK to install?
>
> Now self-approved :-), applied as r276340.
>

Hi Richard,

As you may have noticed from gcc-testresults, the new test simd-abi-9.c
fails with -mabi=ilp32.

Christophe



>
> Richard
>


Re: use call-clobbered reg to disalign the stack

2019-10-08 Thread Uros Bizjak
> Some x86 tests of stack realignment, that disaligned the stack with
> pushes and pops, failed when the compiler was configured to tune for a
> target that preferred to accumulate outgoing arguments: the stack
> space is reserved before the asm push, the call sequence overwrites
> the saved register, and then the asm pop restores the overwritten
> value.  Since that's a call-preserved register in 32-bit mode, it
> should be preserved unchanged, but isn't.
>
> Merely changing the register to a call-clobbered one would be enough,
> but the tests would remain fragile and prone to failure due to other
> optimizations, so I arranged for the compiler to be made aware of the
> register used for the push and the pop, so it won't use it for
> something else, and forced the function to use a frame pointer, so
> that it won't use stack pointer offsets for local variables: the
> offsets would likely be wrong between the asm push and pop.
>
> Tested on x86_64-linux-gnu with -m64 and -m32.  Ok to install?
>
>
> for  gcc/testsuite/ChangeLog
>
> * gcc.target/i386/20060512-1.c (sse2_test): Use a
> call-clobbered register variable for stack-disaligning push
> and pop.  Require a frame pointer.
> * gcc.target/i386/20060512-3.c (sse2_test): Likewise.

OK.

Thanks,
Uros.