Re: [patch, libfortran] PR95647 operator(.eq.) and operator(==) treated differently

2021-02-11 Thread Tobias Burnus

Hi Jerry,

On 12.02.21 04:02, Jerry DeLisle wrote:

The attached patch is another provided from Steve Kargle in the PR
report.
I have created a test case and regression tested the result.

OK for trunk?

LGTM except:

libgfortran/ChangeLog:

PR libfortran 95647


Syntax is "PR /
* ieee/ieee_arithmetic.F90: Flip interfaces of operators .eq. to
== and .ne. to /= .

gcc/testsuite/ChangeLog:

PR libfortran 95647
* gfortran.dg/ieee/ieee_arithmetic: New test.

-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf


Re: Re: [PATCH] plug memory leaks in warn_parm_array_mismatch (PR 99055)

2021-02-11 Thread Richard Biener via Gcc-patches
On Thu, Feb 11, 2021 at 7:35 PM Martin Sebor  wrote:
>
> On 2/11/21 12:59 AM, Richard Biener wrote:
> > On Wed, Feb 10, 2021 at 6:16 PM Martin Sebor  wrote:
> >>
> >> The attached patch replaces calls to print_generic_expr_to_str() with
> >> a helper function that returns a std::string and releases the caller
> >> from the responsibility to explicitly free memory.
> >
> > I don't like this.  What's the reason to use generic_expr_as_string
> > anyway?  We have %E pretty-printers after all.
>
> [Reposting; my first reply was too big.]
>
> The VLA bound is either an expression or the asterisk (*) when newbnd
> is null.  The reason for using the function is to avoid duplicating
> the warning call and making one with %E and another with "*".
>
> > Couldn't you have
> > fixed the leak by doing
> >
> > if (newbnd)
> >free (newbndstr);
> >
> > etc.?
>
> Yes, I could have.  I considered it and chose not to because this
> is much simpler, safer (if newbnd were to change after newbndstr
> is set), and more in the "C++ spirit" (RAII).  This code already
> uses std::string (attr_access::array_as_string) and so my change
> is in line with it.
>
> I understand that you prefer a more C-ish coding style so if you
> consider it a prerequisite for accepting this fix I can make
> the change conform to it.  See the attached revision (although
> I hope you'll see why I chose not to go this route).

I can understand but I still disagree ;)

> For what it's worth, print_generic_expr_to_str() would be improved
> by returning std::string.  It would mean including  in
> most sources in the compiler, which I suspect may not be a popular
> suggestion with everyone here, and which is why I didn't make it
> to begin with.  But if you're open to it I'm happy to make that
> change either for GCC 12 or even now.

Well, looking at print_generic_expr_to_str I see

/* Print a single expression T to string, and return it.  */

char *
print_generic_expr_to_str (tree t)
{
  pretty_printer pp;
  dump_generic_node (, t, 0, TDF_VOPS|TDF_MEMSYMS, false);
  return xstrdup (pp_formatted_text ());
}

which makes me think that this instead should be sth like

class generic_expr_as_str : public pretty_printer
{
   generic_expr_as_str (tree t) { dump_generic_node (, t, 0,
TDF_VOPS|TDF_MEMSYMS, false); }
   operator const char *() { return pp_formatted_text (); }
   pretty_printer pp;
};

As we do seem to have a RAII pretty-printer class already.  That said,
I won't mind using

   pretty_printer pp;
   dump_generic_node (, t, 0, TDF_VOPS|TDF_MEMSYMS, false);

and pp_formatted_text () in the users either.

That is, print_generic_expr_to_str looks just like a bad designed hack.
Using std::string there doesn't make it any better.

Don't you agree to that?

So your 2nd variant patch is OK but you might consider just using
a pretty-printer manually and even do away with the xstrdup in that
way entirely!

> >
> >> With the patch installed, Valgrind shows more leaks in this code that
> >> I'm not sure what to do about:
> >>
> >> 1) A tree built by build_type_attribute_qual_variant() called from
> >>  attr_access::array_as_string() to build a temporary type only
> >>  for the purposes of formatting it.
> >>
> >> 2) A tree (an attribute list) built by tree_cons() called from
> >>  build_attr_access_from_parms() that's used only for the duration
> >>  of the caller.
> >>
> >> Do these temporary trees need to be released somehow or are the leaks
> >> expected?
> >
> > You should configure GCC with --enable-valgrind-annotations to make
> > it aware of our GC.
>
> I did configure with that option:
>
> $ /src/gcc/master/configure --enable-checking=yes
> --enable-languages=all,jit,lto --enable-host-shared
> --enable-valgrind-annotations
>
> Attached to pr99055 is my valgrind output for gcc.dg/Wvla-parameter.c
> with the top of trunk and the patch applied:
>
> $ /build/gcc-master/gcc/xgcc -B /build/gcc-master/gcc -S -Wall
> /src/gcc/master/gcc/testsuite/gcc.dg/Wvla-parameter.c -wrapper
> valgrind,--leak-check=full,--show-leak-kinds=all,--track-origins=yes,--log-file=valgrind-out.txt
>
> Do you not see the same leaks?

Err, well.  --show-leak-kinds=all is probably the cause.  We
definitely do not force-release
all reachable GC allocated memory at program end.  Not sure if
valgrind annotations can
make that obvious to valgrind.  I'm just using --leak-check=full and
thus look for
unreleased and no longer reachable memory.

Richard.

>
> Martin
>


[PING, PATCH V2] Add conversions between _Float128 and Decimal.

2021-02-11 Thread Michael Meissner via Gcc-patches
Ping patch

| Subject: [PATCH, V2] Add conversions between _Float128 and Decimal.
| Message-ID: <20210209073505.ga11...@ibm-toto.the-meissners.org>
https://gcc.gnu.org/pipermail/gcc-patches/2021-February/565009.html

I was asked to resubmit the patch with the copyright dates fixed, and Will's
comments addressed.  I have done this (Will's comments other than the copyright
dates were all about the check-in message, and not the patch itself).  I have
attempted to make the check-in message clearer.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


[PATCH RFA] cgraph: flatten and same_body aliases [PR96078]

2021-02-11 Thread Jason Merrill via Gcc-patches
The patch for PR92372 made us start warning about a flatten attribute on an
alias.  But in the case of C++ 'tor base/complete variants, the user didn't
create the alias, so we shouldn't warn.

I could also remove the attribute in maybe_clone_body, but here seems a bit
better.

Tested x86_64-pc-linux-gnu.  OK for trunk?

gcc/ChangeLog:

PR c++/96078
* cgraph.c (cgraph_node::create_same_body_alias): Remove flatten
attribute from alias.

gcc/testsuite/ChangeLog:

PR c++/96078
* g++.dg/ext/attr-flatten1.C: New test.
---
 gcc/cgraph.c | 3 +++
 gcc/testsuite/g++.dg/ext/attr-flatten1.C | 9 +
 2 files changed, 12 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/attr-flatten1.C

diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index db038306e19..ab0fc592b46 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -603,6 +603,9 @@ cgraph_node::create_same_body_alias (tree alias, tree decl)
   n->cpp_implicit_alias = true;
   if (symtab->cpp_implicit_aliases_done)
 n->resolve_alias (cgraph_node::get (decl));
+  /* Avoid warning in process_function_and_variable_attributes.  */
+  DECL_ATTRIBUTES (alias)
+= remove_attribute ("flatten", DECL_ATTRIBUTES (alias));
   return n;
 }
 
diff --git a/gcc/testsuite/g++.dg/ext/attr-flatten1.C 
b/gcc/testsuite/g++.dg/ext/attr-flatten1.C
new file mode 100644
index 000..5bcbfb6f4aa
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/attr-flatten1.C
@@ -0,0 +1,9 @@
+// PR c++/96078
+// { dg-do compile { target c++11 } }
+
+struct A {
+[[gnu::flatten]] A() {}
+[[gnu::flatten]] ~A() {}
+};
+
+A a;

base-commit: ac001ddd0cb635dec0145bf577ac796894bda398
-- 
2.27.0



Re: [PATCH v7] Practical improvement to libgcc complex divide

2021-02-11 Thread Patrick McGehearty via Gcc-patches

Ping - Submitted Feb 2, 2021
I believe this version fixes all issues raised
in previous submissions.

- Patrick McGehearty

On 2/2/2021 4:25 PM, Patrick McGehearty via Gcc-patches wrote:

Changes in this version from Version 6:
Updated copyrights for following three files to -2021.
 gcc/c-family/c-cppbuiltin.c
Moved code for setting LIBGCC modename to FUNC_EXT section.
Added code to set modename for any additional modes such as
FLT128 or FLT64X.

 libgcc/libgcc2.c
Changed RMIN2 & RMINSCAL for XF and TF modes to use the matching
LIBGCC_?F_EPSILON instead of LIBGCC_DF_EPSILON

 libgcc/config/rs6000/_divkc3.c
Changed RMIN2 & RMINSCAL for TF mode to use the matching
LIBGCC_TF_EPSILON instead of LIBGCC_DF_EPSILON

Resubmitted with correction to following link. No other changes
from previous PATCH v7.

Correctness and performance test programs used during development of
this project may be found in the attachment to:
https://gcc.gnu.org/pipermail/gcc-patches/2021-January/564239.html
with the attachment at
https://gcc.gnu.org/pipermail/gcc-patches/attachments/20210125/b5a7df3b/attachment-0001.bin

Summary of Purpose

This patch to libgcc/libgcc2.c __divdc3 provides an
opportunity to gain important improvements to the quality of answers
for the default complex divide routine (half, float, double, extended,
long double precisions) when dealing with very large or very small exponents.

The current code correctly implements Smith's method (1962) [2]
further modified by c99's requirements for dealing with NaN (not a
number) results. When working with input values where the exponents
are greater than *_MAX_EXP/2 or less than -(*_MAX_EXP)/2, results are
substantially different from the answers provided by quad precision
more than 1% of the time. This error rate may be unacceptable for many
applications that cannot a priori restrict their computations to the
safe range. The proposed method reduces the frequency of
"substantially different" answers by more than 99% for double
precision at a modest cost of performance.

Differences between current gcc methods and the new method will be
described. Then accuracy and performance differences will be discussed.

Background

This project started with an investigation related to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59714.  Study of Beebe[1]
provided an overview of past and recent practice for computing complex
divide. The current glibc implementation is based on Robert Smith's
algorithm [2] from 1962.  A google search found the paper by Baudin
and Smith [3] (same Robert Smith) published in 2012. Elen Kalda's
proposed patch [4] is based on that paper.

I developed two sets of test data by randomly distributing values over
a restricted range and the full range of input values. The current
complex divide handled the restricted range well enough, but failed on
the full range more than 1% of the time. Baudin and Smith's primary
test for "ratio" equals zero reduced the cases with 16 or more error
bits by a factor of 5, but still left too many flawed answers. Adding
debug print out to cases with substantial errors allowed me to see the
intermediate calculations for test values that failed. I noted that
for many of the failures, "ratio" was a subnormal. Changing the
"ratio" test from check for zero to check for subnormal reduced the 16
bit error rate by another factor of 12. This single modified test
provides the greatest benefit for the least cost, but the percentage
of cases with greater than 16 bit errors (double precision data) is
still greater than 0.027% (2.7 in 10,000).

Continued examination of remaining errors and their intermediate
computations led to the various tests of input value tests and scaling
to avoid under/overflow. The current patch does not handle some of the
rare and most extreme combinations of input values, but the random
test data is only showing 1 case in 10 million that has an error of
greater than 12 bits. That case has 18 bits of error and is due to
subtraction cancellation. These results are significantly better
than the results reported by Baudin and Smith.

Support for half, float, double, extended, and long double precision
is included as all are handled with suitable preprocessor symbols in a
single source routine. Since half precision is computed with float
precision as per current libgcc practice, the enhanced algorithm
provides no benefit for half precision and would cost performance.
Further investigation showed changing the half precision algorithm
to use the simple formula (real=a*c+b*d imag=b*c-a*d) caused no
loss of precision and modest improvement in performance.

The existing constants for each precision:
float: FLT_MAX, FLT_MIN;
double: DBL_MAX, DBL_MIN;
extended and/or long double: LDBL_MAX, LDBL_MIN
are used for avoiding the more common overflow/underflow cases.  This
use is made generic by defining appropriate __LIBGCC2_* macros in
c-cppbuiltin.c.


[patch, libfortran] PR95647 operator(.eq.) and operator(==) treated differently

2021-02-11 Thread Jerry DeLisle

The attached patch is another provided from Steve Kargle in the PR report.

I have created a test case and regression tested the result.

OK for trunk?

Regards,

Jerry

libgfortran: Fix PR95647 by changing the interfaces of operators .eq. 
and .ne.


The FE converts the old school .eq. to ==,
and then tracks the ==.  The module starts with == and so it does not
properly overload the .eq.  Reversing the interfaces fixes this.

2021-02-11  Steve Kargl 

libgfortran/ChangeLog:

    PR libfortran 95647
    * ieee/ieee_arithmetic.F90: Flip interfaces of operators .eq. to
    == and .ne. to /= .

gcc/testsuite/ChangeLog:

    PR libfortran 95647
    * gfortran.dg/ieee/ieee_arithmetic: New test.

diff --git a/gcc/testsuite/gfortran.dg/ieee/ieee_12.f90 b/gcc/testsuite/gfortran.dg/ieee/ieee_12.f90
new file mode 100644
index 000..139a70142b8
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/ieee/ieee_12.f90
@@ -0,0 +1,24 @@
+! { dg-do run }
+! PR95647 operator(.eq.) and operator(==) treated differently
+program test
+  use, intrinsic :: ieee_arithmetic, only : &
+&ieee_class,   &
+&ieee_class_type,  &
+&ieee_negative_normal, &
+&ieee_positive_normal, &
+&operator(.eq.), operator(.ne.)
+  integer :: good
+  real(4) r4
+  type(ieee_class_type) class1
+  good = 0
+  r4 = 1.0
+  class1 = ieee_class(r4)
+  if (class1 .eq. ieee_positive_normal) good = good + 1
+  if (class1 .ne. ieee_negative_normal) good = good + 1
+  r4 = -1.0
+  class1 = ieee_class(r4)
+  if (class1 .eq. ieee_negative_normal) good = good + 1
+  if (class1 .ne. ieee_positive_normal) good = good + 1
+  if (good /= 4) call abort
+end program test
+
diff --git a/libgfortran/ieee/ieee_arithmetic.F90 b/libgfortran/ieee/ieee_arithmetic.F90
index 55992232ce2..35a16938f8e 100644
--- a/libgfortran/ieee/ieee_arithmetic.F90
+++ b/libgfortran/ieee/ieee_arithmetic.F90
@@ -77,15 +77,16 @@ module IEEE_ARITHMETIC
 
 
   ! Equality operators on the derived types
-  interface operator (==)
+  ! Note, the FE overloads .eq. to == and .ne. to /=
+  interface operator (.eq.)
 module procedure IEEE_CLASS_TYPE_EQ, IEEE_ROUND_TYPE_EQ
   end interface
-  public :: operator(==)
+  public :: operator(.eq.)
 
-  interface operator (/=)
+  interface operator (.ne.)
 module procedure IEEE_CLASS_TYPE_NE, IEEE_ROUND_TYPE_NE
   end interface
-  public :: operator (/=)
+  public :: operator (.ne.)
 
 
   ! IEEE_IS_FINITE


[pushed] c++: variadic lambda template and empty pack [PR97246]

2021-02-11 Thread Jason Merrill via Gcc-patches
In get<0>, Is is empty, so the first parameter pack of the lambda is empty,
but after the fix for PR94546 we were wrongly associating it with the
partial instantiation of 'v'.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog:

PR c++/97246
PR c++/94546
* pt.c (extract_fnparm_pack): Check DECL_PACK_P here.
(register_parameter_specializations): Not here.

gcc/testsuite/ChangeLog:

PR c++/97246
* g++.dg/cpp2a/lambda-generic-variadic21.C: New test.
---
 gcc/cp/pt.c   | 38 ---
 .../g++.dg/cpp2a/lambda-generic-variadic21.C  | 19 ++
 2 files changed, 43 insertions(+), 14 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/lambda-generic-variadic21.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 37c45a297c2..50cdb00310f 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -12302,29 +12302,40 @@ extract_fnparm_pack (tree tmpl_parm, tree *spec_p)
 {
   /* Collect all of the extra "packed" parameters into an
  argument pack.  */
-  tree parmvec;
-  tree argpack = make_node (NONTYPE_ARGUMENT_PACK);
+  tree argpack;
   tree spec_parm = *spec_p;
-  int i, len;
+  int len;
 
   for (len = 0; spec_parm; ++len, spec_parm = TREE_CHAIN (spec_parm))
 if (tmpl_parm
&& !function_parameter_expanded_from_pack_p (spec_parm, tmpl_parm))
   break;
 
-  /* Fill in PARMVEC and PARMTYPEVEC with all of the parameters.  */
-  parmvec = make_tree_vec (len);
   spec_parm = *spec_p;
-  for (i = 0; i < len; i++, spec_parm = DECL_CHAIN (spec_parm))
+  if (len == 1 && DECL_PACK_P (spec_parm))
 {
-  tree elt = spec_parm;
-  if (DECL_PACK_P (elt))
-   elt = make_pack_expansion (elt);
-  TREE_VEC_ELT (parmvec, i) = elt;
+  /* The instantiation is still a parameter pack; don't wrap it in a
+NONTYPE_ARGUMENT_PACK.  */
+  argpack = spec_parm;
+  spec_parm = DECL_CHAIN (spec_parm);
 }
+  else
+{
+  /* Fill in PARMVEC with all of the parameters.  */
+  tree parmvec = make_tree_vec (len);
+  argpack = make_node (NONTYPE_ARGUMENT_PACK);
+  for (int i = 0; i < len; i++)
+   {
+ tree elt = spec_parm;
+ if (DECL_PACK_P (elt))
+   elt = make_pack_expansion (elt);
+ TREE_VEC_ELT (parmvec, i) = elt;
+ spec_parm = DECL_CHAIN (spec_parm);
+   }
 
-  /* Build the argument packs.  */
-  SET_ARGUMENT_PACK_ARGS (argpack, parmvec);
+  /* Build the argument packs.  */
+  SET_ARGUMENT_PACK_ARGS (argpack, parmvec);
+}
   *spec_p = spec_parm;
 
   return argpack;
@@ -25716,8 +25727,7 @@ register_parameter_specializations (tree pattern, tree 
inst)
 }
   for (; tmpl_parm; tmpl_parm = DECL_CHAIN (tmpl_parm))
 {
-  if (!DECL_PACK_P (tmpl_parm)
- || (spec_parm && DECL_PACK_P (spec_parm)))
+  if (!DECL_PACK_P (tmpl_parm))
{
  register_local_specialization (spec_parm, tmpl_parm);
  spec_parm = DECL_CHAIN (spec_parm);
diff --git a/gcc/testsuite/g++.dg/cpp2a/lambda-generic-variadic21.C 
b/gcc/testsuite/g++.dg/cpp2a/lambda-generic-variadic21.C
new file mode 100644
index 000..d6b5656e5db
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/lambda-generic-variadic21.C
@@ -0,0 +1,19 @@
+// PR c++/97246
+// { dg-do compile { target c++20 } }
+
+template 
+T arg_T(decltype(Is)..., T, ...);
+
+template 
+inline constexpr auto get =
+ [](decltype(Is)..., T... v, ...) {
+   static_assert( sizeof...(T) == sizeof...(v) );
+   if constexpr ( sizeof...(T) == 1 )
+ return (v,...);
+   else {
+ using V = decltype(arg_T<__integer_pack(I)...>(v...));
+ return get.template operator()(v...);
+   }
+ };
+
+static_assert( get<0>('\0', short{1}, 2, long{3}) == 0 );

base-commit: 3e2f329e94830e7c1861c590f661c25f7465bb7c
-- 
2.27.0



libbacktrace patch committed: Use objcopy --help to check for option

2021-02-11 Thread Ian Lance Taylor via Gcc-patches
This patch changes the libbacktrace configure script to check for
whether objcopy supports --add-gnu-debuglink (a test that only affects
the libbacktrace testsuite) to look at the objcopy --help option
rather than trying to apply --add-gnu-debuglink to /bin/ls.  The
latter can trigger a warning if /bin/ls already has a separate debug
file.  Bootstrapped and ran libbacktrace testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian

* configure.ac: Check for objcopy --add-gnu-debuglink by using
objcopy --help.
* configure: Regenerate
diff --git a/libbacktrace/configure.ac b/libbacktrace/configure.ac
index 83d4733509a..43a33a66b82 100644
--- a/libbacktrace/configure.ac
+++ b/libbacktrace/configure.ac
@@ -500,8 +500,7 @@ AC_CACHE_CHECK([whether objcopy supports debuglink],
   libbacktrace_cv_objcopy_debuglink=no
 elif ! test -n "${OBJCOPY}"; then
   libbacktrace_cv_objcopy_debuglink=no
-elif ${OBJCOPY} --add-gnu-debuglink=x /bin/ls /tmp/ls$$; then
-  rm -f /tmp/ls$$
+elif ${OBJCOPY} --help | fgrep add-gnu-debuglink >/dev/null 2>&1; then
   libbacktrace_cv_objcopy_debuglink=yes
 else
   libbacktrace_cv_objcopy_debuglink=no


[committed] analyzer: fix ICE in print_mem_ref [PR98969]

2021-02-11 Thread David Malcolm via Gcc-patches
PR analyzer/98969 and PR analyzer/99064 describes ICEs, in both cases
within print_mem_ref, when falsely reporting memory leaks - though it
is possible to generate the ICE on other diagnostics (which I added
in one of the test cases).

This patch fixes the ICE, leaving the fix for the leak false positives
as followup work.

The analyzer uses region_model::get_representative_path_var and
region_model::get_representative_tree to map back from its svalue
and region classes to the tree type used by the rest of the compiler,
and, in particular, for diagnostics.

The root cause of the ICE is sloppiness about types within those
functions; specifically when casts were stripped off svalues.  To
track these down I added wrapper functions that verify that the
types of the results are correct, and in doing so found various
other type-safety issues, which the patch also fixes.

Doing so led to various changes in diagnostics messages due to
more accurate types, but I felt that these changes weren't
desirable.
For example, the warning at CVE-2005-1689-minimal.c line 48
which expects:
  double-'free' of 'inbuf.data'
changed fo
  double-'free' of '(char *)inbuf.data'

So I added stripping of top-level casts where necessary to avoid
cluttering diagnostics.

Finally, the more accurate types led to worse results from
readability_comparator, where e.g. the event message at line 50
of sensitive-1.c regressed from the precise:
  passing sensitive value 'password' in call to 'called_by_test_5' from 'test_5'
to the vaguer:
  calling 'called_by_test_5' from 'test_5'
This was due to erroneously picking the initial value of "password"
in the caller frame as the best value within the *callee* frame, due to
"char *" vs "const char *", which confuses the logic for tracking values
that pass along callgraph edges.  The patch fixes this by combining the
readability tests for tree and stack depth, rather than performing
them in sequence, so that it favors the value in the deepest frame.

As noted above, the patch fixes the ICEs, but does not fix the
leak false positives.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r11-7202-g467a48205279cab368dbeb02879e4b721516.

gcc/analyzer/ChangeLog:
PR analyzer/98969
* engine.cc (readability): Add names for the various arbitrary
values.  Handle NOP_EXPR and INTEGER_CST.
(readability_comparator): Combine the readability tests for
tree and stack depth, rather than performing them sequentially.
(impl_region_model_context::on_state_leak): Strip off top-level
casts.
* region-model.cc (region_model::get_representative_path_var): Add
type-checking, moving the bulk of the implementation to...
(region_model::get_representative_path_var_1): ...here.  Respect
types in casts by recursing and re-adding the cast, rather than
merely stripping them off.  Use the correct type when handling
region_svalue.
(region_model::get_representative_tree): Strip off any top-level
cast.
(region_model::get_representative_path_var): Add type-checking,
moving the bulk of the implementation to...
(region_model::get_representative_path_var_1): ...here.
* region-model.h (region_model::get_representative_path_var_1):
New decl
(region_model::get_representative_path_var_1): New decl.
* store.cc (append_pathvar_with_type): New.
(binding_cluster::get_representative_path_vars): Cast path_vars
to the correct type when adding them to *OUT_PVS.

gcc/testsuite/ChangeLog:
PR analyzer/98969
* g++.dg/analyzer/pr99064.C: New test.
* gcc.dg/analyzer/pr98969.c: New test.
---
 gcc/analyzer/engine.cc  | 43 +--
 gcc/analyzer/region-model.cc| 98 +
 gcc/analyzer/region-model.h |  7 ++
 gcc/analyzer/store.cc   | 19 -
 gcc/testsuite/g++.dg/analyzer/pr99064.C | 39 ++
 gcc/testsuite/gcc.dg/analyzer/pr98969.c | 18 +
 6 files changed, 200 insertions(+), 24 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/analyzer/pr99064.C
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr98969.c

diff --git a/gcc/analyzer/engine.cc b/gcc/analyzer/engine.cc
index 45aed8f0d37..89b3be22ecb 100644
--- a/gcc/analyzer/engine.cc
+++ b/gcc/analyzer/engine.cc
@@ -456,6 +456,9 @@ private:
 static int
 readability (const_tree expr)
 {
+  /* Arbitrarily-chosen "high readability" value.  */
+  const int HIGH_READABILITY = 65536;
+
   gcc_assert (expr);
   switch (TREE_CODE (expr))
 {
@@ -479,8 +482,7 @@ readability (const_tree expr)
 case PARM_DECL:
 case VAR_DECL:
   if (DECL_NAME (expr))
-   /* Arbitrarily-chosen "high readability" value.  */
-   return 65536;
+   return HIGH_READABILITY;
   else
/* We don't want to print temporaries.  For example, the C FE
   prints them as e.g. 

[stage 1 patch] remove unreachable code in expand_expr_real_1 (PR 21433)

2021-02-11 Thread Martin Sebor via Gcc-patches

While trawling through old bugs I came across one from 2005: PR 21433
- The COMPONENT_REF case of expand_expr_real_1 is probably wrong.

The report looks correct in that argument 0 in COMPONENT_REF cannot
be a CONSTRUCTOR.  In my tests it's only been one of the following
codes:

  array_ref
  component_ref
  mem_ref
  parm_decl
  result_decl
  var_decl

The attached patch removes the CONSTRUCTOR code and replaces it with
an assert verifying it doesn't come up there.  Besides testing on
x86_64-linux, the change is supported by comments in code and also
in the internals manual (although that looks incorrect and should
be changed to avoid suggesting the first operand is a decl).

tree.def:

/* Value is structure or union component.
   Operand 0 is the structure or union (an expression).
   Operand 1 is the field (a node of type FIELD_DECL).
   Operand 2, if present, is the value of DECL_FIELD_OFFSET, measured
   in units of DECL_OFFSET_ALIGN / BITS_PER_UNIT.  */
DEFTREECODE (COMPONENT_REF, "component_ref", tcc_reference, 3)

generic.texi:

@item COMPONENT_REF
These nodes represent non-static data member accesses.  The first
operand is the object (rather than a pointer to it); the second operand
is the @code{FIELD_DECL} for the data member.  The third operand represents
the byte offset of the field, but should not be used directly; call
@code{component_ref_field_offset} instead.
PR middle-end/21433 - The COMPONENT_REF case of expand_expr_real_1 is probably wrong

gcc/ChangeLog:

	PR middle-end/21433
	* expr.c (expand_expr_real_1): Replace unreachable code with an assert.

diff --git a/gcc/expr.c b/gcc/expr.c
index 86dc1b6c973..7d5efe5722a 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -10767,61 +10767,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
   goto normal_inner_ref;
 
 case COMPONENT_REF:
-  /* If the operand is a CONSTRUCTOR, we can just extract the
-	 appropriate field if it is present.  */
-  if (TREE_CODE (treeop0) == CONSTRUCTOR)
-	{
-	  unsigned HOST_WIDE_INT idx;
-	  tree field, value;
-	  scalar_int_mode field_mode;
-
-	  FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (treeop0),
-idx, field, value)
-	if (field == treeop1
-		/* We can normally use the value of the field in the
-		   CONSTRUCTOR.  However, if this is a bitfield in
-		   an integral mode that we can fit in a HOST_WIDE_INT,
-		   we must mask only the number of bits in the bitfield,
-		   since this is done implicitly by the constructor.  If
-		   the bitfield does not meet either of those conditions,
-		   we can't do this optimization.  */
-		&& (! DECL_BIT_FIELD (field)
-		|| (is_int_mode (DECL_MODE (field), _mode)
-			&& (GET_MODE_PRECISION (field_mode)
-			<= HOST_BITS_PER_WIDE_INT
-	  {
-		if (DECL_BIT_FIELD (field)
-		&& modifier == EXPAND_STACK_PARM)
-		  target = 0;
-		op0 = expand_expr (value, target, tmode, modifier);
-		if (DECL_BIT_FIELD (field))
-		  {
-		HOST_WIDE_INT bitsize = TREE_INT_CST_LOW (DECL_SIZE (field));
-		scalar_int_mode imode
-		  = SCALAR_INT_TYPE_MODE (TREE_TYPE (field));
-
-		if (TYPE_UNSIGNED (TREE_TYPE (field)))
-		  {
-			op1 = gen_int_mode ((HOST_WIDE_INT_1 << bitsize) - 1,
-	imode);
-			op0 = expand_and (imode, op0, op1, target);
-		  }
-		else
-		  {
-			int count = GET_MODE_PRECISION (imode) - bitsize;
-
-			op0 = expand_shift (LSHIFT_EXPR, imode, op0, count,
-	target, 0);
-			op0 = expand_shift (RSHIFT_EXPR, imode, op0, count,
-	target, 0);
-		  }
-		  }
-
-		return op0;
-	  }
-	}
-  goto normal_inner_ref;
-
+  gcc_assert (TREE_CODE (treeop0) != CONSTRUCTOR);
+  /* Fall through.  */
 case BIT_FIELD_REF:
 case ARRAY_RANGE_REF:
 normal_inner_ref:


[PATCH] libstdc++: ifdef rtti specific function __throw_ios_failure() by __cpp_rtti

2021-02-11 Thread Mirko Vogt

Hello,

ran into the following when building libstdc++ without rtti support:

libstdc++-v3/src/c++11/cxx11-ios_failure.cc:174:54: error: no matching 
function for call to 'std::ios_base::failure::failure(const char*&, int&)'


Attached patch does as follows:

ifdef rtti specific function __throw_ios_failure() by __cpp_rtti


Overloaded __throw_ios_failure(const char*, int) got introduced in

484e936e88e5, however __ios_failure() with respective signature is only

defined if __cpp_rtti is defined, hence should only be used within

contexts also guarded by __cpp_rtti.



This was done correctly for the c++98 part and probably just forgotten

for c++11.

Thanks

  mirko

PS: Shouldn't this have been covered by any tests?
>From 46aa7ca34b832695b3f13167034f6aa19f44d074 Mon Sep 17 00:00:00 2001
From: Mirko Vogt 
Date: Thu, 11 Feb 2021 23:18:28 +
Subject: [PATCH] ifdef rtti specific function __throw_ios_failure() by
 __cpp_rtti

Overloaded __throw_ios_failure(const char*, int) got introduced in
484e936e88e5, however __ios_failure() with respective signature is only
defined if __cpp_rtti is defined, hence should only be used within
contexts also guarded by __cpp_rtti.

This was done correctly for the c++98 part and probably just forgotten
for c++11.
---
 libstdc++-v3/src/c++11/cxx11-ios_failure.cc | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/libstdc++-v3/src/c++11/cxx11-ios_failure.cc b/libstdc++-v3/src/c++11/cxx11-ios_failure.cc
index e82c1aaf63b..7edde29f0e8 100644
--- a/libstdc++-v3/src/c++11/cxx11-ios_failure.cc
+++ b/libstdc++-v3/src/c++11/cxx11-ios_failure.cc
@@ -168,10 +168,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   __throw_ios_failure(const char* __s __attribute__((unused)))
   { _GLIBCXX_THROW_OR_ABORT(__ios_failure(_(__s))); }
 
+#if __cpp_rtti
   void
   __throw_ios_failure(const char* str __attribute__((unused)),
 		  int err __attribute__((unused)))
   { _GLIBCXX_THROW_OR_ABORT(__ios_failure(_(str), err)); }
+#endif
 
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace
-- 
2.20.1



Re: [PATCH,rs6000] Optimize pcrel access of globals [ping]

2021-02-11 Thread Segher Boessenkool
Hi!

On Wed, Dec 09, 2020 at 11:04:44AM -0600, acsaw...@linux.ibm.com wrote:
> This patch implements a RTL pass that looks for pc-relative loads of the
> address of an external variable using the PCREL_GOT relocation and a
> single load or store that uses that external address.

> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
> @@ -509,7 +509,7 @@ or1k*-*-*)
>   ;;
>  powerpc*-*-*)
>   cpu_type=rs6000
> - extra_objs="rs6000-string.o rs6000-p8swap.o rs6000-logue.o 
> rs6000-call.o"
> + extra_objs="rs6000-string.o rs6000-p8swap.o rs6000-logue.o 
> rs6000-call.o pcrel-opt.o"

Make this fit on its line?  Just like extra_headers for example:

>   extra_headers="ppc-asm.h altivec.h htmintrin.h htmxlintrin.h"
>   extra_headers="${extra_headers} bmi2intrin.h bmiintrin.h"
>   extra_headers="${extra_headers} xmmintrin.h mm_malloc.h emmintrin.h"


> +/* This file implements a RTL pass that looks for pc-relative loads of the
> +   address of an external variable using the PCREL_GOT relocation and a 
> single
> +   load that uses that external address.  If that is found we create the
> +   PCREL_OPT relocation to possibly convert:
> +
> + pld addr_reg,var@pcrel@got
> +
> + 
> +
> + lwz data_reg,0(addr_reg)
> +
> +   into:
> +
> + plwz data_reg,var@pcrel
> +
> + 
> +
> + nop

The nop first seems much simpler, but you cannot replace a 4-byte insn
with an 8-byte one (without huge effort).  Pity.  Maybe mention that
somewhere?

> +   If the variable is not defined in the main program or the code using it is
> +   not in the main program, the linker puts the address in the .got section 
> and
> +   generates:
> +
> + .section .got
> + .Lvar_got:
> + .dword var
> +
> + .section .text
> + pld addr_reg,.Lvar_got@pcrel
> +
> + 
> +
> + lwz data_reg,0(addr_reg)

What is the advantage of this, over what we started with?

> +   We look for a single usage in the basic block where the external
> +   address is loaded.  Multiple uses or references in another basic block 
> will
> +   force us to not use the PCREL_OPT relocation.
> +
> +   We also optimize stores to the address of an external variable using the
> +   PCREL_GOT relocation and a single store that uses that external address.  
> If
> +   that is found we create the PCREL_OPT relocation to possibly convert:
> +
> + pld addr_reg,var@pcrel@got
> +
> + 
> +
> + stw data_reg,0(addr_reg)
> +
> +   into:
> +
> + pstw data_reg,var@pcrel
> +
> + 
> +
> + nop
> +
> +   If the variable is not defined in the main program or the code using it is
> +   not in the main program, the linker put the address in the .got section 
> and
> +   do:

"puts and does"?  Or "will put and will do"?

> + .section .got
> + .Lvar_got:
> + .dword var
> +
> + .section .text
> + pld addr_reg,.Lvar_got@pcrel
> +
> + 
> +
> + stw data_reg,0(addr_reg)
> +
> +   We only look for a single usage in the basic block where the external
> +   address is loaded.  Multiple uses or references in another basic block 
> will
> +   force us to not use the PCREL_OPT relocation.  */

That sounds like it is a restriction, but you cannot do better at all,
not without communicating a lot more info to the linker anyway.

> +#include "config.h"
> +#include "system.h"
> +#include "coretypes.h"
> +#include "backend.h"
> +#include "rtl.h"
> +#include "tree.h"
> +#include "memmodel.h"
> +#include "expmed.h"
> +#include "optabs.h"
> +#include "recog.h"
> +#include "df.h"
> +#include "tm_p.h"
> +#include "ira.h"
> +#include "print-tree.h"
> +#include "varasm.h"
> +#include "explow.h"
> +#include "expr.h"
> +#include "output.h"
> +#include "tree-pass.h"
> +#include "rtx-vector-builder.h"
> +#include "print-rtl.h"
> +#include "insn-attr.h"
> +#include "insn-codes.h"

Do you need all these header files?  It looks quite reduced, and in the
right order, but did you check :-)

> +/* Various counters.  */
> +static struct {
> +  unsigned long extern_addrs;
> +  unsigned long loads;
> +  unsigned long adjacent_loads;
> +  unsigned long failed_loads;
> +  unsigned long stores;
> +  unsigned long adjacent_stores;
> +  unsigned long failed_stores;
> +} counters;

There is the whole statistics.[ch] you could use for this.  I've never
used it, no idea if there are actual advantages to it :-)

> +/* Return a marker to identify the PCREL_OPT load address and
> +   load/store instruction.  We use a unique integer which is appended
> +   to ".Lpcrel" to make the label.  */
> +
> +static rtx
> +pcrel_opt_next_marker (void)
> +{
> +  static unsigned int pcrel_opt_next_num;
> +
> +  pcrel_opt_next_num++;
> +  return GEN_INT (pcrel_opt_next_num);
> +}

You could just open-code it the whole two places you use this, that is
more straightforward.

> +/* Optimize a PC-relative load address to be used in a load.
> +
> +   If the 

[x86] Fix -freorder-blocks-and-partition glitch with Windows SEH

2021-02-11 Thread Eric Botcazou
Since GCC 8, the -freorder-blocks-and-partition pass can split a function into 
hot and cold parts, thus generating 2 CIEs for a single function in DWARF for 
exception purposes and doing an equivalent trick for Windows SEH on x86-64.

Now the Windows system unwinder is picky when it comes to the boundary between 
an active EH region and the end of the function and, therefore, a nop may need 
to be added in specific cases, see e.g. ix86_seh_fixup_eh_fallthru.

I overlooked that when implementing the -freorder-blocks-and-partition support 
back in 2018 and this results in e.g. a non-functioning Ada compiler when you 
do a profiled bootstrap.

Bootstrapped on x86-64/Windows, applied on all active branches as obvious.


2021-02-11  Eric Botcazou 

* config/i386/winnt.c (i386_pe_seh_unwind_emit): When switching to
the cold section, emit a nop before the directive if the previous
active instruction can throw.

-- 
Eric Botcazoudiff --git a/gcc/config/i386/winnt.c b/gcc/config/i386/winnt.c
index 962c88e3c1b..adc3f36f13b 100644
--- a/gcc/config/i386/winnt.c
+++ b/gcc/config/i386/winnt.c
@@ -1231,6 +1231,10 @@ i386_pe_seh_unwind_emit (FILE *asm_out_file, rtx_insn *insn)
   seh = cfun->machine->seh;
   if (NOTE_P (insn) && NOTE_KIND (insn) == NOTE_INSN_SWITCH_TEXT_SECTIONS)
 {
+  /* See ix86_seh_fixup_eh_fallthru for the rationale.  */
+  rtx_insn *prev = prev_active_insn (insn);
+  if (prev && !insn_nothrow_p (prev))
+	fputs ("\tnop\n", asm_out_file);
   fputs ("\t.seh_endproc\n", asm_out_file);
   seh->in_cold_section = true;
   return;


[PATCH] libgomp/i386: Revert the type of syscall wrappers output back to long.

2021-02-11 Thread Uros Bizjak via Gcc-patches
Linux man-pages 5.07 wrongly declares syscall output type as int.  This error
was fixed in release 5.10, so this patch reverts my recent change.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline as partial revert.

2021-02-11  Uroš Bizjak  

libgomp/
* config/linux/x86/futex.h (__futex_wait):
Revert output type back to long.
(__futex_wake): Ditto.
(futex_wait): Update for revert.
(futex_wake): Ditto.

Uros.
diff --git a/libgomp/config/linux/x86/futex.h b/libgomp/config/linux/x86/futex.h
index 7807e88e2e0..e7f53399a4e 100644
--- a/libgomp/config/linux/x86/futex.h
+++ b/libgomp/config/linux/x86/futex.h
@@ -30,10 +30,10 @@
 #  define SYS_futex202
 # endif
 
-static inline int
+static inline long
 __futex_wait (int *addr, int futex_op, int val)
 {
-  int res;
+  long res;
 
   register void *timeout __asm ("r10") = NULL;
   __asm volatile ("syscall"
@@ -44,10 +44,10 @@ __futex_wait (int *addr, int futex_op, int val)
   return res;
 }
 
-static inline int
+static inline long
 __futex_wake (int *addr, int futex_op, int count)
 {
-  int res;
+  long res;
 
   __asm volatile ("syscall"
  : "=a" (res)
@@ -61,10 +61,10 @@ __futex_wake (int *addr, int futex_op, int count)
 #  define SYS_futex240
 # endif
 
-static inline int
+static inline long
 __futex_wait (int *addr, int futex_op, int val)
 {
-  int res;
+  long res;
 
   void *timeout = NULL;
   __asm volatile ("int $0x80"
@@ -75,10 +75,10 @@ __futex_wait (int *addr, int futex_op, int val)
   return res;
 }
 
-static inline int
+static inline long
 __futex_wake (int *addr, int futex_op, int count)
 {
-  int res;
+  long res;
 
   __asm volatile ("int $0x80"
  : "=a" (res)
@@ -92,9 +92,9 @@ __futex_wake (int *addr, int futex_op, int count)
 static inline void
 futex_wait (int *addr, int val)
 {
-  int res = __futex_wait (addr, gomp_futex_wait, val);
+  long err = __futex_wait (addr, gomp_futex_wait, val);
 
-  if (__builtin_expect (res == -ENOSYS, 0))
+  if (__builtin_expect (err == -ENOSYS, 0))
 {
   gomp_futex_wait &= ~FUTEX_PRIVATE_FLAG;
   gomp_futex_wake &= ~FUTEX_PRIVATE_FLAG;
@@ -106,9 +106,9 @@ futex_wait (int *addr, int val)
 static inline void
 futex_wake (int *addr, int count)
 {
-  int res = __futex_wake (addr, gomp_futex_wake, count);
+  long err = __futex_wake (addr, gomp_futex_wake, count);
 
-  if (__builtin_expect (res == -ENOSYS, 0))
+  if (__builtin_expect (err == -ENOSYS, 0))
 {
   gomp_futex_wait &= ~FUTEX_PRIVATE_FLAG;
   gomp_futex_wake &= ~FUTEX_PRIVATE_FLAG;


Re: [PATCH] adjust "partly out of bounds" warning (PR 98503)

2021-02-11 Thread Martin Sebor via Gcc-patches

On 2/11/21 1:09 AM, Richard Biener wrote:

On Wed, Feb 10, 2021 at 7:03 PM Martin Sebor  wrote:


On 2/10/21 3:39 AM, Richard Biener wrote:

On Tue, Feb 9, 2021 at 4:37 PM Martin Sebor  wrote:


On 2/9/21 12:41 AM, Richard Biener wrote:

On Tue, Feb 9, 2021 at 1:04 AM Martin Sebor via Gcc-patches
 wrote:


On 2/8/21 12:09 PM, Jeff Law wrote:



On 2/3/21 3:45 PM, Martin Sebor wrote:

On 2/3/21 2:57 PM, Jeff Law wrote:



On 1/28/21 4:03 PM, Martin Sebor wrote:

The GCC 11 -Warray-bounds enhancement to diagnose accesses whose
leading offset is in bounds but whose trailing offset is not has
been causing some confusion.  When the warning is issued for
an access to an in-bounds member via a pointer to a struct that's
larger than the pointed-to object, phrasing this strictly invalid
access in terms of array subscripts can be misleading, especially
when the source code doesn't involve any arrays or indexing.

Since the problem boils down to an aliasing violation much more
so than an actual out-of-bounds access, the attached patch adjusts
the code to issue a -Wstrict-aliasing warning in these cases instead
of -Warray-bounds.  In addition, since the aliasing assumptions in
GCC can be disabled by -fno-strict-aliasing, the patch also makes
these instances of the warning conditional on -fstrict-aliasing
being in effect.

Martin

gcc-98503.diff

PR middle-end/98503 -Warray-bounds when -Wstrict-aliasing would be
more appropriate

gcc/ChangeLog:

PR middle-end/98503
* gimple-array-bounds.cc (array_bounds_checker::check_mem_ref):
Issue -Wstrict-aliasing for a subset of violations.
(array_bounds_checker::check_array_bounds):  Set new member.
* gimple-array-bounds.h (array_bounds_checker::cref_of_mref): New
data member.

gcc/testsuite/ChangeLog:

PR middle-end/98503
* g++.dg/warn/Warray-bounds-10.C: Adjust text of expected warnings.
* g++.dg/warn/Warray-bounds-11.C: Same.
* g++.dg/warn/Warray-bounds-12.C: Same.
* g++.dg/warn/Warray-bounds-13.C: Same.
* gcc.dg/Warray-bounds-63.c: Avoid -Wstrict-aliasing.  Adjust text
of expected warnings.
* gcc.dg/Warray-bounds-66.c: Adjust text of expected warnings.
* gcc.dg/Wstrict-aliasing-2.c: New test.
* gcc.dg/Wstrict-aliasing-3.c: New test.

What I don't like here is the strict-aliasing warnings inside the file
and analysis phase for array bounds checking.

ISTM that catching this at cast time would be better.  So perhaps in
build_c_cast and the C++ equivalent?

It would mean we're warning at the cast site rather than the
dereference, but that may ultimately be better for the warning anyway.
I'm not sure.


I had actually experimented with a this (in the middle end, and only
for accesses) but even that caused so many warnings that I abandoned
it, at least for now.  -Warray-bounds has the benefit of flow analysis
(and dead code elimination).  In the front end it would have neither
and be both excessively noisy and ineffective at the same time.  For
example:

I think we agree that this really is an aliasing issue and I would go
further and claim that it has nothing to do with array bounds checking.
Therefore warning for it within gimple-array-bounds just seems wrong.

I realize that you like having DCE run and the ability to look at
offsets and such, but it really feels like the wrong place to do this.
Aliasing issues are also more of a front-end thing since the language
defines what is and what is not valid aliasing -- one might even argue
that any aliasing warning needs to be identified by the front-ends
exclusively (though possibly pruned away later).


The aliasing restrictions are on accesses, which are [defined in
C as] execution-time actions on objects.  Analyzing execution-time
actions unavoidably depends on flow analysis which the front ends
have only very limited support for (simple expressions only).
I gave one example showing how the current -Wstrict-aliasing in
the front end is ineffective against all but the most simplistic
bugs, but there are many more.  For instance:

  int i;
  void *p = // valid
  float *q = p;// valid
  *q = 0;  // aliasing violation

This bug is easily detectable in the middle end but impossible
to do in the front end (same as all other invalid accesses).


But the code is valid in GIMPLE which allows to re-use the 'int i' storage
with storing using a different type.


Presumably you're referring to using placement new?


No, I'm refering to the rules the GIMPLE IL follows.  GIMPLE is no
longer C or C++ and thus what is invalid in C or C++ doesn't
necessarily have to be invalid in GIMPLE.


   The warning
would have to be aware of it and either run before placement new
is inlined.  Alternatively, the inlining could add some annotation
into the IL that the warning would then use to differentiate valid
code from invalid.


At least in old versions of the C++ standard a simple "re-use" of

Re: [PATCH 4/4] c++: dependent constraint on placeholder 'auto' [PR96443]

2021-02-11 Thread Patrick Palka via Gcc-patches
On Thu, 11 Feb 2021, Jason Merrill wrote:

> On 2/8/21 2:03 PM, Patrick Palka wrote:
> > This fixes the way we check satisfaction of constraints on placeholder
> > types in various contexts, and in particular when the constraint is
> > dependent.
> > 
> > Firstly, when evaluating the return type requirement of a compound
> > requirement, we currently substitute the outer template arguments into
> > the constraint before checking satisfaction. But we should instead be
> > passing in the complete set of template arguments to satisfaction and
> > not do a prior separate substitution.  Our current approach leads to us
> > incorrectly rejecting the testcase concepts-return-req2.C below.
> > 
> > Secondly, when checking the constraints on a placeholder variable or
> > return type, we don't substitute the template arguments of the enclosing
> > context at all.  This leads to bogus errors during satisfaction when the
> > constraint is dependent as in the testcase concepts-placeholder3.C
> > below.
> > 
> > In order to fix these two issues, we need to be able to properly
> > normalize the constraints on a placeholder 'auto', which in turn
> > requires us to know the template parameters that were in-scope where an
> > 'auto' was introduced.  This information currently doesn't seem to be
> > easily available when we need it, so this patch adds an auxiliary hash
> > table that keeps track of the value of current_template_parms when each
> > constrained 'auto' was formed.
> > 
> > This patch also removes some seemingly wrong handling of placeholder
> > type arguments from tsubst_parameter_mapping.  The code doesn't trigger
> > with the example used in the comments, because type_uses_auto doesn't
> > look inside non-deduced contexts such as the operand of decltype.  And
> > the call to do_auto_deduction seems confused because if 'arg' is a type,
> > then so is 'parm', and therefore 'init' too is a type, but
> > do_auto_deduction expects it to be an expression.  Before this patch,
> > this code was dead (as far as our testsuite can tell), but now it breaks
> > other parts of this patch, so let's remove it.
> > 
> > gcc/cp/ChangeLog:
> > 
> > PR c++/96443
> > * constraint.cc (type_deducible_p): Don't substitute into the
> > constraints, and instead just pass 'args' to do_auto_deduction
> > as the outer template arguments.
> > (tsubst_parameter_mapping): Remove confused code for handling
> > placeholder type arguments.
> > (normalize_placeholder_type_constraint): Define.
> > (satisfy_constraint_expression): Use it to handle placeholder
> > 'auto' types.
> > * cp-tree.h (get_constrained_auto_context): Declare.
> > * pt.c (constrained_auto_context_map): Define.
> > (get_placeholder_type_constraint_context): Define.
> > (set_placeholder_type_constraints): Define.
> > (copy_placeholder_type_constraints): Define.
> > (tsubst) : Use
> > copy_placeholder_type_constraints.
> > (make_constrained_placeholder_type): Use
> > set_placeholder_type_constraints.
> > (do_auto_deduction): Clarify comments about the outer_targs
> > parameter.  Rework satisfaction of a placeholder type constraint
> > to pass in the complete set of template arguments directly to
> > constraints_satisfied_p.
> > (splice_late_return_type): Use copy_placeholder_type_constraints.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > PR c++/96443
> > * g++.dg/cpp2a/concepts-placeholder3.C: New test.
> > * g++.dg/cpp2a/concepts-return-req2.C: New test.
> > * g++.dg/concepts-ts1.C: Add dg-bogus directive to the call to
> > f15 that we expect to accept.
> > ---
> >   gcc/cp/constraint.cc  | 106 --
> >   gcc/cp/cp-tree.h  |   1 +
> >   gcc/cp/pt.c   | 101 +++--
> >   .../g++.dg/cpp2a/concepts-placeholder3.C  |  19 
> >   .../g++.dg/cpp2a/concepts-return-req2.C   |  13 +++
> >   gcc/testsuite/g++.dg/cpp2a/concepts-ts1.C |   2 +-
> >   6 files changed, 146 insertions(+), 96 deletions(-)
> >   create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-placeholder3.C
> >   create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-return-req2.C
> > 
> > diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
> > index 56134f8b2bf..53588047d44 100644
> > --- a/gcc/cp/constraint.cc
> > +++ b/gcc/cp/constraint.cc
> > @@ -2007,39 +2007,19 @@ type_deducible_p (tree expr, tree type, tree
> > placeholder, tree args,
> >references are preserved in the result.  */
> > expr = force_paren_expr_uneval (expr);
> >   -  /* Replace the constraints with the instantiated constraints. This
> > - substitutes args into any template parameters in the trailing
> > - result type.  */
> > -  tree saved_constr = PLACEHOLDER_TYPE_CONSTRAINTS (placeholder);
> > -  tree subst_constr
> > -= tsubst_constraint (saved_constr,
> > -args,
> > - 

Re: [PATCH v2] c++: Endless loop with targ deduction in member tmpl [PR95888]

2021-02-11 Thread Jason Merrill via Gcc-patches

On 2/11/21 3:06 PM, Marek Polacek wrote:

On Thu, Feb 11, 2021 at 02:24:22PM -0500, Marek Polacek via Gcc-patches wrote:

On Thu, Feb 11, 2021 at 11:30:07AM -0500, Jason Merrill via Gcc-patches wrote:

On 2/9/21 5:41 PM, Marek Polacek wrote:

My r10-7007 patch tweaked tsubst not to reduce the template level of
template parameters when tf_partial.  That caused infinite looping in
is_specialization_of: we ended up with a class template specialization
whose TREE_TYPE (CLASSTYPE_TI_TEMPLATE (t)) == t, so the second for
loop in is_specialization_of never finished.

There's a lot going on in this test, but essentially: the template fn
here has two template parameters, we call it with one explicitly
provided, the other one has to be deduced.  So we'll find ourselves
in fn_type_unification which uses tf_partial when tsubsting the
*explicit* template arguments into the function type.  That leads to
tsubstituting the return type, C.  C is a member template; its
most general template is

template template struct B::C

we figure out (tsubst_template_args) that the template argument list
is .  They come from different levels, one comes from B,
the other one from fn.

So now we lookup_template_class to see if we have C.  We
do the
/* This is a full instantiation of a member template.  Find
   the partial instantiation of which this is an instance.  */
TREE_VEC_LENGTH (arglist)--;
// arglist is now , not 
found = tsubst (gen_tmpl, arglist, complain, NULL_TREE);
TREE_VEC_LENGTH (arglist)++;

magic which is looking for the partial instantiation, in this case,
that would be template struct B::C.  Note we're still
in a tf_partial context!  So the tsubst_template_args in the tsubst
(which tries to substitute  into ) returns , but
V's template level hasn't been reduced!  After tsubst_template_args,
tsubst_template_decl looks to see if we already have this specialization:

// t = template_decl C
// full_args = 
spec = retrieve_specialization (t, full_args, hash);

but doesn't find the one we created a while ago, when processing
B b; in the test, because V's levels don't match.  Whereupon
tsubst_template_decl creates a new TEMPLATE_DECL, one that leads to
the infinite looping problem.

I think let's clear tf_partial when looking for an existing partial
instantiation.

It also occurred to me that I should be able to trigger a similar
problem with 'auto', since r10-7007 removed an is_auto check.  And lo,
I constructed deduce10.C which exhibits the same issue with pre-r10-7007
compilers.  This patch fixes that problem as well.  I'm ecstatic.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/10?  Also
built cmcstl2.


Perhaps the mistake here is using the complain parm at all; this
substitution is not in the "immediate context" of deduction.  Either tf_none
or tf_warning_or_error should be fine, as we know this substitution has
previously succeeded.


Yeah, that makes sense to me too:

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/10?

-- >8 --
My r10-7007 patch tweaked tsubst not to reduce the template level of
template parameters when tf_partial.  That caused infinite looping in
is_specialization_of: we ended up with a class template specialization
whose TREE_TYPE (CLASSTYPE_TI_TEMPLATE (t)) == t, so the second for
loop in is_specialization_of never finished.

There's a lot going on in this test, but essentially: the template fn
here has two template parameters, we call it with one explicitly
provided, the other one has to be deduced.  So we'll find ourselves
in fn_type_unification which uses tf_partial when tsubsting the
*explicit* template arguments into the function type.  That leads to
tsubstituting the return type, C.  C is a member template; its
most general template is

   template template struct B::C

we figure out (tsubst_template_args) that the template argument list
is .  They come from different levels, one comes from B,
the other one from fn.

So now we lookup_template_class to see if we have C.  We
do the
   /* This is a full instantiation of a member template.  Find
  the partial instantiation of which this is an instance.  */
   TREE_VEC_LENGTH (arglist)--;
   // arglist is now , not 
   found = tsubst (gen_tmpl, arglist, complain, NULL_TREE);
   TREE_VEC_LENGTH (arglist)++;

magic which is looking for the partial instantiation, in this case,
that would be template struct B::C.  Note we're still
in a tf_partial context!  So the tsubst_template_args in the tsubst
(which tries to substitute  into ) returns , but
V's template level hasn't been reduced!  After tsubst_template_args,
tsubst_template_decl looks to see if we already have this specialization:

   // t = template_decl C
   // full_args = 
   spec = retrieve_specialization (t, full_args, hash);

but doesn't find the one we created a while ago, when processing
B b; in the test, because V's levels don't match.  Whereupon
tsubst_template_decl creates a new TEMPLATE_DECL, one that leads to

New Swedish PO file for 'cpplib' (version 11.1-b20210207)

2021-02-11 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'cpplib' has been submitted
by the Swedish team of translators.  The file is available at:

https://translationproject.org/latest/cpplib/sv.po

(This file, 'cpplib-11.1-b20210207.sv.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

https://translationproject.org/latest/cpplib/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

https://translationproject.org/domain/cpplib.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




Contents of PO file 'cpplib-11.1-b20210207.sv.po'

2021-02-11 Thread Translation Project Robot


cpplib-11.1-b20210207.sv.po.gz
Description: Binary data
The Translation Project robot, in the
name of your translation coordinator.



[PATCH, pushed] rs6000: Fix invalid address used in MMA built-in function

2021-02-11 Thread Peter Bergner via Gcc-patches
The mma_assemble_input_operand predicate is too lenient on the memory
operands it will accept, leading to an ICE when illegitimate addresses
are passed in.  The solution is to only accept memory operands with
addresses that are valid for quad word memory accesses.  The test case
is a minimized test case from the Eigen library.  The creduced test case
is very noisy with respect to warnings, so the test case has added -w to
silence them.

Thanks to Jakub for diagnosing the bug!

This passed bootstrap and regtesting with no regressions on powerpc64le-linux.
Approved offline by Segher.  Committed and pushed.

Peter


gcc/
PR target/99041
* config/rs6000/predicates.md (mma_assemble_input_operand): Restrict
memory addresses that are legal for quad word accesses.

gcc/testsuite/
PR target/99041
* g++.target/powerpc/pr99041.C: New test.

diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index 76328ecff3d..bd26c62b3a4 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -1156,7 +1156,9 @@
 ;; Return 1 if this operand is valid for a MMA assemble accumulator insn.
 (define_special_predicate "mma_assemble_input_operand"
   (match_test "(mode == V16QImode
-   && (vsx_register_operand (op, mode) || MEM_P (op)))"))
+   && (vsx_register_operand (op, mode)
+   || (MEM_P (op)
+   && quad_address_p (XEXP (op, 0), mode, false"))
 
 ;; Return 1 if this operand is valid for an MMA disassemble insn.
 (define_predicate "mma_disassemble_output_operand"
diff --git a/gcc/testsuite/g++.target/powerpc/pr99041.C 
b/gcc/testsuite/g++.target/powerpc/pr99041.C
new file mode 100644
index 000..c83f9806570
--- /dev/null
+++ b/gcc/testsuite/g++.target/powerpc/pr99041.C
@@ -0,0 +1,84 @@
+/* PR target/99041 */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=power10 -w" } */
+
+/* Verify we do not ICE on the following source.  */
+
+long a, b, c, d, e;
+ enum { f };
+ enum { g, aa };
+ enum { h };
+ template < typename > struct ai;
+ template < typename > struct ad;
+ template < typename, int, int ah, int = 0, int = f, int = ah > class ab;
+ template < typename > class ar;
+ template < typename > class ak;
+ template < typename, int, int = 1, bool = false > class aj;
+ template < typename > class an;
+ template < typename, typename > class al;
+ template < typename, typename, int = h > class am;
+ template < typename, unsigned > class ao;
+ template < typename > struct ap;
+ template < typename aq > struct av { typedef ar< aq > ac; };
+ template < typename > struct as {   typedef av< am< al< int, an< aj< ab< 
double, 5, 2 >, false > > >, int > >::ac   ac; };
+ template < typename at > at au(const typename ap< at >::ac *);
+ template < typename at, int > at cc(typename ap< at >::ac *aw) {   return au< 
at >(aw); }
+ typedef __attribute__((altivec(vector__))) double ax;
+ template <> struct ap< ax > { typedef double ac; };
+ template <> ax au(const double *aw) { return __builtin_vec_vsx_ld(0, aw); }
+ template < typename > struct ay {};
+ template < typename aq > class ae : public ad< aq > { public:   typedef 
typename ai< aq >::ba ba; };
+ template < typename aq > class ar : public ae< aq > { public:   ak< aq > 
bk(); };
+ template < typename > struct ad {};
+ template < int > class bc;
+ template < typename bd, typename bf, int bg > struct ai< am< bd, bf, bg > > { 
  typedef typename bd::ba ba; };
+ template < typename bh, typename bj, int bg > class am : public bc< bg > { 
public:   bh bu();   bj k(); };
+ template < int bg > class bc : public as< am< int, int, bg > >::ac {};
+ template < typename, typename, typename > struct l;
+ template < typename m, typename bl, typename ag > void n(m bm, bl bn, ag bo) 
{   l< m, bl, ag >::bz(bm, bn, bo); }
+ template < typename, typename, typename, typename > struct bp;
+ class o { public:   o(double *, long);   double ()(long i, long j) { 
return p[aa ? j + i * w : w]; }   template < typename q, int t > q br(long i, 
long j) { double  = operator()(i, j); return cc< q, t >();   }   
double *p;   long w; };
+ class bt : public o { public:   bt(double *cf, long bv) : o(cf, bv) {} };
+ struct bw {   enum { bx }; };
+ template < typename bq > class ak { public:   template < typename by > void 
operator=(by) { am< al< const an< const aj< aj< ab< double, 5, 5, 2 >, -1, 
-1 >, -1 > >, const an< const aj< aj< ab< double, 5, 5, 2 >, -1, -1 
>, -1 > > >, ao< aj< ab< double, 5, 0 >, 1 >, 0 > > ck; 
n(ca, ck, ay< typename by::ba >());   }   bq ca; };
+ template < typename aq > class dn : public av< aq >::ac {};
+ template < typename cd, int af, int ah, int ce, int co, int cg > struct ai< 
ab< cd, af, ah, ce, co, cg > > {   typedef cd ba;   enum { bs }; };
+ template < typename, int, int, int, int co, int cg > class ab : public dn< 
ab< int, co, cg > > { public:  

Re: [PATCH v2] c++: Endless loop with targ deduction in member tmpl [PR95888]

2021-02-11 Thread Marek Polacek via Gcc-patches
On Thu, Feb 11, 2021 at 02:24:22PM -0500, Marek Polacek via Gcc-patches wrote:
> On Thu, Feb 11, 2021 at 11:30:07AM -0500, Jason Merrill via Gcc-patches wrote:
> > On 2/9/21 5:41 PM, Marek Polacek wrote:
> > > My r10-7007 patch tweaked tsubst not to reduce the template level of
> > > template parameters when tf_partial.  That caused infinite looping in
> > > is_specialization_of: we ended up with a class template specialization
> > > whose TREE_TYPE (CLASSTYPE_TI_TEMPLATE (t)) == t, so the second for
> > > loop in is_specialization_of never finished.
> > > 
> > > There's a lot going on in this test, but essentially: the template fn
> > > here has two template parameters, we call it with one explicitly
> > > provided, the other one has to be deduced.  So we'll find ourselves
> > > in fn_type_unification which uses tf_partial when tsubsting the
> > > *explicit* template arguments into the function type.  That leads to
> > > tsubstituting the return type, C.  C is a member template; its
> > > most general template is
> > > 
> > >template template struct B::C
> > > 
> > > we figure out (tsubst_template_args) that the template argument list
> > > is .  They come from different levels, one comes from B,
> > > the other one from fn.
> > > 
> > > So now we lookup_template_class to see if we have C.  We
> > > do the
> > >/* This is a full instantiation of a member template.  Find
> > >   the partial instantiation of which this is an instance.  */
> > >TREE_VEC_LENGTH (arglist)--;
> > >// arglist is now , not 
> > >found = tsubst (gen_tmpl, arglist, complain, NULL_TREE);
> > >TREE_VEC_LENGTH (arglist)++;
> > > 
> > > magic which is looking for the partial instantiation, in this case,
> > > that would be template struct B::C.  Note we're still
> > > in a tf_partial context!  So the tsubst_template_args in the tsubst
> > > (which tries to substitute  into ) returns , but
> > > V's template level hasn't been reduced!  After tsubst_template_args,
> > > tsubst_template_decl looks to see if we already have this specialization:
> > > 
> > >// t = template_decl C
> > >// full_args = 
> > >spec = retrieve_specialization (t, full_args, hash);
> > > 
> > > but doesn't find the one we created a while ago, when processing
> > > B b; in the test, because V's levels don't match.  Whereupon
> > > tsubst_template_decl creates a new TEMPLATE_DECL, one that leads to
> > > the infinite looping problem.
> > > 
> > > I think let's clear tf_partial when looking for an existing partial
> > > instantiation.
> > > 
> > > It also occurred to me that I should be able to trigger a similar
> > > problem with 'auto', since r10-7007 removed an is_auto check.  And lo,
> > > I constructed deduce10.C which exhibits the same issue with pre-r10-7007
> > > compilers.  This patch fixes that problem as well.  I'm ecstatic.
> > > 
> > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/10?  Also
> > > built cmcstl2.
> > 
> > Perhaps the mistake here is using the complain parm at all; this
> > substitution is not in the "immediate context" of deduction.  Either tf_none
> > or tf_warning_or_error should be fine, as we know this substitution has
> > previously succeeded.
> 
> Yeah, that makes sense to me too:
> 
> Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/10?
> 
> -- >8 --
> My r10-7007 patch tweaked tsubst not to reduce the template level of
> template parameters when tf_partial.  That caused infinite looping in
> is_specialization_of: we ended up with a class template specialization
> whose TREE_TYPE (CLASSTYPE_TI_TEMPLATE (t)) == t, so the second for
> loop in is_specialization_of never finished.
> 
> There's a lot going on in this test, but essentially: the template fn
> here has two template parameters, we call it with one explicitly
> provided, the other one has to be deduced.  So we'll find ourselves
> in fn_type_unification which uses tf_partial when tsubsting the
> *explicit* template arguments into the function type.  That leads to
> tsubstituting the return type, C.  C is a member template; its
> most general template is
> 
>   template template struct B::C
> 
> we figure out (tsubst_template_args) that the template argument list
> is .  They come from different levels, one comes from B,
> the other one from fn.
> 
> So now we lookup_template_class to see if we have C.  We
> do the
>   /* This is a full instantiation of a member template.  Find
>  the partial instantiation of which this is an instance.  */
>   TREE_VEC_LENGTH (arglist)--;
>   // arglist is now , not 
>   found = tsubst (gen_tmpl, arglist, complain, NULL_TREE);
>   TREE_VEC_LENGTH (arglist)++;
> 
> magic which is looking for the partial instantiation, in this case,
> that would be template struct B::C.  Note we're still
> in a tf_partial context!  So the tsubst_template_args in the tsubst
> (which tries to substitute  into ) returns , but
> V's template level hasn't been reduced!  After tsubst_template_args,
> 

Re: [PATCH v2] c++: Endless loop with targ deduction in member tmpl [PR95888]

2021-02-11 Thread Marek Polacek via Gcc-patches
On Thu, Feb 11, 2021 at 11:30:07AM -0500, Jason Merrill via Gcc-patches wrote:
> On 2/9/21 5:41 PM, Marek Polacek wrote:
> > My r10-7007 patch tweaked tsubst not to reduce the template level of
> > template parameters when tf_partial.  That caused infinite looping in
> > is_specialization_of: we ended up with a class template specialization
> > whose TREE_TYPE (CLASSTYPE_TI_TEMPLATE (t)) == t, so the second for
> > loop in is_specialization_of never finished.
> > 
> > There's a lot going on in this test, but essentially: the template fn
> > here has two template parameters, we call it with one explicitly
> > provided, the other one has to be deduced.  So we'll find ourselves
> > in fn_type_unification which uses tf_partial when tsubsting the
> > *explicit* template arguments into the function type.  That leads to
> > tsubstituting the return type, C.  C is a member template; its
> > most general template is
> > 
> >template template struct B::C
> > 
> > we figure out (tsubst_template_args) that the template argument list
> > is .  They come from different levels, one comes from B,
> > the other one from fn.
> > 
> > So now we lookup_template_class to see if we have C.  We
> > do the
> >/* This is a full instantiation of a member template.  Find
> >   the partial instantiation of which this is an instance.  */
> >TREE_VEC_LENGTH (arglist)--;
> >// arglist is now , not 
> >found = tsubst (gen_tmpl, arglist, complain, NULL_TREE);
> >TREE_VEC_LENGTH (arglist)++;
> > 
> > magic which is looking for the partial instantiation, in this case,
> > that would be template struct B::C.  Note we're still
> > in a tf_partial context!  So the tsubst_template_args in the tsubst
> > (which tries to substitute  into ) returns , but
> > V's template level hasn't been reduced!  After tsubst_template_args,
> > tsubst_template_decl looks to see if we already have this specialization:
> > 
> >// t = template_decl C
> >// full_args = 
> >spec = retrieve_specialization (t, full_args, hash);
> > 
> > but doesn't find the one we created a while ago, when processing
> > B b; in the test, because V's levels don't match.  Whereupon
> > tsubst_template_decl creates a new TEMPLATE_DECL, one that leads to
> > the infinite looping problem.
> > 
> > I think let's clear tf_partial when looking for an existing partial
> > instantiation.
> > 
> > It also occurred to me that I should be able to trigger a similar
> > problem with 'auto', since r10-7007 removed an is_auto check.  And lo,
> > I constructed deduce10.C which exhibits the same issue with pre-r10-7007
> > compilers.  This patch fixes that problem as well.  I'm ecstatic.
> > 
> > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/10?  Also
> > built cmcstl2.
> 
> Perhaps the mistake here is using the complain parm at all; this
> substitution is not in the "immediate context" of deduction.  Either tf_none
> or tf_warning_or_error should be fine, as we know this substitution has
> previously succeeded.

Yeah, that makes sense to me too:

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/10?

-- >8 --
My r10-7007 patch tweaked tsubst not to reduce the template level of
template parameters when tf_partial.  That caused infinite looping in
is_specialization_of: we ended up with a class template specialization
whose TREE_TYPE (CLASSTYPE_TI_TEMPLATE (t)) == t, so the second for
loop in is_specialization_of never finished.

There's a lot going on in this test, but essentially: the template fn
here has two template parameters, we call it with one explicitly
provided, the other one has to be deduced.  So we'll find ourselves
in fn_type_unification which uses tf_partial when tsubsting the
*explicit* template arguments into the function type.  That leads to
tsubstituting the return type, C.  C is a member template; its
most general template is

  template template struct B::C

we figure out (tsubst_template_args) that the template argument list
is .  They come from different levels, one comes from B,
the other one from fn.

So now we lookup_template_class to see if we have C.  We
do the
  /* This is a full instantiation of a member template.  Find
 the partial instantiation of which this is an instance.  */
  TREE_VEC_LENGTH (arglist)--;
  // arglist is now , not 
  found = tsubst (gen_tmpl, arglist, complain, NULL_TREE);
  TREE_VEC_LENGTH (arglist)++;

magic which is looking for the partial instantiation, in this case,
that would be template struct B::C.  Note we're still
in a tf_partial context!  So the tsubst_template_args in the tsubst
(which tries to substitute  into ) returns , but
V's template level hasn't been reduced!  After tsubst_template_args,
tsubst_template_decl looks to see if we already have this specialization:

  // t = template_decl C
  // full_args = 
  spec = retrieve_specialization (t, full_args, hash);

but doesn't find the one we created a while ago, when processing
B b; in the test, because V's levels don't 

Re: [PATCH] libgomp/i386: Move syscall asms to static inline wrapper.

2021-02-11 Thread Jakub Jelinek via Gcc-patches
On Thu, Feb 11, 2021 at 07:44:53PM +0100, Uros Bizjak wrote:
> 2021-02-11  Uroš Bizjak  
> 
> libgomp/
> * config/linux/x86/futex.h (__futex_wait): New static inline
> wrapper function.  Correct output type to int and
> timeout type to void *.
> (__futex_wake): New static inline wrapper function.
> Correct output type to int.
> (futex_wait): Use __futex_wait.
> (futex_wake): Use __futex_wake.
> 
> Uros.

> diff --git a/libgomp/config/linux/x86/futex.h 
> b/libgomp/config/linux/x86/futex.h
> index 1e01b681401..1e5f9d4357d 100644
> --- a/libgomp/config/linux/x86/futex.h
> +++ b/libgomp/config/linux/x86/futex.h
> -static inline void
> -futex_wait (int *addr, int val)
> +static inline int
> +__futex_wait (int *uaddr, int futex_op, int val)

Please call the argument just addr rather than uaddr, there is no need
to stress that it is a user address, all pointers outside of kernel are user
addresses.  And addr is consistent with all the other futex.h headers.

Ok for trunk with those changes.

Jakub



[PATCH] libgomp/i386: Move syscall asms to static inline wrapper.

2021-02-11 Thread Uros Bizjak via Gcc-patches
Move syscall asms to static inline wrapper functions to improve #ifdeffery.
Also correct output type to int and timeout type to void *.

No functional changes.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

OK for mainline?

2021-02-11  Uroš Bizjak  

libgomp/
* config/linux/x86/futex.h (__futex_wait): New static inline
wrapper function.  Correct output type to int and
timeout type to void *.
(__futex_wake): New static inline wrapper function.
Correct output type to int.
(futex_wait): Use __futex_wait.
(futex_wake): Use __futex_wake.

Uros.
diff --git a/libgomp/config/linux/x86/futex.h b/libgomp/config/linux/x86/futex.h
index 1e01b681401..1e5f9d4357d 100644
--- a/libgomp/config/linux/x86/futex.h
+++ b/libgomp/config/linux/x86/futex.h
@@ -30,99 +30,92 @@
 #  define SYS_futex202
 # endif
 
-static inline void
-futex_wait (int *addr, int val)
+static inline int
+__futex_wait (int *uaddr, int futex_op, int val)
 {
-  long res;
+  int res;
 
-  register long r10 __asm__("%r10") = 0;
+  register void *timeout __asm ("r10") = NULL;
   __asm volatile ("syscall"
  : "=a" (res)
- : "0" (SYS_futex), "D" (addr), "S" (gomp_futex_wait),
-   "d" (val), "r" (r10)
+ : "0" (SYS_futex), "D" (uaddr), "S" (futex_op),
+   "d" (val), "r" (timeout)
  : "r11", "rcx", "memory");
-  if (__builtin_expect (res == -ENOSYS, 0))
-{
-  gomp_futex_wait &= ~FUTEX_PRIVATE_FLAG;
-  gomp_futex_wake &= ~FUTEX_PRIVATE_FLAG;
-  __asm volatile ("syscall"
- : "=a" (res)
- : "0" (SYS_futex), "D" (addr), "S" (gomp_futex_wait),
-   "d" (val), "r" (r10)
- : "r11", "rcx", "memory");
-}
+  return res;
 }
 
-static inline void
-futex_wake (int *addr, int count)
+static inline int
+__futex_wake (int *uaddr, int futex_op, int count)
 {
-  long res;
+  int res;
 
   __asm volatile ("syscall"
  : "=a" (res)
- : "0" (SYS_futex), "D" (addr), "S" (gomp_futex_wake),
+ : "0" (SYS_futex), "D" (uaddr), "S" (futex_op),
"d" (count)
  : "r11", "rcx", "memory");
-  if (__builtin_expect (res == -ENOSYS, 0))
-{
-  gomp_futex_wait &= ~FUTEX_PRIVATE_FLAG;
-  gomp_futex_wake &= ~FUTEX_PRIVATE_FLAG;
-  __asm volatile ("syscall"
- : "=a" (res)
- : "0" (SYS_futex), "D" (addr), "S" (gomp_futex_wake),
-   "d" (count)
- : "r11", "rcx", "memory");
-}
+  return res;
 }
 #else
 # ifndef SYS_futex
 #  define SYS_futex240
 # endif
 
-static inline void
-futex_wait (int *addr, int val)
+static inline int
+__futex_wait (int *uaddr, int futex_op, int val)
+{
+  int res;
+
+  void *timeout = NULL;
+  __asm volatile ("int $0x80"
+ : "=a" (res)
+ : "0" (SYS_futex), "b" (uaddr), "c" (futex_op),
+   "d" (val), "S" (timeout)
+ : "memory");
+  return res;
+}
+
+static inline int
+__futex_wake (int *uaddr, int futex_op, int count)
 {
-  long res;
+  int res;
 
   __asm volatile ("int $0x80"
  : "=a" (res)
- : "0" (SYS_futex), "b" (addr), "c" (gomp_futex_wait),
-   "d" (val), "S" (0)
+ : "0" (SYS_futex), "b" (uaddr), "c" (futex_op),
+   "d" (count)
  : "memory");
+  return res;
+}
+#endif /* __x86_64__ */
+
+static inline void
+futex_wait (int *addr, int val)
+{
+  int res = __futex_wait (addr, gomp_futex_wait, val);
+
   if (__builtin_expect (res == -ENOSYS, 0))
 {
   gomp_futex_wait &= ~FUTEX_PRIVATE_FLAG;
   gomp_futex_wake &= ~FUTEX_PRIVATE_FLAG;
-  __asm volatile ("int $0x80"
- : "=a" (res)
- : "0" (SYS_futex), "b" (addr), "c" (gomp_futex_wait),
-   "d" (val), "S" (0)
- : "memory");
+
+  __futex_wait (addr, gomp_futex_wait, val);
 }
 }
 
 static inline void
 futex_wake (int *addr, int count)
 {
-  long res;
+  int res = __futex_wake (addr, gomp_futex_wake, count);
 
-  __asm volatile ("int $0x80"
- : "=a" (res)
- : "0" (SYS_futex), "b" (addr), "c" (gomp_futex_wake),
-   "d" (count)
- : "memory");
   if (__builtin_expect (res == -ENOSYS, 0))
 {
   gomp_futex_wait &= ~FUTEX_PRIVATE_FLAG;
   gomp_futex_wake &= ~FUTEX_PRIVATE_FLAG;
-  __asm volatile ("int $0x80"
- : "=a" (res)
- : "0" (SYS_futex), "b" (addr), "c" (gomp_futex_wake),
-   "d" (count)
- : "memory");
+
+  __futex_wake (addr, gomp_futex_wake, count);
 }
 }
-#endif /* __x86_64__ */
 
 static inline void
 cpu_relax (void)


Fwd: Re: [PATCH] plug memory leaks in warn_parm_array_mismatch (PR 99055)

2021-02-11 Thread Martin Sebor via Gcc-patches

On 2/11/21 12:59 AM, Richard Biener wrote:

On Wed, Feb 10, 2021 at 6:16 PM Martin Sebor  wrote:


The attached patch replaces calls to print_generic_expr_to_str() with
a helper function that returns a std::string and releases the caller
from the responsibility to explicitly free memory.


I don't like this.  What's the reason to use generic_expr_as_string
anyway?  We have %E pretty-printers after all.


[Reposting; my first reply was too big.]

The VLA bound is either an expression or the asterisk (*) when newbnd
is null.  The reason for using the function is to avoid duplicating
the warning call and making one with %E and another with "*".


Couldn't you have
fixed the leak by doing

if (newbnd)
   free (newbndstr);

etc.?


Yes, I could have.  I considered it and chose not to because this
is much simpler, safer (if newbnd were to change after newbndstr
is set), and more in the "C++ spirit" (RAII).  This code already
uses std::string (attr_access::array_as_string) and so my change
is in line with it.

I understand that you prefer a more C-ish coding style so if you
consider it a prerequisite for accepting this fix I can make
the change conform to it.  See the attached revision (although
I hope you'll see why I chose not to go this route).

For what it's worth, print_generic_expr_to_str() would be improved
by returning std::string.  It would mean including  in
most sources in the compiler, which I suspect may not be a popular
suggestion with everyone here, and which is why I didn't make it
to begin with.  But if you're open to it I'm happy to make that
change either for GCC 12 or even now.




With the patch installed, Valgrind shows more leaks in this code that
I'm not sure what to do about:

1) A tree built by build_type_attribute_qual_variant() called from
 attr_access::array_as_string() to build a temporary type only
 for the purposes of formatting it.

2) A tree (an attribute list) built by tree_cons() called from
 build_attr_access_from_parms() that's used only for the duration
 of the caller.

Do these temporary trees need to be released somehow or are the leaks
expected?


You should configure GCC with --enable-valgrind-annotations to make
it aware of our GC.


I did configure with that option:

$ /src/gcc/master/configure --enable-checking=yes 
--enable-languages=all,jit,lto --enable-host-shared 
--enable-valgrind-annotations


Attached to pr99055 is my valgrind output for gcc.dg/Wvla-parameter.c
with the top of trunk and the patch applied:

$ /build/gcc-master/gcc/xgcc -B /build/gcc-master/gcc -S -Wall 
/src/gcc/master/gcc/testsuite/gcc.dg/Wvla-parameter.c -wrapper 
valgrind,--leak-check=full,--show-leak-kinds=all,--track-origins=yes,--log-file=valgrind-out.txt


Do you not see the same leaks?

Martin

PR c/99055 - memory leak in warn_parm_array_mismatch

gcc/c-family/ChangeLog:

	PR c/99055
	* c-warn.c (warn_parm_array_mismatch): Free strings returned from
	print_generic_expr_to_str.

gcc/ChangeLog:

	* tree-pretty-print.c (print_generic_expr_to_str): Update comment.

diff --git a/gcc/c-family/c-warn.c b/gcc/c-family/c-warn.c
index e6e28d9b139..27c7baa61bb 100644
--- a/gcc/c-family/c-warn.c
+++ b/gcc/c-family/c-warn.c
@@ -3636,6 +3636,10 @@ warn_parm_array_mismatch (location_t origloc, tree fndecl, tree newparms)
 		inform (, "previously declared as %s with bound "
 			"%<%s%>", curparmstr.c_str (), curbndstr);
 
+		  if (newbnd)
+		free (const_cast (newbndstr));
+		  if (curbnd)
+		free (const_cast (curbndstr));
 		  continue;
 		}
 	}
@@ -3646,7 +3650,13 @@ warn_parm_array_mismatch (location_t origloc, tree fndecl, tree newparms)
 		 Compare them lexicographically to detect gross mismatches
 		 such as between T[foo()] and T[bar()].  */
 	  if (operand_equal_p (newbnd, curbnd, OEP_LEXICOGRAPHIC))
-		continue;
+		{
+		  if (newbnd)
+		free (const_cast (newbndstr));
+		  if (curbnd)
+		free (const_cast (curbndstr));
+		  continue;
+		}
 
 	  if (warning_at (newloc, OPT_Wvla_parameter,
 			  "argument %u of type %s declared with "
@@ -3654,8 +3664,18 @@ warn_parm_array_mismatch (location_t origloc, tree fndecl, tree newparms)
 			  parmpos + 1, newparmstr.c_str (), newbndstr))
 		inform (origloc, "previously declared as %s with bound %qs",
 			curparmstr.c_str (), curbndstr);
+
+	  if (newbnd)
+		free (const_cast (newbndstr));
+	  if (curbnd)
+		free (const_cast (curbndstr));
 	  continue;
 	}
+
+	  if (newbnd)
+	free (const_cast (newbndstr));
+	  if (curbnd)
+	free (const_cast (curbndstr));
 	}
 
   if (newa->minsize == cura->minsize
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index aabe6bb23b9..986f75d1d5f 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -169,7 +169,8 @@ print_generic_expr (FILE *file, tree t, dump_flags_t flags)
   pp_flush (tree_pp);
 }
 
-/* Print a single expression T to string, and return it.  */
+/* Print a single expression T to string, and 

[Patch] Fortran: Fix rank of assumed-rank array [PR99043]

2021-02-11 Thread Tobias Burnus

In the Fortran standard, I think it is best explained
in the description of the RANK intrinsic:

"Example. If X is an assumed-rank dummy argument and
its associated effective argument is an array of rank,
RANK(X) has the value 3."

That's already well tested in assumed_rank_16.f90;
however, as the PR shows, this should not be reset
to "-1" when passing it further on as actual argument to
another assumed-rank dummy argument.

OK for mainline?
Reported against GCC 10, not a regression but simple wrong-code fix;
does it make sense to apply there was well?

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf
Fortran: Fix rank of assumed-rank array [PR99043]

gcc/fortran/ChangeLog:

	PR fortran/99043
	* trans-expr.c (gfc_conv_procedure_call): Don't reset
	rank of assumed-rank array.

gcc/testsuite/ChangeLog:

	PR fortran/99043
	* gfortran.dg/assumed_rank_20.f90: New test.

 gcc/fortran/trans-expr.c  |  5 ++--
 gcc/testsuite/gfortran.dg/assumed_rank_20.f90 | 36 +++
 2 files changed, 39 insertions(+), 2 deletions(-)

diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
index b0c8d577ca5..103cb31c664 100644
--- a/gcc/fortran/trans-expr.c
+++ b/gcc/fortran/trans-expr.c
@@ -6403,9 +6403,10 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
 
 	  /* Unallocated allocatable arrays and unassociated pointer arrays
 		 need their dtype setting if they are argument associated with
-		 assumed rank dummies.  */
+		 assumed rank dummies, unless already assumed rank.  */
 	  if (!sym->attr.is_bind_c && e && fsym && fsym->as
-		  && fsym->as->type == AS_ASSUMED_RANK)
+		  && fsym->as->type == AS_ASSUMED_RANK
+		  && e->rank != -1)
 		{
 		  if (gfc_expr_attr (e).pointer
 		  || gfc_expr_attr (e).allocatable)
diff --git a/gcc/testsuite/gfortran.dg/assumed_rank_20.f90 b/gcc/testsuite/gfortran.dg/assumed_rank_20.f90
new file mode 100644
index 000..10ad1fc8e89
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/assumed_rank_20.f90
@@ -0,0 +1,36 @@
+! { dg-do run }
+!
+! PR fortran/99043
+!
+module assumed_rank_module
+implicit none
+private
+
+public :: rank_of_pointer_level1
+contains
+subroutine rank_of_pointer_level1(ap,aa)
+real, dimension(..), intent(in), pointer :: ap
+real, dimension(..), intent(in), allocatable :: aa
+if (rank(ap) /= 3) stop 1
+if (rank(aa) /= 3) stop 2
+call rank_of_pointer_level2(ap, aa)
+end subroutine rank_of_pointer_level1
+
+subroutine rank_of_pointer_level2(ap,aa)
+real, dimension(..), intent(in), pointer :: ap
+real, dimension(..), intent(in), allocatable :: aa
+
+if (rank(ap) /= 3) stop 3
+if (rank(aa) /= 3) stop 4
+end subroutine rank_of_pointer_level2
+end module assumed_rank_module
+
+program assumed_rank
+use :: assumed_rank_module, only : rank_of_pointer_level1
+implicit none
+real, dimension(:,:,:), pointer :: ap
+real, dimension(:,:,:), allocatable :: aa
+
+ap => null()
+call rank_of_pointer_level1(ap, aa)
+end program assumed_rank


[PATCH][AArch64] Leveraging the use of STP instruction for vec_duplicate

2021-02-11 Thread Victor Do Nascimento via Gcc-patches
Dear GCC community,

The backend pattern for storing a pair of identical values in 32 and 64-bit 
modes with the machine instruction STP was missing, and multiple instructions 
were needed to reproduce this behavior as a result of failed RTL pattern match 
in combine pass. 

For the test case :

typedef long long v2di __attribute__((vector_size (16))); typedef int v2si 
__attribute__((vector_size (8)));

void
foo (v2di *x, long long a)
{
  v2di tmp = {a, a};
  *x = tmp;
}

void
foo2 (v2si *x, int a)
{
  v2si tmp = {a, a};
  *x = tmp;
}

at -O2 on aarch64 gives:

foo:
    stp x1, x1, [x0]
    ret
foo2:
    stp w1, w1, [x0]
    ret

instead of:

foo:
    dup v0.2d, x1
    str q0, [x0]
    ret
foo2:
    dup v0.2s, w1
    str d0, [x0]
    ret
    
Added new RTL template, unittest and checked for regressions on bootstrapped 
aarch64-none-linux-gnu.

gcc/ChangeLog

2021-02-04 victor Do Nascimento 

* config/aarch64/aarch64-simd.md: Implement RTX pattern for
mapping 'vec_duplicate' RTX onto 'STP' ASM insn.
* config/aarch64/iterators.md: Implement ldpstp_vel_sz iterator
to map STP/LDP vector element mode to correct suffix in 
attribute type definition of aarch64_simd_stp pattern.
    
gcc/testsuite/ChangeLog
    
2021-02-04 Victor Do Nascimento 

* gcc.target/stp_vec-dup_32_64-1.c: New.

Regards,
Victor

diff --git a/gcc/config/aarch64/aarch64-simd.md 
b/gcc/config/aarch64/aarch64-simd.md
index 68baf416045..4623cbb95f4 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -205,6 +205,16 @@
   [(set_attr "type" "neon_stp")]
 )
 
+(define_insn "aarch64_simd_stp"
+[(set (match_operand:VP_2E 0 "aarch64_mem_pair_operand" "=Ump,Ump")
+  (vec_duplicate:VP_2E (match_operand: 1 "register_operand" "w,r")))]
+"TARGET_SIMD"
+"@
+ stp\\t%1, %1, %z0
+ stp\\t%1, %1, %z0"
+[(set_attr "type" "neon_stp, store_")]
+)
+
 (define_insn "load_pair"
   [(set (match_operand:VQ 0 "register_operand" "=w")
(match_operand:VQ 1 "aarch64_mem_pair_operand" "Ump"))
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index fb1426b7752..aac6e0b5bd9 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -880,6 +880,9 @@
 ;; Likewise for load/store pair.
 (define_mode_attr ldpstp_sz [(SI "8") (DI "16")])
 
+;; Size of element access for STP/LDP-generated vectors.
+(define_mode_attr ldpstp_vel_sz [(V2SI "8") (V2SF "8") (V2DI "16") (V2DF 
"16")])
+
 ;; For inequal width int to float conversion
 (define_mode_attr w1 [(HF "w") (SF "w") (DF "x")])
 (define_mode_attr w2 [(HF "x") (SF "x") (DF "w")])
diff --git a/gcc/testsuite/gcc.target/aarch64/stp_vec_dup_32_64-1.c 
b/gcc/testsuite/gcc.target/aarch64/stp_vec_dup_32_64-1.c
new file mode 100644
index 000..a37c903dfd4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/stp_vec_dup_32_64-1.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+typedef long long v2di __attribute__((vector_size (16)));
+typedef int v2si __attribute__((vector_size (8)));
+
+void
+foo (v2di *x, long long a)
+{
+  v2di tmp = {a, a};
+  *x = tmp;
+}
+
+void
+foo2 (v2si *x, int a)
+{
+  v2si tmp = {a, a};
+  *x = tmp;
+}
+
+/* { dg-final { scan-assembler-times "stp\t" 2 } } */
+/* { dg-final { scan-assembler-not "dup\t" } } */


Re: [PATCH] Fix version namespace build

2021-02-11 Thread Jonathan Wakely via Gcc-patches

On 11/02/21 15:23 +, Jonathan Wakely wrote:

On 09/02/21 22:05 +0100, François Dumont via Libstdc++ wrote:

    libstdc++: Fix versioned namespace build

    libstdc++-v3/ChangeLog:

    * libsupc++/eh_ptr.cc (exception_ptr(__safe_bool)): Define if
    _GLIBCXX_EH_PTR_COMPAT is defined.
    (exception_ptr::_M_safe_bool_dummy()): Likewise.
    (exception_ptr::operator!()): Likewise.
    (exception_ptr::opeerator __safe_bool()): Likewise.

Ok to commit ?


No, this is not correct. Those functions are declared in the
header if included from C++98 code:

#if (__cplusplus < 201103L) || defined (_GLIBCXX_EH_PTR_COMPAT)
 typedef void (exception_ptr::*__safe_bool)();

 // For construction from nullptr or 0.
 exception_ptr(__safe_bool) _GLIBCXX_USE_NOEXCEPT;
#endif

With your change, C++98 code that calls those functions will be unable
to link to libstdc++.so.8 because the functions won't be defined in
the library.

The simplest fix would be to revert this change:

--- a/libstdc++-v3/libsupc++/eh_ptr.cc
+++ b/libstdc++-v3/libsupc++/eh_ptr.cc
@@ -25,7 +25,12 @@
#include 
#include "eh_atomics.h"

+#if ! _GLIBCXX_INLINE_VERSION
+// This macro causes exception_ptr to declare an older API (with corresponding
+// definitions in this file) and to mark some inline functions as "used" so
+// that definitions will be emitted in this translation unit.
#define _GLIBCXX_EH_PTR_COMPAT
+#endif

#include 
#include 

But that leaves the unwanted operator== definitions in libstdc++.so.8
so the attached patch (only tested with unversioned namespace) is
probably what we want to do. Does it fix the problem?


I've now tested the patch below and committed it.



diff --git a/libstdc++-v3/libsupc++/eh_ptr.cc b/libstdc++-v3/libsupc++/eh_ptr.cc
index 5d8ac1d879a..5c4685606fe 100644
--- a/libstdc++-v3/libsupc++/eh_ptr.cc
+++ b/libstdc++-v3/libsupc++/eh_ptr.cc
@@ -25,11 +25,15 @@
#include 
#include "eh_atomics.h"

-#if ! _GLIBCXX_INLINE_VERSION
// This macro causes exception_ptr to declare an older API (with corresponding
-// definitions in this file) and to mark some inline functions as "used" so
-// that definitions will be emitted in this translation unit.
+// definitions in this file).
#define _GLIBCXX_EH_PTR_COMPAT
+
+#if ! _GLIBCXX_INLINE_VERSION
+// This macro causes some inline functions in exception_ptr to be marked
+// as "used" so that definitions will be emitted in this translation unit.
+// We need this because those functions were not inline in previous releases.
+#define _GLIBCXX_EH_PTR_RELOPS_COMPAT
#endif

#include 
diff --git a/libstdc++-v3/libsupc++/exception_ptr.h 
b/libstdc++-v3/libsupc++/exception_ptr.h
index 6d41b34fabe..9943668d9b3 100644
--- a/libstdc++-v3/libsupc++/exception_ptr.h
+++ b/libstdc++-v3/libsupc++/exception_ptr.h
@@ -39,7 +39,7 @@
#include 
#include 

-#ifdef _GLIBCXX_EH_PTR_COMPAT
+#ifdef _GLIBCXX_EH_PTR_RELOPS_COMPAT
# define _GLIBCXX_EH_PTR_USED __attribute__((__used__))
#else
# define _GLIBCXX_EH_PTR_USED
@@ -153,7 +153,7 @@ namespace std
#endif

#if __cpp_impl_three_way_comparison >= 201907L \
-  && ! defined _GLIBCXX_EH_PTR_COMPAT
+  && ! defined _GLIBCXX_EH_PTR_RELOPS_COMPAT
  friend bool
  operator==(const exception_ptr&, const exception_ptr&) noexcept = default;
#else




[committed] libstdc++: Document when C++11/14/17 support became stable [PR 99058]

2021-02-11 Thread Jonathan Wakely via Gcc-patches
libstdc++-v3/ChangeLog:

PR libstdc++/99058
* doc/xml/manual/status_cxx2011.xml: Document when support
became stable.
* doc/xml/manual/status_cxx2014.xml: Likewise.
* doc/xml/manual/status_cxx2017.xml: Likewise.
* doc/html/manual/status.html: Regenerate.

Tested powerpc64le-linux. Committed to trunk.

commit ce43c906049b828c0472d8499b52ac6233c869d0
Author: Jonathan Wakely 
Date:   Thu Feb 11 15:35:23 2021

libstdc++: Document when C++11/14/17 support became stable [PR 99058]

libstdc++-v3/ChangeLog:

PR libstdc++/99058
* doc/xml/manual/status_cxx2011.xml: Document when support
became stable.
* doc/xml/manual/status_cxx2014.xml: Likewise.
* doc/xml/manual/status_cxx2017.xml: Likewise.
* doc/html/manual/status.html: Regenerate.

diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2011.xml 
b/libstdc++-v3/doc/xml/manual/status_cxx2011.xml
index be873962597..88844f8f0cc 100644
--- a/libstdc++-v3/doc/xml/manual/status_cxx2011.xml
+++ b/libstdc++-v3/doc/xml/manual/status_cxx2011.xml
@@ -24,6 +24,9 @@ features. See dialect
 options. The pre-defined symbol
 __cplusplus is used to check for the
 presence of the required flag.
+GCC 5.1 was the first release with non-experimental C++11 support,
+so the API and ABI of features added in C++11 is only stable
+since that release.
 
 
 
diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2014.xml 
b/libstdc++-v3/doc/xml/manual/status_cxx2014.xml
index 61bea5adad5..2cf5f629efb 100644
--- a/libstdc++-v3/doc/xml/manual/status_cxx2014.xml
+++ b/libstdc++-v3/doc/xml/manual/status_cxx2014.xml
@@ -17,6 +17,9 @@ features. See dialect
 options. The pre-defined symbol
 __cplusplus is used to check for the
 presence of the required flag.
+GCC 6.1 was the first release with non-experimental C++14 support,
+so the API and ABI of features added in C++14 is only stable
+since that release.
 
 
 
diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2017.xml 
b/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
index aa34b8c3cf5..b1c12bd3799 100644
--- a/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
+++ b/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
@@ -17,6 +17,9 @@ features. See dialect
 options. The pre-defined symbol
 __cplusplus is used to check for the
 presence of the required flag.
+GCC 9.1 was the first release with non-experimental C++17 support,
+so the API and ABI of features added in C++17 is only stable
+since that release.
 
 
 


Re: [PATCH] c++: Fix endless errors on invalid requirement seq [PR97742]

2021-02-11 Thread Jason Merrill via Gcc-patches

On 2/11/21 11:35 AM, Marek Polacek wrote:

On Thu, Feb 11, 2021 at 05:30:24PM +0100, Jakub Jelinek via Gcc-patches wrote:

Hi!

As the testcase shows, if we reach CPP_EOF during parsing of requirement
sequence, we end up with endless loop where we always report invalid
requirement expression, don't consume any token (as we are at eof) and
repeat.

This patch stops the loop when we reach CPP_EOF.

Ok for trunk if it passes bootstrap/regtest?


LGTM.


OK.


2021-02-11  Jakub Jelinek  

PR c++/97742
* parser.c (cp_parser_requirement_seq): Stop iterating after reaching
CPP_EOF.

* g++.dg/cpp2a/concepts-requires24.C: New test.

--- gcc/cp/parser.c.jj  2021-02-03 17:14:01.0 +0100
+++ gcc/cp/parser.c 2021-02-11 14:44:05.877357556 +0100
@@ -28816,7 +28816,9 @@ cp_parser_requirement_seq (cp_parser *pa
tree req = cp_parser_requirement (parser);
if (req != error_mark_node)
result = tree_cons (NULL_TREE, req, result);
-} while (cp_lexer_next_token_is_not (parser->lexer, CPP_CLOSE_BRACE));
+}
+  while (cp_lexer_next_token_is_not (parser->lexer, CPP_CLOSE_BRACE)
+&& cp_lexer_next_token_is_not (parser->lexer, CPP_EOF));
  
/* If there are no valid requirements, this is not a valid expression. */

if (!result)
--- gcc/testsuite/g++.dg/cpp2a/concepts-requires24.C.jj 2021-02-11 
14:40:28.548815748 +0100
+++ gcc/testsuite/g++.dg/cpp2a/concepts-requires24.C2021-02-11 
14:39:40.209362526 +0100
@@ -0,0 +1,4 @@
+// PR c++/97742
+// { dg-do compile { target c++20 } }
+
+template 

Marek





[PATCH] Fix producer string memory leaks

2021-02-11 Thread Martin Liška

Hello.

This fixes 2 memory leaks I noticed.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

* opts-common.c (decode_cmdline_option): Release werror_arg.
* opts.c (gen_producer_string): Release output of
gen_command_line_string.
---
 gcc/opts-common.c | 1 +
 gcc/opts.c| 7 +--
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/gcc/opts-common.c b/gcc/opts-common.c
index 6cb5602896d..5e10edeb4cf 100644
--- a/gcc/opts-common.c
+++ b/gcc/opts-common.c
@@ -766,6 +766,7 @@ decode_cmdline_option (const char *const *argv, unsigned 
int lang_mask,
   werror_arg[0] = 'W';
 
   size_t warning_index = find_opt (werror_arg, lang_mask);

+  free (werror_arg);
   if (warning_index != OPT_SPECIAL_unknown)
{
  const struct cl_option *warning_option
diff --git a/gcc/opts.c b/gcc/opts.c
index fc5f61e13cc..24bb64198c8 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -3401,8 +3401,11 @@ char *
 gen_producer_string (const char *language_string, cl_decoded_option *options,
 unsigned int options_count)
 {
-  return concat (language_string, " ", version_string, " ",
-gen_command_line_string (options, options_count), NULL);
+  char *cmdline = gen_command_line_string (options, options_count);
+  char *combined = concat (language_string, " ", version_string, " ",
+  cmdline, NULL);
+  free (cmdline);
+  return combined;
 }
 
 #if CHECKING_P

--
2.30.0



Re: [PATCH] c++: Fix endless errors on invalid requirement seq [PR97742]

2021-02-11 Thread Marek Polacek via Gcc-patches
On Thu, Feb 11, 2021 at 05:30:24PM +0100, Jakub Jelinek via Gcc-patches wrote:
> Hi!
> 
> As the testcase shows, if we reach CPP_EOF during parsing of requirement
> sequence, we end up with endless loop where we always report invalid
> requirement expression, don't consume any token (as we are at eof) and
> repeat.
> 
> This patch stops the loop when we reach CPP_EOF.
> 
> Ok for trunk if it passes bootstrap/regtest?

LGTM.

> 2021-02-11  Jakub Jelinek  
> 
>   PR c++/97742
>   * parser.c (cp_parser_requirement_seq): Stop iterating after reaching
>   CPP_EOF.
> 
>   * g++.dg/cpp2a/concepts-requires24.C: New test.
> 
> --- gcc/cp/parser.c.jj2021-02-03 17:14:01.0 +0100
> +++ gcc/cp/parser.c   2021-02-11 14:44:05.877357556 +0100
> @@ -28816,7 +28816,9 @@ cp_parser_requirement_seq (cp_parser *pa
>tree req = cp_parser_requirement (parser);
>if (req != error_mark_node)
>   result = tree_cons (NULL_TREE, req, result);
> -} while (cp_lexer_next_token_is_not (parser->lexer, CPP_CLOSE_BRACE));
> +}
> +  while (cp_lexer_next_token_is_not (parser->lexer, CPP_CLOSE_BRACE)
> +  && cp_lexer_next_token_is_not (parser->lexer, CPP_EOF));
>  
>/* If there are no valid requirements, this is not a valid expression. */
>if (!result)
> --- gcc/testsuite/g++.dg/cpp2a/concepts-requires24.C.jj   2021-02-11 
> 14:40:28.548815748 +0100
> +++ gcc/testsuite/g++.dg/cpp2a/concepts-requires24.C  2021-02-11 
> 14:39:40.209362526 +0100
> @@ -0,0 +1,4 @@
> +// PR c++/97742
> +// { dg-do compile { target c++20 } }
> +
> +template  
>   Jakub
> 

Marek



[PATCH] c++: Fix endless errors on invalid requirement seq [PR97742]

2021-02-11 Thread Jakub Jelinek via Gcc-patches
Hi!

As the testcase shows, if we reach CPP_EOF during parsing of requirement
sequence, we end up with endless loop where we always report invalid
requirement expression, don't consume any token (as we are at eof) and
repeat.

This patch stops the loop when we reach CPP_EOF.

Ok for trunk if it passes bootstrap/regtest?

2021-02-11  Jakub Jelinek  

PR c++/97742
* parser.c (cp_parser_requirement_seq): Stop iterating after reaching
CPP_EOF.

* g++.dg/cpp2a/concepts-requires24.C: New test.

--- gcc/cp/parser.c.jj  2021-02-03 17:14:01.0 +0100
+++ gcc/cp/parser.c 2021-02-11 14:44:05.877357556 +0100
@@ -28816,7 +28816,9 @@ cp_parser_requirement_seq (cp_parser *pa
   tree req = cp_parser_requirement (parser);
   if (req != error_mark_node)
result = tree_cons (NULL_TREE, req, result);
-} while (cp_lexer_next_token_is_not (parser->lexer, CPP_CLOSE_BRACE));
+}
+  while (cp_lexer_next_token_is_not (parser->lexer, CPP_CLOSE_BRACE)
+&& cp_lexer_next_token_is_not (parser->lexer, CPP_EOF));
 
   /* If there are no valid requirements, this is not a valid expression. */
   if (!result)
--- gcc/testsuite/g++.dg/cpp2a/concepts-requires24.C.jj 2021-02-11 
14:40:28.548815748 +0100
+++ gcc/testsuite/g++.dg/cpp2a/concepts-requires24.C2021-02-11 
14:39:40.209362526 +0100
@@ -0,0 +1,4 @@
+// PR c++/97742
+// { dg-do compile { target c++20 } }
+
+template 

Re: [PATCH] c++: Endless loop with targ deduction in member tmpl [PR95888]

2021-02-11 Thread Jason Merrill via Gcc-patches

On 2/9/21 5:41 PM, Marek Polacek wrote:

My r10-7007 patch tweaked tsubst not to reduce the template level of
template parameters when tf_partial.  That caused infinite looping in
is_specialization_of: we ended up with a class template specialization
whose TREE_TYPE (CLASSTYPE_TI_TEMPLATE (t)) == t, so the second for
loop in is_specialization_of never finished.

There's a lot going on in this test, but essentially: the template fn
here has two template parameters, we call it with one explicitly
provided, the other one has to be deduced.  So we'll find ourselves
in fn_type_unification which uses tf_partial when tsubsting the
*explicit* template arguments into the function type.  That leads to
tsubstituting the return type, C.  C is a member template; its
most general template is

   template template struct B::C

we figure out (tsubst_template_args) that the template argument list
is .  They come from different levels, one comes from B,
the other one from fn.

So now we lookup_template_class to see if we have C.  We
do the
   /* This is a full instantiation of a member template.  Find
  the partial instantiation of which this is an instance.  */
   TREE_VEC_LENGTH (arglist)--;
   // arglist is now , not 
   found = tsubst (gen_tmpl, arglist, complain, NULL_TREE);
   TREE_VEC_LENGTH (arglist)++;

magic which is looking for the partial instantiation, in this case,
that would be template struct B::C.  Note we're still
in a tf_partial context!  So the tsubst_template_args in the tsubst
(which tries to substitute  into ) returns , but
V's template level hasn't been reduced!  After tsubst_template_args,
tsubst_template_decl looks to see if we already have this specialization:

   // t = template_decl C
   // full_args = 
   spec = retrieve_specialization (t, full_args, hash);

but doesn't find the one we created a while ago, when processing
B b; in the test, because V's levels don't match.  Whereupon
tsubst_template_decl creates a new TEMPLATE_DECL, one that leads to
the infinite looping problem.

I think let's clear tf_partial when looking for an existing partial
instantiation.

It also occurred to me that I should be able to trigger a similar
problem with 'auto', since r10-7007 removed an is_auto check.  And lo,
I constructed deduce10.C which exhibits the same issue with pre-r10-7007
compilers.  This patch fixes that problem as well.  I'm ecstatic.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/10?  Also
built cmcstl2.


Perhaps the mistake here is using the complain parm at all; this 
substitution is not in the "immediate context" of deduction.  Either 
tf_none or tf_warning_or_error should be fine, as we know this 
substitution has previously succeeded.



gcc/cp/ChangeLog:

PR c++/95888
* pt.c (lookup_template_class_1): Clear tf_partial when looking for
the partial instantiation.

gcc/testsuite/ChangeLog:

PR c++/95888
* g++.dg/template/deduce10.C: New test.
* g++.dg/template/deduce9.C: New test.
---
  gcc/cp/pt.c  | 10 +-
  gcc/testsuite/g++.dg/template/deduce10.C | 23 +++
  gcc/testsuite/g++.dg/template/deduce9.C  | 23 +++
  3 files changed, 55 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/template/deduce10.C
  create mode 100644 gcc/testsuite/g++.dg/template/deduce9.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index f73deb3aee3..5b8a08a30e0 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -10137,7 +10137,15 @@ lookup_template_class_1 (tree d1, tree arglist, tree 
in_decl, tree context,
  /* Temporarily reduce by one the number of levels in the ARGLIST
 so as to avoid comparing the last set of arguments.  */
  TREE_VEC_LENGTH (arglist)--;
- found = tsubst (gen_tmpl, arglist, complain, NULL_TREE);
+ /* Clear tf_partial, which can be set because we might be at the
+beginning of template argument deduction when any explicitly
+specified template arguments are substituted into the function
+type.  If we don't clear tf_partial, we won't find the partial
+instantiation that might have been created outside tf_partial
+context, because the levels of template parameters wouldn't
+match, because in a tf_partial context, tsubst doesn't reduce
+TEMPLATE_PARM_LEVEL.  */
+ found = tsubst (gen_tmpl, arglist, complain & ~tf_partial, NULL_TREE);
  TREE_VEC_LENGTH (arglist)++;
  /* FOUND is either a proper class type, or an alias
 template specialization.  In the later case, it's a
diff --git a/gcc/testsuite/g++.dg/template/deduce10.C 
b/gcc/testsuite/g++.dg/template/deduce10.C
new file mode 100644
index 000..165ff195728
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/deduce10.C
@@ -0,0 +1,23 @@
+// PR c++/95888
+// { dg-do compile { target c++17 } }
+
+template  class A {
+  A(int, int);
+  

Re: [PATCH 4/4] c++: dependent constraint on placeholder 'auto' [PR96443]

2021-02-11 Thread Jason Merrill via Gcc-patches

On 2/8/21 2:03 PM, Patrick Palka wrote:

This fixes the way we check satisfaction of constraints on placeholder
types in various contexts, and in particular when the constraint is
dependent.

Firstly, when evaluating the return type requirement of a compound
requirement, we currently substitute the outer template arguments into
the constraint before checking satisfaction. But we should instead be
passing in the complete set of template arguments to satisfaction and
not do a prior separate substitution.  Our current approach leads to us
incorrectly rejecting the testcase concepts-return-req2.C below.

Secondly, when checking the constraints on a placeholder variable or
return type, we don't substitute the template arguments of the enclosing
context at all.  This leads to bogus errors during satisfaction when the
constraint is dependent as in the testcase concepts-placeholder3.C
below.

In order to fix these two issues, we need to be able to properly
normalize the constraints on a placeholder 'auto', which in turn
requires us to know the template parameters that were in-scope where an
'auto' was introduced.  This information currently doesn't seem to be
easily available when we need it, so this patch adds an auxiliary hash
table that keeps track of the value of current_template_parms when each
constrained 'auto' was formed.

This patch also removes some seemingly wrong handling of placeholder
type arguments from tsubst_parameter_mapping.  The code doesn't trigger
with the example used in the comments, because type_uses_auto doesn't
look inside non-deduced contexts such as the operand of decltype.  And
the call to do_auto_deduction seems confused because if 'arg' is a type,
then so is 'parm', and therefore 'init' too is a type, but
do_auto_deduction expects it to be an expression.  Before this patch,
this code was dead (as far as our testsuite can tell), but now it breaks
other parts of this patch, so let's remove it.

gcc/cp/ChangeLog:

PR c++/96443
* constraint.cc (type_deducible_p): Don't substitute into the
constraints, and instead just pass 'args' to do_auto_deduction
as the outer template arguments.
(tsubst_parameter_mapping): Remove confused code for handling
placeholder type arguments.
(normalize_placeholder_type_constraint): Define.
(satisfy_constraint_expression): Use it to handle placeholder
'auto' types.
* cp-tree.h (get_constrained_auto_context): Declare.
* pt.c (constrained_auto_context_map): Define.
(get_placeholder_type_constraint_context): Define.
(set_placeholder_type_constraints): Define.
(copy_placeholder_type_constraints): Define.
(tsubst) : Use
copy_placeholder_type_constraints.
(make_constrained_placeholder_type): Use
set_placeholder_type_constraints.
(do_auto_deduction): Clarify comments about the outer_targs
parameter.  Rework satisfaction of a placeholder type constraint
to pass in the complete set of template arguments directly to
constraints_satisfied_p.
(splice_late_return_type): Use copy_placeholder_type_constraints.

gcc/testsuite/ChangeLog:

PR c++/96443
* g++.dg/cpp2a/concepts-placeholder3.C: New test.
* g++.dg/cpp2a/concepts-return-req2.C: New test.
* g++.dg/concepts-ts1.C: Add dg-bogus directive to the call to
f15 that we expect to accept.
---
  gcc/cp/constraint.cc  | 106 --
  gcc/cp/cp-tree.h  |   1 +
  gcc/cp/pt.c   | 101 +++--
  .../g++.dg/cpp2a/concepts-placeholder3.C  |  19 
  .../g++.dg/cpp2a/concepts-return-req2.C   |  13 +++
  gcc/testsuite/g++.dg/cpp2a/concepts-ts1.C |   2 +-
  6 files changed, 146 insertions(+), 96 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-placeholder3.C
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-return-req2.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 56134f8b2bf..53588047d44 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -2007,39 +2007,19 @@ type_deducible_p (tree expr, tree type, tree 
placeholder, tree args,
   references are preserved in the result.  */
expr = force_paren_expr_uneval (expr);
  
-  /* Replace the constraints with the instantiated constraints. This

- substitutes args into any template parameters in the trailing
- result type.  */
-  tree saved_constr = PLACEHOLDER_TYPE_CONSTRAINTS (placeholder);
-  tree subst_constr
-= tsubst_constraint (saved_constr,
-args,
-info.complain | tf_partial,
-info.in_decl);
-
-  if (subst_constr == error_mark_node)
-return false;
-
-  PLACEHOLDER_TYPE_CONSTRAINTS (placeholder) = subst_constr;
-
-  /* Temporarily unlink the canonical type.  */
-  tree saved_type = TYPE_CANONICAL 

Re: [PATCH] c++: Fix zero initialization of flexible array members [PR99033]

2021-02-11 Thread Jason Merrill via Gcc-patches

On 2/11/21 3:50 AM, Jakub Jelinek wrote:

Hi!

array_type_nelts returns error_mark_node for type of flexible array members
and build_zero_init_1 was placing an error_mark_node into the CONSTRUCTOR,
on which e.g. varasm ICEs.  I think there is nothing erroneous on zero
initialization of flexible array members though, such arrays should simply
get no elements, like they do if such classes are constructed (everything
except when some larger initializer comes from an explicit initializer).

So, this patch handles [] arrays in zero initialization like [0] arrays
and fixes handling of the [0] arrays - the
tree_int_cst_equal (max_index, integer_minus_one_node) check
didn't do what it thought it would do, max_index is typically unsigned
integer (sizetype) and so it is never equal to a -1.


I wonder about using build_minus_one_cst and integer_minus_onep instead. 
OK either way.



What the patch doesn't do and maybe would be desirable is if it returns
error_mark_node for other reasons let the recursive callers not stick that
into CONSTRUCTOR but return error_mark_node instead.  But I don't have a
testcase where that would be needed right now.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-02-11  Jakub Jelinek  

PR c++/99033
* init.c (build_zero_init_1): Handle zero initialiation of
flexible array members like initialization of [0] arrays.
Use integer_all_onesp instead of comparison to integer_minus_one_node
and integer_zerop instead of comparison against size_zero_node.
Formatting fixes.

* g++.dg/ext/flexary38.C: New test.

--- gcc/cp/init.c.jj2021-01-27 10:05:35.273942499 +0100
+++ gcc/cp/init.c   2021-02-10 15:03:02.668042115 +0100
@@ -247,9 +247,12 @@ build_zero_init_1 (tree type, tree nelts
  
/* Iterate over the array elements, building initializations.  */

if (nelts)
-   max_index = fold_build2_loc (input_location,
-MINUS_EXPR, TREE_TYPE (nelts),
-nelts, integer_one_node);
+   max_index = fold_build2_loc (input_location, MINUS_EXPR,
+TREE_TYPE (nelts), nelts,
+build_one_cst (TREE_TYPE (nelts)));
+  /* Treat flexible array members like [0] arrays.  */
+  else if (TYPE_DOMAIN (type) == NULL_TREE)
+   max_index = build_all_ones_cst (sizetype);
else
max_index = array_type_nelts (type);
  
@@ -261,20 +264,19 @@ build_zero_init_1 (tree type, tree nelts
  
/* A zero-sized array, which is accepted as an extension, will

 have an upper bound of -1.  */
-  if (!tree_int_cst_equal (max_index, integer_minus_one_node))
+  if (!integer_all_onesp (max_index))
{
  constructor_elt ce;
  
  	  /* If this is a one element array, we just use a regular init.  */

- if (tree_int_cst_equal (size_zero_node, max_index))
+ if (integer_zerop (max_index))
ce.index = size_zero_node;
  else
ce.index = build2 (RANGE_EXPR, sizetype, size_zero_node,
-   max_index);
+  max_index);
  
-	  ce.value = build_zero_init_1 (TREE_TYPE (type),

-/*nelts=*/NULL_TREE,
-static_storage_p, NULL_TREE);
+ ce.value = build_zero_init_1 (TREE_TYPE (type), /*nelts=*/NULL_TREE,
+   static_storage_p, NULL_TREE);
  if (ce.value)
{
  vec_alloc (v, 1);
--- gcc/testsuite/g++.dg/ext/flexary38.C.jj 2021-02-10 15:16:40.700102841 
+0100
+++ gcc/testsuite/g++.dg/ext/flexary38.C2021-02-10 15:16:22.243304533 
+0100
@@ -0,0 +1,18 @@
+// PR c++/99033
+// { dg-do compile }
+// { dg-options "" }
+
+struct T { int t; };
+struct S { char c; int T::*b[]; } a;
+struct U { char c; int T::*b[0]; } b;
+struct V { char c; int T::*b[1]; } c;
+struct W { char c; int T::*b[2]; } d;
+
+void
+foo ()
+{
+  a.c = 1;
+  b.c = 2;
+  c.c = 3;
+  d.c = 4;
+}

Jakub





Re: [PATCH] c++: ICE with unexpanded pack in do-while [PR99063]

2021-02-11 Thread Jason Merrill via Gcc-patches

On 2/10/21 11:47 PM, Marek Polacek wrote:

Here an unexpanded parameter pack snuck into prep_operand which doesn't
expect to see an operand without a type, and since r247842
NONTYPE_ARGUMENT_PACK doesn't have a type anymore.

This only happens with the do-while loop whose condition may not
contain a declaration so we never called finish_cond which checks
for unexpanded parameter packs.  So use check_for_bare_parameter_packs
to remedy that.


I wonder if we want to move that check into maybe_convert_cond?  OK 
either way.



Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

gcc/cp/ChangeLog:

PR c++/99063
* semantics.c (finish_do_stmt): Check for unexpanded parameter packs.

gcc/testsuite/ChangeLog:

PR c++/99063
* g++.dg/cpp0x/variadic-crash6.C: New test.
---
  gcc/cp/semantics.c   |  5 +
  gcc/testsuite/g++.dg/cpp0x/variadic-crash6.C | 16 
  2 files changed, 21 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/variadic-crash6.C

diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 73834467fca..30dd206c899 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -1028,6 +1028,11 @@ finish_do_stmt (tree cond, tree do_stmt, bool ivdep, 
unsigned short unroll)
  {
cond = maybe_convert_cond (cond);
end_maybe_infinite_loop (cond);
+  /* Unlike other iteration statements, the condition may not contain
+ a declaration, so we don't call finish_cond which checks for
+ unexpanded parameter packs.  */
+  if (check_for_bare_parameter_packs (cond))
+cond = error_mark_node;
if (ivdep && cond != error_mark_node)
  cond = build3 (ANNOTATE_EXPR, TREE_TYPE (cond), cond,
   build_int_cst (integer_type_node, annot_expr_ivdep_kind),
diff --git a/gcc/testsuite/g++.dg/cpp0x/variadic-crash6.C 
b/gcc/testsuite/g++.dg/cpp0x/variadic-crash6.C
new file mode 100644
index 000..88009b765c2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/variadic-crash6.C
@@ -0,0 +1,16 @@
+// PR c++/99063
+// { dg-do compile { target c++11 } }
+
+template 
+void f (T... n)
+{
+  do
+{
+}
+  while (--n); // { dg-error "parameter packs not expanded with '...'" }
+}
+
+void g ()
+{
+  f(3);
+}

base-commit: 27a804bc62805aedb1b097a00eb2c0059244680a





Re: [PATCH v3] arm: Low overhead loop handle long range branches [PR98931]

2021-02-11 Thread Andrea Corallo via Gcc-patches
"Richard Earnshaw (lists)"  writes:

[...]

> +  [(set (attr "length")
> +(if_then_else
> +(lt (minus (pc) (match_dup 0)) (const_int 1024))
> + (const_int 4)
> + (const_int 6)))
> +   (set_attr "type" "branch")])
>
> Shouldn't that be using "ltu" rather than "lt", so that if, for some
> reason, the branch has been retargeted to come after the branch, then
> the test will still fail and we'll get the comparison variant back.

Right good point

> Otherwise OK.
>
> R.

Installed into trunk with the suggestion as 38c5703449c.

Thanks!

  Andrea


Re: [PATCH] Fix version namespace build

2021-02-11 Thread Jonathan Wakely via Gcc-patches

On 09/02/21 22:05 +0100, François Dumont via Libstdc++ wrote:

    libstdc++: Fix versioned namespace build

    libstdc++-v3/ChangeLog:

    * libsupc++/eh_ptr.cc (exception_ptr(__safe_bool)): Define if
    _GLIBCXX_EH_PTR_COMPAT is defined.
    (exception_ptr::_M_safe_bool_dummy()): Likewise.
    (exception_ptr::operator!()): Likewise.
    (exception_ptr::opeerator __safe_bool()): Likewise.

Ok to commit ?


No, this is not correct. Those functions are declared in the
header if included from C++98 code:

#if (__cplusplus < 201103L) || defined (_GLIBCXX_EH_PTR_COMPAT)
  typedef void (exception_ptr::*__safe_bool)();

  // For construction from nullptr or 0.
  exception_ptr(__safe_bool) _GLIBCXX_USE_NOEXCEPT;
#endif

With your change, C++98 code that calls those functions will be unable
to link to libstdc++.so.8 because the functions won't be defined in
the library.

The simplest fix would be to revert this change:

--- a/libstdc++-v3/libsupc++/eh_ptr.cc
+++ b/libstdc++-v3/libsupc++/eh_ptr.cc
@@ -25,7 +25,12 @@
 #include 
 #include "eh_atomics.h"

+#if ! _GLIBCXX_INLINE_VERSION
+// This macro causes exception_ptr to declare an older API (with corresponding
+// definitions in this file) and to mark some inline functions as "used" so
+// that definitions will be emitted in this translation unit.
 #define _GLIBCXX_EH_PTR_COMPAT
+#endif

 #include 
 #include 

But that leaves the unwanted operator== definitions in libstdc++.so.8
so the attached patch (only tested with unversioned namespace) is
probably what we want to do. Does it fix the problem?


diff --git a/libstdc++-v3/libsupc++/eh_ptr.cc b/libstdc++-v3/libsupc++/eh_ptr.cc
index 5d8ac1d879a..5c4685606fe 100644
--- a/libstdc++-v3/libsupc++/eh_ptr.cc
+++ b/libstdc++-v3/libsupc++/eh_ptr.cc
@@ -25,11 +25,15 @@
 #include 
 #include "eh_atomics.h"
 
-#if ! _GLIBCXX_INLINE_VERSION
 // This macro causes exception_ptr to declare an older API (with corresponding
-// definitions in this file) and to mark some inline functions as "used" so
-// that definitions will be emitted in this translation unit.
+// definitions in this file).
 #define _GLIBCXX_EH_PTR_COMPAT
+
+#if ! _GLIBCXX_INLINE_VERSION
+// This macro causes some inline functions in exception_ptr to be marked
+// as "used" so that definitions will be emitted in this translation unit.
+// We need this because those functions were not inline in previous releases.
+#define _GLIBCXX_EH_PTR_RELOPS_COMPAT
 #endif
 
 #include 
diff --git a/libstdc++-v3/libsupc++/exception_ptr.h b/libstdc++-v3/libsupc++/exception_ptr.h
index 6d41b34fabe..9943668d9b3 100644
--- a/libstdc++-v3/libsupc++/exception_ptr.h
+++ b/libstdc++-v3/libsupc++/exception_ptr.h
@@ -39,7 +39,7 @@
 #include 
 #include 
 
-#ifdef _GLIBCXX_EH_PTR_COMPAT
+#ifdef _GLIBCXX_EH_PTR_RELOPS_COMPAT
 # define _GLIBCXX_EH_PTR_USED __attribute__((__used__))
 #else
 # define _GLIBCXX_EH_PTR_USED
@@ -153,7 +153,7 @@ namespace std
 #endif
 
 #if __cpp_impl_three_way_comparison >= 201907L \
-  && ! defined _GLIBCXX_EH_PTR_COMPAT
+  && ! defined _GLIBCXX_EH_PTR_RELOPS_COMPAT
   friend bool
   operator==(const exception_ptr&, const exception_ptr&) noexcept = default;
 #else


[PATCH] df: Record all definitions in DF_LR_BB_INFO->def [PR98863]

2021-02-11 Thread Richard Sandiford via Gcc-patches
df_lr_bb_local_compute has:

  FOR_EACH_INSN_INFO_DEF (def, insn_info)
/* If the def is to only part of the reg, it does
   not kill the other defs that reach here.  */
if (!(DF_REF_FLAGS (def) & (DF_REF_PARTIAL | DF_REF_CONDITIONAL)))

However, as noted in the comment in the patch and below, almost all
partial definitions have an associated use.  This means that the
confluence function:

  IN = (OUT & ~DEF) | USE

is unaffected by whether partial definitions are in DEF or not.

Even though the choice doesn't matter for the LR problem itself,
it's IMO much more convenient for consumers if DEF contains all the
definitions in the block.  The only pre-RTL-SSA code that tries to
consume DEF directly is shrink-wrap.c, which already has to work
around the incompleteness of the information:

  /* DF_LR_BB_INFO (bb)->def does not comprise the DF_REF_PARTIAL and
 DF_REF_CONDITIONAL defs.  So if DF_LIVE doesn't exist, i.e.
 at -O1, just give up searching NEXT_BLOCK.  */

I hit the same problem when trying to fix the RTL-SSA part of PR98863.

This patch treats partial definitions as both a def and a use,
just like the df_ref records almost always do.

To show that partial definitions almost always have uses:

  DF_REF_CONDITIONAL:

Added by:

  case COND_EXEC:
df_defs_record (collection_rec, COND_EXEC_CODE (x),
bb, insn_info, DF_REF_CONDITIONAL);
break;

Later, df_get_conditional_uses creates uses for all DF_REF_CONDITIONAL
definitions.

  DF_REF_PARTIAL:

In total, there are 4 locations at which we add partial definitions.

Case 1:

  if (GET_CODE (dst) == STRICT_LOW_PART)
{
  flags |= DF_REF_READ_WRITE | DF_REF_PARTIAL | DF_REF_STRICT_LOW_PART;

  loc =  (dst, 0);
  dst = *loc;
}

Corresponding use:

  case STRICT_LOW_PART:
{
  rtx *temp =  (dst, 0);
  /* A strict_low_part uses the whole REG and not just the
 SUBREG.  */
  dst = XEXP (dst, 0);
  df_uses_record (collection_rec,
  (GET_CODE (dst) == SUBREG) ? _REG (dst) : temp,
  DF_REF_REG_USE, bb, insn_info,
  DF_REF_READ_WRITE | DF_REF_STRICT_LOW_PART);
}
break;

Case 2:

  if (GET_CODE (dst) == ZERO_EXTRACT)
{
  flags |= DF_REF_READ_WRITE | DF_REF_PARTIAL | DF_REF_ZERO_EXTRACT;

  loc =  (dst, 0);
  dst = *loc;
}

Corresponding use:

  case ZERO_EXTRACT:
{
  df_uses_record (collection_rec,  (dst, 1),
  DF_REF_REG_USE, bb, insn_info, flags);
  df_uses_record (collection_rec,  (dst, 2),
  DF_REF_REG_USE, bb, insn_info, flags);
  if (GET_CODE (XEXP (dst,0)) == MEM)
df_uses_record (collection_rec,  (dst, 0),
DF_REF_REG_USE, bb, insn_info,
flags);
  else
df_uses_record (collection_rec,  (dst, 0),
DF_REF_REG_USE, bb, insn_info,
DF_REF_READ_WRITE | DF_REF_ZERO_EXTRACT);
^
}
break;

Case 3:

  else if (GET_CODE (dst) == SUBREG && REG_P (SUBREG_REG (dst)))
{
  if (read_modify_subreg_p (dst))
flags |= DF_REF_READ_WRITE | DF_REF_PARTIAL;

  flags |= DF_REF_SUBREG;

  df_ref_record (DF_REF_REGULAR, collection_rec,
 dst, loc, bb, insn_info, DF_REF_REG_DEF, flags);
}

Corresponding use:

  case SUBREG:
if (read_modify_subreg_p (dst))
  {
df_uses_record (collection_rec, _REG (dst),
DF_REF_REG_USE, bb, insn_info,
flags | DF_REF_READ_WRITE | DF_REF_SUBREG);
break;
  }

Case 4:

  /*  If this is a multiword hardreg, we create some extra
  datastructures that will enable us to easily build REG_DEAD
  and REG_UNUSED notes.  */
  if (collection_rec
  && (endregno != regno + 1) && insn_info)
{
  /* Sets to a subreg of a multiword register are partial.
 Sets to a non-subreg of a multiword register are not.  */
  if (GET_CODE (reg) == SUBREG)
ref_flags |= DF_REF_PARTIAL;
  ref_flags |= DF_REF_MW_HARDREG;

Corresponding use:

  None.  However, this case should be rare to non-existent on most
  targets, and the current handling seems suspect.  See the comment
  in the patch for more details.

Tested so far on aarch64-linux-gnu, both individually and with
the follow-on rtl-ssa patch.  WDYT?  I realise it's not ideal
to be applying this in stage 4, but it would be expensive
to work around.

FWIW, I also tried a variation:

   df_insn_info 

Re: [PATCH v3] arm: Low overhead loop handle long range branches [PR98931]

2021-02-11 Thread Richard Earnshaw (lists) via Gcc-patches
On 10/02/2021 17:44, Andrea Corallo via Gcc-patches wrote:
> Andrea Corallo via Gcc-patches  writes:
> 
>> "Richard Earnshaw (lists)"  writes:
>>
>>> On 09/02/2021 16:27, Andrea Corallo via Gcc-patches wrote:
 Jakub Jelinek  writes:

> On Tue, Feb 09, 2021 at 03:09:43PM +0100, Jakub Jelinek via Gcc-patches 
> wrote:
>>>"TARGET_32BIT && TARGET_HAVE_LOB"
>>> -  "le\t%|lr, %l0")
>>> +  "*
>>> +  if (get_attr_length (insn) == 4)
>>> +return \"le\\t%|lr, %l0\";
>>> +  else
>>> +return \"subs\\t%|lr, #1\;bne\\t%l0\";
>>> +  "
>>
>> Why not
>> {
>>   if (get_attr_length (insn) == 4)
>> return "le\t%|lr, %l0";
>>   else
>> return "subs\t%|lr, #1;bne\t%l0";
>> }
>> instead?  Seems the arm backend uses "*..." more than the more modern {},
>> but one needs to backslash prefix a lot which makes it less readable?
>
> Where "more modern" is introduced 19.5 years ago ;)
>
>   Jakub

 I guess we really like traditions :)

 Attached second version addressing this.

 Thanks

   Andrea

>>>
>>> You're missing a clobber of the condition codes in the RTL.  This wasn't
>>> needed before, but is now.
>>>
>>> R.
>>
>> Hi Richard,
>>
>> thanks for reviewing, I guess this is going to be a good learning moment
>> for me.
>>
>> What we originally expand is:
>>
>> (insn 2396 2360 2397 3 (parallel [
>> (set (reg:CC_NZ 100 cc)
>> (compare:CC_NZ (plus:SI (reg:SI 14 lr)
>> (const_int -1 [0x]))
>> (const_int 0 [0])))
>> (set (reg:SI 14 lr)
>> (plus:SI (reg:SI 14 lr)
>> (const_int -1 [0x])))
>> ]) "p1.c":4:21 -1
>>  (nil))
>> (jump_insn 2397 2396 2365 3 (set (pc)
>> (if_then_else (ne (reg:CC_NZ 100 cc)
>> (const_int 0 [0]))
>> (label_ref:SI 2361)
>> (pc))) "p1.c":4:21 273 {arm_cond_branch}
>>  (expr_list:REG_DEAD (reg:CC_NZ 100 cc)
>> (int_list:REG_BR_PROB 1062895996 (nil)))
>>  -> 2361)
>>
>> Combine recognizing cc:CC_NZ as a dead reg and rewriting the two insns
>> as:
>>
>> (jump_insn 2397 2396 2365 3 (parallel [
>> (set (pc)
>> (if_then_else (ne (reg:SI 14 lr)
>> (const_int 1 [0x1]))
>> (label_ref:SI 2361)
>> (pc)))
>> (set (reg:SI 14 lr)
>> (plus:SI (reg:SI 14 lr)
>> (const_int -1 [0x])))
>> ]) "p1.c":4:21 1047 {*doloop_end_internal}
>>  (int_list:REG_BR_PROB 1062895996 (nil))
>>  -> 2361)
>>
>> I originally thought that because the write of reg:CC_NZ is explicit in
>> the first pattern we expand this was sufficient, but I now understand
>> I'm wrong and combine should produce a pattern still expressing this.
>> Now the question is how to do that.
>>
>> If I add the clobber '(clobber (reg:CC CC_REGNUM))' inside the parallel
>> of *doloop_end_internal as last element of the vector we ICE in
>> 'add_clobbers' called during combine, apparently 'add_clobbers' does not
>> handle the insn_code_number.
>>
>> If I add it as second element of the parallel combine is not combining
>> the two insns.
>>
>> If I place the clobber outside the parallel as a second element of the
>> insn vector combine is crashing in 'recog_for_combine_1'.
>>
>> So the question is probably: where should the clobber be positioned
>> canonically to have this working?
>>
>> Thanks!
>>
>>   Andrea
> 
> Righ, I've been explained by a knowledgeable colleague that the
> 'parallel' is implicit in the 'define_insn' and there's no need to
> express it (interestgly this is confusing the code generating
> 'add_clobbers').
> 
> The attached patch rewrites the pattern as such and adds the missing
> clobber.
> 
> Tests are running, okay for trunk when done with these?
> 
> Regards
> 
>   Andrea
> 
+  [(set (attr "length")
+(if_then_else
+(lt (minus (pc) (match_dup 0)) (const_int 1024))
+   (const_int 4)
+   (const_int 6)))
+   (set_attr "type" "branch")])

Shouldn't that be using "ltu" rather than "lt", so that if, for some
reason, the branch has been retargeted to come after the branch, then
the test will still fail and we'll get the comparison variant back.

Otherwise OK.

R.



[Paatch, fortran] PR99060 - [9/10/11 Regression] ICE in gfc_match_varspec, at fortran/primary.c:2411

2021-02-11 Thread Paul Richard Thomas via Gcc-patches
Thanks to Steve for an 'obvious' patch for this one. It is fixed on all
three branches.

Paul


Re: [aarch64][vect] Support V8QI->V8HI WIDEN_ patterns

2021-02-11 Thread Richard Sandiford via Gcc-patches
One more formatting nit, sorry:

Joel Hutton  writes:
> +bool
> +supportable_half_widening_operation (enum tree_code code,
> +tree vectype_out, tree vectype_in,
> +enum tree_code *code1)

The arguments need reindenting for the new function name, so that
they line up under “enum”.  (Obviously doesn't need a full retest.)

OK with that change, thanks.

Richard

> +{
> +  machine_mode m1,m2;
> +  enum tree_code dummy_code;
> +  optab op;
> +
> +  gcc_assert (VECTOR_TYPE_P (vectype_out) && VECTOR_TYPE_P (vectype_in));
> +
> +  m1 = TYPE_MODE (vectype_out);
> +  m2 = TYPE_MODE (vectype_in);
> +
> +  if (!VECTOR_MODE_P (m1) || !VECTOR_MODE_P (m2))
> +return false;
> +
> +  if (maybe_ne (TYPE_VECTOR_SUBPARTS (vectype_in),
> +   TYPE_VECTOR_SUBPARTS (vectype_out)))
> +return false;
> +
> +  switch (code)
> +{
> +case WIDEN_LSHIFT_EXPR:
> +  *code1 = LSHIFT_EXPR;
> +  break;
> +case WIDEN_MINUS_EXPR:
> +  *code1 = MINUS_EXPR;
> +  break;
> +case WIDEN_PLUS_EXPR:
> +  *code1 = PLUS_EXPR;
> +  break;
> +case WIDEN_MULT_EXPR:
> +  *code1 = MULT_EXPR;
> +  break;
> +default:
> +  return false;
> +}
> +
> +  if (!supportable_convert_operation (NOP_EXPR, vectype_out, vectype_in,
> +  _code))
> +return false;
> +
> +  op = optab_for_tree_code (*code1, vectype_out, optab_vector);
> +  return (optab_handler (op, TYPE_MODE (vectype_out)) != CODE_FOR_nothing);
> +}
> +
>  /* Function supportable_convert_operation
>  
> Check whether an operation represented by the code CODE is a
> diff --git a/gcc/optabs-tree.h b/gcc/optabs-tree.h
> index 173da0d3bd2..c3aaa1a4169 100644
> --- a/gcc/optabs-tree.h
> +++ b/gcc/optabs-tree.h
> @@ -36,6 +36,9 @@ enum optab_subtype
> the second argument.  The third argument distinguishes between the types 
> of
> vector shifts and rotates.  */
>  optab optab_for_tree_code (enum tree_code, const_tree, enum optab_subtype);
> +bool
> +supportable_half_widening_operation (enum tree_code, tree, tree,
> + enum tree_code *);
>  bool supportable_convert_operation (enum tree_code, tree, tree,
>   enum tree_code *);
>  bool expand_vec_cmp_expr_p (tree, tree, enum tree_code);
> diff --git a/gcc/testsuite/gcc.target/aarch64/pr98772.c 
> b/gcc/testsuite/gcc.target/aarch64/pr98772.c
> new file mode 100644
> index 000..663221514e9
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/pr98772.c
> @@ -0,0 +1,155 @@
> +/* { dg-do run } */
> +/* { dg-options "-O3 -save-temps" } */
> +#include 
> +#include 
> +
> +#define DSIZE 16
> +#define PIXSIZE 64
> +
> +extern void
> +wplus (uint16_t *d, uint8_t *restrict pix1, uint8_t *restrict pix2 )
> +{
> +for (int y = 0; y < 4; y++ )
> +{
> + for (int x = 0; x < 4; x++ )
> + d[x + y*4] = pix1[x] + pix2[x];
> + pix1 += 16;
> + pix2 += 16;
> +}
> +}
> +extern void __attribute__((optimize (0)))
> +wplus_no_opt (uint16_t *d, uint8_t *restrict pix1, uint8_t *restrict pix2 )
> +{
> +for (int y = 0; y < 4; y++ )
> +{
> + for (int x = 0; x < 4; x++ )
> + d[x + y*4] = pix1[x] + pix2[x];
> + pix1 += 16;
> + pix2 += 16;
> +}
> +}
> +
> +extern void
> +wminus (uint16_t *d, uint8_t *restrict pix1, uint8_t *restrict pix2 )
> +{
> +for (int y = 0; y < 4; y++ )
> +{
> + for (int x = 0; x < 4; x++ )
> + d[x + y*4] = pix1[x] - pix2[x];
> + pix1 += 16;
> + pix2 += 16;
> +}
> +}
> +extern void __attribute__((optimize (0)))
> +wminus_no_opt (uint16_t *d, uint8_t *restrict pix1, uint8_t *restrict pix2 )
> +{
> +for (int y = 0; y < 4; y++ )
> +{
> + for (int x = 0; x < 4; x++ )
> + d[x + y*4] = pix1[x] - pix2[x];
> + pix1 += 16;
> + pix2 += 16;
> +}
> +}
> +
> +extern void
> +wmult (uint16_t *d, uint8_t *restrict pix1, uint8_t *restrict pix2 )
> +{
> +for (int y = 0; y < 4; y++ )
> +{
> + for (int x = 0; x < 4; x++ )
> + d[x + y*4] = pix1[x] * pix2[x];
> + pix1 += 16;
> + pix2 += 16;
> +}
> +}
> +extern void __attribute__((optimize (0)))
> +wmult_no_opt (uint16_t *d, uint8_t *restrict pix1, uint8_t *restrict pix2 )
> +{
> +for (int y = 0; y < 4; y++ )
> +{
> + for (int x = 0; x < 4; x++ )
> + d[x + y*4] = pix1[x] * pix2[x];
> + pix1 += 16;
> + pix2 += 16;
> +}
> +}
> +
> +extern void
> +wlshift (uint16_t *d, uint8_t *restrict pix1)
> +
> +{
> +for (int y = 0; y < 4; y++ )
> +{
> + for (int x = 0; x < 4; x++ )
> + d[x + y*4] = pix1[x] << 8;
> + pix1 += 16;
> +}
> +}
> +extern void __attribute__((optimize (0)))
> +wlshift_no_opt (uint16_t *d, uint8_t *restrict pix1)
> +
> +{
> +for (int y = 0; y < 4; y++ )
> +{
> + for (int x = 0; x < 4; x++ )
> + d[x + y*4] = pix1[x] << 8;
> + pix1 += 16;
> +}
> 

Re: [aarch64][vect] Support V8QI->V8HI WIDEN_ patterns

2021-02-11 Thread Joel Hutton via Gcc-patches
Hi Richard,

I've revised the patch, sorry about sloppy formatting in the previous one.

Full bootstrap/regression tests are still running, but the changes are pretty 
trivial.

Ok for trunk assuming tests finish clean?

>Joel Hutton  writes:
>> @@ -277,6 +277,81 @@ optab_for_tree_code (enum tree_code code, const_tree 
>> type,
>>  }
>>  }
>>
>> +/* Function supportable_half_widening_operation
>> +
>
>I realise existing (related) functions do have a “Function foo” line,
>but it's not generally the GCC style, so I think it would be better
>to leave it out.
Done.

>> +   Check whether an operation represented by code is a 'half' widening 
>> operation
>
>Nit: CODE should be in caps since it's a parameter name.
Done.

>> +   using only half of the full vector width for the inputs. This allows the
>
>I think this could be misread as saying that we only look at half the
>input vector.  Maybe it would be clearer to say something like “in which
>the input vectors contain half as many bits as the output vectors”
>instead of “using…”.
Done.

>> +   output to be stored in a single vector, as opposed to using the full 
>> width
>> +   for input, where half the elements must be processed at a time into
>> +   respective full-width vector outputs i.e. a 'hi'/'lo' pair using codes 
>> such
>> +   as VEC_WIDEN_MINUS_HI/LO. This is handled by widening the inputs and 
>> using
>> +   non-widening instructions. RTL fusing converts these to the widening 
>> hardware
>> +   instructions if supported.
>
>And here the contrast is with cases in which the input vectors have
>the same number of bits as the output vectors, meaning that we need
>a lo/hi pair to generate twice the amount of output data.
Done.

> Might be clearer to make “This is handled…” a new paragraph.
Done.

>> +
>> +   Supported widening operations:
>> +WIDEN_MINUS_EXPR
>> +WIDEN_PLUS_EXPR
>> +WIDEN_MULT_EXPR
>> +WIDEN_LSHIFT_EXPR
>> +
>> +   Output:
>> +   - CODE1 - The non-widened code, which will be used after the inputs are
>> + converted to the wide type.
>> +   */
>
>Formatting nit, */ should be on the previous line, i.e. “.” + two spaces
>+ “*/”.  (Sorry for all the formatting comments.)
Done (sorry for the formatting errors)

>> +bool
>> +supportable_half_widening_operation (enum tree_code code,
>> +tree vectype_out, tree vectype_in,
>> +enum tree_code *code1)
>> +{
>> +  machine_mode m1,m2;
>> +  bool truncp;
>> +  enum tree_code dummy_code;
>> +  optab op;
>> +
>> +  gcc_assert (VECTOR_TYPE_P (vectype_out) && VECTOR_TYPE_P (vectype_in));
>> +
>> +  m1 = TYPE_MODE (vectype_out);
>> +  m2 = TYPE_MODE (vectype_in);
>> +
>> +  if (!VECTOR_MODE_P (m1) || !VECTOR_MODE_P (m2))
>> +return false;
>> +
>> +  if(!known_eq (TYPE_VECTOR_SUBPARTS (vectype_in),
>> +   TYPE_VECTOR_SUBPARTS (vectype_out)))
>> +  return false;
>
>Formatting nit: should be a space between “if” and “(”.  IMO
>maybe_ne is more readable than !known_eq.  The return statement
>should only be indented by 4 columns.
Done.

>> +  if(!(code == WIDEN_MINUS_EXPR
>> +   || code == WIDEN_PLUS_EXPR
>> +   || code == WIDEN_MULT_EXPR
>> +   || code == WIDEN_LSHIFT_EXPR))
>> +return false;
>> +
>> +  switch (code)
>> +  {
>> +case WIDEN_LSHIFT_EXPR:
>> +  *code1 = LSHIFT_EXPR;
>> +  break;
>> +case WIDEN_MINUS_EXPR:
>> +  *code1 = MINUS_EXPR;
>> +  break;
>> +case WIDEN_PLUS_EXPR:
>> +  *code1 = PLUS_EXPR;
>> +  break;
>> +case WIDEN_MULT_EXPR:
>> +  *code1 = MULT_EXPR;
>> +  break;
>> +default:
>> +  gcc_unreachable ();
>
>I think it would be better to return false in the default case and remove
>the “if” above, so that we only have one copy of the list of codes.
Done.

>> +  }
>
>Formatting nit: the { and } lines should be indented two spaces more.
>(The contents of the switch are OK as-is.)
Done.

>> +
>> +  if (!supportable_convert_operation(NOP_EXPR, vectype_out, vectype_in,
>> +  _code))
>
>Nit: should be a space between the function name and ”(”.
Done.

>> +return false;
>> +
>> +  op = optab_for_tree_code (*code1, vectype_out, optab_vector);
>> +  return (optab_handler (op, TYPE_MODE (vectype_out)) != CODE_FOR_nothing);
>> +}
>> +
>>  /* Function supportable_convert_operation
>>
>> Check whether an operation represented by the code CODE is a
>
>> […]
>> @@ -4697,7 +4755,13 @@ vectorizable_conversion (vec_info *vinfo,
>>nunits_in = TYPE_VECTOR_SUBPARTS (vectype_in);
>>nunits_out = TYPE_VECTOR_SUBPARTS (vectype_out);
>>if (known_eq (nunits_out, nunits_in))
>> -modifier = NONE;
>> +if (code == WIDEN_MINUS_EXPR
>> + || code == WIDEN_PLUS_EXPR
>> + || code == WIDEN_LSHIFT_EXPR
>> + || code == WIDEN_MULT_EXPR)
>> + modifier = WIDEN;
>
>The “modifier = WIDEN;” line is indented by a tab (i.e. 8 columns),
>but it should only be indented by 6 spaces.
Done.

>> + 

Re: [RFC] test builtin ratio for loop distribution

2021-02-11 Thread Alexandre Oliva
On Feb 11, 2021, Alexandre Oliva  wrote:

> How does this look?


> for  gcc/ChangeLog

>   PR tree-optimization/94092
>   * builtins.c (try_store_by_multiple_pieces): New.
>   (expand_builtin_memset_args): Use it.  If target_char_cast
>   fails, proceed as for non-constant val.  Pass len's ctz to...
>   * expr.c (clear_storage_hints): ... this.  Try store by
>   multiple pieces after setmem.
>   (clear_storage): Adjust.
>   * expr.h (clear_storage_hints): Likewise.
>   (try_store_by_multiple_pieces): Declare.
>   * tree-loop-distribution.c: Include builtins.h.
>   (generate_memset_builtin): Propagate dst_base alignmen to mem.
>   * tree-ssanames.c (get_nonzero_bits): Zero out low bits of
>   integral types, when a MULT_EXPR INTEGER_CST operand ensures
>   the result will be a multiple of a power of two.

I forgot to mention this passed regstrap on x86_64-linux-gnu, as well as
some cross testing of riscv32-elf.

I've also regstrapped it on x86_64-linux-gnu along with a patch for
testing purposes, that tried try_store_by_multiple_pieces before setmem
in all 3 locations where they are called, which gives me some confidence
that the implementation is reasonably robust.


Is this ok to install?  (if not right now, perhaps in stage1)

-- 
Alexandre Oliva, happy hacker  https://FSFLA.org/blogs/lxo/
   Free Software Activist GNU Toolchain Engineer
Vim, Vi, Voltei pro Emacs -- GNUlius Caesar


Re: [PATCH] reduce sparseset memory requirement

2021-02-11 Thread Richard Biener via Gcc-patches
On Tue, Feb 9, 2021 at 12:38 PM Richard Biener  wrote:
>
> Currently we use HOST_WIDEST_FAST_INT for the sparseset element
> type which maps to a 64bit type on 64bit hosts.  That's excessive
> for the only current sparseset users which are LRA and IRA and
> which store register numbers in it which are unsigned int.  The
> following changes the sparseset element type to unsigned int.
>
> This was changed (in accident?) in 0263463dd114 and the following
> just reverts that bit.
>
> Bootstrap / regtest pending on x86_64-unknown-linux-gnu, OK?

I've now verified that all callers of sparseset_alloc call it with
an int or unsigned int argument and thus I've pushed the change.

Richard.

> Thanks,
> Richard.
>
> 2021-02-09  Richard Biener  
>
> * sparseset.h (SPARSESET_ELT_BITS): Remove.
> (SPARSESET_ELT_TYPE): Use unsigned int.
> * fwprop.c: Do not include sparseset.h.
> ---
>  gcc/fwprop.c| 1 -
>  gcc/sparseset.h | 3 +--
>  2 files changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/gcc/fwprop.c b/gcc/fwprop.c
> index 123cc228630..4b8a554e823 100644
> --- a/gcc/fwprop.c
> +++ b/gcc/fwprop.c
> @@ -28,7 +28,6 @@ along with GCC; see the file COPYING3.  If not see
>  #include "df.h"
>  #include "rtl-ssa.h"
>
> -#include "sparseset.h"
>  #include "predict.h"
>  #include "cfgrtl.h"
>  #include "cfgcleanup.h"
> diff --git a/gcc/sparseset.h b/gcc/sparseset.h
> index c72b4fe8aed..536d35c51af 100644
> --- a/gcc/sparseset.h
> +++ b/gcc/sparseset.h
> @@ -83,8 +83,7 @@ along with GCC; see the file COPYING3.  If not see
>
>  /* Data Structure used for the SparseSet representation.  */
>
> -#define SPARSESET_ELT_BITS ((unsigned) HOST_BITS_PER_WIDEST_FAST_INT)
> -#define SPARSESET_ELT_TYPE unsigned HOST_WIDEST_FAST_INT
> +#define SPARSESET_ELT_TYPE unsigned int
>
>  typedef struct sparseset_def
>  {
> --
> 2.26.2


Re: [PATCH] tree-optimization/38474 - fix store-merging compile-time regression

2021-02-11 Thread Jakub Jelinek via Gcc-patches
On Thu, Feb 11, 2021 at 11:55:30AM +0100, Richard Biener wrote:
> 2021-02-11  Richard Biener  
> 
>   PR tree-optimization/38474
>   * params.opt (-param=max-store-chains-to-track=): New param.
>   (-param=max-stores-to-track=): Likewise.
>   * doc/invoke.texi (max-store-chains-to-track): Document.
>   (max-stores-to-track): Likewise.
>   * gimple-ssa-store-merging.c (pass_store_merging::m_n_chains):
>   New.
>   (pass_store_merging::m_n_stores): Likewise.
>   (pass_store_merging::terminate_and_process_chain): Update
>   m_n_stores and m_n_chains.
>   (pass_store_merging::process_store): Likewise.   Terminate
>   oldest chains if the number of stores or chains get too large.
>   (imm_store_chain_info::terminate_and_process_chain): Dump
>   chain length.

LGTM.

Jakub



[PATCH] tree-optimization/38474 - fix store-merging compile-time regression

2021-02-11 Thread Richard Biener
The following puts a limit on the number of alias tests we do in
terminate_all_aliasing_chains which is quadratic in the number of
overall stores currentrly tracked.  There is already a limit in
place on the maximum number of stores in a single chain so the
following adds a limit on the number of chains tracked.  The
worst number of overall stores tracked from the defaults (64 and 64)
is then 4096 which when imposed as the sole limit for the testcase
still causes

 store merging  :  71.65 ( 56%)

because the testcase is somewhat degenerate with most chains
consisting only of a single store (and 25% of exactly three stores).
The single stores are all CLOBBERs at the point variables go out of
scope.  Note unpatched we have

 store merging  : 308.60 ( 84%)

Limiting the number of chains to 64 brings this down to

 store merging  :   1.52 (  3%)

which is more reasonable.  There are ideas on how to make
terminate_all_aliasing_chains cheaper but for this degenerate case
they would not have any effect so I'll defer for GCC 12 for those.

I'm not sure we want to have both --params, just keeping the
more to-the-point max-stores-to-track works but makes the
degenerate case above slower.
I made the current default 1024 (a million alias disambiguations)
which for the testcasse (without limiting chains) results in 25%
compile time and 20s putting it in the same ballpart as the next
offender (which is PTA).

This is a regression on trunk and the GCC 10 branch btw.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

Any opinions?

Thanks,
Richard.

2021-02-11  Richard Biener  

PR tree-optimization/38474
* params.opt (-param=max-store-chains-to-track=): New param.
(-param=max-stores-to-track=): Likewise.
* doc/invoke.texi (max-store-chains-to-track): Document.
(max-stores-to-track): Likewise.
* gimple-ssa-store-merging.c (pass_store_merging::m_n_chains):
New.
(pass_store_merging::m_n_stores): Likewise.
(pass_store_merging::terminate_and_process_chain): Update
m_n_stores and m_n_chains.
(pass_store_merging::process_store): Likewise.   Terminate
oldest chains if the number of stores or chains get too large.
(imm_store_chain_info::terminate_and_process_chain): Dump
chain length.
---
 gcc/doc/invoke.texi|  8 
 gcc/gimple-ssa-store-merging.c | 88 ++
 gcc/params.opt |  8 
 3 files changed, 83 insertions(+), 21 deletions(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 3751bc3ac7c..e8baa545eee 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -13308,6 +13308,14 @@ do so.
 The maximum number of stores to attempt to merge into wider stores in the store
 merging pass.
 
+@item max-store-chains-to-track
+The maximum number of store chains to track at the same time in the attempt
+to merge them into wider stores in the store merging pass.
+
+@item max-stores-to-track
+The maximum number of stores to track at the same time in the attemt to
+to merge them into wider stores in the store merging pass.
+
 @item max-unrolled-insns
 The maximum number of instructions that a loop may have to be unrolled.
 If a loop is unrolled, this parameter also determines how many times
diff --git a/gcc/gimple-ssa-store-merging.c b/gcc/gimple-ssa-store-merging.c
index f0f4a068de5..b4c5e8eb9a8 100644
--- a/gcc/gimple-ssa-store-merging.c
+++ b/gcc/gimple-ssa-store-merging.c
@@ -2324,7 +2324,8 @@ class pass_store_merging : public gimple_opt_pass
 {
 public:
   pass_store_merging (gcc::context *ctxt)
-: gimple_opt_pass (pass_data_tree_store_merging, ctxt), m_stores_head ()
+: gimple_opt_pass (pass_data_tree_store_merging, ctxt), m_stores_head (),
+  m_n_chains (0), m_n_stores (0)
   {
   }
 
@@ -2356,6 +2357,11 @@ private:
  decisions when going out of SSA).  */
   imm_store_chain_info *m_stores_head;
 
+  /* The number of store chains currently tracked.  */
+  unsigned m_n_chains;
+  /* The number of stores currently tracked.  */
+  unsigned m_n_stores;
+
   bool process_store (gimple *);
   bool terminate_and_process_chain (imm_store_chain_info *);
   bool terminate_all_aliasing_chains (imm_store_chain_info **, gimple *);
@@ -2435,6 +2441,8 @@ pass_store_merging::terminate_all_aliasing_chains 
(imm_store_chain_info
 bool
 pass_store_merging::terminate_and_process_chain (imm_store_chain_info 
*chain_info)
 {
+  m_n_stores -= chain_info->m_store_info.length ();
+  m_n_chains--;
   bool ret = chain_info->terminate_and_process_chain ();
   m_stores.remove (chain_info->base_addr);
   delete chain_info;
@@ -4711,6 +4719,9 @@ imm_store_chain_info::output_merged_stores ()
 bool
 imm_store_chain_info::terminate_and_process_chain ()
 {
+  if (dump_file && (dump_flags & TDF_DETAILS))
+fprintf (dump_file, "Terminating chain with %u stores\n",
+m_store_info.length ());
 

Re: [RFC] test builtin ratio for loop distribution

2021-02-11 Thread Alexandre Oliva
On Feb  4, 2021, Alexandre Oliva  wrote:

> On Feb  4, 2021, Richard Biener  wrote:
>>> >  b) if expansion would use BY_PIECES then expand to an unrolled loop
>>> 
>>> Why would that be better than keeping the constant-length memset call,
>>> that would be turned into an unrolled loop during expand?

>> Well, because of the possibly lost ctz and alignment info.

> Funny you should mention that.  I got started with the expand-time
> expansion yesterday, and found out that we're not using the alignment
> information that is available.  Though the pointer is known to point to
> an aligned object, we are going for 8-bit alignment for some reason.

> The strategy I used there was to first check whether by_pieces would
> expand inline a constant length near the max known length, then loop
> over the bits in the variable length, expand in each iteration a
> constant-length store-by-pieces for the fixed length corresponding to
> that bit, and a test comparing the variable length with the fixed length
> guarding the expansion of the store-by-pieces.  We may get larger code
> this way (no loops), but only O(log(len)) compares.

> I've also fixed some bugs in the ldist expander, so now it bootstraps,
> but with a few regressions in the testsuite, that I'm yet to look into.

A few more fixes later, this seems to work.

It encompasses the patch to recover tree_ctz from a mult_expr def, it
adds code to set up the alignment data for the ldist-generated memset
dest, and then it expands memset as described above if length is not
constant, setmem is not available, but the by-pieces machinery would
still be used for nearby constants.

How does this look?


[PR94092] introduce try store by multiple pieces

From: Alexandre Oliva 

The ldist pass turns even very short loops into memset calls.  E.g.,
the TFmode emulation calls end with a loop of up to 3 iterations, to
zero out trailing words, and the loop distribution pass turns them
into calls of the memset builtin.

Though short constant-length memsets are usually dealt with
efficiently, for non-constant-length ones, the options are setmemM, or
a function calls.

RISC-V doesn't have any setmemM pattern, so the loops above end up
"optimized" into memset calls, incurring not only the overhead of an
explicit call, but also discarding the information the compiler has
about the alignment of the destination, and that the length is a
multiple of the word alignment.

This patch handles variable lengths with multiple conditional
power-of-2-constant-sized stores-by-pieces, so as to reduce the
overhead of length compares.

It also propagates the alignment of the memset dest, introduced by
loop-distribution, to the SSA pointer used to hold it, so that it is
not discarded right away, and may be recovered by the memset builtin
expander.

Finally, it computes the ctz of the length, and uses it to omit blocks
for stores of small byte counts when we're storing whole words.
tree_ctz is improved so as to look at the length DEF stmt, and zero
out the least-significant bits if it's a multiply by a power of two.


for  gcc/ChangeLog

PR tree-optimization/94092
* builtins.c (try_store_by_multiple_pieces): New.
(expand_builtin_memset_args): Use it.  If target_char_cast
fails, proceed as for non-constant val.  Pass len's ctz to...
* expr.c (clear_storage_hints): ... this.  Try store by
multiple pieces after setmem.
(clear_storage): Adjust.
* expr.h (clear_storage_hints): Likewise.
(try_store_by_multiple_pieces): Declare.
* tree-loop-distribution.c: Include builtins.h.
(generate_memset_builtin): Propagate dst_base alignmen to mem.
* tree-ssanames.c (get_nonzero_bits): Zero out low bits of
integral types, when a MULT_EXPR INTEGER_CST operand ensures
the result will be a multiple of a power of two.
---
 gcc/builtins.c   |  113 +++---
 gcc/expr.c   |9 +++
 gcc/expr.h   |   12 
 gcc/tree-loop-distribution.c |9 +++
 gcc/tree-ssanames.c  |   23 -
 5 files changed, 154 insertions(+), 12 deletions(-)

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 0aed008687ccc..44f3af92536a5 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -6600,6 +6600,97 @@ expand_builtin_memset (tree exp, rtx target, 
machine_mode mode)
   return expand_builtin_memset_args (dest, val, len, target, mode, exp);
 }
 
+/* Try to store VAL (or, if NULL_RTX, VALC) in LEN bytes starting at TO.
+   Return TRUE if successful, FALSE otherwise.  TO is assumed to be
+   aligned at an ALIGN boundary.  LEN is assumed to be a multiple of
+   1<= ctz_len; i--)
+{
+  unsigned HOST_WIDE_INT blksize = HOST_WIDE_INT_1U << i;
+  rtx_code_label *label = NULL;
+
+  if (i <= shr_bits)
+   {
+ label = gen_label_rtx ();
+ emit_cmp_and_jump_insns (rem, GEN_INT (blksize), LT, NULL,
+  

Re: [RFC] test builtin ratio for loop distribution

2021-02-11 Thread Alexandre Oliva
On Feb  4, 2021, Jim Wilson  wrote:

> FYI we have a bug report for this for a coremark regression which sounds
> like the same problem.
>  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94092

Indeed, thanks!

-- 
Alexandre Oliva, happy hacker  https://FSFLA.org/blogs/lxo/
   Free Software Activist GNU Toolchain Engineer
Vim, Vi, Voltei pro Emacs -- GNUlius Caesar


[PATCH] c++: Fix zero initialization of flexible array members [PR99033]

2021-02-11 Thread Jakub Jelinek via Gcc-patches
Hi!

array_type_nelts returns error_mark_node for type of flexible array members
and build_zero_init_1 was placing an error_mark_node into the CONSTRUCTOR,
on which e.g. varasm ICEs.  I think there is nothing erroneous on zero
initialization of flexible array members though, such arrays should simply
get no elements, like they do if such classes are constructed (everything
except when some larger initializer comes from an explicit initializer).

So, this patch handles [] arrays in zero initialization like [0] arrays
and fixes handling of the [0] arrays - the
tree_int_cst_equal (max_index, integer_minus_one_node) check
didn't do what it thought it would do, max_index is typically unsigned
integer (sizetype) and so it is never equal to a -1.

What the patch doesn't do and maybe would be desirable is if it returns
error_mark_node for other reasons let the recursive callers not stick that
into CONSTRUCTOR but return error_mark_node instead.  But I don't have a
testcase where that would be needed right now.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-02-11  Jakub Jelinek  

PR c++/99033
* init.c (build_zero_init_1): Handle zero initialiation of
flexible array members like initialization of [0] arrays.
Use integer_all_onesp instead of comparison to integer_minus_one_node
and integer_zerop instead of comparison against size_zero_node.
Formatting fixes.

* g++.dg/ext/flexary38.C: New test.

--- gcc/cp/init.c.jj2021-01-27 10:05:35.273942499 +0100
+++ gcc/cp/init.c   2021-02-10 15:03:02.668042115 +0100
@@ -247,9 +247,12 @@ build_zero_init_1 (tree type, tree nelts
 
   /* Iterate over the array elements, building initializations.  */
   if (nelts)
-   max_index = fold_build2_loc (input_location,
-MINUS_EXPR, TREE_TYPE (nelts),
-nelts, integer_one_node);
+   max_index = fold_build2_loc (input_location, MINUS_EXPR,
+TREE_TYPE (nelts), nelts,
+build_one_cst (TREE_TYPE (nelts)));
+  /* Treat flexible array members like [0] arrays.  */
+  else if (TYPE_DOMAIN (type) == NULL_TREE)
+   max_index = build_all_ones_cst (sizetype);
   else
max_index = array_type_nelts (type);
 
@@ -261,20 +264,19 @@ build_zero_init_1 (tree type, tree nelts
 
   /* A zero-sized array, which is accepted as an extension, will
 have an upper bound of -1.  */
-  if (!tree_int_cst_equal (max_index, integer_minus_one_node))
+  if (!integer_all_onesp (max_index))
{
  constructor_elt ce;
 
  /* If this is a one element array, we just use a regular init.  */
- if (tree_int_cst_equal (size_zero_node, max_index))
+ if (integer_zerop (max_index))
ce.index = size_zero_node;
  else
ce.index = build2 (RANGE_EXPR, sizetype, size_zero_node,
-   max_index);
+  max_index);
 
- ce.value = build_zero_init_1 (TREE_TYPE (type),
-/*nelts=*/NULL_TREE,
-static_storage_p, NULL_TREE);
+ ce.value = build_zero_init_1 (TREE_TYPE (type), /*nelts=*/NULL_TREE,
+   static_storage_p, NULL_TREE);
  if (ce.value)
{
  vec_alloc (v, 1);
--- gcc/testsuite/g++.dg/ext/flexary38.C.jj 2021-02-10 15:16:40.700102841 
+0100
+++ gcc/testsuite/g++.dg/ext/flexary38.C2021-02-10 15:16:22.243304533 
+0100
@@ -0,0 +1,18 @@
+// PR c++/99033
+// { dg-do compile }
+// { dg-options "" }
+
+struct T { int t; };
+struct S { char c; int T::*b[]; } a;
+struct U { char c; int T::*b[0]; } b;
+struct V { char c; int T::*b[1]; } c;
+struct W { char c; int T::*b[2]; } d;
+
+void
+foo ()
+{
+  a.c = 1;
+  b.c = 2;
+  c.c = 3;
+  d.c = 4;
+}

Jakub



Re: [PATCH] adjust "partly out of bounds" warning (PR 98503)

2021-02-11 Thread Richard Biener via Gcc-patches
On Wed, Feb 10, 2021 at 7:03 PM Martin Sebor  wrote:
>
> On 2/10/21 3:39 AM, Richard Biener wrote:
> > On Tue, Feb 9, 2021 at 4:37 PM Martin Sebor  wrote:
> >>
> >> On 2/9/21 12:41 AM, Richard Biener wrote:
> >>> On Tue, Feb 9, 2021 at 1:04 AM Martin Sebor via Gcc-patches
> >>>  wrote:
> 
>  On 2/8/21 12:09 PM, Jeff Law wrote:
> >
> >
> > On 2/3/21 3:45 PM, Martin Sebor wrote:
> >> On 2/3/21 2:57 PM, Jeff Law wrote:
> >>>
> >>>
> >>> On 1/28/21 4:03 PM, Martin Sebor wrote:
>  The GCC 11 -Warray-bounds enhancement to diagnose accesses whose
>  leading offset is in bounds but whose trailing offset is not has
>  been causing some confusion.  When the warning is issued for
>  an access to an in-bounds member via a pointer to a struct that's
>  larger than the pointed-to object, phrasing this strictly invalid
>  access in terms of array subscripts can be misleading, especially
>  when the source code doesn't involve any arrays or indexing.
> 
>  Since the problem boils down to an aliasing violation much more
>  so than an actual out-of-bounds access, the attached patch adjusts
>  the code to issue a -Wstrict-aliasing warning in these cases instead
>  of -Warray-bounds.  In addition, since the aliasing assumptions in
>  GCC can be disabled by -fno-strict-aliasing, the patch also makes
>  these instances of the warning conditional on -fstrict-aliasing
>  being in effect.
> 
>  Martin
> 
>  gcc-98503.diff
> 
>  PR middle-end/98503 -Warray-bounds when -Wstrict-aliasing would be
>  more appropriate
> 
>  gcc/ChangeLog:
> 
> PR middle-end/98503
> * gimple-array-bounds.cc 
>  (array_bounds_checker::check_mem_ref):
> Issue -Wstrict-aliasing for a subset of violations.
> (array_bounds_checker::check_array_bounds):  Set new member.
> * gimple-array-bounds.h (array_bounds_checker::cref_of_mref): 
>  New
> data member.
> 
>  gcc/testsuite/ChangeLog:
> 
> PR middle-end/98503
> * g++.dg/warn/Warray-bounds-10.C: Adjust text of expected 
>  warnings.
> * g++.dg/warn/Warray-bounds-11.C: Same.
> * g++.dg/warn/Warray-bounds-12.C: Same.
> * g++.dg/warn/Warray-bounds-13.C: Same.
> * gcc.dg/Warray-bounds-63.c: Avoid -Wstrict-aliasing.  Adjust 
>  text
> of expected warnings.
> * gcc.dg/Warray-bounds-66.c: Adjust text of expected warnings.
> * gcc.dg/Wstrict-aliasing-2.c: New test.
> * gcc.dg/Wstrict-aliasing-3.c: New test.
> >>> What I don't like here is the strict-aliasing warnings inside the file
> >>> and analysis phase for array bounds checking.
> >>>
> >>> ISTM that catching this at cast time would be better.  So perhaps in
> >>> build_c_cast and the C++ equivalent?
> >>>
> >>> It would mean we're warning at the cast site rather than the
> >>> dereference, but that may ultimately be better for the warning anyway.
> >>> I'm not sure.
> >>
> >> I had actually experimented with a this (in the middle end, and only
> >> for accesses) but even that caused so many warnings that I abandoned
> >> it, at least for now.  -Warray-bounds has the benefit of flow analysis
> >> (and dead code elimination).  In the front end it would have neither
> >> and be both excessively noisy and ineffective at the same time.  For
> >> example:
> > I think we agree that this really is an aliasing issue and I would go
> > further and claim that it has nothing to do with array bounds checking.
> > Therefore warning for it within gimple-array-bounds just seems wrong.
> >
> > I realize that you like having DCE run and the ability to look at
> > offsets and such, but it really feels like the wrong place to do this.
> > Aliasing issues are also more of a front-end thing since the language
> > defines what is and what is not valid aliasing -- one might even argue
> > that any aliasing warning needs to be identified by the front-ends
> > exclusively (though possibly pruned away later).
> 
>  The aliasing restrictions are on accesses, which are [defined in
>  C as] execution-time actions on objects.  Analyzing execution-time
>  actions unavoidably depends on flow analysis which the front ends
>  have only very limited support for (simple expressions only).
>  I gave one example showing how the current -Wstrict-aliasing in
>  the front end is ineffective against all but the most simplistic
>  bugs, but there are many more.  For instance:
> 
>   int i;
>   void *p = //