Re: [PATCH 2/2] vec: Add array_slice::bsearch

2022-08-26 Thread Richard Biener via Gcc-patches



> Am 26.08.2022 um 23:45 schrieb Martin Jambor :
> 
> Hi,
> 
>> On Fri, Aug 26 2022, Richard Sandiford wrote:
>> Richard Biener  writes:
 Am 26.08.2022 um 18:40 schrieb Martin Jambor :
 
 Hi,
 
 This adds a method to binary search in a sorted array_slice.
 
 The implementation is direct copy of vec:bsearch.  Moreover, to only
 copy it once and not twice, I used const_cast in the non-const
 variants to be able to use the const variants.  I hope that is
 acceptable abuse of const_cast but I'll be happy to change that if
 not.
 
 Bootstrapped and tested along code that actually uses it on
 x86_64-linux.  OK for trunk?
>>> 
>>> Can you avoid the copying by using array slice bsearch from the vec<> 
>>> bsearch?
>> 
>> IMO it would be better to transition to using  routines
>> for this kind of thing (for new code).  In this case that would be
>> std::lower_bound.
>> 
>> Using std::lower_bound is more convenient because it avoids the void *
>> thing (you can use lambdas to capture any number of variables instead)
>> and because it works on subranges, not just whole ranges.
>> 
> 
> OK, I can use std::lower_bound with simple lambdas too.  The semantics
> of returning the first matching a criterion actually allows me to use it
> one more time.

Can you try to compare generated code?

> Martin


Re: [PATCH][DOCS] Mention removed ports in GCC 13.

2022-08-26 Thread Gerald Pfeifer
On Fri, 26 Aug 2022, Martin Liška wrote:
> Ready for master?

Looks fine. Tthank you, Martin!

Gerald


[PATCH] driver: Extend 'getenv' function to allow default value

2022-08-26 Thread Keith Packard via Gcc-patches
Right now, a missing environment variable provided to the 'getenv'
function in a .specs file causes a fatal error. That makes writing a
spec file that uses the GCC_EXEC_PREFIX value difficult as that
variable is only set when the driver has been relocated, but not when
run from the defined location. This makes building a relocatable
toolchain difficult to extend to other ancilary pieces which use specs
files to locate header and library files adjacent to the toolchain.

This patch adds an optional third argument to the getenv function that
can be used to fall back to the standard installation path when the
driver hasn't set GCC_EXEC_PREFIX in the environment.

For example, if an alternate C library is installed in
${prefix}/extra, then this change allows the specs file to locate that
relative to the gcc directory, if gcc is located in the original
installation directory (which would leave GCC_EXEC_PREFIX unset), or
if the gcc tree has been moved to a different location (where gcc
would set GCC_EXEC_PREFIX itself):

*cpp:
-isystem %:getenv(GCC_EXEC_PREFIX ../../extra/include ${prefix}/extra/include)

I considered changing the behavior of either the %R sequence so that
it had a defined behavior when there was no sysroot defined, or making
the driver always set the GCC_EXEC_PREFIX environment variable and
decided that the approach of adding functionality to getenv where it
was previously invalid would cause the least potential for affecting
existing usage.

Signed-off-by: Keith Packard 
---
 gcc/doc/invoke.texi | 18 +++---
 gcc/gcc.cc  | 10 +-
 2 files changed, 20 insertions(+), 8 deletions(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 6131bfa7acf..669c28a609a 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -34223,17 +34223,21 @@ The following built-in spec functions are provided:
 
 @table @code
 @item @code{getenv}
-The @code{getenv} spec function takes two arguments: an environment
-variable name and a string.  If the environment variable is not
-defined, a fatal error is issued.  Otherwise, the return value is the
-value of the environment variable concatenated with the string.  For
-example, if @env{TOPDIR} is defined as @file{/path/to/top}, then:
+
+The @code{getenv} spec function takes two or three arguments: an
+environment variable name, a string and an optional default value.  If
+the environment variable is not defined and a default value is
+provided, that is used as the return value; otherwise a fatal error is
+issued.  Otherwise, the return value is the value of the environment
+variable concatenated with the string.  For example, if @env{TOPDIR}
+is defined as @file{/path/to/top}, then:
 
 @smallexample
-%:getenv(TOPDIR /include)
+%:getenv(TOPDIR /include /path/to/default/include)
 @end smallexample
 
-expands to @file{/path/to/top/include}.
+expands to @file{/path/to/top/include}. If @env{TOPDIR} is not
+defined, then this expands to @file{/path/to/default/include}.
 
 @item @code{if-exists}
 The @code{if-exists} spec function takes one argument, an absolute
diff --git a/gcc/gcc.cc b/gcc/gcc.cc
index b6d562a92f0..16295702db7 100644
--- a/gcc/gcc.cc
+++ b/gcc/gcc.cc
@@ -10169,12 +10169,20 @@ getenv_spec_function (int argc, const char **argv)
   char *ptr;
   size_t len;
 
-  if (argc != 2)
+  if (argc != 2 && argc != 3)
 return NULL;
 
   varname = argv[0];
   value = env.get (varname);
 
+  if (!value && argc == 3)
+{
+  value = argv[2];
+  result = XNEWVAR(char, strlen(value) + 1);
+  strcpy(result, value);
+  return result;
+}
+
   /* If the variable isn't defined and this is allowed, craft our expected
  return value.  Assume variable names used in specs strings don't contain
  any active spec character so don't need escaping.  */
-- 
2.36.1



[PATCH] rs6000: Don't ICE when we disassemble an MMA variable [PR101322]

2022-08-26 Thread Peter Bergner via Gcc-patches
When we expand an MMA disassemble built-in with C++ using a pointer that
is casted to a valid MMA type, the type isn't passed down to the expand
machinery and we end up using the base type of the pointer which leads to
an ICE.  This patch enforces we always use the correct MMA type regardless
of the pointer type being used.

This passed bootstrap and regtesting on powerpc64le-linux with no regressions.
Ok for trunk and backports after some burn-in time?

Peter

gcc/
PR target/101322
* config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_mma_builtin):
Enforce the use of a valid MMA pointer type.

gcc/testsuite/
PR target/101322
* g++.target/powerpc/pr101322.C: New test.

diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
b/gcc/config/rs6000/rs6000-builtin.cc
index 12afa86854c..e796e74f072 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -1085,7 +1085,12 @@ rs6000_gimple_fold_mma_builtin (gimple_stmt_iterator 
*gsi,
   unsigned nvec = (fncode == RS6000_BIF_DISASSEMBLE_ACC) ? 4 : 2;
   tree dst_ptr = gimple_call_arg (stmt, 0);
   tree src_ptr = gimple_call_arg (stmt, 1);
-  tree src_type = TREE_TYPE (src_ptr);
+  tree src_type = (fncode == RS6000_BIF_DISASSEMBLE_ACC)
+ ? build_pointer_type (vector_quad_type_node)
+ : build_pointer_type (vector_pair_type_node);
+  if (TREE_TYPE (TREE_TYPE (src_ptr)) != src_type)
+   src_ptr = build1 (VIEW_CONVERT_EXPR, src_type, src_ptr);
+
   tree src = create_tmp_reg_or_ssa_name (TREE_TYPE (src_type));
   gimplify_assign (src, build_simple_mem_ref (src_ptr), &new_seq);
 
diff --git a/gcc/testsuite/g++.target/powerpc/pr101322.C 
b/gcc/testsuite/g++.target/powerpc/pr101322.C
new file mode 100644
index 000..59e71e8eb89
--- /dev/null
+++ b/gcc/testsuite/g++.target/powerpc/pr101322.C
@@ -0,0 +1,17 @@
+/* PR target/101322 */
+/* { dg-options "-O2 -mdejagnu-cpu=power10" } */
+/* { dg-require-effective-target power10_ok } */
+
+/* Verify we don't ICE on the following test cases.  */
+
+void
+foo (char *resp, char *vpp)
+{
+  __builtin_vsx_disassemble_pair (resp, (__vector_pair *) vpp);
+}
+
+void
+bar (char *resp, char *vpp)
+{
+  __builtin_mma_disassemble_acc (resp, (__vector_quad *)vpp);
+}


Re: [EXTERNAL] [PATCH] contrib: modernize gen_autofdo_event.py

2022-08-26 Thread Xi Ruoyao via Gcc-patches
On Fri, 2022-08-26 at 22:20 +, Eugene Rozenfeld wrote:
> The patch is approved.

Thanks, pushed r13-2234.
-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


[PATCH] c++: Fix C++11 attribute propagation [PR106712]

2022-08-26 Thread Marek Polacek via Gcc-patches
When we have

  [[noreturn]] int fn1 [[nodiscard]](), fn2();

"noreturn" should apply to both fn1 and fn2 but "nodiscard" only to fn1:
[dcl.pre]/3: "The attribute-specifier-seq appertains to each of
the entities declared by the declarators of the init-declarator-list."
[dcl.spec.general]: "The attribute-specifier-seq affects the type
only for the declaration it appears in, not other declarations involving
the same type."

As Ed Catmur correctly analyzed, this is because, for the test above,
we call start_decl with prefix_attributes=noreturn, but this line:

  attributes = attr_chainon (attributes, prefix_attributes);

results in attributes == prefix_attributes, because chainon sees
that attributes is null so it just returns prefix_attributes.  Then
in grokdeclarator we reach

  *attrlist = attr_chainon (*attrlist, declarator->std_attributes);

which modifies prefix_attributes so now it's "noreturn, nodiscard"
and so fn2 is wrongly marked nodiscard as well.  Fixed by copying
prefix_attributes.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

PR c++/106712

gcc/cp/ChangeLog:

* decl.cc (start_decl): Copy prefix_attributes.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/gen-attrs-77.C: New test.
---
 gcc/cp/decl.cc|  3 ++-
 gcc/testsuite/g++.dg/cpp0x/gen-attrs-77.C | 17 +
 2 files changed, 19 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/gen-attrs-77.C

diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index d46a347a6c7..9fa80b926d5 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -5566,7 +5566,8 @@ start_decl (const cp_declarator *declarator,
   *pushed_scope_p = NULL_TREE;
 
   if (prefix_attributes != error_mark_node)
-attributes = attr_chainon (attributes, prefix_attributes);
+/* Don't let grokdeclarator modify prefix_attributes.  */
+attributes = attr_chainon (attributes, copy_list (prefix_attributes));
 
   decl = grokdeclarator (declarator, declspecs, NORMAL, initialized,
 &attributes);
diff --git a/gcc/testsuite/g++.dg/cpp0x/gen-attrs-77.C 
b/gcc/testsuite/g++.dg/cpp0x/gen-attrs-77.C
new file mode 100644
index 000..2c41c62f33b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/gen-attrs-77.C
@@ -0,0 +1,17 @@
+// PR c++/106712
+// { dg-do compile { target c++11 } }
+
+[[noreturn]] int f1 [[nodiscard]](), f2 ();
+[[nodiscard]] int f3 (), f4 ();
+int f5 [[nodiscard]](), f6 ();
+
+int
+main ()
+{
+  f1 (); // { dg-warning "ignoring" }
+  f2 ();
+  f3 (); // { dg-warning "ignoring" }
+  f4 (); // { dg-warning "ignoring" }
+  f5 (); // { dg-warning "ignoring" }
+  f6 ();
+}

base-commit: 390f94eee1ae694901f896ac45bfb148f8126baa
-- 
2.37.2



RE: [EXTERNAL] [PATCH] contrib: modernize gen_autofdo_event.py

2022-08-26 Thread Eugene Rozenfeld via Gcc-patches
The patch is approved.

Eugene

-Original Message-
From: Andi Kleen  
Sent: Friday, August 05, 2022 11:29 PM
To: Eugene Rozenfeld ; Xi Ruoyao 
; gcc-patches@gcc.gnu.org
Subject: Re: [EXTERNAL] [PATCH] contrib: modernize gen_autofdo_event.py


On 8/6/2022 1:07 AM, Eugene Rozenfeld wrote:
> The changes look good to me. Also adding Andi, the author of the script.


Looks all good to me too.


-Andi




Re: [PATCH 2/2] vec: Add array_slice::bsearch

2022-08-26 Thread Martin Jambor
Hi,

On Fri, Aug 26 2022, Richard Sandiford wrote:
> Richard Biener  writes:
>>> Am 26.08.2022 um 18:40 schrieb Martin Jambor :
>>> 
>>> Hi,
>>> 
>>> This adds a method to binary search in a sorted array_slice.
>>> 
>>> The implementation is direct copy of vec:bsearch.  Moreover, to only
>>> copy it once and not twice, I used const_cast in the non-const
>>> variants to be able to use the const variants.  I hope that is
>>> acceptable abuse of const_cast but I'll be happy to change that if
>>> not.
>>> 
>>> Bootstrapped and tested along code that actually uses it on
>>> x86_64-linux.  OK for trunk?
>>
>> Can you avoid the copying by using array slice bsearch from the vec<> 
>> bsearch?
>
> IMO it would be better to transition to using  routines
> for this kind of thing (for new code).  In this case that would be
> std::lower_bound.
>
> Using std::lower_bound is more convenient because it avoids the void *
> thing (you can use lambdas to capture any number of variables instead)
> and because it works on subranges, not just whole ranges.
>

OK, I can use std::lower_bound with simple lambdas too.  The semantics
of returning the first matching a criterion actually allows me to use it
one more time.

Martin


New Swedish PO file for 'gcc' (version 12.2.0)

2022-08-26 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the Swedish team of translators.  The file is available at:

https://translationproject.org/latest/gcc/sv.po

(This file, 'gcc-12.2.0.sv.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

https://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

https://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




Re: [PATCH 2/2] vec: Add array_slice::bsearch

2022-08-26 Thread Martin Jambor
Hi,

On Fri, Aug 26 2022, Richard Biener wrote:
>> Am 26.08.2022 um 18:40 schrieb Martin Jambor :
>> 
>> Hi,
>> 
>> This adds a method to binary search in a sorted array_slice.
>> 
>> The implementation is direct copy of vec:bsearch.  Moreover, to only
>> copy it once and not twice, I used const_cast in the non-const
>> variants to be able to use the const variants.  I hope that is
>> acceptable abuse of const_cast but I'll be happy to change that if
>> not.
>> 
>> Bootstrapped and tested along code that actually uses it on
>> x86_64-linux.  OK for trunk?
>
> Can you avoid the copying by using array slice bsearch from the vec<> bsearch?

I would be easy to just move the implementation to array_slice and then
implement vec<...>::bsearch by calling into that.

But I still think I need more constructors for that ;-)

Martin

>> 
>> 
>> gcc/ChangeLog:
>> 
>> 2022-08-08  Martin Jambor  
>> 
>>* vec.h (array_slice::bsearch): New methods.
>> ---
>> gcc/vec.h | 94 +++
>> 1 file changed, 94 insertions(+)
>> 
>> diff --git a/gcc/vec.h b/gcc/vec.h
>> index b0477e1044c..61ebdc4ca13 100644
>> --- a/gcc/vec.h
>> +++ b/gcc/vec.h
>> @@ -2301,6 +2301,14 @@ public:
>>   // True if the array is valid, false if it is an array like INVALID.
>>   bool is_valid () const { return m_base || m_size == 0; }
>> 
>> +  /* Methods for binary search in sorted array_slice.  */
>> +  const T *bsearch (const void *key, int (*compar)(const void *,
>> +   const void *)) const;
>> +  T *bsearch (const void *key, int (*compar)(const void *, const void *));
>> +  const T *bsearch (const void *key,
>> +  int (*compar)(const void *, const void *, void *), void *) const;
>> +  T *bsearch (const void *key,
>> +  int (*compar)(const void *, const void *, void *), void *);
>> private:
>>   iterator m_base;
>>   unsigned int m_size;
>> @@ -2361,6 +2369,92 @@ make_array_slice (T *base, unsigned int size)
>>   return array_slice (base, size);
>> }
>> 
>> +/* Search the contents of the sorted array_slice with a binary search.  CMP 
>> is
>> +   the comparison function to pass to bsearch.  */
>> +
>> +template
>> +inline const T *
>> +array_slice::bsearch (const void *key,
>> + int (*compar) (const void *, const void *)) const
>> +{
>> +  const void *base = this->m_base;
>> +  size_t nmemb = this->size ();
>> +  size_t size = sizeof (T);
>> +  /* The following is a copy of glibc stdlib-bsearch.h.  */
>> +  size_t l, u, idx;
>> +  const void *p;
>> +  int comparison;
>> +
>> +  l = 0;
>> +  u = nmemb;
>> +  while (l < u)
>> +{
>> +  idx = (l + u) / 2;
>> +  p = (const void *) (((const char *) base) + (idx * size));
>> +  comparison = (*compar) (key, p);
>> +  if (comparison < 0)
>> +u = idx;
>> +  else if (comparison > 0)
>> +l = idx + 1;
>> +  else
>> +return (T *)const_cast(p);
>> +}
>> +
>> +  return NULL;
>> +}
>> +
>> +template
>> +inline T *
>> +array_slice::bsearch (const void *key,
>> + int (*compar) (const void *, const void *))
>> +{
>> +  return const_cast(bsearch (key, compar));
>> +}
>> +
>> +/* Search the contents of the sorted array_slice with a binary search.  CMP 
>> is
>> +   the comparison function to pass to bsearch.  */
>> +
>> +template
>> +inline const T *
>> +array_slice::bsearch (const void *key,
>> + int (*compar) (const void *, const void *, void *),
>> + void *data) const
>> +{
>> +  const void *base = this->m_base;
>> +  size_t nmemb = this->size ();
>> +  size_t size = sizeof (T);
>> +  /* The following is a copy of glibc stdlib-bsearch.h.  */
>> +  size_t l, u, idx;
>> +  const void *p;
>> +  int comparison;
>> +
>> +  l = 0;
>> +  u = nmemb;
>> +  while (l < u)
>> +{
>> +  idx = (l + u) / 2;
>> +  p = (const void *) (((const char *) base) + (idx * size));
>> +  comparison = (*compar) (key, p, data);
>> +  if (comparison < 0)
>> +u = idx;
>> +  else if (comparison > 0)
>> +l = idx + 1;
>> +  else
>> +return (T *)const_cast(p);
>> +}
>> +
>> +  return NULL;
>> +}
>> +
>> +template
>> +inline T *
>> +array_slice::bsearch (const void *key,
>> + int (*compar) (const void *, const void *, void *),
>> + void *data)
>> +{
>> +  return const_cast (bsearch (key, compar, data));
>> +}
>> +
>> #if (GCC_VERSION >= 3000)
>> # pragma GCC poison m_vec m_vecpfx m_vecdata
>> #endif
>> -- 
>> 2.37.2
>> 


Re: [PATCH 2/2] vec: Add array_slice::bsearch

2022-08-26 Thread Richard Sandiford via Gcc-patches
Richard Biener  writes:
>> Am 26.08.2022 um 18:40 schrieb Martin Jambor :
>> 
>> Hi,
>> 
>> This adds a method to binary search in a sorted array_slice.
>> 
>> The implementation is direct copy of vec:bsearch.  Moreover, to only
>> copy it once and not twice, I used const_cast in the non-const
>> variants to be able to use the const variants.  I hope that is
>> acceptable abuse of const_cast but I'll be happy to change that if
>> not.
>> 
>> Bootstrapped and tested along code that actually uses it on
>> x86_64-linux.  OK for trunk?
>
> Can you avoid the copying by using array slice bsearch from the vec<> bsearch?

IMO it would be better to transition to using  routines
for this kind of thing (for new code).  In this case that would be
std::lower_bound.

Using std::lower_bound is more convenient because it avoids the void *
thing (you can use lambdas to capture any number of variables instead)
and because it works on subranges, not just whole ranges.

Thanks,
Richard



[PATCH] libstdc++: Add [[nodiscard]] attribute to and

2022-08-26 Thread Jonathan Wakely via Gcc-patches
I plan to commit this after testing finishes, probably on Monday.

Basic testing on powerpc64le-linux is clean, but I'm still running debug
mode and other variations of tests.

-- >8 --

libstdc++-v3/ChangeLog:

* include/bits/basic_string.h (basic_string): Add nodiscard
attribute to all relevant functions.
* include/std/string_view (basic_string_view): Likewise.
* testsuite/21_strings/basic_string/capacity/1.cc: Cast unused
results to void.
* testsuite/21_strings/basic_string/capacity/char/1.cc:
Likewise.
* testsuite/21_strings/basic_string/capacity/wchar_t/1.cc:
Likewise.
* testsuite/21_strings/basic_string/cons/char/self_move.cc:
Likewise.
* testsuite/21_strings/basic_string/element_access/char/1.cc:
Likewise.
* testsuite/21_strings/basic_string/element_access/char/21674.cc:
Likewise.
* testsuite/21_strings/basic_string/element_access/wchar_t/1.cc:
Likewise.
* testsuite/21_strings/basic_string/element_access/wchar_t/21674.cc:
Likewise.
* testsuite/21_strings/basic_string/lwg2758.cc: Likewise.
* testsuite/21_strings/basic_string/lwg2946.cc: Likewise.
* testsuite/21_strings/basic_string/operations/contains/nonnull.cc:
Add -Wno-unused-result to options.
* testsuite/21_strings/basic_string/operations/ends_with/nonnull.cc:
Likewise.
* testsuite/21_strings/basic_string/operations/starts_with/nonnull.cc:
Likewise.
* testsuite/21_strings/basic_string/operators/char/1.cc: Cast
unused results to void.
* testsuite/21_strings/basic_string/operators/wchar_t/1.cc:
Likewise.
* testsuite/21_strings/basic_string_view/capacity/1.cc:
Likewise.
* testsuite/21_strings/basic_string_view/element_access/char/1.cc:
Likewise.
* testsuite/21_strings/basic_string_view/element_access/char/2.cc:
Likewise.
* testsuite/21_strings/basic_string_view/element_access/wchar_t/1.cc:
Likewise.
* testsuite/21_strings/basic_string_view/element_access/wchar_t/2.cc:
Likewise.
* testsuite/21_strings/basic_string_view/operations/contains/nonnull.cc:
Likewise.
Add -Wno-unused-result to options.
* 
testsuite/21_strings/basic_string_view/operations/ends_with/nonnull.cc:
Likewise.
* 
testsuite/21_strings/basic_string_view/operations/starts_with/nonnull.cc:
Likewise.
* testsuite/27_io/basic_stringbuf/sputbackc/char/1.cc: Cast
unused results to void.
* testsuite/27_io/basic_stringbuf/sputbackc/wchar_t/1.cc:
Likewise.
* testsuite/27_io/basic_stringbuf/sungetc/char/1.cc: Likewise.
* testsuite/27_io/basic_stringbuf/sungetc/wchar_t/1.cc:
Likewise.
---
 libstdc++-v3/include/bits/basic_string.h  | 211 +++---
 libstdc++-v3/include/std/string_view  | 107 +++--
 .../21_strings/basic_string/capacity/1.cc |   2 +-
 .../basic_string/capacity/char/1.cc   |   6 +-
 .../basic_string/capacity/wchar_t/1.cc|   6 +-
 .../basic_string/cons/char/self_move.cc   |   2 +-
 .../basic_string/element_access/char/1.cc |   4 +-
 .../basic_string/element_access/char/21674.cc |   2 +-
 .../basic_string/element_access/wchar_t/1.cc  |   4 +-
 .../element_access/wchar_t/21674.cc   |   2 +-
 .../21_strings/basic_string/lwg2758.cc|   4 +-
 .../21_strings/basic_string/lwg2946.cc|  16 +-
 .../operations/contains/nonnull.cc|   2 +-
 .../operations/ends_with/nonnull.cc   |   2 +-
 .../operations/starts_with/nonnull.cc |   2 +-
 .../basic_string/operators/char/1.cc  |   4 +-
 .../basic_string/operators/wchar_t/1.cc   |   4 +-
 .../basic_string_view/capacity/1.cc   |   2 +-
 .../element_access/char/1.cc  |   2 +-
 .../element_access/char/2.cc  |   2 +-
 .../element_access/wchar_t/1.cc   |   2 +-
 .../element_access/wchar_t/2.cc   |   2 +-
 .../operations/contains/nonnull.cc|   2 +-
 .../operations/ends_with/nonnull.cc   |   2 +-
 .../operations/starts_with/nonnull.cc |   2 +-
 .../27_io/basic_stringbuf/sputbackc/char/1.cc |  10 +-
 .../basic_stringbuf/sputbackc/wchar_t/1.cc|  10 +-
 .../27_io/basic_stringbuf/sungetc/char/1.cc   |  10 +-
 .../basic_stringbuf/sungetc/wchar_t/1.cc  |  10 +-
 29 files changed, 279 insertions(+), 157 deletions(-)

diff --git a/libstdc++-v3/include/bits/basic_string.h 
b/libstdc++-v3/include/bits/basic_string.h
index 9d8b415302b..0df64ea98ca 100644
--- a/libstdc++-v3/include/bits/basic_string.h
+++ b/libstdc++-v3/include/bits/basic_string.h
@@ -942,7 +942,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
*  Returns a read/write iterator that points to the first character in
*  the %string.
*/
-  _GLIBCXX20_

Re: [PATCH 3/3] libstdc++: Implement ranges::zip_view from P2321R2

2022-08-26 Thread Jonathan Wakely via Gcc-patches
On Tue, 23 Aug 2022 at 02:38, Patrick Palka via Libstdc++
 wrote:
> +template
> +  constexpr void
> +  __tuple_for_each(_Fp&& __f, _Tuple&& __tuple)
> +  {
> +   std::apply([&](_Ts&&... __elts) {
> + (std::__invoke(__f, std::forward<_Ts>(__elts)), ...);
> +   }, std::forward<_Tuple>(__tuple));


Nicole Mazzuca noticed a problem with the definition of
tuple-for-each, which I've just added to the issues list:
https://cplusplus.github.io/LWG/issue3755



Re: [PATCH 1/2] vec: Add array_slice constructors from non-const and gc vectors

2022-08-26 Thread Martin Jambor
Hi,

On Fri, Aug 26 2022, Richard Biener wrote:
>> Am 26.08.2022 um 18:39 schrieb Martin Jambor :
>>
>> Hi,
>>
>> This patch adds constructors of array_slice that are required to
>> create them from non-const (heap or auto) vectors or from GC vectors.
>>
>> The use of non-const array_slices is somewhat limited, as creating one
>> from const vec still leads to array_slice,
>> so I eventually also only resorted to having read-only array_slices.
>> But I do need the constructor from the gc vector.
>>
>> Bootstrapped and tested along code that actually uses it on
>> x86_64-linux.  OK for trunk?
>>
>> Thanks,
>>
>> Martin
>>
>>
>> gcc/ChangeLog:
>>
>> 2022-08-08  Martin Jambor  
>>
>>* vec.h (array_slice): Add constructors for non-const reference to
>>heap vector and pointers to heap vectors.
>> ---
>> gcc/vec.h | 12 
>> 1 file changed, 12 insertions(+)
>>
>> diff --git a/gcc/vec.h b/gcc/vec.h
>> index eed075addc9..b0477e1044c 100644
>> --- a/gcc/vec.h
>> +++ b/gcc/vec.h
>> @@ -2264,6 +2264,18 @@ public:
>>   array_slice (const vec &v)
>> : m_base (v.address ()), m_size (v.length ()) {}
>>
>> +  template
>> +  array_slice (vec &v)
>> +: m_base (v.address ()), m_size (v.length ()) {}
>> +
>> +  template
>> +  array_slice (const vec *v)
>> +: m_base (v ? v->address () : nullptr), m_size (v ? v->length () : 0) {}
>> +
>> +  template
>> +  array_slice (vec *v)
>> +: m_base (v ? v->address () : nullptr), m_size (v ? v->length () : 0) {}
>> +
>
> I don’t quite understand why the generic ctor doesn’t cover the GC case.  It 
> looks more like reference vs pointer?
>

If you think that this should work:

  vec *heh = cfun->local_decls;
  array_slice arr_slice (*heh);

then it does not:

  /home/mjambor/gcc/mine/src/gcc/ipa-cp.cc:6693:36: error: no matching function 
for call to ‘array_slice::array_slice(vec&)’
   6693 |   array_slice arr_slice (*heh);
|^
  In file included from /home/mjambor/gcc/mine/src/gcc/hash-table.h:248,
   from /home/mjambor/gcc/mine/src/gcc/coretypes.h:486,
   from /home/mjambor/gcc/mine/src/gcc/ipa-cp.cc:105:
  /home/mjambor/gcc/mine/src/gcc/vec.h:2264:3: note: candidate: ‘template array_slice::array_slice(const vec&) [with T = tree_node*]’
   2264 |   array_slice (const vec &v)
|   ^~~
  /home/mjambor/gcc/mine/src/gcc/vec.h:2264:3: note:   template argument 
deduction/substitution failed:
  /home/mjambor/gcc/mine/src/gcc/ipa-cp.cc:6693:36: note:   mismatched types 
‘va_heap’ and ‘va_gc’
   6693 |   array_slice arr_slice (*heh);
|^

  [... I trimmed notes about all other candidates...]

Or did you mean something else?

Thanks,

Martin


Re: [PATCH 1/2] libstdc++: Implement ranges::zip_transform_view from P2321R2

2022-08-26 Thread Jonathan Wakely via Gcc-patches
On Thu, 25 Aug 2022 at 16:40, Patrick Palka via Libstdc++
 wrote:
>
> Tested on x86_64-pc-linux-gnu, does this look OK for trunk?

OK, thanks.

>
> libstdc++-v3/ChangeLog:
>
> * include/std/ranges (zip_view::_Iterator): Befriend
> zip_transform_view.
> (__detail::__range_iter_cat): Define.
> (zip_transform_view): Define.
> (zip_transform_view::_Iterator): Define.
> (zip_transform_view::_Sentinel): Define.
> (views::__detail::__can_zip_transform_view): Define.
> (views::_ZipTransform): Define.
> (views::zip_transform): Define.
> * testsuite/std/ranges/zip_transform/1.cc: New test.
> ---
>  libstdc++-v3/include/std/ranges   | 343 ++
>  .../testsuite/std/ranges/zip_transform/1.cc   | 108 ++
>  2 files changed, 451 insertions(+)
>  create mode 100644 libstdc++-v3/testsuite/std/ranges/zip_transform/1.cc
>
> diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
> index fb815c48f99..d748cb73346 100644
> --- a/libstdc++-v3/include/std/ranges
> +++ b/libstdc++-v3/include/std/ranges
> @@ -4502,6 +4502,12 @@ namespace views::__adaptor
> return input_iterator_tag{};
>  }
>
> +template
> +  requires (view<_Ws> && ...) && (sizeof...(_Ws) > 0) && is_object_v<_Fp>
> +   && regular_invocable<_Fp&, range_reference_t<_Ws>...>
> +   && std::__detail::__can_reference range_reference_t<_Ws>...>>
> +  friend class zip_transform_view;
> +
>public:
>  // iterator_category defined in __zip_view_iter_cat
>  using iterator_concept = decltype(_S_iter_concept());
> @@ -4781,6 +4787,343 @@ namespace views::__adaptor
>
>  inline constexpr _Zip zip;
>}
> +
> +  namespace __detail
> +  {
> +template
> +  using __range_iter_cat
> +   = typename iterator_traits _Range>>>::iterator_category;
> +  }
> +
> +  template
> +requires (view<_Vs> && ...) && (sizeof...(_Vs) > 0) && is_object_v<_Fp>
> +  && regular_invocable<_Fp&, range_reference_t<_Vs>...>
> +  && std::__detail::__can_reference range_reference_t<_Vs>...>>
> +  class zip_transform_view : public view_interface _Vs...>>
> +  {
> +[[no_unique_address]] __detail::__box<_Fp> _M_fun;
> +zip_view<_Vs...> _M_zip;
> +
> +using _InnerView = zip_view<_Vs...>;
> +
> +template
> +  using __ziperator = iterator_t<__detail::__maybe_const_t<_Const, 
> _InnerView>>;
> +
> +template
> +  using __zentinel = sentinel_t<__detail::__maybe_const_t<_Const, 
> _InnerView>>;
> +
> +template
> +  using _Base = __detail::__maybe_const_t<_Const, _InnerView>;
> +
> +template
> +  struct __iter_cat
> +  { };
> +
> +template
> +  requires forward_range<_Base<_Const>>
> +  struct __iter_cat<_Const>
> +  {
> +  private:
> +   static auto
> +   _S_iter_cat()
> +   {
> + using __detail::__maybe_const_t;
> + using __detail::__range_iter_cat;
> + using _Res = invoke_result_t<__maybe_const_t<_Const, _Fp>&,
> +  
> range_reference_t<__maybe_const_t<_Const, _Vs>>...>;
> + if constexpr (!is_lvalue_reference_v<_Res>)
> +   return input_iterator_tag{};
> + else if constexpr ((derived_from<__range_iter_cat<_Vs, _Const>,
> +  random_access_iterator_tag> && 
> ...))
> +   return random_access_iterator_tag{};
> + else if constexpr ((derived_from<__range_iter_cat<_Vs, _Const>,
> +  bidirectional_iterator_tag> && 
> ...))
> +   return bidirectional_iterator_tag{};
> + else if constexpr ((derived_from<__range_iter_cat<_Vs, _Const>,
> +  forward_iterator_tag> && ...))
> +   return forward_iterator_tag{};
> + else
> +   return input_iterator_tag{};
> +   }
> +  public:
> +   using iterator_category = decltype(_S_iter_cat());
> +  };
> +
> +template class _Iterator;
> +template class _Sentinel;
> +
> +  public:
> +zip_transform_view() = default;
> +
> +constexpr explicit
> +zip_transform_view(_Fp __fun, _Vs... __views)
> +  : _M_fun(std::move(__fun)), _M_zip(std::move(__views)...)
> +{ }
> +
> +constexpr auto
> +begin()
> +{ return _Iterator(*this, _M_zip.begin()); }
> +
> +constexpr auto
> +begin() const
> +  requires range
> +   && regular_invocable...>
> +{ return _Iterator(*this, _M_zip.begin()); }
> +
> +constexpr auto
> +end()
> +{
> +  if constexpr (common_range<_InnerView>)
> +   return _Iterator(*this, _M_zip.end());
> +  else
> +   return _Sentinel(_M_zip.end());
> +}
> +
> +constexpr auto
> +end() const
> +  requires range
> +   && regular_invocable...>
> +{
> +  if constexpr (common_range)
> +   return _Iterator(*this, _M_zip.end());
> +  else
> +

Re: [PATCH 2/2] libstdc++: Implement LWG 3692/3702 changes to zip_/zip_transform_view

2022-08-26 Thread Jonathan Wakely via Gcc-patches
On Thu, 25 Aug 2022 at 16:40, Patrick Palka via Libstdc++
 wrote:
>
> Tested on x86_64-pc-linux-gnu, does this look OK for trunk?

OK, thanks.


>
> libstdc++-v3/ChangeLog:
>
> * include/std/ranges (zip_view::_Iterator::operator<): Remove.
> (zip_view::_Iterator::operator>): Remove.
> (zip_view::_Iterator::operator<=): Remove.
> (zip_view::_Iterator::operator>=): Remove.
> (zip_view::_Iterator::operator<=>): Remove three_way_comparable
> constraint.
> (zip_transform_view::_Iterator): Ditto.
> ---
>  libstdc++-v3/include/std/ranges | 43 +
>  1 file changed, 1 insertion(+), 42 deletions(-)
>
> diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
> index d748cb73346..bf0ff29c38c 100644
> --- a/libstdc++-v3/include/std/ranges
> +++ b/libstdc++-v3/include/std/ranges
> @@ -4616,30 +4616,9 @@ namespace views::__adaptor
> }(make_index_sequence{});
>  }
>
> -friend constexpr bool
> -operator<(const _Iterator& __x, const _Iterator& __y)
> -  requires __detail::__all_random_access<_Const, _Vs...>
> -{ return __x._M_current < __y._M_current; }
> -
> -friend constexpr bool
> -operator>(const _Iterator& __x, const _Iterator& __y)
> -  requires __detail::__all_random_access<_Const, _Vs...>
> -{ return __y < __x; }
> -
> -friend constexpr bool
> -operator<=(const _Iterator& __x, const _Iterator& __y)
> -  requires __detail::__all_random_access<_Const, _Vs...>
> -{ return !(__y < __x); }
> -
> -friend constexpr bool
> -operator>=(const _Iterator& __x, const _Iterator& __y)
> -  requires __detail::__all_random_access<_Const, _Vs...>
> -{ return !(__x < __y); }
> -
>  friend constexpr auto
>  operator<=>(const _Iterator& __x, const _Iterator& __y)
>requires __detail::__all_random_access<_Const, _Vs...>
> -   && (three_way_comparable _Vs>>> && ...)
>  { return __x._M_current <=> __y._M_current; }
>
>  friend constexpr _Iterator
> @@ -5008,29 +4987,9 @@ namespace views::__adaptor
>requires equality_comparable<__ziperator<_Const>>
>  { return __x._M_inner == __y._M_inner; }
>
> -friend constexpr bool
> -operator<(const _Iterator& __x, const _Iterator& __y)
> -  requires random_access_range<_Base<_Const>>
> -{ return __x._M_inner < __y._M_inner; }
> -
> -friend constexpr bool
> -operator>(const _Iterator& __x, const _Iterator& __y)
> -  requires random_access_range<_Base<_Const>>
> -{ return __x._M_inner > __y._M_inner; }
> -
> -friend constexpr bool
> -operator<=(const _Iterator& __x, const _Iterator& __y)
> -  requires random_access_range<_Base<_Const>>
> -{ return __x._M_inner <= __y._M_inner; }
> -
> -friend constexpr bool
> -operator>=(const _Iterator& __x, const _Iterator& __y)
> -  requires random_access_range<_Base<_Const>>
> -{ return __x._M_inner >= __y._M_inner; }
> -
>  friend constexpr auto
>  operator<=>(const _Iterator& __x, const _Iterator& __y)
> -  requires random_access_range<_Base<_Const>> && 
> three_way_comparable<__ziperator<_Const>>
> +  requires random_access_range<_Base<_Const>>
>  { return __x._M_inner <=> __y._M_inner; }
>
>  friend constexpr _Iterator
> --
> 2.37.2.382.g795ea8776b
>



Re: [PATCH] Add support for floating point endpoints to frange.

2022-08-26 Thread Andrew Pinski via Gcc-patches
On Fri, Aug 26, 2022 at 12:16 PM Aldy Hernandez  wrote:
>
> On Fri, Aug 26, 2022 at 7:40 PM Andrew Pinski  wrote:
> >
> > On Fri, Aug 26, 2022 at 8:55 AM Aldy Hernandez  wrote:
> > >
> > > [pinskia: I'm CCing you as the author of the match.pd pattern.]
> > >
> > > So, as I wrap up the work here (latest patch attached), I see there's
> > > another phiopt regression (not spaceship related).  I was hoping
> > > someone could either give me a hand, or offer some guidance.
> > >
> > > The failure is in gcc.dg/tree-ssa/phi-opt-24.c.
> > >
> > > We fail to transform the following into -A:
> > >
> > > /* { dg-options "-O2 -fno-signed-zeros -fdump-tree-phiopt" } */
> > >
> > > float f0(float A)
> > > {
> > >   // A == 0? A : -Asame as -A
> > >   if (A == 0)  return A;
> > >   return -A;
> > > }
> > >
> > > This is because the abs/negative match.pd pattern here:
> > >
> > > /* abs/negative simplifications moved from fold_cond_expr_with_comparison,
> > >Need to handle (A - B) case as fold_cond_expr_with_comparison does.
> > >Need to handle UN* comparisons.
> > >...
> > >...
> > >
> > > Sees IL that has the 0.0 propagated.
> > >
> > > Instead of:
> > >
> > >[local count: 1073741824]:
> > >   if (A_2(D) == 0.0)
> > > goto ; [34.00%]
> > >   else
> > > goto ; [66.00%]
> > >
> > >[local count: 708669601]:
> > >   _3 = -A_2(D);
> > >
> > >[local count: 1073741824]:
> > >   # _1 = PHI 
> > >
> > > It now sees:
> > >
> > >[local count: 1073741824]:
> > >   # _1 = PHI <0.0(2), _3(3)>
> > >
> > > which it leaves untouched, causing the if conditional to survive.
> > >
> > > Is this something that can be done by improving the match.pd pattern,
> > > or should be done elsewhere?
> >
> > Oh the pattern which is supposed to catch this does:
> >   (simplify
> >(cnd (cmp @0 zerop) integer_zerop (negate@1 @0))
> > (if (!HONOR_SIGNED_ZEROS (type))
> >  @1))
>
> On trunk without any patches, for the following snippet with -O2
> -fno-signed-zeros -fdump-tree-phiopt-folding...
>
> float f0(float A)
> {
>   // A == 0? A : -Asame as -A
>   if (A == 0)  return A;
>   return -A;
> }
>
> ...the phiopt2 dump file has:
>
> Applying pattern match.pd:4805, gimple-match.cc:69291, which
> corresponds to the aforementioned pattern.  So it looks like that was
> the pattern that was matching that isn't any more?
>
> Are you saying this pattern should only work with integers?

I am saying the pattern which is right after the one that matches
(without your patch) currrently works for integer only.
You could change integer_zerop to zerop in that pattern but I am not
100% sure that is valid thing to do.
Note there are a few other patterns in that for loop that does
integer_zerop which might need to be zerop too.

Thanks,
Andrew Pinski

>
> Aldy
>


Re: [PATCH] Add support for floating point endpoints to frange.

2022-08-26 Thread Aldy Hernandez via Gcc-patches
On Fri, Aug 26, 2022 at 7:40 PM Andrew Pinski  wrote:
>
> On Fri, Aug 26, 2022 at 8:55 AM Aldy Hernandez  wrote:
> >
> > [pinskia: I'm CCing you as the author of the match.pd pattern.]
> >
> > So, as I wrap up the work here (latest patch attached), I see there's
> > another phiopt regression (not spaceship related).  I was hoping
> > someone could either give me a hand, or offer some guidance.
> >
> > The failure is in gcc.dg/tree-ssa/phi-opt-24.c.
> >
> > We fail to transform the following into -A:
> >
> > /* { dg-options "-O2 -fno-signed-zeros -fdump-tree-phiopt" } */
> >
> > float f0(float A)
> > {
> >   // A == 0? A : -Asame as -A
> >   if (A == 0)  return A;
> >   return -A;
> > }
> >
> > This is because the abs/negative match.pd pattern here:
> >
> > /* abs/negative simplifications moved from fold_cond_expr_with_comparison,
> >Need to handle (A - B) case as fold_cond_expr_with_comparison does.
> >Need to handle UN* comparisons.
> >...
> >...
> >
> > Sees IL that has the 0.0 propagated.
> >
> > Instead of:
> >
> >[local count: 1073741824]:
> >   if (A_2(D) == 0.0)
> > goto ; [34.00%]
> >   else
> > goto ; [66.00%]
> >
> >[local count: 708669601]:
> >   _3 = -A_2(D);
> >
> >[local count: 1073741824]:
> >   # _1 = PHI 
> >
> > It now sees:
> >
> >[local count: 1073741824]:
> >   # _1 = PHI <0.0(2), _3(3)>
> >
> > which it leaves untouched, causing the if conditional to survive.
> >
> > Is this something that can be done by improving the match.pd pattern,
> > or should be done elsewhere?
>
> Oh the pattern which is supposed to catch this does:
>   (simplify
>(cnd (cmp @0 zerop) integer_zerop (negate@1 @0))
> (if (!HONOR_SIGNED_ZEROS (type))
>  @1))

On trunk without any patches, for the following snippet with -O2
-fno-signed-zeros -fdump-tree-phiopt-folding...

float f0(float A)
{
  // A == 0? A : -Asame as -A
  if (A == 0)  return A;
  return -A;
}

...the phiopt2 dump file has:

Applying pattern match.pd:4805, gimple-match.cc:69291, which
corresponds to the aforementioned pattern.  So it looks like that was
the pattern that was matching that isn't any more?

Are you saying this pattern should only work with integers?

Aldy



Re: [PATCH] 32-bit PA-RISC with HP-UX: remove deprecated ports

2022-08-26 Thread John David Anglin

On 2022-08-26 3:15 a.m., Martin Liška wrote:

fixincludes/ChangeLog:

* configure: Regenerate.
* inclhack.def: Remove patterns for deprecated ports.
* fixincl.x: Regenerate.

There are problems with the fixincludes changes.  These break build on 
hppa64-hpux11.

configure:4015: checking whether we are cross compiling
configure:4023: /home/dave/gnu/gcc/objdir64/./gcc/xgcc -B/home/dave/gnu/gcc/objd
ir64/./gcc/ -B/opt/gnu64/gcc/gcc-12/hppa64-hp-hpux11.11/bin/ -B/opt/gnu64/gcc/gc
c-12/hppa64-hp-hpux11.11/lib/ -isystem /opt/gnu64/gcc/gcc-12/hppa64-hp-hpux11.11
/include -isystem /opt/gnu64/gcc/gcc-12/hppa64-hp-hpux11.11/sys-include   -fno-c
hecking -o conftest -g -O2   conftest.c  >&5
In file included from /home/dave/gnu/gcc/objdir64/gcc/include-fixed/stdio.h:21,
 from conftest.c:11:
/home/dave/gnu/gcc/objdir64/gcc/include-fixed/stdio.h:608:59: error: unknown 
type name 'va_list'
  608 |  extern int vsnprintf(char *, __size_t, const char *, __va__list);
  | ^~
/home/dave/gnu/gcc/objdir64/gcc/include-fixed/stdio.h:116:1: note: 'va_list' is defined in header ''; did you forget to '#include 
'?

  115 | #   include 

Some hpux11 specific hacks are removed.  Some non system specific hpux hacks 
are also removed.
One can't trust comments or hacknames to infer applicability.

Dave

--
John David Anglin  dave.ang...@bell.net



Re: [PATCH 2/2] vec: Add array_slice::bsearch

2022-08-26 Thread Richard Biener via Gcc-patches



> Am 26.08.2022 um 18:40 schrieb Martin Jambor :
> 
> Hi,
> 
> This adds a method to binary search in a sorted array_slice.
> 
> The implementation is direct copy of vec:bsearch.  Moreover, to only
> copy it once and not twice, I used const_cast in the non-const
> variants to be able to use the const variants.  I hope that is
> acceptable abuse of const_cast but I'll be happy to change that if
> not.
> 
> Bootstrapped and tested along code that actually uses it on
> x86_64-linux.  OK for trunk?

Can you avoid the copying by using array slice bsearch from the vec<> bsearch?


> Thanks,
> 
> Martin
> 
> 
> gcc/ChangeLog:
> 
> 2022-08-08  Martin Jambor  
> 
>* vec.h (array_slice::bsearch): New methods.
> ---
> gcc/vec.h | 94 +++
> 1 file changed, 94 insertions(+)
> 
> diff --git a/gcc/vec.h b/gcc/vec.h
> index b0477e1044c..61ebdc4ca13 100644
> --- a/gcc/vec.h
> +++ b/gcc/vec.h
> @@ -2301,6 +2301,14 @@ public:
>   // True if the array is valid, false if it is an array like INVALID.
>   bool is_valid () const { return m_base || m_size == 0; }
> 
> +  /* Methods for binary search in sorted array_slice.  */
> +  const T *bsearch (const void *key, int (*compar)(const void *,
> +   const void *)) const;
> +  T *bsearch (const void *key, int (*compar)(const void *, const void *));
> +  const T *bsearch (const void *key,
> +  int (*compar)(const void *, const void *, void *), void *) const;
> +  T *bsearch (const void *key,
> +  int (*compar)(const void *, const void *, void *), void *);
> private:
>   iterator m_base;
>   unsigned int m_size;
> @@ -2361,6 +2369,92 @@ make_array_slice (T *base, unsigned int size)
>   return array_slice (base, size);
> }
> 
> +/* Search the contents of the sorted array_slice with a binary search.  CMP 
> is
> +   the comparison function to pass to bsearch.  */
> +
> +template
> +inline const T *
> +array_slice::bsearch (const void *key,
> + int (*compar) (const void *, const void *)) const
> +{
> +  const void *base = this->m_base;
> +  size_t nmemb = this->size ();
> +  size_t size = sizeof (T);
> +  /* The following is a copy of glibc stdlib-bsearch.h.  */
> +  size_t l, u, idx;
> +  const void *p;
> +  int comparison;
> +
> +  l = 0;
> +  u = nmemb;
> +  while (l < u)
> +{
> +  idx = (l + u) / 2;
> +  p = (const void *) (((const char *) base) + (idx * size));
> +  comparison = (*compar) (key, p);
> +  if (comparison < 0)
> +u = idx;
> +  else if (comparison > 0)
> +l = idx + 1;
> +  else
> +return (T *)const_cast(p);
> +}
> +
> +  return NULL;
> +}
> +
> +template
> +inline T *
> +array_slice::bsearch (const void *key,
> + int (*compar) (const void *, const void *))
> +{
> +  return const_cast(bsearch (key, compar));
> +}
> +
> +/* Search the contents of the sorted array_slice with a binary search.  CMP 
> is
> +   the comparison function to pass to bsearch.  */
> +
> +template
> +inline const T *
> +array_slice::bsearch (const void *key,
> + int (*compar) (const void *, const void *, void *),
> + void *data) const
> +{
> +  const void *base = this->m_base;
> +  size_t nmemb = this->size ();
> +  size_t size = sizeof (T);
> +  /* The following is a copy of glibc stdlib-bsearch.h.  */
> +  size_t l, u, idx;
> +  const void *p;
> +  int comparison;
> +
> +  l = 0;
> +  u = nmemb;
> +  while (l < u)
> +{
> +  idx = (l + u) / 2;
> +  p = (const void *) (((const char *) base) + (idx * size));
> +  comparison = (*compar) (key, p, data);
> +  if (comparison < 0)
> +u = idx;
> +  else if (comparison > 0)
> +l = idx + 1;
> +  else
> +return (T *)const_cast(p);
> +}
> +
> +  return NULL;
> +}
> +
> +template
> +inline T *
> +array_slice::bsearch (const void *key,
> + int (*compar) (const void *, const void *, void *),
> + void *data)
> +{
> +  return const_cast (bsearch (key, compar, data));
> +}
> +
> #if (GCC_VERSION >= 3000)
> # pragma GCC poison m_vec m_vecpfx m_vecdata
> #endif
> -- 
> 2.37.2
> 


Re: [PATCH 1/2] vec: Add array_slice constructors from non-const and gc vectors

2022-08-26 Thread Richard Biener via Gcc-patches



> Am 26.08.2022 um 18:39 schrieb Martin Jambor :
> 
> Hi,
> 
> This patch adds constructors of array_slice that are required to
> create them from non-const (heap or auto) vectors or from GC vectors.
> 
> The use of non-const array_slices is somewhat limited, as creating one
> from const vec still leads to array_slice,
> so I eventually also only resorted to having read-only array_slices.
> But I do need the constructor from the gc vector.
> 
> Bootstrapped and tested along code that actually uses it on
> x86_64-linux.  OK for trunk?
> 
> Thanks,
> 
> Martin
> 
> 
> gcc/ChangeLog:
> 
> 2022-08-08  Martin Jambor  
> 
>* vec.h (array_slice): Add constructors for non-const reference to
>heap vector and pointers to heap vectors.
> ---
> gcc/vec.h | 12 
> 1 file changed, 12 insertions(+)
> 
> diff --git a/gcc/vec.h b/gcc/vec.h
> index eed075addc9..b0477e1044c 100644
> --- a/gcc/vec.h
> +++ b/gcc/vec.h
> @@ -2264,6 +2264,18 @@ public:
>   array_slice (const vec &v)
> : m_base (v.address ()), m_size (v.length ()) {}
> 
> +  template
> +  array_slice (vec &v)
> +: m_base (v.address ()), m_size (v.length ()) {}
> +
> +  template
> +  array_slice (const vec *v)
> +: m_base (v ? v->address () : nullptr), m_size (v ? v->length () : 0) {}
> +
> +  template
> +  array_slice (vec *v)
> +: m_base (v ? v->address () : nullptr), m_size (v ? v->length () : 0) {}
> +

I don’t quite understand why the generic ctor doesn’t cover the GC case.  It 
looks more like reference vs pointer?

>   iterator begin () { return m_base; }
>   iterator end () { return m_base + m_size; }
> 
> -- 
> 2.37.2
> 


[Patch] OpenMP/Fortran: Permit end-clause on directive

2022-08-26 Thread Tobias Burnus

I did run into some issues related to this; those turned out to be
unrelated, but I end ended up implementing this feature.

Side remark: 'omp parallel workshare' seems to actually permit 'nowait'
now, but I guess that's an unintended change due to the
syntax-representation change. Hence, it is now tracked as Spec Issue
3338 and I do not permit it.

OK for mainline?

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
OpenMP/Fortran: Permit end-clause on directive

gcc/fortran/ChangeLog:

	* openmp.cc (OMP_DO_CLAUSES, OMP_SCOPE_CLAUSES,
	OMP_SECTIONS_CLAUSES, OMP_SINGLE_CLAUSES): Add 'nowait'.
	(gfc_match_omp_distribute_parallel_do,
	gfc_match_omp_distribute_parallel_do_simd,
	gfc_match_omp_parallel_do,
	gfc_match_omp_parallel_do_simd,
	gfc_match_omp_parallel_sections,
	gfc_match_omp_teams_distribute_parallel_do,
	gfc_match_omp_teams_distribute_parallel_do_simd): Disallow 'nowait'.
	gfc_match_omp_workshare): Match 'nowait' clause.
	(gfc_match_omp_end_single): Use clause matcher for 'nowait'.
	(resolve_omp_clauses): Reject 'nowait' + 'copyprivate'.
	* parse.cc (decode_omp_directive): Break too long line.
	(parse_omp_do, parse_omp_structured_block): Diagnose duplicated
	'nowait' clause.

libgomp/ChangeLog:

	* libgomp.texi (OpenMP 5.2): Mark end-directive as Y.

gcc/testsuite/ChangeLog:

	* gfortran.dg/gomp/copyprivate-1.f90: New test.
	* gfortran.dg/gomp/copyprivate-2.f90: New test.
	* gfortran.dg/gomp/nowait-4.f90: New test.
	* gfortran.dg/gomp/nowait-5.f90: New test.
	* gfortran.dg/gomp/nowait-6.f90: New test.

 gcc/fortran/openmp.cc|  55 ++--
 gcc/fortran/parse.cc |  33 ++-
 gcc/testsuite/gfortran.dg/gomp/copyprivate-1.f90 |  21 ++
 gcc/testsuite/gfortran.dg/gomp/copyprivate-2.f90 |  69 +
 gcc/testsuite/gfortran.dg/gomp/nowait-4.f90  | 315 +++
 gcc/testsuite/gfortran.dg/gomp/nowait-5.f90  | 118 +
 gcc/testsuite/gfortran.dg/gomp/nowait-6.f90  |  92 +++
 libgomp/libgomp.texi |   2 +-
 8 files changed, 673 insertions(+), 32 deletions(-)

diff --git a/gcc/fortran/openmp.cc b/gcc/fortran/openmp.cc
index 594907714ff..b3b6fc17828 100644
--- a/gcc/fortran/openmp.cc
+++ b/gcc/fortran/openmp.cc
@@ -3795,17 +3795,19 @@ cleanup:
   (omp_mask (OMP_CLAUSE_PRIVATE) | OMP_CLAUSE_FIRSTPRIVATE		\
| OMP_CLAUSE_LASTPRIVATE | OMP_CLAUSE_REDUCTION			\
| OMP_CLAUSE_SCHEDULE | OMP_CLAUSE_ORDERED | OMP_CLAUSE_COLLAPSE	\
-   | OMP_CLAUSE_LINEAR | OMP_CLAUSE_ORDER | OMP_CLAUSE_ALLOCATE)
+   | OMP_CLAUSE_LINEAR | OMP_CLAUSE_ORDER | OMP_CLAUSE_ALLOCATE		\
+   | OMP_CLAUSE_NOWAIT)
 #define OMP_LOOP_CLAUSES \
   (omp_mask (OMP_CLAUSE_BIND) | OMP_CLAUSE_COLLAPSE | OMP_CLAUSE_ORDER	\
| OMP_CLAUSE_PRIVATE | OMP_CLAUSE_LASTPRIVATE | OMP_CLAUSE_REDUCTION)
 
 #define OMP_SCOPE_CLAUSES \
   (omp_mask (OMP_CLAUSE_PRIVATE) |OMP_CLAUSE_FIRSTPRIVATE		\
-   | OMP_CLAUSE_REDUCTION | OMP_CLAUSE_ALLOCATE)
+   | OMP_CLAUSE_REDUCTION | OMP_CLAUSE_ALLOCATE | OMP_CLAUSE_NOWAIT)
 #define OMP_SECTIONS_CLAUSES \
   (omp_mask (OMP_CLAUSE_PRIVATE) | OMP_CLAUSE_FIRSTPRIVATE		\
-   | OMP_CLAUSE_LASTPRIVATE | OMP_CLAUSE_REDUCTION | OMP_CLAUSE_ALLOCATE)
+   | OMP_CLAUSE_LASTPRIVATE | OMP_CLAUSE_REDUCTION			\
+   | OMP_CLAUSE_ALLOCATE | OMP_CLAUSE_NOWAIT)
 #define OMP_SIMD_CLAUSES \
   (omp_mask (OMP_CLAUSE_PRIVATE) | OMP_CLAUSE_LASTPRIVATE		\
| OMP_CLAUSE_REDUCTION | OMP_CLAUSE_COLLAPSE | OMP_CLAUSE_SAFELEN	\
@@ -3855,7 +3857,7 @@ cleanup:
| OMP_CLAUSE_ORDER | OMP_CLAUSE_ALLOCATE)
 #define OMP_SINGLE_CLAUSES \
   (omp_mask (OMP_CLAUSE_PRIVATE) | OMP_CLAUSE_FIRSTPRIVATE		\
-   | OMP_CLAUSE_ALLOCATE)
+   | OMP_CLAUSE_ALLOCATE | OMP_CLAUSE_NOWAIT | OMP_CLAUSE_COPYPRIVATE)
 #define OMP_ORDERED_CLAUSES \
   (omp_mask (OMP_CLAUSE_THREADS) | OMP_CLAUSE_SIMD)
 #define OMP_DECLARE_TARGET_CLAUSES \
@@ -4020,8 +4022,8 @@ gfc_match_omp_distribute_parallel_do (void)
   return match_omp (EXEC_OMP_DISTRIBUTE_PARALLEL_DO,
 		(OMP_DISTRIBUTE_CLAUSES | OMP_PARALLEL_CLAUSES
 		 | OMP_DO_CLAUSES)
-		& ~(omp_mask (OMP_CLAUSE_ORDERED))
-		& ~(omp_mask (OMP_CLAUSE_LINEAR)));
+		& ~(omp_mask (OMP_CLAUSE_ORDERED)
+			| OMP_CLAUSE_LINEAR | OMP_CLAUSE_NOWAIT));
 }
 
 
@@ -4031,7 +4033,7 @@ gfc_match_omp_distribute_parallel_do_simd (void)
   return match_omp (EXEC_OMP_DISTRIBUTE_PARALLEL_DO_SIMD,
 		(OMP_DISTRIBUTE_CLAUSES | OMP_PARALLEL_CLAUSES
 		 | OMP_DO_CLAUSES | OMP_SIMD_CLAUSES)
-		& ~(omp_mask (OMP_CLAUSE_ORDERED)));
+		& ~(omp_mask (OMP_CLAUSE_ORDERED) | OMP_CLAUSE_NOWAIT));
 }
 
 
@@ -5332,7 +5334,8 @@ match
 gfc_match_omp_parallel_do (void)
 {
   return match_omp (EXEC_OMP_PARALLEL_DO,
-		OMP_PARALLEL_CLAUSES | OMP_DO_CLAUSES);
+		(OMP_PARALLEL_CLAUSES | OMP_DO_CLAUSES)
+

Re: [PATCH] Add support for floating point endpoints to frange.

2022-08-26 Thread Aldy Hernandez via Gcc-patches
On Fri, Aug 26, 2022, 19:40 Andrew Pinski  wrote:

> On Fri, Aug 26, 2022 at 8:55 AM Aldy Hernandez  wrote:
> >
> > [pinskia: I'm CCing you as the author of the match.pd pattern.]
> >
> > So, as I wrap up the work here (latest patch attached), I see there's
> > another phiopt regression (not spaceship related).  I was hoping
> > someone could either give me a hand, or offer some guidance.
> >
> > The failure is in gcc.dg/tree-ssa/phi-opt-24.c.
> >
> > We fail to transform the following into -A:
> >
> > /* { dg-options "-O2 -fno-signed-zeros -fdump-tree-phiopt" } */
> >
> > float f0(float A)
> > {
> >   // A == 0? A : -Asame as -A
> >   if (A == 0)  return A;
> >   return -A;
> > }
> >
> > This is because the abs/negative match.pd pattern here:
> >
> > /* abs/negative simplifications moved from
> fold_cond_expr_with_comparison,
> >Need to handle (A - B) case as fold_cond_expr_with_comparison does.
> >Need to handle UN* comparisons.
> >...
> >...
> >
> > Sees IL that has the 0.0 propagated.
> >
> > Instead of:
> >
> >[local count: 1073741824]:
> >   if (A_2(D) == 0.0)
> > goto ; [34.00%]
> >   else
> > goto ; [66.00%]
> >
> >[local count: 708669601]:
> >   _3 = -A_2(D);
> >
> >[local count: 1073741824]:
> >   # _1 = PHI 
> >
> > It now sees:
> >
> >[local count: 1073741824]:
> >   # _1 = PHI <0.0(2), _3(3)>
> >
> > which it leaves untouched, causing the if conditional to survive.
> >
> > Is this something that can be done by improving the match.pd pattern,
> > or should be done elsewhere?
>
> Oh the pattern which is supposed to catch this does:
>   (simplify
>(cnd (cmp @0 zerop) integer_zerop (negate@1 @0))
> (if (!HONOR_SIGNED_ZEROS (type))
>  @1))
>
> Notice the integer_zerop here.
> fold_cond_expr_with_comparison has integer_zerop there too.
> I am not 100% sure you can replace A_2 with 0.0 where you are doing it
> as mentioned in another thread.
>

Are you sure we can't make the replacement, cause the test runs with
-fno-signed-zeros?

Aldy


> Thanks,
> Andrew Pinski
>
>
> >
> > Thanks.
> > Aldy
>
>


Re: [PATCH v5] c++: Implement -Wself-move warning [PR81159]

2022-08-26 Thread Jason Merrill via Gcc-patches

On 8/26/22 13:04, Marek Polacek wrote:

On Thu, Aug 25, 2022 at 08:52:58PM -0400, Jason Merrill wrote:

On 8/25/22 17:49, Marek Polacek wrote:

On Thu, Aug 25, 2022 at 09:25:43AM -0400, Jason Merrill wrote:

On 8/24/22 17:30, Marek Polacek wrote:

On Tue, Aug 23, 2022 at 05:27:00PM -0400, Jason Merrill wrote:

On 8/23/22 09:39, Marek Polacek wrote:

+  tree arg = CALL_EXPR_ARG (fn, 0);
+  extract_op (arg);
+  if (TREE_CODE (arg) == ADDR_EXPR)
+arg = TREE_OPERAND (arg, 0);
+  tree type = TREE_TYPE (lhs);
+  lhs = maybe_undo_parenthesized_ref (lhs);
+  STRIP_ANY_LOCATION_WRAPPER (lhs);
+  const bool print_var_p = (DECL_P (lhs)
+   || REFERENCE_REF_P (lhs)
+   || TREE_CODE (lhs) == COMPONENT_REF);


Why include REFERENCE_REF_P and COMPONENT_REF?  Reference refs should be
stripped before this test, member refs aren't variables.


I'm checking REFERENCE_REF_P and COMPONENT_REF to say "moving a variable"
in #1 and #3.  The REFERENCE_REF_P check means that we also say "variable"
for #2.  Sure, "A variable is introduced by the declaration of a reference
other than a non-static data member", but I'm not sure if users care about
that here?

If I strip REFERENCE_REFs before the check then the result will be the
same.


That's what I was suggesting, yes: Strip the REFERENCE_REF so DECL_P can see
the decl.


Ok, I've added the REFERENCE_REF stripping.  But I've still left the
COMPONENT_REF in.  Perhaps we could say "moving a member" to itself for
COMPONENT_REFs.  Or just say "moving 'x' of type 'int' to itself" and
avoid all of this.  :)


Sure, that would be simpler.  In any case, we shouldn't call member
references variables.


Here's the "moving 'x' of type 'int' to itself" version then:

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?


OK.


-- >8 --
About 5 years ago we got a request to implement -Wself-move, which
warns about useless moves like this:

   int x;
   x = std::move (x);

This patch implements that warning.

PR c++/81159

gcc/c-family/ChangeLog:

* c.opt (Wself-move): New option.

gcc/cp/ChangeLog:

* typeck.cc (maybe_warn_self_move): New.
(cp_build_modify_expr): Call maybe_warn_self_move.

gcc/ChangeLog:

* doc/invoke.texi: Document -Wself-move.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wself-move1.C: New test.
---
  gcc/c-family/c.opt  |   4 +
  gcc/cp/typeck.cc|  53 +-
  gcc/doc/invoke.texi |  23 -
  gcc/testsuite/g++.dg/warn/Wself-move1.C | 125 
  4 files changed, 203 insertions(+), 2 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/warn/Wself-move1.C

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index dfdebd596ef..f776efd39d8 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -1229,6 +1229,10 @@ Wselector
  ObjC ObjC++ Var(warn_selector) Warning
  Warn if a selector has multiple methods.
  
+Wself-move

+C++ ObjC++ Var(warn_self_move) Warning LangEnabledBy(C++ ObjC++, Wall)
+Warn when a value is moved to itself with std::move.
+
  Wsequence-point
  C ObjC C++ ObjC++ Var(warn_sequence_point) Warning LangEnabledBy(C ObjC C++ 
ObjC++,Wall)
  Warn about possible violations of sequence point rules.
diff --git a/gcc/cp/typeck.cc b/gcc/cp/typeck.cc
index 7fde65adaa4..b99947c10fd 100644
--- a/gcc/cp/typeck.cc
+++ b/gcc/cp/typeck.cc
@@ -8897,7 +8897,56 @@ cp_build_c_cast (location_t loc, tree type, tree expr,
  
return error_mark_node;

  }
-
+
+/* Warn when a value is moved to itself with std::move.  LHS is the target,
+   RHS may be the std::move call, and LOC is the location of the whole
+   assignment.  */
+
+static void
+maybe_warn_self_move (location_t loc, tree lhs, tree rhs)
+{
+  if (!warn_self_move)
+return;
+
+  /* C++98 doesn't know move.  */
+  if (cxx_dialect < cxx11)
+return;
+
+  if (processing_template_decl)
+return;
+
+  if (!REFERENCE_REF_P (rhs)
+  || TREE_CODE (TREE_OPERAND (rhs, 0)) != CALL_EXPR)
+return;
+  tree fn = TREE_OPERAND (rhs, 0);
+  if (!is_std_move_p (fn))
+return;
+
+  /* Just a little helper to strip * and various NOPs.  */
+  auto extract_op = [] (tree &op) {
+STRIP_NOPS (op);
+while (INDIRECT_REF_P (op))
+  op = TREE_OPERAND (op, 0);
+op = maybe_undo_parenthesized_ref (op);
+STRIP_ANY_LOCATION_WRAPPER (op);
+  };
+
+  tree arg = CALL_EXPR_ARG (fn, 0);
+  extract_op (arg);
+  if (TREE_CODE (arg) == ADDR_EXPR)
+arg = TREE_OPERAND (arg, 0);
+  tree type = TREE_TYPE (lhs);
+  tree orig_lhs = lhs;
+  extract_op (lhs);
+  if (cp_tree_equal (lhs, arg))
+{
+  auto_diagnostic_group d;
+  if (warning_at (loc, OPT_Wself_move,
+ "moving %qE of type %qT to itself", orig_lhs, type))
+   inform (loc, "remove % call");
+}
+}
+
  /* For use from the C common bits.  */
  tree
  build_modify_expr (location_t location,
@@ -9101,6 +9150,8 @@ cp_build_modify_expr (l

Re: [PATCH] Add support for floating point endpoints to frange.

2022-08-26 Thread Andrew Pinski via Gcc-patches
On Fri, Aug 26, 2022 at 8:55 AM Aldy Hernandez  wrote:
>
> [pinskia: I'm CCing you as the author of the match.pd pattern.]
>
> So, as I wrap up the work here (latest patch attached), I see there's
> another phiopt regression (not spaceship related).  I was hoping
> someone could either give me a hand, or offer some guidance.
>
> The failure is in gcc.dg/tree-ssa/phi-opt-24.c.
>
> We fail to transform the following into -A:
>
> /* { dg-options "-O2 -fno-signed-zeros -fdump-tree-phiopt" } */
>
> float f0(float A)
> {
>   // A == 0? A : -Asame as -A
>   if (A == 0)  return A;
>   return -A;
> }
>
> This is because the abs/negative match.pd pattern here:
>
> /* abs/negative simplifications moved from fold_cond_expr_with_comparison,
>Need to handle (A - B) case as fold_cond_expr_with_comparison does.
>Need to handle UN* comparisons.
>...
>...
>
> Sees IL that has the 0.0 propagated.
>
> Instead of:
>
>[local count: 1073741824]:
>   if (A_2(D) == 0.0)
> goto ; [34.00%]
>   else
> goto ; [66.00%]
>
>[local count: 708669601]:
>   _3 = -A_2(D);
>
>[local count: 1073741824]:
>   # _1 = PHI 
>
> It now sees:
>
>[local count: 1073741824]:
>   # _1 = PHI <0.0(2), _3(3)>
>
> which it leaves untouched, causing the if conditional to survive.
>
> Is this something that can be done by improving the match.pd pattern,
> or should be done elsewhere?

Oh the pattern which is supposed to catch this does:
  (simplify
   (cnd (cmp @0 zerop) integer_zerop (negate@1 @0))
(if (!HONOR_SIGNED_ZEROS (type))
 @1))

Notice the integer_zerop here.
fold_cond_expr_with_comparison has integer_zerop there too.
I am not 100% sure you can replace A_2 with 0.0 where you are doing it
as mentioned in another thread.

Thanks,
Andrew Pinski


>
> Thanks.
> Aldy


[PATCH] x86: Handle V16BF in ix86_avx256_split_vector_move_misalign

2022-08-26 Thread H.J. Lu via Gcc-patches
Handle E_V16BFmode in ix86_avx256_split_vector_move_misalign and add
V16BF to V_256H iterator.

gcc/

PR target/106748
* config/i386/i386-expand.cc
(ix86_avx256_split_vector_move_misalign): Handle E_V16BFmode.
* config/i386/sse.md (V_256H): Add V16BF.

gcc/testsuite/

PR target/106748
* gcc.target/i386/pr106748.c: New test.
---
 gcc/config/i386/i386-expand.cc   |  4 
 gcc/config/i386/sse.md   |  4 ++--
 gcc/testsuite/gcc.target/i386/pr106748.c | 20 
 3 files changed, 26 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr106748.c

diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc
index 4b216308a18..836ebc82d67 100644
--- a/gcc/config/i386/i386-expand.cc
+++ b/gcc/config/i386/i386-expand.cc
@@ -745,6 +745,10 @@ ix86_avx256_split_vector_move_misalign (rtx op0, rtx op1)
   extract = gen_avx_vextractf128v32qi;
   mode = V16QImode;
   break;
+case E_V16BFmode:
+  extract = gen_avx_vextractf128v16bf;
+  mode = V8BFmode;
+  break;
 case E_V16HFmode:
   extract = gen_avx_vextractf128v16hf;
   mode = V8HFmode;
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index e6ab3c92dcf..259048481b6 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -297,9 +297,9 @@ (define_mode_iterator V_128
 (define_mode_iterator V_256
   [V32QI V16HI V8SI V4DI V8SF V4DF])
 
-;; All 256bit vector modes including HF vector mode
+;; All 256bit vector modes including HF/BF vector modes
 (define_mode_iterator V_256H
-  [V32QI V16HI V8SI V4DI V8SF V4DF V16HF])
+  [V32QI V16HI V8SI V4DI V8SF V4DF V16HF V16BF])
 
 ;; All 128bit and 256bit vector modes
 (define_mode_iterator V_128_256
diff --git a/gcc/testsuite/gcc.target/i386/pr106748.c 
b/gcc/testsuite/gcc.target/i386/pr106748.c
new file mode 100644
index 000..6388b1deb23
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr106748.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O0 -mavx256-split-unaligned-store -mavx -fpack-struct" } */
+
+typedef __bf16 __m256bf16 __attribute__((__vector_size__(32)));
+typedef struct {
+  __m256bf16 _m256bf16[1];
+} YMM_T;
+
+struct {
+  YMM_T ymm0;
+} fregs;
+
+__m256bf16 do_test_u3b_0_0;
+int do_test_i;
+
+void
+do_test()
+{
+  (&fregs.ymm0)[do_test_i]._m256bf16[0] = do_test_u3b_0_0;
+}
-- 
2.37.2



Re: [PATCH] Implement __builtin_issignaling

2022-08-26 Thread Michael Meissner via Gcc-patches
On Thu, Aug 25, 2022 at 09:56:18PM +0200, Jakub Jelinek wrote:
> On Thu, Aug 25, 2022 at 03:23:12PM -0400, Michael Meissner wrote:
> > On Mon, Aug 15, 2022 at 12:12:02PM +0200, Jakub Jelinek via Gcc-patches 
> > wrote:
> > > Hi!
> > > 
> > > The following patch implements a new builtin, __builtin_issignaling,
> > > which can be used to implement the ISO/IEC TS 18661-1  issignaling
> > > macro.
> > 
> > I haven't looked in detail at the patch, but from the description I think it
> > needs a way for machine dependent parts to optimize this for a given mode 
> > when
> > various switches are used.
> 
> There is an optab which takes priority over the generic code.
> So if you have something better than the generic expansion, feel free to add
> the named pattern(s).
> Currently it is used just on x86 for xf mode.

That is good to know.  When I looked into before, I didn't notice there was an
optab for the case.

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


[PATCH v5] c++: Implement -Wself-move warning [PR81159]

2022-08-26 Thread Marek Polacek via Gcc-patches
On Thu, Aug 25, 2022 at 08:52:58PM -0400, Jason Merrill wrote:
> On 8/25/22 17:49, Marek Polacek wrote:
> > On Thu, Aug 25, 2022 at 09:25:43AM -0400, Jason Merrill wrote:
> > > On 8/24/22 17:30, Marek Polacek wrote:
> > > > On Tue, Aug 23, 2022 at 05:27:00PM -0400, Jason Merrill wrote:
> > > > > On 8/23/22 09:39, Marek Polacek wrote:
> > > > > > +  tree arg = CALL_EXPR_ARG (fn, 0);
> > > > > > +  extract_op (arg);
> > > > > > +  if (TREE_CODE (arg) == ADDR_EXPR)
> > > > > > +arg = TREE_OPERAND (arg, 0);
> > > > > > +  tree type = TREE_TYPE (lhs);
> > > > > > +  lhs = maybe_undo_parenthesized_ref (lhs);
> > > > > > +  STRIP_ANY_LOCATION_WRAPPER (lhs);
> > > > > > +  const bool print_var_p = (DECL_P (lhs)
> > > > > > +   || REFERENCE_REF_P (lhs)
> > > > > > +   || TREE_CODE (lhs) == COMPONENT_REF);
> > > > > 
> > > > > Why include REFERENCE_REF_P and COMPONENT_REF?  Reference refs should 
> > > > > be
> > > > > stripped before this test, member refs aren't variables.
> > > > 
> > > > I'm checking REFERENCE_REF_P and COMPONENT_REF to say "moving a 
> > > > variable"
> > > > in #1 and #3.  The REFERENCE_REF_P check means that we also say 
> > > > "variable"
> > > > for #2.  Sure, "A variable is introduced by the declaration of a 
> > > > reference
> > > > other than a non-static data member", but I'm not sure if users care 
> > > > about
> > > > that here?
> > > > 
> > > > If I strip REFERENCE_REFs before the check then the result will be the
> > > > same.
> > > 
> > > That's what I was suggesting, yes: Strip the REFERENCE_REF so DECL_P can 
> > > see
> > > the decl.
> > 
> > Ok, I've added the REFERENCE_REF stripping.  But I've still left the
> > COMPONENT_REF in.  Perhaps we could say "moving a member" to itself for
> > COMPONENT_REFs.  Or just say "moving 'x' of type 'int' to itself" and
> > avoid all of this.  :)
> 
> Sure, that would be simpler.  In any case, we shouldn't call member
> references variables.

Here's the "moving 'x' of type 'int' to itself" version then:

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
About 5 years ago we got a request to implement -Wself-move, which
warns about useless moves like this:

  int x;
  x = std::move (x);

This patch implements that warning.

PR c++/81159

gcc/c-family/ChangeLog:

* c.opt (Wself-move): New option.

gcc/cp/ChangeLog:

* typeck.cc (maybe_warn_self_move): New.
(cp_build_modify_expr): Call maybe_warn_self_move.

gcc/ChangeLog:

* doc/invoke.texi: Document -Wself-move.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wself-move1.C: New test.
---
 gcc/c-family/c.opt  |   4 +
 gcc/cp/typeck.cc|  53 +-
 gcc/doc/invoke.texi |  23 -
 gcc/testsuite/g++.dg/warn/Wself-move1.C | 125 
 4 files changed, 203 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/warn/Wself-move1.C

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index dfdebd596ef..f776efd39d8 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -1229,6 +1229,10 @@ Wselector
 ObjC ObjC++ Var(warn_selector) Warning
 Warn if a selector has multiple methods.
 
+Wself-move
+C++ ObjC++ Var(warn_self_move) Warning LangEnabledBy(C++ ObjC++, Wall)
+Warn when a value is moved to itself with std::move.
+
 Wsequence-point
 C ObjC C++ ObjC++ Var(warn_sequence_point) Warning LangEnabledBy(C ObjC C++ 
ObjC++,Wall)
 Warn about possible violations of sequence point rules.
diff --git a/gcc/cp/typeck.cc b/gcc/cp/typeck.cc
index 7fde65adaa4..b99947c10fd 100644
--- a/gcc/cp/typeck.cc
+++ b/gcc/cp/typeck.cc
@@ -8897,7 +8897,56 @@ cp_build_c_cast (location_t loc, tree type, tree expr,
 
   return error_mark_node;
 }
-
+
+/* Warn when a value is moved to itself with std::move.  LHS is the target,
+   RHS may be the std::move call, and LOC is the location of the whole
+   assignment.  */
+
+static void
+maybe_warn_self_move (location_t loc, tree lhs, tree rhs)
+{
+  if (!warn_self_move)
+return;
+
+  /* C++98 doesn't know move.  */
+  if (cxx_dialect < cxx11)
+return;
+
+  if (processing_template_decl)
+return;
+
+  if (!REFERENCE_REF_P (rhs)
+  || TREE_CODE (TREE_OPERAND (rhs, 0)) != CALL_EXPR)
+return;
+  tree fn = TREE_OPERAND (rhs, 0);
+  if (!is_std_move_p (fn))
+return;
+
+  /* Just a little helper to strip * and various NOPs.  */
+  auto extract_op = [] (tree &op) {
+STRIP_NOPS (op);
+while (INDIRECT_REF_P (op))
+  op = TREE_OPERAND (op, 0);
+op = maybe_undo_parenthesized_ref (op);
+STRIP_ANY_LOCATION_WRAPPER (op);
+  };
+
+  tree arg = CALL_EXPR_ARG (fn, 0);
+  extract_op (arg);
+  if (TREE_CODE (arg) == ADDR_EXPR)
+arg = TREE_OPERAND (arg, 0);
+  tree type = TREE_TYPE (lhs);
+  tree orig_lhs = lhs;
+  extract_op (lhs);
+  if (cp_tree_equal (lhs, arg))
+{
+  auto_diagnostic_group d;
+  if (warning_at (loc, OPT_Wself_move

Re: [PATCH] x86: Cast stride to __PTRDIFF_TYPE__ in AMX intrinsics

2022-08-26 Thread H.J. Lu via Gcc-patches
On Mon, Aug 22, 2022 at 7:05 PM Hongtao Liu  wrote:
>
> On Tue, Aug 23, 2022 at 1:02 AM H.J. Lu  wrote:
> >
> > On 64-bit Windows, long is 32 bits and can't be used as stride in memory
> > operand when base is a pointer which is 64 bits.  Cast stride to
> > __PTRDIFF_TYPE__, instead of long.
> Ok.
> >
> > PR target/106714
> > * config/i386/amxtileintrin.h (_tile_loadd_internal): Cast to
> > __PTRDIFF_TYPE__.
> > (_tile_stream_loadd_internal): Likewise.
> > (_tile_stored_internal): Likewise.
> > ---
> >  gcc/config/i386/amxtileintrin.h | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/gcc/config/i386/amxtileintrin.h 
> > b/gcc/config/i386/amxtileintrin.h
> > index 7b5a39eba72..06f18aa9bfb 100644
> > --- a/gcc/config/i386/amxtileintrin.h
> > +++ b/gcc/config/i386/amxtileintrin.h
> > @@ -62,7 +62,7 @@ _tile_release (void)
> >  #define _tile_loadd_internal(dst,base,stride)  \
> >__asm__ volatile \
> >("{tileloadd\t(%0,%1,1), %%tmm"#dst"|tileloadd\t%%tmm"#dst", [%0+%1*1]}" 
> > \
> > -   :: "r" ((const void*) (base)), "r" ((long) (stride)))
> > +   :: "r" ((const void*) (base)), "r" ((__PTRDIFF_TYPE__) (stride)))
> >
> >  #define _tile_stream_loadd(dst,base,stride)\
> >_tile_stream_loadd_internal (dst, base, stride)
> > @@ -70,7 +70,7 @@ _tile_release (void)
> >  #define _tile_stream_loadd_internal(dst,base,stride)   \
> >__asm__ volatile \
> >("{tileloaddt1\t(%0,%1,1), %%tmm"#dst"|tileloaddt1\t%%tmm"#dst", 
> > [%0+%1*1]}" \
> > -   :: "r" ((const void*) (base)), "r" ((long) (stride)))
> > +   :: "r" ((const void*) (base)), "r" ((__PTRDIFF_TYPE__) (stride)))
> >
> >  #define _tile_stored(dst,base,stride)  \
> >_tile_stored_internal (dst, base, stride)
> > @@ -78,7 +78,7 @@ _tile_release (void)
> >  #define _tile_stored_internal(src,base,stride) \
> >__asm__ volatile \
> >("{tilestored\t%%tmm"#src", (%0,%1,1)|tilestored\t[%0+%1*1], 
> > %%tmm"#src"}" \
> > -   :: "r" ((void*) (base)), "r" ((long) (stride)) \
> > +   :: "r" ((void*) (base)), "r" ((__PTRDIFF_TYPE__) (stride)) \
> > : "memory")
> >
> >  #define _tile_zero(dst)\
> > --
> > 2.37.2
> >
>
>
> --
> BR,
> Hongtao

OK to backport it to GCC 12 branch?


-- 
H.J.


Re: [PATCH] [ranger] x == -0.0 does not mean we can replace x with -0.0

2022-08-26 Thread Jakub Jelinek via Gcc-patches
On Fri, Aug 26, 2022 at 05:46:06PM +0200, Aldy Hernandez wrote:
> On the true side of x == -0.0, we can't just blindly value propagate
> the -0.0 into every use of x because x could be +0.0 (or vice versa).
> 
> With this change, we only allow the transformation if
> !HONOR_SIGNED_ZEROS or if the range is known not to contain 0.
> 
> Will commit after tests complete.
> 
> gcc/ChangeLog:
> 
>   * range-op-float.cc (foperator_equal::op1_range): Do not blindly
>   copy op2 range when honoring signed zeros.
> ---
>  gcc/range-op-float.cc | 17 +++--
>  1 file changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/range-op-float.cc b/gcc/range-op-float.cc
> index ad2fae578d2..ff9fe312acf 100644
> --- a/gcc/range-op-float.cc
> +++ b/gcc/range-op-float.cc
> @@ -252,8 +252,21 @@ foperator_equal::op1_range (frange &r, tree type,
>switch (get_bool_state (r, lhs, type))
>  {
>  case BRS_TRUE:
> -  // If it's true, the result is the same as OP2.
> -  r = op2;
> +  if (HONOR_SIGNED_ZEROS (type)
> +   && op2.contains_p (build_zero_cst (type)))

What exactly does op2.contains_p for zero?
Does it use real_compare/real_equal under the hood, so it is
equivalent to op2 == 0.0 or op2 == -0.0, where both will be
true whether op2 is -0.0 or 0.0?  Or is it more strict and
checks whether it is actually a positive zero?
In any case, for HONOR_SIGNED_ZEROS, VARYING is unnecessary, all you
can do is extend the r range to contain both -0.0 and +0.0 if it contains
at least one of them.

> + {
> +   // With signed zeros, x == -0.0 does not mean we can replace
> +   // x with -0.0, because x may be either +0.0 or -0.0.
> +   r.set_varying (type);
> + }
> +  else
> + {
> +   // If it's true, the result is the same as OP2.
> +   //
> +   // If the range does not actually contain zeros, this should
> +   // always be OK.
> +   r = op2;
> + }

!HONOR_SIGNED_ZEROS doesn't imply that negative zeros won't appear,
but says that user doesn't care if he gets a positive or negative zero
(unless !MODE_HAS_SIGNED_ZEROS - in that case -0.0 doesn't exist
and one doesn't need to bother with it).

Now, if all the code setting franges makes sure that for
MODE_HAS_SIGNED_ZEROS && !HONOR_SIGNED_ZEROS if +0.0 or -0.0 are inside
of a range, then both -0.0 and +0.0 are in the range, then yes,
you can use r = op2;

>// The TRUE side of op1 == op2 implies op1 is !NAN.
>r.set_nan (fp_prop::NO);
>break;

Jakub



[PATCH 2/2] vec: Add array_slice::bsearch

2022-08-26 Thread Martin Jambor
Hi,

This adds a method to binary search in a sorted array_slice.

The implementation is direct copy of vec:bsearch.  Moreover, to only
copy it once and not twice, I used const_cast in the non-const
variants to be able to use the const variants.  I hope that is
acceptable abuse of const_cast but I'll be happy to change that if
not.

Bootstrapped and tested along code that actually uses it on
x86_64-linux.  OK for trunk?

Thanks,

Martin


gcc/ChangeLog:

2022-08-08  Martin Jambor  

* vec.h (array_slice::bsearch): New methods.
---
 gcc/vec.h | 94 +++
 1 file changed, 94 insertions(+)

diff --git a/gcc/vec.h b/gcc/vec.h
index b0477e1044c..61ebdc4ca13 100644
--- a/gcc/vec.h
+++ b/gcc/vec.h
@@ -2301,6 +2301,14 @@ public:
   // True if the array is valid, false if it is an array like INVALID.
   bool is_valid () const { return m_base || m_size == 0; }
 
+  /* Methods for binary search in sorted array_slice.  */
+  const T *bsearch (const void *key, int (*compar)(const void *,
+  const void *)) const;
+  T *bsearch (const void *key, int (*compar)(const void *, const void *));
+  const T *bsearch (const void *key,
+ int (*compar)(const void *, const void *, void *), void *) const;
+  T *bsearch (const void *key,
+ int (*compar)(const void *, const void *, void *), void *);
 private:
   iterator m_base;
   unsigned int m_size;
@@ -2361,6 +2369,92 @@ make_array_slice (T *base, unsigned int size)
   return array_slice (base, size);
 }
 
+/* Search the contents of the sorted array_slice with a binary search.  CMP is
+   the comparison function to pass to bsearch.  */
+
+template
+inline const T *
+array_slice::bsearch (const void *key,
+int (*compar) (const void *, const void *)) const
+{
+  const void *base = this->m_base;
+  size_t nmemb = this->size ();
+  size_t size = sizeof (T);
+  /* The following is a copy of glibc stdlib-bsearch.h.  */
+  size_t l, u, idx;
+  const void *p;
+  int comparison;
+
+  l = 0;
+  u = nmemb;
+  while (l < u)
+{
+  idx = (l + u) / 2;
+  p = (const void *) (((const char *) base) + (idx * size));
+  comparison = (*compar) (key, p);
+  if (comparison < 0)
+   u = idx;
+  else if (comparison > 0)
+   l = idx + 1;
+  else
+   return (T *)const_cast(p);
+}
+
+  return NULL;
+}
+
+template
+inline T *
+array_slice::bsearch (const void *key,
+int (*compar) (const void *, const void *))
+{
+  return const_cast(bsearch (key, compar));
+}
+
+/* Search the contents of the sorted array_slice with a binary search.  CMP is
+   the comparison function to pass to bsearch.  */
+
+template
+inline const T *
+array_slice::bsearch (const void *key,
+int (*compar) (const void *, const void *, void *),
+void *data) const
+{
+  const void *base = this->m_base;
+  size_t nmemb = this->size ();
+  size_t size = sizeof (T);
+  /* The following is a copy of glibc stdlib-bsearch.h.  */
+  size_t l, u, idx;
+  const void *p;
+  int comparison;
+
+  l = 0;
+  u = nmemb;
+  while (l < u)
+{
+  idx = (l + u) / 2;
+  p = (const void *) (((const char *) base) + (idx * size));
+  comparison = (*compar) (key, p, data);
+  if (comparison < 0)
+   u = idx;
+  else if (comparison > 0)
+   l = idx + 1;
+  else
+   return (T *)const_cast(p);
+}
+
+  return NULL;
+}
+
+template
+inline T *
+array_slice::bsearch (const void *key,
+int (*compar) (const void *, const void *, void *),
+void *data)
+{
+  return const_cast (bsearch (key, compar, data));
+}
+
 #if (GCC_VERSION >= 3000)
 # pragma GCC poison m_vec m_vecpfx m_vecdata
 #endif
-- 
2.37.2



Re: [PATCH] Add set/get functions for negative infinity in real.*

2022-08-26 Thread Aldy Hernandez via Gcc-patches
On Fri, Aug 26, 2022 at 6:34 PM Jeff Law  wrote:
>
>
>
> On 8/26/2022 10:24 AM, Aldy Hernandez wrote:
> > On Fri, Aug 26, 2022 at 6:08 PM Jeff Law via Gcc-patches
> >  wrote:
> >>
> >>
> >> On 8/23/2022 4:33 AM, Aldy Hernandez via Gcc-patches wrote:
> >>> For the frange implementation with endpoints I'm about to contribute,
> >>> we need to set REAL_VALUE_TYPEs with negative infinity.  The support
> >>> is already there in real.cc, but it is awkward to get at.  One could
> >>> call real_inf() and then negate the value, but I've added the ability
> >>> to pass the sign argument like many of the existing real.* functions.
> >>>
> >>> I've declared the functions in such a way to avoid changes to the
> >>> existing code base:
> >>>
> >>> // Unchanged function returning true for either +-INF.
> >>> bool real_isinf (const REAL_VALUE_TYPE *r);
> >>> // New overload to be able to specify the sign.
> >>> bool real_isinf (const REAL_VALUE_TYPE *r, int sign);
> >>> // Replacement function for setting INF, defaults to +INF.
> >>> void real_inf (REAL_VALUE_TYPE *, int sign = 0);
> >>>
> >>> Tested on x86-64 Linux.
> >>>
> >>> OK?
> >>>
> >>> gcc/ChangeLog:
> >>>
> >>>* real.cc (real_isinf): New overload.
> >>>(real_inf): Add sign argument.
> >>>* real.h (real_isinf): New overload.
> >>>(real_inf): Add sign argument.
> >> Funny in that I've fairly recently had the desire to do something a bit
> >> similar.  Let's consider 0.5, which we have a dconsthalf, but we don't
> >> have dconstmhalf for -0.5.  To get that value I create a dconsthalf
> >> object and flip its sign.  Similarly for a variety of other special
> >> constants (particularly powers of two, but a few others as well).
> > Ugh, yeah.  I've been doing a lot of gymnastics in this space because
> > frange's will have REAL_VALUE_TYPE endpoints.
> In our case we have instructions that can make of various FP constants,
> some of which may be negative.  So we need to be able to recognize those
> constants.  Leading to having to do similar gymnastics as what you're doing.

It seems real.* needs some minor TLC.  For one, a lot of these
functions should be inlined.  I suppose it wasn't meant to be abused
the way we're about to in range-op-float.cc :-).

Thanks.
Aldy



[PATCH 1/2] vec: Add array_slice constructors from non-const and gc vectors

2022-08-26 Thread Martin Jambor
Hi,

This patch adds constructors of array_slice that are required to
create them from non-const (heap or auto) vectors or from GC vectors.

The use of non-const array_slices is somewhat limited, as creating one
from const vec still leads to array_slice,
so I eventually also only resorted to having read-only array_slices.
But I do need the constructor from the gc vector.

Bootstrapped and tested along code that actually uses it on
x86_64-linux.  OK for trunk?

Thanks,

Martin


gcc/ChangeLog:

2022-08-08  Martin Jambor  

* vec.h (array_slice): Add constructors for non-const reference to
heap vector and pointers to heap vectors.
---
 gcc/vec.h | 12 
 1 file changed, 12 insertions(+)

diff --git a/gcc/vec.h b/gcc/vec.h
index eed075addc9..b0477e1044c 100644
--- a/gcc/vec.h
+++ b/gcc/vec.h
@@ -2264,6 +2264,18 @@ public:
   array_slice (const vec &v)
 : m_base (v.address ()), m_size (v.length ()) {}
 
+  template
+  array_slice (vec &v)
+: m_base (v.address ()), m_size (v.length ()) {}
+
+  template
+  array_slice (const vec *v)
+: m_base (v ? v->address () : nullptr), m_size (v ? v->length () : 0) {}
+
+  template
+  array_slice (vec *v)
+: m_base (v ? v->address () : nullptr), m_size (v ? v->length () : 0) {}
+
   iterator begin () { return m_base; }
   iterator end () { return m_base + m_size; }
 
-- 
2.37.2



Re: [PATCH] Remove uninit_analysis::use_cannot_happen

2022-08-26 Thread Jeff Law via Gcc-patches




On 8/26/2022 12:55 AM, Richard Biener via Gcc-patches wrote:

As written earlier uninit_analysis::use_cannot_happen is duplicate
functionality implemented in a complement way, not adhering to
the idea of disproving a may-uninit use and eventually (I have not
yet found a testcase it helps to avoid false positives) avoiding
false positives because of this or different ways it imposes limits
on the predicate computations.

This patch removes it.

I've bootstrapped and tested this before, but re-bootstrap/test
on x86_64-unknown-linux-gnu is running now.

* gimple-predicate-analysis.h
(uninit_analysis::use_cannot_happen): Remove.
* gimple-predicate-analysis.cc (can_be_invalidated_p): Remove.
(uninit_analysis::use_cannot_happen): Likewise.
(uninit_analysis::is_use_guarded): Do not call
use_cannot_happen.
ACK.  IIRC Aldy added this several years back to deal with a uninit 
regression PR.  If it's not triggering anymore, I'm not going to lose 
any sleep over losing it.


jeff



Re: [PATCH] Add set/get functions for negative infinity in real.*

2022-08-26 Thread Jeff Law via Gcc-patches




On 8/26/2022 10:25 AM, Aldy Hernandez wrote:

On Fri, Aug 26, 2022 at 6:09 PM Jeff Law via Gcc-patches
 wrote:



On 8/26/2022 9:43 AM, Aldy Hernandez via Gcc-patches wrote:

Another real.* tweak.  This time, adding a real_iszero() entry point
to match the real_isnegzero.  I could combine this patch with the
negative infinity one in this thread if y'all would prefer.

OK pending tests?

p.s. I'd really like to go and benchmark all this real.* stuff
(later), because I think we could inline a bunch of these functions in
the header file.

Same comments as the INF patch.

OK pending tests?

Yes.
Jeff


Re: [PATCH] Add set/get functions for negative infinity in real.*

2022-08-26 Thread Jeff Law via Gcc-patches




On 8/26/2022 10:24 AM, Aldy Hernandez wrote:

On Fri, Aug 26, 2022 at 6:08 PM Jeff Law via Gcc-patches
 wrote:



On 8/23/2022 4:33 AM, Aldy Hernandez via Gcc-patches wrote:

For the frange implementation with endpoints I'm about to contribute,
we need to set REAL_VALUE_TYPEs with negative infinity.  The support
is already there in real.cc, but it is awkward to get at.  One could
call real_inf() and then negate the value, but I've added the ability
to pass the sign argument like many of the existing real.* functions.

I've declared the functions in such a way to avoid changes to the
existing code base:

// Unchanged function returning true for either +-INF.
bool real_isinf (const REAL_VALUE_TYPE *r);
// New overload to be able to specify the sign.
bool real_isinf (const REAL_VALUE_TYPE *r, int sign);
// Replacement function for setting INF, defaults to +INF.
void real_inf (REAL_VALUE_TYPE *, int sign = 0);

Tested on x86-64 Linux.

OK?

gcc/ChangeLog:

   * real.cc (real_isinf): New overload.
   (real_inf): Add sign argument.
   * real.h (real_isinf): New overload.
   (real_inf): Add sign argument.

Funny in that I've fairly recently had the desire to do something a bit
similar.  Let's consider 0.5, which we have a dconsthalf, but we don't
have dconstmhalf for -0.5.  To get that value I create a dconsthalf
object and flip its sign.  Similarly for a variety of other special
constants (particularly powers of two, but a few others as well).

Ugh, yeah.  I've been doing a lot of gymnastics in this space because
frange's will have REAL_VALUE_TYPE endpoints.
In our case we have instructions that can make of various FP constants, 
some of which may be negative.  So we need to be able to recognize those 
constants.  Leading to having to do similar gymnastics as what you're doing.

  or


Consider making the "sign" argument a boolean.  It's defined as a single
bit bitfield in the real_value structure.   We don't want folks to pass
in values outside [0..1] for the sign if we can easily avoid it:-)

I was trying to follow all the other functions in real.cc which have
"int sign", though I suppose none of them are exported in the header
file.
They probably pre-date using bool types in GCC.  Feel free to update 
them if you need a mindless task at some point.




OK pending tests?

Of course.  I should have noted that with such a change it'd pre-approved.

jeff


Re: [PATCH 6/6] Extend SLP permutation optimisations

2022-08-26 Thread Jeff Law via Gcc-patches




On 8/25/2022 7:07 AM, Richard Sandiford via Gcc-patches wrote:

Currently SLP tries to force permute operations "down" the graph
from loads in the hope of reducing the total number of permutations
needed or (in the best case) removing the need for the permutations
entirely.  This patch tries to extend it as follows:

- Allow loads to take a different permutation from the one they
   started with, rather than choosing between "original permutation"
   and "no permutation".

- Allow changes in both directions, if the target supports the
   reverse permutation.

- Treat the placement of permutations as a two-way dataflow problem:
   after propagating information from leaves to roots (as now), propagate
   information back up the graph.

- Take execution frequency into account when optimising for speed,
   so that (for example) permutations inside loops have a higher
   cost than permutations outside loops.

- Try to reduce the total number of permutations when optimising for
   size, even if that increases the number of permutations on a given
   execution path.

See the big block comment above vect_optimize_slp_pass for
a detailed description.

The original motivation for doing this was to add a framework that would
allow other layout differences in future.  The two main ones are:

- Make it easier to represent predicated operations, including
   predicated operations with gaps.  E.g.:

  a[0] += 1;
  a[1] += 1;
  a[3] += 1;

   could be a single load/add/store for SVE.  We could handle this
   by representing a layout such as { 0, 1, _, 2 } or { 0, 1, _, 3 }
   (depending on what's being counted).  We might need to move
   elements between lanes at various points, like with permutes.

   (This would first mean adding support for stores with gaps.)

- Make it easier to switch between an even/odd and unpermuted layout
   when switching between wide and narrow elements.  E.g. if a widening
   operation produces an even vector and an odd vector, we should try
   to keep operations on the wide elements in that order rather than
   force them to be permuted back "in order".
Very cool.  Richi and I discussed this a bit a year or so ago -- 
basically noting that bi-directional movement is really the way to go 
and that the worst thing to do is push a permute down into the *middle* 
of a computation chain since that will tend to break FMA generation.  
Moving to the loads or stores or to another permute point ought to be 
the goal.


It looks like you've covered that excessively well :-)

Jeff



Re: [PATCH] Add set/get functions for negative infinity in real.*

2022-08-26 Thread Aldy Hernandez via Gcc-patches
On Fri, Aug 26, 2022 at 6:09 PM Jeff Law via Gcc-patches
 wrote:
>
>
>
> On 8/26/2022 9:43 AM, Aldy Hernandez via Gcc-patches wrote:
> > Another real.* tweak.  This time, adding a real_iszero() entry point
> > to match the real_isnegzero.  I could combine this patch with the
> > negative infinity one in this thread if y'all would prefer.
> >
> > OK pending tests?
> >
> > p.s. I'd really like to go and benchmark all this real.* stuff
> > (later), because I think we could inline a bunch of these functions in
> > the header file.
> Same comments as the INF patch.

OK pending tests?
Aldy
From 23338ea7f2f079a35df52b240e7422ca2d449acd Mon Sep 17 00:00:00 2001
From: Aldy Hernandez 
Date: Fri, 26 Aug 2022 16:57:09 +0200
Subject: [PATCH] Add real_iszero to real.*

We have real_isnegzero but no real_iszero.  We could memcmp with 0,
but that's just ugly.

gcc/ChangeLog:

	* real.cc (real_iszero): New.
	* real.h (real_iszero): New.
---
 gcc/real.cc | 16 
 gcc/real.h  |  6 ++
 2 files changed, 22 insertions(+)

diff --git a/gcc/real.cc b/gcc/real.cc
index dcf41b79f60..96f05ec68ca 100644
--- a/gcc/real.cc
+++ b/gcc/real.cc
@@ -1272,6 +1272,22 @@ real_isneg (const REAL_VALUE_TYPE *r)
   return r->sign;
 }
 
+/* Determine whether a floating-point value X is plus or minus zero.  */
+
+bool
+real_iszero (const REAL_VALUE_TYPE *r)
+{
+  return r->cl == rvc_zero;
+}
+
+/* Determine whether a floating-point value X is zero with SIGN.  */
+
+bool
+real_iszero (const REAL_VALUE_TYPE *r, bool sign)
+{
+  return real_iszero (r) && r->sign == sign;
+}
+
 /* Determine whether a floating-point value X is minus zero.  */
 
 bool
diff --git a/gcc/real.h b/gcc/real.h
index e01f9ed7282..ec78e8a8932 100644
--- a/gcc/real.h
+++ b/gcc/real.h
@@ -295,6 +295,12 @@ extern bool real_isneg (const REAL_VALUE_TYPE *);
 /* Determine whether a floating-point value X is minus zero.  */
 extern bool real_isnegzero (const REAL_VALUE_TYPE *);
 
+/* Determine whether a floating-point value X is plus or minus zero.  */
+extern bool real_iszero (const REAL_VALUE_TYPE *);
+
+/* Determine whether a floating-point value X is zero with SIGN.  */
+extern bool real_iszero (const REAL_VALUE_TYPE *, bool sign);
+
 /* Test relationships between reals.  */
 extern bool real_identical (const REAL_VALUE_TYPE *, const REAL_VALUE_TYPE *);
 extern bool real_equal (const REAL_VALUE_TYPE *, const REAL_VALUE_TYPE *);
-- 
2.37.1



Re: [PATCH] Add set/get functions for negative infinity in real.*

2022-08-26 Thread Aldy Hernandez via Gcc-patches
On Fri, Aug 26, 2022 at 6:08 PM Jeff Law via Gcc-patches
 wrote:
>
>
>
> On 8/23/2022 4:33 AM, Aldy Hernandez via Gcc-patches wrote:
> > For the frange implementation with endpoints I'm about to contribute,
> > we need to set REAL_VALUE_TYPEs with negative infinity.  The support
> > is already there in real.cc, but it is awkward to get at.  One could
> > call real_inf() and then negate the value, but I've added the ability
> > to pass the sign argument like many of the existing real.* functions.
> >
> > I've declared the functions in such a way to avoid changes to the
> > existing code base:
> >
> > // Unchanged function returning true for either +-INF.
> > bool real_isinf (const REAL_VALUE_TYPE *r);
> > // New overload to be able to specify the sign.
> > bool real_isinf (const REAL_VALUE_TYPE *r, int sign);
> > // Replacement function for setting INF, defaults to +INF.
> > void real_inf (REAL_VALUE_TYPE *, int sign = 0);
> >
> > Tested on x86-64 Linux.
> >
> > OK?
> >
> > gcc/ChangeLog:
> >
> >   * real.cc (real_isinf): New overload.
> >   (real_inf): Add sign argument.
> >   * real.h (real_isinf): New overload.
> >   (real_inf): Add sign argument.
>
> Funny in that I've fairly recently had the desire to do something a bit
> similar.  Let's consider 0.5, which we have a dconsthalf, but we don't
> have dconstmhalf for -0.5.  To get that value I create a dconsthalf
> object and flip its sign.  Similarly for a variety of other special
> constants (particularly powers of two, but a few others as well).

Ugh, yeah.  I've been doing a lot of gymnastics in this space because
frange's will have REAL_VALUE_TYPE endpoints.

>
> Consider making the "sign" argument a boolean.  It's defined as a single
> bit bitfield in the real_value structure.   We don't want folks to pass
> in values outside [0..1] for the sign if we can easily avoid it:-)

I was trying to follow all the other functions in real.cc which have
"int sign", though I suppose none of them are exported in the header
file.

OK pending tests?
Aldy
From 86664496f62f5d4d33da0347f5d5c017bd204513 Mon Sep 17 00:00:00 2001
From: Aldy Hernandez 
Date: Sat, 20 Aug 2022 12:41:53 +0200
Subject: [PATCH] Add set/get functions for negative infinity in real.*

For the frange implementation with endpoints I'm about to contribute,
we need to set REAL_VALUE_TYPEs with negative infinity.  The support
is already there in real.cc, but it is awkward to get at.  One could
call real_inf() and then negate the value, but I've added the ability
to pass the sign argument like many of the existing real.* functions.

I've declared the functions in such a way to avoid changes to the
existing code base:

// Unchanged function returning true for either +-INF.
bool real_isinf (const REAL_VALUE_TYPE *r);
// New overload to be able to specify the sign.
bool real_isinf (const REAL_VALUE_TYPE *r, int sign);
// Replacement function for setting INF, defaults to +INF.
void real_inf (REAL_VALUE_TYPE *, int sign = 0);

gcc/ChangeLog:

	* real.cc (real_isinf): New overload.
	(real_inf): Add sign argument.
	* real.h (real_isinf): New overload.
	(real_inf): Add sign argument.
---
 gcc/real.cc | 14 +++---
 gcc/real.h  |  5 -
 2 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/gcc/real.cc b/gcc/real.cc
index 4e63b1449c5..dcf41b79f60 100644
--- a/gcc/real.cc
+++ b/gcc/real.cc
@@ -1234,6 +1234,14 @@ real_isinf (const REAL_VALUE_TYPE *r)
   return (r->cl == rvc_inf);
 }
 
+/* Determine whether a floating-point value X is infinite with SIGN.  */
+
+bool
+real_isinf (const REAL_VALUE_TYPE *r, bool sign)
+{
+  return real_isinf (r) && r->sign == sign;
+}
+
 /* Determine whether a floating-point value X is a NaN.  */
 
 bool
@@ -2484,12 +2492,12 @@ dconst_sqrt2_ptr (void)
   return &value;
 }
 
-/* Fills R with +Inf.  */
+/* Fills R with Inf with SIGN.  */
 
 void
-real_inf (REAL_VALUE_TYPE *r)
+real_inf (REAL_VALUE_TYPE *r, bool sign)
 {
-  get_inf (r, 0);
+  get_inf (r, sign);
 }
 
 /* Fills R with a NaN whose significand is described by STR.  If QUIET,
diff --git a/gcc/real.h b/gcc/real.h
index 845ef29e3a4..e01f9ed7282 100644
--- a/gcc/real.h
+++ b/gcc/real.h
@@ -277,6 +277,9 @@ extern bool real_compare (int, const REAL_VALUE_TYPE *, const REAL_VALUE_TYPE *)
 /* Determine whether a floating-point value X is infinite.  */
 extern bool real_isinf (const REAL_VALUE_TYPE *);
 
+/* Determine whether a floating-point value X is infinite with SIGN.  */
+extern bool real_isinf (const REAL_VALUE_TYPE *, bool sign);
+
 /* Determine whether a floating-point value X is a NaN.  */
 extern bool real_isnan (const REAL_VALUE_TYPE *);
 
@@ -331,7 +334,7 @@ extern long real_to_target (long *, const REAL_VALUE_TYPE *, format_helper);
 extern void real_from_target (REAL_VALUE_TYPE *, const long *,
 			  format_helper);
 
-extern void real_inf (REAL_VALUE_TYPE *);
+extern void real_inf (REAL_VALUE_TYPE *, bool sign = false);
 
 extern bool real_nan (REAL_VALUE_TYPE *, const char *, i

Re: [PATCH 1/2] Improve compute_control_dep_chain documentation

2022-08-26 Thread Jeff Law via Gcc-patches




On 8/26/2022 6:34 AM, Richard Biener via Gcc-patches wrote:

In the quest to understand how compute_control_dep_chain works I've
produced the following two changes, documenting PR106754 on the
way.

Bootstrap and regtest is running on x86_64-unknown-linux-gnu,
the changes should be no-ops but hopefully improve understanding
of the code.
Thanks so much for this.  Every time I've had to get into that code it 
makes my head hurt -- anything that clarifies is an improvement.


Jeff



Re: [PATCH] Add set/get functions for negative infinity in real.*

2022-08-26 Thread Jeff Law via Gcc-patches




On 8/26/2022 9:43 AM, Aldy Hernandez via Gcc-patches wrote:

Another real.* tweak.  This time, adding a real_iszero() entry point
to match the real_isnegzero.  I could combine this patch with the
negative infinity one in this thread if y'all would prefer.

OK pending tests?

p.s. I'd really like to go and benchmark all this real.* stuff
(later), because I think we could inline a bunch of these functions in
the header file.

Same comments as the INF patch.

jeff



Re: [PATCH] Add set/get functions for negative infinity in real.*

2022-08-26 Thread Jeff Law via Gcc-patches




On 8/23/2022 4:33 AM, Aldy Hernandez via Gcc-patches wrote:

For the frange implementation with endpoints I'm about to contribute,
we need to set REAL_VALUE_TYPEs with negative infinity.  The support
is already there in real.cc, but it is awkward to get at.  One could
call real_inf() and then negate the value, but I've added the ability
to pass the sign argument like many of the existing real.* functions.

I've declared the functions in such a way to avoid changes to the
existing code base:

// Unchanged function returning true for either +-INF.
bool real_isinf (const REAL_VALUE_TYPE *r);
// New overload to be able to specify the sign.
bool real_isinf (const REAL_VALUE_TYPE *r, int sign);
// Replacement function for setting INF, defaults to +INF.
void real_inf (REAL_VALUE_TYPE *, int sign = 0);

Tested on x86-64 Linux.

OK?

gcc/ChangeLog:

* real.cc (real_isinf): New overload.
(real_inf): Add sign argument.
* real.h (real_isinf): New overload.
(real_inf): Add sign argument.


Funny in that I've fairly recently had the desire to do something a bit 
similar.  Let's consider 0.5, which we have a dconsthalf, but we don't 
have dconstmhalf for -0.5.  To get that value I create a dconsthalf 
object and flip its sign.  Similarly for a variety of other special 
constants (particularly powers of two, but a few others as well).


Consider making the "sign" argument a boolean.  It's defined as a single 
bit bitfield in the real_value structure.   We don't want folks to pass 
in values outside [0..1] for the sign if we can easily avoid it:-)




jeff



Re: [PATCH] Add support for floating point endpoints to frange.

2022-08-26 Thread Aldy Hernandez via Gcc-patches
[pinskia: I'm CCing you as the author of the match.pd pattern.]

So, as I wrap up the work here (latest patch attached), I see there's
another phiopt regression (not spaceship related).  I was hoping
someone could either give me a hand, or offer some guidance.

The failure is in gcc.dg/tree-ssa/phi-opt-24.c.

We fail to transform the following into -A:

/* { dg-options "-O2 -fno-signed-zeros -fdump-tree-phiopt" } */

float f0(float A)
{
  // A == 0? A : -Asame as -A
  if (A == 0)  return A;
  return -A;
}

This is because the abs/negative match.pd pattern here:

/* abs/negative simplifications moved from fold_cond_expr_with_comparison,
   Need to handle (A - B) case as fold_cond_expr_with_comparison does.
   Need to handle UN* comparisons.
   ...
   ...

Sees IL that has the 0.0 propagated.

Instead of:

   [local count: 1073741824]:
  if (A_2(D) == 0.0)
goto ; [34.00%]
  else
goto ; [66.00%]

   [local count: 708669601]:
  _3 = -A_2(D);

   [local count: 1073741824]:
  # _1 = PHI 

It now sees:

   [local count: 1073741824]:
  # _1 = PHI <0.0(2), _3(3)>

which it leaves untouched, causing the if conditional to survive.

Is this something that can be done by improving the match.pd pattern,
or should be done elsewhere?

Thanks.
Aldy
From a1f64017defaf6d5841b54e8bc867d63738a9f5c Mon Sep 17 00:00:00 2001
From: Aldy Hernandez 
Date: Tue, 23 Aug 2022 13:36:33 +0200
Subject: [PATCH] Add support for floating point endpoints to frange.

The current implementation of frange is just a type with some bits to
represent NAN and INF.  We can do better and represent endpoints to
ultimately solve longstanding PRs such as PR24021.  This patch adds
these endpoints.  In follow-up patches I will add support for
relational operators using them, as well as a bare bones PLUS_EXPR
range-op-float entry to solve the PR.

I have chosen to use REAL_VALUE_TYPEs for the endpoints, since that's
what we use underneath the trees.  This will be somewhat analogous to
our eventual use of wide-ints in the irange.  No sense going through
added levels of indirection if we can avoid it.  That, plus real.*
already has a nice API for dealing with floats.

With this patch, ranges will be closed float point intervals, which
make the implementation simpler, since we don't have to keep track of
open/closed intervals.  This is conservative enough for use in the
ranger world, as we'd rather err on the side of more elements in a
range, than less.

For example, even though we cannot precisely represent the open
interval (3.0, 5.0) with this approach, it is perfectably reasonable
to represent it as [3.0, 5.0] since the closed interval is a super set
of the open one.  In the VRP/ranger world, it is always better to
err on the side of more information in a range, than not.  After all,
when we don't know anything about a range, we just use VARYING which
is a fancy term for a range spanning the entire domain.

Since REAL_VALUE_TYPEs have properly defined infinity and NAN
semantics, all the math can be made to work:

[-INF, 3.0] !NAN	=> Numbers <= 3.0 (NAN cannot happen)
[3.0, +INF]		=> Numbers >= 3.0 (NAN is possible)
[-INF, +INF]		=> VARYING (NAN is possible)
[-INF, +INF] !NAN	=> Entire domain.  NAN cannot happen.

Also, since REAL_VALUE_TYPEs can represent the minimum and maximum
representable values of a TYPE_MODE, we can disambiguate between them
and negative and positive infinity (see get_max_float in real.cc).

This also makes the math all work.  For example, suppose we know
nothing about x and y (VARYING).  On the TRUE side of x > y, we can
deduce that:

	(a) x cannot be NAN
	(b) y cannot be NAN
	(c) y cannot be +INF.

(c) means that we can drop the upper bound of "y" from +INF to the
maximum representable value for its type.

Having endpoints with different representation for infinity and the
maximum representable values, means we can drop the +-INF properties
we currently have in the frange.  I will do this as a follow-up patch,
as well as contributing more detailed relational operators.

gcc/ChangeLog:

	* value-range-pretty-print.cc (vrange_printer::visit): Adapt for
	endpoints.
	* value-range.cc (frange::set): Same.
	(frange::normalize_kind): Same.
	(frange::operator=): Same.
	(frange::verify_range): Same.
	(frange::union_): Adapt for endpoints and special case NAN
	ranges.
	(frange::intersect): Same.
	(frange::operator==): Special case NAN ranges.
	(frange::contains_p): New.
	(frange::singleton_p): New.
	(frange_float): New.
	(frange_nan): New.
	(early_nan_resolve): New.
	(real_max_representable): New.
	(real_min_representable): New.
	(range_tests_floats): New.
	(range_tests): Call range_tests_floats.
	* value-range.h (frange::lower_bound): New.
	(frange::upper_bound): New.
	(vrp_val_min): Use real_inf with a sign argument.
	(frange::frange): New.
	(frange::set_varying): Adapt for endpoints.
	(frange::set_undefined): Adapt for endpoints.
---
 gcc/range-op-float.cc   |  16 +-
 gcc/value-range-pretty-print.cc |  12 +-
 gcc/va

Re: [PATCH V2] Add warning options -W[no-]compare-distinct-pointer-types

2022-08-26 Thread Jose E. Marchesi via Gcc-patches


> On Thu, 18 Aug 2022, Jose E. Marchesi via Gcc-patches wrote:
>
>> diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
>> index de8780a1502..04af02add37 100644
>> --- a/gcc/c/c-typeck.cc
>> +++ b/gcc/c/c-typeck.cc
>> @@ -12397,7 +12397,8 @@ build_binary_op (location_t location, enum tree_code 
>> code,
>>  }
>>else
>>  /* Avoid warning about the volatile ObjC EH puts on decls.  */
>> -if (!objc_ok)
>> +if (!objc_ok
>> +&& warn_compare_distinct_pointer_types)
>>pedwarn (location, 0,
>> "comparison of distinct pointer types lacks a cast");
>>  
>> @@ -12517,8 +12518,9 @@ build_binary_op (location_t location, enum tree_code 
>> code,
>>int qual = ENCODE_QUAL_ADDR_SPACE (as_common);
>>result_type = build_pointer_type
>>(build_qualified_type (void_type_node, qual));
>> -  pedwarn (location, 0,
>> -   "comparison of distinct pointer types lacks a cast");
>> +  if (warn_compare_distinct_pointer_types)
>> +pedwarn (location, 0,
>> + "comparison of distinct pointer types lacks a 
>> cast");
>
> I think this should use OPT_Wcompare_distinct_pointer_types in place of 0, 
> and then you shouldn't need to check warn_compare_distinct_pointer_types 
> (as well as the diagnostic then automatically telling the user what option 
> controls it).

Ouch, better to use pedwarn the way it is intended to be used yes.
Sorry for the silly overlook :)

Sending a V3 with this modification.


[PATCH] Make all default vrange setters set VARYING.

2022-08-26 Thread Aldy Hernandez via Gcc-patches
frange is using some of the default vrange setters, some of which are
leaving the range in an undefined state.  We hadn't noticed this
because neither frange nor unsupported_range, both which use some of
the default implementation, weren't being used much.

We can never go wrong with setting VARYING ;-).

Will commit after tests complete.

gcc/ChangeLog:

* value-range.cc (vrange::set): Set varying.
(vrange::set_nonzero): Same.
(vrange::set_zero): Same.
(vrange::set_nonnegative): Same.
---
 gcc/value-range.cc | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/gcc/value-range.cc b/gcc/value-range.cc
index d056f7356e1..edd10bf5794 100644
--- a/gcc/value-range.cc
+++ b/gcc/value-range.cc
@@ -94,8 +94,9 @@ vrange::singleton_p (tree *) const
 }
 
 void
-vrange::set (tree, tree, value_range_kind)
+vrange::set (tree min, tree, value_range_kind)
 {
+  set_varying (TREE_TYPE (min));
 }
 
 tree
@@ -168,18 +169,21 @@ vrange::nonzero_p () const
 }
 
 void
-vrange::set_nonzero (tree)
+vrange::set_nonzero (tree type)
 {
+  set_varying (type);
 }
 
 void
-vrange::set_zero (tree)
+vrange::set_zero (tree type)
 {
+  set_varying (type);
 }
 
 void
-vrange::set_nonnegative (tree)
+vrange::set_nonnegative (tree type)
 {
+  set_varying (type);
 }
 
 bool
-- 
2.37.1



[PATCH] [ranger] x == -0.0 does not mean we can replace x with -0.0

2022-08-26 Thread Aldy Hernandez via Gcc-patches
On the true side of x == -0.0, we can't just blindly value propagate
the -0.0 into every use of x because x could be +0.0 (or vice versa).

With this change, we only allow the transformation if
!HONOR_SIGNED_ZEROS or if the range is known not to contain 0.

Will commit after tests complete.

gcc/ChangeLog:

* range-op-float.cc (foperator_equal::op1_range): Do not blindly
copy op2 range when honoring signed zeros.
---
 gcc/range-op-float.cc | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/gcc/range-op-float.cc b/gcc/range-op-float.cc
index ad2fae578d2..ff9fe312acf 100644
--- a/gcc/range-op-float.cc
+++ b/gcc/range-op-float.cc
@@ -252,8 +252,21 @@ foperator_equal::op1_range (frange &r, tree type,
   switch (get_bool_state (r, lhs, type))
 {
 case BRS_TRUE:
-  // If it's true, the result is the same as OP2.
-  r = op2;
+  if (HONOR_SIGNED_ZEROS (type)
+ && op2.contains_p (build_zero_cst (type)))
+   {
+ // With signed zeros, x == -0.0 does not mean we can replace
+ // x with -0.0, because x may be either +0.0 or -0.0.
+ r.set_varying (type);
+   }
+  else
+   {
+ // If it's true, the result is the same as OP2.
+ //
+ // If the range does not actually contain zeros, this should
+ // always be OK.
+ r = op2;
+   }
   // The TRUE side of op1 == op2 implies op1 is !NAN.
   r.set_nan (fp_prop::NO);
   break;
-- 
2.37.1



Re: [PATCH] Add set/get functions for negative infinity in real.*

2022-08-26 Thread Aldy Hernandez via Gcc-patches
Another real.* tweak.  This time, adding a real_iszero() entry point
to match the real_isnegzero.  I could combine this patch with the
negative infinity one in this thread if y'all would prefer.

OK pending tests?

p.s. I'd really like to go and benchmark all this real.* stuff
(later), because I think we could inline a bunch of these functions in
the header file.

On Tue, Aug 23, 2022 at 12:33 PM Aldy Hernandez  wrote:
>
> For the frange implementation with endpoints I'm about to contribute,
> we need to set REAL_VALUE_TYPEs with negative infinity.  The support
> is already there in real.cc, but it is awkward to get at.  One could
> call real_inf() and then negate the value, but I've added the ability
> to pass the sign argument like many of the existing real.* functions.
>
> I've declared the functions in such a way to avoid changes to the
> existing code base:
>
> // Unchanged function returning true for either +-INF.
> bool real_isinf (const REAL_VALUE_TYPE *r);
> // New overload to be able to specify the sign.
> bool real_isinf (const REAL_VALUE_TYPE *r, int sign);
> // Replacement function for setting INF, defaults to +INF.
> void real_inf (REAL_VALUE_TYPE *, int sign = 0);
>
> Tested on x86-64 Linux.
>
> OK?
>
> gcc/ChangeLog:
>
> * real.cc (real_isinf): New overload.
> (real_inf): Add sign argument.
> * real.h (real_isinf): New overload.
> (real_inf): Add sign argument.
> ---
>  gcc/real.cc | 14 +++---
>  gcc/real.h  |  5 -
>  2 files changed, 15 insertions(+), 4 deletions(-)
>
> diff --git a/gcc/real.cc b/gcc/real.cc
> index 4e63b1449c5..f570ca8e85b 100644
> --- a/gcc/real.cc
> +++ b/gcc/real.cc
> @@ -1234,6 +1234,14 @@ real_isinf (const REAL_VALUE_TYPE *r)
>return (r->cl == rvc_inf);
>  }
>
> +/* Determine whether a floating-point value X is infinite with SIGN.  */
> +
> +bool
> +real_isinf (const REAL_VALUE_TYPE *r, int sign)
> +{
> +  return real_isinf (r) && r->sign == sign;
> +}
> +
>  /* Determine whether a floating-point value X is a NaN.  */
>
>  bool
> @@ -2484,12 +2492,12 @@ dconst_sqrt2_ptr (void)
>return &value;
>  }
>
> -/* Fills R with +Inf.  */
> +/* Fills R with Inf with SIGN.  */
>
>  void
> -real_inf (REAL_VALUE_TYPE *r)
> +real_inf (REAL_VALUE_TYPE *r, int sign)
>  {
> -  get_inf (r, 0);
> +  get_inf (r, sign);
>  }
>
>  /* Fills R with a NaN whose significand is described by STR.  If QUIET,
> diff --git a/gcc/real.h b/gcc/real.h
> index 845ef29e3a4..76360b603fb 100644
> --- a/gcc/real.h
> +++ b/gcc/real.h
> @@ -277,6 +277,9 @@ extern bool real_compare (int, const REAL_VALUE_TYPE *, 
> const REAL_VALUE_TYPE *)
>  /* Determine whether a floating-point value X is infinite.  */
>  extern bool real_isinf (const REAL_VALUE_TYPE *);
>
> +/* Determine whether a floating-point value X is infinite with SIGN.  */
> +extern bool real_isinf (const REAL_VALUE_TYPE *, int sign);
> +
>  /* Determine whether a floating-point value X is a NaN.  */
>  extern bool real_isnan (const REAL_VALUE_TYPE *);
>
> @@ -331,7 +334,7 @@ extern long real_to_target (long *, const REAL_VALUE_TYPE 
> *, format_helper);
>  extern void real_from_target (REAL_VALUE_TYPE *, const long *,
>   format_helper);
>
> -extern void real_inf (REAL_VALUE_TYPE *);
> +extern void real_inf (REAL_VALUE_TYPE *, int sign = 0);
>
>  extern bool real_nan (REAL_VALUE_TYPE *, const char *, int, format_helper);
>
> --
> 2.37.1
>
From 8bedd64a6b8df23caf78f5f411dc17e30ad35e88 Mon Sep 17 00:00:00 2001
From: Aldy Hernandez 
Date: Fri, 26 Aug 2022 16:57:09 +0200
Subject: [PATCH] Add real_iszero to real.*

We have real_isnegzero but no real_iszero.  We could memcmp with 0,
but that's just ugly.

gcc/ChangeLog:

	* real.cc (real_iszero): New.
	* real.h (real_iszero): New.
---
 gcc/real.cc | 16 
 gcc/real.h  |  6 ++
 2 files changed, 22 insertions(+)

diff --git a/gcc/real.cc b/gcc/real.cc
index f570ca8e85b..d362680fe9b 100644
--- a/gcc/real.cc
+++ b/gcc/real.cc
@@ -1272,6 +1272,22 @@ real_isneg (const REAL_VALUE_TYPE *r)
   return r->sign;
 }
 
+/* Determine whether a floating-point value X is plus or minus zero.  */
+
+bool
+real_iszero (const REAL_VALUE_TYPE *r)
+{
+  return r->cl == rvc_zero;
+}
+
+/* Determine whether a floating-point value X is zero with SIGN.  */
+
+bool
+real_iszero (const REAL_VALUE_TYPE *r, int sign)
+{
+  return real_iszero (r) && r->sign == sign;
+}
+
 /* Determine whether a floating-point value X is minus zero.  */
 
 bool
diff --git a/gcc/real.h b/gcc/real.h
index 76360b603fb..20d85378855 100644
--- a/gcc/real.h
+++ b/gcc/real.h
@@ -295,6 +295,12 @@ extern bool real_isneg (const REAL_VALUE_TYPE *);
 /* Determine whether a floating-point value X is minus zero.  */
 extern bool real_isnegzero (const REAL_VALUE_TYPE *);
 
+/* Determine whether a floating-point value X is plus or minus zero.  */
+extern bool real_iszero (const REAL_VALUE_TYPE *);
+
+/* Determine whether a floating-point value X is zero with SIGN.  */
+ext

[committed] libstdc++: Simplify std::error_code and std::error_condition

2022-08-26 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux, pushed to trunk.

-- >8 --

This removes the redundant operator=(E) from std::error_code and
std::error_condition. Without that overload, assignment from a custom
type will use the templated constructor to create a temporary and then
use the trivial copy assignment operator. With the overloaded
assignment, we have to check the constraints twice as often, because
that overload and its constraints are checked for simple copy
assignments (including the one in the overloaded assignment operator
itself!)

Also add tests that ADL is used as per LWG 3629.

libstdc++-v3/ChangeLog:

* include/std/system_error (error_code::_Check): New alias
template for constructor SFINAE constraint.
(error_code::error_code(ErrorCodeEnum)): Use it.
(error_code::operator=(ErrorCodeEnum)): Remove.
(error_condition::_Check): New alias template for constraint.
(error_condition::error_condition(ErrorConditionEnum)): Use it.
(error_condition::operator=(ErrorConditionEnum)): Remove.
* testsuite/19_diagnostics/error_code/cons/1.cc: Check
constructor taking user-defined error enum.
* testsuite/19_diagnostics/error_condition/cons/1.cc: Likewise.
---
 libstdc++-v3/include/std/system_error | 38 ++-
 .../19_diagnostics/error_code/cons/1.cc   | 31 +--
 .../19_diagnostics/error_condition/cons/1.cc  | 31 +--
 3 files changed, 75 insertions(+), 25 deletions(-)

diff --git a/libstdc++-v3/include/std/system_error 
b/libstdc++-v3/include/std/system_error
index 87cf720f6e3..050439427cc 100644
--- a/libstdc++-v3/include/std/system_error
+++ b/libstdc++-v3/include/std/system_error
@@ -216,6 +216,10 @@ _GLIBCXX_END_INLINE_ABI_NAMESPACE(_V2)
*/
   class error_code
   {
+template
+  using _Check
+   = __enable_if_t::value>;
+
   public:
 error_code() noexcept
 : _M_value(0), _M_cat(&system_category()) { }
@@ -223,11 +227,15 @@ _GLIBCXX_END_INLINE_ABI_NAMESPACE(_V2)
 error_code(int __v, const error_category& __cat) noexcept
 : _M_value(__v), _M_cat(&__cat) { }
 
-template::value>::type>
+/// Initialize with a user-defined type, by calling make_error_code.
+template>
   error_code(_ErrorCodeEnum __e) noexcept
   { *this = make_error_code(__e); }
 
+error_code(const error_code&) = default;
+error_code& operator=(const error_code&) = default;
+
 void
 assign(int __v, const error_category& __cat) noexcept
 {
@@ -239,13 +247,6 @@ _GLIBCXX_END_INLINE_ABI_NAMESPACE(_V2)
 clear() noexcept
 { assign(0, system_category()); }
 
-// DR 804.
-template
-  typename enable_if::value,
-error_code&>::type
-  operator=(_ErrorCodeEnum __e) noexcept
-  { return *this = make_error_code(__e); }
-
 /// The error value.
 [[__nodiscard__]]
 int
@@ -345,6 +346,10 @@ _GLIBCXX_END_INLINE_ABI_NAMESPACE(_V2)
*/
   class error_condition
   {
+template
+  using _Check
+   = __enable_if_t::value>;
+
   public:
 /// Initialize with a zero (no error) value and the generic category.
 error_condition() noexcept
@@ -354,11 +359,15 @@ _GLIBCXX_END_INLINE_ABI_NAMESPACE(_V2)
 error_condition(int __v, const error_category& __cat) noexcept
 : _M_value(__v), _M_cat(&__cat) { }
 
-template::value>::type>
+/// Initialize with a user-defined type, by calling make_error_condition.
+template>
   error_condition(_ErrorConditionEnum __e) noexcept
   { *this = make_error_condition(__e); }
 
+error_condition(const error_condition&) = default;
+error_condition& operator=(const error_condition&) = default;
+
 /// Set the value and category.
 void
 assign(int __v, const error_category& __cat) noexcept
@@ -367,13 +376,6 @@ _GLIBCXX_END_INLINE_ABI_NAMESPACE(_V2)
   _M_cat = &__cat;
 }
 
-// DR 804.
-template
-  typename enable_if::value, error_condition&>::type
-  operator=(_ErrorConditionEnum __e) noexcept
-  { return *this = make_error_condition(__e); }
-
 /// Reset the value and category to the default-constructed state.
 void
 clear() noexcept
diff --git a/libstdc++-v3/testsuite/19_diagnostics/error_code/cons/1.cc 
b/libstdc++-v3/testsuite/19_diagnostics/error_code/cons/1.cc
index f454fc270d8..d48d5c1ffe8 100644
--- a/libstdc++-v3/testsuite/19_diagnostics/error_code/cons/1.cc
+++ b/libstdc++-v3/testsuite/19_diagnostics/error_code/cons/1.cc
@@ -22,23 +22,46 @@
 #include 
 #include 
 
+namespace adl
+{
+  struct Error { };
+
+  const Error err;
+
+  struct category : std::error_category
+  {
+const char* name() const noexcept override { return "adl"; }
+std::string message(int) const { return ""; }
+  };
+
+  const category cat;
+
+  std::error_code
+  make_error_code(Error) { return std::error_code(999, cat); }
+}
+
+template<> struct std::is_error_code_enum : std::true_type { };
+
 int main()
 {
-  // 1
+ 

[committed] libstdc++: Add nonnull to starts_with/ends_with/contains string members

2022-08-26 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux, pushed to trunk.

-- >8 --

Ideally this wouldn't be needed, because eventually these pointers all
get passed to either the basic_string_view(const CharT*) constructor, or
to basic_string_view::find(const CharT*), both of which already have the
attribute. But for that to work requires optimization, so that the null
value gets propagated through the call chain.

Adding it explicitly to each member that requires a non-null pointer
makes the diagnostics more reliable even without optimization. It's
better to give a diagnostic earlier anyway, at the actual problematic
call in the user's code.

libstdc++-v3/ChangeLog:

* include/bits/basic_string.h (starts_with, ends_with, contains):
Add nonnull attribute.
* include/bits/cow_string.h (starts_with, ends_with, contains):
Likewise.
* include/std/string_view (starts_with, ends_with, contains):
Likewise.
* testsuite/21_strings/basic_string/operations/contains/nonnull.cc
* testsuite/21_strings/basic_string/operations/ends_with/nonnull.cc
* testsuite/21_strings/basic_string/operations/starts_with/nonnull.cc
* testsuite/21_strings/basic_string_view/operations/contains/nonnull.cc
* testsuite/21_strings/basic_string_view/operations/ends_with/nonnull.cc
* 
testsuite/21_strings/basic_string_view/operations/starts_with/nonnull.cc
---
 libstdc++-v3/include/bits/basic_string.h |  3 +++
 libstdc++-v3/include/bits/cow_string.h   |  3 +++
 libstdc++-v3/include/std/string_view |  3 +++
 .../basic_string/operations/contains/nonnull.cc  | 12 
 .../basic_string/operations/ends_with/nonnull.cc | 12 
 .../basic_string/operations/starts_with/nonnull.cc   | 12 
 .../basic_string_view/operations/contains/nonnull.cc | 12 
 .../operations/ends_with/nonnull.cc  | 12 
 .../operations/starts_with/nonnull.cc| 12 
 9 files changed, 81 insertions(+)
 create mode 100644 
libstdc++-v3/testsuite/21_strings/basic_string/operations/contains/nonnull.cc
 create mode 100644 
libstdc++-v3/testsuite/21_strings/basic_string/operations/ends_with/nonnull.cc
 create mode 100644 
libstdc++-v3/testsuite/21_strings/basic_string/operations/starts_with/nonnull.cc
 create mode 100644 
libstdc++-v3/testsuite/21_strings/basic_string_view/operations/contains/nonnull.cc
 create mode 100644 
libstdc++-v3/testsuite/21_strings/basic_string_view/operations/ends_with/nonnull.cc
 create mode 100644 
libstdc++-v3/testsuite/21_strings/basic_string_view/operations/starts_with/nonnull.cc

diff --git a/libstdc++-v3/include/bits/basic_string.h 
b/libstdc++-v3/include/bits/basic_string.h
index b04fba95678..9d8b415302b 100644
--- a/libstdc++-v3/include/bits/basic_string.h
+++ b/libstdc++-v3/include/bits/basic_string.h
@@ -3400,6 +3400,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
   starts_with(_CharT __x) const noexcept
   { return __sv_type(this->data(), this->size()).starts_with(__x); }
 
+  [[__gnu__::__nonnull__]]
   constexpr bool
   starts_with(const _CharT* __x) const noexcept
   { return __sv_type(this->data(), this->size()).starts_with(__x); }
@@ -3412,6 +3413,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
   ends_with(_CharT __x) const noexcept
   { return __sv_type(this->data(), this->size()).ends_with(__x); }
 
+  [[__gnu__::__nonnull__]]
   constexpr bool
   ends_with(const _CharT* __x) const noexcept
   { return __sv_type(this->data(), this->size()).ends_with(__x); }
@@ -3426,6 +3428,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
   contains(_CharT __x) const noexcept
   { return __sv_type(this->data(), this->size()).contains(__x); }
 
+  [[__gnu__::__nonnull__]]
   constexpr bool
   contains(const _CharT* __x) const noexcept
   { return __sv_type(this->data(), this->size()).contains(__x); }
diff --git a/libstdc++-v3/include/bits/cow_string.h 
b/libstdc++-v3/include/bits/cow_string.h
index f16e33ac1ef..f5f03338eec 100644
--- a/libstdc++-v3/include/bits/cow_string.h
+++ b/libstdc++-v3/include/bits/cow_string.h
@@ -3012,6 +3012,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   starts_with(_CharT __x) const noexcept
   { return __sv_type(this->data(), this->size()).starts_with(__x); }
 
+  [[__gnu__::__nonnull__]]
   bool
   starts_with(const _CharT* __x) const noexcept
   { return __sv_type(this->data(), this->size()).starts_with(__x); }
@@ -3024,6 +3025,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   ends_with(_CharT __x) const noexcept
   { return __sv_type(this->data(), this->size()).ends_with(__x); }
 
+  [[__gnu__::__nonnull__]]
   bool
   ends_with(const _CharT* __x) const noexcept
   { return __sv_type(this->data(), this->size()).ends_with(__x); }
@@ -3038,6 +3040,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   contains(_CharT __x) const noexcept
   { return __sv_type(this

[PATCH][RFC] libstdc++: Optimize std::chrono::abs

2022-08-26 Thread Jonathan Wakely via Gcc-patches
While looking into LWG 3741 I came up with this small change to
chrono::abs, which reduces how much work the compiler does to compile
it, but makes the code less clear. The current implementation is very
easy to understand, and compiling chrono::abs probably isn't a hotspot
in anybody's build. Is this worth it?

-- >8 --

This change manually inlines the call to duration::zero, the comparison
using chrono::operator< with duration arguments, the call to
duration::operator- (and the common_type instantiation it does). This
also avoids calling the duration(const duration&) constructor
(and its constraint checks).

By performing the arithmetic operations directly on the Rep value we
improve compilation throughput and also runtime performance for
unoptimized builds.

libstdc++-v3/ChangeLog:

* include/bits/chrono.h (chrono::abs): Optimize.
---
 libstdc++-v3/include/bits/chrono.h | 42 +++---
 1 file changed, 21 insertions(+), 21 deletions(-)

diff --git a/libstdc++-v3/include/bits/chrono.h 
b/libstdc++-v3/include/bits/chrono.h
index 05987ca09df..99d47503af3 100644
--- a/libstdc++-v3/include/bits/chrono.h
+++ b/libstdc++-v3/include/bits/chrono.h
@@ -317,6 +317,23 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   { };
 #endif // C++20
 
+/// duration_values
+template
+  struct duration_values
+  {
+   static constexpr _Rep
+   zero() noexcept
+   { return _Rep(0); }
+
+   static constexpr _Rep
+   max() noexcept
+   { return numeric_limits<_Rep>::max(); }
+
+   static constexpr _Rep
+   min() noexcept
+   { return numeric_limits<_Rep>::lowest(); }
+  };
+
 #if __cplusplus >= 201703L
 # define __cpp_lib_chrono 201611L
 
@@ -365,11 +382,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 template
   constexpr
   enable_if_t::is_signed, duration<_Rep, _Period>>
-  abs(duration<_Rep, _Period> __d)
+  abs(duration<_Rep, _Period> __d) noexcept(is_arithmetic_v<_Rep>)
   {
-   if (__d >= __d.zero())
- return __d;
-   return -__d;
+   if (_Rep __c = __d.count(); __c < duration_values<_Rep>::zero())
+ return duration<_Rep, _Period>(-__c);
+   return __d;
   }
 
 // Make chrono::ceil also usable as chrono::__detail::ceil.
@@ -399,23 +416,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 }
 #endif // C++17
 
-/// duration_values
-template
-  struct duration_values
-  {
-   static constexpr _Rep
-   zero() noexcept
-   { return _Rep(0); }
-
-   static constexpr _Rep
-   max() noexcept
-   { return numeric_limits<_Rep>::max(); }
-
-   static constexpr _Rep
-   min() noexcept
-   { return numeric_limits<_Rep>::lowest(); }
-  };
-
 /// @cond undocumented
 
 template
-- 
2.37.2



Re: [Patch] libgomp/nvptx: Prepare for reverse-offload callback handling

2022-08-26 Thread Alexander Monakov via Gcc-patches


On Fri, 26 Aug 2022, Tobias Burnus wrote:

> @Tom and Alexander: Better suggestions are welcome for the busy loop in
> libgomp/plugin/plugin-nvptx.c regarding the variable placement and checking
> its value.

I think to do that without polling you can use PTX 'brkpt' instruction on the
device and CUDA Debugger API on the host (but you'd have to be careful about
interactions with the real debugger).

How did the standardization process for this feature look like, how did it pass
if it's not efficiently implementable for the major offloading targets?

Alexander


Re: [PATCH] libstdc++: Optimize std::con/disjunction, __and_/__or_, etc

2022-08-26 Thread Jonathan Wakely via Gcc-patches
On Fri, 26 Aug 2022 at 14:45, Patrick Palka via Libstdc++
 wrote:
>
> On Wed, 24 Aug 2022, Patrick Palka wrote:
>
> > The internal type-level logical operations __and_ and __or_ are
> > currently quite slow to compile for a couple of reasons:
> >
> >   1. They are drop-in replacements for std::con/disjunction, which
> >  are rigidly specified to form a type that derives from the first
> >  type argument that caused the overall computation to short-circuit.
> >  In practice this inheritance property seems to be rarely needed;
> >  usually all we care about is the value of the overall expression.
> >   2. Their recursive implementations instantiate up to ~N class templates
> >  and form up to a depth ~N inheritance chain.
> >
> > This patch does away with this inheritance property of __and_ and __or_
> > (which seems to be unneeded in the library except indirectly by
> > std::con/disjunction) and redefines them as alias templates that yield
> > either false_type or true_type via SFINAE and overload resolution of a
> > pair of function templates.
>
> Another difference between this implementation of __and_/__or_  and
> std::con/disjunction is the handling of invalid/non-"truthy" operands.
> The standard makes this ill-formed ([meta.logical]/4), whereas this
> implementation of __and_/__or_ silently treats such an operand as if
> it were false_type/true_type respectively.
>
> Thus e.g. std::conjunction_v and std::disjunction_v are both
> ill-formed

The standard probably *should* make it ill-formed, but currently it
makes it undefined, because it just says "shall be". Violating that
rule has undefined behaviour, so isn't required to be ill-formed. Our
implementations make it ill-formed, because that's just what happens
if we try to access Bi::value outside a SFINAE context, but I think

> whereas __and_v/__or_v are false/true respectively
> with this implementation (somewhat nonsensically).

Which is actually fine for something that the standard says is undefined.

Ill-formed is more user-friendly for the standardized
std::{con,dis}junction APIs than (potentially unbounded) UB, but
"somewhat nonsensical, but entirely well-defined" is perfectly fine
for our own internal helpers.

>  Though I'm not sure
> if this corner case is relevant for our current internal uses of
> __and_/__or_, which all seem to pass in "truthy" operands.

Yes, I *think* it's the case that we always pass sensible types into
them. There's a small risk in that I've sometimes used __and_ where
the standard requires conjunction, just because it saved one class
template instantiation to do so. But I think even in those cases, all
the type args are traits like is_assignable, which will always be
truthy. We should keep this edge case difference in mind for the
future though, and use std::{con,dis}junction if the difference might
matter.

The new implementations of __and_ and __or_ are very impressive. The
patch is OK for trunk, thanks!

Yesterday you mentioned changing conjunction_v to be defined in terms
of the cheaper __and_v in stead of conjunction::value. If we decide
there are some edge cases that would make that not equivalent, what we
could do is:

template
  inline constexpr conjunction_v = __detail::__conjunction_impl::value;

i.e. skip the step of instantiating std::conjunction::type and just
use that type directly.

But I think we should be able to just define it as __and_v
because [meta.logical]/4 makes that equivalent to
conjunction::value.



Re: [PATCH] libcpp: Implement P2362R3 - Remove non-encodable wide character literals and multicharacter [PR106647]

2022-08-26 Thread Jason Merrill via Gcc-patches

On 8/26/22 03:35, Jakub Jelinek wrote:

Hi!

My understanding of the paper is that we just want to promote the CPP_WCHAR
"character constant too long for its type" warning to error as it is already
error for u8, u and U literals.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


OK.


2022-08-26  Jakub Jelinek  

PR c++/106647
* charset.cc (wide_str_to_charconst): Implement P2362R3 - Remove
non-encodable wide character literals and multicharacter.  For
C++23 use CPP_DL_ERROR instead of CPP_DL_WARNING for
"character constant too long for its type" diagnostics on CPP_WCHAR
literals.

* g++.dg/cpp23/wchar-multi1.C: New test.
* g++.dg/cpp23/wchar-multi2.C: New test.

--- libcpp/charset.cc.jj2022-08-25 11:54:38.849924475 +0200
+++ libcpp/charset.cc   2022-08-25 18:36:20.650415220 +0200
@@ -2170,7 +2170,11 @@ wide_str_to_charconst (cpp_reader *pfile
   character constant is guaranteed to overflow.  */
if (str.len > nbwc * 2)
  cpp_error (pfile, (CPP_OPTION (pfile, cplusplus)
-  && (type == CPP_CHAR16 || type == CPP_CHAR32))
+  && (type == CPP_CHAR16
+  || type == CPP_CHAR32
+  /* In C++23 this is error even for L'ab'.  */
+  || (type == CPP_WCHAR
+  && CPP_OPTION (pfile, size_t_literals
  ? CPP_DL_ERROR : CPP_DL_WARNING,
   "character constant too long for its type");
  
--- gcc/testsuite/g++.dg/cpp23/wchar-multi1.C.jj	2022-08-25 18:08:01.973426155 +0200

+++ gcc/testsuite/g++.dg/cpp23/wchar-multi1.C   2022-08-25 18:51:30.476687112 
+0200
@@ -0,0 +1,42 @@
+// P2362R3 - Remove non-encodable wide character literals and multicharacter
+// wide character literals.
+// { dg-do compile }
+
+char a = 'a';
+int b = 'ab';  // { dg-warning "multi-character character 
constant" }
+int c = '\u05D9';  // { dg-warning "multi-character character 
constant" }
+#if __SIZEOF_INT__ > 2
+int d = '\U0001F525';  // { dg-warning "multi-character character constant" 
"" { target int32 } }
+#endif
+int e = 'abcd';// { dg-warning "multi-character character 
constant" }
+wchar_t f = L'f';
+wchar_t g = L'gh'; // { dg-error "character constant too long for its type" 
"" { target c++23 } }
+   // { dg-warning "character constant too long for its 
type" "" { target c++20_down } .-1 }
+wchar_t h = L'ijkl';   // { dg-error "character constant too long for its type" 
"" { target c++23 } }
+   // { dg-warning "character constant too long for its 
type" "" { target c++20_down } .-1 }
+wchar_t i = L'\U0001F525'; // { dg-error "character constant too long for its type" 
"" { target { c++23 && { ! 4byte_wchar_t } } } }
+   // { dg-warning "character constant too long for its type" 
"" { target { c++20_down && { ! 4byte_wchar_t } } } .-1 }
+#ifdef __cpp_char8_t
+typedef char8_t u8;
+#else
+typedef char u8;
+#endif
+#if __cpp_unicode_characters >= 201411
+u8 j = u8'j';
+u8 k = u8'kl'; // { dg-error "character constant too long for its type" 
"" { target c++17 } }
+u8 l = u8'\U0001F525'; // { dg-error "character constant too long for its type" 
"" { target c++17 }  }
+#endif
+#if __cpp_unicode_characters >= 200704
+char16_t m = u'm';
+char16_t n = u'no';// { dg-error "character constant too long for its type" 
"" { target c++11 } }
+char16_t o = u'\u05D9';
+char16_t p = u'\U0001F525';// { dg-error "character constant too long for its type" 
"" { target c++11 } }
+char32_t q = U'm';
+char32_t r = U'no';// { dg-error "character constant too long for its type" 
"" { target c++11 } }
+char32_t s = U'\u05D9';
+char32_t t = U'\U0001F525';
+#endif
+wchar_t u = L'\u0065\u0301';   // { dg-error "character constant too long for its 
type" "" { target c++23 } }
+   // { dg-warning "character constant too long for its 
type" "" { target c++20_down } .-1 }
+wchar_t v = L'é'; // { dg-error "character constant too long for its type" 
"" { target c++23 } }
+   // { dg-warning "character constant too long for its 
type" "" { target c++20_down } .-1 }
--- gcc/testsuite/g++.dg/cpp23/wchar-multi2.C.jj2022-08-25 
18:51:53.744386945 +0200
+++ gcc/testsuite/g++.dg/cpp23/wchar-multi2.C   2022-08-25 18:53:03.317489442 
+0200
@@ -0,0 +1,43 @@
+// P2362R3 - Remove non-encodable wide character literals and multicharacter
+// wide character literals.
+// { dg-do compile }
+// { dg-options "-fshort-wchar" }
+
+char a = 'a';
+int b = 'ab';  // { dg-warning "multi-character character 
constant" }
+int c = '\u05D9';  // { dg-warning "multi-character character 
constant" }
+#if __SI

Re: [[GCC13][Patch][V3] 1/2] Add a new option -fstrict-flex-array[=n] and new attribute strict_flex_array

2022-08-26 Thread Qing Zhao via Gcc-patches


> On Aug 26, 2022, at 4:48 AM, Richard Biener  wrote:
> 
> On Wed, 17 Aug 2022, Qing Zhao wrote:
> 
>> Add the following new option -fstrict-flex-array[=n] and a corresponding
>> attribute strict_flex_array to GCC:
>> 
>> '-fstrict-flex-array'
>> Treat the trailing array of a structure as a flexible array member
>> in a stricter way.  The positive form is equivalent to
>> '-fstrict-flex-array=3', which is the strictest.  A trailing array
>> is treated as a flexible array member only when it is declared as a
>> flexible array member per C99 standard onwards.  The negative form
>> is equivalent to '-fstrict-flex-array=0', which is the least
>> strict.  All trailing arrays of structures are treated as flexible
>> array members.
>> 
>> '-fstrict-flex-array=LEVEL'
>> Treat the trailing array of a structure as a flexible array member
>> in a stricter way.  The value of LEVEL controls the level of
>> strictness.
>> 
>> The possible values of LEVEL are the same as for the
>> 'strict_flex_array' attribute (*note Variable Attributes::).
>> 
>> You can control this behavior for a specific trailing array field
>> of a structure by using the variable attribute 'strict_flex_array'
>> attribute (*note Variable Attributes::).
>> 
>> This option is only valid when flexible array member is supported in the
>> language. FOR ISO C before C99 and ISO C++, no language support for the 
>> flexible
>> array member at all, this option will be invalid and a warning will be 
>> issued.
>> When -std=gnu89 is specified or C++ with GNU extension, only zero-length 
>> array
>> extension and one-size array are supported, as a result, LEVEL=3 will be
>> invalid and a warning will be issued.
>> 
>> 'strict_flex_array (LEVEL)'
>> The 'strict_flex_array' attribute should be attached to the
>> trailing array field of a structure.  It specifies the level of
>> strictness of treating the trailing array field of a structure as a
>> flexible array member.  LEVEL must be an integer betwen 0 to 3.
>> 
>> LEVEL=0 is the least strict level, all trailing arrays of
>> structures are treated as flexible array members.  LEVEL=3 is the
>> strictest level, only when the trailing array is declared as a
>> flexible array member per C99 standard onwards ([]), it is treated
>> as a flexible array member.
>> 
>> There are two more levels in between 0 and 3, which are provided to
>> support older codes that use GCC zero-length array extension ([0])
>> or one-size array as flexible array member ([1]): When LEVEL is 1,
>> the trailing array is treated as a flexible array member when it is
>> declared as either [], [0], or [1]; When LEVEL is 2, the trailing
>> array is treated as a flexible array member when it is declared as
>> either [], or [0].
>> 
>> This attribute can be used with or without '-fstrict-flex-array'.
>> When both the attribute and the option present at the same time,
>> the level of the strictness for the specific trailing array field
>> is determined by the attribute.
>> 
>> This attribute is only valid when flexible array member is supported in 
>> the
>> language. For ISO C before C99 and ISO C++, no language support for the 
>> flexible
>> array member at all, this attribute will be invalid and a warning is 
>> issued.
>> When -std=gnu89 is specified or C++ with GNU extension, only zero-length 
>> array
>> extension and one-size array are supported, as a result, LEVEL=3 will be
>> invalid and a warning is issued.
>> 
>> gcc/c-family/ChangeLog:
>> 
>>  * c-attribs.cc (handle_strict_flex_arrays_attribute): New function.
>>  (c_common_attribute_table): New item for strict_flex_array.
>>  * c-opts.cc (c_common_post_options): Handle the combination of
>>  -fstrict-flex-arrays and -std specially.
>>  * c.opt: (fstrict-flex-array): New option.
>>  (fstrict-flex-array=): New option.
>> 
>> gcc/c/ChangeLog:
>> 
>>  * c-decl.cc (flexible_array_member_type_p): New function.
>>  (one_element_array_type_p): Likewise.
>>  (zero_length_array_type_p): Likewise.
>>  (add_flexible_array_elts_to_size): Call new utility
>>  routine flexible_array_member_type_p.
>>  (is_flexible_array_member_p): New function.
>>  (finish_struct): Set the new DECL_NOT_FLEXARRAY flag.
>> 
>> gcc/cp/ChangeLog:
>> 
>>  * module.cc (trees_out::core_bools): Stream out new bit
>>  decl_not_flexarray.
>>  (trees_in::core_bools): Stream in new bit decl_not_flexarray.
>> 
>> gcc/ChangeLog:
>> 
>>  * doc/extend.texi: Document strict_flex_array attribute.
>>  * doc/invoke.texi: Document -fstrict-flex-array[=n] option.
>>  * print-tree.cc (print_node): Print new bit decl_not_flexarray.
>>  * tree-core.h (struct tree_decl_common): New bit field
>>  decl_not_flexarray.
>>  * tree-streamer-in.cc (unpack_ts_decl_common_

Re: [PATCH] libstdc++: Optimize std::con/disjunction, __and_/__or_, etc

2022-08-26 Thread Patrick Palka via Gcc-patches
On Wed, 24 Aug 2022, Patrick Palka wrote:

> The internal type-level logical operations __and_ and __or_ are
> currently quite slow to compile for a couple of reasons:
> 
>   1. They are drop-in replacements for std::con/disjunction, which
>  are rigidly specified to form a type that derives from the first
>  type argument that caused the overall computation to short-circuit.
>  In practice this inheritance property seems to be rarely needed;
>  usually all we care about is the value of the overall expression.
>   2. Their recursive implementations instantiate up to ~N class templates
>  and form up to a depth ~N inheritance chain.
> 
> This patch does away with this inheritance property of __and_ and __or_
> (which seems to be unneeded in the library except indirectly by
> std::con/disjunction) and redefines them as alias templates that yield
> either false_type or true_type via SFINAE and overload resolution of a
> pair of function templates.

Another difference between this implementation of __and_/__or_  and
std::con/disjunction is the handling of invalid/non-"truthy" operands.
The standard makes this ill-formed ([meta.logical]/4), whereas this
implementation of __and_/__or_ silently treats such an operand as if
it were false_type/true_type respectively.

Thus e.g. std::conjunction_v and std::disjunction_v are both
ill-formed whereas __and_v/__or_v are false/true respectively
with this implementation (somewhat nonsensically).  Though I'm not sure
if this corner case is relevant for our current internal uses of
__and_/__or_, which all seem to pass in "truthy" operands.

> 
> As for std::con/disjunction, it seems we need to keep defining them in
> terms of recursive class templates for sake of the inheritance property.
> But in the recursive step, instead of using inheritance which would
> yield a depth ~N inheritance chain, use a recursive member typedef which
> gets immediately flattened.  Thus a specialization of conjunction and
> disjunction now has depth ~1 instead of up to ~N.
> 
> In passing, redefine __not_ as an alias template for consistency with
> __and_ and __or_, and to remove a layer of indirection.
> 
> Together these changes have a substantial effect on compile time and
> memory usage for code that indirectly makes heavy use of these internal
> type traits.  For example, for the following which tests constructibility
> between two compatible 257-element tuple types
> 
>   #include 
> 
>   struct A { A(int); };
> 
>   #define M(x) x, x
> 
>   using ty1 = std::tuple;
>   using ty2 = std::tuple;
> 
>   static_assert(std::is_constructible_v);
> 
> memory usage improves by ~27% from 440MB to 320M and compile time
> improves by ~20% from ~2s to ~1.6s (with -std=c++23).
> 
> Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> trunk?
> 
> libstdc++-v3/ChangeLog:
> 
>   * include/std/type_traits (enable_if, __enable_if_t): Move up
>   their definitions.
>   (__detail::__first_t): Define.
>   (__detail::__or_fn, __detail::__and_fn): Declare.
>   (__or_, __and_): Redefine as alias templates in terms of __or_fn
>   and __and_fn.
>   (__not_): Redefine as an alias template.
>   (__detail::__disjunction_impl, __detail::__conjunction_impl):
>   Define.
>   (conjuction, disjunction): Redefine in terms of __disjunction_impl
>   and __conjunction_impl.
> ---
>  libstdc++-v3/include/std/type_traits | 152 ---
>  1 file changed, 93 insertions(+), 59 deletions(-)
> 
> diff --git a/libstdc++-v3/include/std/type_traits 
> b/libstdc++-v3/include/std/type_traits
> index 14b029cec64..07a50a31e86 100644
> --- a/libstdc++-v3/include/std/type_traits
> +++ b/libstdc++-v3/include/std/type_traits
> @@ -100,6 +100,21 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  
>// Metaprogramming helper types.
>  
> +  // Primary template.
> +  /// Define a member typedef `type` only if a boolean constant is true.
> +  template
> +struct enable_if
> +{ };
> +
> +  // Partial specialization for true.
> +  template
> +struct enable_if
> +{ typedef _Tp type; };
> +
> +  // __enable_if_t (std::enable_if_t for C++11)
> +  template
> +using __enable_if_t = typename enable_if<_Cond, _Tp>::type;
> +
>template
>  struct __conditional
>  {
> @@ -127,56 +142,39 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>template
>  using __type_identity_t = typename __type_identity<_Tp>::type;
>  
> -  template
> -struct __or_;
> -
> -  template<>
> -struct __or_<>
> -: public false_type
> -{ };
> +  namespace __detail
> +  {
> +// A variadic alias template that resolves to its first argument.
> +template
> +  using __first_t = _Tp;
>  
> -  template
> -struct __or_<_B1>
> -: public _B1
> -{ };
> +// These are deliberately not defined.
> +template
> +  auto __or_fn(int) -> __first_t +  __enable_if_t...>;
>  
> -  template
> -struct __or_

Re: [[GCC13][Patch][V3] 2/2] Use array_at_struct_end_p in __builtin_object_size [PR101836]

2022-08-26 Thread Qing Zhao via Gcc-patches



> On Aug 26, 2022, at 4:49 AM, Richard Biener  wrote:
> 
> On Wed, 17 Aug 2022, Qing Zhao wrote:
> 
>> Use array_at_struct_end_p to determine whether the trailing array
>> of a structure is flexible array member in __builtin_object_size.
> 
> With the discussion about what array_at_struct_end_p actually computes,
> is this now the correct utility for __builtin_object_size or should
> it use a stricter variant?

My understanding is, -fstrict-flex-arrays should control all the GCC analysis 
consistently, including __bultin_object_size. 
For a standard conforming application, use -fstrict-flex-arrays=3 for the most 
efficient code. 

Kees, is it okay for Linux Kernel to add -fstrict-flex-arrays in order to 
enable FORTIFY_SOURCE work better for regular trailing arrays?

Thanks.

Qing
> 
>> gcc/ChangeLog:
>> 
>>  PR tree-optimization/101836
>>  * tree-object-size.cc (addr_object_size): Use array_at_struct_end_p
>>  to determine a flexible array member reference.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>>  PR tree-optimization/101836
>>  * gcc.dg/pr101836.c: New test.
>>  * gcc.dg/pr101836_1.c: New test.
>>  * gcc.dg/pr101836_2.c: New test.
>>  * gcc.dg/pr101836_3.c: New test.
>>  * gcc.dg/pr101836_4.c: New test.
>>  * gcc.dg/pr101836_5.c: New test.
>>  * gcc.dg/strict-flex-array-5.c: New test.
>>  * gcc.dg/strict-flex-array-6.c: New test.
>> ---
>> gcc/testsuite/gcc.dg/pr101836.c| 60 ++
>> gcc/testsuite/gcc.dg/pr101836_1.c  | 60 ++
>> gcc/testsuite/gcc.dg/pr101836_2.c  | 60 ++
>> gcc/testsuite/gcc.dg/pr101836_3.c  | 60 ++
>> gcc/testsuite/gcc.dg/pr101836_4.c  | 60 ++
>> gcc/testsuite/gcc.dg/pr101836_5.c  | 60 ++
>> gcc/testsuite/gcc.dg/strict-flex-array-5.c | 60 ++
>> gcc/testsuite/gcc.dg/strict-flex-array-6.c | 60 ++
>> gcc/tree-object-size.cc| 16 +++---
>> 9 files changed, 487 insertions(+), 9 deletions(-)
>> create mode 100644 gcc/testsuite/gcc.dg/pr101836.c
>> create mode 100644 gcc/testsuite/gcc.dg/pr101836_1.c
>> create mode 100644 gcc/testsuite/gcc.dg/pr101836_2.c
>> create mode 100644 gcc/testsuite/gcc.dg/pr101836_3.c
>> create mode 100644 gcc/testsuite/gcc.dg/pr101836_4.c
>> create mode 100644 gcc/testsuite/gcc.dg/pr101836_5.c
>> create mode 100644 gcc/testsuite/gcc.dg/strict-flex-array-5.c
>> create mode 100644 gcc/testsuite/gcc.dg/strict-flex-array-6.c
>> 
>> diff --git a/gcc/testsuite/gcc.dg/pr101836.c 
>> b/gcc/testsuite/gcc.dg/pr101836.c
>> new file mode 100644
>> index ..efad02cfe899
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/pr101836.c
>> @@ -0,0 +1,60 @@
>> +/* -fstrict-flex-arrays is aliased with -ftrict-flex-arrays=3, which is the
>> +   strictest, only [] is treated as flexible array.  */ 
>> +/* PR tree-optimization/101836 */
>> +/* { dg-do run } */
>> +/* { dg-options "-O2 -fstrict-flex-arrays" } */
>> +
>> +#include 
>> +
>> +#define expect(p, _v) do { \
>> +size_t v = _v; \
>> +if (p == v) \
>> +printf("ok:  %s == %zd\n", #p, p); \
>> +else \
>> +{  \
>> +  printf("WAT: %s == %zd (expected %zd)\n", #p, p, v); \
>> +  __builtin_abort (); \
>> +} \
>> +} while (0);
>> +
>> +struct trailing_array_1 {
>> +int a;
>> +int b;
>> +int c[4];
>> +};
>> +
>> +struct trailing_array_2 {
>> +int a;
>> +int b;
>> +int c[1];
>> +};
>> +
>> +struct trailing_array_3 {
>> +int a;
>> +int b;
>> +int c[0];
>> +};
>> +struct trailing_array_4 {
>> +int a;
>> +int b;
>> +int c[];
>> +};
>> +
>> +void __attribute__((__noinline__)) stuff(
>> +struct trailing_array_1 *normal,
>> +struct trailing_array_2 *trailing_1,
>> +struct trailing_array_3 *trailing_0,
>> +struct trailing_array_4 *trailing_flex)
>> +{
>> +expect(__builtin_object_size(normal->c, 1), 16);
>> +expect(__builtin_object_size(trailing_1->c, 1), 4);
>> +expect(__builtin_object_size(trailing_0->c, 1), 0);
>> +expect(__builtin_object_size(trailing_flex->c, 1), -1);
>> +}
>> +
>> +int main(int argc, char *argv[])
>> +{
>> +stuff((void *)argv[0], (void *)argv[0], (void *)argv[0], (void 
>> *)argv[0]);
>> +
>> +return 0;
>> +}
>> diff --git a/gcc/testsuite/gcc.dg/pr101836_1.c 
>> b/gcc/testsuite/gcc.dg/pr101836_1.c
>> new file mode 100644
>> index ..e2931ce1012e
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/pr101836_1.c
>> @@ -0,0 +1,60 @@
>> +/* -fstrict-flex-arrays=3 is the strictest, only [] is treated as
>> +   flexible array.  */ 
>> +/* PR tree-optimization/101836 */
>> +/* { dg-do run } */
>> +/* { dg-options "-O2 -fstrict-flex-arrays=3" } */
>> +
>> +#include 
>> +
>> +#define expect(p, _v) do { \
>> +size_t v = _v; \
>> +if (p == v) \
>> +printf("ok:  %s == %zd\n", #p, p); \
>> +else \
>> +{  \

Re: [PATCH v2] ipa-visibility: Optimize TLS access [PR99619]

2022-08-26 Thread Alexander Monakov via Gcc-patches
On Fri, 26 Aug 2022, Martin Jambor wrote:

> > +/* Check if promoting general-dynamic TLS access model to local-dynamic is
> > +   desirable for DECL.  */
> > +
> > +static bool
> > +optimize_dyn_tls_for_decl_p (const_tree decl)
> > +{
> > +  if (optimize)
> > +return true;
> 
> ...this.  This is again an access to optimize which in LTO IPA phase is
> not really meaningful.  Can the test be simply removed?  If not (but
> please look at why), I'd suggest to test the overall optimize level only
> if there is a non-NULL cfun.

I'd prefer to keep it. This code is also called from the front-ends when
assigning initial TLS access model (during parsing, at the point where
visibility attributes, if present, have not been processed yet). There
we don't have IPA structures, but 'optimize' is set up.

I also want to avoid iterating over referring functions in non-LTO mode
where trusting 'optimize' should be fine during this IPA pass I think?

Alexander


Re: [PATCH, rs6000] Change insn condition from TARGET_64BIT to TARGET_POWERPC64 for VSX scalar extract/insert instructions

2022-08-26 Thread David Edelsohn via Gcc-patches
On Thu, Aug 25, 2022 at 10:42 PM HAO CHEN GUI  wrote:
>
> Hi David,
>
> On 25/8/2022 下午 10:01, David Edelsohn wrote:
> > On Thu, Aug 25, 2022 at 1:22 AM Kewen.Lin  wrote:
> >>
> >> on 2022/8/25 11:37, HAO CHEN GUI wrote:
> >>> Hi,
> >>>
> >>> On 24/8/2022 下午 1:24, Kewen.Lin wrote:
>  Could you try to test with dg-options "-mdejagnu-cpu=power9 -mpowerpc64" 
>  all the time, but still
>  having that has_arch_ppc64 effective target on aix?
> 
>  I'd expect has_arch_ppc64 check to fail on aix 32bit, the error will not 
>  be a problem (turning
>  into an UNSUPPORTED then)?
> >>>
> >>> I tested it on AIX. "has_arch_ppc64" fails with dg-options 
> >>> "-mdejagnu-cpu=power9 -mpowerpc64" on
> >>> 32-bit AIX environment. It works as we expected.
> >>
> >> Nice, thanks for your time on testing.
> >>
> >>>
> >>> Also I found that AIX and Darwin are skipped for bfp test. So in 
> >>> testcase, it's no need to care
> >>> about them. Not sure if it's intention.
> >>>
> >>> In bfp.exp
> >>>
> >>> # Exit immediately if this isn't a PowerPC target or if the target is
> >>> # aix or Darwin.
> >>> if { (![istarget powerpc*-*-*] && ![istarget rs6000-*-*])
> >>>  || [istarget "powerpc*-*-aix*"]
> >>>  || [istarget "powerpc*-*-darwin*"]  } then {
> >>>   return
> >>> }
> >>
> >> I can't find a hint about why we wanted to disable bfp testing on aix, it 
> >> looks like a overkill to me.
> >>
> >> Could you help to further test if all test cases in this small bucket 
> >> available on aix?
> >>
> >> Maybe it can give us some evidences on why it's intentional or not.
> >>
> >> Hi David & Segher,
> >>
> >> Do you have some insights on this?
> >
> > AIX (and Darwin) are not Linux and not ELF.  There is no support for
> > BPF.  All of the tests fail, so they are skipped.
>
> Thanks so much for your info.
>
> Here are test results on P7 AIX7.1. I tested all scalar-extract-sig-* and 
> scalar-insert-exp-* cases in
> "testsuite/powerpc/bfp" fold. All compiling cases pass except those use 
> __ieee128. The runnable cases
> fail as expected. p9vector is not supported on P7 servers.
>
> So the __ieee128 blocks Binary floating-point on AIX?

AIX does not support IEEE 128 bit  floating point, only the IBM
double-double format.  Also GCC for AIX does not recognize the
attributes and options for other formats, although there are some
patches from Mike to address that.

Thanks, David


RE: [PATCH] Add pattern to convert vector shift + bitwise and + multiply to vector compare in some cases.

2022-08-26 Thread Tamar Christina via Gcc-patches
> -Original Message-
> From: Gcc-patches  bounces+tamar.christina=arm@gcc.gnu.org> On Behalf Of Richard
> Biener via Gcc-patches
> Sent: Friday, August 26, 2022 10:08 AM
> To: mtsamis 
> Cc: GCC Patches ;
> jiangning@amperecomputing.com; Philipp Tomsich
> 
> Subject: Re: [PATCH] Add pattern to convert vector shift + bitwise and +
> multiply to vector compare in some cases.
> 
> On Sat, Aug 13, 2022 at 11:59 AM mtsamis  wrote:
> >
> > When using SWAR (SIMD in a register) techniques a comparison operation
> > within such a register can be made by using a combination of shifts,
> > bitwise and and multiplication. If code using this scheme is
> > vectorized then there is potential to replace all these operations
> > with a single vector comparison, by reinterpreting the vector types to
> match the width of the SWAR register.
> >
> > For example, for the test function packed_cmp_16_32, the original
> generated code is:
> >
> > ldr q0, [x0]
> > add w1, w1, 1
> > ushrv0.4s, v0.4s, 15
> > and v0.16b, v0.16b, v2.16b
> > shl v1.4s, v0.4s, 16
> > sub v0.4s, v1.4s, v0.4s
> > str q0, [x0], 16
> > cmp w2, w1
> > bhi .L20
> >
> > with this pattern the above can be optimized to:
> >
> > ldr q0, [x0]
> > add w1, w1, 1
> > cmltv0.8h, v0.8h, #0
> > str q0, [x0], 16
> > cmp w2, w1
> > bhi .L20
> >
> > The effect is similar for x86-64.
> >
> > gcc/ChangeLog:
> >
> > * match.pd: Simplify vector shift + bit_and + multiply in some 
> > cases.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/aarch64/swar_to_vec_cmp.c: New test.
> >
> > Signed-off-by: mtsamis 
> > ---
> >  gcc/match.pd  | 57 +++
> >  .../gcc.target/aarch64/swar_to_vec_cmp.c  | 72
> +++
> >  2 files changed, 129 insertions(+)
> >  create mode 100644 gcc/testsuite/gcc.target/aarch64/swar_to_vec_cmp.c
> >
> > diff --git a/gcc/match.pd b/gcc/match.pd index
> > 8bbc0dbd5cd..5c768a94846 100644
> > --- a/gcc/match.pd
> > +++ b/gcc/match.pd
> > @@ -301,6 +301,63 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >  (view_convert (bit_and:itype (view_convert @0)
> >  (ne @1 { build_zero_cst (type);
> > })))
> >
> > +/* In SWAR (SIMD in a register) code a comparison of packed data can
> > +   be consturcted with a particular combination of shift, bitwise and,
> > +   and multiplication by constants.  If that code is vectorized we can
> > +   convert this pattern into a more efficient vector comparison.  */
> > +(simplify  (mult (bit_and (rshift @0 @1) @2) @3)
> 
> You should restrict the pattern a bit more, below you use
> uniform_integer_cst_p and also require a vector type thus
> 
>   (simplify
>(mult (bit_and (rshift @0 VECTOR_CST@1) VECTOR_CST@2)
> VECTOR_CST@3)
> 
> 
> > + (with {
> > +   tree op_type = TREE_TYPE (@0);
> 
> that's the same as 'type' which is already available.
> 
> > +   tree rshift_cst = NULL_TREE;
> > +   tree bit_and_cst = NULL_TREE;
> > +   tree mult_cst = NULL_TREE;
> > +  }
> > +  /* Make sure we're working with vectors and uniform vector
> > + constants.  */  (if (VECTOR_TYPE_P (op_type)
> > +   && (rshift_cst = uniform_integer_cst_p (@1))
> > +   && (bit_and_cst = uniform_integer_cst_p (@2))
> > +   && (mult_cst = uniform_integer_cst_p (@3)))
> > +   /* Compute what constants would be needed for this to represent a
> packed
> > +  comparison based on the shift amount denoted by RSHIFT_CST.  */
> > +   (with {
> > + HOST_WIDE_INT vec_elem_bits = vector_element_bits (op_type);
> > + HOST_WIDE_INT vec_nelts = TYPE_VECTOR_SUBPARTS
> > + (op_type).to_constant ();
> 
> you need to check that this isn't a VLA vector operation.

Seems like this pattern should be applicable to VLA as well no?
So could we not keep vec_nelts as a poly and just use exact_div
Below in the division? The pattern is only valid if cmp_bits_i is a
multiple of vec_elem_bits anyway.  The build_vector_* should then
do the right thing.

Regards,
Tamar

> 
> > + HOST_WIDE_INT vec_bits = vec_elem_bits * vec_nelts;
> > +
> > + unsigned HOST_WIDE_INT cmp_bits_i, bit_and_i, mult_i;
> > + unsigned HOST_WIDE_INT target_mult_i, target_bit_and_i;
> > + cmp_bits_i = tree_to_uhwi (rshift_cst) + 1;
> 
> and that the rshift_cst and others actually fit an uhwi.
> 
> > + target_mult_i = (HOST_WIDE_INT_1U << cmp_bits_i) - 1;
> > +
> > + mult_i = tree_to_uhwi (mult_cst);
> > + bit_and_i = tree_to_uhwi (bit_and_cst);
> > + target_bit_and_i = 0;
> > +
> > + for (unsigned i = 0; i < vec_elem_bits / cmp_bits_i; i++)
> > +   target_bit_and_i = (target_bit_and_i << cmp_bits_i) | 1U;
> 
> it would be nice to have a comment on what this actually does ...
> 
> > +}
> > +(if ((exact_log2 (cmp_bits_i)) >= 0
> > +&& cmp_bit

PING^2: [PATCH v3] eliminate mutex in fast path of __register_frame

2022-08-26 Thread Thomas Neumann via Gcc-patches
Summary: __register_frame and the corresponding _Unwind_Find_FDE use a 
global mutex for frame registration and unwinding. This can lead to very 
poor performance on machines with high core counts. This patch organizes 
the frames in a b-tree with read-optimized synchronization instead, 
which allows for fully parallel unwinding.


See:
https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597256.html

Best

Thomas


[PATCH 2/2] Refactor init_use_preds and find_control_equiv_block

2022-08-26 Thread Richard Biener via Gcc-patches
The following inlines find_control_equiv_block and is_loop_exit
into init_use_preds and refactors that for better readability and
similarity with the post-dominator walk in compute_control_dep_chain.

* gimple-predicate-analysis.cc (is_loop_exit,
find_control_equiv_block): Inline into single caller ...
(uninit_analysis::init_use_preds): ... here and refactor.
---
 gcc/gimple-predicate-analysis.cc | 55 ++--
 1 file changed, 16 insertions(+), 39 deletions(-)

diff --git a/gcc/gimple-predicate-analysis.cc b/gcc/gimple-predicate-analysis.cc
index 934e9516e7b..e388bb37685 100644
--- a/gcc/gimple-predicate-analysis.cc
+++ b/gcc/gimple-predicate-analysis.cc
@@ -59,33 +59,6 @@
for the case corresponding to an edge.  */
 #define MAX_SWITCH_CASES 40
 
-/* Return true if, when BB1 is postdominating BB2, BB1 is a loop exit.  */
-
-static bool
-is_loop_exit (basic_block bb2, basic_block bb1)
-{
-  return single_pred_p (bb1) && !single_succ_p (bb2);
-}
-
-/* Find BB's closest postdominator that is its control equivalent (i.e.,
-   that's controlled by the same predicate).  */
-
-static inline basic_block
-find_control_equiv_block (basic_block bb)
-{
-  basic_block pdom = get_immediate_dominator (CDI_POST_DOMINATORS, bb);
-
-  /* Skip the postdominating bb that is also a loop exit.  */
-  if (is_loop_exit (bb, pdom))
-return NULL;
-
-  /* If the postdominator is dominated by BB, return it.  */
-  if (dominated_by_p (CDI_DOMINATORS, pdom, bb))
-return pdom;
-
-  return NULL;
-}
-
 /* Return true if X1 is the negation of X2.  */
 
 static inline bool
@@ -1991,25 +1964,29 @@ bool
 uninit_analysis::init_use_preds (predicate &use_preds, basic_block def_bb,
 basic_block use_bb)
 {
-  gcc_assert (use_preds.is_empty ());
+  gcc_assert (use_preds.is_empty ()
+ && dominated_by_p (CDI_DOMINATORS, use_bb, def_bb));
 
   /* Set CD_ROOT to the basic block closest to USE_BB that is the control
  equivalent of (is guarded by the same predicate as) DEF_BB that also
- dominates USE_BB.  */
+ dominates USE_BB.  This mimics the inner loop in
+ compute_control_dep_chain.  */
   basic_block cd_root = def_bb;
-  while (dominated_by_p (CDI_DOMINATORS, use_bb, cd_root))
+  do
 {
-  /* Find CD_ROOT's closest postdominator that's its control
-equivalent.  */
-  if (basic_block bb = find_control_equiv_block (cd_root))
-   if (dominated_by_p (CDI_DOMINATORS, use_bb, bb))
- {
-   cd_root = bb;
-   continue;
- }
+  basic_block pdom = get_immediate_dominator (CDI_POST_DOMINATORS, 
cd_root);
 
-  break;
+  /* Stop at a loop exit which is also postdominating cd_root.  */
+  if (single_pred_p (pdom) && !single_succ_p (cd_root))
+   break;
+
+  if (!dominated_by_p (CDI_DOMINATORS, pdom, cd_root)
+ || !dominated_by_p (CDI_DOMINATORS, use_bb, pdom))
+   break;
+
+  cd_root = pdom;
 }
+  while (1);
 
   /* Set DEP_CHAINS to the set of edges between CD_ROOT and USE_BB.
  Each DEP_CHAINS element is a series of edges whose conditions
-- 
2.35.3


[PATCH 1/2] Improve compute_control_dep_chain documentation

2022-08-26 Thread Richard Biener via Gcc-patches


In the quest to understand how compute_control_dep_chain works I've
produced the following two changes, documenting PR106754 on the
way.

Bootstrap and regtest is running on x86_64-unknown-linux-gnu,
the changes should be no-ops but hopefully improve understanding
of the code.

--


The following refactors compute_control_dep_chain slightly by
inlining is_loop_exit and factoring the check on the loop
invariant condition.  It also adds a comment as of how I
understand the code and it's current problem.

* gimple-predicate-analysis.cc (compute_control_dep_chain):
Inline is_loop_exit and refactor, add comment about
loop exits.
---
 gcc/gimple-predicate-analysis.cc | 27 +++
 1 file changed, 23 insertions(+), 4 deletions(-)

diff --git a/gcc/gimple-predicate-analysis.cc b/gcc/gimple-predicate-analysis.cc
index 32542f93057..934e9516e7b 100644
--- a/gcc/gimple-predicate-analysis.cc
+++ b/gcc/gimple-predicate-analysis.cc
@@ -1110,6 +1110,10 @@ compute_control_dep_chain (basic_block dom_bb, 
const_basic_block dep_bb,
   vec &cur_cd_chain, unsigned *num_calls,
   unsigned in_region = 0, unsigned depth = 0)
 {
+  /* In our recursive calls this doesn't happen.  */
+  if (single_succ_p (dom_bb))
+return false;
+
   if (*num_calls > (unsigned)param_uninit_control_dep_attempts)
 {
   if (dump_file)
@@ -1167,7 +1171,21 @@ compute_control_dep_chain (basic_block dom_bb, 
const_basic_block dep_bb,
   basic_block cd_bb = e->dest;
   cur_cd_chain.safe_push (e);
   while (!dominated_by_p (CDI_POST_DOMINATORS, dom_bb, cd_bb)
-|| is_loop_exit (dom_bb, cd_bb))
+/* We want to stop when the CFG merges back from the
+   branch in dom_bb.  The post-dominance check alone
+   falls foul of the case of a loop exit test branch
+   where the path on the loop exit post-dominates
+   the branch block.
+   The following catches this but will not allow
+   exploring the post-dom path further.  For the
+   outermost recursion this means we will fail to
+   reach dep_bb while for others it means at least
+   dropping the loop exit predicate from the path
+   which is problematic as it increases the domain
+   spanned by the resulting predicate.
+   See gcc.dg/uninit-pred-11.c for the first case
+   and PR106754 for the second.  */
+|| single_pred_p (cd_bb))
{
  if (cd_bb == dep_bb)
{
@@ -1187,9 +1205,10 @@ compute_control_dep_chain (basic_block dom_bb, 
const_basic_block dep_bb,
break;
 
  /* Check if DEP_BB is indirectly control-dependent on DOM_BB.  */
- if (compute_control_dep_chain (cd_bb, dep_bb, cd_chains,
-num_chains, cur_cd_chain,
-num_calls, in_region, depth + 1))
+ if (!single_succ_p (cd_bb)
+ && compute_control_dep_chain (cd_bb, dep_bb, cd_chains,
+   num_chains, cur_cd_chain,
+   num_calls, in_region, depth + 1))
{
  found_cd_chain = true;
  break;
-- 
2.35.3



Re: [PATCH v2] ipa-visibility: Optimize TLS access [PR99619]

2022-08-26 Thread Martin Jambor
Hi,

sorry for ignoring this for so long, I hope that Honza wold take over.

I think the patch would be good if it did not have

On Thu, Jul 07 2022, Alexander Monakov via Gcc-patches wrote:
> From: Artem Klimov 
>
> Fix PR99619, which asks to optimize TLS model based on visibility.
> The fix is implemented as an IPA optimization: this allows to take
> optimized visibility status into account (as well as avoid modifying
> all language frontends).
>
> 2022-04-17  Artem Klimov  
>
> gcc/ChangeLog:
>
>   * ipa-visibility.cc (function_and_variable_visibility): Promote
>   TLS access model afer visibility optimizations.
>   * varasm.cc (have_optimized_refs): New helper.
>   (optimize_dyn_tls_for_decl_p): New helper. Use it ...
>   (decl_default_tls_model): ... here in place of 'optimize' check.
>
> gcc/testsuite/ChangeLog:
>
>   * gcc.dg/tls/vis-attr-gd.c: New test.
>   * gcc.dg/tls/vis-attr-hidden-gd.c: New test.
>   * gcc.dg/tls/vis-attr-hidden.c: New test.
>   * gcc.dg/tls/vis-flag-hidden-gd.c: New test.
>   * gcc.dg/tls/vis-flag-hidden.c: New test.
>   * gcc.dg/tls/vis-pragma-hidden-gd.c: New test.
>   * gcc.dg/tls/vis-pragma-hidden.c: New test.
>
> Co-Authored-By:  Alexander Monakov  
> Signed-off-by: Artem Klimov 
> ---
>
> v2: run the new loop in ipa-visibility only in the whole-program IPA pass;
> in decl_default_tls_model, check if any referring function is optimized
> when 'optimize == 0' (when running in LTO mode)
>
>
> Note for reviewers: I noticed there's a place which tries to avoid TLS
> promotion, but the comment seems wrong and I could not find a testcase.
> I'd suggest we remove it. The compiler can only promote general-dynamic
> to local-dynamic and initial-exec to local-exec. The comment refers to
> promoting x-dynamic to y-exec, but that cannot happen AFAICT:
> https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=8e1ba78f1b8eedd6c65c6f0e6d6d09a801de5d3d
>
>
>  gcc/ipa-visibility.cc | 19 +++
>  gcc/testsuite/gcc.dg/tls/vis-attr-gd.c| 12 +++
>  gcc/testsuite/gcc.dg/tls/vis-attr-hidden-gd.c | 13 
>  gcc/testsuite/gcc.dg/tls/vis-attr-hidden.c| 12 +++
>  gcc/testsuite/gcc.dg/tls/vis-flag-hidden-gd.c | 13 
>  gcc/testsuite/gcc.dg/tls/vis-flag-hidden.c| 12 +++
>  .../gcc.dg/tls/vis-pragma-hidden-gd.c | 17 ++
>  gcc/testsuite/gcc.dg/tls/vis-pragma-hidden.c  | 16 ++
>  gcc/varasm.cc | 32 ++-
>  9 files changed, 145 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tls/vis-attr-gd.c
>  create mode 100644 gcc/testsuite/gcc.dg/tls/vis-attr-hidden-gd.c
>  create mode 100644 gcc/testsuite/gcc.dg/tls/vis-attr-hidden.c
>  create mode 100644 gcc/testsuite/gcc.dg/tls/vis-flag-hidden-gd.c
>  create mode 100644 gcc/testsuite/gcc.dg/tls/vis-flag-hidden.c
>  create mode 100644 gcc/testsuite/gcc.dg/tls/vis-pragma-hidden-gd.c
>  create mode 100644 gcc/testsuite/gcc.dg/tls/vis-pragma-hidden.c
>
> diff --git a/gcc/ipa-visibility.cc b/gcc/ipa-visibility.cc
> index 8a27e7bcd..3ed2b7cf6 100644
> --- a/gcc/ipa-visibility.cc
> +++ b/gcc/ipa-visibility.cc
> @@ -873,6 +873,25 @@ function_and_variable_visibility (bool whole_program)
>   }
>  }
>  
> +  if (symtab->state >= IPA_SSA)
> +{
> +  FOR_EACH_VARIABLE (vnode)
> + {
> +   tree decl = vnode->decl;
> +
> +   /* Upgrade TLS access model based on optimized visibility status,
> +  unless it was specified explicitly or no references remain.  */
> +   if (DECL_THREAD_LOCAL_P (decl)
> +   && !lookup_attribute ("tls_model", DECL_ATTRIBUTES (decl))
> +   && vnode->ref_list.referring.length ())
> + {
> +   enum tls_model new_model = decl_default_tls_model (decl);
> +   gcc_checking_assert (new_model >= decl_tls_model (decl));
> +   set_decl_tls_model (decl, new_model);
> + }
> + }
> +}
> +
>if (dump_file)
>  {
>fprintf (dump_file, "\nMarking local functions:");
> diff --git a/gcc/varasm.cc b/gcc/varasm.cc
> index 4db8506b1..de149e82c 100644
> --- a/gcc/varasm.cc
> +++ b/gcc/varasm.cc
> @@ -6679,6 +6679,36 @@ init_varasm_once (void)
>  #endif
>  }
>  
> +/* Determine whether SYMBOL is used in any optimized function.  */
> +
> +static bool
> +have_optimized_refs (struct symtab_node *symbol)
> +{
> +  struct ipa_ref *ref;
> +
> +  for (int i = 0; symbol->iterate_referring (i, ref); i++)
> +{
> +  cgraph_node *cnode = dyn_cast  (ref->referring);
> +
> +  if (cnode && opt_for_fn (cnode->decl, optimize))
> + return true;
> +}
> +
> +  return false;
> +}
> +
> +/* Check if promoting general-dynamic TLS access model to local-dynamic is
> +   desirable for DECL.  */
> +
> +static bool
> +optimize_dyn_tls_for_decl_p (const_tree decl)
> +{
> +  if (optimize)
> +return true;

...this.  This is again an access to optimize whic

[PATCH] RISC-V: Add vector registers and classification

2022-08-26 Thread juzhe . zhong
From: zhongjuzhe 

This patch is preparing patch for RVV support:

1. Add vector registers.
2. Add classification of RVV instructions in attribute 'type'.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_v_ext_vector_mode_p): New function.
(riscv_classify_address): Disallow PLUS/LO_SUM/CONST_INT address types 
for RVV.
(riscv_address_insns): Add RVV modes condition.
(riscv_binary_cost): Ditto.
(riscv_rtx_costs): Adjust cost for RVV.
(riscv_secondary_memory_needed): Add RVV modes condition.
(riscv_hard_regno_nregs): Add RVV register allocation.
(riscv_hard_regno_mode_ok): Add RVV register allocation.
(riscv_class_max_nregs): Add RVV register allocation.
* config/riscv/riscv.h (DWARF_FRAME_REGNUM): Add VL/VTYPE in Dwarf.
(UNITS_PER_V_REG): New macro.
(FIRST_PSEUDO_REGISTER): Adjust first pseudo num for RVV.
(V_REG_FIRST): New macro.
(V_REG_LAST): Ditto.
(V_REG_NUM): Ditto.
(V_REG_P): Ditto.
(VL_REG_P): Ditto.
(VTYPE_REG_P): Ditto.
(RISCV_DWARF_VL): Ditto.
(RISCV_DWARF_VTYPE): Ditto.
(enum reg_class): Add RVV register types.
(REG_CLASS_CONTENTS): Add RVV register types.
* config/riscv/riscv.md: Add new type for vector instructions.

---
 gcc/config/riscv/riscv.cc | 100 +++--
 gcc/config/riscv/riscv.h  |  92 --
 gcc/config/riscv/riscv.md | 102 +-
 3 files changed, 273 insertions(+), 21 deletions(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 026c69ce40d..65d71544f47 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -279,7 +279,22 @@ const enum reg_class 
riscv_regno_to_class[FIRST_PSEUDO_REGISTER] = {
   FP_REGS, FP_REGS,FP_REGS,FP_REGS,
   FP_REGS, FP_REGS,FP_REGS,FP_REGS,
   FP_REGS, FP_REGS,FP_REGS,FP_REGS,
-  FRAME_REGS,  FRAME_REGS,
+  FRAME_REGS,  FRAME_REGS, VL_REGS,VTYPE_REGS,
+  NO_REGS, NO_REGS,NO_REGS,NO_REGS,
+  NO_REGS, NO_REGS,NO_REGS,NO_REGS,
+  NO_REGS, NO_REGS,NO_REGS,NO_REGS,
+  NO_REGS, NO_REGS,NO_REGS,NO_REGS,
+  NO_REGS, NO_REGS,NO_REGS,NO_REGS,
+  NO_REGS, NO_REGS,NO_REGS,NO_REGS,
+  NO_REGS, NO_REGS,NO_REGS,NO_REGS,
+  VM_REGS, VD_REGS,VD_REGS,VD_REGS,
+  VD_REGS, VD_REGS,VD_REGS,VD_REGS,
+  VD_REGS, VD_REGS,VD_REGS,VD_REGS,
+  VD_REGS, VD_REGS,VD_REGS,VD_REGS,
+  VD_REGS, VD_REGS,VD_REGS,VD_REGS,
+  VD_REGS, VD_REGS,VD_REGS,VD_REGS,
+  VD_REGS, VD_REGS,VD_REGS,VD_REGS,
+  VD_REGS, VD_REGS,VD_REGS,VD_REGS,
 };
 
 /* Costs to use when optimizing for rocket.  */
@@ -894,6 +909,14 @@ riscv_valid_lo_sum_p (enum riscv_symbol_type sym_type, 
machine_mode mode,
   return true;
 }
 
+/* Return true if mode is the RVV mode.  */
+
+static bool
+riscv_v_ext_vector_mode_p (machine_mode mode)
+{
+  return VECTOR_MODE_P (mode);
+}
+
 /* Return true if X is a valid address for machine mode MODE.  If it is,
fill in INFO appropriately.  STRICT_P is true if REG_OK_STRICT is in
effect.  */
@@ -912,6 +935,10 @@ riscv_classify_address (struct riscv_address_info *info, 
rtx x,
   return riscv_valid_base_register_p (info->reg, mode, strict_p);
 
 case PLUS:
+  /* RVV load/store disallow any offset.  */
+  if (riscv_v_ext_vector_mode_p (mode))
+   return false;
+
   info->type = ADDRESS_REG;
   info->reg = XEXP (x, 0);
   info->offset = XEXP (x, 1);
@@ -919,6 +946,10 @@ riscv_classify_address (struct riscv_address_info *info, 
rtx x,
  && riscv_valid_offset_p (info->offset, mode));
 
 case LO_SUM:
+  /* RVV load/store disallow LO_SUM.  */
+  if (riscv_v_ext_vector_mode_p (mode))
+   return false;
+
   info->type = ADDRESS_LO_SUM;
   info->reg = XEXP (x, 0);
   info->offset = XEXP (x, 1);
@@ -937,6 +968,10 @@ riscv_classify_address (struct riscv_address_info *info, 
rtx x,
  && riscv_valid_lo_sum_p (info->symbol_type, mode, info->offset));
 
 case CONST_INT:
+  /* RVV load/store disallow CONST_INT.  */
+  if (riscv_v_ext_vector_mode_p (mode))
+   return false;
+
   /* Small-integer addresses don't occur very often, but they
 are legitimate if x0 is a valid base register.  */
   info->type = ADDRESS_CONST_INT;
@@ -1022,7 +1057,7 @@ riscv_address_insns (rtx x, machine_mode mode, bool 
might_split_p)
 
   /* BLKmode is used for single unaligned loads and stores and should
  not count as a multiword mode. */
-  if (mode != BLKmode && might_split_p)
+  if (!riscv_v_ext_vector_mode_p (mode) && mod

[Patch] libgomp: Add reverse-offload splay tree

2022-08-26 Thread Tobias Burnus

For reverse-offload data handling, we need to support:
(a) device fn addr -> host fn address
(b) finding already mapped host -> device vars via their device address

For (a), the functions addrs, we need some extra code (cf. previous patches)
as this information does not exist already. For (b), the variables, we have
two options:
(i) Do a reverse search on the existing data. That's done in
 oacc-mem.c's lookup_dev
   and obviously is O(N) as it has to check all nodes.
   With the assumption "This is not expected to be a common operation."
   is might be still okay.
(ii) Using a second splay tree for the reverse lookup.

The attached patch prepares for this – but does not yet handle all
locations and is not yet active. The 'devicep->load_image_func' call
depends whether the previous [1/3] patch (cf. below) has been applied
or not.

(The advantage of the reverse-offload mapping is that 'nowait' is not
permitted and 'target {enter,exit,} data device(anchestor:1)' is neither.
Thus, all 'omp target device(ancestor:1)' mapping done in target_rev
can be undone in the same function - and all others are preexisting.)

OK for mainline?

Tobias

PS: Still to be done (for nvptx, for gcn a bit more is needed):
- handle remaining var-mapping cases
- gomp_target_rev - mainly: handle all of map/firstprivate.
- turn on reverse-offload support (accept omp_requires)
+ add tons of testcases.

Pending reverse-offload patches (libgomp only with abit gcc/config/{gcn,nvptx}):

[Patch][1/3] libgomp: Prepare for reverse offload fn lookup
[Patch][2/3] GCN: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup
[Patch][2/3] nvptx: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup
[1/3] is at: https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600333.html

[Patch] libgomp/nvptx: Prepare for reverse-offload callback handling
https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600381.html


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
libgomp: Add reverse-offload splay tree

Adds a splay tree for reverse offloading; effectively, it is not yet
filled as reverse offload is not enabled for any device type. It also
does not yet track variables later mapped (and then unmapped). Those
will be added in follow-up commits.

libgomp/ChangeLog:

	* libgomp.h: Include splay-tree.h with splay_tree_prefix set to
	'reverse'.
	(struct target_mem_desc): Declare at old line; move
	definition down and add 'reverse_splay_tree_node rev_array' member.
	(struct reverse_splay_tree_key_s, reverse_splay_compare): New.
	(reverse_splay_tree_node, reverse_splay_tree, reverse_splay_tree_key):
	New typedef.
	(struct gomp_device_descr): Add 'reverse_splay_tree_s mem_map_rev'.
	* oacc-host.c (host_dispatch): NULL init mem_map_rev.
	* target.c: Include splay-tree.h with splay_tree_c set and
	splay_tree_prefix set to 'reverse'.
	(gomp_map_lookup_rev): New function (for now inside '#if 0').
	(gomp_unmap_tgt): Free array_rev.
	(gomp_load_image_to_device): Fill reverse-offload splay tree with
	funct and var address.
	(gomp_unload_image_from_device): Empty reverse-offload splay tree.
	(gomp_target_init): Init mem_map_rev.

 libgomp/libgomp.h   | 83 ++---
 libgomp/oacc-host.c |  1 +
 libgomp/target.c| 78 +++--
 3 files changed, 137 insertions(+), 25 deletions(-)

diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h
index c243c4d6cf4..10518a8348b 100644
--- a/libgomp/libgomp.h
+++ b/libgomp/libgomp.h
@@ -1038,29 +1038,7 @@ struct target_var_desc {
   uintptr_t length;
 };
 
-struct target_mem_desc {
-  /* Reference count.  */
-  uintptr_t refcount;
-  /* All the splay nodes allocated together.  */
-  splay_tree_node array;
-  /* Start of the target region.  */
-  uintptr_t tgt_start;
-  /* End of the targer region.  */
-  uintptr_t tgt_end;
-  /* Handle to free.  */
-  void *to_free;
-  /* Previous target_mem_desc.  */
-  struct target_mem_desc *prev;
-  /* Number of items in following list.  */
-  size_t list_count;
-
-  /* Corresponding target device descriptor.  */
-  struct gomp_device_descr *device_descr;
-
-  /* List of target items to remove (or decrease refcount)
- at the end of region.  */
-  struct target_var_desc list[];
-};
+struct target_mem_desc;
 
 /* Special value for refcount - mask to indicate existence of special
values. Right now we allocate 3 bits.  */
@@ -1154,6 +1132,64 @@ splay_compare (splay_tree_key x, splay_tree_key y)
 
 #include "splay-tree.h"
 
+/* Reverse offload splay-tree handling. */
+
+struct reverse_splay_tree_key_s {
+  /* Address of the device object.  */
+  uintptr_t dev_start;
+  /* Address immediately after the device object.  */
+  uintptr_t dev_end;
+
+  splay_tree_key k;
+};
+
+typedef struct re

Re: [PATCH 1/3] omp-simd-clone: Allow fixed-lane vectors

2022-08-26 Thread Jakub Jelinek via Gcc-patches
On Tue, Aug 09, 2022 at 02:23:48PM +0100, Andrew Stubbs wrote:
> 
> The vecsize_int/vecsize_float has an assumption that all arguments will use
> the same bitsize, and vary the number of lanes according to the element size,
> but this is inappropriate on targets where the number of lanes is fixed and
> the bitsize varies (i.e. amdgcn).
> 
> With this change the vecsize can be left zero and the vectorization factor 
> will
> be the same for all types.
> 
> gcc/ChangeLog:
> 
>   * doc/tm.texi: Regenerate.
>   * omp-simd-clone.cc (simd_clone_adjust_return_type): Allow zero
>   vecsize.
>   (simd_clone_adjust_argument_types): Likewise.
>   * target.def (compute_vecsize_and_simdlen): Document the new
>   vecsize_int and vecsize_float semantics.

LGTM, except for a formatting nit.

> @@ -618,8 +621,12 @@ simd_clone_adjust_argument_types (struct cgraph_node 
> *node)
>   veclen = sc->vecsize_int;
> else
>   veclen = sc->vecsize_float;
> -   veclen = exact_div (veclen,
> -   GET_MODE_BITSIZE (SCALAR_TYPE_MODE (parm_type)));
> +   if (known_eq (veclen, 0))
> + veclen = sc->simdlen;
> +   else
> + veclen = exact_div (veclen,
> + GET_MODE_BITSIZE
> + (SCALAR_TYPE_MODE (parm_type)));

Macro name on one line and ( on another is too ugly, can you please use:
veclen
  = exact_div (veclen,
   GET_MODE_BITSIZE (SCALAR_TYPE_MODE (parm_type)));
or:
{
  scalar_mode m = SCALAR_TYPE_MODE (parm_type);
  veclen = exact_div (veclen, GET_MODE_BITSIZE (m));
}
?

Jakub



[PATCH] New testcase for uninit

2022-08-26 Thread Richard Biener via Gcc-patches
The following adds a testcase that illustrates a defect in
compute_control_dep_chain and its attempt to identify loop
exits as special to continue walking post-dominators but failing
to do so for following post-dominators.  On trunk there is now
simple_control_dep_chain saving the day, avoiding the false
positive but with GCC 12 we get a bogus diagnostic.

Pushed.

* gcc.dg/uninit-pred-11.c: New testcase.
---
 gcc/testsuite/gcc.dg/uninit-pred-11.c | 27 +++
 1 file changed, 27 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/uninit-pred-11.c

diff --git a/gcc/testsuite/gcc.dg/uninit-pred-11.c 
b/gcc/testsuite/gcc.dg/uninit-pred-11.c
new file mode 100644
index 000..734df379ef7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/uninit-pred-11.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-O -Wmaybe-uninitialized" } */
+
+extern unsigned bar (void);
+extern void quux (void);
+int z;
+unsigned foo (unsigned v, int y)
+{
+  unsigned u;
+  if (v != 1)
+u = bar ();
+
+  // Prevent the "dom" pass from changing the CFG layout based on the inference
+  // 'if (v != 1) is false then (v != 2) is true'.  (Now it would have to
+  // duplicate the loop in order to do so, which is deemed expensive.)
+  for (int i = 0; i < 10; i++)
+quux ();
+
+  // This variation from uninit-25.c causes compute_control_dep_chain
+  // to run into a defect but simple_control_dep_chain saves us here
+  if (y)
+z = 1;
+  if (v != 1)
+return u;   /* { dg-bogus "may be used uninitialized" } */
+
+  return 0;
+}
-- 
2.35.3


Re: [Patch] OpenMP: Support reverse offload (middle end part)

2022-08-26 Thread Jakub Jelinek via Gcc-patches
On Thu, Jul 21, 2022 at 02:33:32PM +0200, Tobias Burnus wrote:
> OpenMP: Support reverse offload (middle end part)
> 
> gcc/ChangeLog:
> 
>   * internal-fn.cc (expand_GOMP_TARGET_REV): New.
>   * internal-fn.def (GOMP_TARGET_REV): New.
>   * lto-cgraph.cc (lto_output_node, verify_node_partition): Mark
>   'omp target device_ancestor_host' as in_other_partition and don't
>   error if absent.
>   * omp-low.cc (create_omp_child_function): Mark as 'noclone'.
>   * omp-expand.cc (expand_omp_target): For reverse offload, remove
>   sorry, use device = GOMP_DEVICE_HOST_FALLBACK and create
>   empty-body nohost function.
>   * omp-offload.cc (execute_omp_device_lower): Handle
>   IFN_GOMP_TARGET_REV.
>   (pass_omp_target_link::execute): For ACCEL_COMPILER, don't
>   nullify fn argument for reverse offload
> 
> libgomp/ChangeLog:
> 
>   * libgomp.texi (OpenMP 5.0): Mark 'ancestor' as implemented but
>   refer to 'requires'.
>   * testsuite/libgomp.c-c++-common/reverse-offload-1-aux.c: New test.
>   * testsuite/libgomp.c-c++-common/reverse-offload-1.c: New test.
>   * testsuite/libgomp.fortran/reverse-offload-1-aux.f90: New test.
>   * testsuite/libgomp.fortran/reverse-offload-1.f90: New test.
> 
> gcc/testsuite/ChangeLog:
> 
>   * c-c++-common/gomp/reverse-offload-1.c: Remove dg-sorry.
>   * c-c++-common/gomp/target-device-ancestor-4.c: Likewise.
>   * gfortran.dg/gomp/target-device-ancestor-4.f90: Likewise.
>   * gfortran.dg/gomp/target-device-ancestor-5.f90: Likewise.
>   * c-c++-common/goacc/classify-kernels-parloops.c: Add 'noclone' to
>   scan-tree-dump-times.
>   * c-c++-common/goacc/classify-kernels-unparallelized-parloops.c:
>   Likewise.
>   * c-c++-common/goacc/classify-kernels-unparallelized.c: Likewise.
>   * c-c++-common/goacc/classify-kernels.c: Likewise.
>   * c-c++-common/goacc/classify-parallel.c: Likewise.
>   * c-c++-common/goacc/classify-serial.c: Likewise.
>   * c-c++-common/goacc/kernels-counter-vars-function-scope.c: Likewise.
>   * c-c++-common/goacc/kernels-loop-2.c: Likewise.
>   * c-c++-common/goacc/kernels-loop-3.c: Likewise.
>   * c-c++-common/goacc/kernels-loop-data-2.c: Likewise.
>   * c-c++-common/goacc/kernels-loop-data-enter-exit-2.c: Likewise.
>   * c-c++-common/goacc/kernels-loop-data-enter-exit.c: Likewise.
>   * c-c++-common/goacc/kernels-loop-data-update.c: Likewise.
>   * c-c++-common/goacc/kernels-loop-data.c: Likewise.
>   * c-c++-common/goacc/kernels-loop-g.c: Likewise.
>   * c-c++-common/goacc/kernels-loop-mod-not-zero.c: Likewise.
>   * c-c++-common/goacc/kernels-loop-n.c: Likewise.
>   * c-c++-common/goacc/kernels-loop-nest.c: Likewise.
>   * c-c++-common/goacc/kernels-loop.c: Likewise.
>   * c-c++-common/goacc/kernels-one-counter-var.c: Likewise.
>   * c-c++-common/goacc/kernels-parallel-loop-data-enter-exit.c: Likewise.
>   * gfortran.dg/goacc/classify-kernels-parloops.f95: Likewise.
>   * gfortran.dg/goacc/classify-kernels-unparallelized-parloops.f95:
>   Likewise.
>   * gfortran.dg/goacc/classify-kernels-unparallelized.f95: Likewise.
>   * gfortran.dg/goacc/classify-kernels.f95: Likewise.
>   * gfortran.dg/goacc/classify-parallel.f95: Likewise.
>   * gfortran.dg/goacc/classify-serial.f95: Likewise.
>   * gfortran.dg/goacc/kernels-loop-2.f95: Likewise.
>   * gfortran.dg/goacc/kernels-loop-data-2.f95: Likewise.
>   * gfortran.dg/goacc/kernels-loop-data-enter-exit-2.f95: Likewise.
>   * gfortran.dg/goacc/kernels-loop-data-enter-exit.f95: Likewise.
>   * gfortran.dg/goacc/kernels-loop-data-update.f95: Likewise.
>   * gfortran.dg/goacc/kernels-loop-data.f95: Likewise.
>   * gfortran.dg/goacc/kernels-loop-n.f95: Likewise.
>   * gfortran.dg/goacc/kernels-loop.f95: Likewise.
>   * gfortran.dg/goacc/kernels-parallel-loop-data-enter-exit.f95: Likewise.

Ok for trunk, just a comment regarding the FIXME below (can be handled
incrementally).

> +   case IFN_GOMP_TARGET_REV:
> + {
> +#ifndef ACCEL_COMPILER
> +   gimple_stmt_iterator gsi2 = gsi;
> +   gsi_next (&gsi2);
> +   gcc_assert (!gsi_end_p (gsi2));
> +   gcc_assert (gimple_call_builtin_p (gsi_stmt (gsi2),
> +  BUILT_IN_GOMP_TARGET));
> +   tree old_decl
> + = TREE_OPERAND (gimple_call_arg (gsi_stmt (gsi2), 1), 0);
> +   tree new_decl = gimple_call_arg (gsi_stmt (gsi), 0);
> +   gimple_call_set_arg (gsi_stmt (gsi2), 1, new_decl);
> +   update_stmt (gsi_stmt (gsi2));
> +   new_decl = TREE_OPERAND (new_decl, 0);
> +   unsigned i;
> +   unsigned num_funcs = vec_safe_length (offload_funcs);
> +   for (i = 0; i < num_funcs; i++)
> + {
> +   if ((*offload_funcs)[i] == old_decl)
> + {
> 

Re: [PATCH][_GLIBCXX_DEBUG] Add basic_string::starts_with/ends_with checks

2022-08-26 Thread Jonathan Wakely via Gcc-patches
On Sun, 14 Aug 2022 at 16:34, François Dumont via Libstdc++
 wrote:
>
> I think we can add those checks.
>
> Note that I wonder if it was needed as in basic_string_view I see usages
> of __attribute__((__nonnull__)). But running the test I saw no impact
> even after I try to apply this attribute to the starts_with/ends_with
> methods themselves.

That should cause warnings, and does when I try it.

As you say, the relevant string_view constructor already has that anyway:

  __attribute__((__nonnull__)) constexpr
  basic_string_view(const _CharT* __str) noexcept

And so does string_view::find. The problem is that those only help if
the compiler inlines the calls to those functions and so can propagate
the null value all the way down to a function with the attribute.
Adding the attribute to the relevant starts_with, ends_with and
contains functions makes the diagnostics more likely to be emitted
without optimization.

>
> Also note that several checks like the ones I am adding here are XFAILS
> when using 'make check' because of the segfault rather than on a proper
> debug checks. Would you prefer to add dg-require-debug-mode to those ?
>
>  libstdc++: [_GLIBCXX_DEBUG] Add basic_string::starts_with/ends_with
> checks
>
>  Add simple checks on C string parameters which should not be null.
>
>  Review null string checks to show:
>  _String != nullptr
>
>  rather than:
>  _String != 0

I don't really like the extra complexity in the macros, but this does
seem like a nice improvement for what users see.

We could use __null for C++98, which is a compiler keyword that
expands to a null pointer constant, but I'm not sure if that would be
clear to all users or not. Maybe 0 is better.



Re: [PATCH V4] rs6000: Optimize cmp on rotated 16bits constant

2022-08-26 Thread Jiufu Guo via Gcc-patches
Hi,

Segher Boessenkool  writes:

> Hi!
>
> On Thu, Aug 25, 2022 at 08:11:31PM +0800, Jiufu Guo wrote:
>> Segher Boessenkool  writes:
>> > You usually can split fine if you cannot create new pseudos, by reusing
>> > existing registers.
>> >
>> > FAIL will cause an ICE: the RTL instruction does match, but will fail
>> > when trying to generate machine code for it.
>> >
>> Previous patch is using "gen_reg_rtx (DImode)" to generate a pseudo for
>> the rotated result to prevent orignal one being changed accidently.
>> So, an 'assert (can_create_pseudo_p ())' would catch it in after RA.
>
> It sounds like you want a define_split, not a define_insn_and_split.
> That is much more stomachable anyway.
>
Thanks for pointing out this!

As you mentioned, since it is only 'combine pass' that can match the
patterns, it would be better to just a define_split.  While I tried to
use this way, like:

(define_split
  [(set (pc)
(if_then_else (eqne (match_operand:DI 1 "gpc_reg_operand")
(match_operand:DI 2 "const_int_operand"))
  (label_ref (match_operand 0))
  (pc)))]
  "TARGET_POWERPC64 && num_insns_constant (operands[2], DImode) > 1
   && compare_rotate_immediate_p (UINTVAL (operands[2]))"
  [(pc)]

But this does not work.  With more debugging, it seems that,
"combine_split_insns/split_insns" returns correctly with sequence of
three insns.   But after return, only less than two insns can be
handled.  Just as the code comment:
 If we were combining three insns and the result is a simple SET
 with no ASM_OPERANDS that wasn't recognized, try to split it into two
 insns.

then, that 'define_split' fail to keep the result.


In the patch, for 'define_insn_and_split', it is handled as the
process:
In 'combine' pass, the new defined insns "rotate_on_cmpdi" is combined
from three instructions; 
And then, in the 'split1' pass, it was split into other three insns.


> Anything that creates conditional branches together with compars insns
> belongs before RA, before sched1 even.
>
For this patch, it would run in 'split1' mostly.  The good thing is
'split1' is before sched1. :-)

>> To enable this splitter works after RA, we may need to reserve one
>> register (clobber would be ok).  Such as below:
>> 
>>   [(set (pc)
>>  (if_then_else (eqne (match_operand:DI 1 "gpc_reg_operand" "r")
>>  (match_operand:DI 2 "const_int_operand" "n"))
>>(label_ref (match_operand 0 ""))
>>(pc)))
>>   (clobber (match_scratch:DI 3 "=r"))
>>   (clobber (match_scratch:CCUNS 4 "=y"))]
>
> Yes, that is one way to do it.  Another way is to reuse operand 1.  A
> clobber is probably better in this case though :-)
Yes, a clobber would be better -:)  For example:
If %3 is used later, it would be not safe to change:
"%3:DI!=0x8642"==>"%3:DI=%3DI<-15, %3:DI!=0x4321"

>
> If this is only so combine can match things, you certainly want just a
> define_split, and the compare+branch in one pattern is not as bad
> then.
As the above comments, since I failed to use 'define_split', so in
patch, 'define_insn_and_split' is used. :(


BR,
Jeff(Jiufu)

>
>
> Segher


Re: [PATCH 0/6] Optimise placement of SLP permutations

2022-08-26 Thread Richard Biener via Gcc-patches
On Thu, 25 Aug 2022, Richard Sandiford wrote:

> This series is a follow-up from the RFC that I posted a while
> back about optimising the placement of SLP permutations.
> The main comment is in the final patch.
> 
> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  If the series
> is OK, I'll test on powerpc64le-linux-gnu too before committing.

The series is OK.

Thanks for working on this!
Richard.


Re: [PATCH][DOCS] Mention removed ports in GCC 13.

2022-08-26 Thread Richard Biener via Gcc-patches
On Fri, Aug 26, 2022 at 8:33 AM Martin Liška  wrote:
>
> Ready for master?

OK

> Thanks,
> Martin
>
> ---
>  htdocs/gcc-13/changes.html | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html
> index c58ca1d4..e3f2ee44 100644
> --- a/htdocs/gcc-13/changes.html
> +++ b/htdocs/gcc-13/changes.html
> @@ -28,7 +28,9 @@ a work-in-progress.
>  
>  Caveats
>  
> -  ...
> +The support for the cr16-elf, 
> tilegx*-linux and tilepro*-linux
> +  configurations has been removed.
> +
>  
>
>
> --
> 2.37.2
>


Re: [PATCH] Add pattern to convert vector shift + bitwise and + multiply to vector compare in some cases.

2022-08-26 Thread Richard Biener via Gcc-patches
On Sat, Aug 13, 2022 at 11:59 AM mtsamis  wrote:
>
> When using SWAR (SIMD in a register) techniques a comparison operation within
> such a register can be made by using a combination of shifts, bitwise and and
> multiplication. If code using this scheme is vectorized then there is 
> potential
> to replace all these operations with a single vector comparison, by 
> reinterpreting
> the vector types to match the width of the SWAR register.
>
> For example, for the test function packed_cmp_16_32, the original generated 
> code is:
>
> ldr q0, [x0]
> add w1, w1, 1
> ushrv0.4s, v0.4s, 15
> and v0.16b, v0.16b, v2.16b
> shl v1.4s, v0.4s, 16
> sub v0.4s, v1.4s, v0.4s
> str q0, [x0], 16
> cmp w2, w1
> bhi .L20
>
> with this pattern the above can be optimized to:
>
> ldr q0, [x0]
> add w1, w1, 1
> cmltv0.8h, v0.8h, #0
> str q0, [x0], 16
> cmp w2, w1
> bhi .L20
>
> The effect is similar for x86-64.
>
> gcc/ChangeLog:
>
> * match.pd: Simplify vector shift + bit_and + multiply in some cases.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/aarch64/swar_to_vec_cmp.c: New test.
>
> Signed-off-by: mtsamis 
> ---
>  gcc/match.pd  | 57 +++
>  .../gcc.target/aarch64/swar_to_vec_cmp.c  | 72 +++
>  2 files changed, 129 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/swar_to_vec_cmp.c
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 8bbc0dbd5cd..5c768a94846 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -301,6 +301,63 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  (view_convert (bit_and:itype (view_convert @0)
>  (ne @1 { build_zero_cst (type); })))
>
> +/* In SWAR (SIMD in a register) code a comparison of packed data can
> +   be consturcted with a particular combination of shift, bitwise and,
> +   and multiplication by constants.  If that code is vectorized we can
> +   convert this pattern into a more efficient vector comparison.  */
> +(simplify
> + (mult (bit_and (rshift @0 @1) @2) @3)

You should restrict the pattern a bit more, below you use uniform_integer_cst_p
and also require a vector type thus

  (simplify
   (mult (bit_and (rshift @0 VECTOR_CST@1) VECTOR_CST@2) VECTOR_CST@3)


> + (with {
> +   tree op_type = TREE_TYPE (@0);

that's the same as 'type' which is already available.

> +   tree rshift_cst = NULL_TREE;
> +   tree bit_and_cst = NULL_TREE;
> +   tree mult_cst = NULL_TREE;
> +  }
> +  /* Make sure we're working with vectors and uniform vector constants.  */
> +  (if (VECTOR_TYPE_P (op_type)
> +   && (rshift_cst = uniform_integer_cst_p (@1))
> +   && (bit_and_cst = uniform_integer_cst_p (@2))
> +   && (mult_cst = uniform_integer_cst_p (@3)))
> +   /* Compute what constants would be needed for this to represent a packed
> +  comparison based on the shift amount denoted by RSHIFT_CST.  */
> +   (with {
> + HOST_WIDE_INT vec_elem_bits = vector_element_bits (op_type);
> + HOST_WIDE_INT vec_nelts = TYPE_VECTOR_SUBPARTS (op_type).to_constant ();

you need to check that this isn't a VLA vector operation.

> + HOST_WIDE_INT vec_bits = vec_elem_bits * vec_nelts;
> +
> + unsigned HOST_WIDE_INT cmp_bits_i, bit_and_i, mult_i;
> + unsigned HOST_WIDE_INT target_mult_i, target_bit_and_i;
> + cmp_bits_i = tree_to_uhwi (rshift_cst) + 1;

and that the rshift_cst and others actually fit an uhwi.

> + target_mult_i = (HOST_WIDE_INT_1U << cmp_bits_i) - 1;
> +
> + mult_i = tree_to_uhwi (mult_cst);
> + bit_and_i = tree_to_uhwi (bit_and_cst);
> + target_bit_and_i = 0;
> +
> + for (unsigned i = 0; i < vec_elem_bits / cmp_bits_i; i++)
> +   target_bit_and_i = (target_bit_and_i << cmp_bits_i) | 1U;

it would be nice to have a comment on what this actually does ...

> +}
> +(if ((exact_log2 (cmp_bits_i)) >= 0
> +&& cmp_bits_i < HOST_BITS_PER_WIDE_INT
> +&& vec_elem_bits <= HOST_BITS_PER_WIDE_INT
> +&& tree_fits_uhwi_p (rshift_cst)
> +&& tree_fits_uhwi_p (mult_cst)
> +&& tree_fits_uhwi_p (bit_and_cst)
> +&& target_mult_i == mult_i
> +&& target_bit_and_i == bit_and_i)
> + /* Compute the vector shape for the comparison and check if the target 
> is
> +   able to expand the comparison with that type.  */
> + (with {
> +   tree bool_type = build_nonstandard_boolean_type (cmp_bits_i);
> +   int vector_type_nelts = vec_bits / cmp_bits_i;
> +   tree vector_type = build_vector_type (bool_type, vector_type_nelts);

why do you build a bool vector type here and then ...

> +   tree zeros = build_zero_cst (vector_type);
> +   tree mask_type = truth_type_for (vector_type);

... its truth type?  Note both might not be actually supported by the target
and thus receive BLK

[Patch] libgomp/nvptx: Prepare for reverse-offload callback handling

2022-08-26 Thread Tobias Burnus

@Tom and Alexander: Better suggestions are welcome for the busy loop in
libgomp/plugin/plugin-nvptx.c regarding the variable placement and checking
its value.


PRE-REMARK

As nvptx (and all other plugins) returns <= 0 for
GOMP_OFFLOAD_get_num_devices if GOMP_REQUIRES_REVERSE_OFFLOAD is
set. This patch is currently still a no op.

The patch is almost stand alone, except that it either needs a
 void *rev_fn_table = NULL;
in GOMP_OFFLOAD_load_image or the following patch:
 [Patch][2/3] nvptx: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup
 https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600348.html
(which in turn needs the '[1/3]' patch).

Not required to be compilable, but the patch is based on the ideas/code from
the reverse-offload ME patch; the latter adds calls to
 GOMP_target_ext (omp_initial_device,
which is for host fallback code processed by the normal target_ext and for
device code by the target_ext of this patch.
→ "[Patch] OpenMP: Support reverse offload (middle end part)"
 https://gcc.gnu.org/pipermail/gcc-patches/2022-July/598662.html

* * *

This patch adds initial offloading support for nvptx.
When the nvptx's device GOMP_target_ext is called - it creates a lock,
fills a struct with the argument pointers (addr, kinds, sizes), its
device number and the set the function pointer address.

On the host side, the last address is checked - if fn_addr != NULL,
it passes all arguments on to the generic (target.c) gomp_target_rev
to do the actual offloading.

CUDA does lockup when trying to copy data from the currently running
stream; hence, a new stream is generated to do the memory copying.
Just having managed memory is not enough - it needs to be concurrently
accessible - otherwise, it will segfault on the host when migrated to
the device.

OK for mainline?

* * *

Future work for nvptx:
* Adjust 'sleep', possibly using different values with and without USM and
 to do shorter sleeps than usleep(1)?
* Set a flag whether there is any offload function at all, avoiding to run
 the more expensive check if there is 'requires reverse_offload' without
 actual reverse-offloading functions present.
 (Recall that the '2/3' patch, mentioned above, only has fn != NULL for
 reverse-offload functions.)
* Document → libgomp.texi that reverse offload may cause some performance
 overhead for all target regions. + That reverse offload is run serialized.

And obviously: submitting the missing bits to get reverse offload working,
but that's mostly not an nvptx topic.

Tobias


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
libgomp/nvptx: Prepare for reverse-offload callback handling

This patch adds a stub 'gomp_target_rev' in the host's target.c, which will
later handle the reverse offload.
For nvptx, it adds support for forwarding the offload gomp_target_ext call
to the host by setting values in a struct on the device and querying it on
the host - invoking gomp_target_rev on the result.

include/ChangeLog:

	* cuda/cuda.h (enum CUdevice_attribute): Add
	CU_DEVICE_ATTRIBUTE_CONCURRENT_MANAGED_ACCESS.
	(enum CUmemAttach_flags): New stub with only member
	CU_MEM_ATTACH_GLOBAL.
	(cuMemAllocManaged): Add prototype.

libgomp/ChangeLog:

	* config/nvptx/icv-device.c (GOMP_DEVICE_NUM_VAR): Remove
	'static' for this variable.
	* config/nvptx/target.c (GOMP_REV_OFFLOAD_VAR): #define as
	variable-name string and use it to define the variable.
	(GOMP_DEVICE_NUM_VAR): Declare this extern global var.
	(struct rev_offload): Define.
	(GOMP_target_ext): Handle reverse offload.
	* libgomp-plugin.h (GOMP_PLUGIN_target_rev): New prototype.
	* libgomp-plugin.c (GOMP_PLUGIN_target_rev): New, call ...
	* target.c (gomp_target_rev): ... this new stub function.
	* libgomp.h (gomp_target_rev): Declare.
	* libgomp.map (GOMP_PLUGIN_1.4): New; add GOMP_PLUGIN_target_rev.
	* plugin/cuda-lib.def (cuMemAllocManaged): Add.
	* plugin/plugin-nvptx.c (GOMP_REV_OFFLOAD_VAR): #define var string.
	(struct rev_offload): New.
	(struct ptx_device): Add concurr_managed_access, rev_data
	and rev_data_dev.
	(nvptx_open_device): Set ptx_device's concurr_managed_access;
	'#if 0' unused async_engines.
	(GOMP_OFFLOAD_load_image): Allocate rev_data variable.
	(rev_off_dev_to_host_cpy, rev_off_host_to_dev_cpy): New.
	(GOMP_OFFLOAD_run): Handle reverse offloading.

 include/cuda/cuda.h   |   8 ++-
 libgomp/config/nvptx/icv-device.c |   2 +-
 libgomp/config/nvptx/target.c |  52 --
 libgomp/libgomp-plugin.c  |  12 
 libgomp/libgomp-plugin.h  |   7 ++
 libgomp/libgomp.h |   5 ++
 libgomp/libgomp.map   |   5 ++
 libgomp/plugin/cuda-lib.def   |   1 +
 libgomp/plugin/plugin-nvptx.c | 148 +-
 libgomp/target.c  

Re: [[GCC13][Patch][V3] 2/2] Use array_at_struct_end_p in __builtin_object_size [PR101836]

2022-08-26 Thread Richard Biener via Gcc-patches
On Wed, 17 Aug 2022, Qing Zhao wrote:

> Use array_at_struct_end_p to determine whether the trailing array
> of a structure is flexible array member in __builtin_object_size.

With the discussion about what array_at_struct_end_p actually computes,
is this now the correct utility for __builtin_object_size or should
it use a stricter variant?

> gcc/ChangeLog:
> 
>   PR tree-optimization/101836
>   * tree-object-size.cc (addr_object_size): Use array_at_struct_end_p
>   to determine a flexible array member reference.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR tree-optimization/101836
>   * gcc.dg/pr101836.c: New test.
>   * gcc.dg/pr101836_1.c: New test.
>   * gcc.dg/pr101836_2.c: New test.
>   * gcc.dg/pr101836_3.c: New test.
>   * gcc.dg/pr101836_4.c: New test.
>   * gcc.dg/pr101836_5.c: New test.
>   * gcc.dg/strict-flex-array-5.c: New test.
>   * gcc.dg/strict-flex-array-6.c: New test.
> ---
>  gcc/testsuite/gcc.dg/pr101836.c| 60 ++
>  gcc/testsuite/gcc.dg/pr101836_1.c  | 60 ++
>  gcc/testsuite/gcc.dg/pr101836_2.c  | 60 ++
>  gcc/testsuite/gcc.dg/pr101836_3.c  | 60 ++
>  gcc/testsuite/gcc.dg/pr101836_4.c  | 60 ++
>  gcc/testsuite/gcc.dg/pr101836_5.c  | 60 ++
>  gcc/testsuite/gcc.dg/strict-flex-array-5.c | 60 ++
>  gcc/testsuite/gcc.dg/strict-flex-array-6.c | 60 ++
>  gcc/tree-object-size.cc| 16 +++---
>  9 files changed, 487 insertions(+), 9 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/pr101836.c
>  create mode 100644 gcc/testsuite/gcc.dg/pr101836_1.c
>  create mode 100644 gcc/testsuite/gcc.dg/pr101836_2.c
>  create mode 100644 gcc/testsuite/gcc.dg/pr101836_3.c
>  create mode 100644 gcc/testsuite/gcc.dg/pr101836_4.c
>  create mode 100644 gcc/testsuite/gcc.dg/pr101836_5.c
>  create mode 100644 gcc/testsuite/gcc.dg/strict-flex-array-5.c
>  create mode 100644 gcc/testsuite/gcc.dg/strict-flex-array-6.c
> 
> diff --git a/gcc/testsuite/gcc.dg/pr101836.c b/gcc/testsuite/gcc.dg/pr101836.c
> new file mode 100644
> index ..efad02cfe899
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr101836.c
> @@ -0,0 +1,60 @@
> +/* -fstrict-flex-arrays is aliased with -ftrict-flex-arrays=3, which is the
> +   strictest, only [] is treated as flexible array.  */ 
> +/* PR tree-optimization/101836 */
> +/* { dg-do run } */
> +/* { dg-options "-O2 -fstrict-flex-arrays" } */
> +
> +#include 
> +
> +#define expect(p, _v) do { \
> +size_t v = _v; \
> +if (p == v) \
> +printf("ok:  %s == %zd\n", #p, p); \
> +else \
> + {  \
> +  printf("WAT: %s == %zd (expected %zd)\n", #p, p, v); \
> +   __builtin_abort (); \
> + } \
> +} while (0);
> +
> +struct trailing_array_1 {
> +int a;
> +int b;
> +int c[4];
> +};
> +
> +struct trailing_array_2 {
> +int a;
> +int b;
> +int c[1];
> +};
> +
> +struct trailing_array_3 {
> +int a;
> +int b;
> +int c[0];
> +};
> +struct trailing_array_4 {
> +int a;
> +int b;
> +int c[];
> +};
> +
> +void __attribute__((__noinline__)) stuff(
> +struct trailing_array_1 *normal,
> +struct trailing_array_2 *trailing_1,
> +struct trailing_array_3 *trailing_0,
> +struct trailing_array_4 *trailing_flex)
> +{
> +expect(__builtin_object_size(normal->c, 1), 16);
> +expect(__builtin_object_size(trailing_1->c, 1), 4);
> +expect(__builtin_object_size(trailing_0->c, 1), 0);
> +expect(__builtin_object_size(trailing_flex->c, 1), -1);
> +}
> +
> +int main(int argc, char *argv[])
> +{
> +stuff((void *)argv[0], (void *)argv[0], (void *)argv[0], (void 
> *)argv[0]);
> +
> +return 0;
> +}
> diff --git a/gcc/testsuite/gcc.dg/pr101836_1.c 
> b/gcc/testsuite/gcc.dg/pr101836_1.c
> new file mode 100644
> index ..e2931ce1012e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr101836_1.c
> @@ -0,0 +1,60 @@
> +/* -fstrict-flex-arrays=3 is the strictest, only [] is treated as
> +   flexible array.  */ 
> +/* PR tree-optimization/101836 */
> +/* { dg-do run } */
> +/* { dg-options "-O2 -fstrict-flex-arrays=3" } */
> +
> +#include 
> +
> +#define expect(p, _v) do { \
> +size_t v = _v; \
> +if (p == v) \
> +printf("ok:  %s == %zd\n", #p, p); \
> +else \
> + {  \
> +  printf("WAT: %s == %zd (expected %zd)\n", #p, p, v); \
> +   __builtin_abort (); \
> + } \
> +} while (0);
> +
> +struct trailing_array_1 {
> +int a;
> +int b;
> +int c[4];
> +};
> +
> +struct trailing_array_2 {
> +int a;
> +int b;
> +int c[1];
> +};
> +
> +struct trailing_array_3 {
> +int a;
> +int b;
> +int c[0];
> +};
> +struct trailing_array_4 {
> +int a;
> +int b;
> +int c[];
> +};
> +
> +void __attribute__((__noinline__)) stuff(
> +struct trailing_array_1 *normal,
>

Re: [[GCC13][Patch][V3] 1/2] Add a new option -fstrict-flex-array[=n] and new attribute strict_flex_array

2022-08-26 Thread Richard Biener via Gcc-patches
On Wed, 17 Aug 2022, Qing Zhao wrote:

> Add the following new option -fstrict-flex-array[=n] and a corresponding
> attribute strict_flex_array to GCC:
> 
> '-fstrict-flex-array'
>  Treat the trailing array of a structure as a flexible array member
>  in a stricter way.  The positive form is equivalent to
>  '-fstrict-flex-array=3', which is the strictest.  A trailing array
>  is treated as a flexible array member only when it is declared as a
>  flexible array member per C99 standard onwards.  The negative form
>  is equivalent to '-fstrict-flex-array=0', which is the least
>  strict.  All trailing arrays of structures are treated as flexible
>  array members.
> 
> '-fstrict-flex-array=LEVEL'
>  Treat the trailing array of a structure as a flexible array member
>  in a stricter way.  The value of LEVEL controls the level of
>  strictness.
> 
>  The possible values of LEVEL are the same as for the
>  'strict_flex_array' attribute (*note Variable Attributes::).
> 
>  You can control this behavior for a specific trailing array field
>  of a structure by using the variable attribute 'strict_flex_array'
>  attribute (*note Variable Attributes::).
> 
>  This option is only valid when flexible array member is supported in the
>  language. FOR ISO C before C99 and ISO C++, no language support for the 
> flexible
>  array member at all, this option will be invalid and a warning will be 
> issued.
>  When -std=gnu89 is specified or C++ with GNU extension, only zero-length 
> array
>  extension and one-size array are supported, as a result, LEVEL=3 will be
>  invalid and a warning will be issued.
> 
> 'strict_flex_array (LEVEL)'
>  The 'strict_flex_array' attribute should be attached to the
>  trailing array field of a structure.  It specifies the level of
>  strictness of treating the trailing array field of a structure as a
>  flexible array member.  LEVEL must be an integer betwen 0 to 3.
> 
>  LEVEL=0 is the least strict level, all trailing arrays of
>  structures are treated as flexible array members.  LEVEL=3 is the
>  strictest level, only when the trailing array is declared as a
>  flexible array member per C99 standard onwards ([]), it is treated
>  as a flexible array member.
> 
>  There are two more levels in between 0 and 3, which are provided to
>  support older codes that use GCC zero-length array extension ([0])
>  or one-size array as flexible array member ([1]): When LEVEL is 1,
>  the trailing array is treated as a flexible array member when it is
>  declared as either [], [0], or [1]; When LEVEL is 2, the trailing
>  array is treated as a flexible array member when it is declared as
>  either [], or [0].
> 
>  This attribute can be used with or without '-fstrict-flex-array'.
>  When both the attribute and the option present at the same time,
>  the level of the strictness for the specific trailing array field
>  is determined by the attribute.
> 
>  This attribute is only valid when flexible array member is supported in 
> the
>  language. For ISO C before C99 and ISO C++, no language support for the 
> flexible
>  array member at all, this attribute will be invalid and a warning is 
> issued.
>  When -std=gnu89 is specified or C++ with GNU extension, only zero-length 
> array
>  extension and one-size array are supported, as a result, LEVEL=3 will be
>  invalid and a warning is issued.
> 
> gcc/c-family/ChangeLog:
> 
>   * c-attribs.cc (handle_strict_flex_arrays_attribute): New function.
>   (c_common_attribute_table): New item for strict_flex_array.
>   * c-opts.cc (c_common_post_options): Handle the combination of
>   -fstrict-flex-arrays and -std specially.
>   * c.opt: (fstrict-flex-array): New option.
>   (fstrict-flex-array=): New option.
> 
> gcc/c/ChangeLog:
> 
>   * c-decl.cc (flexible_array_member_type_p): New function.
>   (one_element_array_type_p): Likewise.
>   (zero_length_array_type_p): Likewise.
>   (add_flexible_array_elts_to_size): Call new utility
>   routine flexible_array_member_type_p.
>   (is_flexible_array_member_p): New function.
>   (finish_struct): Set the new DECL_NOT_FLEXARRAY flag.
> 
> gcc/cp/ChangeLog:
> 
>   * module.cc (trees_out::core_bools): Stream out new bit
>   decl_not_flexarray.
>   (trees_in::core_bools): Stream in new bit decl_not_flexarray.
> 
> gcc/ChangeLog:
> 
>   * doc/extend.texi: Document strict_flex_array attribute.
>   * doc/invoke.texi: Document -fstrict-flex-array[=n] option.
>   * print-tree.cc (print_node): Print new bit decl_not_flexarray.
>   * tree-core.h (struct tree_decl_common): New bit field
>   decl_not_flexarray.
>   * tree-streamer-in.cc (unpack_ts_decl_common_value_fields): Stream
>   in new bit decl_not_flexarray.
>   * tree-streamer-out.cc (pack

Re: [PATCH] d: Fix #error You must define PREFERRED_DEBUGGING_TYPE if DWARF is not supported (PR105659)

2022-08-26 Thread Richard Biener via Gcc-patches
On Tue, Aug 16, 2022 at 11:17 PM Iain Buclaw via Gcc-patches
 wrote:
>
> Hi,
>
> Because targetdm contains hooks pertaining to both the target platform
> and cpu, it tries to pull in both platform and cpu headers via tm_d.h in
> the source file where TARGETDM_INITIALIZER is used.
>
> Since 12.0, this has caused problems when there is no platform (*-elf),
> resulting in default-d.cc failing to build due to triggering a
> PREFERRED_DEBUGGING_TYPE #error.
>
> This patch removes the CPU-specific hooks from targetdm, documenting
> them instead as target macros.  Also removing the generation of tm_d.h
> as its role is redundant.
>
> I also notice that Rust maintainers initially copied what I did in
> devel/rust/master, but ended up reverting back to using macros to get at
> target OS and CPU information as well, possibly because they ran into
> the same problems as reported in PR105659.
>
> I'm not sure whether calling these hooks via function-like macros is
> really desirable, I do recall early on during the review process of the
> D front-end that putting target-specific language features behind a
> targetdm hook was the preferred/encouraged way to expose these things.
>
> One alternative perhaps would be to break out CPU-specific hooks in
> targetdm into a separate targetdm_cpu hook vector.  This would mean
> there would be no need to include tm_p.h anywhere in D-specific target
> sources (only tm.h where needed), and all D-specific prototypes in
> $cpu_type-protos.h can be removed.  Though tm_d.h would still be
> redundant, so either way it gets the chop.
>
> OK? Thoughts?  I don't expect this to go in for 12.2, but backporting
> some time before 12.3 would be nice.

I was hoping Joseph would chime in here - I recollect debugging this kind
of thing and a thread about this a while back but unfortunately I do not
remember the details here (IIRC some things get included where they
better should not be).

This is all D specific changes so I'd say OK for trunk unless Joseph has
any comments.  Please wait a while before backporting so we can see
if it breaks any configuration.

Thanks,
Richard.

> Bootstrapped and regression tested on x86_64-linux-gnu, and checked that
> it indeed fixes the referenced PR by building an aarch64-rtems cross.
>
> Regards,
> Iain.
>
> ---
> PR d/105659
>
> gcc/ChangeLog:
>
> * Makefile.in (tm_d_file_list): Remove.
> (tm_d_include_list): Remove.
> (TM_D_H): Remove.
> (tm_d.h): Remove.
> (cs-tm_d.h): Remove.
> (generated_files): Remove TM_D_H.
> * config.gcc (tm_d_file): Remove.
> * config/darwin-d.cc: Include memmodel.h and tm_p.h instead of tm_d.h.
> * config/default-d.cc: Remove includes of memmodel.h and tm_d.h.
> * config/dragonfly-d.cc: Include tm_p.h instead of tm_d.h.
> * configure: Regenerate.
> * configure.ac (tm_d_file): Remove.
> (tm_d_file_list): Remove substitution.
> (tm_d_include_list): Remove substitution.
> * doc/tm.texi: Regenerate.
> * doc/tm.texi.in (TARGET_D_CPU_VERSIONS): Document hook as being a
> function-like macro.
> (TARGET_D_REGISTER_CPU_TARGET_INFO): Likewise.
>
> gcc/d/ChangeLog:
>
> * d-builtins.cc: Include memmodel.h and tm_p.h.
> (d_init_versions): Call TARGET_D_CPU_VERSIONS via macro.
> * d-target.cc (Target::_init): Call TARGET_D_REGISTER_CPU_TARGET_INFO
> via macro.
> * d-target.def (d_cpu_versions): Remove hook.
> (d_register_cpu_target_info): Remove hook.
> ---
>  gcc/Makefile.in   | 11 +--
>  gcc/config.gcc|  7 ---
>  gcc/config/darwin-d.cc|  3 ++-
>  gcc/config/default-d.cc   |  9 +++--
>  gcc/config/dragonfly-d.cc |  2 +-
>  gcc/configure | 32 
>  gcc/configure.ac  | 18 --
>  gcc/d/d-builtins.cc   |  6 +-
>  gcc/d/d-target.cc |  4 +++-
>  gcc/d/d-target.def| 22 --
>  gcc/doc/tm.texi   | 22 --
>  gcc/doc/tm.texi.in| 18 --
>  12 files changed, 55 insertions(+), 99 deletions(-)
>
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index 203f0a15187..12d9b5a3be4 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -571,8 +571,6 @@ tm_include_list=@tm_include_list@
>  tm_defines=@tm_defines@
>  tm_p_file_list=@tm_p_file_list@
>  tm_p_include_list=@tm_p_include_list@
> -tm_d_file_list=@tm_d_file_list@
> -tm_d_include_list=@tm_d_include_list@
>  build_xm_file_list=@build_xm_file_list@
>  build_xm_include_list=@build_xm_include_list@
>  build_xm_defines=@build_xm_defines@
> @@ -865,7 +863,6 @@ BCONFIG_H = bconfig.h $(build_xm_file_list)
>  CONFIG_H  = config.h  $(host_xm_file_list)
>  TCONFIG_H = tconfig.h $(xm_file_list)
>  TM_P_H= tm_p.h$(tm_p_file_list)
> -TM_D_H= tm_d.h$(tm_d_file_list)
>  GTM_H = tm.h  $(tm_file_list) insn

Re: [PATCH] match.pd: Add abs with bitwise and pattern [PR106243]

2022-08-26 Thread Richard Biener via Gcc-patches
On Wed, Aug 10, 2022 at 7:11 PM Sam Feifer via Gcc-patches
 wrote:
>
> This patch adds a simplification to match.pd that was discussed on the
> thread for pr106243. It simplifies the pattern, abs(x) & 1, to x & 1.
>
> There are also tests for the simplification in this patch. I couldn't
> figure out how to get abs to work with vectors. If a test for that is
> necessary, could I get some guidance on using abs with vector types?
>
> Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

OK, and sorry for the delay.

Thanks,
Richard.

> PR tree-optimization/106243
>
> gcc/ChangeLog:
>
> * match.pd (abs(x) & 1): New simplification.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/pr106243-2.c: New test.
> * gcc.dg/pr106243-3.c: New test.
> ---
>  gcc/match.pd  |  5 +
>  gcc/testsuite/gcc.dg/pr106243-2.c | 31 +++
>  gcc/testsuite/gcc.dg/pr106243-3.c | 18 ++
>  3 files changed, 54 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/pr106243-2.c
>  create mode 100644 gcc/testsuite/gcc.dg/pr106243-3.c
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index f82f94ad1fe..c04e70f34c1 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -8071,3 +8071,8 @@ and,
>  (simplify
>(bit_and (negate @0) integer_onep@1)
>(bit_and @0 @1))
> +
> +/* abs(x) & 1 -> x & 1.  */
> +(simplify
> +  (bit_and (abs @0) integer_onep@1)
> +  (bit_and @0 @1))
> diff --git a/gcc/testsuite/gcc.dg/pr106243-2.c 
> b/gcc/testsuite/gcc.dg/pr106243-2.c
> new file mode 100644
> index 000..27e66f59160
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr106243-2.c
> @@ -0,0 +1,31 @@
> +/* PR tree-optimization/106243 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +
> +#include 
> +
> +__attribute__((noipa)) int foo (int x) {
> +return abs(x) & 1;
> +}
> +
> +__attribute__((noipa)) int bar (int x) {
> +return (0 - abs(x)) & 1;
> +}
> +
> +/* Commutative property.  */
> +__attribute__((noipa)) int baz (int x) {
> +return 1 & abs(x);
> +}
> +
> +/* Forward propogation.  */
> +__attribute__((noipa)) int qux (int x) {
> +int y = abs(x);
> +return y & 1;
> +}
> +
> +/* Should not simplify.  */
> +__attribute__((noipa)) int thud (int x) {
> +return abs(x) & -1;
> +}
> +
> +/* { dg-final {scan-tree-dump-times " ABS_EXPR " 1 "optimized" } } */
> diff --git a/gcc/testsuite/gcc.dg/pr106243-3.c 
> b/gcc/testsuite/gcc.dg/pr106243-3.c
> new file mode 100644
> index 000..68800868751
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr106243-3.c
> @@ -0,0 +1,18 @@
> +/* PR tree-optimization/106243 */
> +/* { dg-do run } */
> +/* { dg-options "-O2" } */
> +
> +#include "pr106243-2.c"
> +
> +int main () {
> +
> +if (foo(6) != 0
> +|| bar(-3) != 1
> +|| baz(32) != 0
> +|| qux(-128) != 0
> +|| foo (127) != 1) {
> +__builtin_abort();
> +}
> +
> +return 0;
> +}
>
> base-commit: be58bf98e98bb431ed26ca8be84586075fe8be82
> --
> 2.31.1
>


Re: [PATCH] vect: Tighten get_related_vectype_for_scalar_type

2022-08-26 Thread Richard Biener via Gcc-patches
On Thu, 25 Aug 2022, Richard Sandiford wrote:

> Builds of glibc with SVE enabled have been failing since V1DI was added
> to the aarch64 port.  The problem is that BB SLP starts the (hopeless)
> attempt to use variable-length modes to vectorise a single-element
> vector, and that now gets further than it did before.
> 
> Initially we tried getting a vector mode with 1 + 1X DI elements
> (i.e. 1 DI per 128-bit vector chunk).  We don't provide such a mode --
> it would be VNx1DI -- because it isn't a native SVE format.  We then
> try just 1 DI, which previously failed but now succeeds.
> 
> There are numerous ways we could fix this.  Perhaps the most obvious
> would be to skip variable-length modes for BB SLP.  However, I think
> that'd just be kicking the can down the road, since eventually we want
> to support BB SLP and VLA vectors using predication.
> 
> However, if we do use VLA vectors for BB SLP, the vector modes
> we use should actually be variable length.  We don't want to use
> variable-length vectors for some element types/group sizes and
> fixed-length vectors for others, since it would be difficult
> to handle the seams.
> 
> The same principle applies during loop vectorisation.  We can't
> use a mixture of variable-length and fixed-length vectors for
> the same loop because the relative unroll/vectorisation factors
> would not be constant (compile-time) multiples of each other.
> 
> This patch therefore makes get_related_vectype_for_scalar_type
> check that the provided number of units is interoperable with
> the provided prevailing mode.  The function is generally quite
> forgiving -- it does basic things like checking for scalarness
> itself rather than expecting callers to do them -- so the new
> check feels in keeping with that.
> 
> This seems to subsume the fix for PR96974.  I'm not sure it's
> worth reverting that code to an assert though, so the patch just
> drops the scan for the associated message.
> 
> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

OK.

Thanks,
Richard.

> Richard
> 
> 
> gcc/
>   * tree-vect-stmts.cc (get_related_vectype_for_scalar_type): Check
>   that the requested number of units is interoperable with the requested
>   prevailing mode.
> 
> gcc/testsuite/
>   * gcc.target/aarch64/sve/slp_15.c: New test.
>   * g++.target/aarch64/sve/pr96974.C: Remove scan test.
> ---
>  gcc/testsuite/g++.target/aarch64/sve/pr96974.C |  4 +---
>  gcc/testsuite/gcc.target/aarch64/sve/slp_15.c  | 17 +
>  gcc/tree-vect-stmts.cc | 10 ++
>  3 files changed, 28 insertions(+), 3 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/slp_15.c
> 
> diff --git a/gcc/testsuite/g++.target/aarch64/sve/pr96974.C 
> b/gcc/testsuite/g++.target/aarch64/sve/pr96974.C
> index 54000f568ab..2f6ebd6ce3d 100644
> --- a/gcc/testsuite/g++.target/aarch64/sve/pr96974.C
> +++ b/gcc/testsuite/g++.target/aarch64/sve/pr96974.C
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-Ofast -march=armv8.2-a+sve -fdisable-tree-fre4 
> -fdump-tree-slp-details" } */
> +/* { dg-options "-Ofast -march=armv8.2-a+sve -fdisable-tree-fre4" } */
>  
>  float a;
>  int
> @@ -14,5 +14,3 @@ struct c {
>  }
>  int coeffs[10];
>  } f;
> -
> -/* { dg-final { scan-tree-dump "Not vectorized: Incompatible number of 
> vector subparts between" "slp1" { target lp64 } } } */
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/slp_15.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/slp_15.c
> new file mode 100644
> index 000..23f6d567cc5
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/slp_15.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-O3" } */
> +
> +struct foo
> +{
> +  void *handle;
> +  void *arg;
> +};
> +
> +void
> +dlinfo_doit (struct foo *args)
> +{
> +  __UINTPTR_TYPE__ **l = args->handle;
> +
> +  *(__UINTPTR_TYPE__ *) args->arg = 0;
> +  *(__UINTPTR_TYPE__ *) args->arg = **l;
> +}
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index c9dab217f05..7748c42c70f 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -11486,6 +11486,16 @@ get_related_vectype_for_scalar_type (machine_mode 
> prevailing_mode,
>  
>unsigned int nbytes = GET_MODE_SIZE (inner_mode);
>  
> +  /* Interoperability between modes requires one to be a constant multiple
> + of the other, so that the number of vectors required for each operation
> + is a compile-time constant.  */
> +  if (prevailing_mode != VOIDmode
> +  && !constant_multiple_p (nunits * nbytes,
> +GET_MODE_SIZE (prevailing_mode))
> +  && !constant_multiple_p (GET_MODE_SIZE (prevailing_mode),
> +nunits * nbytes))
> +return NULL_TREE;
> +
>/* For vector types of elements whose mode precision doesn't
>   match their types precision we use a element type of mode
>   precision.  The vectorization rou

[PING] Re: [PATCH, libgomp] Fix chunk_size<1 for dynamic schedule

2022-08-26 Thread Chung-Lin Tang

On 2022/8/4 9:31 PM, Koning, Paul wrote:




On Aug 4, 2022, at 9:17 AM, Chung-Lin Tang  wrote:

On 2022/6/28 10:06 PM, Jakub Jelinek wrote:

On Thu, Jun 23, 2022 at 11:47:59PM +0800, Chung-Lin Tang wrote:

with the way that chunk_size < 1 is handled for gomp_iter_dynamic_next:

(1) chunk_size <= -1: wraps into large unsigned value, seems to work though.
(2) chunk_size == 0:  infinite loop

The (2) behavior is obviously not desired. This patch fixes this by changing

Why?  It is a user error, undefined behavior, we shouldn't slow down valid
code for users who don't bother reading the standard.


This is loop init code, not per-iteration. The overhead really isn't that much.

The question should be, if GCC having infinite loop behavior is reasonable,
even if it is undefined in the spec.


I wouldn't think so.  The way I see "undefined code" is that you can't complain about 
"wrong code" produced by the compiler.  But for the compiler to malfunction on wrong 
input is an entirely differerent matter.  For one thing, it's hard to fix your code if the compiler 
fails.  How would you locate the offending source line?

paul


Ping?


Re: [PATCH] libstdc++: Add test for std::con/disjunction's base class

2022-08-26 Thread Jonathan Wakely via Gcc-patches
On Fri, 26 Aug 2022, 02:06 Patrick Palka via Libstdc++, <
libstd...@gcc.gnu.org> wrote:

> Tested on x86_64-pc-linux-gnu, does this look OK for trunk?
>
> libstdc++-v3/ChangeLog:
>
> * testsuite/20_util/logical_traits/requirements/base_classes.cc:
> New test.
> ---
>  .../requirements/base_classes.cc  | 37 +++
>  1 file changed, 37 insertions(+)
>  create mode 100644
> libstdc++-v3/testsuite/20_util/logical_traits/requirements/base_classes.cc
>
> diff --git
> a/libstdc++-v3/testsuite/20_util/logical_traits/requirements/base_classes.cc
> b/libstdc++-v3/testsuite/20_util/logical_traits/requirements/base_classes.cc
> new file mode 100644
> index 000..6699037bb14
> --- /dev/null
> +++
> b/libstdc++-v3/testsuite/20_util/logical_traits/requirements/base_classes.cc
> @@ -0,0 +1,37 @@
> +// { dg-do compile { target c++17 } }
> +
> +#include 
>

 instead

OK with that change, thanks.


+
> +using std::true_type;
> +using std::false_type;
> +
> +template struct T : std::true_type { };
> +template struct F : std::false_type { };
> +
> +// [meta.logical]/5: The specialization conjunction has a
> +// public and unambiguous base that is either:
> +//   - the first type B_i in the list true_type, B_1, ..., B_n for which
> +//   bool(B_i::value) is false, or
> +//   - if there is no such Bi, the last type in the list.
> +
> +static_assert(std::is_base_of_v>);
> +static_assert(std::is_base_of_v, std::conjunction>>);
> +static_assert(std::is_base_of_v, std::conjunction>>);
> +static_assert(std::is_base_of_v, std::conjunction, T<1>>>);
> +static_assert(std::is_base_of_v, std::conjunction, F<1>>>);
> +static_assert(std::is_base_of_v, std::conjunction, F<0>,
> F<1>>>);
> +static_assert(std::is_base_of_v, std::conjunction, F<0>, T<1>,
> F<1>>>);
> +
> +// [meta.logical]/10: The specialization disjunction has a
> +// public and unambiguous base that is either:
> +//   - the first type B_i in the list false_type, B_1, ..., B_n for which
> +//   bool(B_i::value) is true, or
> +//   - if there is no such Bi, the last type in the list.
> +
> +static_assert(std::is_base_of_v>);
> +static_assert(std::is_base_of_v, std::disjunction>>);
> +static_assert(std::is_base_of_v, std::disjunction>>);
> +static_assert(std::is_base_of_v, std::disjunction, T<1>>>);
> +static_assert(std::is_base_of_v, std::disjunction, F<1>>>);
> +static_assert(std::is_base_of_v, std::disjunction, F<0>,
> F<1>>>);
> +static_assert(std::is_base_of_v, std::disjunction, F<0>, T<1>,
> F<1>>>);
> --
> 2.37.2.382.g795ea8776b
>
>


[PATCH] libcpp: Implement P2362R3 - Remove non-encodable wide character literals and multicharacter [PR106647]

2022-08-26 Thread Jakub Jelinek via Gcc-patches
Hi!

My understanding of the paper is that we just want to promote the CPP_WCHAR
"character constant too long for its type" warning to error as it is already
error for u8, u and U literals.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-08-26  Jakub Jelinek  

PR c++/106647
* charset.cc (wide_str_to_charconst): Implement P2362R3 - Remove
non-encodable wide character literals and multicharacter.  For
C++23 use CPP_DL_ERROR instead of CPP_DL_WARNING for
"character constant too long for its type" diagnostics on CPP_WCHAR
literals.

* g++.dg/cpp23/wchar-multi1.C: New test.
* g++.dg/cpp23/wchar-multi2.C: New test.

--- libcpp/charset.cc.jj2022-08-25 11:54:38.849924475 +0200
+++ libcpp/charset.cc   2022-08-25 18:36:20.650415220 +0200
@@ -2170,7 +2170,11 @@ wide_str_to_charconst (cpp_reader *pfile
  character constant is guaranteed to overflow.  */
   if (str.len > nbwc * 2)
 cpp_error (pfile, (CPP_OPTION (pfile, cplusplus)
-  && (type == CPP_CHAR16 || type == CPP_CHAR32))
+  && (type == CPP_CHAR16
+  || type == CPP_CHAR32
+  /* In C++23 this is error even for L'ab'.  */
+  || (type == CPP_WCHAR
+  && CPP_OPTION (pfile, size_t_literals
  ? CPP_DL_ERROR : CPP_DL_WARNING,
   "character constant too long for its type");
 
--- gcc/testsuite/g++.dg/cpp23/wchar-multi1.C.jj2022-08-25 
18:08:01.973426155 +0200
+++ gcc/testsuite/g++.dg/cpp23/wchar-multi1.C   2022-08-25 18:51:30.476687112 
+0200
@@ -0,0 +1,42 @@
+// P2362R3 - Remove non-encodable wide character literals and multicharacter
+// wide character literals.
+// { dg-do compile }
+
+char a = 'a';
+int b = 'ab';  // { dg-warning "multi-character character 
constant" }
+int c = '\u05D9';  // { dg-warning "multi-character character 
constant" }
+#if __SIZEOF_INT__ > 2
+int d = '\U0001F525';  // { dg-warning "multi-character character 
constant" "" { target int32 } }
+#endif
+int e = 'abcd';// { dg-warning "multi-character 
character constant" }
+wchar_t f = L'f';
+wchar_t g = L'gh'; // { dg-error "character constant too long for 
its type" "" { target c++23 } }
+   // { dg-warning "character constant too long 
for its type" "" { target c++20_down } .-1 }
+wchar_t h = L'ijkl';   // { dg-error "character constant too long for 
its type" "" { target c++23 } }
+   // { dg-warning "character constant too long 
for its type" "" { target c++20_down } .-1 }
+wchar_t i = L'\U0001F525'; // { dg-error "character constant too long for 
its type" "" { target { c++23 && { ! 4byte_wchar_t } } } }
+   // { dg-warning "character constant too long 
for its type" "" { target { c++20_down && { ! 4byte_wchar_t } } } .-1 }
+#ifdef __cpp_char8_t
+typedef char8_t u8;
+#else
+typedef char u8;
+#endif
+#if __cpp_unicode_characters >= 201411 
+u8 j = u8'j';
+u8 k = u8'kl'; // { dg-error "character constant too long for 
its type" "" { target c++17 } }
+u8 l = u8'\U0001F525'; // { dg-error "character constant too long for 
its type" "" { target c++17 }  }
+#endif
+#if __cpp_unicode_characters >= 200704
+char16_t m = u'm';
+char16_t n = u'no';// { dg-error "character constant too long for 
its type" "" { target c++11 } }
+char16_t o = u'\u05D9';
+char16_t p = u'\U0001F525';// { dg-error "character constant too long for 
its type" "" { target c++11 } }
+char32_t q = U'm';
+char32_t r = U'no';// { dg-error "character constant too long for 
its type" "" { target c++11 } }
+char32_t s = U'\u05D9';
+char32_t t = U'\U0001F525';
+#endif
+wchar_t u = L'\u0065\u0301';   // { dg-error "character constant too 
long for its type" "" { target c++23 } }
+   // { dg-warning "character constant too long 
for its type" "" { target c++20_down } .-1 }
+wchar_t v = L'é'; // { dg-error "character constant too long for 
its type" "" { target c++23 } }
+   // { dg-warning "character constant too long 
for its type" "" { target c++20_down } .-1 }
--- gcc/testsuite/g++.dg/cpp23/wchar-multi2.C.jj2022-08-25 
18:51:53.744386945 +0200
+++ gcc/testsuite/g++.dg/cpp23/wchar-multi2.C   2022-08-25 18:53:03.317489442 
+0200
@@ -0,0 +1,43 @@
+// P2362R3 - Remove non-encodable wide character literals and multicharacter
+// wide character literals.
+// { dg-do compile }
+// { dg-options "-fshort-wchar" }
+
+char a = 'a';
+int b = 'ab';  // { dg-warning "multi-character character 
constant" }
+int c = '\u05D9';  // { dg-warning "multi-character character 
constant" }
+#if __SIZEOF_INT__ > 2
+int d = '\U0001F525';   

[PATCH, rs6000] Put dg-options ahead of target selector checks

2022-08-26 Thread HAO CHEN GUI via Gcc-patches
Hi,
  This patch changes the sequence of test directives for 3 cases. Originally,
these 3 cases got failed or unsupported on some platforms, as their target
selector checks depend on compiling options.

  Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.
Is this okay for trunk? Any recommendations? Thanks a lot.

Thanks
Gui Haochen

ChangeLog
2022-08-26  Haochen Gui  

rs6000: Change the sequence of test directives for some test cases.  Put
dg-options ahead of target selector checks as the compiling options affect the
result of these checks.

gcc/testsuite/
* gcc.target/powerpc/pr92398.p9+.c: Put dg-options ahead of target
selector check.
* gcc.target/powerpc/pr92398.p9-.c: Likewise.
* gcc.target/powerpc/pr93453-1.c: Likewise.


patch.diff
diff --git a/gcc/testsuite/gcc.target/powerpc/pr92398.p9+.c 
b/gcc/testsuite/gcc.target/powerpc/pr92398.p9+.c
index 72dd1d9a274..4e4fad620e8 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr92398.p9+.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr92398.p9+.c
@@ -1,6 +1,8 @@
-/* { dg-do compile { target { lp64 && has_arch_pwr9 } } } */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mdejagnu-cpu=power9 -mvsx" } */
+/* { dg-require-effective-target has_arch_ppc64 } */
+/* { dg-require-effective-target int128 } */
 /* { dg-require-effective-target powerpc_vsx_ok } */
-/* { dg-options "-O2 -mvsx" } */

 /* { dg-final { scan-assembler-times {\mmtvsrdd\M} 1 } } */
 /* { dg-final { scan-assembler-times {\mxxlnor\M} 1 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/pr92398.p9-.c 
b/gcc/testsuite/gcc.target/powerpc/pr92398.p9-.c
index bd7fa98af51..4e6a8c8cb8e 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr92398.p9-.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr92398.p9-.c
@@ -1,6 +1,8 @@
-/* { dg-do compile { target { lp64 && {! has_arch_pwr9} } } } */
-/* { dg-require-effective-target powerpc_vsx_ok } */
 /* { dg-options "-O2 -mvsx" } */
+/* { dg-do compile { target { ! has_arch_pwr9 } } } */
+/* { dg-require-effective-target int128 } */
+/* { dg-require-effective-target has_arch_ppc64 } */
+/* { dg-require-effective-target powerpc_vsx_ok } */

 /* { dg-final { scan-assembler-times {\mnot\M} 2 { xfail be } } } */
 /* { dg-final { scan-assembler-times {\mstd\M} 2 { xfail { { {! has_arch_pwr9} 
&& has_arch_pwr8 } && be } } } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/pr93453-1.c 
b/gcc/testsuite/gcc.target/powerpc/pr93453-1.c
index b396458ba12..6f4d899c114 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr93453-1.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr93453-1.c
@@ -1,5 +1,6 @@
-/* { dg-do compile { target has_arch_ppc64 } } */
+/* { dg-do compile } */
 /* { dg-options "-mdejagnu-cpu=power6 -O2" } */
+/* { dg-require-effective-target has_arch_ppc64 } */

 unsigned long load_byte_reverse (unsigned long *in)
 {



Re: [PATCH] 32-bit PA-RISC with HP-UX: remove deprecated ports

2022-08-26 Thread Martin Liška
On 8/26/22 09:20, Andreas Schwab wrote:
> On Aug 26 2022, Martin Liška wrote:
> 
>> diff --git a/config.rpath b/config.rpath
> 
> This is part of gettext.
> 

Ah, thanks, but how should one know that :P ?

Cheers,
Martin


Re: [PATCH] 32-bit PA-RISC with HP-UX: remove deprecated ports

2022-08-26 Thread Andreas Schwab
On Aug 26 2022, Martin Liška wrote:

> diff --git a/config.rpath b/config.rpath

This is part of gettext.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."