[PATCH] c: Handle initializations of opaque types [PR106016]

2022-06-17 Thread Peter Bergner via Gcc-patches
The initial commit that added opaque types thought that there couldn't
be any valid initializations for variables of these types, but the test
case in the bug report shows that isn't true.  The solution is to handle
OPAQUE_TYPE initializations just like the other scalar types.

This passed bootstrap and regtesting with no regressions on powerpc64le-linux.
Ok for trunk?  This is an issue in GCC 12 and 11 too.  Ok for the release
branches after some burn-in on trunk?

Peter

gcc/
PR c/106016
* expr.cc (count_type_elements): Handle OPAQUE_TYPE.

gcc/testsuite/
PR c/106016
* gcc.target/powerpc/pr106016.c: New test.

diff --git a/gcc/expr.cc b/gcc/expr.cc
index 78c839ab425..1675198a146 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -6423,13 +6423,13 @@ count_type_elements (const_tree type, bool for_ctor_p)
 case OFFSET_TYPE:
 case REFERENCE_TYPE:
 case NULLPTR_TYPE:
+case OPAQUE_TYPE:
   return 1;
 
 case ERROR_MARK:
   return 0;
 
 case VOID_TYPE:
-case OPAQUE_TYPE:
 case METHOD_TYPE:
 case FUNCTION_TYPE:
 case LANG_TYPE:
diff --git a/gcc/testsuite/gcc.target/powerpc/pr106016.c 
b/gcc/testsuite/gcc.target/powerpc/pr106016.c
new file mode 100644
index 000..3db8345dcc6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr106016.c
@@ -0,0 +1,14 @@
+/* PR target/106016 */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=power10" } */
+
+/* Make sure we do not ICE on the following test case.  */
+
+extern void bar (__vector_quad *);
+
+void
+foo (__vector_quad *a, __vector_quad *b)
+{
+  __vector_quad arr[2] = {*a, *b};
+  bar ([0]);
+}


[PATCH] libcpp: Support raw strings with newlines while processing directives [PR55971]

2022-06-17 Thread Lewis Hyatt via Gcc-patches
Hello-

The attached fixes PR preprocessor/55971:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55971

, which is the issue that we don't currently allow raw string literals
containing embedded newlines to appear in #define. With the patch, they can be
used in any preprocessing directive.

While I was in there, I also added support for the argument to _Pragma() to be
a raw string, which was not supported (with or without newlines) previously,
but I think is nice to have.

Bootstrap + regtest all languages on x86-64 Linux looks good, with no new
failures and the new testcases passing:

FAIL 103 103
PASS 542454 542574
UNSUPPORTED 15248 15248
UNTESTED 136 136
XFAIL 4166 4166
XPASS 17 17

Please let me know if it looks OK?
Thanks!

-Lewis
[PATCH] libcpp: Support raw strings with newlines while processing directives 
[PR55971]

It's not currently possible to use a C++11 raw string containing a newline as
part of the definition of a macro, or in any other preprocessing directive,
such as:

 #define X R"(two
lines)"

 #error R"(this error has
two lines)"

This patch adds support for that by relaxing the conditions under which
_cpp_get_fresh_line() refuses to get a new line. For the case of lexing a raw
string, it's OK to do so as long as there is another line within the current
buffer. The code in cpp_get_fresh_line() was refactored into a new function
get_fresh_line_impl(), so that the new logic is applied only when processing a
raw string and not any other times.

gcc -E needed a small tweak now that it's possible to get a token from
macro expansion which contains a newline; in that case, c-ppoutput.c needs
to check and avoid incrementing its internal line counter in that case,
otherwise it erroneously prints a line change marker after printing the
expansion of a macro with an embedded newline.

I have added testcases for all preprocessing directives to make sure they are
OK with these kinds of raw strings. While doing that it became apparent that
we do not currently accept a raw string (with or without embedded newlines) as
the argument to _Pragma(). That was pretty straightforward to add, so I have
done that as well, since it seems potentially handy to avoid needing to escape
all the quotes inside the pragma, plus clang accepts this as well.

PR preprocessor/55971

libcpp/ChangeLog:

* directives.cc (destringize_and_run): Support C++11 raw strings
as the argument to _Pragma().
* lex.cc (get_fresh_line_impl): New function refactoring the code
from...
(_cpp_get_fresh_line): ...here.
(lex_raw_string): Use the new version of get_fresh_line_impl() to
support raw strings containing new lines when processing a directive.

gcc/testsuite/ChangeLog:

* c-c++-common/raw-string-directive-1.c: New test.
* c-c++-common/raw-string-directive-2.c: New test.

gcc/c-family/ChangeLog:

* c-ppoutput.cc (token_streamer::stream): Don't call
account_for_newlines() if the tokens came from macro expansion.

diff --git a/gcc/c-family/c-ppoutput.cc b/gcc/c-family/c-ppoutput.cc
index 9de46a9655f..4f3576fa273 100644
--- a/gcc/c-family/c-ppoutput.cc
+++ b/gcc/c-family/c-ppoutput.cc
@@ -292,11 +292,13 @@ token_streamer::stream (cpp_reader *pfile, const 
cpp_token *token,
   print.printed = true;
 }
 
-  /* CPP_COMMENT tokens and raw-string literal tokens can have
- embedded new-line characters.  Rather than enumerating all the
- possible token types just check if token uses val.str union
- member.  */
-  if (cpp_token_val_index (token) == CPP_TOKEN_FLD_STR)
+  /* CPP_COMMENT tokens and raw-string literal tokens can have embedded
+ new-line characters.  Rather than enumerating all the possible token
+ types, just check if token uses val.str union member.  If the token came
+ from a macro expansion, then no adjustment should be made since the
+ new-line characters did not appear in the source.  */
+  if (cpp_token_val_index (token) == CPP_TOKEN_FLD_STR
+  && !from_macro_expansion_at (loc))
 account_for_newlines (token->val.str.text, token->val.str.len);
 }
 
diff --git a/gcc/testsuite/c-c++-common/raw-string-directive-1.c 
b/gcc/testsuite/c-c++-common/raw-string-directive-1.c
new file mode 100644
index 000..810f11256fa
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/raw-string-directive-1.c
@@ -0,0 +1,77 @@
+/* { dg-do compile } */
+/* { dg-options "-std=gnu99" { target c } } */
+/* { dg-options "-std=c++11" { target c++ } } */
+
+/* Test that multi-line raw strings are lexed OK for all preprocessing
+   directives where one could appear. Test raw-string-directive-2.c
+   checks that #define is also processed properly.  */
+
+/* Note that in cases where we cause GCC to produce a multi-line error
+   message, we construct the string so that the second line looks enough
+   like an error message for DejaGNU to process it as such, so that we
+   can use dg-warning or dg-error directives to check for it.  */
+

Re: [PATCH] libgo: Recognize off64_t / loff_t type definition of musl libc

2022-06-17 Thread Ian Lance Taylor via Gcc-patches
On Tue, May 31, 2022 at 12:41 PM Sören Tempel  wrote:
>
> PING.
>
> If there is anything else that needs to be addressed please let me know.

Thanks.  Committed as follows.  Sorry for the delay.

Ian
e584afe7976a40df42eed4df6ce6852abab74030
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index 0cda305c648..4b75dd37355 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-8db6b78110f84e22c409f334aeaefb80a8b39917
+a409e049737ec9a358a19233e017d957db3d6d2a
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/libgo/configure.ac b/libgo/configure.ac
index 7e2b98ba67c..bac58b07b41 100644
--- a/libgo/configure.ac
+++ b/libgo/configure.ac
@@ -579,7 +579,7 @@ AC_C_BIGENDIAN
 
 GCC_CHECK_UNWIND_GETIPINFO
 
-AC_CHECK_HEADERS(port.h sched.h semaphore.h sys/file.h sys/mman.h syscall.h 
sys/epoll.h sys/event.h sys/inotify.h sys/ptrace.h sys/syscall.h sys/sysctl.h 
sys/user.h sys/utsname.h sys/select.h sys/socket.h net/bpf.h net/if.h 
net/if_arp.h net/route.h netpacket/packet.h sys/prctl.h sys/mount.h sys/vfs.h 
sys/statfs.h sys/timex.h sys/sysinfo.h utime.h linux/ether.h linux/fs.h 
linux/ptrace.h linux/reboot.h netinet/in_syst.h netinet/ip.h 
netinet/ip_mroute.h netinet/if_ether.h lwp.h)
+AC_CHECK_HEADERS(fcntl.h port.h sched.h semaphore.h sys/file.h sys/mman.h 
syscall.h sys/epoll.h sys/event.h sys/inotify.h sys/ptrace.h sys/syscall.h 
sys/sysctl.h sys/user.h sys/utsname.h sys/select.h sys/socket.h net/bpf.h 
net/if.h net/if_arp.h net/route.h netpacket/packet.h sys/prctl.h sys/mount.h 
sys/vfs.h sys/statfs.h sys/timex.h sys/sysinfo.h utime.h linux/ether.h 
linux/fs.h linux/ptrace.h linux/reboot.h netinet/in_syst.h netinet/ip.h 
netinet/ip_mroute.h netinet/if_ether.h lwp.h)
 
 AC_CHECK_HEADERS([netinet/icmp6.h], [], [],
 [#include 
@@ -601,7 +601,11 @@ AC_STRUCT_DIRENT_D_TYPE
 
 AC_CHECK_FUNCS(accept4 dup3 epoll_create1 faccessat fallocate fchmodat 
fchownat futimesat getxattr inotify_add_watch inotify_init inotify_init1 
inotify_rm_watch listxattr mkdirat mknodat open64 openat pipe2 removexattr 
renameat setxattr sync_file_range splice syscall tee unlinkat unshare utimensat)
 AC_TYPE_OFF_T
-AC_CHECK_TYPES([loff_t])
+
+CFLAGS_hold="$CFLAGS"
+CFLAGS="$OSCFLAGS $CFLAGS"
+AC_CHECK_TYPES([loff_t], [], [], [[#include ]])
+CFLAGS="$CFLAGS_hold"
 
 LIBS_hold="$LIBS"
 LIBS="$LIBS -lm"
diff --git a/libgo/go/syscall/libcall_linux.go 
b/libgo/go/syscall/libcall_linux.go
index 7bec2fbaeb5..19ae4393cf1 100644
--- a/libgo/go/syscall/libcall_linux.go
+++ b/libgo/go/syscall/libcall_linux.go
@@ -210,20 +210,20 @@ func Gettid() (tid int) {
 //sys  Setxattr(path string, attr string, data []byte, flags int) (err error)
 //setxattr(path *byte, name *byte, value *byte, size Size_t, flags _C_int) 
_C_int
 
-//sys  splice(rfd int, roff *_loff_t, wfd int, woff *_loff_t, len int, flags 
int) (n int64, err error)
-//splice(rfd _C_int, roff *_loff_t, wfd _C_int, woff *_loff_t, len Size_t, 
flags _C_uint) Ssize_t
+//sys  splice(rfd int, roff *_libgo_loff_t_type, wfd int, woff 
*_libgo_loff_t_type, len int, flags int) (n int64, err error)
+//splice(rfd _C_int, roff *_libgo_loff_t_type, wfd _C_int, woff 
*_libgo_loff_t_type, len Size_t, flags _C_uint) Ssize_t
 
 func Splice(rfd int, roff *int64, wfd int, woff *int64, len int, flags int) (n 
int64, err error) {
-   var lroff _loff_t
-   var plroff *_loff_t
+   var lroff _libgo_loff_t_type
+   var plroff *_libgo_loff_t_type
if roff != nil {
-   lroff = _loff_t(*roff)
+   lroff = _libgo_loff_t_type(*roff)
plroff = 
}
-   var lwoff _loff_t
-   var plwoff *_loff_t
+   var lwoff _libgo_loff_t_type
+   var plwoff *_libgo_loff_t_type
if woff != nil {
-   lwoff = _loff_t(*woff)
+   lwoff = _libgo_loff_t_type(*woff)
plwoff = 
}
n, err = splice(rfd, plroff, wfd, plwoff, len, flags)
diff --git a/libgo/mksysinfo.sh b/libgo/mksysinfo.sh
index 0c52ea5d71a..5aa309155c3 100755
--- a/libgo/mksysinfo.sh
+++ b/libgo/mksysinfo.sh
@@ -403,11 +403,7 @@ fi
 # Some basic types.
 echo 'type Size_t _size_t' >> ${OUT}
 echo "type Ssize_t _ssize_t" >> ${OUT}
-if grep '^const _HAVE_OFF64_T = ' gen-sysinfo.go > /dev/null 2>&1; then
-  echo "type Offset_t _off64_t" >> ${OUT}
-else
-  echo "type Offset_t _off_t" >> ${OUT}
-fi
+echo "type Offset_t _libgo_off_t_type" >> ${OUT}
 echo "type Mode_t _mode_t" >> ${OUT}
 echo "type Pid_t _pid_t" >> ${OUT}
 echo "type Uid_t _uid_t" >> ${OUT}
diff --git a/libgo/sysinfo.c b/libgo/sysinfo.c
index 8ce061e2f5f..a4259c02ded 100644
--- a/libgo/sysinfo.c
+++ b/libgo/sysinfo.c
@@ -357,6 +357,18 @@ enum {
 };
 #endif
 
+#if defined(HAVE_LOFF_T)
+// loff_t can be defined as a macro; for -fgo-dump-spec make sure we
+// see a typedef.
+typedef loff_t libgo_loff_t_type;
+#endif
+
+#if defined(HAVE_OFF64_T)
+typedef off64_t 

[PATCH RFA] ubsan: do return check with -fsanitize=unreachable

2022-06-17 Thread Jason Merrill via Gcc-patches
Related to PR104642, the current situation where we get less return checking
with just -fsanitize=unreachable than no sanitize flags seems undesirable; I
propose that we do return checking when -fsanitize=unreachable.

Looks like clang just traps on missing return if not -fsanitize=return, but
the approach in this patch seems more helpful to me if we're already
sanitizing other should-be-unreachable code.

I'm assuming that the difference in treatment of SANITIZE_UNREACHABLE and
SANITIZE_RETURN with regard to loop optimization is deliberate.

This assumes Jakub's -fsanitize-trap patch.

gcc/ChangeLog:

* doc/invoke.texi: Note that -fsanitize=unreachable implies
-fsanitize=return.
* opts.cc (finish_options): Make that so.

gcc/cp/ChangeLog:

* cp-gimplify.cc (cp_maybe_instrument_return): Remove
return vs. unreachable handling.

gcc/testsuite/ChangeLog:

* g++.dg/ubsan/return-8c.C: New test.
---
 gcc/doc/invoke.texi|  2 ++
 gcc/cp/cp-gimplify.cc  | 12 
 gcc/opts.cc| 10 ++
 gcc/testsuite/g++.dg/ubsan/return-8c.C | 15 +++
 4 files changed, 27 insertions(+), 12 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/ubsan/return-8c.C

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 50f57877477..e572158a1ba 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -15946,6 +15946,8 @@ built with this option turned on will issue an error 
message
 when the end of a non-void function is reached without actually
 returning a value.  This option works in C++ only.
 
+This check is also enabled by -fsanitize=unreachable.
+
 @item -fsanitize=signed-integer-overflow
 @opindex fsanitize=signed-integer-overflow
 This option enables signed integer overflow checking.  We check that
diff --git a/gcc/cp/cp-gimplify.cc b/gcc/cp/cp-gimplify.cc
index 6f84d157c98..5c2eb61842c 100644
--- a/gcc/cp/cp-gimplify.cc
+++ b/gcc/cp/cp-gimplify.cc
@@ -1806,18 +1806,6 @@ cp_maybe_instrument_return (tree fndecl)
   || !targetm.warn_func_return (fndecl))
 return;
 
-  if (!sanitize_flags_p (SANITIZE_RETURN, fndecl)
-  /* Don't add __builtin_unreachable () if not optimizing, it will not
-improve any optimizations in that case, just break UB code.
-Don't add it if -fsanitize=unreachable -fno-sanitize=return either,
-UBSan covers this with ubsan_instrument_return above where sufficient
-information is provided, while the __builtin_unreachable () below
-if return sanitization is disabled will just result in hard to
-understand runtime error without location.  */
-  && (!optimize
- || sanitize_flags_p (SANITIZE_UNREACHABLE, fndecl)))
-return;
-
   tree t = DECL_SAVED_TREE (fndecl);
   while (t)
 {
diff --git a/gcc/opts.cc b/gcc/opts.cc
index 062782ac700..a7b02b0f693 100644
--- a/gcc/opts.cc
+++ b/gcc/opts.cc
@@ -1254,6 +1254,16 @@ finish_options (struct gcc_options *opts, struct 
gcc_options *opts_set,
   if (opts->x_flag_sanitize & ~(SANITIZE_LEAK | SANITIZE_UNREACHABLE))
 opts->x_flag_aggressive_loop_optimizations = 0;
 
+  /* -fsanitize=unreachable implies -fsanitize=return, but without affecting
+  aggressive loop optimizations.  */
+  if ((opts->x_flag_sanitize & (SANITIZE_UNREACHABLE | SANITIZE_RETURN))
+  == SANITIZE_UNREACHABLE)
+{
+  opts->x_flag_sanitize |= SANITIZE_RETURN;
+  if (opts->x_flag_sanitize_trap & SANITIZE_UNREACHABLE)
+   opts->x_flag_sanitize_trap |= SANITIZE_RETURN;
+}
+
   /* Enable -fsanitize-address-use-after-scope if either address sanitizer is
  enabled.  */
   if (opts->x_flag_sanitize
diff --git a/gcc/testsuite/g++.dg/ubsan/return-8c.C 
b/gcc/testsuite/g++.dg/ubsan/return-8c.C
new file mode 100644
index 000..a67f086d452
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ubsan/return-8c.C
@@ -0,0 +1,15 @@
+// PR c++/104642
+
+// -fsanitize=unreachable should imply -fsanitize=return.
+
+// { dg-do run }
+// { dg-shouldfail { *-*-* } }
+// { dg-additional-options "-O -fsanitize=unreachable" }
+
+bool b;
+
+int f() {
+  if (b) return 42;
+} // { dg-warning "-Wreturn-type" }
+
+int main() { f(); }

base-commit: 0f96ac43fa0a5fdbfce317b274233852d5b46d23
prerequisite-patch-id: fa35013a253eae78fe744794172aeed26fe6f166
-- 
2.27.0



Re: [PATCH]middle-end Add optimized float addsub without needing VEC_PERM_EXPR.

2022-06-17 Thread Andrew Pinski via Gcc-patches
On Thu, Jun 16, 2022 at 3:59 AM Tamar Christina via Gcc-patches
 wrote:
>
> Hi All,
>
> For IEEE 754 floating point formats we can replace a sequence of alternative
> +/- with fneg of a wider type followed by an fadd.  This eliminated the need 
> for
> using a permutation.  This patch adds a math.pd rule to recognize and do this
> rewriting.

I don't think this is correct. You don't check the format of the
floating point to make sure this is valid (e.g. REAL_MODE_FORMAT's
signbit_rw/signbit_ro field).
Also would just be better if you do the xor in integer mode (using
signbit_rw field for the correct bit)?
And then making sure the target optimizes the xor to the neg
instruction when needed?

Thanks,
Andrew Pinski



>
> For
>
> void f (float *restrict a, float *restrict b, float *res, int n)
> {
>for (int i = 0; i < (n & -4); i+=2)
> {
>   res[i+0] = a[i+0] + b[i+0];
>   res[i+1] = a[i+1] - b[i+1];
> }
> }
>
> we generate:
>
> .L3:
> ldr q1, [x1, x3]
> ldr q0, [x0, x3]
> fnegv1.2d, v1.2d
> faddv0.4s, v0.4s, v1.4s
> str q0, [x2, x3]
> add x3, x3, 16
> cmp x3, x4
> bne .L3
>
> now instead of:
>
> .L3:
> ldr q1, [x0, x3]
> ldr q2, [x1, x3]
> faddv0.4s, v1.4s, v2.4s
> fsubv1.4s, v1.4s, v2.4s
> tbl v0.16b, {v0.16b - v1.16b}, v3.16b
> str q0, [x2, x3]
> add x3, x3, 16
> cmp x3, x4
> bne .L3
>
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>
> Thanks to George Steed for the idea.
>
> Ok for master?
>
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
> * match.pd: Add fneg/fadd rule.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/aarch64/simd/addsub_1.c: New test.
> * gcc.target/aarch64/sve/addsub_1.c: New test.
>
> --- inline copy of patch --
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 
> 51b0a1b562409af535e53828a10c30b8a3e1ae2e..af1c98d4a2831f38258d6fc1bbe811c8ee6c7c6e
>  100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -7612,6 +7612,49 @@ and,
>(simplify (reduc (op @0 VECTOR_CST@1))
>  (op (reduc:type @0) (reduc:type @1
>
> +/* Simplify vector floating point operations of alternating sub/add pairs
> +   into using an fneg of a wider element type followed by a normal add.
> +   under IEEE 754 the fneg of the wider type will negate every even entry
> +   and when doing an add we get a sub of the even and add of every odd
> +   elements.  */
> +(simplify
> + (vec_perm (plus:c @0 @1) (minus @0 @1) VECTOR_CST@2)
> + (if (!VECTOR_INTEGER_TYPE_P (type) && !BYTES_BIG_ENDIAN)
> +  (with
> +   {
> + /* Build a vector of integers from the tree mask.  */
> + vec_perm_builder builder;
> + if (!tree_to_vec_perm_builder (, @2))
> +   return NULL_TREE;
> +
> + /* Create a vec_perm_indices for the integer vector.  */
> + poly_uint64 nelts = TYPE_VECTOR_SUBPARTS (type);
> + vec_perm_indices sel (builder, 2, nelts);
> +   }
> +   (if (sel.series_p (0, 2, 0, 2))
> +(with
> + {
> +   machine_mode vec_mode = TYPE_MODE (type);
> +   auto elem_mode = GET_MODE_INNER (vec_mode);
> +   auto nunits = exact_div (GET_MODE_NUNITS (vec_mode), 2);
> +   tree stype;
> +   switch (elem_mode)
> +{
> +case E_HFmode:
> +  stype = float_type_node;
> +  break;
> +case E_SFmode:
> +  stype = double_type_node;
> +  break;
> +default:
> +  return NULL_TREE;
> +}
> +   tree ntype = build_vector_type (stype, nunits);
> +   if (!ntype)
> +return NULL_TREE;
> + }
> + (plus (view_convert:type (negate (view_convert:ntype @1))) @0))
> +
>  (simplify
>   (vec_perm @0 @1 VECTOR_CST@2)
>   (with
> diff --git a/gcc/testsuite/gcc.target/aarch64/simd/addsub_1.c 
> b/gcc/testsuite/gcc.target/aarch64/simd/addsub_1.c
> new file mode 100644
> index 
> ..1fb91a34c421bbd2894faa0dbbf1b47ad43310c4
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/simd/addsub_1.c
> @@ -0,0 +1,56 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_v8_2a_fp16_neon_ok } */
> +/* { dg-options "-Ofast" } */
> +/* { dg-add-options arm_v8_2a_fp16_neon } */
> +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */
> +
> +#pragma GCC target "+nosve"
> +
> +/*
> +** f1:
> +** ...
> +** fnegv[0-9]+.2d, v[0-9]+.2d
> +** faddv[0-9]+.4s, v[0-9]+.4s, v[0-9]+.4s
> +** ...
> +*/
> +void f1 (float *restrict a, float *restrict b, float *res, int n)
> +{
> +   for (int i = 0; i < (n & -4); i+=2)
> +{
> +  res[i+0] = a[i+0] + b[i+0];
> +  res[i+1] = a[i+1] - b[i+1];
> +}
> +}
> +
> +/*
> +** d1:
> +** ...
> +** fnegv[0-9]+.4s, v[0-9]+.4s
> +** faddv[0-9]+.8h, v[0-9]+.8h, v[0-9]+.8h
> +** ...
> +*/
> +void d1 (_Float16 *restrict a, _Float16 

Re: [PATCH] alpha: Introduce target specific store_data_bypass_p function [PR105209]

2022-06-17 Thread Jeff Law via Gcc-patches




On 6/17/2022 9:22 AM, Uros Bizjak via Gcc-patches wrote:

This patch introduces alpha-specific version of store_data_bypass_p that
ignores TRAP_IF that would result in assertion failure (and internal
compiler error) in the generic store_data_bypass_p function.

While at it, also remove ev4_ist_c reservation, store_data_bypass_p
can handle the patterns with multiple sets since some time ago.

2022-06-17  Uroš Bizjak  

gcc/ChangeLog:

 PR target/105209
 * config/alpha/alpha-protos.h (alpha_store_data_bypass_p): New.
 * config/alpha/alpha.cc (alpha_store_data_bypass_p): New function.
 (alpha_store_data_bypass_p_1): Ditto.
 * config/alpha/ev4.md: Use alpha_store_data_bypass_p instead
 of generic store_data_bypass_p.
 (ev4_ist_c): Remove insn reservation.

gcc/testsuite/ChangeLog:

 PR target/105209
 * gcc.target/alpha/pr105209.c: New test.

Tested with a cross-compiler.

Pushed to master
And FWIW it'll get bootstrapped using qemu user mode emulation over the 
weekend.


http://law-sandy.freeddns.org:8080/job/alpha-linux-gnu

jeff

ps.  I think last weekend's run got aborted due to an overheated server.




Re: Modula-2: merge followup (brief update on the progress of the new linking implementation)

2022-06-17 Thread Richard Biener via Gcc-patches



> Am 17.06.2022 um 19:09 schrieb Gaius Mulley via Gcc-patches 
> :
> 
> 
> New linking implementation is complete, gcc bootstraps and hello
> world links.  I'll git push the changes, then test/debug/polish and
> produce new patch sets

Great!

Thanks,
Richard 

> regards,
> Gaius


Re: [PATCH] ubsan: Add -fsanitize-trap= support

2022-06-17 Thread Jason Merrill via Gcc-patches

On 6/17/22 11:34, Jakub Jelinek via Gcc-patches wrote:

On Thu, Jun 16, 2022 at 09:32:02PM +0100, Jonathan Wakely wrote:

It looks like clang has addressed this deficiency now:

https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html#usage


Thanks, that is roughly what I'd implement anyway and apparently they have
it already since 2015, we've added the -fsanitize-undefined-trap-on-error
support back in 2014 and didn't change it since then.

As a small divergence from clang, I chose -fsanitize-undefined-trap-on-error
to be a (deprecated) alias for -fsanitize-trap aka -fsanitize-trap=all
rather thn -fsanitize-trap=undefined which seems to be what clang does,
because for a deprecated option it is IMHO more important backwards
compatibility with what gcc did over the past 8 years rather than clang
compatibility.
Some sanitizers (e.g. asan, lsan, tsan) don't support traps,
-fsanitize-trap=address etc. will be rejected (if enabled at the end of
command line), -fno-sanitize-trap= can be specified even for them.
This is similar behavior to -fsanitize-recover=.
One complication is vptr sanitization, which can't easily trap,
as the whole slow path of the checking is inside of libubsan.
Previously, -fsanitize=vptr -fsanitize-undefined-trap-on-error
silently ignored vptr sanitization.
This patch similarly to what clang does will accept
-fsanitize-trap=all or -fsanitize-trap=undefined which enable
the vptr bit as trapping and again that causes silent disabling
of vptr sanitization, while -fsanitize-trap=vptr is rejected
(already during option processing).

So far quickly tested with make check-gcc check-g++ RUNTESTFLAGS=ubsan.exp,
ok for trunk if it passes full bootstrap/regtest?

2022-06-17  Jakub Jelinek  

gcc/
* common.opt (flag_sanitize_trap): New variable.
(fsanitize-trap=, fsanitize-trap): New options.
(fsanitize-undefined-trap-on-error): Change into deprecated alias
for -fsanitize-trap=all.
* opts.h (struct sanitizer_opts_s): Add can_trap member.
* opts.cc (finish_options): Complain about unsupported
-fsanitize-trap= options.
(sanitizer_opts): Add can_trap values to all entries.
(get_closest_sanitizer_option): Ignore -fsanitize-trap=
options which have can_trap false.
(parse_sanitizer_options): Add support for -fsanitize-trap=.
For -fsanitize-trap=all, enable
SANITIZE_UNDEFINED | SANITIZE_UNDEFINED_NONDEFAULT.  Disallow
-fsanitize-trap=vptr here.
(common_handle_option): Handle OPT_fsanitize_trap_ and
OPT_fsanitize_trap.
* sanopt.cc (maybe_optimize_ubsan_null_ifn): Check
flag_sanitize_trap & SANITIZE_{NULL,ALIGNMENT} instead of
flag_sanitize_undefined_trap_on_error.
* gcc.cc (sanitize_spec_function): Use
flag_sanitize & ~flag_sanitize_trap instead of flag_sanitize
and drop use of flag_sanitize_undefined_trap_on_error in
"undefined" handling.
* ubsan.cc (ubsan_instrument_unreachable): Use
flag_sanitize_trap & SANITIZE_??? instead of
flag_sanitize_undefined_trap_on_error.
(ubsan_expand_bounds_ifn, ubsan_expand_null_ifn,
ubsan_expand_objsize_ifn, ubsan_expand_ptr_ifn,
ubsan_build_overflow_builtin, instrument_bool_enum_load,
ubsan_instrument_float_cast, instrument_nonnull_arg,
instrument_nonnull_return, instrument_builtin): Likewise.
* doc/invoke.texi (-fsanitize-trap=, -fsanitize-trap): Document.
(-fsanitize-undefined-trap-on-error): Document as deprecated
alias of -fsanitize-trap.
gcc/c-family/
* c-ubsan.cc (ubsan_instrument_division, ubsan_instrument_shift):
Use flag_sanitize_trap & SANITIZE_??? instead of
flag_sanitize_undefined_trap_on_error.  If 2 sanitizers are involved
and flag_sanitize_trap differs for them, emit __builtin_trap only
for the comparison where trap is requested.
(ubsan_instrument_vla, ubsan_instrument_return): Use
lag_sanitize_trap & SANITIZE_??? instead of
flag_sanitize_undefined_trap_on_error.
gcc/cp/
* cp-ubsan.cc (cp_ubsan_instrument_vptr_p): Use
flag_sanitize_trap & SANITIZE_VPTR instead of
flag_sanitize_undefined_trap_on_error.
gcc/testsuite/
* c-c++-common/ubsan/nonnull-4.c: Use -fsanitize-trap=all
instead of -fsanitize-undefined-trap-on-error.
* c-c++-common/ubsan/div-by-zero-4.c: Use
-fsanitize-trap=signed-integer-overflow instead of
-fsanitize-undefined-trap-on-error.
* c-c++-common/ubsan/overflow-add-4.c: Use -fsanitize-trap=undefined
instead of -fsanitize-undefined-trap-on-error.
* c-c++-common/ubsan/pr56956.c: Likewise.
* c-c++-common/ubsan/pr68142.c: Likewise.
* c-c++-common/ubsan/pr80932.c: Use
-fno-sanitize-trap=all -fsanitize-trap=shift,undefined
instead of -fsanitize-undefined-trap-on-error.
* 

Re: [PATCH, rs6000] Use CC for BCD operations [PR100736]

2022-06-17 Thread Segher Boessenkool
Hi!

On Fri, Jun 17, 2022 at 04:19:37PM +0800, HAO CHEN GUI wrote:
> On 16/6/2022 下午 7:24, Segher Boessenkool wrote:
> > You shouldn't need anything like this, bcdinvalid will work just fine if
> > written as bcdadd_ov (with vector of 0 as the second op)?
> 
> The vector of 0 is not equal to BCD 0, I think. The BCD number contains
> preferred sign (PS) bit. So all zeros itself is an invalid encoding. We may
> use bcdsub_ov with duplicated op to implement bcdinvalid.

For the machine, you should use 0x0c or 0x0f, sure.  But in RTL we can
do whatever we want.

But bcdsub is easier indeed, and we don't need to construct a 0 first
then.


Segher


kernel sparse annotations vs. compiler attributes and debug_annotate_{type,decl} WAS: Re: [PATCH 0/9] Add debug_annotate attributes

2022-06-17 Thread Jose E. Marchesi via Gcc-patches


Hi Yonghong.

> On 6/15/22 1:57 PM, David Faust wrote:
>> 
>> On 6/14/22 22:53, Yonghong Song wrote:
>>>
>>>
>>> On 6/7/22 2:43 PM, David Faust wrote:
 Hello,

 This patch series adds support for:

 - Two new C-language-level attributes that allow to associate (to 
 "annotate" or
 to "tag") particular declarations and types with arbitrary strings. As
 explained below, this is intended to be used to, for example, 
 characterize
 certain pointer types.

 - The conveyance of that information in the DWARF output in the form of a 
 new
 DIE: DW_TAG_GNU_annotation.

 - The conveyance of that information in the BTF output in the form of two 
 new
 kinds of BTF objects: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG.

 All of these facilities are being added to the eBPF ecosystem, and support 
 for
 them exists in some form in LLVM.

 Purpose
 ===

 1)  Addition of C-family language constructs (attributes) to specify 
 free-text
   tags on certain language elements, such as struct fields.

   The purpose of these annotations is to provide additional 
 information about
   types, variables, and function parameters of interest to the kernel. 
 A
   driving use case is to tag pointer types within the linux kernel and 
 eBPF
   programs with additional semantic information, such as '__user' or 
 '__rcu'.

   For example, consider the linux kernel function do_execve with the
   following declaration:

 static int do_execve(struct filename *filename,
const char __user *const __user *__argv,
const char __user *const __user *__envp);

   Here, __user could be defined with these annotations to record 
 semantic
   information about the pointer parameters (e.g., they are 
 user-provided) in
   DWARF and BTF information. Other kernel facilites such as the eBPF 
 verifier
   can read the tags and make use of the information.

 2)  Conveying the tags in the generated DWARF debug info.

   The main motivation for emitting the tags in DWARF is that the Linux 
 kernel
   generates its BTF information via pahole, using DWARF as a source:

   ++  BTF  BTF   +--+
   | pahole |---> vmlinux.btf --->| verifier |
   ++ +--+
   ^^
   ||
 DWARF |BTF |
   ||
vmlinux  +-+
module1.ko   | BPF program |
module2.ko   +-+
  ...

   This is because:

   a)  Unlike GCC, LLVM will only generate BTF for BPF programs.

   b)  GCC can generate BTF for whatever target with -gbtf, but there 
 is no
   support for linking/deduplicating BTF in the linker.

   In the scenario above, the verifier needs access to the pointer tags 
 of
   both the kernel types/declarations (conveyed in the DWARF and 
 translated
   to BTF by pahole) and those of the BPF program (available directly 
 in BTF).

   Another motivation for having the tag information in DWARF, 
 unrelated to
   BPF and BTF, is that the drgn project (another DWARF consumer) also 
 wants
   to benefit from these tags in order to differentiate between 
 different
   kinds of pointers in the kernel.

 3)  Conveying the tags in the generated BTF debug info.

   This is easy: the main purpose of having this info in BTF is for the
   compiled eBPF programs. The kernel verifier can then access the tags
   of pointers used by the eBPF programs.


 For more information about these tags and the motivation behind them, 
 please
 refer to the following linux kernel discussions:

 https://lore.kernel.org/bpf/20210914223004.244411-1-...@fb.com/
 https://lore.kernel.org/bpf/20211012164838.3345699-1-...@fb.com/
 https://lore.kernel.org/bpf/2022012604.1504583-1-...@fb.com/


 Implementation Overview
 ===

 To enable these annotations, two new C language attributes are added:
 __attribute__((debug_annotate_decl("foo"))) and
 __attribute__((debug_annotate_type("bar"))). Both attributes accept a 
 single
 arbitrary string constant argument, which will be recorded in the generated
 DWARF and/or BTF debug 

Re: [PATCH]middle-end Use subregs to expand COMPLEX_EXPR to set the lowpart.

2022-06-17 Thread Richard Sandiford via Gcc-patches
Tamar Christina  writes:
>> -Original Message-
>> From: Richard Sandiford 
>> Sent: Monday, June 13, 2022 9:41 AM
>> To: Tamar Christina 
>> Cc: gcc-patches@gcc.gnu.org; nd ; rguent...@suse.de
>> Subject: Re: [PATCH]middle-end Use subregs to expand COMPLEX_EXPR to
>> set the lowpart.
>> 
>> Tamar Christina  writes:
>> > Hi All,
>> >
>> > When lowering COMPLEX_EXPR we currently emit two VEC_EXTRACTs.
>> One
>> > for the lowpart and one for the highpart.
>> >
>> > The problem with this is that in RTL the lvalue of the RTX is the only
>> > thing tying the two instructions together.
>> >
>> > This means that e.g. combine is unable to try to combine the two
>> > instructions for setting the lowpart and highpart.
>> >
>> > For ISAs that have bit extract instructions we can eliminate one of
>> > the extracts if, and only if we're setting the entire complex number.
>> >
>> > This change changes the expand code when we're setting the entire
>> > complex number to generate a subreg for the lowpart instead of a
>> vec_extract.
>> >
>> > This allows us to optimize sequences such as:
>> >
>> > _Complex int f(int a, int b) {
>> > _Complex int t = a + b * 1i;
>> > return t;
>> > }
>> >
>> > from:
>> >
>> > f:
>> >bfi x2, x0, 0, 32
>> >bfi x2, x1, 32, 32
>> >mov x0, x2
>> >ret
>> >
>> > into:
>> >
>> > f:
>> >bfi x0, x1, 32, 32
>> >ret
>> >
>> > I have also confirmed the codegen for x86_64 did not change.
>> >
>> > Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu
>> > and no issues.
>> >
>> > Ok for master?
>> 
>> I'm not sure this is endian-safe.  For big-endian it's the imaginary part 
>> that can
>> be written as a subreg.  The real part might be the high part of a register.
>> 
>> Maybe a more general way to handle this would be to add (yet another)
>> parameter to store_bit_field that indicates that the current value of the
>> structure is undefined.  That would also be useful in at least one other 
>> caller
>> (from calls.cc).  write_complex_part could then have a similar parameter,
>> true for the first write and false for the second.
>
> Ohayou-gozaimasu!
>
> I've rewritten it using the approach you requested. I attempted to set the 
> flag
> In the correct places as well.

Thanks, looks good.  But rather than treat this as a new case, I think
we can instead generalise this store_bit_field_1 code:

  else if (constant_multiple_p (bitnum, regsize * BITS_PER_UNIT, )
   && multiple_p (bitsize, regsize * BITS_PER_UNIT)
   && known_ge (GET_MODE_BITSIZE (GET_MODE (op0)), bitsize))
{
  sub = simplify_gen_subreg (fieldmode, op0, GET_MODE (op0),
 regnum * regsize);
  if (sub)
{
  if (reverse)
value = flip_storage_order (fieldmode, value);
  emit_move_insn (sub, value);
  return true;
}
}

so that the multiple_p test is skipped if the structure is undefined.

Richard

> Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu
> and no issues.
>
> Ok for master?
>
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
>   * expmed.cc (store_bit_field): Add parameter that indicates if value is
>   still undefined and if so emit a subreg move instead.
>   * expr.h (write_complex_part): Likewise.
>   * expmed.h (store_bit_field): Add new parameter.
>   * builtins.cc (expand_ifn_atomic_compare_exchange_into_call): Use new
>   parameter.
>   (expand_ifn_atomic_compare_exchange): Likewise.
>   * calls.cc (store_unaligned_arguments_into_pseudos): Likewise.
>   * emit-rtl.cc (validate_subreg): Likewise.
>   * expr.cc (emit_group_store): Likewise.
>   (copy_blkmode_from_reg): Likewise.
>   (copy_blkmode_to_reg): Likewise.
>   (clear_storage_hints): Likewise.
>   (write_complex_part):  Likewise.
>   (emit_move_complex_parts): Likewise.
>   (expand_assignment): Likewise.
>   (store_expr): Likewise.
>   (store_field): Likewise.
>   (expand_expr_real_2): Likewise.
>   * ifcvt.cc (noce_emit_move_insn): Likewise.
>   * internal-fn.cc (expand_arith_set_overflow): Likewise.
>   (expand_arith_overflow_result_store): Likewise.
>   (expand_addsub_overflow): Likewise.
>   (expand_neg_overflow): Likewise.
>   (expand_mul_overflow): Likewise.
>   (expand_arith_overflow): Likewise.
>
> gcc/testsuite/ChangeLog:
>
>   * g++.target/aarch64/complex-init.C: New test.
>
> --- inline copy of patch ---
>
> diff --git a/gcc/builtins.cc b/gcc/builtins.cc
> index 
> 4c6c29390531d8ae9765add598621727213b23ec..8c80e46d9c9c9c2a7e1ce0f8add86729fd542b16
>  100644
> --- a/gcc/builtins.cc
> +++ b/gcc/builtins.cc
> @@ -6014,8 +6014,8 @@ expand_ifn_atomic_compare_exchange_into_call (gcall 
> *call, machine_mode mode)
>if (GET_MODE (boolret) != mode)
>   boolret = convert_modes (mode, GET_MODE (boolret), 

Re: Modula-2: merge followup (brief update on the progress of the new linking implementation)

2022-06-17 Thread Gaius Mulley via Gcc-patches


New linking implementation is complete, gcc bootstraps and hello
world links.  I'll git push the changes, then test/debug/polish and
produce new patch sets

regards,
Gaius


[PATCH] c++, v2: Add support for __real__/__imag__ modifications in constant expressions [PR88174]

2022-06-17 Thread Jakub Jelinek via Gcc-patches
On Fri, Jun 10, 2022 at 09:57:06PM +0200, Jakub Jelinek via Gcc-patches wrote:
> On Fri, Jun 10, 2022 at 01:27:28PM -0400, Jason Merrill wrote:
> > Doesn't this assert mean that complex_expr will always be == valp?
> 
> No, even when handling the pushed *PART_EXPR, it will set
> valp = _OPERAND (*valp, index != integer_zero_node);
> So, valp will be either _OPERAND (*complex_expr, 0)
> or _OPERAND (*complex_expr, 1).
> As *valp = init; is what is usually then stored and we want to store there
> the scalar.
> 
> > I don't understand this block; shouldn't valp point to the real or imag part
> > of the complex number at this point?  How could complex_part be set without
> > us handling the complex case in the loop already?
> 
> Because for most references, the code will do:
>   vec_safe_push (ctors, *valp);
>   vec_safe_push (indexes, index);
> I chose not to do this for *PART_EXPR, because the COMPLEX_EXPR isn't a
> CONSTRUCTOR and code later on e.g. walks all the ctors and accesses
> CONSTRUCTOR_NO_CLEARING on them etc.  As the *PART_EXPR is asserted to
> be outermost only, complex_expr is a variant of that ctors push and
> complex_part of the indexes.
> The reason for the above if is just in case the evaluation of the rhs
> of the store would store to the complex and could e.g. make it a COMPLEX_CST
> again.
> 
> > I might have added the COMPLEX_EXPR to ctors instead of a separate variable,
> > but this is fine too.
> 
> See above.
> The COMPLEX_EXPR needs special handling (conversion into COMPLEX_CST if it
> is constant) anyway.

Here is a variant patch which pushes even the *PART_EXPR related entries
into ctors and indexes vectors, so it doesn't need to use extra variables
for the complex stuff.

2022-06-17  Jakub Jelinek  

PR c++/88174
* constexpr.cc (cxx_eval_store_expression): Handle REALPART_EXPR
and IMAGPART_EXPR.  Change ctors from releasing_vec to
auto_vec, adjust all uses.

* g++.dg/cpp1y/constexpr-complex1.C: New test.

--- gcc/cp/constexpr.cc.jj  2022-06-09 17:42:23.606243920 +0200
+++ gcc/cp/constexpr.cc 2022-06-17 18:59:54.809208997 +0200
@@ -5714,6 +5714,20 @@ cxx_eval_store_expression (const constex
  }
  break;
 
+   case REALPART_EXPR:
+ gcc_assert (probe == target);
+ vec_safe_push (refs, probe);
+ vec_safe_push (refs, TREE_TYPE (probe));
+ probe = TREE_OPERAND (probe, 0);
+ break;
+
+   case IMAGPART_EXPR:
+ gcc_assert (probe == target);
+ vec_safe_push (refs, probe);
+ vec_safe_push (refs, TREE_TYPE (probe));
+ probe = TREE_OPERAND (probe, 0);
+ break;
+
default:
  if (evaluated)
object = probe;
@@ -5752,7 +5766,8 @@ cxx_eval_store_expression (const constex
   type = TREE_TYPE (object);
   bool no_zero_init = true;
 
-  releasing_vec ctors, indexes;
+  auto_vec ctors;
+  releasing_vec indexes;
   auto_vec index_pos_hints;
   bool activated_union_member_p = false;
   bool empty_base = false;
@@ -5792,14 +5807,36 @@ cxx_eval_store_expression (const constex
  *valp = ary_ctor;
}
 
-  /* If the value of object is already zero-initialized, any new ctors for
-subobjects will also be zero-initialized.  */
-  no_zero_init = CONSTRUCTOR_NO_CLEARING (*valp);
-
   enum tree_code code = TREE_CODE (type);
   tree reftype = refs->pop();
   tree index = refs->pop();
 
+  if (code == COMPLEX_TYPE)
+   {
+ if (TREE_CODE (*valp) == COMPLEX_CST)
+   *valp = build2 (COMPLEX_EXPR, type, TREE_REALPART (*valp),
+   TREE_IMAGPART (*valp));
+ else if (TREE_CODE (*valp) == CONSTRUCTOR
+  && CONSTRUCTOR_NELTS (*valp) == 0
+  && CONSTRUCTOR_NO_CLEARING (*valp))
+   {
+ tree r = build_constructor (reftype, NULL);
+ CONSTRUCTOR_NO_CLEARING (r) = 1;
+ *valp = build2 (COMPLEX_EXPR, type, r, r);
+   }
+ gcc_assert (TREE_CODE (*valp) == COMPLEX_EXPR);
+ ctors.safe_push (valp);
+ vec_safe_push (indexes, index);
+ valp = _OPERAND (*valp, TREE_CODE (index) == IMAGPART_EXPR);
+ gcc_checking_assert (refs->is_empty ());
+ type = reftype;
+ break;
+   }
+
+  /* If the value of object is already zero-initialized, any new ctors for
+subobjects will also be zero-initialized.  */
+  no_zero_init = CONSTRUCTOR_NO_CLEARING (*valp);
+
   if (code == RECORD_TYPE && is_empty_field (index))
/* Don't build a sub-CONSTRUCTOR for an empty base or field, as they
   have no data and might have an offset lower than previously declared
@@ -5842,7 +5879,7 @@ cxx_eval_store_expression (const constex
  no_zero_init = true;
}
 
-  vec_safe_push (ctors, *valp);
+  ctors.safe_push (valp);
   vec_safe_push (indexes, index);
 
   constructor_elt *cep
@@ -5904,11 

[PATCH] ubsan: Add -fsanitize-trap= support

2022-06-17 Thread Jakub Jelinek via Gcc-patches
On Thu, Jun 16, 2022 at 09:32:02PM +0100, Jonathan Wakely wrote:
> It looks like clang has addressed this deficiency now:
> 
> https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html#usage

Thanks, that is roughly what I'd implement anyway and apparently they have
it already since 2015, we've added the -fsanitize-undefined-trap-on-error
support back in 2014 and didn't change it since then.

As a small divergence from clang, I chose -fsanitize-undefined-trap-on-error
to be a (deprecated) alias for -fsanitize-trap aka -fsanitize-trap=all
rather thn -fsanitize-trap=undefined which seems to be what clang does,
because for a deprecated option it is IMHO more important backwards
compatibility with what gcc did over the past 8 years rather than clang
compatibility.
Some sanitizers (e.g. asan, lsan, tsan) don't support traps,
-fsanitize-trap=address etc. will be rejected (if enabled at the end of
command line), -fno-sanitize-trap= can be specified even for them.
This is similar behavior to -fsanitize-recover=.
One complication is vptr sanitization, which can't easily trap,
as the whole slow path of the checking is inside of libubsan.
Previously, -fsanitize=vptr -fsanitize-undefined-trap-on-error
silently ignored vptr sanitization.
This patch similarly to what clang does will accept
-fsanitize-trap=all or -fsanitize-trap=undefined which enable
the vptr bit as trapping and again that causes silent disabling
of vptr sanitization, while -fsanitize-trap=vptr is rejected
(already during option processing).

So far quickly tested with make check-gcc check-g++ RUNTESTFLAGS=ubsan.exp,
ok for trunk if it passes full bootstrap/regtest?

2022-06-17  Jakub Jelinek  

gcc/
* common.opt (flag_sanitize_trap): New variable.
(fsanitize-trap=, fsanitize-trap): New options.
(fsanitize-undefined-trap-on-error): Change into deprecated alias
for -fsanitize-trap=all.
* opts.h (struct sanitizer_opts_s): Add can_trap member.
* opts.cc (finish_options): Complain about unsupported
-fsanitize-trap= options.
(sanitizer_opts): Add can_trap values to all entries.
(get_closest_sanitizer_option): Ignore -fsanitize-trap=
options which have can_trap false.
(parse_sanitizer_options): Add support for -fsanitize-trap=.
For -fsanitize-trap=all, enable
SANITIZE_UNDEFINED | SANITIZE_UNDEFINED_NONDEFAULT.  Disallow
-fsanitize-trap=vptr here.
(common_handle_option): Handle OPT_fsanitize_trap_ and
OPT_fsanitize_trap.
* sanopt.cc (maybe_optimize_ubsan_null_ifn): Check
flag_sanitize_trap & SANITIZE_{NULL,ALIGNMENT} instead of
flag_sanitize_undefined_trap_on_error.
* gcc.cc (sanitize_spec_function): Use
flag_sanitize & ~flag_sanitize_trap instead of flag_sanitize
and drop use of flag_sanitize_undefined_trap_on_error in
"undefined" handling.
* ubsan.cc (ubsan_instrument_unreachable): Use
flag_sanitize_trap & SANITIZE_??? instead of
flag_sanitize_undefined_trap_on_error.
(ubsan_expand_bounds_ifn, ubsan_expand_null_ifn,
ubsan_expand_objsize_ifn, ubsan_expand_ptr_ifn,
ubsan_build_overflow_builtin, instrument_bool_enum_load,
ubsan_instrument_float_cast, instrument_nonnull_arg,
instrument_nonnull_return, instrument_builtin): Likewise.
* doc/invoke.texi (-fsanitize-trap=, -fsanitize-trap): Document.
(-fsanitize-undefined-trap-on-error): Document as deprecated
alias of -fsanitize-trap.
gcc/c-family/
* c-ubsan.cc (ubsan_instrument_division, ubsan_instrument_shift):
Use flag_sanitize_trap & SANITIZE_??? instead of
flag_sanitize_undefined_trap_on_error.  If 2 sanitizers are involved
and flag_sanitize_trap differs for them, emit __builtin_trap only
for the comparison where trap is requested.
(ubsan_instrument_vla, ubsan_instrument_return): Use
lag_sanitize_trap & SANITIZE_??? instead of
flag_sanitize_undefined_trap_on_error.
gcc/cp/
* cp-ubsan.cc (cp_ubsan_instrument_vptr_p): Use
flag_sanitize_trap & SANITIZE_VPTR instead of
flag_sanitize_undefined_trap_on_error.
gcc/testsuite/
* c-c++-common/ubsan/nonnull-4.c: Use -fsanitize-trap=all
instead of -fsanitize-undefined-trap-on-error.
* c-c++-common/ubsan/div-by-zero-4.c: Use
-fsanitize-trap=signed-integer-overflow instead of
-fsanitize-undefined-trap-on-error.
* c-c++-common/ubsan/overflow-add-4.c: Use -fsanitize-trap=undefined
instead of -fsanitize-undefined-trap-on-error.
* c-c++-common/ubsan/pr56956.c: Likewise.
* c-c++-common/ubsan/pr68142.c: Likewise.
* c-c++-common/ubsan/pr80932.c: Use
-fno-sanitize-trap=all -fsanitize-trap=shift,undefined
instead of -fsanitize-undefined-trap-on-error.
* c-c++-common/ubsan/align-8.c: Use -fsanitize-trap=alignment

Re: [PATCH] varasm: Fix up ICE in narrowing_initializer_constant_valid_p [PR105998]

2022-06-17 Thread Richard Biener via Gcc-patches



> Am 17.06.2022 um 11:20 schrieb Jakub Jelinek via Gcc-patches 
> :
> 
> On Fri, Jun 17, 2022 at 10:37:45AM +0200, Richard Biener wrote:
>>> --- gcc/varasm.cc.jj2022-06-06 12:18:12.792812888 +0200
>>> +++ gcc/varasm.cc2022-06-17 09:49:21.918029072 +0200
>>> @@ -4716,7 +4716,8 @@ narrowing_initializer_constant_valid_p (
>>>{
>>>  tree inner = TREE_OPERAND (op0, 0);
>>>  if (inner == error_mark_node
>>> -  || ! INTEGRAL_MODE_P (TYPE_MODE (TREE_TYPE (inner)))
>>> +  || VECTOR_TYPE_P (TREE_TYPE (inner))
>> 
>> Do we really want to allow all integer modes here regardless of a
>> composite type (record type for example)?  I’d say !INTEGRAL_TYPE_P would
>> match the rest better.  OTOH if we want to allow integer modes I fail to
>> see why to exclude vector types (but not complex, etc)
> 
> I've excluded VECTOR_TYPE_P because those are the only types for which
> TYPE_MODE can be different from the raw type mode (so, SCALAR_INT_MODE_P
> was true but SCALAR_INT_TYPE_MODE still ICEd).
> 
> Checking for INTEGRAL_TYPE_P seems reasonable to me though,
> and I'd say we also want to check the outer type too because nothing
> really checks it (at least for the first iteration, 2nd and further
> get it from checking of inner in the previous iteration).
> 
> So like this if it passes bootstrap/regtest?

Ok.

Richard 
> 2022-06-17  Jakub Jelinek  
> 
>PR middle-end/105998
>* varasm.cc (narrowing_initializer_constant_valid_p): Check
>SCALAR_INT_MODE_P instead of INTEGRAL_MODE_P, also break on
>! INTEGRAL_TYPE_P and do the same check also on op{0,1}'s type.
> 
>* c-c++-common/pr105998.c: New test.
> 
> --- gcc/varasm.cc.jj2022-06-17 11:07:57.883679019 +0200
> +++ gcc/varasm.cc2022-06-17 11:10:09.190932417 +0200
> @@ -4716,7 +4716,10 @@ narrowing_initializer_constant_valid_p (
> {
>   tree inner = TREE_OPERAND (op0, 0);
>   if (inner == error_mark_node
> -  || ! INTEGRAL_MODE_P (TYPE_MODE (TREE_TYPE (inner)))
> +  || ! INTEGRAL_TYPE_P (TREE_TYPE (op0))
> +  || ! SCALAR_INT_MODE_P (TYPE_MODE (TREE_TYPE (op0)))
> +  || ! INTEGRAL_TYPE_P (TREE_TYPE (inner))
> +  || ! SCALAR_INT_MODE_P (TYPE_MODE (TREE_TYPE (inner)))
>  || (GET_MODE_SIZE (SCALAR_INT_TYPE_MODE (TREE_TYPE (op0)))
>  > GET_MODE_SIZE (SCALAR_INT_TYPE_MODE (TREE_TYPE (inner)
>break;
> @@ -4728,7 +4731,10 @@ narrowing_initializer_constant_valid_p (
> {
>   tree inner = TREE_OPERAND (op1, 0);
>   if (inner == error_mark_node
> -  || ! INTEGRAL_MODE_P (TYPE_MODE (TREE_TYPE (inner)))
> +  || ! INTEGRAL_TYPE_P (TREE_TYPE (op1))
> +  || ! SCALAR_INT_MODE_P (TYPE_MODE (TREE_TYPE (op1)))
> +  || ! INTEGRAL_TYPE_P (TREE_TYPE (inner))
> +  || ! SCALAR_INT_MODE_P (TYPE_MODE (TREE_TYPE (inner)))
>  || (GET_MODE_SIZE (SCALAR_INT_TYPE_MODE (TREE_TYPE (op1)))
>  > GET_MODE_SIZE (SCALAR_INT_TYPE_MODE (TREE_TYPE (inner)
>break;
> --- gcc/testsuite/c-c++-common/pr105998.c.jj2022-06-17 11:09:11.196703834 
> +0200
> +++ gcc/testsuite/c-c++-common/pr105998.c2022-06-17 11:09:11.196703834 
> +0200
> @@ -0,0 +1,12 @@
> +/* PR middle-end/105998 */
> +
> +typedef int __attribute__((__vector_size__ (sizeof (long long V;
> +
> +V v;
> +
> +long long
> +foo (void)
> +{
> +  long long l = (long long) ((0 | v) - ((V) { } == 0));
> +  return l;
> +}
> 
> 
>Jakub
> 


[PATCH] alpha: Introduce target specific store_data_bypass_p function [PR105209]

2022-06-17 Thread Uros Bizjak via Gcc-patches
This patch introduces alpha-specific version of store_data_bypass_p that
ignores TRAP_IF that would result in assertion failure (and internal
compiler error) in the generic store_data_bypass_p function.

While at it, also remove ev4_ist_c reservation, store_data_bypass_p
can handle the patterns with multiple sets since some time ago.

2022-06-17  Uroš Bizjak  

gcc/ChangeLog:

PR target/105209
* config/alpha/alpha-protos.h (alpha_store_data_bypass_p): New.
* config/alpha/alpha.cc (alpha_store_data_bypass_p): New function.
(alpha_store_data_bypass_p_1): Ditto.
* config/alpha/ev4.md: Use alpha_store_data_bypass_p instead
of generic store_data_bypass_p.
(ev4_ist_c): Remove insn reservation.

gcc/testsuite/ChangeLog:

PR target/105209
* gcc.target/alpha/pr105209.c: New test.

Tested with a cross-compiler.

Pushed to master.

Uros.
diff --git a/gcc/config/alpha/alpha-protos.h b/gcc/config/alpha/alpha-protos.h
index 0c832bf039c..adfdd774ef4 100644
--- a/gcc/config/alpha/alpha-protos.h
+++ b/gcc/config/alpha/alpha-protos.h
@@ -73,6 +73,8 @@ extern void alpha_end_function (FILE *, const char *, tree);
 
 extern bool alpha_find_lo_sum_using_gp (rtx);
 
+extern int alpha_store_data_bypass_p (rtx_insn *, rtx_insn *);
+
 #ifdef REAL_VALUE_TYPE
 extern int check_float_value (machine_mode, REAL_VALUE_TYPE *, int);
 #endif
diff --git a/gcc/config/alpha/alpha.cc b/gcc/config/alpha/alpha.cc
index 3db53374c9e..0a85e66fa89 100644
--- a/gcc/config/alpha/alpha.cc
+++ b/gcc/config/alpha/alpha.cc
@@ -7564,6 +7564,75 @@ alpha_does_function_need_gp (void)
   return 0;
 }
 
+/* Helper function for alpha_store_data_bypass_p, handle just a single SET
+   IN_SET.  */
+
+static bool
+alpha_store_data_bypass_p_1 (rtx_insn *out_insn, rtx in_set)
+{
+  if (!MEM_P (SET_DEST (in_set)))
+return false;
+
+  rtx out_set = single_set (out_insn);
+  if (out_set)
+return !reg_mentioned_p (SET_DEST (out_set), SET_DEST (in_set));
+
+  rtx out_pat = PATTERN (out_insn);
+  if (GET_CODE (out_pat) != PARALLEL)
+return false;
+
+  for (int i = 0; i < XVECLEN (out_pat, 0); i++)
+{
+  rtx out_exp = XVECEXP (out_pat, 0, i);
+
+  if (GET_CODE (out_exp) == CLOBBER || GET_CODE (out_exp) == USE
+ || GET_CODE (out_exp) == TRAP_IF)
+   continue;
+
+  gcc_assert (GET_CODE (out_exp) == SET);
+
+  if (reg_mentioned_p (SET_DEST (out_exp), SET_DEST (in_set)))
+   return false;
+}
+
+  return true;
+}
+
+/* True if the dependency between OUT_INSN and IN_INSN is on the store
+   data not the address operand(s) of the store.  IN_INSN and OUT_INSN
+   must be either a single_set or a PARALLEL with SETs inside.
+
+   This alpha-specific version of store_data_bypass_p ignores TRAP_IF
+   that would result in assertion failure (and internal compiler error)
+   in the generic store_data_bypass_p function.  */
+
+int
+alpha_store_data_bypass_p (rtx_insn *out_insn, rtx_insn *in_insn)
+{
+  rtx in_set = single_set (in_insn);
+  if (in_set)
+return alpha_store_data_bypass_p_1 (out_insn, in_set);
+
+  rtx in_pat = PATTERN (in_insn);
+  if (GET_CODE (in_pat) != PARALLEL)
+return false;
+
+  for (int i = 0; i < XVECLEN (in_pat, 0); i++)
+{
+  rtx in_exp = XVECEXP (in_pat, 0, i);
+
+  if (GET_CODE (in_exp) == CLOBBER || GET_CODE (in_exp) == USE
+ || GET_CODE (in_exp) == TRAP_IF)
+   continue;
+
+  gcc_assert (GET_CODE (in_exp) == SET);
+
+  if (!alpha_store_data_bypass_p_1 (out_insn, in_exp))
+   return false;
+}
+
+  return true;
+}
 
 /* Helper function to set RTX_FRAME_RELATED_P on instructions, including
sequences.  */
diff --git a/gcc/config/alpha/ev4.md b/gcc/config/alpha/ev4.md
index 01b9a727a18..c8ff4ed8f0d 100644
--- a/gcc/config/alpha/ev4.md
+++ b/gcc/config/alpha/ev4.md
@@ -44,14 +44,7 @@ (define_insn_reservation "ev4_ld" 1
 ; Stores can issue before the data (but not address) is ready.
 (define_insn_reservation "ev4_ist" 1
   (and (eq_attr "tune" "ev4")
-   (eq_attr "type" "ist"))
-  "ev4_ib1+ev4_abox")
-
-; ??? Separate from ev4_ist because store_data_bypass_p can't handle
-; the patterns with multiple sets, like store-conditional.
-(define_insn_reservation "ev4_ist_c" 1
-  (and (eq_attr "tune" "ev4")
-   (eq_attr "type" "st_c"))
+   (eq_attr "type" "ist,st_c"))
   "ev4_ib1+ev4_abox")
 
 (define_insn_reservation "ev4_fst" 1
@@ -110,7 +103,7 @@ (define_bypass 1 "ev4_icmp" "ev4_ibr")
 (define_bypass 0
   "ev4_iaddlog,ev4_shiftcm,ev4_icmp"
   "ev4_ist"
-  "store_data_bypass_p")
+  "alpha_store_data_bypass_p")
 
 ; Multiplies use a non-pipelined imul unit.  Also, "no [ebox] insn can
 ; be issued exactly three cycles before an integer multiply completes".
@@ -121,7 +114,7 @@ (define_insn_reservation "ev4_imulsi" 21
(eq_attr "opsize" "si")))
   "ev4_ib0+ev4_imul,ev4_imul*18,ev4_ebox")
 
-(define_bypass 20 "ev4_imulsi" "ev4_ist" "store_data_bypass_p")
+(define_bypass 20 "ev4_imulsi" "ev4_ist" 

[PATCH] i386: Fix assert in ix86_function_arg [PR105970]

2022-06-17 Thread Uros Bizjak via Gcc-patches
The mode of pointer argument should equal ptr_mode, not Pmode.

2022-06-17  Uroš Bizjak  

gcc/ChangeLog:

PR target/105970
* config/i386/i386.cc (ix86_function_arg): Assert that
the mode of pointer argumet is equal to ptr_mode, not Pmode.

gcc/testsuite/ChangeLog:

PR target/105970
* gcc.target/i386/pr105970.c: New test.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Pushed to master.

Uros.
diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index 3d189e124e4..f158cc3aaea 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -3348,7 +3348,7 @@ ix86_function_arg (cumulative_args_t cum_v, const 
function_arg_info )
   if (POINTER_TYPE_P (arg.type))
{
  /* This is the pointer argument.  */
- gcc_assert (TYPE_MODE (arg.type) == Pmode);
+ gcc_assert (TYPE_MODE (arg.type) == ptr_mode);
  /* It is at -WORD(AP) in the current frame in interrupt and
 exception handlers.  */
  reg = plus_constant (Pmode, arg_pointer_rtx, -UNITS_PER_WORD);
diff --git a/gcc/testsuite/gcc.target/i386/pr105970.c 
b/gcc/testsuite/gcc.target/i386/pr105970.c
new file mode 100644
index 000..326486faebf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr105970.c
@@ -0,0 +1,6 @@
+/* PR target/105970 */
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-require-effective-target maybe_x32 } */
+/* { dg-options "-mx32 -mgeneral-regs-only -maddress-mode=long" } */
+
+#include "../../gcc.dg/torture/pr68037-1.c"


[PATCH] i386: Fix VPMOV splitter [PR105993]

2022-06-17 Thread Uros Bizjak via Gcc-patches
REGNO should not be used with register_operand before reload because
subregs of registers or even subregs of memory match the predicate.
The build with RTL checking enabled does not tolerate REGNO with
non-reg operand.
The patch splits the splitter into two related splitters and uses
(match_dup ...) RTXes instead of REGNO comparisons.

2022-06-17  Uroš Bizjak  

gcc/ChangeLog:

PR target/105993
* config/i386/sse.md (vpmov splitter): Use (match_dup ...)
instead of REGNO comparisons in combine splitter.

gcc/testsuite/ChangeLog:

PR target/105993
* gcc.target/i386/pr105993.c: New test.

Bootstrapped and regression tested on x86_64-linux-gnu ,-m32}.

Pushed to master.

Uros.
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 3e3d96fe087..64ac490d272 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -23875,21 +23875,23 @@ (define_split
(xor:V_128_256 (match_operand:V_128_256 1 "register_operand")
   (match_operand:V_128_256 2 "register_operand"))
(match_operand:V_128_256 3 "nonimmediate_operand"))
- (match_operand:V_128_256 4 "register_operand")))]
-  "TARGET_XOP
-   && (REGNO (operands[4]) == REGNO (operands[1])
-   || REGNO (operands[4]) == REGNO (operands[2]))"
+ (match_dup 1)))]
+  "TARGET_XOP"
   [(set (match_dup 0) (if_then_else:V_128_256 (match_dup 3)
- (match_dup 5)
- (match_dup 4)))]
-{
-  /* To handle the commutivity of XOR, operands[4] is either operands[1]
- or operands[2], we need operands[5] to be the other one.  */
-  if (REGNO (operands[4]) == REGNO (operands[1]))
-operands[5] = operands[2];
-  else
-operands[5] = operands[1];
-})
+ (match_dup 2)
+ (match_dup 1)))])
+(define_split
+  [(set (match_operand:V_128_256 0 "register_operand")
+   (xor:V_128_256
+ (and:V_128_256
+   (xor:V_128_256 (match_operand:V_128_256 1 "register_operand")
+  (match_operand:V_128_256 2 "register_operand"))
+   (match_operand:V_128_256 3 "nonimmediate_operand"))
+ (match_dup 2)))]
+  "TARGET_XOP"
+  [(set (match_dup 0) (if_then_else:V_128_256 (match_dup 3)
+ (match_dup 1)
+ (match_dup 2)))])
 
 ;; XOP horizontal add/subtract instructions
 (define_insn "xop_phaddbw"
diff --git a/gcc/testsuite/gcc.target/i386/pr105993.c 
b/gcc/testsuite/gcc.target/i386/pr105993.c
new file mode 100644
index 000..79e3414f67b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr105993.c
@@ -0,0 +1,18 @@
+/* PR target/105993 */
+/* { dg-do compile } */
+/* { dg-options "-O -mxop" } */
+
+typedef unsigned short __attribute__((__vector_size__ (16))) V;
+V x, y, z;
+
+char c;
+short s;
+
+V
+foo (void)
+{
+  V u = __builtin_shufflevector (z, y, 2, 1, 0, 8, 4, 1, 7, 2);
+  V v = ~(__builtin_bswap16 (s) & (u ^ c));
+
+  return v;
+}


Re: [PATCH v1 2/3] RISC-V: Split slli+sh[123]add.uw opportunities to avoid zext.w

2022-06-17 Thread Philipp Tomsich
Kito, thanks: you were a few minutes ahead of my fix there.

On Fri, 17 Jun 2022 at 16:00, Kito Cheng  wrote:

> Hi Andreas:
>
> Fixed via
> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=d6b423882a05d7b4f40ae1e9d942c9c4c13761b7
> ,
> thanks!
>
> On Fri, Jun 17, 2022 at 4:34 PM Andreas Schwab 
> wrote:
> >
> > ../../gcc/config/riscv/bitmanip.md: In function 'rtx_insn*
> gen_split_44(rtx_ins\
> > n*, rtx_def**)':
> > ../../gcc/config/riscv/bitmanip.md:110:28: error: comparison of integer
> express\
> > ions of different signedness: 'int' and 'long unsigned int'
> [-Werror=sign-compa\
> > re]
> >   110 | if ((scale + bias) != UINTVAL (operands[2]))
> >
> > --
> > Andreas Schwab, sch...@linux-m68k.org
> > GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
> > "And now for something completely different."
>


Re: [PATCH] Add operators / and * for profile_{count,probability}.

2022-06-17 Thread Jan Hubicka via Gcc-patches
> PING^2
Sorry, I thought it is approved once we settled down the multiplicatoin
datatype, but apparently never sent the email.
Patch is oK.

Honza
> 
> On 5/24/22 13:35, Martin Liška wrote:
> > PING^1
> > 
> > On 5/5/22 20:15, Martin Liška wrote:
> >> On 5/5/22 15:49, Jan Hubicka wrote:
> >>> Hi,
>  The patch simplifies usage of the profile_{count,probability} types.
> 
>  Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
>  Ready to be installed?
> >>>
> >>> The reason I intentionally did not add * and / to the original API was
> >>> to detect situations where values that should be
> >>> profile_count/profile_probability are stored into integers, since
> >>> previous code used integers for everything.
> >>>
> >>> Having one to add apply_scale made him/her (mostly me :) to think if the
> >>> value is really just a fixed scale or it it should be better converted
> >>> to proper data type (count or probability).
> >>>
> >>> I guess now we completed the conversion so risk of this creeping in is
> >>> relatively low and the code indeed looks better.
> >>
> >> Yes, that's my impression as well that the profiling code is quite settled 
> >> down.
> >>
> >>> It will make it bit
> >>> harder for me to backport jump threading profile updating fixes I plan
> >>> for 12.2 but it should not be hard.
> >>
> >> You'll manage ;)
> >>
>  diff --git a/gcc/cfgloopmanip.cc b/gcc/cfgloopmanip.cc
>  index b4357c03e86..a1ac1146445 100644
>  --- a/gcc/cfgloopmanip.cc
>  +++ b/gcc/cfgloopmanip.cc
>  @@ -563,8 +563,7 @@ scale_loop_profile (class loop *loop, 
>  profile_probability p,
>   
> /* Probability of exit must be 1/iterations.  */
> count_delta = e->count ();
>  -  e->probability = profile_probability::always ()
>  -.apply_scale (1, iteration_bound);
>  +  e->probability = profile_probability::always () / 
>  iteration_bound;
> >>> However this is kind of example of the problem. 
> >>> iteration_bound is gcov_type so we can get overflow here.
> >>
> >> typedef int64_t gcov_type;
> >>
> >> and apply_scale takes int64_t types as arguments. Similarly the newly 
> >> added operators,
> >> so how can that change anything?
> >>
> >>> I guess we want to downgrade iteration_bound since it is always either 0
> >>> or int.
>  diff --git a/gcc/tree-switch-conversion.cc 
>  b/gcc/tree-switch-conversion.cc
>  index e14b4e6c94a..cef26a9878e 100644
>  --- a/gcc/tree-switch-conversion.cc
>  +++ b/gcc/tree-switch-conversion.cc
>  @@ -1782,7 +1782,7 @@ switch_decision_tree::analyze_switch_statement ()
> tree high = CASE_HIGH (elt);
>   
> profile_probability p
>  -= case_edge->probability.apply_scale (1, (intptr_t) 
>  (case_edge->aux));
>  += case_edge->probability / ((intptr_t) (case_edge->aux));
> >>>
> >>> I think the switch ranges may be also in risk of overflow?
> >>>
> >>> We could make operators to accept gcov_type or int64_t.
> >>
> >> As explained, they do.
> >>
> >> Cheers,
> >> Martin
> >>
> >>>
> >>> Thanks,
> >>> Honza
> >>
> > 
> 


[PATCH] rs6000: Fix some error messages for invalid conversions

2022-06-17 Thread Segher Boessenkool
"* something" isn't a type.  "something *" is.

Tested and committed.


Segher


2022-06-17  Segher Boessenkool  

* config/rs6000/rs6000.cc (rs6000_invalid_conversion): Correct some
types.
---
 gcc/config/rs6000/rs6000.cc | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 59481d9ac708..3d1f895ebd52 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -28305,13 +28305,13 @@ rs6000_invalid_conversion (const_tree fromtype, 
const_tree totype)
  && tomode != VOIDmode)
{
  if (frommode == XOmode)
-   return N_("invalid conversion from type %<* __vector_quad%>");
+   return N_("invalid conversion from type %<__vector_quad *%>");
  if (tomode == XOmode)
-   return N_("invalid conversion to type %<* __vector_quad%>");
+   return N_("invalid conversion to type %<__vector_quad *%>");
  if (frommode == OOmode)
-   return N_("invalid conversion from type %<* __vector_pair%>");
+   return N_("invalid conversion from type %<__vector_pair *%>");
  if (tomode == OOmode)
-   return N_("invalid conversion to type %<* __vector_pair%>");
+   return N_("invalid conversion to type %<__vector_pair *%>");
}
 }
 
-- 
1.8.3.1



Re: [PATCH] c: Extend the -Wpadded message with actual padding size

2022-06-17 Thread Marek Polacek via Gcc-patches
On Thu, Jun 16, 2022 at 09:37:32PM +0200, Vit Kabele wrote:
> When the compiler warns about padding struct to alignment boundary, it
> now also informs the user about the size of the alignment that needs to
> be added to get rid of the warning.
> 
> This removes the need of using pahole or similar tools, or manually
> determining the padding size.

Thanks for the patch, it looks reasonable, with the formatting fixed.
It would be nice to have a testcase, at least something like

struct S {
  __UINT64_TYPE__ i;
  char c;
};

The problem is what value to check for, on 32-bit arches the padding is
probably 3 bytes large and on 64-bit arches probably 7 bytes.  So I think
you could use __attribute__((aligned (8))) and then it's always 7.

> Tested on x86_64-pc-linux-gnu.
> 
> gcc/ChangeLog:
> 
>   * stor-layout.cc (finalize_record_size): Improve warning message

Missing '.' at the end.

> 
> Signed-off-by: Vit Kabele 
> ---
>  gcc/stor-layout.cc | 9 -
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/stor-layout.cc b/gcc/stor-layout.cc
> index 765f22f68b9..57ddb001780 100644
> --- a/gcc/stor-layout.cc
> +++ b/gcc/stor-layout.cc
> @@ -1781,7 +1781,14 @@ finalize_record_size (record_layout_info rli)
>&& simple_cst_equal (unpadded_size, TYPE_SIZE (rli->t)) == 0
>&& input_location != BUILTINS_LOCATION
>&& !TYPE_ARTIFICIAL (rli->t))
> -warning (OPT_Wpadded, "padding struct size to alignment boundary");
> +  {
> +  tree padding_size
> + = size_binop (MINUS_EXPR,
> + TYPE_SIZE_UNIT (rli->t), unpadded_size_unit);
> +  warning (OPT_Wpadded,
> +"padding struct size to alignment boundary with %E bytes",
> +padding_size);
> +  }
>  
>if (warn_packed && TREE_CODE (rli->t) == RECORD_TYPE
>&& TYPE_PACKED (rli->t) && ! rli->packed_maybe_necessary
> -- 
> 2.30.2
> 

Marek



Re: [PATCH v1 2/3] RISC-V: Split slli+sh[123]add.uw opportunities to avoid zext.w

2022-06-17 Thread Kito Cheng via Gcc-patches
Hi Andreas:

Fixed via 
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=d6b423882a05d7b4f40ae1e9d942c9c4c13761b7,
thanks!

On Fri, Jun 17, 2022 at 4:34 PM Andreas Schwab  wrote:
>
> ../../gcc/config/riscv/bitmanip.md: In function 'rtx_insn* 
> gen_split_44(rtx_ins\
> n*, rtx_def**)':
> ../../gcc/config/riscv/bitmanip.md:110:28: error: comparison of integer 
> express\
> ions of different signedness: 'int' and 'long unsigned int' 
> [-Werror=sign-compa\
> re]
>   110 | if ((scale + bias) != UINTVAL (operands[2]))
>
> --
> Andreas Schwab, sch...@linux-m68k.org
> GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
> "And now for something completely different."


Re: [PATCH] xtensa: Defer storing integer constants into litpool until reload

2022-06-17 Thread Takayuki 'January June' Suwa via Gcc-patches
erratum:

-   extern unsigned int value;
+   extern unsigned short value;

On 2022/06/17 22:47, Takayuki 'January June' Suwa via Gcc-patches wrote:
> Storing integer constants into litpool in the early stage of compilation
> hinders some integer optimizations.  In fact, such integer constants are
> not subject to the constant folding process.
> 
> For example:
> 
> extern unsigned int value;
> extern void foo(void);
> void test(void) {
>   if (value == 30001)
> foo();
> }
> 
>   .literal_position
>   .literal .LC0, value
>   .literal .LC1, 30001
> test:
>   l32ra3, .LC0
>   l32ra2, .LC1
>   l16ui   a3, a3, 0
>   extui   a2, a2, 0, 16  // runtime zero-extension despite constant
>   bne a3, a2, .L1
>   j.l foo, a9
> .L1:
>   ret.n
> 
> This patch defers the placement of integer constants into litpool until
> the start of reload:
> 
>   .literal_position
>   .literal .LC0, value
>   .literal .LC1, 30001
> test:
>   l32ra3, .LC0
>   l32ra2, .LC1
>   l16ui   a3, a3, 0
>   bne a3, a2, .L1
>   j.l foo, a9
> .L1:
>   ret.n
> 
> gcc/ChangeLog:
> 
>   * config/xtensa/constraints.md (Y):
>   Change to include integer constants until reload begins.
>   * config/xtensa/predicates.md (move_operand): Ditto.
>   * config/xtensa/xtensa.cc (xtensa_emit_move_sequence):
>   Change to allow storing integer constants into litpool only after
>   reload begins.


[PATCH] xtensa: Defer storing integer constants into litpool until reload

2022-06-17 Thread Takayuki 'January June' Suwa via Gcc-patches
Storing integer constants into litpool in the early stage of compilation
hinders some integer optimizations.  In fact, such integer constants are
not subject to the constant folding process.

For example:

extern unsigned int value;
extern void foo(void);
void test(void) {
  if (value == 30001)
foo();
}

.literal_position
.literal .LC0, value
.literal .LC1, 30001
test:
l32ra3, .LC0
l32ra2, .LC1
l16ui   a3, a3, 0
extui   a2, a2, 0, 16  // runtime zero-extension despite constant
bne a3, a2, .L1
j.l foo, a9
.L1:
ret.n

This patch defers the placement of integer constants into litpool until
the start of reload:

.literal_position
.literal .LC0, value
.literal .LC1, 30001
test:
l32ra3, .LC0
l32ra2, .LC1
l16ui   a3, a3, 0
bne a3, a2, .L1
j.l foo, a9
.L1:
ret.n

gcc/ChangeLog:

* config/xtensa/constraints.md (Y):
Change to include integer constants until reload begins.
* config/xtensa/predicates.md (move_operand): Ditto.
* config/xtensa/xtensa.cc (xtensa_emit_move_sequence):
Change to allow storing integer constants into litpool only after
reload begins.
---
 gcc/config/xtensa/constraints.md | 6 --
 gcc/config/xtensa/predicates.md  | 5 +++--
 gcc/config/xtensa/xtensa.cc  | 3 ++-
 3 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/gcc/config/xtensa/constraints.md b/gcc/config/xtensa/constraints.md
index e7ac8dbfebf..0b7dcd1440e 100644
--- a/gcc/config/xtensa/constraints.md
+++ b/gcc/config/xtensa/constraints.md
@@ -113,8 +113,10 @@
 
 (define_constraint "Y"
  "A constant that can be used in relaxed MOVI instructions."
- (and (match_code "const_int,const_double,const,symbol_ref,label_ref")
-  (match_test "TARGET_AUTO_LITPOOLS")))
+ (ior (and (match_code "const_int,const_double,const,symbol_ref,label_ref")
+  (match_test "TARGET_AUTO_LITPOOLS"))
+  (and (match_code "const_int")
+  (match_test "can_create_pseudo_p ()"
 
 ;; Memory constraints.  Do not use define_memory_constraint here.  Doing so
 ;; causes reload to force some constants into the constant pool, but since
diff --git a/gcc/config/xtensa/predicates.md b/gcc/config/xtensa/predicates.md
index edd13ae41b9..0590c0f81a9 100644
--- a/gcc/config/xtensa/predicates.md
+++ b/gcc/config/xtensa/predicates.md
@@ -147,8 +147,9 @@
   (match_test "!constantpool_mem_p (op)
|| GET_MODE_SIZE (mode) % UNITS_PER_WORD == 0")))
  (ior (and (match_code "const_int")
-  (match_test "GET_MODE_CLASS (mode) == MODE_INT
-   && xtensa_simm12b (INTVAL (op))"))
+  (match_test "(GET_MODE_CLASS (mode) == MODE_INT
+&& xtensa_simm12b (INTVAL (op)))
+   || can_create_pseudo_p ()"))
  (and (match_code "const_int,const_double,const,symbol_ref,label_ref")
   (match_test "(TARGET_CONST16 || TARGET_AUTO_LITPOOLS)
&& CONSTANT_P (op)
diff --git a/gcc/config/xtensa/xtensa.cc b/gcc/config/xtensa/xtensa.cc
index d6f08b11648..c5d00acdf2c 100644
--- a/gcc/config/xtensa/xtensa.cc
+++ b/gcc/config/xtensa/xtensa.cc
@@ -1182,7 +1182,8 @@ xtensa_emit_move_sequence (rtx *operands, machine_mode 
mode)
  return 1;
}
 
-  if (! TARGET_AUTO_LITPOOLS && ! TARGET_CONST16)
+  if (! TARGET_AUTO_LITPOOLS && ! TARGET_CONST16
+ && ! (CONST_INT_P (src) && can_create_pseudo_p ()))
{
  src = force_const_mem (SImode, src);
  operands[1] = src;
-- 
2.20.1


Re: [PATCH] c++: Use fold_non_dependent_expr rather than maybe_constant_value in __builtin_shufflevector handling [PR106001]

2022-06-17 Thread Jason Merrill via Gcc-patches

On 6/17/22 03:28, Jakub Jelinek wrote:

Hi!

In this case the STATIC_CAST_EXPR expressions in the call aren't
type nor value dependent, but maybe_constant_value still ICEs on those
when processing_template_decl.  Calling fold_non_dependent_expr on it
instead fixes the ICE and folds them to INTEGER_CSTs.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


OK.


2022-06-17  Jakub Jelinek  

PR c++/106001
* typeck.cc (build_x_shufflevector): Use fold_non_dependent_expr
instead of maybe_constant_value.

* g++.dg/ext/builtin-shufflevector-4.C: New test.

--- gcc/cp/typeck.cc.jj 2022-06-04 10:34:26.261505682 +0200
+++ gcc/cp/typeck.cc2022-06-16 19:38:04.397979247 +0200
@@ -6344,7 +6344,7 @@ build_x_shufflevector (location_t loc, v
auto_vec mask;
for (unsigned i = 2; i < args->length (); ++i)
  {
-  tree idx = maybe_constant_value ((*args)[i]);
+  tree idx = fold_non_dependent_expr ((*args)[i], complain);
mask.safe_push (idx);
  }
tree exp = c_build_shufflevector (loc, arg0, arg1, mask, complain & 
tf_error);
--- gcc/testsuite/g++.dg/ext/builtin-shufflevector-4.C.jj   2022-06-16 
19:43:13.103935249 +0200
+++ gcc/testsuite/g++.dg/ext/builtin-shufflevector-4.C  2022-06-16 
19:42:37.534401207 +0200
@@ -0,0 +1,18 @@
+// PR c++/106001
+// { dg-do compile }
+
+typedef int V __attribute__((vector_size (2 * sizeof (int;
+
+template 
+void
+foo ()
+{
+  V v = {};
+  v = __builtin_shufflevector (v, v, static_cast(1), 
static_cast(0));
+}
+
+void
+bar ()
+{
+  foo <0> ();
+}

Jakub





[committed] arm: fix checking ICE in arm_print_operand [PR106004]

2022-06-17 Thread Richard Earnshaw via Gcc-patches

Sigh, another instance where I incorrectly used XUINT instead of
UINTVAL.

I've also made the code here a little more robust (although I think
this case can't in fact be reached) if the 32-bit clear mask includes
bit 31.  This case, if reached, would print out an out-of-range value
based on the size of the compiler's HOST_WIDE_INT type due to
sign-extension.  We avoid this by masking the value after inversion.

gcc/ChangeLog:
PR target/106004
* config/arm/arm.cc (arm_print_operand, case 'V'): Use UINTVAL.
Clear bits in the mask above bit 31.
---
 gcc/config/arm/arm.cc | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
index 2925907b436..33fb98d5cad 100644
--- a/gcc/config/arm/arm.cc
+++ b/gcc/config/arm/arm.cc
@@ -24199,7 +24199,8 @@ arm_print_operand (FILE *stream, rtx x, int code)
 	return;
 	  }
 
-	unsigned HOST_WIDE_INT val = ~XUINT (x, 0);
+	unsigned HOST_WIDE_INT val
+	  = ~UINTVAL (x) & HOST_WIDE_INT_UC (0x);
 	int lsb = exact_log2 (val & -val);
 	asm_fprintf (stream, "#%d, #%d", lsb,
 		 (exact_log2 (val + (val & -val)) - lsb));


Re: [PATCH] c: Extend the -Wpadded message with actual padding size

2022-06-17 Thread Eric Gallager via Gcc-patches
On Thu, Jun 16, 2022 at 3:37 PM Vit Kabele  wrote:
>
> When the compiler warns about padding struct to alignment boundary, it
> now also informs the user about the size of the alignment that needs to
> be added to get rid of the warning.

Hi, thanks for taking the time to improve -Wpadded; I have been
wishing that GCC's implementation of -Wpadded would print this
information for a while now and thought there was a bug open for it,
but can't seem to find it now...

>
> This removes the need of using pahole or similar tools, or manually
> determining the padding size.
>
> Tested on x86_64-pc-linux-gnu.
>
> gcc/ChangeLog:
>
> * stor-layout.cc (finalize_record_size): Improve warning message
>
> Signed-off-by: Vit Kabele 
> ---
>  gcc/stor-layout.cc | 9 -
>  1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/stor-layout.cc b/gcc/stor-layout.cc
> index 765f22f68b9..57ddb001780 100644
> --- a/gcc/stor-layout.cc
> +++ b/gcc/stor-layout.cc
> @@ -1781,7 +1781,14 @@ finalize_record_size (record_layout_info rli)
>&& simple_cst_equal (unpadded_size, TYPE_SIZE (rli->t)) == 0
>&& input_location != BUILTINS_LOCATION
>&& !TYPE_ARTIFICIAL (rli->t))
> -warning (OPT_Wpadded, "padding struct size to alignment boundary");
> +  {
> +  tree padding_size
> +   = size_binop (MINUS_EXPR,
> +   TYPE_SIZE_UNIT (rli->t), unpadded_size_unit);
> +  warning (OPT_Wpadded,
> +  "padding struct size to alignment boundary with %E bytes",
> +  padding_size);
> +  }

Style nit: indentation seems off; check your tabs vs. spaces etc.

>
>if (warn_packed && TREE_CODE (rli->t) == RECORD_TYPE
>&& TYPE_PACKED (rli->t) && ! rli->packed_maybe_necessary
> --
> 2.30.2


Re: [committed] libstdc++: Support constexpr global std::string for size < 15 [PR105995]

2022-06-17 Thread Jonathan Wakely via Gcc-patches
On Thu, 16 Jun 2022 at 20:23, Jonathan Wakely via Libstdc++
 wrote:
>
> Tested x86_64-linux, pushed to trunk.

Somehow I messed up the test in the commit I pushed (but not the one I
tested ... weird).

Fixed at r13-1151-g0f96ac43fa0a5f by the attached patch.

-- >8 --

   libstdc++: Add missing #include  to new test

   Somehow I pushed a different version of this test to the one I actually
   tested.

   libstdc++-v3/ChangeLog:

   * testsuite/21_strings/basic_string/cons/char/105995.cc: Add
   missing #include.
commit 0f96ac43fa0a5fdbfce317b274233852d5b46d23
Author: Jonathan Wakely 
Date:   Fri Jun 17 13:29:05 2022

libstdc++: Add missing #include  to new test

Somehow I pushed a different version of this test to the one I actually
tested.

libstdc++-v3/ChangeLog:

* testsuite/21_strings/basic_string/cons/char/105995.cc: Add
missing #include.

diff --git a/libstdc++-v3/testsuite/21_strings/basic_string/cons/char/105995.cc 
b/libstdc++-v3/testsuite/21_strings/basic_string/cons/char/105995.cc
index aa8bcba3dca..4764ceff72a 100644
--- a/libstdc++-v3/testsuite/21_strings/basic_string/cons/char/105995.cc
+++ b/libstdc++-v3/testsuite/21_strings/basic_string/cons/char/105995.cc
@@ -2,6 +2,8 @@
 // { dg-do compile { target c++20 } }
 // { dg-require-effective-target cxx11_abi }
 
+#include 
+
 // PR libstdc++/105995
 // Not required by the standard, but supported for QoI.
 constexpr std::string pr105995_empty;


[PATCH 2/2] aarch64: Fix bit-field alignment in param passing [PR105549]

2022-06-17 Thread Christophe Lyon via Gcc-patches
While working on enabling DFP for AArch64, I noticed new failures in
gcc.dg/compat/struct-layout-1.exp (t028) which were not actually
caused by DFP types handling. These tests are generated during 'make
check' and enabling DFP made generation different (not sure if new
non-DFP tests are generated, or if existing ones are generated
differently, the tests in question are huge and difficult to compare).

Anyway, I reduced the problem to what I attach at the end of the new
gcc.target/aarch64/aapcs64/va_arg-17.c test and rewrote it in the same
scheme as other va_arg* AArch64 tests.  Richard Sandiford further
reduced this to a non-vararg function, added as a second testcase.

This is a tough case mixing bit-fields and alignment, where
aarch64_function_arg_alignment did not follow what its descriptive
comment says: we want to use the natural alignment of the bit-field
type only if the user didn't reduce the alignment for the bit-field
itself.

The fix would be very small, except that this introduces a new ABI
break, and we have to warn about that.  Since this actually fixes a
problem introduced in GCC 9.1, we keep the old computation to detect
when we now behave differently.

This patch adds two new tests (va_arg-17.c and
pr105549.c). va_arg-17.c contains the reduced offending testcase from
struct-layout-1.exp for reference.  We update some tests introduced by
the previous patch, where parameters with bit-fields and packed
attribute now emit a different warning.

We also take the opportunity to fix the comment above
aarch64_function_arg_alignment since the value of the abi_break
parameter was changed in a previous commit, no longer matching the
description.

2022-06-16  Christophe Lyon  

gcc/
PR target/105549
* config/aarch64/aarch64.cc (aarch64_function_arg_alignment):
Check DECL_PACKED for bitfield.
(aarch64_layout_arg): Warn when parameter passing ABI changes.
(aarch64_function_arg_boundary): Likewise.
(aarch64_gimplify_va_arg_expr): Likewise.

gcc/testsuite/
PR target/105549
* gcc.target/aarch64/bitfield-abi-warning-align16-O0.c: Update.
* gcc.target/aarch64/bitfield-abi-warning-align16-O2.c: Update.
* gcc.target/aarch64/aapcs64/va_arg-17.c: New test.
* gcc.target/aarch64/pr105549.c: New test.
---
 gcc/config/aarch64/aarch64.cc |  87 ---
 .../gcc.target/aarch64/aapcs64/va_arg-17.c| 105 ++
 .../aarch64/bitfield-abi-warning-align16-O0.c |  60 +-
 .../aarch64/bitfield-abi-warning-align16-O2.c |  60 +-
 gcc/testsuite/gcc.target/aarch64/pr105549.c   |  12 ++
 5 files changed, 249 insertions(+), 75 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/aapcs64/va_arg-17.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/pr105549.c

diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 13984e3435b..e7a6288d7a7 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -7248,15 +7248,19 @@ aarch64_vfp_is_call_candidate (cumulative_args_t 
pcum_v, machine_mode mode,
 /* Given MODE and TYPE of a function argument, return the alignment in
bits.  The idea is to suppress any stronger alignment requested by
the user and opt for the natural alignment (specified in AAPCS64 \S
-   4.1).  ABI_BREAK is set to true if the alignment was incorrectly
-   calculated in versions of GCC prior to GCC-9.  This is a helper
+   4.1).  ABI_BREAK is set to the old alignment if the alignment was
+   incorrectly calculated in versions of GCC prior to GCC-9.
+   ABI_BREAK_PACKED is set to the old alignment if it was incorrectly
+   calculated in versions between GCC-9 and GCC-13.  This is a helper
function for local use only.  */
 
 static unsigned int
 aarch64_function_arg_alignment (machine_mode mode, const_tree type,
-   unsigned int *abi_break)
+   unsigned int *abi_break,
+   unsigned int *abi_break_packed)
 {
   *abi_break = 0;
+  *abi_break_packed = 0;
   if (!type)
 return GET_MODE_ALIGNMENT (mode);
 
@@ -7272,6 +7276,7 @@ aarch64_function_arg_alignment (machine_mode mode, 
const_tree type,
 return TYPE_ALIGN (TREE_TYPE (type));
 
   unsigned int alignment = 0;
+  unsigned int bitfield_alignment_with_packed = 0;
   unsigned int bitfield_alignment = 0;
   for (tree field = TYPE_FIELDS (type); field; field = DECL_CHAIN (field))
 if (TREE_CODE (field) == FIELD_DECL)
@@ -7290,12 +7295,32 @@ aarch64_function_arg_alignment (machine_mode mode, 
const_tree type,
   "s" contains only one Fundamental Data Type (the int field)
   but gains 8-byte alignment and size thanks to "e".  */
alignment = std::max (alignment, DECL_ALIGN (field));
+
if (DECL_BIT_FIELD_TYPE (field))
- bitfield_alignment
-   = std::max (bitfield_alignment,
-   TYPE_ALIGN 

[PATCH 1/2] aarch64: fix warning emission for ABI break since GCC 9.1

2022-06-17 Thread Christophe Lyon via Gcc-patches
While looking at PR 105549, which is about fixing the ABI break
introduced in GCC 9.1 in parameter alignment with bit-fields, I noticed
that the GCC 9.1 warning is not emitted in all the cases where it
should be.  This patch fixes that and the next patch in the series
fixes the GCC 9.1 break.

I split this into two patches since patch #2 introduces a new ABI
break starting with GCC 13.1.  This way, patch #1 can be back-ported to
release branches if needed to fix the GCC 9.1 warning issue.

The fix in aarch64_layout_arg highlights the bug fixed by patch #2:
GCC 9 should not have changed behavior for nregs==1, and this patch
makes it warn so as to be consistent.

The part of the fix in aarch64_function_arg_boundary (replacing & with
&&) looks like an oversight of a previous commit in this area which
changed 'abi_break' from a boolean to an integer.

2022-06-16  Christophe Lyon  

gcc/ChangeLog:

* config/aarch64/aarch64.cc (aarch64_layout_arg): Fix warning
emission.
(aarch64_function_arg_boundary): Fix typo.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/bitfield-abi-warning-align16-O0.c: New test.
* gcc.target/aarch64/bitfield-abi-warning-align16-O2.c: New test.
* gcc.target/aarch64/bitfield-abi-warning-align8-O0.c: New test.
* gcc.target/aarch64/bitfield-abi-warning-align8-O2.c: New test.
* gcc.target/aarch64/bitfield-abi-warning.h: New test.
---
 gcc/config/aarch64/aarch64.cc |  20 ++-
 .../aarch64/bitfield-abi-warning-align16-O0.c |  81 
 .../aarch64/bitfield-abi-warning-align16-O2.c |  86 
 .../aarch64/bitfield-abi-warning-align8-O0.c  |   7 +
 .../aarch64/bitfield-abi-warning-align8-O2.c  |  16 +++
 .../gcc.target/aarch64/bitfield-abi-warning.h | 125 ++
 6 files changed, 329 insertions(+), 6 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/bitfield-abi-warning-align16-O0.c
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/bitfield-abi-warning-align16-O2.c
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/bitfield-abi-warning-align8-O0.c
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/bitfield-abi-warning-align8-O2.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/bitfield-abi-warning.h

diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index d049f9a9819..13984e3435b 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -7453,23 +7453,31 @@ aarch64_layout_arg (cumulative_args_t pcum_v, const 
function_arg_info )
   if (allocate_ncrn && (ncrn + nregs <= NUM_ARG_REGS))
 {
   gcc_assert (nregs == 0 || nregs == 1 || nregs == 2);
+  unsigned int alignment;
 
   /* C.8 if the argument has an alignment of 16 then the NGRN is
 rounded up to the next even number.  */
-  if (nregs == 2
- && ncrn % 2
+  alignment = aarch64_function_arg_alignment (mode, type, _break);
+  if (ncrn % 2
  /* The == 16 * BITS_PER_UNIT instead of >= 16 * BITS_PER_UNIT
 comparison is there because for > 16 * BITS_PER_UNIT
 alignment nregs should be > 2 and therefore it should be
 passed by reference rather than value.  */
- && (aarch64_function_arg_alignment (mode, type, _break)
+ && (alignment
  == 16 * BITS_PER_UNIT))
{
+ /* We want to emit a warning even if nregs == 1, because
+although we do not round ncrn up in this case, the callee
+has a different (broken) expectation.  */
  if (abi_break && warn_psabi && currently_expanding_gimple_stmt)
inform (input_location, "parameter passing for argument of type "
"%qT changed in GCC 9.1", type);
- ++ncrn;
- gcc_assert (ncrn + nregs <= NUM_ARG_REGS);
+
+ if (ncrn % 2)
+   {
+ ++ncrn;
+ gcc_assert (ncrn + nregs <= NUM_ARG_REGS);
+   }
}
 
   /* If an argument with an SVE mode needs to be shifted up to the
@@ -7648,7 +7656,7 @@ aarch64_function_arg_boundary (machine_mode mode, 
const_tree type)
   unsigned int alignment = aarch64_function_arg_alignment (mode, type,
   _break);
   alignment = MIN (MAX (alignment, PARM_BOUNDARY), STACK_BOUNDARY);
-  if (abi_break & warn_psabi)
+  if (abi_break && warn_psabi)
 {
   abi_break = MIN (MAX (abi_break, PARM_BOUNDARY), STACK_BOUNDARY);
   if (alignment != abi_break)
diff --git a/gcc/testsuite/gcc.target/aarch64/bitfield-abi-warning-align16-O0.c 
b/gcc/testsuite/gcc.target/aarch64/bitfield-abi-warning-align16-O0.c
new file mode 100644
index 000..0a1f54acb2d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/bitfield-abi-warning-align16-O0.c
@@ -0,0 +1,81 @@
+/* { dg-do compile } */
+/* { dg-options "-O0 -save-temps" } */
+
+#define ALIGN 16
+#define EXTRA
+
+#include "bitfield-abi-warning.h"
+
+/* 

[PATCH][pushed] docs: add missing table header

2022-06-17 Thread Martin Liška
libgomp/ChangeLog:

* libgomp.texi: Add table header for new features of
OpenMP 5.2.
---
 libgomp/libgomp.texi | 1 +
 1 file changed, 1 insertion(+)

diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index a5e54456746..2c4622c1092 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -353,6 +353,7 @@ The OpenMP 4.5 specification is fully supported.
 @unnumberedsubsec New features listed in Appendix B of the OpenMP specification
 
 @multitable @columnfractions .60 .10 .25
+@headitem Description @tab Status @tab Comments
 @item @code{omp_in_explicit_task} routine and @emph{implicit-task-var} ICV
   @tab N @tab
 @item @code{omp}/@code{ompx}/@code{omx} sentinels and @code{omp_}/@code{ompx_}
-- 
2.36.1



Re: [PATCH][AArch64] Implement ACLE Data Intrinsics

2022-06-17 Thread Richard Sandiford via Gcc-patches
"Andre Vieira (lists)"  writes:
> Hi,
>
> This patch adds support for the ACLE Data Intrinsics to the AArch64 port.
>
> Bootstrapped and regression tested on aarch64-none-linux.
>
> OK for trunk?

Sorry for the slow review.

>
> gcc/ChangeLog:
>
> 2022-06-10  Andre Vieira  
>
>      * config/aarch64/aarch64.md (rbit2): Rename this ...
>      (@aarch64_rbit): ... this and change it in...
>      (ffs2,ctz2): ... here.
>      (@aarch64_rev16): New.
>      * config/aarch64/aarch64-builtins.cc: (aarch64_builtins):
>      Define the following enum AARCH64_REV16, AARCH64_REV16L, 
> AARCH64_REV16LL,
>      AARCH64_RBIT, AARCH64_RBITL, AARCH64_RBITLL.
>      (aarch64_init_data_intrinsics): New.
>      (handle_arm_acle_h): Add call to aarch64_init_data_intrinsics.
>      (aarch64_expand_builtin_data_intrinsic): New.
>      (aarch64_general_expand_builtin): Add call to 
> aarch64_expand_builtin_data_intrinsic.
>      * config/aarch64/arm_acle.h (__clz, __clzl, __clzll, __cls, 
> __clsl, __clsll, __rbit,
>      __rbitl, __rbitll, __rev, __revl, __revll, __rev16, __rev16l, 
> __rev16ll, __ror, __rorl,
>      __rorll, __revsh): New.
>
> gcc/testsuite/ChangeLog:
>
> 2022-06-10  Andre Vieira  
>
>      * gcc.target/aarch64/acle/data-intrinsics.c: New test.
>
> diff --git a/gcc/config/aarch64/aarch64-builtins.cc 
> b/gcc/config/aarch64/aarch64-builtins.cc
> index 
> e0a741ac663188713e21f457affa57217d074783..91a687dee13a27c21f0c50de9ba777aa900d6096
>  100644
> --- a/gcc/config/aarch64/aarch64-builtins.cc
> +++ b/gcc/config/aarch64/aarch64-builtins.cc
> @@ -613,6 +613,12 @@ enum aarch64_builtins
>AARCH64_LS64_BUILTIN_ST64B,
>AARCH64_LS64_BUILTIN_ST64BV,
>AARCH64_LS64_BUILTIN_ST64BV0,
> +  AARCH64_REV16,
> +  AARCH64_REV16L,
> +  AARCH64_REV16LL,
> +  AARCH64_RBIT,
> +  AARCH64_RBITL,
> +  AARCH64_RBITLL,
>AARCH64_BUILTIN_MAX
>  };
>  
> @@ -1664,10 +1670,41 @@ aarch64_init_ls64_builtins (void)
>= aarch64_general_add_builtin (data[i].name, data[i].type, 
> data[i].code);
>  }
>  
> +static void
> +aarch64_init_data_intrinsics (void)
> +{
> +  tree uint32_fntype = build_function_type_list (uint32_type_node,
> +  uint32_type_node, NULL_TREE);
> +  tree long_fntype = build_function_type_list (long_unsigned_type_node,
> +long_unsigned_type_node,
> +NULL_TREE);

Very minor, but ulong_fntype might be clearer, since the other two
variable names are explicitly unsigned.

> +  tree uint64_fntype = build_function_type_list (uint64_type_node,
> +  uint64_type_node, NULL_TREE);
> +  aarch64_builtin_decls[AARCH64_REV16]
> += aarch64_general_add_builtin ("__builtin_aarch64_rev16", uint32_fntype,
> +AARCH64_REV16);
> +  aarch64_builtin_decls[AARCH64_REV16L]
> += aarch64_general_add_builtin ("__builtin_aarch64_rev16l", long_fntype,
> +AARCH64_REV16L);
> +  aarch64_builtin_decls[AARCH64_REV16LL]
> += aarch64_general_add_builtin ("__builtin_aarch64_rev16ll", 
> uint64_fntype,
> +AARCH64_REV16LL);
> +  aarch64_builtin_decls[AARCH64_RBIT]
> += aarch64_general_add_builtin ("__builtin_aarch64_rbit", uint32_fntype,
> +AARCH64_RBIT);
> +  aarch64_builtin_decls[AARCH64_RBITL]
> += aarch64_general_add_builtin ("__builtin_aarch64_rbitl", long_fntype,
> +AARCH64_RBITL);
> +  aarch64_builtin_decls[AARCH64_RBITLL]
> += aarch64_general_add_builtin ("__builtin_aarch64_rbitll", uint64_fntype,
> +AARCH64_RBITLL);
> +}
> +
>  /* Implement #pragma GCC aarch64 "arm_acle.h".  */
>  void
>  handle_arm_acle_h (void)
>  {
> +  aarch64_init_data_intrinsics ();
>if (TARGET_LS64)
>  aarch64_init_ls64_builtins ();
>  }
> @@ -2393,6 +2430,32 @@ aarch64_expand_builtin_memtag (int fcode, tree exp, 
> rtx target)
>emit_insn (pat);
>return target;
>  }
> +/* Function to expand an expression EXP which calls one of the ACLE Data
> +   Intrinsic builtins FCODE with the result going to TARGET.  */
> +static rtx
> +aarch64_expand_builtin_data_intrinsic (unsigned int fcode, tree exp, rtx 
> target)
> +{
> +  rtx op0 = expand_normal (CALL_EXPR_ARG (exp, 0));
> +  machine_mode mode = GET_MODE (op0);
> +  rtx pat;
> +  switch (fcode)
> +{
> +case AARCH64_REV16:
> +case AARCH64_REV16L:
> +case AARCH64_REV16LL:
> +  pat = gen_aarch64_rev16 (mode, target, op0);

Does this work when op0 is a constant or comes from memory?
Same for when target is a memory.  E.g. does:

void test_rev16 (uint32_t *ptr)
{
  *ptr = __rev16 (*ptr);
}

work?

It'd be more robust to use the expand_insn interface instead;
see aarch64_expand_builtin_ls64 for an example.

> +  break;
> +case AARCH64_RBIT:
> 

[PATCH] vect: Respect slp decision when applying suggested uf [PR105940]

2022-06-17 Thread Kewen.Lin via Gcc-patches
Hi,

This follows Richi's suggestion in PR105940, it aims to avoid
inconsistent slp decision between when the suggested unroll
factor is worked out and when the suggested unroll factor is
applied.

If the previous slp decision is true when the suggested unroll
factor is worked out, when we are applying unroll factor we
don't need to start over with slp off if the analysis with slp
on fails.  On the other hand, if the previous slp decision is
false when the suggested unroll factor is worked out, when we
are applying unroll factor we can skip the slp handlings.

Function vect_is_simple_reduction saves reduction chains for
subsequent slp analyses, we have to disable this early otherwise
there is an ICE in vectorizable_reduction for below:

  if (REDUC_GROUP_FIRST_ELEMENT (stmt_info))
gcc_assert (slp_node
&& REDUC_GROUP_FIRST_ELEMENT (stmt_info)
   == stmt_info);

Bootstrapped and regtested on x86_64-redhat-linux,
powerpc64{,le}-linux-gnu and aarch64-linux-gnu.

Also tested with SPEC2017 build with some rs6000 hacking.

Is it ok for trunk?

BR,
Kewen
-

PR tree-optimization/105940

gcc/ChangeLog:

* tree-vect-loop.cc (vect_analyze_loop_2): Add new parameter
slp_done_for_suggested_uf and adjust with it accordingly.
(vect_analyze_loop_1): Add new variable slp_done_for_suggested_uf,
pass it down to vect_analyze_loop_2 for the initial analysis and
applying suggested unroll factor.
(vect_is_simple_reduction): Add parameter slp and adjust with it.
(vect_analyze_scalar_cycles_1): Add parameter slp and pass down.
(vect_analyze_scalar_cycles): Likewise.
---
 gcc/tree-vect-loop.cc | 101 --
 1 file changed, 67 insertions(+), 34 deletions(-)

diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index e05f8e87f7d..ccab68caf9a 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -157,7 +157,7 @@ along with GCC; see the file COPYING3.  If not see
 static void vect_estimate_min_profitable_iters (loop_vec_info, int *, int *,
unsigned *);
 static stmt_vec_info vect_is_simple_reduction (loop_vec_info, stmt_vec_info,
-  bool *, bool *);
+  bool *, bool *, bool);

 /* Subroutine of vect_determine_vf_for_stmt that handles only one
statement.  VECTYPE_MAYBE_SET_P is true if STMT_VINFO_VECTYPE
@@ -463,10 +463,12 @@ vect_inner_phi_in_double_reduction_p (loop_vec_info 
loop_vinfo, gphi *phi)
Examine the cross iteration def-use cycles of scalar variables
in LOOP.  LOOP_VINFO represents the loop that is now being
considered for vectorization (can be LOOP, or an outer-loop
-   enclosing LOOP).  */
+   enclosing LOOP).  SLP indicates there will be some subsequent
+   slp analyses or not.  */

 static void
-vect_analyze_scalar_cycles_1 (loop_vec_info loop_vinfo, class loop *loop)
+vect_analyze_scalar_cycles_1 (loop_vec_info loop_vinfo, class loop *loop,
+ bool slp)
 {
   basic_block bb = loop->header;
   tree init, step;
@@ -545,7 +547,7 @@ vect_analyze_scalar_cycles_1 (loop_vec_info loop_vinfo, 
class loop *loop)

   stmt_vec_info reduc_stmt_info
= vect_is_simple_reduction (loop_vinfo, stmt_vinfo, _reduc,
-   _chain);
+   _chain, slp);
   if (reduc_stmt_info)
 {
  STMT_VINFO_REDUC_DEF (stmt_vinfo) = reduc_stmt_info;
@@ -616,11 +618,11 @@ vect_analyze_scalar_cycles_1 (loop_vec_info loop_vinfo, 
class loop *loop)
  a[i] = i;  */

 static void
-vect_analyze_scalar_cycles (loop_vec_info loop_vinfo)
+vect_analyze_scalar_cycles (loop_vec_info loop_vinfo, bool slp)
 {
   class loop *loop = LOOP_VINFO_LOOP (loop_vinfo);

-  vect_analyze_scalar_cycles_1 (loop_vinfo, loop);
+  vect_analyze_scalar_cycles_1 (loop_vinfo, loop, slp);

   /* When vectorizing an outer-loop, the inner-loop is executed sequentially.
  Reductions in such inner-loop therefore have different properties than
@@ -632,7 +634,7 @@ vect_analyze_scalar_cycles (loop_vec_info loop_vinfo)
 current checks are too strict.  */

   if (loop->inner)
-vect_analyze_scalar_cycles_1 (loop_vinfo, loop->inner);
+vect_analyze_scalar_cycles_1 (loop_vinfo, loop->inner, slp);
 }

 /* Transfer group and reduction information from STMT_INFO to its
@@ -2223,12 +2225,18 @@ vect_determine_partial_vectors_and_peeling 
(loop_vec_info loop_vinfo,

 /* Function vect_analyze_loop_2.

-   Apply a set of analyses on LOOP, and create a loop_vec_info struct
-   for it.  The different analyses will record information in the
-   loop_vec_info struct.  */
+   Apply a set of analyses on LOOP specified by LOOP_VINFO, the different
+   analyses will record information in some members of LOOP_VINFO.  FATAL
+   indicates if some analysis meets 

[committed] arm: mve: Don't force trivial vector literals to the pool

2022-06-17 Thread Richard Earnshaw via Gcc-patches

A bug in the ordering of the operands in the mve_mov pattern
meant that all literal values were being pushed to the literal pool.
This patch fixes that and simplifies some of the logic slightly so
that we can use as simple switch statement.

For example:
void f (uint32_t *a)
{
  int i;
  for (i = 0; i < 100; i++)
a[i] += 1;
}

Now compiles to:
push{lr}
mov lr, #25
vmov.i32q2, #0x1  @ v4si
...

instead of

push{lr}
mov lr, #25
vldr.64 d4, .L6
vldr.64 d5, .L6+8
...
.L7:
.align  3
.L6:
.word   1
.word   1
.word   1
.word   1

gcc/ChangeLog:
* config/arm/mve.md (*mve_mov): Re-order constraints
to avoid spilling trivial literals to the constant pool.

gcc/testsuite/ChangeLog:
* gcc.target/arm/acle/cde-mve-full-assembly.c: Adjust expected
output.
---
 gcc/config/arm/mve.md |  99 ++--
 .../arm/acle/cde-mve-full-assembly.c  | 549 --
 2 files changed, 311 insertions(+), 337 deletions(-)

diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index f16991c0a34..c4dec01baac 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -18,66 +18,73 @@
 ;; .
 
 (define_insn "*mve_mov"
-  [(set (match_operand:MVE_types 0 "nonimmediate_operand" "=w,w,r,w,w,r,w,Ux,w")
-	(match_operand:MVE_types 1 "general_operand" "w,r,w,Dn,UxUi,r,Dm,w,Ul"))]
+  [(set (match_operand:MVE_types 0 "nonimmediate_operand" "=w,w,r,w   , w,   r,Ux,w")
+	(match_operand:MVE_types 1 "general_operand"  " w,r,w,DnDm,UxUi,r,w, Ul"))]
   "TARGET_HAVE_MVE || TARGET_HAVE_MVE_FLOAT"
 {
-  if (which_alternative == 3 || which_alternative == 6)
+  switch (which_alternative)
 {
-  int width, is_valid;
-  static char templ[40];
+case 0:  /* [w,w].  */
+  return "vmov\t%q0, %q1";
 
-  is_valid = simd_immediate_valid_for_move (operands[1], mode,
-	[1], );
+case 1:  /* [w,r].  */
+  return "vmov\t%e0, %Q1, %R1  %@ \;vmov\t%f0, %J1, %K1";
+
+case 2:  /* [r,w].  */
+  return "vmov\t%Q0, %R0, %e1  %@ \;vmov\t%J0, %K0, %f1";
+
+case 3:  /* [w,DnDm].  */
+  {
+	int width, is_valid;
+
+	is_valid = simd_immediate_valid_for_move (operands[1], mode,
+		  [1], );
+
+	gcc_assert (is_valid);
+
+	if (width == 0)
+	  return "vmov.f32\t%q0, %1  %@ ";
+	else
+	  {
+	const int templ_size = 40;
+	static char templ[templ_size];
+	if (snprintf (templ, templ_size,
+			  "vmov.i%d\t%%q0, %%x1  %%@ ", width)
+		> templ_size)
+	  abort ();
+	return templ;
+	  }
+  }
+
+case 4:  /* [w,UxUi].  */
+  if (mode == V2DFmode || mode == V2DImode
+	  || mode == TImode)
+	return "vldrw.u32\t%q0, %E1";
+  else
+	return "vldr.\t%q0, %E1";
 
-  gcc_assert (is_valid != 0);
+case 5:  /* [r,r].  */
+  return output_move_quad (operands);
 
-  if (width == 0)
-	return "vmov.f32\t%q0, %1  @ ";
+case 6:  /* [Ux,w].  */
+  if (mode == V2DFmode || mode == V2DImode
+	  || mode == TImode)
+	return "vstrw.32\t%q1, %E0";
   else
-	sprintf (templ, "vmov.i%d\t%%q0, %%x1  @ ", width);
-  return templ;
-}
+	return "vstr.\t%q1, %E0";
 
-  if (which_alternative == 4 || which_alternative == 7)
-{
-  if (mode == V2DFmode || mode == V2DImode || mode == TImode)
-	{
-	  if (which_alternative == 7)
-	output_asm_insn ("vstrw.32\t%q1, %E0", operands);
-	  else
-	output_asm_insn ("vldrw.u32\t%q0, %E1",operands);
-	}
-  else
-	{
-	  if (which_alternative == 7)
-	output_asm_insn ("vstr.\t%q1, %E0", operands);
-	  else
-	output_asm_insn ("vldr.\t%q0, %E1", operands);
-	}
-  return "";
-}
-  switch (which_alternative)
-{
-case 0:
-  return "vmov\t%q0, %q1";
-case 1:
-  return "vmov\t%e0, %Q1, %R1  @ \;vmov\t%f0, %J1, %K1";
-case 2:
-  return "vmov\t%Q0, %R0, %e1  @ \;vmov\t%J0, %K0, %f1";
-case 5:
-  return output_move_quad (operands);
-case 8:
+case 7:  /* [w,Ul].  */
 	return output_move_neon (operands);
+
 default:
   gcc_unreachable ();
   return "";
 }
 }
-  [(set_attr "type" "mve_move,mve_move,mve_move,mve_move,mve_load,multiple,mve_move,mve_store,mve_load")
-   (set_attr "length" "4,8,8,4,8,8,4,4,4")
-   (set_attr "thumb2_pool_range" "*,*,*,*,1018,*,*,*,*")
-   (set_attr "neg_pool_range" "*,*,*,*,996,*,*,*,*")])
+  [(set_attr "type" "mve_move,mve_move,mve_move,mve_move,mve_load,multiple,mve_store,mve_load")
+   (set_attr "length" "4,8,8,4,4,8,4,8")
+   (set_attr "thumb2_pool_range" "*,*,*,*,1018,*,*,*")
+   (set_attr "neg_pool_range" "*,*,*,*,996,*,*,*")])
 
 (define_insn "*mve_vdup"
   [(set (match_operand:MVE_vecs 0 "s_register_operand" "=w")
diff --git a/gcc/testsuite/gcc.target/arm/acle/cde-mve-full-assembly.c b/gcc/testsuite/gcc.target/arm/acle/cde-mve-full-assembly.c
index 501cc84da10..d025c3391fb 100644
--- 

Re: [PATCH] varasm: Fix up ICE in narrowing_initializer_constant_valid_p [PR105998]

2022-06-17 Thread Jakub Jelinek via Gcc-patches
On Fri, Jun 17, 2022 at 10:37:45AM +0200, Richard Biener wrote:
> > --- gcc/varasm.cc.jj2022-06-06 12:18:12.792812888 +0200
> > +++ gcc/varasm.cc2022-06-17 09:49:21.918029072 +0200
> > @@ -4716,7 +4716,8 @@ narrowing_initializer_constant_valid_p (
> > {
> >   tree inner = TREE_OPERAND (op0, 0);
> >   if (inner == error_mark_node
> > -  || ! INTEGRAL_MODE_P (TYPE_MODE (TREE_TYPE (inner)))
> > +  || VECTOR_TYPE_P (TREE_TYPE (inner))
> 
> Do we really want to allow all integer modes here regardless of a
> composite type (record type for example)?  I’d say !INTEGRAL_TYPE_P would
> match the rest better.  OTOH if we want to allow integer modes I fail to
> see why to exclude vector types (but not complex, etc)

I've excluded VECTOR_TYPE_P because those are the only types for which
TYPE_MODE can be different from the raw type mode (so, SCALAR_INT_MODE_P
was true but SCALAR_INT_TYPE_MODE still ICEd).

Checking for INTEGRAL_TYPE_P seems reasonable to me though,
and I'd say we also want to check the outer type too because nothing
really checks it (at least for the first iteration, 2nd and further
get it from checking of inner in the previous iteration).

So like this if it passes bootstrap/regtest?

2022-06-17  Jakub Jelinek  

PR middle-end/105998
* varasm.cc (narrowing_initializer_constant_valid_p): Check
SCALAR_INT_MODE_P instead of INTEGRAL_MODE_P, also break on
! INTEGRAL_TYPE_P and do the same check also on op{0,1}'s type.

* c-c++-common/pr105998.c: New test.

--- gcc/varasm.cc.jj2022-06-17 11:07:57.883679019 +0200
+++ gcc/varasm.cc   2022-06-17 11:10:09.190932417 +0200
@@ -4716,7 +4716,10 @@ narrowing_initializer_constant_valid_p (
 {
   tree inner = TREE_OPERAND (op0, 0);
   if (inner == error_mark_node
- || ! INTEGRAL_MODE_P (TYPE_MODE (TREE_TYPE (inner)))
+ || ! INTEGRAL_TYPE_P (TREE_TYPE (op0))
+ || ! SCALAR_INT_MODE_P (TYPE_MODE (TREE_TYPE (op0)))
+ || ! INTEGRAL_TYPE_P (TREE_TYPE (inner))
+ || ! SCALAR_INT_MODE_P (TYPE_MODE (TREE_TYPE (inner)))
  || (GET_MODE_SIZE (SCALAR_INT_TYPE_MODE (TREE_TYPE (op0)))
  > GET_MODE_SIZE (SCALAR_INT_TYPE_MODE (TREE_TYPE (inner)
break;
@@ -4728,7 +4731,10 @@ narrowing_initializer_constant_valid_p (
 {
   tree inner = TREE_OPERAND (op1, 0);
   if (inner == error_mark_node
- || ! INTEGRAL_MODE_P (TYPE_MODE (TREE_TYPE (inner)))
+ || ! INTEGRAL_TYPE_P (TREE_TYPE (op1))
+ || ! SCALAR_INT_MODE_P (TYPE_MODE (TREE_TYPE (op1)))
+ || ! INTEGRAL_TYPE_P (TREE_TYPE (inner))
+ || ! SCALAR_INT_MODE_P (TYPE_MODE (TREE_TYPE (inner)))
  || (GET_MODE_SIZE (SCALAR_INT_TYPE_MODE (TREE_TYPE (op1)))
  > GET_MODE_SIZE (SCALAR_INT_TYPE_MODE (TREE_TYPE (inner)
break;
--- gcc/testsuite/c-c++-common/pr105998.c.jj2022-06-17 11:09:11.196703834 
+0200
+++ gcc/testsuite/c-c++-common/pr105998.c   2022-06-17 11:09:11.196703834 
+0200
@@ -0,0 +1,12 @@
+/* PR middle-end/105998 */
+
+typedef int __attribute__((__vector_size__ (sizeof (long long V;
+
+V v;
+
+long long
+foo (void)
+{
+  long long l = (long long) ((0 | v) - ((V) { } == 0));
+  return l;
+}


Jakub



[PATCH v3] tree-optimization/94899: Remove "+ 0x80000000" in int comparisons

2022-06-17 Thread Arjun Shankar via Gcc-patches
Expressions of the form "X + CST < Y + CST" where:

* CST is an unsigned integer constant with only the MSB set, and
* X and Y's types have integer conversion ranks <= CST's

can be simplified to "(signed) X < (signed) Y".

This is because, assuming a 32-bit signed numbers,
(unsigned) INT_MIN + 0x8000 is 0, and
(unsigned) INT_MAX + 0x8000 is UINT_MAX.

i.e. the result increases monotonically with signed input.

This means:
((signed) X < (signed) Y) iff (X + 0x8000 < Y + 0x8000)

gcc/
* match.pd (X + C < Y + C -> (signed) X < (signed) Y, if C is
0x8000): New simplification.
gcc/testsuite/
* gcc.dg/pr94899.c: New test.
---
 gcc/match.pd   | 13 +
 gcc/testsuite/gcc.dg/pr94899.c | 48 ++
 2 files changed, 61 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/pr94899.c
---
v2: https://gcc.gnu.org/pipermail/gcc-patches/2022-February/589709.html

Notes on v3, based on Richard and Jakub's review comments:

1. Canonicalized the match expression to avoid having to use ":c".
2. Redefined MAGIC in the test to avoid running afoul of 16-bit int
   machines.

Richard has approved this patch for inclusion in GCC 13:
https://gcc.gnu.org/pipermail/gcc-patches/2022-February/589852.html

diff --git a/gcc/match.pd b/gcc/match.pd
index 3e9572e4c9c..ef42611854a 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2080,6 +2080,19 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0))
&& TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)))
(op @0 @1
+
+/* As a special case, X + C < Y + C is the same as (signed) X < (signed) Y
+   when C is an unsigned integer constant with only the MSB set, and X and
+   Y have types of equal or lower integer conversion rank than C's.  */
+(for op (lt le ge gt)
+ (simplify
+  (op (plus @1 INTEGER_CST@0) (plus @2 INTEGER_CST@0))
+  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
+   && TYPE_UNSIGNED (TREE_TYPE (@0))
+   && wi::only_sign_bit_p (wi::to_wide (@0)))
+   (with { tree stype = signed_type_for (TREE_TYPE (@0)); }
+(op (convert:stype @1) (convert:stype @2))
+
 /* For equality and subtraction, this is also true with wrapping overflow.  */
 (for op (eq ne minus)
  (simplify
diff --git a/gcc/testsuite/gcc.dg/pr94899.c b/gcc/testsuite/gcc.dg/pr94899.c
new file mode 100644
index 000..685201307ec
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr94899.c
@@ -0,0 +1,48 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+typedef __INT16_TYPE__ int16_t;
+typedef __INT32_TYPE__ int32_t;
+typedef __UINT16_TYPE__ uint16_t;
+typedef __UINT32_TYPE__ uint32_t;
+
+#define MAGIC (~ (uint32_t) 0 / 2 + 1)
+
+int
+f_i16_i16 (int16_t x, int16_t y)
+{
+  return x + MAGIC < y + MAGIC;
+}
+
+int
+f_i16_i32 (int16_t x, int32_t y)
+{
+  return x + MAGIC < y + MAGIC;
+}
+
+int
+f_i32_i32 (int32_t x, int32_t y)
+{
+  return x + MAGIC < y + MAGIC;
+}
+
+int
+f_u32_i32 (uint32_t x, int32_t y)
+{
+  return x + MAGIC < y + MAGIC;
+}
+
+int
+f_u32_u32 (uint32_t x, uint32_t y)
+{
+  return x + MAGIC < y + MAGIC;
+}
+
+int
+f_i32_i32_sub (int32_t x, int32_t y)
+{
+  return x - MAGIC < y - MAGIC;
+}
+
+/* The constants above should have been optimized away.  */
+/* { dg-final { scan-tree-dump-times "2147483648" 0 "optimized"} } */
-- 
2.35.3



Re: [PATCH][wwwdocs] gcc-13: add arm star-mc1 cpu

2022-06-17 Thread Chung-Ju Wu via Gcc-patches

On 2022/06/16 23:23 UTC+8, Gerald Pfeifer wrote:

On Thu, 16 Jun 2022, Chung-Ju Wu wrote:

Recently we added arm star-mc1 cpu support to upstream:
https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596379.html

It would be great if we can describe it on gcc-13 changes.html as well.
Attached is the patch for gcc-wwwdocs repository.


Looks good to me (from the wwwdocs side), thank you!

Gerald


Hi Gerald,

Thanks for the approval!
Committed as: fbc3a3692fa2bc85cd252d114f801da202f3ed35


Hi Kyrylo,

If there requires any other adjustment on description, just let me know and I 
will make further changes accordingly. Thanks.


Regards,
jasonwucj


Re: [PATCH] varasm: Fix up ICE in narrowing_initializer_constant_valid_p [PR105998]

2022-06-17 Thread Richard Biener via Gcc-patches



> Am 17.06.2022 um 10:11 schrieb Jakub Jelinek via Gcc-patches 
> :
> 
> Hi!
> 
> The following testcase ICEs because there is NON_LVALUE_EXPR (location
> wrapper) around a VAR_DECL and has TYPE_MODE V2SImode and
> SCALAR_INT_TYPE_MODE on that ICEs.  Or for -m32 -march=i386 TYPE_MODE
> is DImode, but SCALAR_INT_TYPE_MODE still uses the raw V2SImode and ICEs
> too.
> 
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok
> for trunk?
> 
> 2022-06-17  Jakub Jelinek  
> 
>PR middle-end/105998
>* varasm.cc (narrowing_initializer_constant_valid_p): Check
>SCALAR_INT_MODE_P instead of INTEGRAL_MODE_P, also break on
>VECTOR_TYPE_P.
> 
>* c-c++-common/pr105998.c: New test.
> 
> --- gcc/varasm.cc.jj2022-06-06 12:18:12.792812888 +0200
> +++ gcc/varasm.cc2022-06-17 09:49:21.918029072 +0200
> @@ -4716,7 +4716,8 @@ narrowing_initializer_constant_valid_p (
> {
>   tree inner = TREE_OPERAND (op0, 0);
>   if (inner == error_mark_node
> -  || ! INTEGRAL_MODE_P (TYPE_MODE (TREE_TYPE (inner)))
> +  || VECTOR_TYPE_P (TREE_TYPE (inner))

Do we really want to allow all integer modes here regardless of a composite 
type (record type for example)?  I’d say !INTEGRAL_TYPE_P would match the rest 
better.  OTOH if we want to allow integer modes I fail to see why to exclude 
vector types (but not complex, etc)

> +  || ! SCALAR_INT_MODE_P (TYPE_MODE (TREE_TYPE (inner)))
>  || (GET_MODE_SIZE (SCALAR_INT_TYPE_MODE (TREE_TYPE (op0)))
>  > GET_MODE_SIZE (SCALAR_INT_TYPE_MODE (TREE_TYPE (inner)
>break;
> @@ -4728,7 +4729,8 @@ narrowing_initializer_constant_valid_p (
> {
>   tree inner = TREE_OPERAND (op1, 0);
>   if (inner == error_mark_node
> -  || ! INTEGRAL_MODE_P (TYPE_MODE (TREE_TYPE (inner)))
> +  || VECTOR_TYPE_P (TREE_TYPE (inner))
> +  || ! SCALAR_INT_MODE_P (TYPE_MODE (TREE_TYPE (inner)))
>  || (GET_MODE_SIZE (SCALAR_INT_TYPE_MODE (TREE_TYPE (op1)))
>  > GET_MODE_SIZE (SCALAR_INT_TYPE_MODE (TREE_TYPE (inner)
>break;
> --- gcc/testsuite/c-c++-common/pr105998.c.jj2022-06-16 16:37:00.094926684 
> +0200
> +++ gcc/testsuite/c-c++-common/pr105998.c2022-06-16 16:36:30.155322215 
> +0200
> @@ -0,0 +1,12 @@
> +/* PR middle-end/105998 */
> +
> +typedef int __attribute__((__vector_size__ (sizeof (long long V;
> +
> +V v;
> +
> +long long
> +foo (void)
> +{
> +  long long l = (long long) ((0 | v) - ((V) { } == 0));
> +  return l;
> +}
> 
>Jakub
> 


Re: [PATCH v1 2/3] RISC-V: Split slli+sh[123]add.uw opportunities to avoid zext.w

2022-06-17 Thread Andreas Schwab
../../gcc/config/riscv/bitmanip.md: In function 'rtx_insn* gen_split_44(rtx_ins\
n*, rtx_def**)':
../../gcc/config/riscv/bitmanip.md:110:28: error: comparison of integer express\
ions of different signedness: 'int' and 'long unsigned int' [-Werror=sign-compa\
re]
  110 | if ((scale + bias) != UINTVAL (operands[2]))

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


Re: [PATCH, rs6000] Use CC for BCD operations [PR100736]

2022-06-17 Thread HAO CHEN GUI via Gcc-patches
Hi,

On 16/6/2022 下午 7:24, Segher Boessenkool wrote:
> There is no normal way to get at bit 3 of a CR field.  We can use some
> unspec though, which is total cheating but it does work, and it is
> safe, albeit sometimes suboptimal.

Thanks so much for your advice. I will use an unspec for setting reg from
the BCD overflow bit.
> 
> You shouldn't need anything like this, bcdinvalid will work just fine if
> written as bcdadd_ov (with vector of 0 as the second op)?

The vector of 0 is not equal to BCD 0, I think. The BCD number contains
preferred sign (PS) bit. So all zeros itself is an invalid encoding. We may
use bcdsub_ov with duplicated op to implement bcdinvalid.


[PATCH] varasm: Fix up ICE in narrowing_initializer_constant_valid_p [PR105998]

2022-06-17 Thread Jakub Jelinek via Gcc-patches
Hi!

The following testcase ICEs because there is NON_LVALUE_EXPR (location
wrapper) around a VAR_DECL and has TYPE_MODE V2SImode and
SCALAR_INT_TYPE_MODE on that ICEs.  Or for -m32 -march=i386 TYPE_MODE
is DImode, but SCALAR_INT_TYPE_MODE still uses the raw V2SImode and ICEs
too.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok
for trunk?

2022-06-17  Jakub Jelinek  

PR middle-end/105998
* varasm.cc (narrowing_initializer_constant_valid_p): Check
SCALAR_INT_MODE_P instead of INTEGRAL_MODE_P, also break on
VECTOR_TYPE_P.

* c-c++-common/pr105998.c: New test.

--- gcc/varasm.cc.jj2022-06-06 12:18:12.792812888 +0200
+++ gcc/varasm.cc   2022-06-17 09:49:21.918029072 +0200
@@ -4716,7 +4716,8 @@ narrowing_initializer_constant_valid_p (
 {
   tree inner = TREE_OPERAND (op0, 0);
   if (inner == error_mark_node
- || ! INTEGRAL_MODE_P (TYPE_MODE (TREE_TYPE (inner)))
+ || VECTOR_TYPE_P (TREE_TYPE (inner))
+ || ! SCALAR_INT_MODE_P (TYPE_MODE (TREE_TYPE (inner)))
  || (GET_MODE_SIZE (SCALAR_INT_TYPE_MODE (TREE_TYPE (op0)))
  > GET_MODE_SIZE (SCALAR_INT_TYPE_MODE (TREE_TYPE (inner)
break;
@@ -4728,7 +4729,8 @@ narrowing_initializer_constant_valid_p (
 {
   tree inner = TREE_OPERAND (op1, 0);
   if (inner == error_mark_node
- || ! INTEGRAL_MODE_P (TYPE_MODE (TREE_TYPE (inner)))
+ || VECTOR_TYPE_P (TREE_TYPE (inner))
+ || ! SCALAR_INT_MODE_P (TYPE_MODE (TREE_TYPE (inner)))
  || (GET_MODE_SIZE (SCALAR_INT_TYPE_MODE (TREE_TYPE (op1)))
  > GET_MODE_SIZE (SCALAR_INT_TYPE_MODE (TREE_TYPE (inner)
break;
--- gcc/testsuite/c-c++-common/pr105998.c.jj2022-06-16 16:37:00.094926684 
+0200
+++ gcc/testsuite/c-c++-common/pr105998.c   2022-06-16 16:36:30.155322215 
+0200
@@ -0,0 +1,12 @@
+/* PR middle-end/105998 */
+
+typedef int __attribute__((__vector_size__ (sizeof (long long V;
+
+V v;
+
+long long
+foo (void)
+{
+  long long l = (long long) ((0 | v) - ((V) { } == 0));
+  return l;
+}

Jakub



[PATCH] c++: Use fold_non_dependent_expr rather than maybe_constant_value in __builtin_shufflevector handling [PR106001]

2022-06-17 Thread Jakub Jelinek via Gcc-patches
Hi!

In this case the STATIC_CAST_EXPR expressions in the call aren't
type nor value dependent, but maybe_constant_value still ICEs on those
when processing_template_decl.  Calling fold_non_dependent_expr on it
instead fixes the ICE and folds them to INTEGER_CSTs.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-06-17  Jakub Jelinek  

PR c++/106001
* typeck.cc (build_x_shufflevector): Use fold_non_dependent_expr
instead of maybe_constant_value.

* g++.dg/ext/builtin-shufflevector-4.C: New test.

--- gcc/cp/typeck.cc.jj 2022-06-04 10:34:26.261505682 +0200
+++ gcc/cp/typeck.cc2022-06-16 19:38:04.397979247 +0200
@@ -6344,7 +6344,7 @@ build_x_shufflevector (location_t loc, v
   auto_vec mask;
   for (unsigned i = 2; i < args->length (); ++i)
 {
-  tree idx = maybe_constant_value ((*args)[i]);
+  tree idx = fold_non_dependent_expr ((*args)[i], complain);
   mask.safe_push (idx);
 }
   tree exp = c_build_shufflevector (loc, arg0, arg1, mask, complain & 
tf_error);
--- gcc/testsuite/g++.dg/ext/builtin-shufflevector-4.C.jj   2022-06-16 
19:43:13.103935249 +0200
+++ gcc/testsuite/g++.dg/ext/builtin-shufflevector-4.C  2022-06-16 
19:42:37.534401207 +0200
@@ -0,0 +1,18 @@
+// PR c++/106001
+// { dg-do compile }
+
+typedef int V __attribute__((vector_size (2 * sizeof (int;
+
+template 
+void
+foo ()
+{
+  V v = {};
+  v = __builtin_shufflevector (v, v, static_cast(1), 
static_cast(0));
+}
+
+void
+bar ()
+{
+  foo <0> ();
+}

Jakub