Add stub 'gcc/rust/ChangeLog' (was: Prepare 'contrib/gcc-changelog/git_commit.py' for GCC/Rust)

2022-12-09 Thread Thomas Schwinge
Hi!

On 2022-12-10T07:39:24+0100, I wrote:
> I've pushed "Prepare 'contrib/gcc-changelog/git_commit.py' for GCC/Rust"
> to master branch in commit 325529e21e81fbc3561d2568cb7e8a26296e5b2f, see
> attached.
>
> Please let me know if there is anything that I need to do to actually
> generate the empty 'gcc/rust/ChangeLog' file.

I've now been informed of a non-public email, that indeed there is a
manual step involved; pushed "Add stub 'gcc/rust/ChangeLog'" to master
branch in commit 24ff0b3e0c41e3997fb4c11736b8a412afbaadf3, see attached.

> (For avoidance of doubt: yes, only 'gcc/rust/' at this time.)


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From 24ff0b3e0c41e3997fb4c11736b8a412afbaadf3 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Sat, 10 Dec 2022 08:33:22 +0100
Subject: [PATCH] Add stub 'gcc/rust/ChangeLog'

---
 gcc/rust/ChangeLog | 6 ++
 1 file changed, 6 insertions(+)
 create mode 100644 gcc/rust/ChangeLog

diff --git a/gcc/rust/ChangeLog b/gcc/rust/ChangeLog
new file mode 100644
index ..3a4f03c28af8
--- /dev/null
+++ b/gcc/rust/ChangeLog
@@ -0,0 +1,6 @@
+
+Copyright (C) 2022 Free Software Foundation, Inc.
+
+Copying and distribution of this file, with or without modification,
+are permitted in any medium without royalty provided the copyright
+notice and this notice are preserved.
-- 
2.35.1



Re: [PATCH] Backport gcc-12: jobserver FIFO support

2022-12-09 Thread Xi Ruoyao via Gcc-patches
On Fri, 2022-12-09 at 11:07 +0100, Martin Liška wrote:
> Hi.
> 
> As make 4.4 has been release, it switches to FIFO by default. That makes
> troubles to the latest GCC release, version 12. Right now, we've been using
> the following 4 patches in openSUSE gcc12 package:
> 
> 1270ccda70ca09f7d4fe76b5156dca8992bd77a6
> 53e3b2bf16a486c15c20991c6095f7be09012b55
> fed766af32ed6cd371016cc24e931131e19b4eb1
> 3f1c2f89f6b8b8d23a9072f8549b0a2c1de06b03
> 
> Would it be fine to backport it to gcc-12 branch? Arsen asked me the today
> as Gentoo people want it as well.

I'd vote a +1, I've applied them to a GCC 12.2 build and used it to
build many packages with -flto=auto.  GCC seems communicating with make-
4.4 correctly with these patches.

> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
> Ready to be installed?
> Thanks,
> Martin

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Prepare 'contrib/gcc-changelog/git_commit.py' for GCC/Rust (was: Rust front-end patches v4)

2022-12-09 Thread Thomas Schwinge
Hi!

On 2022-12-06T12:03:56+0100, Richard Biener via Gcc-patches 
 wrote:
> On Tue, Dec 6, 2022 at 11:11 AM  wrote:
>> This patchset contains the fixed version of our most recent patchset. [...]
>
> Thanks a lot - this is OK to merge now

Hey, hey!  :-)


Still working on some final edits to make the Git commits comply with GCC
policies, but hopefully ready to push soon.


I've pushed "Prepare 'contrib/gcc-changelog/git_commit.py' for GCC/Rust"
to master branch in commit 325529e21e81fbc3561d2568cb7e8a26296e5b2f, see
attached.

Please let me know if there is anything that I need to do to actually
generate the empty 'gcc/rust/ChangeLog' file.

(For avoidance of doubt: yes, only 'gcc/rust/' at this time.)


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From 325529e21e81fbc3561d2568cb7e8a26296e5b2f Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Sat, 10 Dec 2022 07:27:55 +0100
Subject: [PATCH] Prepare 'contrib/gcc-changelog/git_commit.py' for GCC/Rust

	contrib/
	* gcc-changelog/git_commit.py (default_changelog_locations): Add
	'gcc/rust'.
	(bug_components): Add 'rust'.
---
 contrib/gcc-changelog/git_commit.py | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/contrib/gcc-changelog/git_commit.py b/contrib/gcc-changelog/git_commit.py
index fb1d15fd86df..aae3416e082f 100755
--- a/contrib/gcc-changelog/git_commit.py
+++ b/contrib/gcc-changelog/git_commit.py
@@ -45,6 +45,7 @@ default_changelog_locations = {
 'gcc/objc',
 'gcc/objcp',
 'gcc/po',
+'gcc/rust',
 'gcc/testsuite',
 'gnattools',
 'gotools',
@@ -122,6 +123,7 @@ bug_components = {
 'preprocessor',
 'regression',
 'rtl-optimization',
+'rust',
 'sanitizer',
 'spam',
 'target',
-- 
2.35.1



Re: Add zstd support to libbacktrace

2022-12-09 Thread Ian Lance Taylor via Gcc-patches
On Wed, Dec 7, 2022 at 4:22 PM Ian Lance Taylor  wrote:
>
> This patch adds zstd support to libbacktrace, to support the new
> linker option --compress-debug-sections=zstd.

This patch rewrites and simplifies the main zstd decompression loop
using some ideas from the reference implementation.  This speeds it up
a bit, although it still runs at about 35% of the speed of the
reference implementaiton.  Bootstrapped and ran libbacktrace tests on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian

* elf.c (ZSTD_TABLE_*): Use elf_zstd_fse_baseline_entry.
(ZSTD_ENCODE_BASELINE_BITS): Define.
(ZSTD_DECODE_BASELINE, ZSTD_DECODE_BASEBITS): Define.
(elf_zstd_literal_length_base): New static const array.
(elf_zstd_match_length_base): Likewise.
(struct elf_zstd_fse_baseline_entry): Define.
(elf_zstd_make_literal_baseline_fse): New static function.
(elf_zstd_make_offset_baseline_fse): Likewise.
(elf_zstd_make_match_baseline_fse): Likewise.
(print_table, main): Use elf_zstd_fse_baseline_entry.
(elf_zstd_lit_table, elf_zstd_match_table): Likewise.
(elf_zstd_offset_table): Likewise.
(struct elf_zstd_seq_decode): Likewise.  Remove use_rle and rle
fields.
(elf_zstd_unpack_seq_decode): Use elf_zstd_fse_baseline_entry,
taking a conversion function.  Convert RLE to FSE.
(elf_zstd_literal_length_baseline): Remove.
(elf_zstd_literal_length_bits): Remove.
(elf_zstd_match_length_baseline): Remove.
(elf_zstd_match_length_bits): Remove.
(elf_zstd_decompress): Use elf_zstd_fse_baseline_entry.  Rewrite
and simplify main loop.
diff --git a/libbacktrace/elf.c b/libbacktrace/elf.c
index 15e6f284db6..ece02db27f1 100644
--- a/libbacktrace/elf.c
+++ b/libbacktrace/elf.c
@@ -2610,9 +2610,9 @@ elf_zlib_inflate_and_verify (const unsigned char *pin, 
size_t sin,
 }
 
 /* For working memory during zstd compression, we need
-   - a literal length FSE table: 512 32-bit values == 2048 bytes
-   - a match length FSE table: 512 32-bit values == 2048 bytes
-   - a offset FSE table: 256 32-bit values == 1024 bytes
+   - a literal length FSE table: 512 64-bit values == 4096 bytes
+   - a match length FSE table: 512 64-bit values == 4096 bytes
+   - a offset FSE table: 256 64-bit values == 2048 bytes
- a Huffman tree: 2048 uint16_t values == 4096 bytes
- scratch space, one of
  - to build an FSE table: 512 uint16_t values == 1024 bytes
@@ -2620,21 +2620,24 @@ elf_zlib_inflate_and_verify (const unsigned char *pin, 
size_t sin,
  - buffer for literal values == 2048 bytes
 */
 
-#define ZSTD_TABLE_SIZE\
-  (2 * 512 * sizeof (struct elf_zstd_fse_entry)\
-   + 256 * sizeof (struct elf_zstd_fse_entry)  \
-   + 2048 * sizeof (uint16_t)  \
+#define ZSTD_TABLE_SIZE\
+  (2 * 512 * sizeof (struct elf_zstd_fse_baseline_entry)   \
+   + 256 * sizeof (struct elf_zstd_fse_baseline_entry) \
+   + 2048 * sizeof (uint16_t)  \
+ 2048)
 
 #define ZSTD_TABLE_LITERAL_FSE_OFFSET (0)
 
-#define ZSTD_TABLE_MATCH_FSE_OFFSET (512 * sizeof (struct elf_zstd_fse_entry))
+#define ZSTD_TABLE_MATCH_FSE_OFFSET\
+  (512 * sizeof (struct elf_zstd_fse_baseline_entry))
 
-#define ZSTD_TABLE_OFFSET_FSE_OFFSET \
-  (ZSTD_TABLE_MATCH_FSE_OFFSET + 512 * sizeof (struct elf_zstd_fse_entry))
+#define ZSTD_TABLE_OFFSET_FSE_OFFSET   \
+  (ZSTD_TABLE_MATCH_FSE_OFFSET \
+   + 512 * sizeof (struct elf_zstd_fse_baseline_entry))
 
-#define ZSTD_TABLE_HUFFMAN_OFFSET \
-  (ZSTD_TABLE_OFFSET_FSE_OFFSET + 256 * sizeof (struct elf_zstd_fse_entry))
+#define ZSTD_TABLE_HUFFMAN_OFFSET  \
+  (ZSTD_TABLE_OFFSET_FSE_OFFSET
\
+   + 256 * sizeof (struct elf_zstd_fse_baseline_entry))
 
 #define ZSTD_TABLE_WORK_OFFSET \
   (ZSTD_TABLE_HUFFMAN_OFFSET + 2048 * sizeof (uint16_t))
@@ -2645,8 +2648,11 @@ elf_zlib_inflate_and_verify (const unsigned char *pin, 
size_t sin,
 
 struct elf_zstd_fse_entry
 {
+  /* The value that this FSE entry represents.  */
   unsigned char symbol;
+  /* The number of bits to read to determine the next state.  */
   unsigned char bits;
+  /* Add the bits to this base to get the next state.  */
   uint16_t base;
 };
 
@@ -2925,6 +2931,270 @@ elf_zstd_build_fse (const int16_t *norm, int idx, 
uint16_t *next,
   return 1;
 }
 
+/* Encode the baseline and bits into a single 32-bit value.  */
+
+#define ZSTD_ENCODE_BASELINE_BITS(baseline, basebits)  \
+  ((uint32_t)(baseline) | ((uint32_t)(basebits) << 24))
+
+#define ZSTD_DECODE_BASELINE(baseline_basebits)\
+  ((uint32_t)(baseline_basebits) & 0xff)
+
+#define ZSTD_DECODE_BASEBITS(baseline_basebits)\
+  ((uint32_t)(baseline_basebits) >> 24)
+
+/* Given a literal length code, we need to read a number of bits and add that
+   to a baseline.  For states 0 to 15 the baseline is the state and the number
+   of bits is zero.  */
+
+#define 

[PATCH v4 1/19] modula2 front end: changes outside gcc/m2, libgm2 and gcc/testsuite.

2022-12-09 Thread Gaius Mulley via Gcc-patches


While writing the ChangeLog entries git gcc-verify spotted an oversight
with v3 of this patch set.  I had forgotten to post gm2.texi and also a
tiny patchlet in gcc/configure.ac (to detect Python).  HAVE_PYTHON is
used within gcc/m2/Make-lang.in to avoid generating the library section
included by gm2.texi should Python not be available.

ok to commit?  I've included gm2-lang.cc and lang.opt for reference.

regards,
Gaius




diff -ruw gcc-git-master/gcc/configure.ac gcc-git-devel-modula2/gcc/configure.ac
--- gcc-git-master/gcc/configure.ac 2022-12-07 20:16:24.571677189 +
+++ gcc-git-devel-modula2/gcc/configure.ac  2022-12-07 19:46:20.036302786 
+
@@ -1263,6 +1263,10 @@
 # Bison?
 AC_CHECK_PROGS([BISON], bison, [$MISSING bison])

+# Python3?
+AM_PATH_PYTHON(,, [:])
+AM_CONDITIONAL([HAVE_PYTHON], [test "$PYTHON" != :])
+
 # Binutils are not build modules, unlike bison/flex/makeinfo.  So we
 # check for build == host before using them.

@@ -7651,4 +7655,3 @@
 ],
 [subdirs='$subdirs'])
 AC_OUTPUT
-
diff -ruw /dev/null gcc-git-devel-modula2/gcc/doc/gm2.texi
--- /dev/null   2022-08-24 16:22:16.88870 +0100
+++ gcc-git-devel-modula2/gcc/doc/gm2.texi  2022-12-10 00:04:30.263603238 
+
@@ -0,0 +1,2944 @@
+\input texinfo
+@c -*-texinfo-*-
+@c Copyright (C) 2001-2022 Free Software Foundation, Inc.
+@c This is part of the GM2 manual.
+
+@c User level documentation for GNU Modula-2
+@c
+@c header
+
+@setfilename gm2.info
+@settitle The GNU Modula-2 Compiler
+
+@set version-python  3.5
+
+@include gcc-common.texi
+
+@c Copyright years for this manual.
+@set copyrights-gm2 1999-2022
+
+@copying
+@c man begin COPYRIGHT
+Copyright @copyright{} @value{copyrights-gm2} Free Software Foundation, Inc.
+
+Permission is granted to copy, distribute and/or modify this document
+under the terms of the GNU Free Documentation License, Version 1.3 or
+any later version published by the Free Software Foundation; with no
+Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.
+A copy of the license is included in the
+@c man end
+section entitled ``GNU Free Documentation License''.
+@ignore
+@c man begin COPYRIGHT
+man page gfdl(7).
+@c man end
+@end ignore
+@end copying
+
+@ifinfo
+@format
+@dircategory Software development
+@direntry
+* gm2: (gm2).   A GCC-based compiler for the Modula-2 language
+@end direntry
+@end format
+
+@insertcopying
+@end ifinfo
+
+@titlepage
+@title The GNU Modula-2 Compiler
+@versionsubtitle
+@author Gaius Mulley
+
+@page
+@vskip 0pt plus 1filll
+Published by the Free Software Foundation @*
+51 Franklin Street, Fifth Floor@*
+Boston, MA 02110-1301, USA@*
+@sp 1
+@insertcopying
+@end titlepage
+@contents
+@page
+
+@c `Top' Node and Master Menu
+
+@node Top, Overview, (dir), (dir)
+@top Introduction
+
+@menu
+* Overview:: What is GNU Modula-2.
+* Using::Using GNU Modula-2.
+* Licence::  Licence of GNU Modula-2
+* Copying::  GNU Public Licence V3.
+* Contributing:: Contributing to GNU Modula-2
+* Internals::GNU Modula-2 internals.
+* EBNF:: EBNF of GNU Modula-2
+* Libraries::PIM and ISO library definitions.
+* Indices::  Document and function indices.
+@end menu
+
+@node Overview, Using, Top, Top
+@chapter Overview of GNU Modula-2
+
+@menu
+* What is GNU Modula-2::  Brief description of GNU Modula-2.
+* Why use GNU Modula-2::  Advantages of GNU Modula-2.
+* News::  Latest news about GNU Modula-2.
+* Development::   How to get source code using git.
+* Obtaining:: Where to get the source code using git.
+* Features::  GNU Modula-2 Features
+@end menu
+
+@node What is GNU Modula-2, Why use GNU Modula-2, , Using
+@section What is GNU Modula-2
+
+GNU Modula-2 is a @uref{http://gcc.gnu.org/frontends.html, front end}
+for the GNU Compiler Collection (@uref{http://gcc.gnu.org/, GCC}).
+The GNU Modula-2 compiler is compliant with the PIM2, PIM3, PIM4 and
+ISO dialects.  Also implemented are a complete set of free ISO
+libraries and PIM libraries.
+
+@footnote{The four Modula-2 dialects supported are defined in the following
+references:
+
+PIM2: 'Programming in Modula-2', 2nd Edition, Springer Verlag, 1982,
+1983 by Niklaus Wirth (PIM2).
+
+PIM3: 'Programming in Modula-2', 3rd Corrected Edition, Springer Verlag,
+1985 (PIM3).
+
+PIM4: 'Programming in Modula-2', 4th Edition, Springer Verlag, 1988
+(@uref{http://freepages.modula2.org/report4/modula-2.html, PIM4}).
+
+ISO: the ISO Modula-2 language as defined in 'ISO/IEC Information
+technology - programming languages - part 1: Modula-2 Language,
+ISO/IEC 10514-1 (1996)'
+}
+
+@node Why use GNU Modula-2, Release map, What is GNU Modula-2, Using
+@section Why use GNU Modula-2
+
+There are a number of advantages of using GNU Modula-2 rather than
+translate an existing project into another language.
+
+The first advantage is of maintainability of the original sources
+and the ability to debug the 

Re: [PATCH] Fortran: ICE on recursive derived types with allocatable components [PR107872]

2022-12-09 Thread Paul Richard Thomas via Gcc-patches
Hi Harald,

Thanks for doing that. My attention is elsewhere gfortran-wise.

Good for mainline.

Paul


On Fri, 9 Dec 2022 at 21:27, Harald Anlauf via Fortran 
wrote:

> Dear all,
>
> I am submitting the attached simple - and obvious - patch on
> behalf of Paul.  It prevents a resource exhaustion due to an
> infinite loop, and has been regtested by multiple contributers, ;-)
> at least on x86_64-pc-linux-gnu.
>
> I intend to commit it to mainline within 24 hours, unless
> there are further comments.
>
> Thanks,
> Harald
>
>

-- 
"If you can't explain it simply, you don't understand it well enough" -
Albert Einstein


[PATCH] c++: extract_local_specs and unevaluated contexts [PR100295]

2022-12-09 Thread Patrick Palka via Gcc-patches
Here during partial instantiation of the constexpr if, extra_local_specs
walks the statement looking for local specializations within to save and
possibly capture.  However, we're thwarted by the fact that 'ts' first
appears inside an unevaluated context, and so the calls to
process_outer_var_ref for its local specializations are a no-op.  And
since we walk each tree exactly once, we end up not capturing them
despite it later occuring in an evaluated context.

This patch fixes this by making extract_local_specs walk evaluated
contexts first before walking unevaluated contexts.  We could probably
get away with not walking unevaluated contexts at all, but this approach
seems safer.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk/12?

PR c++/100295
PR c++/107579

gcc/cp/ChangeLog:

* pt.cc (el_data::skip_unevaluated_operands): New data member.
(extract_locals_r): If skip_unevaluated_operands is true,
don't walk into unevaluated contexts.
(extract_local_specs): Walk the pattern twice, first with
skip_unevaluated_operands true followed by it set to false.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/constexpr-if-lambda5.C: New test.
---
 gcc/cp/pt.cc  | 19 ++-
 .../g++.dg/cpp1z/constexpr-if-lambda5.C   | 15 +++
 2 files changed, 33 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp1z/constexpr-if-lambda5.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index d05a49b1c11..2b22bf14c53 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -13015,17 +13015,26 @@ public:
   /* List of local_specializations used within the pattern.  */
   tree extra;
   tsubst_flags_t complain;
+  /* True iff we don't want to walk into unevaluated contexts.  */
+  bool skip_unevaluated_operands = false;
 
   el_data (tsubst_flags_t c)
 : extra (NULL_TREE), complain (c) {}
 };
 static tree
-extract_locals_r (tree *tp, int */*walk_subtrees*/, void *data_)
+extract_locals_r (tree *tp, int *walk_subtrees, void *data_)
 {
   el_data  = *reinterpret_cast(data_);
   tree *extra = 
   tsubst_flags_t complain = data.complain;
 
+  if (data.skip_unevaluated_operands
+  && unevaluated_p (TREE_CODE (*tp)))
+{
+  *walk_subtrees = 0;
+  return NULL_TREE;
+}
+
   if (TYPE_P (*tp) && typedef_variant_p (*tp))
 /* Remember local typedefs (85214).  */
 tp = _NAME (*tp);
@@ -13117,6 +13126,14 @@ static tree
 extract_local_specs (tree pattern, tsubst_flags_t complain)
 {
   el_data data (complain);
+  /* Walk the pattern twice, ignoring unevaluated operands the first time
+ around, so that if a local specialization appears in both an
+ evaluated and unevaluated context we prefer to process it in the
+ former context (since e.g. process_outer_var_ref is a no-op inside
+ an unevaluated context).  */
+  data.skip_unevaluated_operands = true;
+  cp_walk_tree (, extract_locals_r, , );
+  data.skip_unevaluated_operands = false;
   cp_walk_tree (, extract_locals_r, , );
   return data.extra;
 }
diff --git a/gcc/testsuite/g++.dg/cpp1z/constexpr-if-lambda5.C 
b/gcc/testsuite/g++.dg/cpp1z/constexpr-if-lambda5.C
new file mode 100644
index 000..d2bf0221743
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/constexpr-if-lambda5.C
@@ -0,0 +1,15 @@
+// PR c++/100295
+// { dg-do compile { target c++17 } }
+
+template
+void f(Ts... ts) {
+  auto lambda = [=](auto x) {
+if constexpr (sizeof((ts+x) + ...) != 0)
+  (..., ts);
+  };
+  lambda(0);
+}
+
+int main() {
+  f(0, 'a');
+}
-- 
2.39.0.rc2



[PATCH] Fortran: ICE on recursive derived types with allocatable components [PR107872]

2022-12-09 Thread Harald Anlauf via Gcc-patches
Dear all,

I am submitting the attached simple - and obvious - patch on
behalf of Paul.  It prevents a resource exhaustion due to an
infinite loop, and has been regtested by multiple contributers, ;-)
at least on x86_64-pc-linux-gnu.

I intend to commit it to mainline within 24 hours, unless
there are further comments.

Thanks,
Harald

From 01254aa2eb766c7584fd047568d7277d4d65d067 Mon Sep 17 00:00:00 2001
From: Paul Thomas 
Date: Fri, 9 Dec 2022 22:13:45 +0100
Subject: [PATCH] Fortran: ICE on recursive derived types with allocatable
 components [PR107872]

gcc/fortran/ChangeLog:

	PR fortran/107872
	* resolve.cc (derived_inaccessible): Skip over allocatable components
	to prevent an infinite loop.

gcc/testsuite/ChangeLog:

	PR fortran/107872
	* gfortran.dg/pr107872.f90: New test.
---
 gcc/fortran/resolve.cc |  3 +-
 gcc/testsuite/gfortran.dg/pr107872.f90 | 40 ++
 2 files changed, 42 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gfortran.dg/pr107872.f90

diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc
index 75dc4b59105..158bf08ec26 100644
--- a/gcc/fortran/resolve.cc
+++ b/gcc/fortran/resolve.cc
@@ -7536,7 +7536,8 @@ derived_inaccessible (gfc_symbol *sym)
   for (c = sym->components; c; c = c->next)
 {
 	/* Prevent an infinite loop through this function.  */
-	if (c->ts.type == BT_DERIVED && c->attr.pointer
+	if (c->ts.type == BT_DERIVED
+	&& (c->attr.pointer || c->attr.allocatable)
 	&& sym == c->ts.u.derived)
 	  continue;

diff --git a/gcc/testsuite/gfortran.dg/pr107872.f90 b/gcc/testsuite/gfortran.dg/pr107872.f90
new file mode 100644
index 000..09838479e92
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr107872.f90
@@ -0,0 +1,40 @@
+! { dg-do run }
+!
+! Test the fix for PR107872, where an ICE occurred in
+! resolve.cc(derived_inaccessible) because derived types with
+! recursive allocatable components were not catered for.
+!
+module mod1
+  type t
+ integer :: data
+ type(t), allocatable :: next
+   contains
+ procedure, private :: write_t
+ generic :: write(formatted) => write_t
+  end type
+contains
+  recursive subroutine write_t(this, unit, iotype, v_list, iostat, iomsg)
+class(t), intent(in) :: this
+integer, intent(in) :: unit
+character(*), intent(in) :: iotype
+integer, intent(in) :: v_list(:)
+integer, intent(out) :: iostat
+character(*), intent(inout) :: iomsg
+if (ALLOCATED(this%next)) &
+ write (unit, '(dt)') this%next
+write (unit, '(i2)') this%data
+  end subroutine
+end module
+
+  use mod1
+  type(t) :: a
+  character (8) :: buffer
+  a%data = 1
+  allocate (a%next)
+  a%next%data = 2
+  allocate (a%next%next)
+  a%next%next%data = 3
+  write (buffer, '(dt)')a
+  deallocate (a%next)
+  if (trim (buffer) .ne. ' 3 2 1') stop 1
+end
--
2.35.3



[Patch] Fortran: Replace simple '.' quotes by %<.%>

2022-12-09 Thread Tobias Burnus

Found when working on the just submitted/committed patch.

I intent to commit it to mainline as obvious tomorrow (or Sun or Mon),
unless there are comments.

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
Fortran: Replace simple '.' quotes by %<.%>

When using %qs instead of '%s' or %<=%> instead of '=' looks nicer
by having nicer quotes and bold text, if the terminal supports it;
otherwise, plain quotes are used.

gcc/fortran/ChangeLog:

	* match.cc (gfc_match_member_sep): Use %<...%> in gfc_error.
	* openmp.cc (gfc_match_oacc_routine, gfc_match_omp_context_selector,
	gfc_match_omp_context_selector_specification,
	gfc_match_omp_declare_variant, resolve_omp_clauses): Likewise;
	use %qs instead of '%s'.
	* primary.cc (match_real_constant, gfc_match_varspec): Likewise.
	* resolve.cc (gfc_resolve_formal_arglist, resolve_operator,
	resolve_ordinary_assign): Likewise.

diff --git a/gcc/fortran/match.cc b/gcc/fortran/match.cc
index 7ba0f349993..89fb115c0f6 100644
--- a/gcc/fortran/match.cc
+++ b/gcc/fortran/match.cc
@@ -195,3 +195,3 @@ gfc_match_member_sep(gfc_symbol *sym)
   gfc_error ("Expected structure component or operator name "
- "after '.' at %C");
+		 "after %<.%> at %C");
   goto error;
diff --git a/gcc/fortran/openmp.cc b/gcc/fortran/openmp.cc
index 4b4e6ac6947..7edc78ad0cb 100644
--- a/gcc/fortran/openmp.cc
+++ b/gcc/fortran/openmp.cc
@@ -4061,3 +4061,3 @@ gfc_match_oacc_routine (void)
 	  gfc_error ("Syntax error in !$ACC ROUTINE ( NAME ) at %C, expecting"
-		 " ')' after NAME");
+		 " %<)%> after NAME");
 	  gfc_current_locus = old_loc;
@@ -5350,4 +5350,4 @@ gfc_match_omp_context_selector (gfc_omp_set_selector *oss)
 		{
-		  gfc_error ("selector '%s' not allowed for context selector "
-			 "set '%s' at %C",
+		  gfc_error ("selector %qs not allowed for context selector "
+			 "set %qs at %C",
 			 selector, oss->trait_set_selector_name);
@@ -5370,3 +5370,3 @@ gfc_match_omp_context_selector (gfc_omp_set_selector *oss)
 	{
-	  gfc_error ("selector '%s' does not accept any properties at %C",
+	  gfc_error ("selector %qs does not accept any properties at %C",
 			 selector);
@@ -5379,3 +5379,3 @@ gfc_match_omp_context_selector (gfc_omp_set_selector *oss)
 		{
-		  gfc_error ("expected '(' at %C");
+		  gfc_error ("expected %<(%> at %C");
 		  return MATCH_ERROR;
@@ -5401,3 +5401,3 @@ gfc_match_omp_context_selector (gfc_omp_set_selector *oss)
 		{
-		  gfc_error ("expected ')' at %C");
+		  gfc_error ("expected %<)%> at %C");
 		  return MATCH_ERROR;
@@ -5514,3 +5514,3 @@ gfc_match_omp_context_selector (gfc_omp_set_selector *oss)
 	{
-	  gfc_error ("expected ')' at %C");
+	  gfc_error ("expected %<)%> at %C");
 	  return MATCH_ERROR;
@@ -5524,3 +5524,3 @@ gfc_match_omp_context_selector (gfc_omp_set_selector *oss)
 	{
-	  gfc_error ("expected '(' at %C");
+	  gfc_error ("expected %<(%> at %C");
 	  return MATCH_ERROR;
@@ -5570,4 +5570,4 @@ gfc_match_omp_context_selector_specification (gfc_omp_declare_variant *odv)
 	{
-	  gfc_error ("expected 'construct', 'device', 'implementation' or "
-		 "'user' at %C");
+	  gfc_error ("expected %, %, % "
+		 "or % at %C");
 	  return MATCH_ERROR;
@@ -5578,3 +5578,3 @@ gfc_match_omp_context_selector_specification (gfc_omp_declare_variant *odv)
 	{
-	  gfc_error ("expected '=' at %C");
+	  gfc_error ("expected %<=%> at %C");
 	  return MATCH_ERROR;
@@ -5585,3 +5585,3 @@ gfc_match_omp_context_selector_specification (gfc_omp_declare_variant *odv)
 	{
-	  gfc_error ("expected '{' at %C");
+	  gfc_error ("expected %<{%> at %C");
 	  return MATCH_ERROR;
@@ -5600,3 +5600,3 @@ gfc_match_omp_context_selector_specification (gfc_omp_declare_variant *odv)
 	{
-	  gfc_error ("expected '}' at %C");
+	  gfc_error ("expected %<}%> at %C");
 	  return MATCH_ERROR;
@@ -5622,3 +5622,3 @@ gfc_match_omp_declare_variant (void)
 {
-  gfc_error ("expected '(' at %C");
+  gfc_error ("expected %<(%> at %C");
   return MATCH_ERROR;
@@ -5670,3 +5670,3 @@ gfc_match_omp_declare_variant (void)
 {
-  gfc_error ("expected ')' at %C");
+  gfc_error ("expected %<)%> at %C");
   return MATCH_ERROR;
@@ -5680,3 +5680,3 @@ gfc_match_omp_declare_variant (void)
 	{
-	  gfc_error ("expected 'match' at %C");
+	  gfc_error ("expected % at %C");
 	  return MATCH_ERROR;
@@ -5689,3 +5689,3 @@ gfc_match_omp_declare_variant (void)
 	{
-	  gfc_error ("expected '(' at %C");
+	  gfc_error ("expected %<(%> at %C");
 	  return MATCH_ERROR;
@@ -5698,3 +5698,3 @@ gfc_match_omp_declare_variant (void)
 	{
-	  gfc_error ("expected ')' at %C");
+	  gfc_error ("expected %<)%> at %C");
 	  return MATCH_ERROR;
@@ -7380,3 +7380,3 @@ resolve_omp_clauses (gfc_code 

Re: [Patch] Fortran/OpenMP: align/allocator modifiers to the allocate clause

2022-12-09 Thread Jakub Jelinek via Gcc-patches
On Fri, Dec 09, 2022 at 09:14:55PM +0100, Tobias Burnus wrote:
> Implementing the 5.1 syntax inside the 'allocate' clause. That's a
> fallout of working on something else...
> 
> OK for mainline?
> 
> Tobias
> -
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
> München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
> Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
> München, HRB 106955

> Fortran/OpenMP: align/allocator modifiers to the allocate clause
> 
> gcc/fortran/ChangeLog:
> 
>   * dump-parse-tree.cc (show_omp_namelist): Improve OMP_LIST_ALLOCATE
>   output.
>   * gfortran.h (struct gfc_omp_namelist): Add 'align' to 'u'.
>   (gfc_free_omp_namelist): Add bool arg.
>   * match.cc (gfc_free_omp_namelist): Likewise; free 'u.align'.
>   * openmp.cc (gfc_free_omp_clauses, gfc_match_omp_clause_reduction,
>   gfc_match_omp_flush): Update call.
>   (gfc_match_omp_clauses): Match 'align/allocate modifers in
>   'allocate' clause.
>   (resolve_omp_clauses): Resolve align.
>   * st.cc (gfc_free_statement): Update call
>   * trans-openmp.cc (gfc_trans_omp_clauses): Handle 'align'.
> 
> libgomp/ChangeLog:
> 
>   * libgomp.texi (5.1 Impl. Status): Split allocate clause/directive
>   item about 'align'; mark clause as 'Y' and directive as 'N'.
>   * testsuite/libgomp.fortran/allocate-2.f90: New test.
>   * testsuite/libgomp.fortran/allocate-3.f90: New test.

LGTM, thanks.

Jakub



[Patch] Fortran/OpenMP: align/allocator modifiers to the allocate clause

2022-12-09 Thread Tobias Burnus

Implementing the 5.1 syntax inside the 'allocate' clause. That's a
fallout of working on something else...

OK for mainline?

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
Fortran/OpenMP: align/allocator modifiers to the allocate clause

gcc/fortran/ChangeLog:

	* dump-parse-tree.cc (show_omp_namelist): Improve OMP_LIST_ALLOCATE
	output.
	* gfortran.h (struct gfc_omp_namelist): Add 'align' to 'u'.
	(gfc_free_omp_namelist): Add bool arg.
	* match.cc (gfc_free_omp_namelist): Likewise; free 'u.align'.
	* openmp.cc (gfc_free_omp_clauses, gfc_match_omp_clause_reduction,
	gfc_match_omp_flush): Update call.
	(gfc_match_omp_clauses): Match 'align/allocate modifers in
	'allocate' clause.
	(resolve_omp_clauses): Resolve align.
	* st.cc (gfc_free_statement): Update call
	* trans-openmp.cc (gfc_trans_omp_clauses): Handle 'align'.

libgomp/ChangeLog:

	* libgomp.texi (5.1 Impl. Status): Split allocate clause/directive
	item about 'align'; mark clause as 'Y' and directive as 'N'.
	* testsuite/libgomp.fortran/allocate-2.f90: New test.
	* testsuite/libgomp.fortran/allocate-3.f90: New test.

 gcc/fortran/dump-parse-tree.cc   |  23 +
 gcc/fortran/gfortran.h   |   3 +-
 gcc/fortran/match.cc |   4 +-
 gcc/fortran/openmp.cc| 106 +++
 gcc/fortran/st.cc|   2 +-
 gcc/fortran/trans-openmp.cc  |   8 ++
 libgomp/libgomp.texi |   4 +-
 libgomp/testsuite/libgomp.fortran/allocate-2.f90 |  25 ++
 libgomp/testsuite/libgomp.fortran/allocate-3.f90 |  28 ++
 9 files changed, 163 insertions(+), 40 deletions(-)

diff --git a/gcc/fortran/dump-parse-tree.cc b/gcc/fortran/dump-parse-tree.cc
index 2f042ab5142..5ae72dc1cac 100644
--- a/gcc/fortran/dump-parse-tree.cc
+++ b/gcc/fortran/dump-parse-tree.cc
@@ -1357,6 +1357,29 @@ show_omp_namelist (int list_type, gfc_omp_namelist *n)
 	}
 	  ns_iter = n->u2.ns;
 	}
+  if (list_type == OMP_LIST_ALLOCATE)
+	{
+	  if (n->expr)
+	{
+	  fputs ("allocator(", dumpfile);
+	  show_expr (n->expr);
+	  fputc (')', dumpfile);
+	}
+	  if (n->expr && n->u.align)
+	fputc (',', dumpfile);
+	  if (n->u.align)
+	{
+	  fputs ("allocator(", dumpfile);
+	  show_expr (n->u.align);
+	  fputc (')', dumpfile);
+	}
+	  if (n->expr || n->u.align)
+	fputc (':', dumpfile);
+	  fputs (n->sym->name, dumpfile);
+	  if (n->next)
+	fputs (") ALLOCATE(", dumpfile);
+	  continue;
+	}
   if (list_type == OMP_LIST_REDUCTION)
 	switch (n->u.reduction_op)
 	  {
diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index b541a07e2c7..5f8a81ae4a1 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -1349,6 +1349,7 @@ typedef struct gfc_omp_namelist
   gfc_omp_reduction_op reduction_op;
   gfc_omp_depend_doacross_op depend_doacross_op;
   gfc_omp_map_op map_op;
+  gfc_expr *align;
   struct
 	{
 	  ENUM_BITFIELD (gfc_omp_linear_op) op:4;
@@ -3572,7 +3573,7 @@ void gfc_free_iterator (gfc_iterator *, int);
 void gfc_free_forall_iterator (gfc_forall_iterator *);
 void gfc_free_alloc_list (gfc_alloc *);
 void gfc_free_namelist (gfc_namelist *);
-void gfc_free_omp_namelist (gfc_omp_namelist *, bool);
+void gfc_free_omp_namelist (gfc_omp_namelist *, bool, bool);
 void gfc_free_equiv (gfc_equiv *);
 void gfc_free_equiv_until (gfc_equiv *, gfc_equiv *);
 void gfc_free_data (gfc_data *);
diff --git a/gcc/fortran/match.cc b/gcc/fortran/match.cc
index 8b8b6e79c8b..7ba0f349993 100644
--- a/gcc/fortran/match.cc
+++ b/gcc/fortran/match.cc
@@ -5524,13 +5524,15 @@ gfc_free_namelist (gfc_namelist *name)
 /* Free an OpenMP namelist structure.  */
 
 void
-gfc_free_omp_namelist (gfc_omp_namelist *name, bool free_ns)
+gfc_free_omp_namelist (gfc_omp_namelist *name, bool free_ns, bool free_align)
 {
   gfc_omp_namelist *n;
 
   for (; name; name = n)
 {
   gfc_free_expr (name->expr);
+  if (free_align)
+	gfc_free_expr (name->u.align);
   if (free_ns)
 	gfc_free_namespace (name->u2.ns);
   else if (name->u2.udr)
diff --git a/gcc/fortran/openmp.cc b/gcc/fortran/openmp.cc
index 862c649b0b6..4b4e6ac6947 100644
--- a/gcc/fortran/openmp.cc
+++ b/gcc/fortran/openmp.cc
@@ -187,7 +187,8 @@ gfc_free_omp_clauses (gfc_omp_clauses *c)
   gfc_free_expr (c->vector_length_expr);
   for (i = 0; i < OMP_LIST_NUM; i++)
 gfc_free_omp_namelist (c->lists[i],
-			   i == OMP_LIST_AFFINITY || i == OMP_LIST_DEPEND);
+			   i == OMP_LIST_AFFINITY || i == OMP_LIST_DEPEND,
+			   i == OMP_LIST_ALLOCATE);
   gfc_free_expr_list (c->wait_list);
   gfc_free_expr_list (c->tile_list);
   free (CONST_CAST (char *, c->critical_name));
@@ -542,7 

[PATCH v2] RISC-V: Produce better code with complex constants [PR95632] [PR106602]

2022-12-09 Thread Raphael Moreira Zinsly
Changes since v1:
- Fixed formatting issues.
- Added a name to the define_insn_and_split pattern.
- Set the target on the 'dg-do compile' in pr106602.c.
- Removed the rv32 restriction in pr95632.c.

-- >8 --

Due to RISC-V limitations on operations with big constants combine
is failing to match such operations and is not being able to
produce optimal code as it keeps splitting them.  By pretending we
can do those operations we can get more opportunities for
simplification of surrounding instructions.

2022-12-06  Raphael Moreira Zinsly  
Jeff Law  

gcc/Changelog:
PR target/95632
PR target/106602
* config/riscv/riscv.md: New pattern to simulate complex
const_int loads.

gcc/testsuite/ChangeLog:
* gcc.target/riscv/pr95632.c: New test.
* gcc.target/riscv/pr106602.c: New test.
---
 gcc/config/riscv/riscv.md | 15 +++
 gcc/testsuite/gcc.target/riscv/pr106602.c | 14 ++
 gcc/testsuite/gcc.target/riscv/pr95632.c  | 15 +++
 3 files changed, 44 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/pr106602.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/pr95632.c

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index df57e2b0b4a..b0daa4b19eb 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -1667,6 +1667,21 @@
  MAX_MACHINE_MODE, [3], TRUE);
 })
 
+;; Pretend to have the ability to load complex const_int in order to get
+;; better code generation around them.
+(define_insn_and_split "*mvconst_internal"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+(match_operand:GPR 1 "splittable_const_int_operand" "i"))]
+  "cse_not_expected"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+{
+  riscv_move_integer (operands[0], operands[0], INTVAL (operands[1]),
+ mode, TRUE);
+  DONE;
+})
+
 ;; 64-bit integer moves
 
 (define_expand "movdi"
diff --git a/gcc/testsuite/gcc.target/riscv/pr106602.c 
b/gcc/testsuite/gcc.target/riscv/pr106602.c
new file mode 100644
index 000..825b1a143b5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/pr106602.c
@@ -0,0 +1,14 @@
+/* { dg-do compile { target { riscv64*-*-* } } } */
+/* { dg-options "-O2" } */
+
+unsigned long
+foo2 (unsigned long a)
+{
+  return (unsigned long)(unsigned int) a << 6;
+}
+
+/* { dg-final { scan-assembler-times "slli\t" 1 } } */
+/* { dg-final { scan-assembler-times "srli\t" 1 } } */
+/* { dg-final { scan-assembler-not "\tli\t" } } */
+/* { dg-final { scan-assembler-not "addi\t" } } */
+/* { dg-final { scan-assembler-not "and\t" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/pr95632.c 
b/gcc/testsuite/gcc.target/riscv/pr95632.c
new file mode 100644
index 000..b865c2f2e97
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/pr95632.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+unsigned short
+foo (unsigned short crc)
+{
+  crc ^= 0x4002;
+  crc >>= 1;
+  crc |= 0x8000;
+
+  return crc;
+}
+
+/* { dg-final { scan-assembler-times "srli\t" 1 } } */
+/* { dg-final { scan-assembler-not "slli\t" } } */
-- 
2.38.1



[PATCH] i386: correct division modeling in lujiazui.md

2022-12-09 Thread Alexander Monakov via Gcc-patches
Model the divider in Lujiazui processors as a separate automaton to
significantly reduce the overall model size. This should also result
in improved accuracy, as pipe 0 should be able to accept new
instructions while the divider is occupied.

It is unclear why integer divisions are modeled as if pipes 0-3 are all
occupied. I've opted to keep a single-cycle reservation of all four
pipes together, so GCC should continue trying to pack instructions
around a division accordingly.

Currently top three symbols in insn-automata.o are:

106102 r lujiazui_core_check
106102 r lujiazui_core_transitions
196123 r lujiazui_core_min_issue_delay

This patch shrinks all lujiazui tables to:

3 r lujiazui_decoder_min_issue_delay
20 r lujiazui_decoder_transitions
32 r lujiazui_agu_min_issue_delay
126 r lujiazui_agu_transitions
304 r lujiazui_div_base
352 r lujiazui_div_check
352 r lujiazui_div_transitions
1152 r lujiazui_core_min_issue_delay
1592 r lujiazui_agu_translate
1592 r lujiazui_core_translate
1592 r lujiazui_decoder_translate
1592 r lujiazui_div_translate
3952 r lujiazui_div_min_issue_delay
9216 r lujiazui_core_transitions

This continues the work on reducing i386 insn-automata.o size started
with similar fixes for division and multiplication instructions in
znver.md [1][2]. I plan to submit corresponding fixes for
b[td]ver[123].md as well.

[1] 
https://inbox.sourceware.org/gcc-patches/23c795d6-403c-5927-e610-f0f1215f5...@ispras.ru/T/#m36e069d43d07d768d4842a779e26b4a0915cc543
[2] 
https://inbox.sourceware.org/gcc-patches/20221101162637.14238-1-amona...@ispras.ru/

gcc/ChangeLog:

PR target/87832
* config/i386/lujiazui.md (lujiazui_div): New automaton.
(lua_div): New unit.
(lua_idiv_qi): Correct unit in the reservation.
(lua_idiv_qi_load): Ditto.
(lua_idiv_hi): Ditto.
(lua_idiv_hi_load): Ditto.
(lua_idiv_si): Ditto.
(lua_idiv_si_load): Ditto.
(lua_idiv_di): Ditto.
(lua_idiv_di_load): Ditto.
(lua_fdiv_SF): Ditto.
(lua_fdiv_SF_load): Ditto.
(lua_fdiv_DF): Ditto.
(lua_fdiv_DF_load): Ditto.
(lua_fdiv_XF): Ditto.
(lua_fdiv_XF_load): Ditto.
(lua_ssediv_SF): Ditto.
(lua_ssediv_load_SF): Ditto.
(lua_ssediv_V4SF): Ditto.
(lua_ssediv_load_V4SF): Ditto.
(lua_ssediv_V8SF): Ditto.
(lua_ssediv_load_V8SF): Ditto.
(lua_ssediv_SD): Ditto.
(lua_ssediv_load_SD): Ditto.
(lua_ssediv_V2DF): Ditto.
(lua_ssediv_load_V2DF): Ditto.
(lua_ssediv_V4DF): Ditto.
(lua_ssediv_load_V4DF): Ditto.
(lua_sseicvt_si): Ditto.
---
 gcc/config/i386/lujiazui.md | 58 +++--
 1 file changed, 30 insertions(+), 28 deletions(-)

diff --git a/gcc/config/i386/lujiazui.md b/gcc/config/i386/lujiazui.md
index 9046c09f2..58a230c70 100644
--- a/gcc/config/i386/lujiazui.md
+++ b/gcc/config/i386/lujiazui.md
@@ -19,8 +19,8 @@
 
 ;; Scheduling for ZHAOXIN lujiazui processor.
 
-;; Modeling automatons for decoders, execution pipes and AGU pipes.
-(define_automaton "lujiazui_decoder,lujiazui_core,lujiazui_agu")
+;; Modeling automatons for decoders, execution pipes, AGU pipes, and divider.
+(define_automaton "lujiazui_decoder,lujiazui_core,lujiazui_agu,lujiazui_div")
 
 ;; The rules for the decoder are simple:
 ;;  - an instruction with 1 uop can be decoded by any of the three
@@ -55,6 +55,8 @@ (define_reservation "lua_decoder01" 
"lua_decoder0|lua_decoder1")
 (define_cpu_unit "lua_p0,lua_p1,lua_p2,lua_p3" "lujiazui_core")
 (define_cpu_unit "lua_p4,lua_p5" "lujiazui_agu")
 
+(define_cpu_unit "lua_div" "lujiazui_div")
+
 (define_reservation "lua_p03" "lua_p0|lua_p3")
 (define_reservation "lua_p12" "lua_p1|lua_p2")
 (define_reservation "lua_p1p2" "lua_p1+lua_p2")
@@ -229,56 +231,56 @@ (define_insn_reservation "lua_idiv_qi" 21
  (and (eq_attr "memory" "none")
   (and (eq_attr "mode" "QI")
(eq_attr "type" "idiv"
-"lua_decoder0,lua_p0p1p2p3*21")
+"lua_decoder0,lua_p0p1p2p3,lua_div*21")
 
 (define_insn_reservation "lua_idiv_qi_load" 25
 (and (eq_attr "cpu" "lujiazui")
  (and (eq_attr "memory" "load")
   (and (eq_attr "mode" "QI")
(eq_attr "type" "idiv"
-"lua_decoder0,lua_p45,lua_p0p1p2p3*21")
+"lua_decoder0,lua_p45,lua_p0p1p2p3,lua_div*21")
 
 (define_insn_reservation "lua_idiv_hi" 22
 (and (eq_attr "cpu" "lujiazui")
  (and (eq_attr "memory" "none")
   (and (eq_attr "mode" "HI")
(eq_attr "type" "idiv"
-"lua_decoder0,lua_p0p1p2p3*22")
+

[PATCH] initialize fde objects lazily

2022-12-09 Thread Thomas Neumann via Gcc-patches

When registering an unwind frame with __register_frame_info_bases
we currently initialize that fde object eagerly. This has the
advantage that it is immutable afterwards and we can safely
access it from multiple threads, but it has the disadvantage
that we pay the initialization cost even if the application
never throws an exception.

This commit changes the logic to initialize the objects lazily.
The objects themselves are inserted into the b-tree when
registering the frame, but the sorted fde_vector is
not constructed yet. Only on the first time that an
exception tries to pass through the registered code the
object is initialized. We notice that with a double checking,
first doing a relaxed load of the sorted bit and then re-checking
under a mutex when the object was not initialized yet.

Note that the check must implicitly be safe concering a concurrent
frame deregistration, as trying the deregister a frame that is
on the unwinding path of a concurrent exception is inherently racy.

libgcc/ChangeLog:
* unwind-dw2-fde.c: Initialize fde object lazily when
the first exception tries to pass through.
---
 libgcc/unwind-dw2-fde.c | 52 -
 1 file changed, 41 insertions(+), 11 deletions(-)

diff --git a/libgcc/unwind-dw2-fde.c b/libgcc/unwind-dw2-fde.c
index 3c0cc654ec0..6f69c20ff4b 100644
--- a/libgcc/unwind-dw2-fde.c
+++ b/libgcc/unwind-dw2-fde.c
@@ -63,8 +63,6 @@ release_registered_frames (void)
 
 static void

 get_pc_range (const struct object *ob, uintptr_type *range);
-static void
-init_object (struct object *ob);
 
 #else

 /* Without fast path frame deregistration must always succeed.  */
@@ -76,6 +74,7 @@ static const int in_shutdown = 0;
by decreasing value of pc_begin.  */
 static struct object *unseen_objects;
 static struct object *seen_objects;
+#endif
 
 #ifdef __GTHREAD_MUTEX_INIT

 static __gthread_mutex_t object_mutex = __GTHREAD_MUTEX_INIT;
@@ -103,7 +102,6 @@ init_object_mutex_once (void)
 static __gthread_mutex_t object_mutex;
 #endif
 #endif
-#endif
 
 /* Called from crtbegin.o to register the unwind info for an object.  */
 
@@ -126,10 +124,7 @@ __register_frame_info_bases (const void *begin, struct object *ob,

 #endif
 
 #ifdef ATOMIC_FDE_FAST_PATH

-  // Initialize eagerly to avoid locking later
-  init_object (ob);
-
-  // And register the frame
+  // Register the frame in the b-tree
   uintptr_type range[2];
   get_pc_range (ob, range);
   btree_insert (_frames, range[0], range[1] - range[0], ob);
@@ -180,10 +175,7 @@ __register_frame_info_table_bases (void *begin, struct 
object *ob,
   ob->s.b.encoding = DW_EH_PE_omit;
 
 #ifdef ATOMIC_FDE_FAST_PATH

-  // Initialize eagerly to avoid locking later
-  init_object (ob);
-
-  // And register the frame
+  // Register the frame in the b-tree
   uintptr_type range[2];
   get_pc_range (ob, range);
   btree_insert (_frames, range[0], range[1] - range[0], ob);
@@ -892,7 +884,15 @@ init_object (struct object* ob)
   accu.linear->orig_data = ob->u.single;
   ob->u.sort = accu.linear;
 
+#ifdef ATOMIC_FDE_FAST_PATH

+  // We must update the sorted bit with an atomic operation
+  struct object tmp;
+  tmp.s.b = ob->s.b;
+  tmp.s.b.sorted = 1;
+  __atomic_store (&(ob->s.b), &(tmp.s.b), __ATOMIC_SEQ_CST);
+#else
   ob->s.b.sorted = 1;
+#endif
 }
 
 #ifdef ATOMIC_FDE_FAST_PATH

@@ -1130,6 +1130,21 @@ search_object (struct object* ob, void *pc)
 }
 }
 
+#ifdef ATOMIC_FDE_FAST_PATH

+
+// Check if the object was already initialized
+static inline bool
+is_object_initialized (struct object *ob)
+{
+  // We have to use relaxed atomics for the read, which
+  // is a bit involved as we read from a bitfield
+  struct object tmp;
+  __atomic_load (&(ob->s.b), &(tmp.s.b), __ATOMIC_RELAXED);
+  return tmp.s.b.sorted;
+}
+
+#endif
+
 const fde *
 _Unwind_Find_FDE (void *pc, struct dwarf_eh_bases *bases)
 {
@@ -1141,6 +1156,21 @@ _Unwind_Find_FDE (void *pc, struct dwarf_eh_bases *bases)
   if (!ob)
 return NULL;
 
+  // Initialize the object lazily

+  if (!is_object_initialized (ob))
+{
+  // Check again under mutex
+  init_object_mutex_once ();
+  __gthread_mutex_lock (_mutex);
+
+  if (!ob->s.b.sorted)
+   {
+ init_object (ob);
+   }
+
+  __gthread_mutex_unlock (_mutex);
+}
+
   f = search_object (ob, pc);
 #else
 
--

2.37.2



Patch ping

2022-12-09 Thread Jakub Jelinek via Gcc-patches
Hi!

I'd like to ping a few pending patches:

https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606973.html
  - PR107465 - c-family: Fix up -Wsign-compare BIT_NOT_EXPR handling

https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607104.html
  - PR107465 - c-family: Incremental fix for -Wsign-compare BIT_NOT_EXPR 
handling

https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607145.html
  - PR107558 - c++: Don't clear TREE_READONLY for -fmerge-all-constants for 
non-aggregates

https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607534.html
  - PR107846 - c-family: Account for integral promotions of left shifts for 
-Wshift-overflow warning

https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606382.html
  - PR107703 - libgcc, i386: Add __fix{,uns}bfti and __float{,un}tibf

Thanks

Jakub



Re: [PATCH] AArch64: Enable TARGET_CONST_ANCHOR

2022-12-09 Thread Richard Sandiford via Gcc-patches
Wilco Dijkstra  writes:
> Enable TARGET_CONST_ANCHOR to allow complex constants to be created via 
> immediate add.
> Use a 24-bit range as that enables a 3 or 4-instruction immediate to be 
> replaced by
> 2 additions.  Fix the costing of immediate add to support 24-bit immediate 
> and 12-bit shifted
> immediates.  The generated code for the testcase is now the same or better 
> than LLVM.
> It also results in a small codesize reduction on SPEC.
>
> Passes bootstrap and regress, OK for commit?
>
> gcc/
> * config/aarch64/aarch64.cc (aarch64_rtx_costs): Add correct costs for
> 24-bit immediate add and 12-bit high immediate add.

Very minor, but it might be worth saying "add/sub" rather than "add".
The reason picking the 24-bit range is right is that add & sub together
give us a 25-bit range.

> (TARGET_CONST_ANCHOR): Define.
>
> gcc/testsuite/
> * gcc.target/aarch64/movk_3.c: New test.
>
> ---
>
> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index 
> e97f3b32f7c7f43564d6a4207eae5a34b9e9bfe7..a73741800c963ee6605fd2cfa918f4399da4bfdf
>  100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -14257,6 +14257,16 @@ cost_plus:
> return true;
>   }
>
> +   if (aarch64_pluslong_immediate (op1, mode))
> + {
> +   /* 24-bit add in 2 instructions or 12-bit shifted add.  */
> +   if ((INTVAL (op1) & 0xfff) != 0)
> + *cost += COSTS_N_INSNS (1);
> +
> +   *cost += rtx_cost (op0, mode, PLUS, 0, speed);
> +   return true;
> + }
> +

I guess for consistency, we arguably should add extra_cost->alu.arith
for each instruction, like we do for the other cases.  But that seems
a bit daft when all arith costs are 0.  And it's hard to believe that
they would be nonzero for any new core (i.e. that a plain addition
would be more expensive than a typical "fast" instruction).  ADD
immediate is effectively the benchmark cost of COSTS_N_INSN (1).

So I agree what you wrote is what we should use.  It might be good to
get rid of the existing uses of alu.arith (for integer ADD only), as a
separate follow-up patch.

It looks like there's an off-by-one error in:

(define_predicate "aarch64_pluslong_immediate"
  (and (match_code "const_int")
   (match_test "(INTVAL (op) < 0xff && INTVAL (op) > -0xff)")))

which might make the optimisation fail at the extremes.  I think it
should be:

(define_predicate "aarch64_pluslong_immediate"
  (and (match_code "const_int")
   (match_test "IN_RANGE (ival, -0xff, 0xff)")))

instead.

OK with that change to the predicate, thanks.

Richard


> *cost += rtx_cost (op1, mode, PLUS, 1, speed);
>
> /* Look for ADD (extended register).  */
> @@ -28051,6 +28061,9 @@ aarch64_libgcc_floating_mode_supported_p
>  #undef TARGET_HAVE_SHADOW_CALL_STACK
>  #define TARGET_HAVE_SHADOW_CALL_STACK true
>
> +#undef TARGET_CONST_ANCHOR
> +#define TARGET_CONST_ANCHOR 0x100
> +
>  struct gcc_target targetm = TARGET_INITIALIZER;
>
>  #include "gt-aarch64.h"
> diff --git a/gcc/testsuite/gcc.target/aarch64/movk_3.c 
> b/gcc/testsuite/gcc.target/aarch64/movk_3.c
> new file mode 100644
> index 
> ..9e8c0c42671bef3f63028b4e51d0bd78c9903994
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/movk_3.c
> @@ -0,0 +1,56 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 --save-temps" } */
> +
> +
> +/* 2 MOV */
> +void f16 (long *p)
> +{
> +  p[0] = 0x1234;
> +  p[2] = 0x1235;
> +}
> +
> +/* MOV, MOVK and ADD */
> +void f32_1 (long *p)
> +{
> +  p[0] = 0x12345678;
> +  p[2] = 0x12345678 + 0xfff;
> +}
> +
> +/* 2 MOV, 2 MOVK */
> +void f32_2 (long *p)
> +{
> +  p[0] = 0x12345678;
> +  p[2] = 0x12345678 + 0x55;
> +}
> +
> +/* MOV, MOVK and ADD */
> +void f32_3 (long *p)
> +{
> +  p[0] = 0x12345678;
> +  p[2] = 0x12345678 + 0x999000;
> +}
> +
> +/* MOV, 2 MOVK and ADD */
> +void f48_1 (long *p)
> +{
> +  p[0] = 0x123456789abc;
> +  p[2] = 0x123456789abc + 0xfff;
> +}
> +
> +/* MOV, 2 MOVK and 2 ADD */
> +void f48_2 (long *p)
> +{
> +  p[0] = 0x123456789abc;
> +  p[2] = 0x123456789abc + 0x66;
> +}
> +
> +/* 2 MOV, 4 MOVK */
> +void f48_3 (long *p)
> +{
> +  p[0] = 0x123456789abc;
> +  p[2] = 0x123456789abc + 0x166;
> +}
> +
> +/* { dg-final { scan-assembler-times "mov\tx\[0-9\]+, \[0-9\]+" 10 } } */
> +/* { dg-final { scan-assembler-times "movk\tx\[0-9\]+, 0x\[0-9a-f\]+" 12 } } 
> */
> +/* { dg-final { scan-assembler-times "add\tx\[0-9\]+, x\[0-9\]+, \[0-9\]+" 5 
> } } */


Re: [Patch] libgomp: Handle OpenMP's reverse offloads

2022-12-09 Thread Jakub Jelinek via Gcc-patches
On Tue, Dec 06, 2022 at 08:45:07AM +0100, Tobias Burnus wrote:
> 32bit vs. 64bit: libgomp itself is compiled with both -m32 and -m64; however,
> nvptx and gcn requires -m64 on the device side and assume that the device
> pointers are representable on the host (i.e. all are 64bit). The new code
> tries to be in principle compatible with uint32_t pointers and uses uint64_t
> to represent it consistently. – The code should be mostly fine, except that
> one called function requires an array of void* and size_t. Instead of handling
> that case, I added some code to permit optimizing away the function content
> without offloading - and a run-time assert if it should ever happen that this
> function gets called on a 32bit host from the target side.

I think we just shouldn't support libgomp plugins for 32-bit libgomp, only
host fallback.  If you want offloading, use 64-bit host...

> libgomp: Handle OpenMP's reverse offloads
> 
> This commit enabled reverse offload for nvptx such that gomp_target_rev
> actually gets called.  And it fills the latter function to do all of
> the following: finding the host function to the device func ptr and
> copying the arguments to the host, processing the mapping/firstprivate,
> calling the host function, copying back the data and freeing as needed.
> 
> The data handling is made easier by assuming that all host variables
> either existed before (and are in the mapping) or that those are
> devices variables not yet available on the host. Thus, the reverse
> mapping can do without refcounts etc. Note that the spec disallows
> inside a target region device-affecting constructs other than target
> plus ancestor device-modifier and it also limits the clauses permitted
> on this construct.
> 
> For the function addresses, an additional splay tree is used; for
> the lookup of mapped variables, the existing splay-tree is used.
> Unfortunately, its data structure requires a full walk of the tree;
> Additionally, the just mapped variables are recorded in a separate
> data structure an extra lookup. While the lookup is slow, assuming
> that only few variables get mapped in each reverse offload construct
> and that reverse offload is the exception and not performance critical,
> this seems to be acceptable.
> 
> libgomp/ChangeLog:
> 
>   * libgomp.h (struct target_mem_desc): Predeclare; move
>   below after 'reverse_splay_tree_node' and add rev_array
>   member.
>   (struct reverse_splay_tree_key_s, reverse_splay_compare): New.
>   (reverse_splay_tree_node, reverse_splay_tree,
>   reverse_splay_tree_key): New typedef.
>   (struct gomp_device_descr): Add mem_map_rev member.
>   * oacc-host.c (host_dispatch): NULL init .mem_map_rev.
>   * plugin/plugin-nvptx.c (GOMP_OFFLOAD_get_num_devices): Claim
>   support for GOMP_REQUIRES_REVERSE_OFFLOAD.
>   * splay-tree.h (splay_tree_callback_stop): New typedef; like
>   splay_tree_callback but returning int not void.
>   (splay_tree_foreach_lazy): Define; like splay_tree_foreach but
>   taking splay_tree_callback_stop as argument.
>   * splay-tree.c (splay_tree_foreach_internal_lazy,
>   splay_tree_foreach_lazy): New; but early exit if callback returns
>   nonzero.
>   * target.c: Instatiate splay_tree_c with splay_tree_prefix 'reverse'.
>   (gomp_map_lookup_rev): New.
>   (gomp_load_image_to_device): Handle reverse-offload function
>   lookup table.
>   (gomp_unload_image_from_device): Free devicep->mem_map_rev.
>   (struct gomp_splay_tree_rev_lookup_data, gomp_splay_tree_rev_lookup,
>   gomp_map_rev_lookup, struct cpy_data, gomp_map_cdata_lookup_int,
>   gomp_map_cdata_lookup): New auxiliary structs and functions for
>   gomp_target_rev.
>   (gomp_target_rev): Implement reverse offloading and its mapping.
>   (gomp_target_init): Init current_device.mem_map_rev.root.
>   * testsuite/libgomp.fortran/reverse-offload-2.f90: New test.
>   * testsuite/libgomp.fortran/reverse-offload-3.f90: New test.
>   * testsuite/libgomp.fortran/reverse-offload-4.f90: New test.
>   * testsuite/libgomp.fortran/reverse-offload-5.f90: New test.
>   * testsuite/libgomp.fortran/reverse-offload-5a.f90: New test without
>   mapping of on-device allocated variables.

> +  /* Likeverse for the reverse lookup device->host for reverse offload. */

Likewise

> +  reverse_splay_tree_node rev_array;

Do we need reverse_splay_tree* stuff in libgomp.h?
As splay_tree_node is just a pointer, perhaps just
struct reverse_splay_tree_node_s;
early and
  struct reverse_splay_tree_node_s *rev_array;
in libgomp.h and include the extra splay-tree.h only in target.c?
Unless one needs it anywhere else...

Otherwise LGTM.

Jakub



[PATCH] AArch64: Enable TARGET_CONST_ANCHOR

2022-12-09 Thread Wilco Dijkstra via Gcc-patches
Enable TARGET_CONST_ANCHOR to allow complex constants to be created via 
immediate add.
Use a 24-bit range as that enables a 3 or 4-instruction immediate to be 
replaced by
2 additions.  Fix the costing of immediate add to support 24-bit immediate and 
12-bit shifted
immediates.  The generated code for the testcase is now the same or better than 
LLVM.
It also results in a small codesize reduction on SPEC.

Passes bootstrap and regress, OK for commit?

gcc/
* config/aarch64/aarch64.cc (aarch64_rtx_costs): Add correct costs for
24-bit immediate add and 12-bit high immediate add.
(TARGET_CONST_ANCHOR): Define.

gcc/testsuite/
* gcc.target/aarch64/movk_3.c: New test.

---

diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 
e97f3b32f7c7f43564d6a4207eae5a34b9e9bfe7..a73741800c963ee6605fd2cfa918f4399da4bfdf
 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -14257,6 +14257,16 @@ cost_plus:
return true;
  }
 
+   if (aarch64_pluslong_immediate (op1, mode))
+ {
+   /* 24-bit add in 2 instructions or 12-bit shifted add.  */
+   if ((INTVAL (op1) & 0xfff) != 0)
+ *cost += COSTS_N_INSNS (1);
+
+   *cost += rtx_cost (op0, mode, PLUS, 0, speed);
+   return true;
+ }
+
*cost += rtx_cost (op1, mode, PLUS, 1, speed);
 
/* Look for ADD (extended register).  */
@@ -28051,6 +28061,9 @@ aarch64_libgcc_floating_mode_supported_p
 #undef TARGET_HAVE_SHADOW_CALL_STACK
 #define TARGET_HAVE_SHADOW_CALL_STACK true
 
+#undef TARGET_CONST_ANCHOR
+#define TARGET_CONST_ANCHOR 0x100
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 #include "gt-aarch64.h"
diff --git a/gcc/testsuite/gcc.target/aarch64/movk_3.c 
b/gcc/testsuite/gcc.target/aarch64/movk_3.c
new file mode 100644
index 
..9e8c0c42671bef3f63028b4e51d0bd78c9903994
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/movk_3.c
@@ -0,0 +1,56 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 --save-temps" } */
+
+
+/* 2 MOV */
+void f16 (long *p)
+{
+  p[0] = 0x1234;
+  p[2] = 0x1235;
+}
+
+/* MOV, MOVK and ADD */
+void f32_1 (long *p)
+{
+  p[0] = 0x12345678;
+  p[2] = 0x12345678 + 0xfff;
+}
+
+/* 2 MOV, 2 MOVK */
+void f32_2 (long *p)
+{
+  p[0] = 0x12345678;
+  p[2] = 0x12345678 + 0x55;
+}
+
+/* MOV, MOVK and ADD */
+void f32_3 (long *p)
+{
+  p[0] = 0x12345678;
+  p[2] = 0x12345678 + 0x999000;
+}
+
+/* MOV, 2 MOVK and ADD */
+void f48_1 (long *p)
+{
+  p[0] = 0x123456789abc;
+  p[2] = 0x123456789abc + 0xfff;
+}
+
+/* MOV, 2 MOVK and 2 ADD */
+void f48_2 (long *p)
+{
+  p[0] = 0x123456789abc;
+  p[2] = 0x123456789abc + 0x66;
+}
+
+/* 2 MOV, 4 MOVK */
+void f48_3 (long *p)
+{
+  p[0] = 0x123456789abc;
+  p[2] = 0x123456789abc + 0x166;
+}
+
+/* { dg-final { scan-assembler-times "mov\tx\[0-9\]+, \[0-9\]+" 10 } } */
+/* { dg-final { scan-assembler-times "movk\tx\[0-9\]+, 0x\[0-9a-f\]+" 12 } } */
+/* { dg-final { scan-assembler-times "add\tx\[0-9\]+, x\[0-9\]+, \[0-9\]+" 5 } 
} */



[PATCH 10/15 V5] arm: Implement cortex-M return signing address codegen

2022-12-09 Thread Andrea Corallo via Gcc-patches
Hi Richard,

thanks for reviewing.

Richard Earnshaw  writes:

> On 07/11/2022 08:57, Andrea Corallo via Gcc-patches wrote:
>> Hi all,
>> please find attached the lastest version of this patch incorporating
>> some
>> more improvents.  Feel free to ignore V3.
>> Best Regards
>>Andrea
>> 
>
>> As part of previous upstream suggestions a test for varargs has been
>> added and '-mtpcs-frame' is deemed being incompatible with this return
>> signing address feature being introduced.
>
> I don't see any check for the tpcs-frame incompatibility?  What
> happens if a user does combine the options?

Check added.

> gcc/Changelog
>
> 2021-11-03  Andrea Corallo  
>
>   * config/arm/arm.h (arm_arch8m_main): Declare it.
>   * config/arm/arm.cc (arm_arch8m_main): Define it.
>   (arm_option_reconfigure_globals): Set arm_arch8m_main.
>   (arm_compute_frame_layout, arm_expand_prologue)
>   (thumb2_expand_return, arm_expand_epilogue)
>   (arm_conditional_register_usage): Update for pac codegen.
>   (arm_current_function_pac_enabled_p): New function.
>   * config/arm/arm.md (pac_ip_lr_sp, pacbti_ip_lr_sp, aut_ip_lr_sp):
>   Add new patterns.
>   * config/arm/unspecs.md (UNSPEC_PAC_IP_LR_SP)
>   (UNSPEC_PACBTI_IP_LR_SP, UNSPEC_AUT_IP_LR_SP): Add unspecs.
>
> You're missing an entry for aarch_bti_enabled () - yes I realize
> that's just a placeholder at present and will be fully defined in
> patch 12.

Fixed

> +static bool
> +aarch_bti_enabled ()
> +{
> +  return false;
> +}
> +
>
> No comment on this function (and in patch 12 it moves to a different
> location).  It would be best to have it in the right place at this
> point in time.
>
> +  clobber_ip = (IS_NESTED (func_type)
> +&& (((TARGET_APCS_FRAME && frame_pointer_needed &&
> TARGET_ARM)
> + || ((flag_stack_check == STATIC_BUILTIN_STACK_CHECK
> +  || flag_stack_clash_protection)
> + && !df_regs_ever_live_p (LR_REGNUM)
> + && arm_r3_live_at_start_p ()))
> +|| (arm_current_function_pac_enabled_p (;
>
> Redundant parenthesis around arm_current_function_pac_enabled_p () call.

Fixed

> +   gcc_assert(arm_compute_static_chain_stack_bytes() == 4
> + || arm_current_function_pac_enabled_p ());
>
> I wonder if this assert is now really serving a useful purpose.  I'd
> consider removing it.

Removed

> @@ -27309,7 +27340,7 @@ thumb2_expand_return (bool simple_return)
>to assert it for now to ensure that future code changes do not silently
>change this behavior.  */
>gcc_assert (!IS_CMSE_ENTRY (arm_current_func_type ()));
> -  if (num_regs == 1)
> +  if (num_regs == 1 && !arm_current_function_pac_enabled_p ())
>  {
>rtx par = gen_rtx_PARALLEL (VOIDmode, rtvec_alloc (2));
>rtx reg = gen_rtx_REG (SImode, PC_REGNUM);
> @@ -27324,10 +27355,20 @@ thumb2_expand_return (bool simple_return)
>  }
>else
>  {
> -  saved_regs_mask &= ~ (1 << LR_REGNUM);
> -  saved_regs_mask |=   (1 << PC_REGNUM);
> -  arm_emit_multi_reg_pop (saved_regs_mask);
> -}
> +   if (arm_current_function_pac_enabled_p ())
> + {
> +   gcc_assert (!(saved_regs_mask & (1 << PC_REGNUM)));
> +   arm_emit_multi_reg_pop (saved_regs_mask);
> +   emit_insn (gen_aut_nop ());
> +   emit_jump_insn (simple_return_rtx);
> + }
> +   else
> + {
> +   saved_regs_mask &= ~ (1 << LR_REGNUM);
> +   saved_regs_mask |=   (1 << PC_REGNUM);
> +   arm_emit_multi_reg_pop (saved_regs_mask);
> + }
> + }
>  }
>else
>
> The logic for these blocks would, I think, be better expressed as
>
>if (pac_enabled)
>...
>else if (num_regs == 1)
>  ...  // existing code
>else
>  ...  // existing code

Done

> Also, I think (out of an abundance of caution) we really need a
> scheduling barrier placed before calls to gen_aut_nop() pattern is
> emitted, to ensure that the scheduler never tries to move this
> instruction away from the position we place it.  Use gen_blockage()
> for that (see TARGET_SCHED_PROLOG).  Alternatively, we could make the
> UNSPEC_PAC_NOP an unspec_volatile, which has the same effect (IIRC)
> without needing an additional insn - if you use this approach, then
> please make sure this is explained in a comment.
>
> +(define_insn "pacbti_nop"
> +  [(set (reg:SI IP_REGNUM)
> + (unspec:SI [(reg:SI SP_REGNUM) (reg:SI LR_REGNUM)]
> +UNSPEC_PACBTI_NOP))]
> +  "arm_arch8m_main"
> +  "pacbti\t%|ip, %|lr, %|sp"
> +  [(set_attr "conds" "unconditional")])
>
> The additional side-effect of this being a BTI landing pad means that
> we mustn't move any other instruction before it.  So I think this
> needs to be an unspec_volatile as well.

Done

> On the tests, they are OK as they 

[PATCH] gcov: Fix -fprofile-update=atomic

2022-12-09 Thread Sebastian Huber
The code coverage support uses counters to determine which edges in the control
flow graph were executed.  If a counter overflows, then the code coverage
information is invalid.  Therefore the counter type should be a 64-bit integer.
In multithreaded applications, it is important that the counter increments are
atomic.  This is not the case by default.  The user can enable atomic counter
increments through the -fprofile-update=atomic and
-fprofile-update=prefer-atomic options.

If the hardware supports 64-bit atomic operations, then everything is fine.  If
not and -fprofile-update=prefer-atomic was chosen by the user, then non-atomic
counter increments will be used.  However, if the hardware does not support the
required atomic operations and -fprofile-atomic=update was chosen by the user,
then a warning was issued and as a forced fall-back to non-atomic operations
was done.  This is probably not what a user wants.  There is still hardware on
the market which does not have atomic operations and is used for multithreaded
applications.  A user which selects -fprofile-update=atomic wants consistent
code coverage data and not random data.

This patch removes the fall-back to non-atomic operations for
-fprofile-update=atomic.  If atomic operations in hardware are not available,
then a library call to libatomic is emitted.  To mitigate potential performance
issues an optimization for systems which only support 32-bit atomic operations
is provided.  Here, the edge counter increments are done like this:

  low = __atomic_add_fetch_4 (, 1, MEMMODEL_RELAXED);
  high_inc = low == 0 ? 1 : 0;
  __atomic_add_fetch_4 (, high_inc, MEMMODEL_RELAXED);

gcc/ChangeLog:

* tree-profile.cc (split_atomic_increment): New.
(gimple_gen_edge_profiler): Split the atomic edge counter increment in
two 32-bit atomic operations if necessary.
(tree_profiling): Remove profile update warning and fall-back.  Set
split_atomic_increment if necessary.
---
 gcc/tree-profile.cc | 81 +
 1 file changed, 59 insertions(+), 22 deletions(-)

diff --git a/gcc/tree-profile.cc b/gcc/tree-profile.cc
index 2beb49241f2..1d326dde59a 100644
--- a/gcc/tree-profile.cc
+++ b/gcc/tree-profile.cc
@@ -73,6 +73,17 @@ static GTY(()) tree ic_tuple_var;
 static GTY(()) tree ic_tuple_counters_field;
 static GTY(()) tree ic_tuple_callee_field;
 
+/* If the user selected atomic profile counter updates
+   (-fprofile-update=atomic), then the counter updates will be done atomically.
+   Ideally, this is done through atomic operations in hardware.  If the
+   hardware supports only 32-bit atomic increments and gcov_type_node is a
+   64-bit integer type, then for the profile edge counters the increment is
+   performed through two separate 32-bit atomic increments.  This case is
+   indicated by the split_atomic_increment variable begin true.  If the
+   hardware does not support atomic operations at all, then a library call to
+   libatomic is emitted.  */
+static bool split_atomic_increment;
+
 /* Do initialization work for the edge profiler.  */
 
 /* Add code:
@@ -242,30 +253,59 @@ gimple_init_gcov_profiler (void)
 void
 gimple_gen_edge_profiler (int edgeno, edge e)
 {
-  tree one;
-
-  one = build_int_cst (gcov_type_node, 1);
+  const char *name = "PROF_edge_counter";
+  tree ref = tree_coverage_counter_ref (GCOV_COUNTER_ARCS, edgeno);
+  tree one = build_int_cst (gcov_type_node, 1);
 
   if (flag_profile_update == PROFILE_UPDATE_ATOMIC)
 {
-  /* __atomic_fetch_add (, 1, MEMMODEL_RELAXED); */
-  tree addr = tree_coverage_counter_addr (GCOV_COUNTER_ARCS, edgeno);
-  tree f = builtin_decl_explicit (TYPE_PRECISION (gcov_type_node) > 32
- ? BUILT_IN_ATOMIC_FETCH_ADD_8:
- BUILT_IN_ATOMIC_FETCH_ADD_4);
-  gcall *stmt = gimple_build_call (f, 3, addr, one,
-  build_int_cst (integer_type_node,
- MEMMODEL_RELAXED));
-  gsi_insert_on_edge (e, stmt);
+  tree addr = build_fold_addr_expr (ref);
+  tree relaxed = build_int_cst (integer_type_node, MEMMODEL_RELAXED);
+  if (!split_atomic_increment)
+   {
+ /* __atomic_fetch_add (, 1, MEMMODEL_RELAXED); */
+ tree f = builtin_decl_explicit (TYPE_PRECISION (gcov_type_node) > 32
+ ? BUILT_IN_ATOMIC_FETCH_ADD_8:
+ BUILT_IN_ATOMIC_FETCH_ADD_4);
+ gcall *stmt = gimple_build_call (f, 3, addr, one, relaxed);
+ gsi_insert_on_edge (e, stmt);
+   }
+  else
+   {
+ /* low = __atomic_add_fetch_4 (addr, 1, MEMMODEL_RELAXED);
+high_inc = low == 0 ? 1 : 0;
+__atomic_add_fetch_4 (addr_high, high_inc, MEMMODEL_RELAXED); */
+ tree zero32 = build_zero_cst (uint32_type_node);
+ tree one32 = build_one_cst 

[PATCH] Fix memory constraint on MVE v[ld/st][2/4] instructions [PR107714]

2022-12-09 Thread Stam Markianos-Wright via Gcc-patches

Hi all,

In the M-Class Arm-ARM:

https://developer.arm.com/documentation/ddi0553/bu/?lang=en

these MVE instructions only have '!' writeback variant and at:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107714

we found that the Um constraint would also allow through a
register offset writeback, resulting in an assembler error.

Here I have added a new constraint and predicate for these
instructions, which (uniquely, AFAICT), only support a `!` writeback
increment by the data size (inside the compiler this is a POST_INC).

No regressions in arm-none-eabi with MVE and MVE.FP.

Ok for trunk, and backport to GCC11 and GCC12 (testing pending)?

Thanks,
Stam

gcc/ChangeLog:
    PR target/107714
    * config/arm/arm-protos.h (mve_struct_mem_operand): New protoype.
    * config/arm/arm.cc (mve_struct_mem_operand): New function.
    * config/arm/constraints.md (Ug): New constraint.
    * config/arm/mve.md (mve_vst4q): Change constraint.
    (mve_vst2q): Likewise.
    (mve_vld4q): Likewise.
    (mve_vld2q): Likewise.
    * config/arm/predicates.md (mve_struct_operand): New predicate.

gcc/testsuite/ChangeLog:
    PR target/107714
    * gcc.target/arm/mve/intrinsics/vldst24q_reg_offset.c: New test.diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 550272facd12e60a49bf8a3b20f811cc13765b3a..8ea38118b05769bd6fcb1d22d902a50979cfd953 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -122,6 +122,7 @@ extern int arm_coproc_mem_operand_wb (rtx, int);
 extern int neon_vector_mem_operand (rtx, int, bool);
 extern int mve_vector_mem_operand (machine_mode, rtx, bool);
 extern int neon_struct_mem_operand (rtx);
+extern int mve_struct_mem_operand (rtx);
 
 extern rtx *neon_vcmla_lane_prepare_operands (rtx *);
 
diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
index b587561eebea921bdc68016922d37948e2870ce2..31f2a7b9d4688dde69d1435e24cf885e8544be71 100644
--- a/gcc/config/arm/arm.cc
+++ b/gcc/config/arm/arm.cc
@@ -13737,6 +13737,24 @@ neon_vector_mem_operand (rtx op, int type, bool strict)
   return FALSE;
 }
 
+/* Return TRUE if OP is a mem suitable for loading/storing an MVE struct
+   type.  */
+int
+mve_struct_mem_operand (rtx op)
+{
+  rtx ind = XEXP (op, 0);
+
+  /* Match: (mem (reg)).  */
+  if (REG_P (ind))
+return arm_address_register_rtx_p (ind, 0);
+
+  /* Allow only post-increment by the mode size.  */
+  if (GET_CODE (ind) == POST_INC)
+return arm_address_register_rtx_p (XEXP (ind, 0), 0);
+
+  return FALSE;
+}
+
 /* Return TRUE if OP is a mem suitable for loading/storing a Neon struct
type.  */
 int
diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md
index e5a36d29c7135943b9bb5ea396f70e2e4beb1e4a..8908b7f5b15ce150685868e78e75280bf32053f1 100644
--- a/gcc/config/arm/constraints.md
+++ b/gcc/config/arm/constraints.md
@@ -474,6 +474,12 @@
  (and (match_code "mem")
   (match_test "TARGET_32BIT && arm_coproc_mem_operand (op, FALSE)")))
 
+(define_memory_constraint "Ug"
+ "@internal
+  In Thumb-2 state a valid MVE struct load/store address."
+ (and (match_code "mem")
+  (match_test "TARGET_HAVE_MVE && mve_struct_mem_operand (op)")))
+
 (define_memory_constraint "Uj"
  "@internal
   In ARM/Thumb-2 state a VFP load/store address that supports writeback
diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index b5e6da4b1335818a3e8815de59850e845a2d0400..847bc032afa2c3977c05725562a14940beb282d4 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -99,7 +99,7 @@
 ;; [vst4q])
 ;;
 (define_insn "mve_vst4q"
-  [(set (match_operand:XI 0 "neon_struct_operand" "=Um")
+  [(set (match_operand:XI 0 "mve_struct_operand" "=Ug")
 	(unspec:XI [(match_operand:XI 1 "s_register_operand" "w")
 		(unspec:MVE_VLD_ST [(const_int 0)] UNSPEC_VSTRUCTDUMMY)]
 	 VST4Q))
@@ -9959,7 +9959,7 @@
 ;; [vst2q])
 ;;
 (define_insn "mve_vst2q"
-  [(set (match_operand:OI 0 "neon_struct_operand" "=Um")
+  [(set (match_operand:OI 0 "mve_struct_operand" "=Ug")
 	(unspec:OI [(match_operand:OI 1 "s_register_operand" "w")
 		(unspec:MVE_VLD_ST [(const_int 0)] UNSPEC_VSTRUCTDUMMY)]
 	 VST2Q))
@@ -9988,7 +9988,7 @@
 ;;
 (define_insn "mve_vld2q"
   [(set (match_operand:OI 0 "s_register_operand" "=w")
-	(unspec:OI [(match_operand:OI 1 "neon_struct_operand" "Um")
+	(unspec:OI [(match_operand:OI 1 "mve_struct_operand" "Ug")
 		(unspec:MVE_VLD_ST [(const_int 0)] UNSPEC_VSTRUCTDUMMY)]
 	 VLD2Q))
   ]
@@ -10016,7 +10016,7 @@
 ;;
 (define_insn "mve_vld4q"
   [(set (match_operand:XI 0 "s_register_operand" "=w")
-	(unspec:XI [(match_operand:XI 1 "neon_struct_operand" "Um")
+	(unspec:XI [(match_operand:XI 1 "mve_struct_operand" "Ug")
 		(unspec:MVE_VLD_ST [(const_int 0)] UNSPEC_VSTRUCTDUMMY)]
 	 VLD4Q))
   ]
diff --git a/gcc/config/arm/predicates.md b/gcc/config/arm/predicates.md
index aab5a91ad4ddc6a7a02611d05442d6de63841a7c..67f2fdb4f8f607ceb50871e1bc17dbdb9b987c2c 100644
--- 

Re: Rust front-end patches v4

2022-12-09 Thread Martin Liška
On 12/6/22 11:13, arthur.co...@embecosm.com wrote:
> Similarly to the previous round of patches, this patchset does not contain any
> new features - only fixes for the reviews of the v3. New features will follow
> shortly once that first patchset is merged.
> 
> Once again, thank you to all the contributors who made this possible and
> especially to Philip Herron for his dedication to the project.

Hello.

Congratulations for the patch set approval!

I noticed a minor git issues when I tried approving the patches:

warning: quoted CRLF detected
.git/rebase-apply/patch:3850: trailing whitespace.
  /* TODO: spec syntax rules state that "MacroInvocationSemi" can be used as 
.git/rebase-apply/patch:3851: trailing whitespace.
   * ExternalItem, but text body isn't so clear. Adding MacroInvocationSemi 
warning: 2 lines add whitespace errors.
.git/rebase-apply/patch:3374: indent with spaces.
   \
.git/rebase-apply/patch:3427: indent with spaces.
   \
warning: 2 lines add whitespace errors.
.git/rebase-apply/patch:315: indent with spaces.
// rust precedences
.git/rebase-apply/patch:316: indent with spaces.
PREC_CLOSURE = -40, // used for closures
.git/rebase-apply/patch:317: indent with spaces.
PREC_JUMP = -30,// used for break, continue, return, and yield
.git/rebase-apply/patch:318: indent with spaces.
PREC_RANGE = -10,   // used for range (although weird comment in 
rustc about this)
.git/rebase-apply/patch:319: indent with spaces.
PREC_BINOP = FROM_ASSOC_OP,
warning: squelched 6 whitespace errors
warning: 11 lines add whitespace errors.
.git/rebase-apply/patch:21: trailing whitespace.
; 
.git/rebase-apply/patch:26: trailing whitespace.
; 
warning: 2 lines add whitespace errors.
.git/rebase-apply/patch:22: trailing whitespace.
* If you're unable to find an open issue addressing the problem, [open a new 
one](https://github.com/Rust-GCC/gccrs/issues/new). 
.git/rebase-apply/patch:23: trailing whitespace.
  Be sure to include a **title and clear description**, as much relevant 
information as possible, and a **code sample** 
.git/rebase-apply/patch:36: trailing whitespace.
These will be imported into a GitHub PR to follow the normal review process, 
.git/rebase-apply/patch:43: trailing whitespace.
* Do not open an issue on GitHub until you have collected positive feedback 
about the change. 
.git/rebase-apply/patch:61: trailing whitespace.
* Where possible please add test cases to `gcc/testsuite/rust/` for all PRs. 
warning: squelched 15 whitespace errors
warning: 20 lines add whitespace errors.

Can you please take a look at that?

Cheers,
Martin


Re: [PATCH 3/3]rs6000: NFC no need copy_rtx in rs6000_emit_set_long_const and rs6000_emit_set_const

2022-12-09 Thread Jiufu Guo via Gcc-patches
Jiufu Guo via Gcc-patches  writes:

> Hi Kewen,
>
> 在 12/1/22 11:31 AM, Kewen.Lin 写道:
>> Hi Jeff,
>> 
>> on 2022/12/1 09:36, Jiufu Guo wrote:
>>> Hi,
>>>
>>> Function rs6000_emit_set_const/rs6000_emit_set_long_const are only invoked 
>>> from
>>> two "define_split"s where the target operand is limited to gpc_reg_operand 
>>> or
>>> int_reg_operand, then the operand must be REG_P.
>>> And in rs6000_emit_set_const/rs6000_emit_set_long_const, to create temp rtx,
>>> it is using code like "gen_reg_rtx({S|D}Imode)", it must also be REG_P.
>>> So, copy_rtx is not needed for temp and dest.
>>>
>>> This patch removes those "copy_rtx" for rs6000_emit_set_const and
>>> rs6000_emit_set_long_const.
>>>
>>> Bootstrap & regtest pass on ppc64{,le}.
>>> Is this ok for trunk? 
>> 
>> This patch is okay, thanks!  For the subject, IMHO it's better to use 
>> something
>> like: "rs6000: Remove useless copy_rtx in rs6000_emit_set_{,long}_const".
>> I don't see NFC tag used much in GCC, though it's used a lot in llvm, but
>> anyway you can append (NFC)/[NFC] at the end if you like.  :)
>> 
>
> "rs6000: Remove useless copy_rtx in rs6000_emit_set_{,long}_const" is great!
>
> Thanks for your review and suggestions!

Thanks for comments and review! And committed via r13-4583-g71b31d13757ae0.

BR,
Jeff (Jiufu)
>
>
> BR,
> Jeff (Jiufu)
>
>> BR,
>> Kewen


Re: [PATCH Rust front-end v4 46/46] gccrs: Add README, CONTRIBUTING and compiler logo

2022-12-09 Thread Martin Liška
On 12/6/22 11:14, arthur.co...@embecosm.com wrote:
> |We still need to write out a documentation section, but these READMEs will 
> help in the meantime.|

Hello.

Just a quick comment: The Sphinx conversion didn't make it for all GCC manuals. 
However,
you can still use Sphinx for a newly created manual, similarly to what 
libgccjit or Ada manuals
do.

Cheers,
Martin


[PATCH] Backport gcc-12: jobserver FIFO support

2022-12-09 Thread Martin Liška
Hi.

As make 4.4 has been release, it switches to FIFO by default. That makes
troubles to the latest GCC release, version 12. Right now, we've been using
the following 4 patches in openSUSE gcc12 package:

1270ccda70ca09f7d4fe76b5156dca8992bd77a6
53e3b2bf16a486c15c20991c6095f7be09012b55
fed766af32ed6cd371016cc24e931131e19b4eb1
3f1c2f89f6b8b8d23a9072f8549b0a2c1de06b03

Would it be fine to backport it to gcc-12 branch? Arsen asked me the today
as Gentoo people want it as well.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin


Re: [PATCH v3] LoongArch: Fixed a compilation failure with '%c' in inline assembly [PR107731].

2022-12-09 Thread Richard Sandiford via Gcc-patches
Lulu Cheng  writes:
> There is description of '%c' "%n" "%a" and "%l" in section 17.5 of gccint.pdf.
> So I can understand that these descriptors are the ones that the common code
> implementation back end has to support, right?
> But I don't see the use of these descriptors in gcc.pdf.Now I want to add the
> descriptor information under loongarch. I had to add '%c' to the schema 
> section.
> Is there a better solution to this?

It looks right to me FWIW.  I agree it seems odd that %c, %n, %a and %l
aren't mentioned in the user-facing documentation, given that md.texi
implies that all targets must support them.  Looks like the user
documentation just says "Typically these qualifiers are hardware 
dependent.", without hinting what the 4 atypical cases are.

Thanks,
Richard

>
>
> V2 -> v3:
> 1. Correct a clerical error.
> 2. Adding document for loongarch operand modifiers.
>
> ---
>
> Co-authored-by: Yang Yujie 
>
> gcc/ChangeLog:
>
>   * config/loongarch/loongarch.cc (loongarch_classify_address):
>   Add precessint for CONST_INT.
>   (loongarch_print_operand_reloc): Operand modifier 'c' is supported.
>   (loongarch_print_operand): Increase the processing of '%c'.
>   * doc/extend.texi: Adds documents for LoongArch operand modifiers.
>
> gcc/testsuite/ChangeLog:
>
>   * gcc.target/loongarch/tst-asm-const.c: Moved to...
>   * gcc.target/loongarch/pr107731.c: ...here.
> ---
>  gcc/config/loongarch/loongarch.cc| 14 ++
>  gcc/doc/extend.texi  | 16 
>  .../loongarch/{tst-asm-const.c => pr107731.c}|  6 +++---
>  3 files changed, 33 insertions(+), 3 deletions(-)
>  rename gcc/testsuite/gcc.target/loongarch/{tst-asm-const.c => pr107731.c} 
> (78%)
>
> diff --git a/gcc/config/loongarch/loongarch.cc 
> b/gcc/config/loongarch/loongarch.cc
> index c6b03fcf2f9..cdf190b985e 100644
> --- a/gcc/config/loongarch/loongarch.cc
> +++ b/gcc/config/loongarch/loongarch.cc
> @@ -2075,6 +2075,11 @@ loongarch_classify_address (struct 
> loongarch_address_info *info, rtx x,
>return (loongarch_valid_base_register_p (info->reg, mode, strict_p)
> && loongarch_valid_lo_sum_p (info->symbol_type, mode,
>  info->offset));
> +case CONST_INT:
> +  /* Small-integer addresses don't occur very often, but they
> +  are legitimate if $r0 is a valid base register.  */
> +  info->type = ADDRESS_CONST_INT;
> +  return IMM12_OPERAND (INTVAL (x));
>  
>  default:
>return false;
> @@ -4933,6 +4938,7 @@ loongarch_print_operand_reloc (FILE *file, rtx op, bool 
> hi64_part,
>  
> 'A'   Print a _DB suffix if the memory model requires a release.
> 'b'   Print the address of a memory operand, without offset.
> +   'c'  Print an integer.
> 'C'   Print the integer branch condition for comparison OP.
> 'd'   Print CONST_INT OP in decimal.
> 'F'   Print the FPU branch condition for comparison OP.
> @@ -4979,6 +4985,14 @@ loongarch_print_operand (FILE *file, rtx op, int 
> letter)
> fputs ("_db", file);
>break;
>  
> +case 'c':
> +  if (CONST_INT_P (op))
> + fprintf (file, HOST_WIDE_INT_PRINT_DEC, INTVAL (op));
> +  else
> + output_operand_lossage ("unsupported operand for code '%c'", letter);
> +
> +  break;
> +
>  case 'C':
>loongarch_print_int_branch_condition (file, code, letter);
>break;
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index b1dd39e64b8..5a8b9489f3d 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -11374,6 +11374,22 @@ constant.  Used to select the specified bit position.
>  @item @code{x} @tab Equivialent to @code{X}, but only for pointers.
>  @end multitable
>  
> +@anchor{loongarchOperandmodifiers}
> +@subsubsection LoongArch Operand Modifiers
> +
> +The list below describes the supported modifiers and their effects for 
> LoongArch.
> +
> +@multitable @columnfractions .10 .90
> +@headitem Modifier @tab Description
> +@item @code{c} @tab Print a constant integer operand in decimal.
> +@item @code{d} @tab Same as @code{c}.
> +@item @code{i} @tab Print the character ''@code{i}'' if the operand is not a 
> register.
> +@item @code{m} @tab Same as @code{c}, but the printed value is @code{operand 
> - 1}.
> +@item @code{X} @tab Print a constant integer operand in hexadecimal.
> +@item @code{z} @tab Print the operand in its unmodified form, followed by a 
> comma.
> +@end multitable
> +
> +
>  @lowersections
>  @include md.texi
>  @raisesections
> diff --git a/gcc/testsuite/gcc.target/loongarch/tst-asm-const.c 
> b/gcc/testsuite/gcc.target/loongarch/pr107731.c
> similarity index 78%
> rename from gcc/testsuite/gcc.target/loongarch/tst-asm-const.c
> rename to gcc/testsuite/gcc.target/loongarch/pr107731.c
> index 2e04b99e301..80d84c48c6e 100644
> --- a/gcc/testsuite/gcc.target/loongarch/tst-asm-const.c
> +++ 

[PATCH v3] LoongArch: Fixed a compilation failure with '%c' in inline assembly [PR107731].

2022-12-09 Thread Lulu Cheng
There is description of '%c' "%n" "%a" and "%l" in section 17.5 of gccint.pdf.
So I can understand that these descriptors are the ones that the common code
implementation back end has to support, right?
But I don't see the use of these descriptors in gcc.pdf.Now I want to add the
descriptor information under loongarch. I had to add '%c' to the schema section.
Is there a better solution to this?


V2 -> v3:
1. Correct a clerical error.
2. Adding document for loongarch operand modifiers.

---

Co-authored-by: Yang Yujie 

gcc/ChangeLog:

* config/loongarch/loongarch.cc (loongarch_classify_address):
Add precessint for CONST_INT.
(loongarch_print_operand_reloc): Operand modifier 'c' is supported.
(loongarch_print_operand): Increase the processing of '%c'.
* doc/extend.texi: Adds documents for LoongArch operand modifiers.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/tst-asm-const.c: Moved to...
* gcc.target/loongarch/pr107731.c: ...here.
---
 gcc/config/loongarch/loongarch.cc| 14 ++
 gcc/doc/extend.texi  | 16 
 .../loongarch/{tst-asm-const.c => pr107731.c}|  6 +++---
 3 files changed, 33 insertions(+), 3 deletions(-)
 rename gcc/testsuite/gcc.target/loongarch/{tst-asm-const.c => pr107731.c} (78%)

diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index c6b03fcf2f9..cdf190b985e 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -2075,6 +2075,11 @@ loongarch_classify_address (struct 
loongarch_address_info *info, rtx x,
   return (loongarch_valid_base_register_p (info->reg, mode, strict_p)
  && loongarch_valid_lo_sum_p (info->symbol_type, mode,
   info->offset));
+case CONST_INT:
+  /* Small-integer addresses don't occur very often, but they
+are legitimate if $r0 is a valid base register.  */
+  info->type = ADDRESS_CONST_INT;
+  return IMM12_OPERAND (INTVAL (x));
 
 default:
   return false;
@@ -4933,6 +4938,7 @@ loongarch_print_operand_reloc (FILE *file, rtx op, bool 
hi64_part,
 
'A' Print a _DB suffix if the memory model requires a release.
'b' Print the address of a memory operand, without offset.
+   'c'  Print an integer.
'C' Print the integer branch condition for comparison OP.
'd' Print CONST_INT OP in decimal.
'F' Print the FPU branch condition for comparison OP.
@@ -4979,6 +4985,14 @@ loongarch_print_operand (FILE *file, rtx op, int letter)
fputs ("_db", file);
   break;
 
+case 'c':
+  if (CONST_INT_P (op))
+   fprintf (file, HOST_WIDE_INT_PRINT_DEC, INTVAL (op));
+  else
+   output_operand_lossage ("unsupported operand for code '%c'", letter);
+
+  break;
+
 case 'C':
   loongarch_print_int_branch_condition (file, code, letter);
   break;
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index b1dd39e64b8..5a8b9489f3d 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -11374,6 +11374,22 @@ constant.  Used to select the specified bit position.
 @item @code{x} @tab Equivialent to @code{X}, but only for pointers.
 @end multitable
 
+@anchor{loongarchOperandmodifiers}
+@subsubsection LoongArch Operand Modifiers
+
+The list below describes the supported modifiers and their effects for 
LoongArch.
+
+@multitable @columnfractions .10 .90
+@headitem Modifier @tab Description
+@item @code{c} @tab Print a constant integer operand in decimal.
+@item @code{d} @tab Same as @code{c}.
+@item @code{i} @tab Print the character ''@code{i}'' if the operand is not a 
register.
+@item @code{m} @tab Same as @code{c}, but the printed value is @code{operand - 
1}.
+@item @code{X} @tab Print a constant integer operand in hexadecimal.
+@item @code{z} @tab Print the operand in its unmodified form, followed by a 
comma.
+@end multitable
+
+
 @lowersections
 @include md.texi
 @raisesections
diff --git a/gcc/testsuite/gcc.target/loongarch/tst-asm-const.c 
b/gcc/testsuite/gcc.target/loongarch/pr107731.c
similarity index 78%
rename from gcc/testsuite/gcc.target/loongarch/tst-asm-const.c
rename to gcc/testsuite/gcc.target/loongarch/pr107731.c
index 2e04b99e301..80d84c48c6e 100644
--- a/gcc/testsuite/gcc.target/loongarch/tst-asm-const.c
+++ b/gcc/testsuite/gcc.target/loongarch/pr107731.c
@@ -1,13 +1,13 @@
-/* Test asm const. */
 /* { dg-do compile } */
 /* { dg-final { scan-assembler-times "foo:.*\\.long 1061109567.*\\.long 52" 1 
} } */
+
 int foo ()
 {
   __asm__ volatile (
   "foo:"
   "\n\t"
- ".long %a0\n\t"
- ".long %a1\n\t"
+ ".long %c0\n\t"
+ ".long %c1\n\t"
  :
  :"i"(0x3f3f3f3f), "i"(52)
  :
-- 
2.31.1



Re: [PATCH] i386: fix assert (__builtin_cpu_supports ("x86-64") >= 0)

2022-12-09 Thread Jakub Jelinek via Gcc-patches
On Fri, Dec 09, 2022 at 10:18:34AM +0100, Martin Liška wrote:
> I'm going to push the revision.
> 
> What exactly do you mean by random? I just know there are differences in 
> between
> x86 and ppc:
> 
> int __builtin_cpu_supports(const char *feature)
> This function returns a positive integer if the run-time CPU supports feature 
> and returns 0 otherwise.
> 
> This function returns a value of 1 if the run-time CPU supports the HWCAP 
> feature feature and returns 0 otherwise.

Because x86-64-v2 etc. isn't a HWCAP feature, but rather an architecture
(or set of canned HWCAP features).  So __builtin_cpu_is for those,
or especially for __builtin_cpu_is ("x86-64") would make more sense.
Though I see that many of the valid -march= values actually aren't supported
by __builtin_cpu_is either, whether it is pentium, nocona, lakemont etc.
but does support say some vendors (intel, amd).

Jakub



Re: [PATCH] i386: fix assert (__builtin_cpu_supports ("x86-64") >= 0)

2022-12-09 Thread Martin Liška
On 12/7/22 12:27, Jakub Jelinek wrote:
> On Fri, Nov 25, 2022 at 01:57:35PM +0100, Martin Liška wrote:
>> PR target/107551
>>
>> gcc/ChangeLog:
>>
>>  * config/i386/i386-builtins.cc (fold_builtin_cpu): Use same path
>>  as for PR103661.
>>  * doc/extend.texi: Fix "x86-64" use.
>>
>> gcc/testsuite/ChangeLog:
>>
>>  * gcc.target/i386/builtin_target.c: Add more checks.
>> +
>> +  field_val = (1U << feature);
> 
> Just
>   field_val = 1U << feature;
> ?
> 
>> +  final = build2 (BIT_AND_EXPR, unsigned_type_node, array_elt,
>> +  build_int_cstu (unsigned_type_node, field_val));
>> +  if (feature == (INT_TYPE_SIZE - 1))
> 
> Just
>   if (feature == INT_TYPE_SIZE - 1)
> ?

Sure, included the 2 aforementioned suggestions.

> 
>> +return build2 (NE_EXPR, integer_type_node, final,
>> +   build_int_cst (unsigned_type_node, 0));
>> +  else
>> +return build1 (NOP_EXPR, integer_type_node, final);
>>  }
>>gcc_unreachable ();
>>  }
> 
> Otherwise LGTM, though I must say the destinction for when
> __builtin_cpu_is and __builtin_cpu_supports works looks completely random.

I'm going to push the revision.

What exactly do you mean by random? I just know there are differences in between
x86 and ppc:

int __builtin_cpu_supports(const char *feature)
This function returns a positive integer if the run-time CPU supports feature 
and returns 0 otherwise.

This function returns a value of 1 if the run-time CPU supports the HWCAP 
feature feature and returns 0 otherwise.

Martin

> 
>   Jakub
> 



Re: [PATCH] IPA: do not release body if still needed

2022-12-09 Thread Martin Liška
PING^1

On 12/1/22 10:59, Martin Liška wrote:
> Hi.
> 
> Noticed during building of libbackend.a with the LTO partial linking.
> 
> The function release_body is called even if clone_of is a clone
> of a another function and thus it shares tree declaration. We should
> preserve it in that situation.
> 
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
> Ready to be installed?
> Thanks,
> Martin
> 
>   PR ipa/107944
> 
> gcc/ChangeLog:
> 
>   * cgraph.cc (cgraph_node::remove): Do not release body
>   if a node is clone of another node.
> ---
>  gcc/cgraph.cc | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/cgraph.cc b/gcc/cgraph.cc
> index f15cb47c8b8..2e7d77ffd6c 100644
> --- a/gcc/cgraph.cc
> +++ b/gcc/cgraph.cc
> @@ -1893,7 +1893,7 @@ cgraph_node::remove (void)
>else if (clone_of)
>  {
>clone_of->clones = next_sibling_clone;
> -  if (!clone_of->analyzed && !clone_of->clones && !clones)
> +  if (!clone_of->analyzed && !clone_of->clones && !clones && 
> !clone_of->clone_of)
>   clone_of->release_body ();
>  }
>if (next_sibling_clone)



Re: [PATCH] ipa: silent -Wodr notes with -w

2022-12-09 Thread Martin Liška
PING^1

On 12/2/22 12:27, Martin Liška wrote:
> If -w is used, warn_odr properly sets *warned = false and
> so it should be preserved when calling warn_types_mismatch.
> 
> Noticed that during a LTO reduction where I used -w.
> 
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
> Ready to be installed?
> Thanks,
> Martin
> 
> gcc/ChangeLog:
> 
>   * ipa-devirt.cc (odr_types_equivalent_p): Respect *warned
>   value if set.
> ---
>  gcc/ipa-devirt.cc | 12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/gcc/ipa-devirt.cc b/gcc/ipa-devirt.cc
> index 265d07bb354..bcdc50c5bd7 100644
> --- a/gcc/ipa-devirt.cc
> +++ b/gcc/ipa-devirt.cc
> @@ -1300,7 +1300,7 @@ odr_types_equivalent_p (tree t1, tree t2, bool warn, 
> bool *warned,
> warn_odr (t1, t2, NULL, NULL, warn, warned,
>   G_("it is defined as a pointer to different type "
>  "in another translation unit"));
> -   if (warn && warned)
> +   if (warn && (warned == NULL || *warned))
>   warn_types_mismatch (TREE_TYPE (t1), TREE_TYPE (t2),
>loc1, loc2);
> return false;
> @@ -1315,7 +1315,7 @@ odr_types_equivalent_p (tree t1, tree t2, bool warn, 
> bool *warned,
> warn_odr (t1, t2, NULL, NULL, warn, warned,
>   G_("a different type is defined "
>  "in another translation unit"));
> -   if (warn && warned)
> +   if (warn && (warned == NULL || *warned))
>   warn_types_mismatch (TREE_TYPE (t1), TREE_TYPE (t2), loc1, loc2);
> return false;
>   }
> @@ -1333,7 +1333,7 @@ odr_types_equivalent_p (tree t1, tree t2, bool warn, 
> bool *warned,
>   warn_odr (t1, t2, NULL, NULL, warn, warned,
> G_("a different type is defined in another "
>"translation unit"));
> - if (warn && warned)
> + if (warn && (warned == NULL || *warned))
> warn_types_mismatch (TREE_TYPE (t1), TREE_TYPE (t2), loc1, loc2);
> }
>   gcc_assert (TYPE_STRING_FLAG (t1) == TYPE_STRING_FLAG (t2));
> @@ -1375,7 +1375,7 @@ odr_types_equivalent_p (tree t1, tree t2, bool warn, 
> bool *warned,
> warn_odr (t1, t2, NULL, NULL, warn, warned,
>   G_("has different return value "
>  "in another translation unit"));
> -   if (warn && warned)
> +   if (warn && (warned == NULL || *warned))
>   warn_types_mismatch (TREE_TYPE (t1), TREE_TYPE (t2), loc1, loc2);
> return false;
>   }
> @@ -1398,7 +1398,7 @@ odr_types_equivalent_p (tree t1, tree t2, bool warn, 
> bool *warned,
> warn_odr (t1, t2, NULL, NULL, warn, warned,
>   G_("has different parameters in another "
>  "translation unit"));
> -   if (warn && warned)
> +   if (warn && (warned == NULL || *warned))
>   warn_types_mismatch (TREE_VALUE (parms1),
>TREE_VALUE (parms2), loc1, loc2);
> return false;
> @@ -1484,7 +1484,7 @@ odr_types_equivalent_p (tree t1, tree t2, bool warn, 
> bool *warned,
>   warn_odr (t1, t2, f1, f2, warn, warned,
> G_("a field of same name but different type "
>"is defined in another translation unit"));
> - if (warn && warned)
> + if (warn && (warned == NULL || *warned))
> warn_types_mismatch (TREE_TYPE (f1), TREE_TYPE (f2), 
> loc1, loc2);
>   return false;
> }