Re: [PATCH, rs6000] Change insn condition from TARGET_64BIT to TARGET_POWERPC64 for VSX scalar extract/insert instructions

2022-08-25 Thread Kewen.Lin via Gcc-patches
on 2022/8/26 10:42, HAO CHEN GUI wrote:
> Hi David,
> 
> On 25/8/2022 下午 10:01, David Edelsohn wrote:
>> On Thu, Aug 25, 2022 at 1:22 AM Kewen.Lin  wrote:
>>>
>>> on 2022/8/25 11:37, HAO CHEN GUI wrote:
 Hi,

 On 24/8/2022 下午 1:24, Kewen.Lin wrote:
> Could you try to test with dg-options "-mdejagnu-cpu=power9 -mpowerpc64" 
> all the time, but still
> having that has_arch_ppc64 effective target on aix?
>
> I'd expect has_arch_ppc64 check to fail on aix 32bit, the error will not 
> be a problem (turning
> into an UNSUPPORTED then)?

 I tested it on AIX. "has_arch_ppc64" fails with dg-options 
 "-mdejagnu-cpu=power9 -mpowerpc64" on
 32-bit AIX environment. It works as we expected.
>>>
>>> Nice, thanks for your time on testing.
>>>

 Also I found that AIX and Darwin are skipped for bfp test. So in testcase, 
 it's no need to care
 about them. Not sure if it's intention.

 In bfp.exp

 # Exit immediately if this isn't a PowerPC target or if the target is
 # aix or Darwin.
 if { (![istarget powerpc*-*-*] && ![istarget rs6000-*-*])
  || [istarget "powerpc*-*-aix*"]
  || [istarget "powerpc*-*-darwin*"]  } then {
   return
 }
>>>
>>> I can't find a hint about why we wanted to disable bfp testing on aix, it 
>>> looks like a overkill to me.
>>>
>>> Could you help to further test if all test cases in this small bucket 
>>> available on aix?
>>>
>>> Maybe it can give us some evidences on why it's intentional or not.
>>>
>>> Hi David & Segher,
>>>
>>> Do you have some insights on this?
>>
>> AIX (and Darwin) are not Linux and not ELF.  There is no support for
>> BPF.  All of the tests fail, so they are skipped.
> 
> Thanks so much for your info.

+1!

But I guessed the name BFP (short for IEEE binary fp IMHO) was misread as BPF 
(eBPF)? 

> 
> Here are test results on P7 AIX7.1. I tested all scalar-extract-sig-* and 
> scalar-insert-exp-* cases in
> "testsuite/powerpc/bfp" fold. All compiling cases pass except those use 
> __ieee128. The runnable cases
> fail as expected. p9vector is not supported on P7 servers.

Thanks for testing again.

> 
> So the __ieee128 blocks Binary floating-point on AIX?

We can add one extra effective target ppc_ieee128_ok for them, aix is already 
excluded in the check.

BR,
Kewen


Re: [PATCH, rs6000] Change insn condition from TARGET_64BIT to TARGET_POWERPC64 for VSX scalar extract/insert instructions

2022-08-25 Thread HAO CHEN GUI via Gcc-patches
Hi David,

On 25/8/2022 下午 10:01, David Edelsohn wrote:
> On Thu, Aug 25, 2022 at 1:22 AM Kewen.Lin  wrote:
>>
>> on 2022/8/25 11:37, HAO CHEN GUI wrote:
>>> Hi,
>>>
>>> On 24/8/2022 下午 1:24, Kewen.Lin wrote:
 Could you try to test with dg-options "-mdejagnu-cpu=power9 -mpowerpc64" 
 all the time, but still
 having that has_arch_ppc64 effective target on aix?

 I'd expect has_arch_ppc64 check to fail on aix 32bit, the error will not 
 be a problem (turning
 into an UNSUPPORTED then)?
>>>
>>> I tested it on AIX. "has_arch_ppc64" fails with dg-options 
>>> "-mdejagnu-cpu=power9 -mpowerpc64" on
>>> 32-bit AIX environment. It works as we expected.
>>
>> Nice, thanks for your time on testing.
>>
>>>
>>> Also I found that AIX and Darwin are skipped for bfp test. So in testcase, 
>>> it's no need to care
>>> about them. Not sure if it's intention.
>>>
>>> In bfp.exp
>>>
>>> # Exit immediately if this isn't a PowerPC target or if the target is
>>> # aix or Darwin.
>>> if { (![istarget powerpc*-*-*] && ![istarget rs6000-*-*])
>>>  || [istarget "powerpc*-*-aix*"]
>>>  || [istarget "powerpc*-*-darwin*"]  } then {
>>>   return
>>> }
>>
>> I can't find a hint about why we wanted to disable bfp testing on aix, it 
>> looks like a overkill to me.
>>
>> Could you help to further test if all test cases in this small bucket 
>> available on aix?
>>
>> Maybe it can give us some evidences on why it's intentional or not.
>>
>> Hi David & Segher,
>>
>> Do you have some insights on this?
> 
> AIX (and Darwin) are not Linux and not ELF.  There is no support for
> BPF.  All of the tests fail, so they are skipped.

Thanks so much for your info.

Here are test results on P7 AIX7.1. I tested all scalar-extract-sig-* and 
scalar-insert-exp-* cases in
"testsuite/powerpc/bfp" fold. All compiling cases pass except those use 
__ieee128. The runnable cases
fail as expected. p9vector is not supported on P7 servers.

So the __ieee128 blocks Binary floating-point on AIX?

Thanks
Gui Haochen
> 
> Thanks, David


[PATCH] libstdc++: Add test for std::con/disjunction's base class

2022-08-25 Thread Patrick Palka via Gcc-patches
Tested on x86_64-pc-linux-gnu, does this look OK for trunk?

libstdc++-v3/ChangeLog:

* testsuite/20_util/logical_traits/requirements/base_classes.cc: New 
test.
---
 .../requirements/base_classes.cc  | 37 +++
 1 file changed, 37 insertions(+)
 create mode 100644 
libstdc++-v3/testsuite/20_util/logical_traits/requirements/base_classes.cc

diff --git 
a/libstdc++-v3/testsuite/20_util/logical_traits/requirements/base_classes.cc 
b/libstdc++-v3/testsuite/20_util/logical_traits/requirements/base_classes.cc
new file mode 100644
index 000..6699037bb14
--- /dev/null
+++ b/libstdc++-v3/testsuite/20_util/logical_traits/requirements/base_classes.cc
@@ -0,0 +1,37 @@
+// { dg-do compile { target c++17 } }
+
+#include 
+
+using std::true_type;
+using std::false_type;
+
+template struct T : std::true_type { };
+template struct F : std::false_type { };
+
+// [meta.logical]/5: The specialization conjunction has a
+// public and unambiguous base that is either:
+//   - the first type B_i in the list true_type, B_1, ..., B_n for which
+//   bool(B_i::value) is false, or
+//   - if there is no such Bi, the last type in the list.
+
+static_assert(std::is_base_of_v>);
+static_assert(std::is_base_of_v, std::conjunction>>);
+static_assert(std::is_base_of_v, std::conjunction>>);
+static_assert(std::is_base_of_v, std::conjunction, T<1>>>);
+static_assert(std::is_base_of_v, std::conjunction, F<1>>>);
+static_assert(std::is_base_of_v, std::conjunction, F<0>, F<1>>>);
+static_assert(std::is_base_of_v, std::conjunction, F<0>, T<1>, 
F<1>>>);
+
+// [meta.logical]/10: The specialization disjunction has a
+// public and unambiguous base that is either:
+//   - the first type B_i in the list false_type, B_1, ..., B_n for which
+//   bool(B_i::value) is true, or
+//   - if there is no such Bi, the last type in the list.
+
+static_assert(std::is_base_of_v>);
+static_assert(std::is_base_of_v, std::disjunction>>);
+static_assert(std::is_base_of_v, std::disjunction>>);
+static_assert(std::is_base_of_v, std::disjunction, T<1>>>);
+static_assert(std::is_base_of_v, std::disjunction, F<1>>>);
+static_assert(std::is_base_of_v, std::disjunction, F<0>, F<1>>>);
+static_assert(std::is_base_of_v, std::disjunction, F<0>, T<1>, 
F<1>>>);
-- 
2.37.2.382.g795ea8776b



Re: [PATCH v4] c++: Implement -Wself-move warning [PR81159]

2022-08-25 Thread Jason Merrill via Gcc-patches

On 8/25/22 17:49, Marek Polacek wrote:

On Thu, Aug 25, 2022 at 09:25:43AM -0400, Jason Merrill wrote:

On 8/24/22 17:30, Marek Polacek wrote:

On Tue, Aug 23, 2022 at 05:27:00PM -0400, Jason Merrill wrote:

On 8/23/22 09:39, Marek Polacek wrote:

+  tree arg = CALL_EXPR_ARG (fn, 0);
+  extract_op (arg);
+  if (TREE_CODE (arg) == ADDR_EXPR)
+arg = TREE_OPERAND (arg, 0);
+  tree type = TREE_TYPE (lhs);
+  lhs = maybe_undo_parenthesized_ref (lhs);
+  STRIP_ANY_LOCATION_WRAPPER (lhs);
+  const bool print_var_p = (DECL_P (lhs)
+   || REFERENCE_REF_P (lhs)
+   || TREE_CODE (lhs) == COMPONENT_REF);


Why include REFERENCE_REF_P and COMPONENT_REF?  Reference refs should be
stripped before this test, member refs aren't variables.


I'm checking REFERENCE_REF_P and COMPONENT_REF to say "moving a variable"
in #1 and #3.  The REFERENCE_REF_P check means that we also say "variable"
for #2.  Sure, "A variable is introduced by the declaration of a reference
other than a non-static data member", but I'm not sure if users care about
that here?

If I strip REFERENCE_REFs before the check then the result will be the
same.


That's what I was suggesting, yes: Strip the REFERENCE_REF so DECL_P can see
the decl.


Ok, I've added the REFERENCE_REF stripping.  But I've still left the
COMPONENT_REF in.  Perhaps we could say "moving a member" to itself for
COMPONENT_REFs.  Or just say "moving 'x' of type 'int' to itself" and
avoid all of this.  :)


Sure, that would be simpler.  In any case, we shouldn't call member 
references variables.


Jason



Re: [PATCH v3] c: Implement C23 nullptr (N3042)

2022-08-25 Thread Marek Polacek via Gcc-patches
On Thu, Aug 25, 2022 at 09:12:07PM +, Joseph Myers wrote:
> On Thu, 25 Aug 2022, Marek Polacek via Gcc-patches wrote:
> 
> > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> 
> This version is OK.

Thanks a lot.  I've gone ahead and updated GCC 13 changes.html:

commit 54b8bcfe7c65290942c5fccd89edd658b97774af
Author: Marek Polacek 
Date:   Thu Aug 25 18:12:15 2022 -0400

gcc-13/changes: Update with recent C23 and C++23 proposals

diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html
index 03a77276..c58ca1d4 100644
--- a/htdocs/gcc-13/changes.html
+++ b/htdocs/gcc-13/changes.html
@@ -73,6 +73,13 @@ a work-in-progress.
 
 C
 
+  Several C23 features have been implemented:
+
+  https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3042.htm;>N3042,
+ Introduce the nullptr constant
+  Support for empty initializer braces
+
+  
   New warnings:
 
   -Wenum-int-mismatch warns about mismatches between an
@@ -93,8 +100,20 @@ a work-in-progress.
  reference binding to temporary
  (https://gcc.gnu.org/PR104477;>PR104477)
   
+  https://wg21.link/p2327;>P2327R1, De-deprecating 
volatile
+ compound operations
+  
+  https://wg21.link/p2437;>P2437R1, Support for
+ #warning
+ (https://gcc.gnu.org/PR106646;>PR106646)
+  
+  https://wg21.link/p2290;>P2290R3, Delimited escape 
sequences
+ (https://gcc.gnu.org/PR106645;>PR106645)
+  
 
   
+  The -Wpessimizing-move and -Wredundant-move
+  warnings have been extended to warn in more contexts.
 
 
 



Re: [PATCH v4] c++: Implement -Wself-move warning [PR81159]

2022-08-25 Thread Marek Polacek via Gcc-patches
On Thu, Aug 25, 2022 at 09:25:43AM -0400, Jason Merrill wrote:
> On 8/24/22 17:30, Marek Polacek wrote:
> > On Tue, Aug 23, 2022 at 05:27:00PM -0400, Jason Merrill wrote:
> > > On 8/23/22 09:39, Marek Polacek wrote:
> > > > +  tree arg = CALL_EXPR_ARG (fn, 0);
> > > > +  extract_op (arg);
> > > > +  if (TREE_CODE (arg) == ADDR_EXPR)
> > > > +arg = TREE_OPERAND (arg, 0);
> > > > +  tree type = TREE_TYPE (lhs);
> > > > +  lhs = maybe_undo_parenthesized_ref (lhs);
> > > > +  STRIP_ANY_LOCATION_WRAPPER (lhs);
> > > > +  const bool print_var_p = (DECL_P (lhs)
> > > > +   || REFERENCE_REF_P (lhs)
> > > > +   || TREE_CODE (lhs) == COMPONENT_REF);
> > > 
> > > Why include REFERENCE_REF_P and COMPONENT_REF?  Reference refs should be
> > > stripped before this test, member refs aren't variables.
> > 
> > I'm checking REFERENCE_REF_P and COMPONENT_REF to say "moving a variable"
> > in #1 and #3.  The REFERENCE_REF_P check means that we also say "variable"
> > for #2.  Sure, "A variable is introduced by the declaration of a reference
> > other than a non-static data member", but I'm not sure if users care about
> > that here?
> > 
> > If I strip REFERENCE_REFs before the check then the result will be the
> > same.
> 
> That's what I was suggesting, yes: Strip the REFERENCE_REF so DECL_P can see
> the decl.

Ok, I've added the REFERENCE_REF stripping.  But I've still left the
COMPONENT_REF in.  Perhaps we could say "moving a member" to itself for
COMPONENT_REFs.  Or just say "moving 'x' of type 'int' to itself" and
avoid all of this.  :)

> I don't see where COMPONENT_REF comes in?

For #1 in the test below the COMPONENT_REF was created in finish_id_expression
-> finish_non_static_data_member -> build_class_member_access_expr and passed
down to maybe_warn_self_move from here:

#0  maybe_warn_self_move (loc=2147483652, lhs=,
rhs=) at 
/home/mpolacek/src/gcc/gcc/cp/typeck.cc:8908
#1  0x00f3d03e in cp_build_modify_expr (loc=2147483652, 
lhs=,
modifycode=NOP_EXPR, rhs=, complain=3)
at /home/mpolacek/src/gcc/gcc/cp/typeck.cc:9161
#2  0x00f3e461 in build_x_modify_expr (loc=2147483652, 
lhs=,
modifycode=NOP_EXPR, rhs=, lookups=, 
complain=3)
at /home/mpolacek/src/gcc/gcc/cp/typeck.cc:9446
#3  0x00d92d4e in cp_parser_assignment_expression 
(parser=0x7fffea236850, pidk=0x0, cast_p=false,
decltype_p=false) at /home/mpolacek/src/gcc/gcc/cp/parser.cc:10461
 
> > Or I could keep only the DECL_P check, but then we'll say "moving
> > an expression" for #1 and #2, which seems strange.
> > 
> > struct S {
> >int x;
> >int 
> >void foo () {
> >  x = std::move (x); // #1
> >  r = std::move (r); // #2
> >};
> > };
> > 
> > void
> > foo (int )
> > {
> >r = std::move (r); // #3
> > }

Marek



Re: [PATCH v3] c: Implement C23 nullptr (N3042)

2022-08-25 Thread Joseph Myers
On Thu, 25 Aug 2022, Marek Polacek via Gcc-patches wrote:

> Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

This version is OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: c: Support C2x empty initializer braces

2022-08-25 Thread Joseph Myers
On Thu, 25 Aug 2022, Jakub Jelinek via Gcc-patches wrote:

> > +  && vec_safe_is_empty (CONSTRUCTOR_ELTS (*from_p)))
> 
> Perhaps
>   && CONSTRUCTOR_NELTS (*from_p) == 0)
> instead?
> 
> > +{
> > +maybe_with_size_expr (from_p);
> 
> The indentation above is 8 spaces instead of tab.
> 
> > +   gcc_assert (TREE_CODE (*from_p) == WITH_SIZE_EXPR);
> > +   return gimplify_modify_expr_to_memset (expr_p,
> > +  TREE_OPERAND (*from_p, 1),
> > +  want_value, pre_p);
> > +}
> > +
> 
> Otherwise LGTM.

Thanks, fixed thus, retested and committed.

c: Support C2x empty initializer braces

ISO C2x standardizes empty initializer braces {}.  Implement this
feature accordingly.  The basic case was already supported and so just
needed diagnostic adjustments.  However, the standard feature also
includes two cases that were not previously supported: empty
initializer braces for scalars, and empty initializer braces for
VLAs.  Thus, add support for those features as well, updating existing
tests that expected them to be diagnosed.

There was already some gimplifier support for converting
variable-sized initializations with empty CONSTRUCTORs to memset.
However, it didn't apply here; code earlier in gimplify_modify_expr
ended up calling gimplify_init_constructor via
gimplify_modify_expr_rhs, which ended up handling the CONSTRUCTOR in a
way that generated an ICE later.  Add a check for this case earlier in
gimplify_modify_expr to avoid that issue.

Bootstrapped with no regressions for x86_64-pc-linux-gnu.

gcc/
* gimplify.cc (gimplify_modify_expr): Convert initialization from
a variable-size CONSTRUCTOR to memset before call to
gimplify_modify_expr_rhs.

gcc/c/
* c-decl.cc (start_decl): Do not diagnose initialization of
variable-sized objects here.
* c-parser.cc (c_parser_braced_init): Add argument DECL.  All
callers changed.
(c_parser_initializer): Diagnose initialization of variable-sized
objects other than with braced initializer.
(c_parser_braced_init): Use pedwarn_c11 for empty initializer
braces and update diagnostic text.  Diagnose initialization of
variable-sized objects with nonempty braces.
* c-typeck.cc (digest_init): Update diagnostic for initialization
of variable-sized objects.
(really_start_incremental_init, set_designator)
(process_init_element): Update comments.
(pop_init_level): Allow scalar empty initializers.

gcc/testsuite/
* gcc.dg/c11-empty-init-1.c, gcc.dg/c11-empty-init-2.c,
gcc.dg/c11-empty-init-3.c, gcc.dg/c2x-empty-init-1.c,
gcc.dg/c2x-empty-init-2.c, gcc.dg/c2x-empty-init-3.c,
gcc.dg/gnu2x-empty-init-1.c, gcc.dg/gnu2x-empty-init-2.c: New
tests.
* gcc.dg/torture/dfp-default-init-1.c: Also test empty
initializers.
* gcc.dg/init-bad-1.c, gcc.dg/noncompile/pr71583.c,
gcc.dg/pr61096-1.c, gcc.dg/vla-init-2.c, gcc.dg/vla-init-3.c,
gcc.target/i386/sse2-bfloat16-scalar-typecheck.c: Update expected
diagnostics.
* gcc.dg/ubsan/c-shift-1.c: Use nonempty initializers for VLA
initializations expected to be diagnosed.

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index 9e590c66dae..1fe31e01132 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -5180,29 +5180,15 @@ start_decl (struct c_declarator *declarator, struct 
c_declspecs *declspecs,
  initialized = false;
else if (COMPLETE_TYPE_P (TREE_TYPE (decl)))
  {
-   /* A complete type is ok if size is fixed.  */
-
-   if (!poly_int_tree_p (TYPE_SIZE (TREE_TYPE (decl)))
-   || C_DECL_VARIABLE_SIZE (decl))
- {
-   error ("variable-sized object may not be initialized");
-   initialized = false;
- }
+   /* A complete type is ok if size is fixed.  If the size is
+  variable, an empty initializer is OK and nonempty
+  initializers will be diagnosed in the parser.  */
  }
else if (TREE_CODE (TREE_TYPE (decl)) != ARRAY_TYPE)
  {
error ("variable %qD has initializer but incomplete type", decl);
initialized = false;
  }
-   else if (C_DECL_VARIABLE_SIZE (decl))
- {
-   /* Although C99 is unclear about whether incomplete arrays
-  of VLAs themselves count as VLAs, it does not make
-  sense to permit them to be initialized given that
-  ordinary VLAs may not be initialized.  */
-   error ("variable-sized object may not be initialized");
-   initialized = false;
- }
   }
 
   if (initialized)
diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index 759f200a7eb..1e8d9dca749 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -1523,7 +1523,7 @@ static tree 

[pushed] c++: block copy elision in delegating ctor

2022-08-25 Thread Jason Merrill via Gcc-patches
CWG2403 deals with the issue that copy elision is not possible when the
initialized object is a potentially-overlapping subobject and the
initializer is a function that returns by value.  Jonathan pointed out that
this also affects delegating constructors, which might be used to construct
a base subobject.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog:

* call.cc (unsafe_return_slot_p): Return 2 for *this in a
constructor.

gcc/testsuite/ChangeLog:

* g++.dg/init/elide8.C: New test.
---
 gcc/cp/call.cc |  7 +++
 gcc/testsuite/g++.dg/init/elide8.C | 11 +++
 2 files changed, 18 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/init/elide8.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index 370137ebd6d..d107a2814dc 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -9022,6 +9022,13 @@ unsafe_return_slot_p (tree t)
   if (is_empty_base_ref (t))
 return 2;
 
+  /* A delegating constructor might be used to initialize a base.  */
+  if (current_function_decl
+  && DECL_CONSTRUCTOR_P (current_function_decl)
+  && (t == current_class_ref
+ || tree_strip_nop_conversions (t) == current_class_ptr))
+return 2;
+
   STRIP_NOPS (t);
   if (TREE_CODE (t) == ADDR_EXPR)
 t = TREE_OPERAND (t, 0);
diff --git a/gcc/testsuite/g++.dg/init/elide8.C 
b/gcc/testsuite/g++.dg/init/elide8.C
new file mode 100644
index 000..31f899b6623
--- /dev/null
+++ b/gcc/testsuite/g++.dg/init/elide8.C
@@ -0,0 +1,11 @@
+// CWG 2403 case 3: we can't elide this copy because the delegating constructor
+// might be used to initialize a base.
+// { dg-do compile { target c++11 } }
+
+struct Noncopyable {
+  Noncopyable() = default;
+  Noncopyable(const Noncopyable &) = delete;
+  Noncopyable(int) : Noncopyable(make()) {} // { dg-error "deleted" }
+
+  static Noncopyable make();
+};

base-commit: 30e160475489867a09ed89532cae135b5849cf98
-- 
2.31.1



[pushed] dwarf2: use DW_ATE_UTF for C++ char8_t

2022-08-25 Thread Jason Merrill via Gcc-patches
While looking at the Rust changes to dwarf2out I noticed that this was
missing from the char8_t support.

gcc/ChangeLog:

* dwarf2out.cc (base_type_die): Also use DW_ATE_UTF for char8_t.

gcc/testsuite/ChangeLog:

* g++.dg/debug/dwarf2/utf-1.C: New test.
---
 gcc/dwarf2out.cc  | 1 +
 gcc/testsuite/g++.dg/debug/dwarf2/utf-1.C | 9 +
 2 files changed, 10 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/debug/dwarf2/utf-1.C

diff --git a/gcc/dwarf2out.cc b/gcc/dwarf2out.cc
index e3920c898f5..28a4ae08fe4 100644
--- a/gcc/dwarf2out.cc
+++ b/gcc/dwarf2out.cc
@@ -13210,6 +13210,7 @@ base_type_die (tree type, bool reverse)
{
  const char *name = IDENTIFIER_POINTER (DECL_NAME (TYPE_NAME (type)));
  if (strcmp (name, "char16_t") == 0
+ || strcmp (name, "char8_t") == 0
  || strcmp (name, "char32_t") == 0)
{
  encoding = DW_ATE_UTF;
diff --git a/gcc/testsuite/g++.dg/debug/dwarf2/utf-1.C 
b/gcc/testsuite/g++.dg/debug/dwarf2/utf-1.C
new file mode 100644
index 000..43b354f1bb5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/debug/dwarf2/utf-1.C
@@ -0,0 +1,9 @@
+// { dg-do compile { target c++20 } }
+// { dg-options { -gdwarf -dA } }
+
+// Test that all three use DW_ATE_UTF.
+// { dg-final { scan-assembler-times {DW_AT_encoding \(0x10\)} 3 } }
+
+char8_t c8;
+char16_t c16;
+char32_t c32;

base-commit: 585a21bab3ec688c2039bff2922cc372d8558283
-- 
2.31.1



[PATCH v3] c: Implement C23 nullptr (N3042)

2022-08-25 Thread Marek Polacek via Gcc-patches
On Thu, Aug 25, 2022 at 05:28:09PM +, Joseph Myers wrote:
> On Wed, 24 Aug 2022, Marek Polacek via Gcc-patches wrote:
> 
> > Ah, okay.  I had just copied what we do in C++ in null_ptr_cst_p and the
> > rest of the patch worked under that assumption.  I've added some tests
> > for this too.  Except I don't really understand the _Generic comment so
> > I only have tests for _Generic that were in the previous version.
> 
> The point is that e.g.
> 
> _Generic(0, int : nullptr)
> 
> is treated the same as nullptr (so is a null pointer constant), just as 
> (nullptr) is.

Ah, of course, I was thinking of nullptr_t...

Tests added.  Fortunately this works as expected and no functional
changes were needed.
 
> > Thanks, tests added to c2x-nullptr-1.c:test1.  I notice that 6.3.2.4 still
> > says "The type nullptr_t may be converted to bool or to a pointer type";
> > isn't it missing the ", void" here too?
> 
> In general none of the subclauses under 6.3.2 about individual kinds of 
> types tend to discuss the possibility of conversion to void.
> 
> > +/* Simple assignment.  */
> > +void
> > +test4 (void)
> > +{
> > +  /* -- the left operand has an atomic, qualified, or unqualified version 
> > of
> > + the nullptr_t type and the type of the right is nullptr_t;  */
> > +  nullptr_t n1;
> > +  n1 = nullptr;
> > +  const nullptr_t n2 = nullptr;
> > +  _Atomic nullptr_t n3 = nullptr;
> > +  volatile nullptr_t n4 = nullptr;
> 
> These qualified cases are all actually initialization, not assignment; I 
> think both assignment and initialization (and argument passing and return) 
> should be tested for the permitted cases for assignment.

OK, more tests added.
 
> > +/* Test nullptr_t from  
> Typo, "".

Fixed.
 
> > +/* If a second or third operand of type nullptr_t is used that is not a 
> > null
> > +   pointer constant and the other operand is not a pointer or does not have
> > +   itself nullptr_t, a constraint is violated even if that other operand is
> > +   a null pointer constant such as 0.  */
> 
> The "that is not a null pointer constant" in that footnote is a bit odd, 
> since it's also a constraint violation (and should be tested as such) to 
> have a conditional expression between e.g. nullptr and 0.

Okay, I've added some more tests to c2x-nullptr-3.c.

Thanks!

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
This patch implements the C23 nullptr literal:
 (with
wording fixes from N3047), which is intended to replace the problematic
definition of NULL which might be either of integer type or void*.

Since C++ has had nullptr for over a decade now, it was relatively easy
to just move the built-in node definitions from the C++ FE to the C/C++
common code.  Also, our DWARF emitter already handles NULLPTR_TYPE by
emitting DW_TAG_unspecified_type.  However, I had to handle a lot of
contexts such as ?:, comparison, conversion, etc.

There are some minor differences, e.g. in C you can do

  bool b = nullptr;

but in C++ you have to use direct-initialization:

  bool b{nullptr};

And I think that

  nullptr_t n = 0;

is only valid in C++.

Of course, C doesn't have to handle mangling, RTTI, substitution,
overloading, ...

This patch also defines nullptr_t in .  However, it does not
define __STDC_VERSION_STDDEF_H__ yet, because we don't know yet what value
it should be defined to.

gcc/c-family/ChangeLog:

* c-common.cc (c_common_reswords): Enable nullptr in C2X.
(c_common_nodes_and_builtins): Create the built-in node for nullptr.
* c-common.h (enum c_tree_index): Add CTI_NULLPTR, CTI_NULLPTR_TYPE.
(struct c_common_resword): Resize the disable member.
(D_C2X): Add.
(nullptr_node): Define.
(nullptr_type_node): Define.
(NULLPTR_TYPE_P): Define.
* c-pretty-print.cc (c_pretty_printer::simple_type_specifier): Handle
NULLPTR_TYPE.
(c_pretty_printer::direct_abstract_declarator): Likewise.
(c_pretty_printer::constant): Likewise.

gcc/c/ChangeLog:

* c-convert.cc (c_convert) : Handle NULLPTR_TYPE.
Give a better diagnostic when converting to nullptr_t.
* c-decl.cc (c_init_decl_processing): Perform C-specific nullptr
initialization.
* c-parser.cc (c_parse_init): Maybe OR D_C2X into mask.
(c_parser_postfix_expression): Handle RID_NULLPTR.
* c-typeck.cc (null_pointer_constant_p): Return true when expr is
nullptr_node.
(build_unary_op) : Handle NULLPTR_TYPE.
(build_conditional_expr): Handle the case when the second/third operand
is NULLPTR_TYPE and third/second operand is POINTER_TYPE.
(convert_for_assignment): Handle converting an expression of type
nullptr_t to pointer/bool.
(build_binary_op) : Handle NULLPTR_TYPE.
: Handle comparing operands of type nullptr_t.

gcc/cp/ChangeLog:

* cp-tree.h (enum cp_tree_index): 

Re: [PATCH] Implement __builtin_issignaling

2022-08-25 Thread Jakub Jelinek via Gcc-patches
On Thu, Aug 25, 2022 at 03:23:12PM -0400, Michael Meissner wrote:
> On Mon, Aug 15, 2022 at 12:12:02PM +0200, Jakub Jelinek via Gcc-patches wrote:
> > Hi!
> > 
> > The following patch implements a new builtin, __builtin_issignaling,
> > which can be used to implement the ISO/IEC TS 18661-1  issignaling
> > macro.
> 
> I haven't looked in detail at the patch, but from the description I think it
> needs a way for machine dependent parts to optimize this for a given mode when
> various switches are used.

There is an optab which takes priority over the generic code.
So if you have something better than the generic expansion, feel free to add
the named pattern(s).
Currently it is used just on x86 for xf mode.

Jakub



Re: [PATCH] Implement __builtin_issignaling

2022-08-25 Thread Michael Meissner via Gcc-patches
On Mon, Aug 15, 2022 at 12:12:02PM +0200, Jakub Jelinek via Gcc-patches wrote:
> Hi!
> 
> The following patch implements a new builtin, __builtin_issignaling,
> which can be used to implement the ISO/IEC TS 18661-1  issignaling
> macro.

I haven't looked in detail at the patch, but from the description I think it
needs a way for machine dependent parts to optimize this for a given mode when
various switches are used.

For example on the PowerPC (and at least older AMD processors) it can often
times be expensive to move data from a floating point register to a GPR
register to do the masking operation.  I implemented special purpose
recognizers a few years ago to speed up the SFmode math library to recognize
when the glibc tests were being done.

If the machine has a faster way to implement it, it should be allowed to do
this.

On some of the PowerPC's, I might want to generate code like (note, I might
have some of the tests backwards, but the idea is test if the value is a NaN,
and if it is a NaN, look at the signalling bit.  Otherwise return 0:

if (__builtin_isnan (x)) {
long long mantissa = 
return ~((mantissa >> 52) & 1);

} else {
return 0;
}

That way it doesn't have to do a direct move to a GPR register to do the
extraction in general.

For SFmode/DFmode, you should be able to optimize it to something like to avoid
any explicit jumps on a power10 (note, I may have some of the tests/shifts
wrong, but that is the general method you could use):

static int inline
is_signalling_nan (double value)
{
  int exponent = __builtin_vec_scalar_extract_exp (value);
  long long mantissa = __builtin_vec_scalar_extract_sig (value);
  return (mantissa >> 52) & (exponent == 0);
}

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


Re: c: Support C2x empty initializer braces

2022-08-25 Thread Jakub Jelinek via Gcc-patches
On Thu, Aug 25, 2022 at 05:33:02PM +, Joseph Myers wrote:
>   * gimplify.cc (gimplify_modify_expr): Convert initialization from
>   a variable-size CONSTRUCTOR to memset before call to
>   gimplify_modify_expr_rhs.

> --- a/gcc/gimplify.cc
> +++ b/gcc/gimplify.cc
> @@ -6031,6 +6031,21 @@ gimplify_modify_expr (tree *expr_p, gimple_seq *pre_p, 
> gimple_seq *post_p,
>return GS_ALL_DONE;
>  }
>  
> +  /* Convert initialization from an empty variable-size CONSTRUCTOR to
> + memset.  */
> +  if (TREE_TYPE (*from_p) != error_mark_node
> +  && TYPE_SIZE_UNIT (TREE_TYPE (*from_p))
> +  && !poly_int_tree_p (TYPE_SIZE_UNIT (TREE_TYPE (*from_p)))
> +  && TREE_CODE (*from_p) == CONSTRUCTOR
> +  && vec_safe_is_empty (CONSTRUCTOR_ELTS (*from_p)))

Perhaps
  && CONSTRUCTOR_NELTS (*from_p) == 0)
instead?

> +{
> +maybe_with_size_expr (from_p);

The indentation above is 8 spaces instead of tab.

> + gcc_assert (TREE_CODE (*from_p) == WITH_SIZE_EXPR);
> + return gimplify_modify_expr_to_memset (expr_p,
> +TREE_OPERAND (*from_p, 1),
> +want_value, pre_p);
> +}
> +

Otherwise LGTM.

Jakub



Re: [PATCH 0/3] picolibc: Add picolibc linking help

2022-08-25 Thread Keith Packard via Gcc-patches
Kito Cheng  writes:

> I am thinking that maybe we should add
> -mlibc=[newlib|newlib-nano|picolibc|unknown] option to bare-matel
> toolchain, one reason is having an unify interface to select libc
> implementation between clang/LLVM, spec file is a GCC specific stuff,
> that cause very bad compatibility between GCC and clang/LLVM, and
> having option to control that would be better since clang/LLVM don't
> have those configure time option.

You can create specs file for each library so that you select the
library with --specs=picolibc.specs or --specs=newlib-nano.specs. That
seems sufficient as you can mess with various compiler options and set
the header paths. Crosstool-ng does this and it enables delivering
newlib, newlib-nano and picolibc in the same toolchain.

-- 
-keith


signature.asc
Description: PGP signature


c: Support C2x empty initializer braces

2022-08-25 Thread Joseph Myers
[Middle-end maintainers / global reviewers, note that there is a 
gimplify.cc change to be reviewed here.]

ISO C2x standardizes empty initializer braces {}.  Implement this
feature accordingly.  The basic case was already supported and so just
needed diagnostic adjustments.  However, the standard feature also
includes two cases that were not previously supported: empty
initializer braces for scalars, and empty initializer braces for
VLAs.  Thus, add support for those features as well, updating existing
tests that expected them to be diagnosed.

There was already some gimplifier support for converting
variable-sized initializations with empty CONSTRUCTORs to memset.
However, it didn't apply here; code earlier in gimplify_modify_expr
ended up calling gimplify_init_constructor via
gimplify_modify_expr_rhs, which ended up handling the CONSTRUCTOR in a
way that generated an ICE later.  Add a check for this case earlier in
gimplify_modify_expr to avoid that issue.

Bootstrapped with no regressions for x86_64-pc-linux-gnu.  OK to commit 
(the gimplify.cc changes)?

gcc/
* gimplify.cc (gimplify_modify_expr): Convert initialization from
a variable-size CONSTRUCTOR to memset before call to
gimplify_modify_expr_rhs.

gcc/c/
* c-decl.cc (start_decl): Do not diagnose initialization of
variable-sized objects here.
* c-parser.cc (c_parser_braced_init): Add argument DECL.  All
callers changed.
(c_parser_initializer): Diagnose initialization of variable-sized
objects other than with braced initializer.
(c_parser_braced_init): Use pedwarn_c11 for empty initializer
braces and update diagnostic text.  Diagnose initialization of
variable-sized objects with nonempty braces.
* c-typeck.cc (digest_init): Update diagnostic for initialization
of variable-sized objects.
(really_start_incremental_init, set_designator)
(process_init_element): Update comments.
(pop_init_level): Allow scalar empty initializers.

gcc/testsuite/
* gcc.dg/c11-empty-init-1.c, gcc.dg/c11-empty-init-2.c,
gcc.dg/c11-empty-init-3.c, gcc.dg/c2x-empty-init-1.c,
gcc.dg/c2x-empty-init-2.c, gcc.dg/c2x-empty-init-3.c,
gcc.dg/gnu2x-empty-init-1.c, gcc.dg/gnu2x-empty-init-2.c: New
tests.
* gcc.dg/torture/dfp-default-init-1.c: Also test empty
initializers.
* gcc.dg/init-bad-1.c, gcc.dg/noncompile/pr71583.c,
gcc.dg/pr61096-1.c, gcc.dg/vla-init-2.c, gcc.dg/vla-init-3.c,
gcc.target/i386/sse2-bfloat16-scalar-typecheck.c: Update expected
diagnostics.
* gcc.dg/ubsan/c-shift-1.c: Use nonempty initializers for VLA
initializations expected to be diagnosed.

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index 9e590c66dae..1fe31e01132 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -5180,29 +5180,15 @@ start_decl (struct c_declarator *declarator, struct 
c_declspecs *declspecs,
  initialized = false;
else if (COMPLETE_TYPE_P (TREE_TYPE (decl)))
  {
-   /* A complete type is ok if size is fixed.  */
-
-   if (!poly_int_tree_p (TYPE_SIZE (TREE_TYPE (decl)))
-   || C_DECL_VARIABLE_SIZE (decl))
- {
-   error ("variable-sized object may not be initialized");
-   initialized = false;
- }
+   /* A complete type is ok if size is fixed.  If the size is
+  variable, an empty initializer is OK and nonempty
+  initializers will be diagnosed in the parser.  */
  }
else if (TREE_CODE (TREE_TYPE (decl)) != ARRAY_TYPE)
  {
error ("variable %qD has initializer but incomplete type", decl);
initialized = false;
  }
-   else if (C_DECL_VARIABLE_SIZE (decl))
- {
-   /* Although C99 is unclear about whether incomplete arrays
-  of VLAs themselves count as VLAs, it does not make
-  sense to permit them to be initialized given that
-  ordinary VLAs may not be initialized.  */
-   error ("variable-sized object may not be initialized");
-   initialized = false;
- }
   }
 
   if (initialized)
diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index 759f200a7eb..1e8d9dca749 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -1523,7 +1523,7 @@ static tree c_parser_simple_asm_expr (c_parser *);
 static tree c_parser_gnu_attributes (c_parser *);
 static struct c_expr c_parser_initializer (c_parser *, tree);
 static struct c_expr c_parser_braced_init (c_parser *, tree, bool,
-  struct obstack *);
+  struct obstack *, tree);
 static void c_parser_initelt (c_parser *, struct obstack *);
 static void c_parser_initval (c_parser *, struct c_expr *,
  struct obstack *);
@@ -5220,11 +5220,15 @@ 

[Patch][2/3] nvptx: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup

2022-08-25 Thread Tobias Burnus

On 25.08.22 16:54, Tobias Burnus wrote:

The attached patch prepare for reverse-offload device->host
function-address lookup by requesting (if needed) the on-device address.


This patch adds the actual implementation for NVPTX.

Having  array[] = {fn1,fn2};  works with nvptx only since sm_35; hence,
if there is a reverse_offload and sm_30 is used, there will be a compile-time
error.

To avoid incompatibilities, I compile with the same PTX ISA .version and
sm_XX version as the (last) file that contains the reverse offload. While
it should not matter, some newer CUDA might not support, e.g., sm_35 or
do not like a specific ISA version - thus, that seemed to be safer.

This is currently effectively a no op as with [1/3] patch, always NULL
is passed and as GOMP_OFFLOAD_get_num_devices returns <= 0 as soon as
'omp requires reverse_offload' has been specified.

OK for mainline?

Tobias


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
nvptx: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup

Add support to nvptx for reverse lookup of function name to prepare for
'omp target device(ancestor:1)'.

gcc/ChangeLog:

	* config/nvptx/mkoffload.cc (record_id): Strip quotations from
	function name.
	(process): For GOMP_REQUIRES_REVERSE_OFFLOAD, check that -march is
	at least sm_35, create '$offload_func_table' global array and init
	with reverse-offload function addresses.
	* config/nvptx/nvptx.cc (write_fn_proto_1, write_fn_proto): New
	force_public attribute to force .visible.
	(nvptx_declare_function_name): For "omp target
	device_ancestor_nohost" attribut, force .visible/TREE_PUBLIC.

libgomp/ChangeLog:

	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_load_image): Read offload
	function address table '$offload_func_table' if rev_fn_table
	is not NULL.

 gcc/config/nvptx/mkoffload.cc | 104 --
 gcc/config/nvptx/nvptx.cc |  20 +---
 libgomp/plugin/plugin-nvptx.c |  19 +++-
 3 files changed, 131 insertions(+), 12 deletions(-)

diff --git a/gcc/config/nvptx/mkoffload.cc b/gcc/config/nvptx/mkoffload.cc
index 3eea0a8f138..c496766b1cc 100644
--- a/gcc/config/nvptx/mkoffload.cc
+++ b/gcc/config/nvptx/mkoffload.cc
@@ -108,12 +108,14 @@ xputenv (const char *string)
 static void
 record_id (const char *p1, id_map ***where)
 {
+  gcc_assert (p1[0] == '"');
+  p1++;
   const char *end = strchr (p1, '\n');
   if (!end)
 fatal_error (input_location, "malformed ptx file");
 
   id_map *v = XNEW (id_map);
-  size_t len = end - p1;
+  size_t len = end - p1 - 1 ; /* remove tailing '"'  */
   v->ptx_name = XNEWVEC (char, len + 1);
   memcpy (v->ptx_name, p1, len);
   v->ptx_name[len] = '\0';
@@ -242,6 +244,10 @@ process (FILE *in, FILE *out, uint32_t omp_requires)
   id_map const *id;
   unsigned obj_count = 0;
   unsigned ix;
+  const char *sm_ver = NULL, *version = NULL;
+  const char *sm_ver2 = NULL, *version2 = NULL;
+  size_t file_cnt = 0;
+  size_t *file_idx = XALLOCAVEC (size_t, len);
 
   fprintf (out, "#include \n\n");
 
@@ -250,6 +256,8 @@ process (FILE *in, FILE *out, uint32_t omp_requires)
   for (size_t i = 0; i != len;)
 {
   char c;
+  bool output_fn_ptr = false;
+  file_idx[file_cnt++] = i;
 
   fprintf (out, "static const char ptx_code_%u[] =\n\t\"", obj_count++);
   while ((c = input[i++]))
@@ -261,6 +269,16 @@ process (FILE *in, FILE *out, uint32_t omp_requires)
 	case '\n':
 	  fprintf (out, "\\n\"\n\t\"");
 	  /* Look for mappings on subsequent lines.  */
+	  if (UNLIKELY (startswith (input + i, ".target sm_")))
+		{
+		  sm_ver = input + i + strlen (".target sm_");
+		  continue;
+		}
+	  if (UNLIKELY (startswith (input + i, ".version ")))
+		{
+		  version = input + i + strlen (".version ");
+		  continue;
+		}
 	  while (startswith (input + i, "//:"))
 		{
 		  i += 3;
@@ -268,7 +286,10 @@ process (FILE *in, FILE *out, uint32_t omp_requires)
 		  if (startswith (input + i, "VAR_MAP "))
 		record_id (input + i + 8, _tail);
 		  else if (startswith (input + i, "FUNC_MAP "))
-		record_id (input + i + 9, _tail);
+		{
+		  output_fn_ptr = true;
+		  record_id (input + i + 9, _tail);
+		}
 		  else
 		abort ();
 		  /* Skip to next line. */
@@ -286,6 +307,81 @@ process (FILE *in, FILE *out, uint32_t omp_requires)
 	  putc (c, out);
 	}
   fprintf (out, "\";\n\n");
+  if (output_fn_ptr
+	  && (omp_requires & GOMP_REQUIRES_REVERSE_OFFLOAD) != 0)
+	{
+	  if (sm_ver && sm_ver[0] == '3' && sm_ver[1] == '0'
+	  && sm_ver[2] == '\n')
+	fatal_error (input_location,
+			 "% requires at least "
+			 "% for %<-misa=%>");
+	  sm_ver2 = sm_ver;
+	  version2 = version;
+	}
+}
+
+  /* Create function-pointer array, required for reverse
+ offload function-pointer 

Re: [PATCH v2] c: Implement C23 nullptr (N3042)

2022-08-25 Thread Joseph Myers
On Wed, 24 Aug 2022, Marek Polacek via Gcc-patches wrote:

> Ah, okay.  I had just copied what we do in C++ in null_ptr_cst_p and the
> rest of the patch worked under that assumption.  I've added some tests
> for this too.  Except I don't really understand the _Generic comment so
> I only have tests for _Generic that were in the previous version.

The point is that e.g.

_Generic(0, int : nullptr)

is treated the same as nullptr (so is a null pointer constant), just as 
(nullptr) is.

> Thanks, tests added to c2x-nullptr-1.c:test1.  I notice that 6.3.2.4 still
> says "The type nullptr_t may be converted to bool or to a pointer type";
> isn't it missing the ", void" here too?

In general none of the subclauses under 6.3.2 about individual kinds of 
types tend to discuss the possibility of conversion to void.

> +/* Simple assignment.  */
> +void
> +test4 (void)
> +{
> +  /* -- the left operand has an atomic, qualified, or unqualified version of
> + the nullptr_t type and the type of the right is nullptr_t;  */
> +  nullptr_t n1;
> +  n1 = nullptr;
> +  const nullptr_t n2 = nullptr;
> +  _Atomic nullptr_t n3 = nullptr;
> +  volatile nullptr_t n4 = nullptr;

These qualified cases are all actually initialization, not assignment; I 
think both assignment and initialization (and argument passing and return) 
should be tested for the permitted cases for assignment.

> +/* Test nullptr_t from ".

> +/* If a second or third operand of type nullptr_t is used that is not a null
> +   pointer constant and the other operand is not a pointer or does not have
> +   itself nullptr_t, a constraint is violated even if that other operand is
> +   a null pointer constant such as 0.  */

The "that is not a null pointer constant" in that footnote is a bit odd, 
since it's also a constraint violation (and should be tested as such) to 
have a conditional expression between e.g. nullptr and 0.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] cr16: remove obsoleted port

2022-08-25 Thread Joseph Myers
On Thu, 25 Aug 2022, Martin Liška wrote:

>   * config.sub: Remove cr16.

config.sub and config.guess are imported verbatim from config.git; any 
changes should go upstream first.  Removal from config.sub would only be 
appropriate if *no* GNU package wants to support cr16 any more.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH v3] Many pages: Document fixed-width types with ISO C naming

2022-08-25 Thread Linus Torvalds
On Thu, Aug 25, 2022 at 7:38 AM Joseph Myers  wrote:
>
> I've not yet implemented it for glibc or for GCC format checking, but C23
> adds 'wN' format length modifiers so you will be able to e.g. use "%w64d"
> with printf to print an int64_t and won't need those PRI macros any more.

Yeah, that's going to help user space.

We don't typically have huge issues with it (any more) in the kernel
exactly because we refused to do the syntactically horrendous PRIxyz
thing.

So in the kernel, we still do have some format string issues, but they
tend to be about "different architectures and configurations do
different things for this type", and those different things are sadly
not necessarily about a fixed width.

IOW, we used to have horrors like "sector_t can be 32-bit or 64-bit
depending on config options" (because small machines didn't want the
overhead of having to pass 64-bit things around - from back when
32-bit was a primary target).

We got rid of *that* thing a few years ago because it just wasn't
worth supporting any more, but some similar issues remain.

So we still have a number of cases of "if you really need to print
this out, you need to use '%llui' and cast the value to 'unsigned long
long'".

But it's happily not as common as it used to be.

 Linus


Re: [PATCH] cr16: remove obsoleted port

2022-08-25 Thread Jeff Law via Gcc-patches




On 8/25/2022 2:38 AM, Martin Liška wrote:

Remove the port that has been marked obsolete in GCC 12 change notes.

Ready to be installed?
Thanks,
Martin

ChangeLog:

* config.sub: Remove cr16.

contrib/ChangeLog:

* config-list.mk: Remove cr16.

gcc/ChangeLog:

* config.gcc: Remove cr16 related stuff.
* doc/extend.texi: Likewise.
* doc/install.texi: Likewise.
* doc/invoke.texi: Likewise.
* doc/md.texi: Likewise.
* function-tests.cc (test_expansion_to_rtl): Likewise.
* common/config/cr16/cr16-common.cc: Removed.
* config/cr16/constraints.md: Removed.
* config/cr16/cr16-protos.h: Removed.
* config/cr16/cr16.cc: Removed.
* config/cr16/cr16.h: Removed.
* config/cr16/cr16.md: Removed.
* config/cr16/cr16.opt: Removed.
* config/cr16/predicates.md: Removed.
* config/cr16/t-cr16: Removed.

libgcc/ChangeLog:

* config.host: Remove cr16 related stuff.
* config/cr16/crti.S: Removed.
* config/cr16/crtlibid.S: Removed.
* config/cr16/crtn.S: Removed.
* config/cr16/divmodhi3.c: Removed.
* config/cr16/lib1funcs.S: Removed.
* config/cr16/t-cr16: Removed.
* config/cr16/t-crtlibid: Removed.
* config/cr16/unwind-cr16.c: Removed.
* config/cr16/unwind-dw2.h: Removed.

LGTM.  If any additional cr16 bits are found, feel free to remove them too.

jeff



[PATCH 1/2] libstdc++: Implement ranges::zip_transform_view from P2321R2

2022-08-25 Thread Patrick Palka via Gcc-patches
Tested on x86_64-pc-linux-gnu, does this look OK for trunk?

libstdc++-v3/ChangeLog:

* include/std/ranges (zip_view::_Iterator): Befriend
zip_transform_view.
(__detail::__range_iter_cat): Define.
(zip_transform_view): Define.
(zip_transform_view::_Iterator): Define.
(zip_transform_view::_Sentinel): Define.
(views::__detail::__can_zip_transform_view): Define.
(views::_ZipTransform): Define.
(views::zip_transform): Define.
* testsuite/std/ranges/zip_transform/1.cc: New test.
---
 libstdc++-v3/include/std/ranges   | 343 ++
 .../testsuite/std/ranges/zip_transform/1.cc   | 108 ++
 2 files changed, 451 insertions(+)
 create mode 100644 libstdc++-v3/testsuite/std/ranges/zip_transform/1.cc

diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
index fb815c48f99..d748cb73346 100644
--- a/libstdc++-v3/include/std/ranges
+++ b/libstdc++-v3/include/std/ranges
@@ -4502,6 +4502,12 @@ namespace views::__adaptor
return input_iterator_tag{};
 }
 
+template
+  requires (view<_Ws> && ...) && (sizeof...(_Ws) > 0) && is_object_v<_Fp>
+   && regular_invocable<_Fp&, range_reference_t<_Ws>...>
+   && std::__detail::__can_reference...>>
+  friend class zip_transform_view;
+
   public:
 // iterator_category defined in __zip_view_iter_cat
 using iterator_concept = decltype(_S_iter_concept());
@@ -4781,6 +4787,343 @@ namespace views::__adaptor
 
 inline constexpr _Zip zip;
   }
+
+  namespace __detail
+  {
+template
+  using __range_iter_cat
+   = typename iterator_traits>>::iterator_category;
+  }
+
+  template
+requires (view<_Vs> && ...) && (sizeof...(_Vs) > 0) && is_object_v<_Fp>
+  && regular_invocable<_Fp&, range_reference_t<_Vs>...>
+  && std::__detail::__can_reference...>>
+  class zip_transform_view : public view_interface>
+  {
+[[no_unique_address]] __detail::__box<_Fp> _M_fun;
+zip_view<_Vs...> _M_zip;
+
+using _InnerView = zip_view<_Vs...>;
+
+template
+  using __ziperator = iterator_t<__detail::__maybe_const_t<_Const, 
_InnerView>>;
+
+template
+  using __zentinel = sentinel_t<__detail::__maybe_const_t<_Const, 
_InnerView>>;
+
+template
+  using _Base = __detail::__maybe_const_t<_Const, _InnerView>;
+
+template
+  struct __iter_cat
+  { };
+
+template
+  requires forward_range<_Base<_Const>>
+  struct __iter_cat<_Const>
+  {
+  private:
+   static auto
+   _S_iter_cat()
+   {
+ using __detail::__maybe_const_t;
+ using __detail::__range_iter_cat;
+ using _Res = invoke_result_t<__maybe_const_t<_Const, _Fp>&,
+  
range_reference_t<__maybe_const_t<_Const, _Vs>>...>;
+ if constexpr (!is_lvalue_reference_v<_Res>)
+   return input_iterator_tag{};
+ else if constexpr ((derived_from<__range_iter_cat<_Vs, _Const>,
+  random_access_iterator_tag> && ...))
+   return random_access_iterator_tag{};
+ else if constexpr ((derived_from<__range_iter_cat<_Vs, _Const>,
+  bidirectional_iterator_tag> && ...))
+   return bidirectional_iterator_tag{};
+ else if constexpr ((derived_from<__range_iter_cat<_Vs, _Const>,
+  forward_iterator_tag> && ...))
+   return forward_iterator_tag{};
+ else
+   return input_iterator_tag{};
+   }
+  public:
+   using iterator_category = decltype(_S_iter_cat());
+  };
+
+template class _Iterator;
+template class _Sentinel;
+
+  public:
+zip_transform_view() = default;
+
+constexpr explicit
+zip_transform_view(_Fp __fun, _Vs... __views)
+  : _M_fun(std::move(__fun)), _M_zip(std::move(__views)...)
+{ }
+
+constexpr auto
+begin()
+{ return _Iterator(*this, _M_zip.begin()); }
+
+constexpr auto
+begin() const
+  requires range
+   && regular_invocable...>
+{ return _Iterator(*this, _M_zip.begin()); }
+
+constexpr auto
+end()
+{
+  if constexpr (common_range<_InnerView>)
+   return _Iterator(*this, _M_zip.end());
+  else
+   return _Sentinel(_M_zip.end());
+}
+
+constexpr auto
+end() const
+  requires range
+   && regular_invocable...>
+{
+  if constexpr (common_range)
+   return _Iterator(*this, _M_zip.end());
+  else
+   return _Sentinel(_M_zip.end());
+}
+
+constexpr auto
+size() requires sized_range<_InnerView>
+{ return _M_zip.size(); }
+
+constexpr auto
+size() const requires sized_range
+{ return _M_zip.size(); }
+  };
+
+  template
+zip_transform_view(_Fp, Rs&&...) -> zip_transform_view<_Fp, 
views::all_t...>;
+
+  template
+requires (view<_Vs> && ...) && (sizeof...(_Vs) > 0) && is_object_v<_Fp>
+ 

[PATCH 2/2] libstdc++: Implement LWG 3692/3702 changes to zip_/zip_transform_view

2022-08-25 Thread Patrick Palka via Gcc-patches
Tested on x86_64-pc-linux-gnu, does this look OK for trunk?

libstdc++-v3/ChangeLog:

* include/std/ranges (zip_view::_Iterator::operator<): Remove.
(zip_view::_Iterator::operator>): Remove.
(zip_view::_Iterator::operator<=): Remove.
(zip_view::_Iterator::operator>=): Remove.
(zip_view::_Iterator::operator<=>): Remove three_way_comparable
constraint.
(zip_transform_view::_Iterator): Ditto.
---
 libstdc++-v3/include/std/ranges | 43 +
 1 file changed, 1 insertion(+), 42 deletions(-)

diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
index d748cb73346..bf0ff29c38c 100644
--- a/libstdc++-v3/include/std/ranges
+++ b/libstdc++-v3/include/std/ranges
@@ -4616,30 +4616,9 @@ namespace views::__adaptor
}(make_index_sequence{});
 }
 
-friend constexpr bool
-operator<(const _Iterator& __x, const _Iterator& __y)
-  requires __detail::__all_random_access<_Const, _Vs...>
-{ return __x._M_current < __y._M_current; }
-
-friend constexpr bool
-operator>(const _Iterator& __x, const _Iterator& __y)
-  requires __detail::__all_random_access<_Const, _Vs...>
-{ return __y < __x; }
-
-friend constexpr bool
-operator<=(const _Iterator& __x, const _Iterator& __y)
-  requires __detail::__all_random_access<_Const, _Vs...>
-{ return !(__y < __x); }
-
-friend constexpr bool
-operator>=(const _Iterator& __x, const _Iterator& __y)
-  requires __detail::__all_random_access<_Const, _Vs...>
-{ return !(__x < __y); }
-
 friend constexpr auto
 operator<=>(const _Iterator& __x, const _Iterator& __y)
   requires __detail::__all_random_access<_Const, _Vs...>
-   && (three_way_comparable>> && ...)
 { return __x._M_current <=> __y._M_current; }
 
 friend constexpr _Iterator
@@ -5008,29 +4987,9 @@ namespace views::__adaptor
   requires equality_comparable<__ziperator<_Const>>
 { return __x._M_inner == __y._M_inner; }
 
-friend constexpr bool
-operator<(const _Iterator& __x, const _Iterator& __y)
-  requires random_access_range<_Base<_Const>>
-{ return __x._M_inner < __y._M_inner; }
-
-friend constexpr bool
-operator>(const _Iterator& __x, const _Iterator& __y)
-  requires random_access_range<_Base<_Const>>
-{ return __x._M_inner > __y._M_inner; }
-
-friend constexpr bool
-operator<=(const _Iterator& __x, const _Iterator& __y)
-  requires random_access_range<_Base<_Const>>
-{ return __x._M_inner <= __y._M_inner; }
-
-friend constexpr bool
-operator>=(const _Iterator& __x, const _Iterator& __y)
-  requires random_access_range<_Base<_Const>>
-{ return __x._M_inner >= __y._M_inner; }
-
 friend constexpr auto
 operator<=>(const _Iterator& __x, const _Iterator& __y)
-  requires random_access_range<_Base<_Const>> && 
three_way_comparable<__ziperator<_Const>>
+  requires random_access_range<_Base<_Const>>
 { return __x._M_inner <=> __y._M_inner; }
 
 friend constexpr _Iterator
-- 
2.37.2.382.g795ea8776b



[Patch][2/3] GCN: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup

2022-08-25 Thread Tobias Burnus

On 25.08.22 16:54, Tobias Burnus wrote:

The attached patch prepare for reverse-offload device->host
function-address lookup by requesting (if needed) the on-device address.


This patch adds the actual implementation for GCN. A variant would be
to only generate .offload_func_table inside mkoffload when
OMP_REQUIRES_REVERSE_OFFLOAD has been requested.

This is currently effectively a no op as with [1/3] patch, always NULL
is passed and as GOMP_OFFLOAD_get_num_devices returns <= 0 as soon as
'omp requires reverse_offload' has been specified.

OK for mainline?

Tobias


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
GCN: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup

Add support to GCN for reverse lookup of function name to prepare for
'omp target device(ancestor:1)'.

gcc/ChangeLog:

	* config/gcn/mkoffload.cc (process_asm): Create .offload_func_table,
	similar to pre-existing .offload_var_table.

libgomp/ChangeLog:

	* plugin/plugin-gcn.c (GOMP_OFFLOAD_load_image): Read
	.offload_func_table to populate rev_fn_table when requested.

 gcc/config/gcn/mkoffload.cc | 11 ++-
 libgomp/plugin/plugin-gcn.c | 26 +-
 2 files changed, 35 insertions(+), 2 deletions(-)
diff --git a/gcc/config/gcn/mkoffload.cc b/gcc/config/gcn/mkoffload.cc
index 4206448703a..24d327355e3 100644
--- a/gcc/config/gcn/mkoffload.cc
+++ b/gcc/config/gcn/mkoffload.cc
@@ -537,63 +537,72 @@ process_asm (FILE *in, FILE *out, FILE *cfile)
 	case IN_VARS:
 	  {
 	char *varname;
 	unsigned varsize;
 	if (sscanf (buf, " .8byte %ms\n", ))
 	  {
 		fputs (buf, out);
 		fgets (buf, sizeof (buf), in);
 		if (!sscanf (buf, " .8byte %u\n", ))
 		  abort ();
 		var_count++;
 	  }
 	break;
 	  }
 	case IN_FUNCS:
 	  {
 	char *funcname;
 	if (sscanf (buf, "\t.8byte\t%ms\n", ))
 	  {
+		fputs (buf, out);
 		obstack_ptr_grow (_os, funcname);
 		fn_count++;
 		continue;
 	  }
 	break;
 	  }
 	}
 
   char dummy;
   if (sscanf (buf, " .section .gnu.offload_vars%c", ) > 0)
 	{
 	  state = IN_VARS;
 
 	  /* Add a global symbol to allow plugin-gcn.c to locate the table
 	 at runtime.  It can't use the "offload_var_table.N" emitted by
 	 the compiler because a) they're not global, and b) there's one
 	 for each input file combined into the binary.  */
 	  fputs (buf, out);
 	  fputs ("\t.global .offload_var_table\n"
 		 "\t.type .offload_var_table, @object\n"
 		 ".offload_var_table:\n",
 		 out);
 	}
   else if (sscanf (buf, " .section .gnu.offload_funcs%c", ) > 0)
-	state = IN_FUNCS;
+	{
+	  state = IN_FUNCS;
+	  /* Likewise for .gnu.offload_vars; used for reverse offload. */
+	  fputs (buf, out);
+	  fputs ("\t.global .offload_func_table\n"
+		 "\t.type .offload_func_table, @object\n"
+		 ".offload_func_table:\n",
+		 out);
+	}
   else if (sscanf (buf, " .amdgpu_metadata%c", ) > 0)
 	{
 	  state = IN_METADATA;
 	  regcount.kernel_name = NULL;
 	  regcount.sgpr_count = regcount.vgpr_count = -1;
 	}
   else if (sscanf (buf, " .section %c", ) > 0
 	   || sscanf (buf, " .text%c", ) > 0
 	   || sscanf (buf, " .bss%c", ) > 0
 	   || sscanf (buf, " .data%c", ) > 0
 	   || sscanf (buf, " .ident %c", ) > 0)
 	state = IN_CODE;
   else if (sscanf (buf, " .end_amdgpu_metadata%c", ) > 0)
 	{
 	  state = IN_CODE;
 	  gcc_assert (regcount.kernel_name != NULL
 		  && regcount.sgpr_count >= 0
 		  && regcount.vgpr_count >= 0);
 	  obstack_grow (_os, , sizeof (regcount));
diff --git a/libgomp/plugin/plugin-gcn.c b/libgomp/plugin/plugin-gcn.c
index 363e2950649..00603981866 100644
--- a/libgomp/plugin/plugin-gcn.c
+++ b/libgomp/plugin/plugin-gcn.c
@@ -3353,7 +3353,7 @@ GOMP_OFFLOAD_init_device (int n)
 int
 GOMP_OFFLOAD_load_image (int ord, unsigned version, const void *target_data,
 			 struct addr_pair **target_table,
-			 uint64_t **rev_fn_table __attribute__((unused)))
+			 uint64_t **rev_fn_table)
 {
   if (GOMP_VERSION_DEV (version) != GOMP_VERSION_GCN)
 {
@@ -3520,6 +3520,30 @@ GOMP_OFFLOAD_load_image (int ord, unsigned version, const void *target_data,
   if (module->fini_array_func)
 kernel_count--;
 
+  if (rev_fn_table != NULL && kernel_count == 0)
+*rev_fn_table = NULL;
+  else if (rev_fn_table != NULL)
+{
+  hsa_status_t status;
+  hsa_executable_symbol_t var_symbol;
+  status = hsa_fns.hsa_executable_get_symbol_fn (agent->executable, NULL,
+		 ".offload_func_table",
+		 agent->id, 0, _symbol);
+  if (status != HSA_STATUS_SUCCESS)
+	hsa_fatal ("Could not find symbol for variable in the code object",
+		   status);
+  uint64_t fn_table_addr;
+  status = hsa_fns.hsa_executable_symbol_get_info_fn
+	(var_symbol, HSA_EXECUTABLE_SYMBOL_INFO_VARIABLE_ADDRESS,
+	 

RE: [PATCH v3] Many pages: Document fixed-width types with ISO C naming

2022-08-25 Thread Joseph Myers
On Thu, 25 Aug 2022, David Laight wrote:

> From: Joseph Myers
> > Sent: 25 August 2022 15:39
> > 
> > On Thu, 25 Aug 2022, Linus Torvalds wrote:
> > 
> > > That's a small detail that yes, we've tried to avoid the absolute
> > > humongous mess that the C standard library has with their horrendous
> > > 'PRId*' mess, but honestly, it's just a tiny detail.
> > 
> > I've not yet implemented it for glibc or for GCC format checking, but C23
> > adds 'wN' format length modifiers so you will be able to e.g. use "%w64d"
> > with printf to print an int64_t and won't need those PRI macros any more.
> 
> Is that meant to work regardless of whether the type is
> int, long or long long provided the size is correct?
> 
> Or does it require the compiler know which type inttypes.h
> uses for uint32_t and uint64_t?

The type passed needs to be that used for the relevant stdint.h typedef, 
not another of the same size.  (For format checking, that means the 
compiler needs to know what the types used in stdint.h are.)

It's now required that if int32_t exists, int_least32_t must have the same 
type, so int_least32_t can also be used with that format (and there are 
'wfN' formats for int_fastN_t / uint_fastN_t as well).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH, rs6000] Change insn condition from TARGET_64BIT to TARGET_POWERPC64 for VSX scalar extract/insert instructions

2022-08-25 Thread Segher Boessenkool
On Wed, Aug 24, 2022 at 01:11:39PM +0800, HAO CHEN GUI wrote:
> On 23/8/2022 下午 10:26, Segher Boessenkool wrote:
> > On Fri, Aug 19, 2022 at 10:35:54AM +0800, HAO CHEN GUI wrote:
> >> --- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-0.c
> >> +++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-0.c
> >> @@ -1,7 +1,8 @@
> >>  /* { dg-do compile { target { powerpc*-*-* } } } */
> >> -/* { dg-require-effective-target lp64 } */
> >> -/* { dg-require-effective-target powerpc_p9vector_ok } */
> >>  /* { dg-options "-mdejagnu-cpu=power9" } */
> >> +/* { dg-additional-options "-mpowerpc64" { target { powerpc*-*-linux* && 
> >> ilp32 } } } */
> > 
> > You can add this always.  It is default on 64-bit systems, but it is
> > simpler to just always add it:
> > /* { dg-additional-options "-mpowerpc64" } */
> > 
> > Or are there subtargets that will error on this?
> Yes, AIX fails if TARGET_POWERPC64 is set and TARGET_64BIT is not set.
> So I add "-mpowerpc64" for Linux 32-bit environment.

Aha.  But you can add it for all linux:

/* { dg-additional-options "-mpowerpc64" { target powerpc*-*-linux* } } */

(or *-*-linux* even, everything in gcc.target/powerpc is known to be
powerpc*-*-* already).

Not that it matters at all here, as the other thread shows :-), but for
the future: run testcases wherever possible (and reasonable), and in the
same vein, try not to specialise option when you do not have to.  Doing
this makes testing much less work, makes it easier to have reasonable
coverage.


Segher


[committed] libstdc++: Some minor cleanups

2022-08-25 Thread Patrick Palka via Gcc-patches
libstdc++-v3/ChangeLog:

* include/std/ranges (lazy_split_view::_OuterIter::_M_current):
Remove redundant comment.
(lazy_split_view::_M_current): Likewise.
(common_view::common_view): Remove commented out view-converting
constructor as per LWG3405.
(elements_view::_Iterator::_Iterator): Uglify 'current' and 'i'.
---
 libstdc++-v3/include/std/ranges | 20 
 1 file changed, 4 insertions(+), 16 deletions(-)

diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
index c600bad6283..fb815c48f99 100644
--- a/libstdc++-v3/include/std/ranges
+++ b/libstdc++-v3/include/std/ranges
@@ -3116,7 +3116,6 @@ namespace views::__adaptor
 
  _Parent* _M_parent = nullptr;
 
- // XXX: _M_current is present only if "V models forward_range"
  [[no_unique_address]]
__detail::__maybe_present_t,
iterator_t<_Base>> _M_current;
@@ -3370,7 +3369,6 @@ namespace views::__adaptor
 
   _Vp _M_base = _Vp();
   _Pattern _M_pattern = _Pattern();
-  // XXX: _M_current is "present only if !forward_range"
   [[no_unique_address]]
__detail::__maybe_present_t,
  __detail::__non_propagating_cache>> _M_current;
@@ -3725,16 +3723,6 @@ namespace views::__adaptor
: _M_base(std::move(__r))
   { }
 
-  /* XXX: LWG 3280 didn't remove this constructor, but I think it should?
-  template
-   requires (!common_range<_Range>)
- && constructible_from<_Vp, views::all_t<_Range>>
-   constexpr explicit
-   common_view(_Range&& __r)
- : _M_base(views::all(std::forward<_Range>(__r)))
-   { }
-  */
-
   constexpr _Vp
   base() const& requires copy_constructible<_Vp>
   { return _M_base; }
@@ -4115,14 +4103,14 @@ namespace views::__adaptor
  _Iterator() requires default_initializable> = 
default;
 
  constexpr explicit
- _Iterator(iterator_t<_Base> current)
-   : _M_current(std::move(current))
+ _Iterator(iterator_t<_Base> __current)
+   : _M_current(std::move(__current))
  { }
 
  constexpr
- _Iterator(_Iterator i)
+ _Iterator(_Iterator __i)
requires _Const && convertible_to, 
iterator_t<_Base>>
-   : _M_current(std::move(i._M_current))
+   : _M_current(std::move(__i._M_current))
  { }
 
  constexpr const iterator_t<_Base>&
-- 
2.37.2.382.g795ea8776b



RE: [PATCH v3] Many pages: Document fixed-width types with ISO C naming

2022-08-25 Thread David Laight via Gcc-patches
From: Joseph Myers
> Sent: 25 August 2022 15:39
> 
> On Thu, 25 Aug 2022, Linus Torvalds wrote:
> 
> > That's a small detail that yes, we've tried to avoid the absolute
> > humongous mess that the C standard library has with their horrendous
> > 'PRId*' mess, but honestly, it's just a tiny detail.
> 
> I've not yet implemented it for glibc or for GCC format checking, but C23
> adds 'wN' format length modifiers so you will be able to e.g. use "%w64d"
> with printf to print an int64_t and won't need those PRI macros any more.

Is that meant to work regardless of whether the type is
int, long or long long provided the size is correct?

Or does it require the compiler know which type inttypes.h
uses for uint32_t and uint64_t?

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



[PATCH] Improve compute_control_dep_chain path finding

2022-08-25 Thread Richard Biener via Gcc-patches
This improves the compute_control_dep_chain path finding by first
marking the dominating region we search and then making sure to
not walk outside if it when enumerating all paths from the dominating
block to the interesting PHI edge source.  I have limited the DFS
walk done for the marking in similar ways as we limit the walking
in compute_control_dep_chain, more careful limiting might be
necessary though - the --param uninit-control-dep-attempts param
I re-use has a rather high default of 1000 which we might be able
to reduce with this patch as well (I think we'll usually hit some of the
other limits before ever reaching this).

Bootstrap & regtest running on x86_64-unknown-linux-gnu.

The true benefit will only be there when we can combine the
multiple compute_control_dep_chain done from
uninit_analysis::init_from_phi_def, but I've not yet managed to
make that work.  I guess I have to fully understand
compute_control_dep_chain first ...

* gimple-predicate-analysis.cc (dfs_mark_dominating_region):
New helper.
(compute_control_dep_chain): Adjust to honor marked region
if provided.
(uninit_analysis::init_from_phi_def): Pre-mark the dominating
region to improve compute_control_dep_chain walking.
* vec.h (vec::allocated): Add forwarder.
---
 gcc/gimple-predicate-analysis.cc | 81 ++--
 gcc/vec.h|  3 ++
 2 files changed, 81 insertions(+), 3 deletions(-)

diff --git a/gcc/gimple-predicate-analysis.cc b/gcc/gimple-predicate-analysis.cc
index 0d973a9e25a..e395c1b7052 100644
--- a/gcc/gimple-predicate-analysis.cc
+++ b/gcc/gimple-predicate-analysis.cc
@@ -1078,6 +1078,54 @@ simple_control_dep_chain (vec& chain, basic_block 
from, edge to)
   simple_control_dep_chain (chain, from, to->src);
 }
 
+/* Perform a DFS walk on predecessor edges to mark the region denoted
+   by the EXIT edge and DOM which dominates EXIT->src, including DOM.
+   Blocks in the region are marked with FLAG and added to BBS.  BBS is
+   filled up to its capacity only after which the walk is terminated
+   and false is returned.  If the whole region was marked, true is returned.  
*/
+
+static bool
+dfs_mark_dominating_region (edge exit, basic_block dom, int flag,
+   vec )
+{
+  if (exit->src == dom || exit->src->flags & flag)
+return true;
+  if (!bbs.space (1))
+return false;
+  bbs.quick_push (exit->src);
+  exit->src->flags |= flag;
+  auto_vec stack (bbs.allocated () - bbs.length () + 1);
+  stack.quick_push (ei_start (exit->src->preds));
+  while (!stack.is_empty ())
+{
+  /* Look at the edge on the top of the stack.  */
+  edge_iterator ei = stack.last ();
+  basic_block src = ei_edge (ei)->src;
+
+  /* Check if the edge source has been visited yet.  */
+  if (!(src->flags & flag))
+   {
+ /* Mark the source if there's still space.  If not, return early.  */
+ if (!bbs.space (1))
+   return false;
+ src->flags |= flag;
+ bbs.quick_push (src);
+
+ /* Queue its predecessors if we didn't reach DOM.  */
+ if (src != dom && EDGE_COUNT (src->preds) > 0)
+   stack.quick_push (ei_start (src->preds));
+   }
+  else
+   {
+ if (!ei_one_before_end_p (ei))
+   ei_next ( ());
+ else
+   stack.pop ();
+   }
+}
+  return true;
+}
+
 /* Recursively compute the control dependence chains (paths of edges)
from the dependent basic block, DEP_BB, up to the dominating basic
block, DOM_BB (the head node of a chain should be dominated by it),
@@ -1093,7 +1141,7 @@ static bool
 compute_control_dep_chain (basic_block dom_bb, const_basic_block dep_bb,
   vec cd_chains[], unsigned *num_chains,
   vec _cd_chain, unsigned *num_calls,
-  unsigned depth = 0)
+  unsigned in_region = 0, unsigned depth = 0)
 {
   if (*num_calls > (unsigned)param_uninit_control_dep_attempts)
 {
@@ -1167,10 +1215,14 @@ compute_control_dep_chain (basic_block dom_bb, 
const_basic_block dep_bb,
  break;
}
 
+ /* If the dominating region has been marked avoid walking outside.  */
+ if (in_region != 0 && !(cd_bb->flags & in_region))
+   break;
+
  /* Check if DEP_BB is indirectly control-dependent on DOM_BB.  */
  if (compute_control_dep_chain (cd_bb, dep_bb, cd_chains,
 num_chains, cur_cd_chain,
-num_calls, depth + 1))
+num_calls, in_region, depth + 1))
{
  found_cd_chain = true;
  break;
@@ -2238,6 +2290,25 @@ uninit_analysis::init_from_phi_def (gphi *phi)
   if (nedges == 0)
 return false;
 
+  auto_bb_flag in_region (cfun);
+  auto_vec region (MIN (n_basic_blocks_for_fn (cfun),
+ 

[PATCH] Improve uninit_analysis::collect_phi_def_edges

2022-08-25 Thread Richard Biener via Gcc-patches
This avoids expanding an edge to those of a PHI def if it is not
may-undefined, reducing the number of compute_control_dep_chain calls.

Bootstrap and regtest in progress on x86_64-unknown-linux-gnu.

* gimple-predicate-analysis.cc
(uninit_analysis::collect_phi_def_edges): Only expand a
PHI def edge when it is possibly undefined.
---
 gcc/gimple-predicate-analysis.cc | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/gcc/gimple-predicate-analysis.cc b/gcc/gimple-predicate-analysis.cc
index 079e06009fd..0d973a9e25a 100644
--- a/gcc/gimple-predicate-analysis.cc
+++ b/gcc/gimple-predicate-analysis.cc
@@ -605,11 +605,7 @@ uninit_analysis::collect_phi_def_edges (gphi *phi, 
basic_block cd_root,
{
  gimple *def = SSA_NAME_DEF_STMT (opnd);
 
- if (gimple_code (def) == GIMPLE_PHI
- && dominated_by_p (CDI_DOMINATORS, gimple_bb (def), cd_root))
-   collect_phi_def_edges (as_a (def), cd_root, edges,
-  visited);
- else if (!m_eval (opnd))
+ if (!m_eval (opnd))
{
  if (dump_file && (dump_flags & TDF_DETAILS))
{
@@ -622,6 +618,10 @@ uninit_analysis::collect_phi_def_edges (gphi *phi, 
basic_block cd_root,
}
  edges->safe_push (opnd_edge);
}
+ else if (gimple_code (def) == GIMPLE_PHI
+  && dominated_by_p (CDI_DOMINATORS, gimple_bb (def), cd_root))
+   collect_phi_def_edges (as_a (def), cd_root, edges,
+  visited);
}
   else
{
-- 
2.35.3


[Patch][1/3] libgomp: Prepare for reverse offload fn lookup

2022-08-25 Thread Tobias Burnus

Technically, this patch is stand alone, but conceptually it based on the
submitted but not reviewed patch:
"[Patch] OpenMP: Support reverse offload (middle end part)"
https://gcc.gnu.org/pipermail/gcc-patches/2022-July/598662.html

With that patch, for reverse offloads ('omp target device(ancestor:1)'),
calls like the following are added:
 GOMP_target_ext (-2 /* initial device */, omp_fn.1
where 'omp_fn.1' on nonhost devices a stub function just required for
looking up the host function pointer via the offload_funcs table.

The attached patch prepare for reverse-offload device->host
function-address lookup by requesting (if needed) the on-device address.

OK for mainline?

Tobias


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
libgomp: Prepare for reverse offload fn lookup

Prepare for reverse-offloading function-pointer lookup by passing
a rev_fn_table argument to GOMP_OFFLOAD_load_image.

The argument will be NULL, unless GOMP_REQUIRES_REVERSE_OFFLOAD is
requested and devices not supported it, are filtered out.
(Up to and including this commit, no non-host device claims such
support and the caller currently always passes NULL.)

libgomp/ChangeLog:

	* libgomp-plugin.h (GOMP_OFFLOAD_load_image): Add
	'uint64_t **rev_fn_table' argument.
	* oacc-host.c (host_load_image): Likewise.
	* plugin/plugin-gcn.c (GOMP_OFFLOAD_load_image): Likewise;
	currently unused.
	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_load_image): Likewise.
	* target.c (gomp_load_image_to_device): Update call but pass
	NULL for now.

liboffloadmic/ChangeLog:

	* plugin/libgomp-plugin-intelmic.cpp (GOMP_OFFLOAD_load_image):
	Add (unused) uint64_t **rev_fn_table argument.

 libgomp/libgomp-plugin.h | 2 +-
 libgomp/oacc-host.c  | 3 ++-
 libgomp/plugin/plugin-gcn.c  | 7 +--
 libgomp/plugin/plugin-nvptx.c| 7 +--
 libgomp/target.c | 2 +-
 liboffloadmic/plugin/libgomp-plugin-intelmic.cpp | 3 ++-
 6 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/libgomp/libgomp-plugin.h b/libgomp/libgomp-plugin.h
index ab3ed638475..57309303e66 100644
--- a/libgomp/libgomp-plugin.h
+++ b/libgomp/libgomp-plugin.h
@@ -130,7 +130,7 @@ extern bool GOMP_OFFLOAD_init_device (int);
 extern bool GOMP_OFFLOAD_fini_device (int);
 extern unsigned GOMP_OFFLOAD_version (void);
 extern int GOMP_OFFLOAD_load_image (int, unsigned, const void *,
-struct addr_pair **);
+struct addr_pair **, uint64_t **);
 extern bool GOMP_OFFLOAD_unload_image (int, unsigned, const void *);
 extern void *GOMP_OFFLOAD_alloc (int, size_t);
 extern bool GOMP_OFFLOAD_free (int, void *);
diff --git a/libgomp/oacc-host.c b/libgomp/oacc-host.c
index eb11b9cf16a..4e3971ae1a9 100644
--- a/libgomp/oacc-host.c
+++ b/libgomp/oacc-host.c
@@ -81,7 +81,8 @@ static int
 host_load_image (int n __attribute__ ((unused)),
 		 unsigned v __attribute__ ((unused)),
 		 const void *t __attribute__ ((unused)),
-		 struct addr_pair **r __attribute__ ((unused)))
+		 struct addr_pair **r __attribute__ ((unused)),
+		 uint64_t **f __attribute__ ((unused)))
 {
   return 0;
 }
diff --git a/libgomp/plugin/plugin-gcn.c b/libgomp/plugin/plugin-gcn.c
index ea327bf2ca0..363e2950649 100644
--- a/libgomp/plugin/plugin-gcn.c
+++ b/libgomp/plugin/plugin-gcn.c
@@ -3346,11 +3346,14 @@ GOMP_OFFLOAD_init_device (int n)
 
 /* Load GCN object-code module described by struct gcn_image_desc in
TARGET_DATA and return references to kernel descriptors in TARGET_TABLE.
-   If there are any constructors then run them.  */
+   If there are any constructors then run them.  If not NULL, REV_FN_TABLE will
+   contain the on-device addresses of the functions for reverse offload.  To be
+   freed by the caller.  */
 
 int
 GOMP_OFFLOAD_load_image (int ord, unsigned version, const void *target_data,
-			 struct addr_pair **target_table)
+			 struct addr_pair **target_table,
+			 uint64_t **rev_fn_table __attribute__((unused)))
 {
   if (GOMP_VERSION_DEV (version) != GOMP_VERSION_GCN)
 {
diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index bc63e274cdf..d130665ed19 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -1266,11 +1266,14 @@ nvptx_set_clocktick (CUmodule module, struct ptx_device *dev)
 }
 
 /* Load the (partial) program described by TARGET_DATA to device
-   number ORD.  Allocate and return TARGET_TABLE.  */
+   number ORD.  Allocate and return TARGET_TABLE.  If not NULL, REV_FN_TABLE
+   will contain the on-device addresses of the functions for reverse offload.
+   To be freed by the caller.  */
 
 int
 GOMP_OFFLOAD_load_image (int ord, unsigned version, const void *target_data,
-			 struct addr_pair 

Re: [PATCH v3] Many pages: Document fixed-width types with ISO C naming

2022-08-25 Thread Joseph Myers
On Thu, 25 Aug 2022, Linus Torvalds wrote:

> That's a small detail that yes, we've tried to avoid the absolute
> humongous mess that the C standard library has with their horrendous
> 'PRId*' mess, but honestly, it's just a tiny detail.

I've not yet implemented it for glibc or for GCC format checking, but C23 
adds 'wN' format length modifiers so you will be able to e.g. use "%w64d" 
with printf to print an int64_t and won't need those PRI macros any more.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH, rs6000] Change insn condition from TARGET_64BIT to TARGET_POWERPC64 for VSX scalar extract/insert instructions

2022-08-25 Thread David Edelsohn via Gcc-patches
On Thu, Aug 25, 2022 at 1:22 AM Kewen.Lin  wrote:
>
> on 2022/8/25 11:37, HAO CHEN GUI wrote:
> > Hi,
> >
> > On 24/8/2022 下午 1:24, Kewen.Lin wrote:
> >> Could you try to test with dg-options "-mdejagnu-cpu=power9 -mpowerpc64" 
> >> all the time, but still
> >> having that has_arch_ppc64 effective target on aix?
> >>
> >> I'd expect has_arch_ppc64 check to fail on aix 32bit, the error will not 
> >> be a problem (turning
> >> into an UNSUPPORTED then)?
> >
> > I tested it on AIX. "has_arch_ppc64" fails with dg-options 
> > "-mdejagnu-cpu=power9 -mpowerpc64" on
> > 32-bit AIX environment. It works as we expected.
>
> Nice, thanks for your time on testing.
>
> >
> > Also I found that AIX and Darwin are skipped for bfp test. So in testcase, 
> > it's no need to care
> > about them. Not sure if it's intention.
> >
> > In bfp.exp
> >
> > # Exit immediately if this isn't a PowerPC target or if the target is
> > # aix or Darwin.
> > if { (![istarget powerpc*-*-*] && ![istarget rs6000-*-*])
> >  || [istarget "powerpc*-*-aix*"]
> >  || [istarget "powerpc*-*-darwin*"]  } then {
> >   return
> > }
>
> I can't find a hint about why we wanted to disable bfp testing on aix, it 
> looks like a overkill to me.
>
> Could you help to further test if all test cases in this small bucket 
> available on aix?
>
> Maybe it can give us some evidences on why it's intentional or not.
>
> Hi David & Segher,
>
> Do you have some insights on this?

AIX (and Darwin) are not Linux and not ELF.  There is no support for
BPF.  All of the tests fail, so they are skipped.

Thanks, David


Re: Rust frontend patches v2

2022-08-25 Thread Philip Herron
On Thu, 25 Aug 2022 at 13:50, Frank Ch. Eigler  wrote:
>
> Hi -
>
>
> > 12K 0004-gccrs-Add-link-cases-testsuite.patch
> > 356K0005-gccrs-Add-general-compilation-test-cases.patch
> > 132K0006-gccrs-Add-execution-test-cases.patch
> > 4.0K0007-gccrs-Add-gcc-check-target-check-rust.patch
> > 656K0008-gccrs-Add-the-Rust-front-end-AST-data-structures.patch
> > 112K0009-gccrs-Add-Lexer-for-Rust-front-end.patch
> > 504K0010-gccrs-Add-Parser-for-Rust-front-end.patch
> > 200K0011-gccrs-Add-expansion-pass-for-the-Rust-front-end.patch
> > 204K0012-gccrs-Add-name-resolution-pass-to-the-Rust-front-end.patch
> > 476K0013-gccrs-Add-second-intermedite-representation-called-H.patch
> > [...]
>
> Just curious whether a human reviewer expected to read through this
> much content?  If not, and if this structure is only for machine /
> bisecting purposes, maybe they're not worth also emailing.
>
> - FChE

That's a good point. I prefer reading this stuff on the git branch either:

https://gcc.gnu.org/git/?p=gcc.git;a=shortlog;h=refs/heads/devel/rust/master
https://github.com/Rust-GCC/gccrs

The first patch here is a nice isolated GCC change the rest of the
patches are either pure front-end code or setting up the rust target
for the test suite.

The main patches I think people will be interested in are:

[PATCH Rust front-end v2 01/37] Use DW_ATE_UTF for the Rust 'char'
[PATCH Rust front-end v2 07/37] gccrs: Add gcc-check-target
[PATCH Rust front-end v2 31/37] gccrs: Add GCC Rust front-end
[PATCH Rust front-end v2 32/37] gccrs: Add config-lang.in
[PATCH Rust front-end v2 33/37] gccrs: add lang-spec.h
[PATCH Rust front-end v2 34/37] gccrs: add lang.opt
[PATCH Rust front-end v2 35/37] gccrs: add compiler driver
[PATCH Rust front-end v2 36/37] gccrs: compiler proper interface

Eventually, when we clean it up, our port of the constexpr.cc from the
C++ front-end might be of interest to other front-ends.

--Phil


Re: [PATCH] c++, v2: Implement C++23 P2071R2 - Named universal character escapes [PR106648]

2022-08-25 Thread Jason Merrill via Gcc-patches

On 8/25/22 04:49, Jakub Jelinek wrote:

On Wed, Aug 24, 2022 at 04:22:17PM -0400, Jason Merrill wrote:

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


Does the copyright 2005-2022 mean that this code is partly derived from some
other?


Yes, I was lazy and started by copying over makeucnid.cc which also
parses UnicodeData.txt.


Makes sense, please mention that in the ChangeLog.  OK with that change.


In the end, according to diff -upd -U1 make{ucnid,uname2c}.cc, there are
~180 lines in common (out of ~530 lines of makeucnid.cc), out of which is
~80 lines in the two copyrights, most of the rest are just empty lines or
lines with { or } alone, beyond that
  #include 
  #include 
  #include 
  #include 
  #include 
  
  
  #define NUM_CODE_POINTS 0x11

  #define MAX_CODE_POINT 0x10
and
  /* Read UnicodeData.txt and fill in the 'decomp' table to be the
 decompositions of characters for which both the character
 decomposed and all the code points in the decomposition are valid
 for some supported language version, and the 'all_decomp' table to
 be the decompositions of all characters without those
 constraints.  */
  
  static void

  {
if (!f)
for (;;)
  {
char line[256];
char *l;
  
if (!fgets (line, sizeof (line), f))

 break;
codepoint = strtoul (line, , 16);
if (l == line || *l != ';')
if (codepoint > MAX_CODE_POINT)
  
do {

} while (*l != ';');
 {
 }
  }
if (ferror (f))
fclose (f);
  }
are the common lines close to each other (and whole
write_copyright function).  Dunno if with that I could use
just 2022 copyright or not.


+ /* We don't know what the next letter will be.
+It could be ISALNUM, then we are supposed
+to omit it, or it could be a space and then
+we should not omit it and need to compare it.
+Fortunately the only 3 names with hyphen
+followed by non-letter are
+U+0F0A TIBETAN MARK BKA- SHOG YIG MGO
+U+0FD0 TIBETAN MARK BKA- SHOG GI MGO RGYAN
+U+0FD0 TIBETAN MARK BSKA- SHOG GI MGO RGYAN
+and makeuname2c.cc verifies this.
+Furthermore, prefixes of NR2 generated
+ranges all end with a hyphen, but the generated
+part is then followed by alpha-numeric.
+So, let's just assume that - at the end of
+key is always followed by alphanumeric and
+so should be omitted.  */


Let's mention that makeuname2c.cc verifies this property.


I had "and makeuname2c.cc verifies this." there already a few lines before,
but I agree it is better to move that to the end.


+ for (j = start; j < end; j++)
+   {
+ /* Actually strlen, but we know strlen () <= 3.  */


Is this comment saying that you're using a loop instead of calling strlen
because you know the result will be small?  That seems an odd choice.


Yes, but perhaps it is a micro-optimization and maybe the Korean characters
will not be used that much that it isn't worth it.
Our optimizers certainly aren't able to figure out that when
strlen is called on an array element with size 4 that calling library
function isn't the best idea.  The string lengths are 0 in 3%, 1 in 44%,
2 in 47% and 3 in 6% of cases.
At least on x86_64 when I just use this_len = strlen (hangul_syllables[j]);
it calls the library routine.
Changed to this_len = strlen (hangul_syllables[j]);


+ /* Try to do a loose name lookup according to
+Unicode loose matching rule UAX44-LM2.


Maybe factor the loose lookup into a separate function?


Good idea.


+ bidi::kind kind;
+ if (buffer->cur[-1] == 'N')
+   kind = get_bidi_named (pfile, buffer->cur, );
+ else
+   kind = get_bidi_ucn (pfile, buffer->cur,
+buffer->cur[-1] == 'U', );


Hmm, I'm surprised that we're doing bidi checking before replacing escape
characters with elements of the translation character set.  So now we need
to check it three different ways.


It is unfortunate, but I'm afraid it is intentional.
Because after replacing the escape characters we lose the distinction
between characters written as UTF-8 in the source and the escape sequences.
The former need to be treated differently as they are more dangerous than
the latter, bidi written as UTF-8 can mislead what the source contains
already in (some) text editors or whatever way user looks at the source
code, while when written as UCNs (\u, \u{}, \U, \N{}) it can be 

Re: [PATCH v4] c++: Implement -Wself-move warning [PR81159]

2022-08-25 Thread Jason Merrill via Gcc-patches

On 8/24/22 17:30, Marek Polacek wrote:

On Tue, Aug 23, 2022 at 05:27:00PM -0400, Jason Merrill wrote:

On 8/23/22 09:39, Marek Polacek wrote:

+  tree arg = CALL_EXPR_ARG (fn, 0);
+  extract_op (arg);
+  if (TREE_CODE (arg) == ADDR_EXPR)
+arg = TREE_OPERAND (arg, 0);
+  tree type = TREE_TYPE (lhs);
+  lhs = maybe_undo_parenthesized_ref (lhs);
+  STRIP_ANY_LOCATION_WRAPPER (lhs);
+  const bool print_var_p = (DECL_P (lhs)
+   || REFERENCE_REF_P (lhs)
+   || TREE_CODE (lhs) == COMPONENT_REF);


Why include REFERENCE_REF_P and COMPONENT_REF?  Reference refs should be
stripped before this test, member refs aren't variables.


I'm checking REFERENCE_REF_P and COMPONENT_REF to say "moving a variable"
in #1 and #3.  The REFERENCE_REF_P check means that we also say "variable"
for #2.  Sure, "A variable is introduced by the declaration of a reference
other than a non-static data member", but I'm not sure if users care about
that here?

If I strip REFERENCE_REFs before the check then the result will be the
same.


That's what I was suggesting, yes: Strip the REFERENCE_REF so DECL_P can 
see the decl.


I don't see where COMPONENT_REF comes in?


Or I could keep only the DECL_P check, but then we'll say "moving
an expression" for #1 and #2, which seems strange.

struct S {
   int x;
   int 
   void foo () {
 x = std::move (x); // #1
 r = std::move (r); // #2
   };
};

void
foo (int )
{
   r = std::move (r); // #3
}

Marek





[PATCH 6/6] Extend SLP permutation optimisations

2022-08-25 Thread Richard Sandiford via Gcc-patches
Currently SLP tries to force permute operations "down" the graph
from loads in the hope of reducing the total number of permutations
needed or (in the best case) removing the need for the permutations
entirely.  This patch tries to extend it as follows:

- Allow loads to take a different permutation from the one they
  started with, rather than choosing between "original permutation"
  and "no permutation".

- Allow changes in both directions, if the target supports the
  reverse permutation.

- Treat the placement of permutations as a two-way dataflow problem:
  after propagating information from leaves to roots (as now), propagate
  information back up the graph.

- Take execution frequency into account when optimising for speed,
  so that (for example) permutations inside loops have a higher
  cost than permutations outside loops.

- Try to reduce the total number of permutations when optimising for
  size, even if that increases the number of permutations on a given
  execution path.

See the big block comment above vect_optimize_slp_pass for
a detailed description.

The original motivation for doing this was to add a framework that would
allow other layout differences in future.  The two main ones are:

- Make it easier to represent predicated operations, including
  predicated operations with gaps.  E.g.:

 a[0] += 1;
 a[1] += 1;
 a[3] += 1;

  could be a single load/add/store for SVE.  We could handle this
  by representing a layout such as { 0, 1, _, 2 } or { 0, 1, _, 3 }
  (depending on what's being counted).  We might need to move
  elements between lanes at various points, like with permutes.

  (This would first mean adding support for stores with gaps.)

- Make it easier to switch between an even/odd and unpermuted layout
  when switching between wide and narrow elements.  E.g. if a widening
  operation produces an even vector and an odd vector, we should try
  to keep operations on the wide elements in that order rather than
  force them to be permuted back "in order".

To give some examples of what the patch does:

int f1(int *__restrict a, int *__restrict b, int *__restrict c,
   int *__restrict d)
{
  a[0] = (b[1] << c[3]) - d[1];
  a[1] = (b[0] << c[2]) - d[0];
  a[2] = (b[3] << c[1]) - d[3];
  a[3] = (b[2] << c[0]) - d[2];
}

continues to produce the same code as before when optimising for
speed: b, c and d are permuted at load time.  But when optimising
for size we instead permute c into the same order as b+d and then
permute the result of the arithmetic into the same order as a:

ldr q1, [x2]
ldr q0, [x1]
ext v1.16b, v1.16b, v1.16b, #8 // <--
sshlv0.4s, v0.4s, v1.4s
ldr q1, [x3]
sub v0.4s, v0.4s, v1.4s
rev64   v0.4s, v0.4s   // <--
str q0, [x0]
ret

The following function:

int f2(int *__restrict a, int *__restrict b, int *__restrict c,
   int *__restrict d)
{
  a[0] = (b[3] << c[3]) - d[3];
  a[1] = (b[2] << c[2]) - d[2];
  a[2] = (b[1] << c[1]) - d[1];
  a[3] = (b[0] << c[0]) - d[0];
}

continues to push the reverse down to just before the store,
like the previous code did.

In:

int f3(int *__restrict a, int *__restrict b, int *__restrict c,
   int *__restrict d)
{
  for (int i = 0; i < 100; ++i)
{
  a[0] = (a[0] + c[3]);
  a[1] = (a[1] + c[2]);
  a[2] = (a[2] + c[1]);
  a[3] = (a[3] + c[0]);
  c += 4;
}
}

the loads of a are hoisted and the stores of a are sunk, so that
only the load from c happens in the loop.  When optimising for
speed, we prefer to have the loop operate on the reversed layout,
changing on entry and exit from the loop:

mov x3, x0
adrpx0, .LC0
add x1, x2, 1600
ldr q2, [x0, #:lo12:.LC0]
ldr q0, [x3]
mov v1.16b, v0.16b
tbl v0.16b, {v0.16b - v1.16b}, v2.16b// <
.p2align 3,,7
.L6:
ldr q1, [x2], 16
add v0.4s, v0.4s, v1.4s
cmp x2, x1
bne .L6
mov v1.16b, v0.16b
adrpx0, .LC0
ldr q2, [x0, #:lo12:.LC0]
tbl v0.16b, {v0.16b - v1.16b}, v2.16b// <
str q0, [x3]
ret

Similarly, for the very artificial testcase:

int f4(int *__restrict a, int *__restrict b, int *__restrict c,
   int *__restrict d)
{
  int a0 = a[0];
  int a1 = a[1];
  int a2 = a[2];
  int a3 = a[3];
  for (int i = 0; i < 100; ++i)
{
  a0 ^= c[0];
  a1 ^= c[1];
  a2 ^= c[2];
  a3 ^= c[3];
  c += 4;
  for (int j = 0; j < 100; ++j)
{
  a0 += d[1];
  a1 += d[0];
  a2 += d[3];
  a3 += d[2];
  d += 4;
}
  b[0] = a0;
  b[1] = a1;
  b[2] = a2;
  b[3] = a3;
  b += 4;
}
  a[0] = a0;
  a[1] = a1;
  a[2] = a2;
  a[3] = a3;
}

the a vector in the inner loop maintains the order { 1, 0, 3, 2 },
even though it's 

[PATCH 5/6] Add base hash traits for vectors

2022-08-25 Thread Richard Sandiford via Gcc-patches
This patch adds a class that provides basic hash/equal functions
for vectors, based on corresponding traits for the element type.

gcc/
* hash-traits.h (vec_hash_base): New class.
(vec_free_hash_base): Likewise.
---
 gcc/hash-traits.h | 55 +++
 1 file changed, 55 insertions(+)

diff --git a/gcc/hash-traits.h b/gcc/hash-traits.h
index 55b81eb0f9e..f5d12706324 100644
--- a/gcc/hash-traits.h
+++ b/gcc/hash-traits.h
@@ -408,6 +408,61 @@ pair_hash ::is_empty (const value_type )
   return T1::is_empty (x.first);
 }
 
+/* Base traits for vectors, providing just the hash and comparison
+   functionality.  Type gives the corresponding traits for the element
+   type.  */
+
+template 
+struct vec_hash_base
+{
+  typedef vec value_type;
+  typedef vec compare_type;
+
+  static inline hashval_t hash (value_type);
+  static inline bool equal (value_type, compare_type);
+};
+
+template 
+inline hashval_t
+vec_hash_base ::hash (value_type x)
+{
+  inchash::hash hstate;
+  hstate.add_int (x.length ());
+  for (auto  : x)
+hstate.merge_hash (Type::hash (value));
+  return hstate.end ();
+}
+
+template 
+inline bool
+vec_hash_base ::equal (value_type x, compare_type y)
+{
+  if (x.length () != y.length ())
+return false;
+  for (unsigned int i = 0; i < x.length (); ++i)
+if (!Type::equal (x[i], y[i]))
+  return false;
+  return true;
+}
+
+/* Traits for vectors whose contents should be freed normally.  */
+
+template 
+struct vec_free_hash_base : vec_hash_base 
+{
+  static void remove (typename vec_hash_base ::value_type &);
+};
+
+template 
+void
+vec_free_hash_base 
+::remove (typename vec_hash_base ::value_type )
+{
+  for (auto  : x)
+Type::remove (x);
+  x.release ();
+}
+
 template  struct default_hash_traits : T {};
 
 template 
-- 
2.25.1



[PATCH 2/6] Split code out of vect_transform_slp_perm_load

2022-08-25 Thread Richard Sandiford via Gcc-patches
Similarly to the previous vectorizable_slp_permutation patch,
this one splits out the main part of vect_transform_slp_perm_load
so that a later patch can test a permutation without constructing
a node for it.

Also fixes a lingering use of STMT_VINFO_VECTYPE.

gcc/
* tree-vect-slp.cc (vect_transform_slp_perm_load_1): Split out from...
(vect_transform_slp_perm_load): ...here.  Use SLP_TREE_VECTYPE instead
of STMT_VINFO_VECTYPE.
---
 gcc/tree-vect-slp.cc | 54 ++--
 1 file changed, 37 insertions(+), 17 deletions(-)

diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index 13c242e5012..64b3379b530 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -6631,23 +6631,23 @@ vect_get_slp_defs (vec_info *,
 }
 }
 
-/* Generate vector permute statements from a list of loads in DR_CHAIN.
-   If ANALYZE_ONLY is TRUE, only check that it is possible to create valid
-   permute statements for the SLP node NODE.  Store the number of vector
-   permute instructions in *N_PERMS and the number of vector load
-   instructions in *N_LOADS.  If DCE_CHAIN is true, remove all definitions
-   that were not needed.  */
+/* A subroutine of vect_transform_slp_perm_load with two extra arguments:
+   - PERM gives the permutation that the caller wants to use for NODE,
+ which might be different from SLP_LOAD_PERMUTATION.
+   - DUMP_P controls whether the function dumps information.  */
 
-bool
-vect_transform_slp_perm_load (vec_info *vinfo,
- slp_tree node, const vec _chain,
- gimple_stmt_iterator *gsi, poly_uint64 vf,
- bool analyze_only, unsigned *n_perms,
- unsigned int *n_loads, bool dce_chain)
+static bool
+vect_transform_slp_perm_load_1 (vec_info *vinfo, slp_tree node,
+   load_permutation_t ,
+   const vec _chain,
+   gimple_stmt_iterator *gsi, poly_uint64 vf,
+   bool analyze_only, bool dump_p,
+   unsigned *n_perms, unsigned int *n_loads,
+   bool dce_chain)
 {
   stmt_vec_info stmt_info = SLP_TREE_SCALAR_STMTS (node)[0];
   int vec_index = 0;
-  tree vectype = STMT_VINFO_VECTYPE (stmt_info);
+  tree vectype = SLP_TREE_VECTYPE (node);
   unsigned int group_size = SLP_TREE_SCALAR_STMTS (node).length ();
   unsigned int mask_element;
   machine_mode mode;
@@ -6732,8 +6732,7 @@ vect_transform_slp_perm_load (vec_info *vinfo,
 {
   unsigned int iter_num = j / group_size;
   unsigned int stmt_num = j % group_size;
-  unsigned int i = (iter_num * DR_GROUP_SIZE (stmt_info)
-   + SLP_TREE_LOAD_PERMUTATION (node)[stmt_num]);
+  unsigned int i = (iter_num * DR_GROUP_SIZE (stmt_info) + perm[stmt_num]);
   bitmap_set_bit (used_in_lanes, i);
   if (repeating_p)
{
@@ -6759,7 +6758,7 @@ vect_transform_slp_perm_load (vec_info *vinfo,
}
  else
{
- if (dump_enabled_p ())
+ if (dump_p)
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
 "permutation requires at "
 "least three vectors %G",
@@ -6780,7 +6779,7 @@ vect_transform_slp_perm_load (vec_info *vinfo,
  indices.new_vector (mask, second_vec_index == -1 ? 1 : 2, nunits);
  if (!can_vec_perm_const_p (mode, mode, indices))
{
- if (dump_enabled_p ())
+ if (dump_p)
{
  dump_printf_loc (MSG_MISSED_OPTIMIZATION,
   vect_location,
@@ -6896,6 +6895,27 @@ vect_transform_slp_perm_load (vec_info *vinfo,
   return true;
 }
 
+/* Generate vector permute statements from a list of loads in DR_CHAIN.
+   If ANALYZE_ONLY is TRUE, only check that it is possible to create valid
+   permute statements for the SLP node NODE.  Store the number of vector
+   permute instructions in *N_PERMS and the number of vector load
+   instructions in *N_LOADS.  If DCE_CHAIN is true, remove all definitions
+   that were not needed.  */
+
+bool
+vect_transform_slp_perm_load (vec_info *vinfo,
+ slp_tree node, const vec _chain,
+ gimple_stmt_iterator *gsi, poly_uint64 vf,
+ bool analyze_only, unsigned *n_perms,
+ unsigned int *n_loads, bool dce_chain)
+{
+  return vect_transform_slp_perm_load_1 (vinfo, node,
+SLP_TREE_LOAD_PERMUTATION (node),
+dr_chain, gsi, vf, analyze_only,
+dump_enabled_p (), n_perms, n_loads,
+dce_chain);
+}
+
 /* Produce the next vector result for SLP permutation NODE by 

[PATCH 4/6] Rearrange unbounded_hashmap_traits

2022-08-25 Thread Richard Sandiford via Gcc-patches
int_hash combines two kinds of operation:

(1) hashing and equality of integers
(2) using spare integer encodings to represent empty and deleted slots

(1) is really independent of (2), and could be useful in cases where
no spare integer encodings are available.  This patch adds a base class
(int_hash_base) for (1) and makes int_hash inherit from it.

If we follow a similar style for future hashes, we can make
unbounded_hashmap_traits take the "base" hash for the key
as a template parameter, rather than requiring every type of
key to have a separate derivative of unbounded_hashmap_traits.
A later patch applies this to vector keys.

No functional change intended.

gcc/
* hash-traits.h (int_hash_base): New struct, split out from...
(int_hash): ...this class, which now inherits from int_hash_base.
* hash-map-traits.h (unbounded_hashmap_traits): Take a template
parameter for the key that provides hash and equality functions.
(unbounded_int_hashmap_traits): Turn into a type alias of
unbounded_hashmap_traits.
---
 gcc/hash-map-traits.h | 74 +++
 gcc/hash-traits.h | 42 ++--
 2 files changed, 65 insertions(+), 51 deletions(-)

diff --git a/gcc/hash-map-traits.h b/gcc/hash-map-traits.h
index fad0c7d52c5..d729d358070 100644
--- a/gcc/hash-map-traits.h
+++ b/gcc/hash-map-traits.h
@@ -105,14 +105,19 @@ struct simple_cache_map_traits: public 
simple_hashmap_traits
   static const bool maybe_mx = false;
 };
 
-/* Implement traits for a hash_map with values of type Value for cases
-   in which the key cannot represent empty and deleted slots.  Instead
-   record empty and deleted entries in Value.  Derived classes must
-   implement the hash and equal_keys functions.  */
+/* Implement traits for a hash_map with keys of type Key and values of
+   type Value for cases in which the key cannot represent empty and
+   deleted slots.  Instead record empty and deleted entries in Value.  */
 
-template 
+template 
 struct unbounded_hashmap_traits
 {
+  typedef typename Key::value_type key_type;
+
+  static hashval_t hash (const typename Key::value_type &);
+  static bool equal_keys (const typename Key::value_type &,
+ const typename Key::compare_type &);
+
   template  static inline void remove (T &);
   static const bool empty_zero_p = default_hash_traits ::empty_zero_p;
   template  static inline bool is_empty (const T &);
@@ -121,42 +126,59 @@ struct unbounded_hashmap_traits
   template  static inline void mark_deleted (T &);
 };
 
-template 
+template 
+inline hashval_t
+unbounded_hashmap_traits 
+::hash (const typename Key::value_type )
+{
+  return Key::hash (key);
+}
+
+template 
+inline bool
+unbounded_hashmap_traits 
+::equal_keys (const typename Key::value_type ,
+ const typename Key::compare_type )
+{
+  return Key::equal (x, y);
+}
+
+template 
 template 
 inline void
-unbounded_hashmap_traits ::remove (T )
+unbounded_hashmap_traits ::remove (T )
 {
   default_hash_traits ::remove (entry.m_value);
 }
 
-template 
+template 
 template 
 inline bool
-unbounded_hashmap_traits ::is_empty (const T )
+unbounded_hashmap_traits ::is_empty (const T )
 {
   return default_hash_traits ::is_empty (entry.m_value);
 }
 
-template 
+template 
 template 
 inline bool
-unbounded_hashmap_traits ::is_deleted (const T )
+unbounded_hashmap_traits ::is_deleted (const T )
 {
   return default_hash_traits ::is_deleted (entry.m_value);
 }
 
-template 
+template 
 template 
 inline void
-unbounded_hashmap_traits ::mark_empty (T )
+unbounded_hashmap_traits ::mark_empty (T )
 {
   default_hash_traits ::mark_empty (entry.m_value);
 }
 
-template 
+template 
 template 
 inline void
-unbounded_hashmap_traits ::mark_deleted (T )
+unbounded_hashmap_traits ::mark_deleted (T )
 {
   default_hash_traits ::mark_deleted (entry.m_value);
 }
@@ -166,25 +188,7 @@ unbounded_hashmap_traits ::mark_deleted (T )
slots.  */
 
 template 
-struct unbounded_int_hashmap_traits : unbounded_hashmap_traits 
-{
-  typedef Key key_type;
-  static inline hashval_t hash (Key);
-  static inline bool equal_keys (Key, Key);
-};
-
-template 
-inline hashval_t
-unbounded_int_hashmap_traits ::hash (Key k)
-{
-  return k;
-}
-
-template 
-inline bool
-unbounded_int_hashmap_traits ::equal_keys (Key k1, Key k2)
-{
-  return k1 == k2;
-}
+using unbounded_int_hashmap_traits
+  = unbounded_hashmap_traits , Value>;
 
 #endif // HASH_MAP_TRAITS_H
diff --git a/gcc/hash-traits.h b/gcc/hash-traits.h
index bef0bd42d04..55b81eb0f9e 100644
--- a/gcc/hash-traits.h
+++ b/gcc/hash-traits.h
@@ -85,41 +85,51 @@ typed_noop_remove ::remove (Type &)
 {
 }
 
+/* Base traits for integer type Type, providing just the hash and
+   comparison functionality.  */
 
-/* Hasher for integer type Type in which Empty is a spare value that can be
-   used to mark empty slots.  If Deleted != Empty then Deleted is another
-   spare value that can be used for deleted 

[PATCH 3/6] Make graphds_scc pass the node order back to callers

2022-08-25 Thread Richard Sandiford via Gcc-patches
As a side-effect, graphds_scc constructs a vector in which all
nodes in an SCC are listed consecutively.  This can be useful
information, so that the patch adds an optional pass-back parameter
for it.  The interface is similar to the one for graphds_dfs.

gcc/
* graphds.cc (graphds_scc): Add a pass-back parameter for the
final node order.
---
 gcc/graphds.cc | 13 ++---
 gcc/graphds.h  |  3 ++-
 2 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/gcc/graphds.cc b/gcc/graphds.cc
index 91a2ca5c225..2a108fd475f 100644
--- a/gcc/graphds.cc
+++ b/gcc/graphds.cc
@@ -281,7 +281,14 @@ graphds_dfs (struct graph *g, int *qs, int nq, vec 
*qt,
numbers assigned by the previous pass.  If SUBGRAPH is not NULL, it
specifies the subgraph of G whose strongly connected components we want
to determine.  If SKIP_EDGE_P is not NULL, it points to a callback function.
-   Edge E will be skipped if callback function returns true.
+   Edge E will be skipped if callback function returns true.  If SCC_GROUPING
+   is not null, the nodes will be added to it in the following order:
+
+   - If SCC A is a direct or indirect predecessor of SCC B in the SCC dag,
+ A's nodes come before B's nodes.
+
+   - All of an SCC's nodes are listed consecutively, although the order
+ of the nodes within an SCC is not really meaningful.
 
After running this function, v->component is the number of the strongly
connected component for each vertex of G.  Returns the number of the
@@ -289,7 +296,7 @@ graphds_dfs (struct graph *g, int *qs, int nq, vec *qt,
 
 int
 graphds_scc (struct graph *g, bitmap subgraph,
-skip_edge_callback skip_edge_p)
+skip_edge_callback skip_edge_p, vec *scc_grouping)
 {
   int *queue = XNEWVEC (int, g->n_vertices);
   vec postorder = vNULL;
@@ -317,7 +324,7 @@ graphds_scc (struct graph *g, bitmap subgraph,
 
   for (i = 0; i < nq; i++)
 queue[i] = postorder[nq - i - 1];
-  comp = graphds_dfs (g, queue, nq, NULL, true, subgraph, skip_edge_p);
+  comp = graphds_dfs (g, queue, nq, scc_grouping, true, subgraph, skip_edge_p);
 
   free (queue);
   postorder.release ();
diff --git a/gcc/graphds.h b/gcc/graphds.h
index c54d8767fa7..e0e4d802cbb 100644
--- a/gcc/graphds.h
+++ b/gcc/graphds.h
@@ -58,7 +58,8 @@ void identify_vertices (struct graph *, int, int);
 typedef bool (*skip_edge_callback) (struct graph_edge *);
 int graphds_dfs (struct graph *, int *, int,
 vec *, bool, bitmap, skip_edge_callback = NULL);
-int graphds_scc (struct graph *, bitmap, skip_edge_callback = NULL);
+int graphds_scc (struct graph *, bitmap, skip_edge_callback = NULL,
+vec * = NULL);
 void graphds_domtree (struct graph *, int, int *, int *, int *);
 typedef void (*graphds_edge_callback) (struct graph *,
   struct graph_edge *, void *);
-- 
2.25.1



[PATCH 1/6] Split code out of vectorizable_slp_permutation

2022-08-25 Thread Richard Sandiford via Gcc-patches
A later patch needs to test whether the target supports a
lane_permutation_t without having to construct a full SLP
node to test that.  This patch splits out most of the work
of vectorizable_slp_permutation into a subroutine, so that
properties of the permutation can be passed explicitly without
disturbing the main interface.

The new subroutine still uses an slp_tree argument to get things
like the number of lanes and the vector type.  That's a bit clunky,
but it seemed like the least worst option.

gcc/
* tree-vect-slp.cc (vectorizable_slp_permutation_1): Split out from...
(vectorizable_slp_permutation): ...here.
---
 gcc/tree-vect-slp.cc | 98 +---
 1 file changed, 66 insertions(+), 32 deletions(-)

diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index dab5daddcc5..13c242e5012 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -6976,20 +6976,22 @@ vect_add_slp_permutation (vec_info *vinfo, 
gimple_stmt_iterator *gsi,
   SLP_TREE_VEC_STMTS (node).quick_push (perm_stmt);
 }
 
-/* Vectorize the SLP permutations in NODE as specified
-   in SLP_TREE_LANE_PERMUTATION which is a vector of pairs of SLP
-   child number and lane number.
-   Interleaving of two two-lane two-child SLP subtrees (not supported):
- [ { 0, 0 }, { 1, 0 }, { 0, 1 }, { 1, 1 } ]
-   A blend of two four-lane two-child SLP subtrees:
- [ { 0, 0 }, { 1, 1 }, { 0, 2 }, { 1, 3 } ]
-   Highpart of a four-lane one-child SLP subtree (not supported):
- [ { 0, 2 }, { 0, 3 } ]
-   Where currently only a subset is supported by code generating below.  */
+/* Subroutine of vectorizable_slp_permutation.  Check whether the target
+   can perform permutation PERM on the (1 or 2) input nodes in CHILDREN.
+   If GSI is nonnull, emit the permutation there.
 
-static bool
-vectorizable_slp_permutation (vec_info *vinfo, gimple_stmt_iterator *gsi,
- slp_tree node, stmt_vector_for_cost *cost_vec)
+   When GSI is null, the only purpose of NODE is to give properties
+   of the result, such as the vector type and number of SLP lanes.
+   The node does not need to be a VEC_PERM_EXPR.
+
+   If the target supports the operation, return the number of individual
+   VEC_PERM_EXPRs needed, otherwise return -1.  Print information to the
+   dump file if DUMP_P is true.  */
+
+static int
+vectorizable_slp_permutation_1 (vec_info *vinfo, gimple_stmt_iterator *gsi,
+   slp_tree node, lane_permutation_t ,
+   vec , bool dump_p)
 {
   tree vectype = SLP_TREE_VECTYPE (node);
 
@@ -7001,7 +7003,7 @@ vectorizable_slp_permutation (vec_info *vinfo, 
gimple_stmt_iterator *gsi,
   poly_uint64 nunits = TYPE_VECTOR_SUBPARTS (vectype);
   bool repeating_p = multiple_p (nunits, SLP_TREE_LANES (node));
   tree op_vectype = NULL_TREE;
-  FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
+  FOR_EACH_VEC_ELT (children, i, child)
 if (SLP_TREE_VECTYPE (child))
   {
op_vectype = SLP_TREE_VECTYPE (child);
@@ -7009,25 +7011,24 @@ vectorizable_slp_permutation (vec_info *vinfo, 
gimple_stmt_iterator *gsi,
   }
   if (!op_vectype)
 op_vectype = vectype;
-  FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
+  FOR_EACH_VEC_ELT (children, i, child)
 {
   if ((SLP_TREE_DEF_TYPE (child) != vect_internal_def
   && !vect_maybe_update_slp_op_vectype (child, op_vectype))
  || !types_compatible_p (SLP_TREE_VECTYPE (child), op_vectype)
  || !types_compatible_p (TREE_TYPE (vectype), TREE_TYPE (op_vectype)))
{
- if (dump_enabled_p ())
+ if (dump_p)
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
 "Unsupported vector types in lane permutation\n");
- return false;
+ return -1;
}
   if (SLP_TREE_LANES (child) != SLP_TREE_LANES (node))
repeating_p = false;
 }
 
-  vec >  = SLP_TREE_LANE_PERMUTATION (node);
   gcc_assert (perm.length () == SLP_TREE_LANES (node));
-  if (dump_enabled_p ())
+  if (dump_p)
 {
   dump_printf_loc (MSG_NOTE, vect_location,
   "vectorizing permutation");
@@ -7076,11 +7077,11 @@ vectorizable_slp_permutation (vec_info *vinfo, 
gimple_stmt_iterator *gsi,
   /* Calculate every element of every permute mask vector explicitly,
 instead of relying on the pattern described above.  */
   if (!nunits.is_constant ())
-   return false;
+   return -1;
   nelts_per_pattern = ncopies = 1;
   if (loop_vec_info linfo = dyn_cast  (vinfo))
if (!LOOP_VINFO_VECT_FACTOR (linfo).is_constant ())
- return false;
+ return -1;
   noutputs_per_mask = 1;
 }
   unsigned olanes = ncopies * SLP_TREE_LANES (node);
@@ -7093,13 +7094,13 @@ vectorizable_slp_permutation (vec_info *vinfo, 
gimple_stmt_iterator *gsi,
   auto_vec, unsigned> > vperm;
   auto_vec active_lane;
   vperm.create 

[PATCH 0/6] Optimise placement of SLP permutations

2022-08-25 Thread Richard Sandiford via Gcc-patches
This series is a follow-up from the RFC that I posted a while
back about optimising the placement of SLP permutations.
The main comment is in the final patch.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  If the series
is OK, I'll test on powerpc64le-linux-gnu too before committing.

Richard


[PATCH] vect: Tighten get_related_vectype_for_scalar_type

2022-08-25 Thread Richard Sandiford via Gcc-patches
Builds of glibc with SVE enabled have been failing since V1DI was added
to the aarch64 port.  The problem is that BB SLP starts the (hopeless)
attempt to use variable-length modes to vectorise a single-element
vector, and that now gets further than it did before.

Initially we tried getting a vector mode with 1 + 1X DI elements
(i.e. 1 DI per 128-bit vector chunk).  We don't provide such a mode --
it would be VNx1DI -- because it isn't a native SVE format.  We then
try just 1 DI, which previously failed but now succeeds.

There are numerous ways we could fix this.  Perhaps the most obvious
would be to skip variable-length modes for BB SLP.  However, I think
that'd just be kicking the can down the road, since eventually we want
to support BB SLP and VLA vectors using predication.

However, if we do use VLA vectors for BB SLP, the vector modes
we use should actually be variable length.  We don't want to use
variable-length vectors for some element types/group sizes and
fixed-length vectors for others, since it would be difficult
to handle the seams.

The same principle applies during loop vectorisation.  We can't
use a mixture of variable-length and fixed-length vectors for
the same loop because the relative unroll/vectorisation factors
would not be constant (compile-time) multiples of each other.

This patch therefore makes get_related_vectype_for_scalar_type
check that the provided number of units is interoperable with
the provided prevailing mode.  The function is generally quite
forgiving -- it does basic things like checking for scalarness
itself rather than expecting callers to do them -- so the new
check feels in keeping with that.

This seems to subsume the fix for PR96974.  I'm not sure it's
worth reverting that code to an assert though, so the patch just
drops the scan for the associated message.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Richard


gcc/
* tree-vect-stmts.cc (get_related_vectype_for_scalar_type): Check
that the requested number of units is interoperable with the requested
prevailing mode.

gcc/testsuite/
* gcc.target/aarch64/sve/slp_15.c: New test.
* g++.target/aarch64/sve/pr96974.C: Remove scan test.
---
 gcc/testsuite/g++.target/aarch64/sve/pr96974.C |  4 +---
 gcc/testsuite/gcc.target/aarch64/sve/slp_15.c  | 17 +
 gcc/tree-vect-stmts.cc | 10 ++
 3 files changed, 28 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/slp_15.c

diff --git a/gcc/testsuite/g++.target/aarch64/sve/pr96974.C 
b/gcc/testsuite/g++.target/aarch64/sve/pr96974.C
index 54000f568ab..2f6ebd6ce3d 100644
--- a/gcc/testsuite/g++.target/aarch64/sve/pr96974.C
+++ b/gcc/testsuite/g++.target/aarch64/sve/pr96974.C
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-Ofast -march=armv8.2-a+sve -fdisable-tree-fre4 
-fdump-tree-slp-details" } */
+/* { dg-options "-Ofast -march=armv8.2-a+sve -fdisable-tree-fre4" } */
 
 float a;
 int
@@ -14,5 +14,3 @@ struct c {
 }
 int coeffs[10];
 } f;
-
-/* { dg-final { scan-tree-dump "Not vectorized: Incompatible number of vector 
subparts between" "slp1" { target lp64 } } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/slp_15.c 
b/gcc/testsuite/gcc.target/aarch64/sve/slp_15.c
new file mode 100644
index 000..23f6d567cc5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/slp_15.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O3" } */
+
+struct foo
+{
+  void *handle;
+  void *arg;
+};
+
+void
+dlinfo_doit (struct foo *args)
+{
+  __UINTPTR_TYPE__ **l = args->handle;
+
+  *(__UINTPTR_TYPE__ *) args->arg = 0;
+  *(__UINTPTR_TYPE__ *) args->arg = **l;
+}
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index c9dab217f05..7748c42c70f 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -11486,6 +11486,16 @@ get_related_vectype_for_scalar_type (machine_mode 
prevailing_mode,
 
   unsigned int nbytes = GET_MODE_SIZE (inner_mode);
 
+  /* Interoperability between modes requires one to be a constant multiple
+ of the other, so that the number of vectors required for each operation
+ is a compile-time constant.  */
+  if (prevailing_mode != VOIDmode
+  && !constant_multiple_p (nunits * nbytes,
+  GET_MODE_SIZE (prevailing_mode))
+  && !constant_multiple_p (GET_MODE_SIZE (prevailing_mode),
+  nunits * nbytes))
+return NULL_TREE;
+
   /* For vector types of elements whose mode precision doesn't
  match their types precision we use a element type of mode
  precision.  The vectorization routines will have to make sure
-- 
2.25.1



Re: [PATCH] cr16: remove obsoleted port

2022-08-25 Thread Jan-Benedict Glaw
On Thu, 2022-08-25 14:50:12 +0200, Martin Liška  wrote:
> On 8/25/22 14:48, Jan-Benedict Glaw wrote:
> > On Thu, 2022-08-25 10:38:30 +0200, Martin Liška  wrote:
> >> Remove the port that has been marked obsolete in GCC 12 change notes.
> > 
> >> contrib/ChangeLog:
> >>
> >>* config-list.mk: Remove cr16.
> > 
> > Thanks for not forgetting this part!  I hope the scripts will properly
> > pick up the cr16 removal. We'll find out! :)
> 
> Which scripts do you mean?

My laminar job scripts, they directly use
`make -f contrib/config-list.mk show` output.

> May I install the patch?

LGTM, but I'm a write-after-approval committer and no reviewer or
global maintainer.

MfG, JBG

-- 


signature.asc
Description: PGP signature


Re: Rust frontend patches v2

2022-08-25 Thread Frank Ch. Eigler via Gcc-patches
Hi -


> 12K 0004-gccrs-Add-link-cases-testsuite.patch
> 356K0005-gccrs-Add-general-compilation-test-cases.patch
> 132K0006-gccrs-Add-execution-test-cases.patch
> 4.0K0007-gccrs-Add-gcc-check-target-check-rust.patch
> 656K0008-gccrs-Add-the-Rust-front-end-AST-data-structures.patch
> 112K0009-gccrs-Add-Lexer-for-Rust-front-end.patch
> 504K0010-gccrs-Add-Parser-for-Rust-front-end.patch
> 200K0011-gccrs-Add-expansion-pass-for-the-Rust-front-end.patch
> 204K0012-gccrs-Add-name-resolution-pass-to-the-Rust-front-end.patch
> 476K0013-gccrs-Add-second-intermedite-representation-called-H.patch
> [...]

Just curious whether a human reviewer expected to read through this
much content?  If not, and if this structure is only for machine /
bisecting purposes, maybe they're not worth also emailing.

- FChE


Re: [PATCH] cr16: remove obsoleted port

2022-08-25 Thread Martin Liška
On 8/25/22 14:48, Jan-Benedict Glaw wrote:
> On Thu, 2022-08-25 10:38:30 +0200, Martin Liška  wrote:
>> Remove the port that has been marked obsolete in GCC 12 change notes.
> 
>> contrib/ChangeLog:
>>
>>  * config-list.mk: Remove cr16.
> 
> Thanks for not forgetting this part!  I hope the scripts will properly
> pick up the cr16 removal. We'll find out! :)

Which scripts do you mean?

May I install the patch?

Thanks,
Martin

> 
> MfG, JBG
> 



Re: [PATCH] cr16: remove obsoleted port

2022-08-25 Thread Jan-Benedict Glaw
On Thu, 2022-08-25 10:38:30 +0200, Martin Liška  wrote:
> Remove the port that has been marked obsolete in GCC 12 change notes.

> contrib/ChangeLog:
> 
>   * config-list.mk: Remove cr16.

Thanks for not forgetting this part!  I hope the scripts will properly
pick up the cr16 removal. We'll find out! :)

MfG, JBG

-- 


signature.asc
Description: PGP signature


Re: [PATCH V4] rs6000: Optimize cmp on rotated 16bits constant

2022-08-25 Thread Segher Boessenkool
Hi!

On Thu, Aug 25, 2022 at 08:11:31PM +0800, Jiufu Guo wrote:
> Segher Boessenkool  writes:
> > You usually can split fine if you cannot create new pseudos, by reusing
> > existing registers.
> >
> > FAIL will cause an ICE: the RTL instruction does match, but will fail
> > when trying to generate machine code for it.
> >
> Previous patch is using "gen_reg_rtx (DImode)" to generate a pseudo for
> the rotated result to prevent orignal one being changed accidently.
> So, an 'assert (can_create_pseudo_p ())' would catch it in after RA.

It sounds like you want a define_split, not a define_insn_and_split.
That is much more stomachable anyway.

Anything that creates conditional branches together with compars insns
belongs before RA, before sched1 even.

> To enable this splitter works after RA, we may need to reserve one
> register (clobber would be ok).  Such as below:
> 
>   [(set (pc)
>   (if_then_else (eqne (match_operand:DI 1 "gpc_reg_operand" "r")
>   (match_operand:DI 2 "const_int_operand" "n"))
> (label_ref (match_operand 0 ""))
> (pc)))
>   (clobber (match_scratch:DI 3 "=r"))
>   (clobber (match_scratch:CCUNS 4 "=y"))]

Yes, that is one way to do it.  Another way is to reuse operand 1.  A
clobber is probably better in this case though :-)

If this is only so combine can match things, you certainly want just a
define_split, and the compare+branch in one pattern is not as bad then.


Segher


Re: [PATCH] regenerate configure files and config.h.in files

2022-08-25 Thread Martin Liška
On 8/25/22 14:21, Richard Sandiford wrote:
> Do you know what's going on with these ifndef guards?  How were they
> there before?

There were there since

commit 5213506e247206177045f832a0911db0118a8641
Author: Ralf Wildenhues 
Date:   Sat Aug 22 13:22:20 2009 +

Regenerate tree using Autoconf 2.64 and Automake 1.11.

So it's very likely dependent on Autoconf version.

Cheers,
Martin


Re: [PATCH] regenerate configure files and config.h.in files

2022-08-25 Thread Richard Sandiford via Gcc-patches
Martin Liška  writes:
> Hi.
>
> I wrote a scipt that runs autoconf in all folders that have configure.ac
> file and same for autoheader (where AC_CONFIG_HEADERS is present) and
> this is the output.
>
> The script can be seen here:
> https://github.com/marxin/script-misc/blob/master/gcc-autoconf-all.py
>
> I'm going to add the script to my daily Builbot tester.
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?
> Thanks,
> Martin
>
> fixincludes/ChangeLog:
>
>   * config.h.in: Regenerate.
>   * configure: Regenerate.
>
> libada/ChangeLog:
>
>   * configure: Regenerate.
>
> libiberty/ChangeLog:
>
>   * configure: Regenerate.
>
> libobjc/ChangeLog:
>
>   * configure: Regenerate.
>
> liboffloadmic/ChangeLog:
>
>   * configure: Regenerate.
>   * plugin/configure: Regenerate.
>
> libquadmath/ChangeLog:
>
>   * configure: Regenerate.
>
> libssp/ChangeLog:
>
>   * configure: Regenerate.
>
> libvtv/ChangeLog:
>
>   * configure: Regenerate.
>
> zlib/ChangeLog:
>
>   * configure: Regenerate.
> ---
>  fixincludes/config.h.in| 204 -
>  fixincludes/configure  |   2 +-
>  libada/configure   |   3 +
>  libiberty/configure|   3 +
>  libobjc/configure  |   6 +-
>  liboffloadmic/configure|  46 +++-
>  liboffloadmic/plugin/configure |  46 +++-
>  libquadmath/configure  |   6 +-
>  libssp/configure   |   6 +-
>  libvtv/configure   |  18 +--
>  zlib/configure |   6 +-
>  11 files changed, 61 insertions(+), 285 deletions(-)
>
> diff --git a/fixincludes/config.h.in b/fixincludes/config.h.in
> index 3f6cf1e574e..69a67f5f116 100644
> --- a/fixincludes/config.h.in
> +++ b/fixincludes/config.h.in
> @@ -1,397 +1,211 @@
>  /* config.h.in.  Generated from configure.ac by autoheader.  */
>  
>  /* Defined to the executable file extension on the host system */
> -#ifndef USED_FOR_TARGET
>  #undef EXE_EXT
> -#endif
> -
>  
>  /* Define to 1 if you have the `clearerr_unlocked' function. */
> -#ifndef USED_FOR_TARGET
>  #undef HAVE_CLEARERR_UNLOCKED
> -#endif
> -

Do you know what's going on with these ifndef guards?  How were they
there before?

LGTM otherwise.

Thanks,
Richard

>  /* Define to 1 if you have the declaration of `abort', and to 0 if you don't.
> */
> -#ifndef USED_FOR_TARGET
>  #undef HAVE_DECL_ABORT
> -#endif
> -
>  
>  /* Define to 1 if you have the declaration of `asprintf', and to 0 if you
> don't. */
> -#ifndef USED_FOR_TARGET
>  #undef HAVE_DECL_ASPRINTF
> -#endif
> -
>  
>  /* Define to 1 if you have the declaration of `basename(char *)', and to 0 if
> you don't. */
> -#ifndef USED_FOR_TARGET
>  #undef HAVE_DECL_BASENAME
> -#endif
> -
>  
>  /* Define to 1 if you have the declaration of `clearerr_unlocked', and to 0 
> if
> you don't. */
> -#ifndef USED_FOR_TARGET
>  #undef HAVE_DECL_CLEARERR_UNLOCKED
> -#endif
> -
>  
>  /* Define to 1 if you have the declaration of `errno', and to 0 if you don't.
> */
> -#ifndef USED_FOR_TARGET
>  #undef HAVE_DECL_ERRNO
> -#endif
> -
>  
>  /* Define to 1 if you have the declaration of `feof_unlocked', and to 0 if 
> you
> don't. */
> -#ifndef USED_FOR_TARGET
>  #undef HAVE_DECL_FEOF_UNLOCKED
> -#endif
> -
>  
>  /* Define to 1 if you have the declaration of `ferror_unlocked', and to 0 if
> you don't. */
> -#ifndef USED_FOR_TARGET
>  #undef HAVE_DECL_FERROR_UNLOCKED
> -#endif
> -
>  
>  /* Define to 1 if you have the declaration of `fflush_unlocked', and to 0 if
> you don't. */
> -#ifndef USED_FOR_TARGET
>  #undef HAVE_DECL_FFLUSH_UNLOCKED
> -#endif
> -
>  
>  /* Define to 1 if you have the declaration of `fgetc_unlocked', and to 0 if
> you don't. */
> -#ifndef USED_FOR_TARGET
>  #undef HAVE_DECL_FGETC_UNLOCKED
> -#endif
> -
>  
>  /* Define to 1 if you have the declaration of `fgets_unlocked', and to 0 if
> you don't. */
> -#ifndef USED_FOR_TARGET
>  #undef HAVE_DECL_FGETS_UNLOCKED
> -#endif
> -
>  
>  /* Define to 1 if you have the declaration of `fileno_unlocked', and to 0 if
> you don't. */
> -#ifndef USED_FOR_TARGET
>  #undef HAVE_DECL_FILENO_UNLOCKED
> -#endif
> -
>  
>  /* Define to 1 if you have the declaration of `fprintf_unlocked', and to 0 if
> you don't. */
> -#ifndef USED_FOR_TARGET
>  #undef HAVE_DECL_FPRINTF_UNLOCKED
> -#endif
> -
>  
>  /* Define to 1 if you have the declaration of `fputc_unlocked', and to 0 if
> you don't. */
> -#ifndef USED_FOR_TARGET
>  #undef HAVE_DECL_FPUTC_UNLOCKED
> -#endif
> -
>  
>  /* Define to 1 if you have the declaration of `fputs_unlocked', and to 0 if
> you don't. */
> -#ifndef USED_FOR_TARGET
>  #undef HAVE_DECL_FPUTS_UNLOCKED
> -#endif
> -
>  
>  /* Define to 1 if you have the declaration of `fread_unlocked', and to 0 if
> you don't. */
> -#ifndef USED_FOR_TARGET
>  #undef HAVE_DECL_FREAD_UNLOCKED
> -#endif
> -
>  
>  /* Define to 1 if 

Re: [PATCH V4] rs6000: Optimize cmp on rotated 16bits constant

2022-08-25 Thread Jiufu Guo via Gcc-patches
Hi,

Segher Boessenkool  writes:

> On Wed, Aug 24, 2022 at 03:48:49PM +0800, Jiufu Guo wrote:
>> Segher Boessenkool  writes:
>> >> +  "TARGET_POWERPC64 && !reload_completed && can_create_pseudo_p ()
>> >
>> > reload_completed in splitters is almost always wrong.  It isn't any
>> > better if it is in the insn condition of a define_insn_and_split :-)
>> >
>> Thanks, 'can_create_pseudo_p' would be ok for this patch.
>> Or just FAIL, if !can_create_pseudo_p()?
>
> You usually can split fine if you cannot create new pseudos, by reusing
> existing registers.
>
> FAIL will cause an ICE: the RTL instruction does match, but will fail
> when trying to generate machine code for it.
>
Previous patch is using "gen_reg_rtx (DImode)" to generate a pseudo for
the rotated result to prevent orignal one being changed accidently.
So, an 'assert (can_create_pseudo_p ())' would catch it in after RA.

To enable this splitter works after RA, we may need to reserve one
register (clobber would be ok).  Such as below:

  [(set (pc)
(if_then_else (eqne (match_operand:DI 1 "gpc_reg_operand" "r")
(match_operand:DI 2 "const_int_operand" "n"))
  (label_ref (match_operand 0 ""))
  (pc)))
  (clobber (match_scratch:DI 3 "=r"))
  (clobber (match_scratch:CCUNS 4 "=y"))]
  "TARGET_POWERPC64 && num_insns_constant (operands[2], DImode) > 1
   && compare_rotate_immediate_p (UINTVAL (operands[2]))"
   "#"
   "&& 1"
  [(pc)]

>> >> +   && num_insns_constant (operands[2], DImode) > 1
>> >> +   && (rotate_from_leading_zeros_const (~UINTVAL (operands[2]), 49) > 0
>> >> +   || rotate_from_leading_zeros_const (UINTVAL (operands[2]), 48) > 
>> >> 0)"
>> > There must be a better way to describe this.
>> Will update this. I'm thinking to replace this with a meaning function,
>> maybe 'compare_rotate_immediate_p'.
>
> Thanks!
>
>> > Why is this doing a conditional branch at all?  Unpredictable
>> > conditional branches are extremely costly.
>> This optimization needs to check whether the comparison code is ne/eq or
>> not.  To get the comparison code, we need to check the parent insn of
>> the 'cmp' insn.  This is why conditional branch patterns in used here.
>> 
>> This patch should not change the information (about prediction) of the
>> branch insn. I'm  thinking of updating the patch to keep the 'note info
>> REG_BR_PROB' for the branch instruction.
>
> Ah, good.  Explain a bit about that?  In a code comment or in the commit
> message, whichever works best here.
>
Thanks! will add a comment for this.

BR,
Jeff(Jiufu)

> Thanks!
>
>
> Segher


Re: [PATCH v2 2/2 resend] LoongArch: add model attribute

2022-08-25 Thread Lulu Cheng



在 2022/8/25 下午7:41, Xi Ruoyao 写道:

On Thu, 2022-08-25 at 17:12 +0800, Xi Ruoyao via Gcc-patches wrote:

On Thu, 2022-08-25 at 16:53 +0800, Lulu Cheng wrote:

I think this should add a sentence:

 
"Currently, the identifier name can only be one of small or

extreme."

I'll add this line and spend some time trying to see how to reliably
reproduce the ICE mentioned in 1/2 before committing.

Pushed r13-2199, with this added and "RTL flag checking enabled"
mentioned in 1/2.



OK!

Thanks!:-)



Re: [PATCH v2 2/2 resend] LoongArch: add model attribute

2022-08-25 Thread Xi Ruoyao via Gcc-patches
On Thu, 2022-08-25 at 17:12 +0800, Xi Ruoyao via Gcc-patches wrote:
> On Thu, 2022-08-25 at 16:53 +0800, Lulu Cheng wrote:
> > I think this should add a sentence:
> > 
> > 
> > "Currently, the identifier name can only be one of small or
> > extreme."
> 
> I'll add this line and spend some time trying to see how to reliably
> reproduce the ICE mentioned in 1/2 before committing.

Pushed r13-2199, with this added and "RTL flag checking enabled"
mentioned in 1/2.


-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: Rust frontend patches v2

2022-08-25 Thread Mark Wielaard
Hi Martin,

On Thu, 2022-08-25 at 11:52 +0200, Martin Liška wrote:
> What about limit increase, how much space do we have on sourceware
> infrastructure?

Feel free to increase the limits, there is a couple of hundred GB free
on sourceware and we can add more.

The public-inbox instance at inbox.sourceware.org was setup to
workaround some of these mailman limits and make it easier to handle
patch emails. It was only finalized last week, but testing shows it
seems ready for more wide use. Let me write an announcement for gcc.

Cheers,

Mark


Re: [PATCH 1/2] LoongArch: Avoid RTL flag check failure in loongarch_classify_symbol

2022-08-25 Thread Xi Ruoyao via Gcc-patches
On Thu, 2022-08-25 at 16:29 +0800, Lulu Cheng wrote:
> 
> 在 2022/8/24 下午10:03, Xi Ruoyao 写道:
>  
> > SYMBOL_REF_TLS_MODEL invokes SYMBOL_REF_FLAGS, and SYMBOL_REF_FLAGS
> > invokes RTL_FLAG_CHECK1 and aborts when RTL code is not SYMBOL_REF.
> > 
> > r13-1833 removed "gcc_assert (SYMBOL_REF_P (x))" before invoking
> > "SYMBOL_REF_TLS_MODEL (x)", indicating that it's now possible that
> > "x"
> > is not a SYMBOL_REF.  So we need to check if "x" is SYMBOL_REF
> > first.
> > 
> > This fixes a test failure happening with r13-2173:
> > 
> > pr106096.C:26:1: internal compiler error: RTL flag check:
> > SYMBOL_REF_FLAGS used with unexpected rtx code 'const' in
> > loongarch_classify_symbol
> 
> I think there is no problem with the code modification, but I have not
> detected this ICE.

If ENABLE_RTL_CHECKING is disabled (for example, use --enable-
checking=release, which is also the default for release tarballs), the
check will be skipped.

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: Rust frontend patches v2

2022-08-25 Thread Philip Herron
Hi Martin

4.0K0001-Use-DW_ATE_UTF-for-the-Rust-char-type.patch
8.0K0002-gccrs-Add-nessecary-hooks-for-a-Rust-front-end-tests.patch
8.0K0003-gccrs-Add-Debug-info-testsuite.patch
12K 0004-gccrs-Add-link-cases-testsuite.patch
356K0005-gccrs-Add-general-compilation-test-cases.patch
132K0006-gccrs-Add-execution-test-cases.patch
4.0K0007-gccrs-Add-gcc-check-target-check-rust.patch
656K0008-gccrs-Add-the-Rust-front-end-AST-data-structures.patch
112K0009-gccrs-Add-Lexer-for-Rust-front-end.patch
504K0010-gccrs-Add-Parser-for-Rust-front-end.patch
200K0011-gccrs-Add-expansion-pass-for-the-Rust-front-end.patch
204K0012-gccrs-Add-name-resolution-pass-to-the-Rust-front-end.patch
476K0013-gccrs-Add-second-intermedite-representation-called-H.patch
212K0014-gccrs-Add-AST-to-HIR-lowering-pass.patch
4.0K0015-gccrs-Add-wrapper-for-make_unique.patch
4.0K0016-gccrs-Add-port-of-FNV-hash-used-during-legacy-symbol.patch
4.0K0017-gccrs-Add-Rust-ABI-enum-helpers.patch
4.0K0018-gccrs-Add-Base62-implementation.patch
12K 0019-gccrs-Add-implementation-of-Optional.patch
28K 0020-gccrs-Add-attributes-checker.patch
60K 0021-gccrs-Add-helpers-mappings-canonical-path-and-lang-i.patch
628K0022-gccrs-Add-type-resolution-and-trait-solving-pass.patch
32K 0023-gccrs-Add-unsafe-checks-for-Rust.patch
28K 0024-gccrs-Add-const-checker.patch
72K 0025-gccrs-Add-privacy-checks.patch
24K 0026-gccrs-Add-dead-code-scan-on-HIR.patch
8.0K0027-gccrs-Add-unused-variable-scan.patch
76K 0028-gccrs-Add-metadata-ouptput-pass.patch
460K0029-gccrs-HIR-to-GCC-GENERIC-lowering.patch
36K 0030-gccrs-These-are-wrappers-ported-from-reusing-gccgo.patch
16K 0031-gccrs-Add-GCC-Rust-front-end-Make-lang.in.patch
4.0K0032-gccrs-Add-config-lang.in.patch
4.0K0033-gccrs-add-lang-spec.h.patch
8.0K0034-gccrs-add-lang.opt.patch
8.0K0035-gccrs-add-compiler-driver.patch
64K 0036-gccrs-compiler-proper-interface-kicks-off-the-pipeli.patch
104K0037-gccrs-Add-README-CONTRIBUTING-and-compiler-logo.patch

I hope this helps. I can do another pass at splitting up on these
patches if it will help.

--Phil

On Thu, 25 Aug 2022 at 10:52, Martin Liška  wrote:
>
> On 8/25/22 11:46, Philip Herron wrote:
> > The missing patches on
> > https://gcc.gnu.org/pipermail/gcc-patches/2022-August/thread.html#600200
> > are 8, 10, 13, 29, and 22.
>
> Hmm, I think our limit is pretty low, sorry for that.
>
> Can you please paste output of du -hs 00*?
>
> What about limit increase, how much space do we have on sourceware 
> infrastructure?
>
> Thanks,
> Martin
>


Patch ping (was Re: [PATCH] libstdc++: Clear padding bits in atomic compare_exchange)

2022-08-25 Thread Jakub Jelinek via Gcc-patches
On Tue, Jan 18, 2022 at 09:48:19PM +, Jonathan Wakely via Gcc-patches wrote:
> On Tue, 2 Nov 2021 at 01:26, Thomas Rodgers  wrote:
> 
> > This should address Jonathan's feedback and adds support for atomic_ref
> >
> 
> 
> >This change implements P0528 which requires that padding bits not
> >participate in atomic compare exchange operations. All arguments to the
> >generic template are 'sanitized' by the __builtin_clearpadding intrisic
> 
> The name of the intrinsic and the word "instrinsic" have typos.

I'd like to ping this patch.
To make some progress, I've tried to incorporate some of Jonathan's
review comments below, but it is incomplete.

ChangeLog + wording above it fixed.

> >
> >   explicit
> >   __atomic_ref(_Tp& __t) : _M_ptr(std::__addressof(__t))
> >-  { __glibcxx_assert(((uintptr_t)_M_ptr % required_alignment) == 0); }
> >+  {
> >+ __glibcxx_assert(((uintptr_t)_M_ptr % required_alignment) == 0);
> >+#if __cplusplus > 201402L && __has_builtin(__builtin_clear_padding)
> >+ __builtin_clear_padding(_M_ptr);
> >+#endif
> >+  }
> 
> Is this safe to do?
> 
> What if multiple threads all create a std::atomic_ref round the same object
> at once, they'll all try to clear padding, and so race, won't they?
> I don't think we can clear padding on atomic_ref construction, only on
> store and RMW operations.

Didn't touch the above.
> 
> 
> >--- a/libstdc++-v3/include/std/atomic
> >+++ b/libstdc++-v3/include/std/atomic

The patch against this file doesn't apply it all.

> >--- /dev/null
> >+++
> b/libstdc++-v3/testsuite/29_atomics/atomic_ref/compare_exchange_padding.cc
> >@@ -0,0 +1,43 @@
> >+// { dg-options "-std=gnu++2a" }
> >+// { dg-do run { target c++2a } }
> 
> This new test is using "2a" not "20".

Fixed thus, but the other testcase wasn't in the patch at all.

Here it is:

libstdc++: Clear padding bits in atomic compare_exchange

This change implements P0528 which requires that padding bits not
participate in atomic compare exchange operations. All arguments to the
generic template are 'sanitized' by the __builtin_clear_padding intrinsic
before they are used in comparisons. This requires that any stores
also sanitize the incoming value.

Signed-off-by: Thomas Rodgers 

libstdc++-v3/ChangeLog:

* include/std/atomic (atomic::atomic(_Tp)): Clear padding for
__cplusplus > 201703L.
(atomic::store()): Clear padding.
(atomic::exchange()): Likewise.
(atomic::compare_exchange_weak()): Likewise.
(atomic::compare_exchange_strong()): Likewise.
* include/bits/atomic_base.h (__atomic_impl::__clear_padding()):
New function.
(__atomic_impl::__maybe_has_padding()): Likewise.
(__atomic_impl::__compare_exchange()): Likewise.
(__atomic_impl::compare_exchange_weak()): Delegate to
__compare_exchange().
(__atomic_impl::compare_exchange_strong()): Likewise.
* testsuite/29_atomics/atomic/compare_exchange_padding.cc: New
test.
* testsuite/28_atomics/atomic_ref/compare_exchange_padding.cc:
Likewise.

--- a/libstdc++-v3/include/bits/atomic_base.h.jj2022-05-16 
09:46:02.361059682 +0200
+++ b/libstdc++-v3/include/bits/atomic_base.h   2022-08-25 12:06:13.758883172 
+0200
@@ -954,6 +954,87 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   /// @endcond
 
+  // Implementation details of atomic padding handling
+  namespace __atomic_impl
+  {
+template
+  _GLIBCXX_ALWAYS_INLINE _Tp*
+  __clear_padding(_Tp& __val) noexcept
+  {
+   auto* __ptr = std::__addressof(__val);
+#if __has_builtin(__builtin_clear_padding)
+   __builtin_clear_padding(std::__addressof(__val));
+#endif
+   return __ptr;
+  }
+
+template
+  constexpr bool
+  __maybe_has_padding()
+  {
+#if ! __has_builtin(__builtin_clear_padding)
+   return false;
+#elif __has_builtin(__has_unique_object_representations)
+   return !__has_unique_object_representations(_Tp)
+ && !is_floating_point<_Tp>::value;
+#else
+   return true;
+#endif
+  }
+
+template
+  _GLIBCXX_ALWAYS_INLINE bool
+  __compare_exchange(_Tp& __val, _Tp& __e, _Tp& __i, bool __weak,
+memory_order __s, memory_order __f) noexcept
+  {
+   __glibcxx_assert(__is_valid_cmpexch_failure_order(__f));
+
+   if _GLIBCXX17_CONSTEXPR (__atomic_impl::__maybe_has_padding<_Tp>())
+ {
+   alignas(_Tp) unsigned char __buf[sizeof(_Tp)];
+   _Tp* __exp = ::new((void*)__buf) _Tp(__e);
+   __exp = __atomic_impl::__clear_padding(*__exp);
+   auto* __des = __atomic_impl::__clear_padding(__i);
+   if (__atomic_compare_exchange(std::__addressof(__val), __exp, 
__des, __weak,
+ int(__s), int(__f)))
+ return true;
+   __builtin_memcpy(std::addressof(__e), __exp, sizeof(_Tp));
+   return false;
+ }
+   else
+ return 

Re: Rust frontend patches v2

2022-08-25 Thread Martin Liška
On 8/25/22 11:46, Philip Herron wrote:
> The missing patches on
> https://gcc.gnu.org/pipermail/gcc-patches/2022-August/thread.html#600200
> are 8, 10, 13, 29, and 22.

Hmm, I think our limit is pretty low, sorry for that.

Can you please paste output of du -hs 00*?

What about limit increase, how much space do we have on sourceware 
infrastructure?

Thanks,
Martin



Re: Rust frontend patches v2

2022-08-25 Thread Philip Herron
Hi everyone

I noticed a few patches still didn't make it through here with the
400kb limit. In the meantime, Mark pointed out to me that they can all
be viewed over here:

https://inbox.sourceware.org/gcc-patches/20220824115956.737931-9-philip.her...@embecosm.com/T/#rbff0bb3df2780fecd9ee3d2baa864d9140d24b54

The missing patches on
https://gcc.gnu.org/pipermail/gcc-patches/2022-August/thread.html#600200
are 8, 10, 13, 29, and 22.

Thanks let me know if there is anything more I can do to help.

--Phil

On Wed, 24 Aug 2022 at 13:01,  wrote:
>
> This is the 2nd patch set for gccrs, since v1 we have dropped the changes
> for target hooks which are not nessecary for us right now. This now
> focuses directly on the front-end the only patch that affects GCC now is a
> tweak to debug info. Note we are close to merging our port of the C++
> constexpr code into our front-end but this patch set does not include this
> yet.
>
> Thanks to Open Source Security, inc and Embecosm for sponsoring this work.
> Special thanks to all of those who have contributed thus far.
>
> See our branch over on 
> https://gcc.gnu.org/git/?p=gcc.git;a=shortlog;h=refs/heads/devel/rust/master
>
> We are currently testing on every commit the following systems:
>
> - Debian i386 - all tests passing
> - Debian testing-x86_64 - all tests passing
> - Fedora arm64 - all tests passing
> - Fedora X86_64 - all tests passing
> - OpenSUSE Leap X86_64 - all tests passing
> - OpenSUSE tw X86_64 - all tests passing
> - Rawhide X86_64 - all tests passing
> - macos x86_64 - all tests passing
> - Debian ppc64 - some tests failing
> - Fedora ppc64le - some tests failing
> - Fedora s390x - some tests failing
>
> The patch set is as follows:
>
> [PATCH Rust front-end v2 01/37] Use DW_ATE_UTF for the Rust 'char'
> [PATCH Rust front-end v2 02/37] gccrs: Add nessecary hooks for a Rust
> [PATCH Rust front-end v2 03/37] gccrs: Add Debug info testsuite
> [PATCH Rust front-end v2 04/37] gccrs: Add link cases testsuite
> [PATCH Rust front-end v2 05/37] gccrs: Add general compilation test
> [PATCH Rust front-end v2 06/37] gccrs: Add execution test cases
> [PATCH Rust front-end v2 07/37] gccrs: Add gcc-check-target
> [PATCH Rust front-end v2 08/37] gccrs: Add the Rust front-end AST
> [PATCH Rust front-end v2 09/37] gccrs: Add Lexer for Rust front-end
> [PATCH Rust front-end v2 10/37] gccrs: Add Parser for Rust front-end
> [PATCH Rust front-end v2 11/37] gccrs: Add expansion pass for the
> [PATCH Rust front-end v2 12/37] gccrs: Add name resolution pass to
> [PATCH Rust front-end v2 13/37] gccrs: Add second intermedite
> [PATCH Rust front-end v2 14/37] gccrs: Add AST to HIR lowering pass
> [PATCH Rust front-end v2 15/37] gccrs: Add wrapper for make_unique
> [PATCH Rust front-end v2 16/37] gccrs: Add port of FNV hash used
> [PATCH Rust front-end v2 17/37] gccrs: Add Rust ABI enum helpers
> [PATCH Rust front-end v2 18/37] gccrs: Add Base62 implementation
> [PATCH Rust front-end v2 19/37] gccrs: Add implementation of Optional
> [PATCH Rust front-end v2 20/37] gccrs: Add attributes checker
> [PATCH Rust front-end v2 21/37] gccrs: Add helpers mappings canonical
> [PATCH Rust front-end v2 22/37] gccrs: Add type resolution and trait
> [PATCH Rust front-end v2 23/37] gccrs: Add unsafe checks for Rust
> [PATCH Rust front-end v2 24/37] gccrs: Add const checker
> [PATCH Rust front-end v2 25/37] gccrs: Add privacy checks
> [PATCH Rust front-end v2 26/37] gccrs: Add dead code scan on HIR
> [PATCH Rust front-end v2 27/37] gccrs: Add unused variable scan
> [PATCH Rust front-end v2 28/37] gccrs: Add metadata ouptput pass
> [PATCH Rust front-end v2 29/37] gccrs: HIR to GCC GENERIC lowering
> [PATCH Rust front-end v2 30/37] gccrs: These are wrappers ported from
> [PATCH Rust front-end v2 31/37] gccrs: Add GCC Rust front-end
> [PATCH Rust front-end v2 32/37] gccrs: Add config-lang.in
> [PATCH Rust front-end v2 33/37] gccrs: add lang-spec.h
> [PATCH Rust front-end v2 34/37] gccrs: add lang.opt
> [PATCH Rust front-end v2 35/37] gccrs: add compiler driver
> [PATCH Rust front-end v2 36/37] gccrs: compiler proper interface
> [PATCH Rust front-end v2 37/37] gccrs: Add README, CONTRIBUTING and
>
> --
> Gcc-rust mailing list
> gcc-r...@gcc.gnu.org
> https://gcc.gnu.org/mailman/listinfo/gcc-rust


[PATCH] regenerate configure files and config.h.in files

2022-08-25 Thread Martin Liška
Hi.

I wrote a scipt that runs autoconf in all folders that have configure.ac
file and same for autoheader (where AC_CONFIG_HEADERS is present) and
this is the output.

The script can be seen here:
https://github.com/marxin/script-misc/blob/master/gcc-autoconf-all.py

I'm going to add the script to my daily Builbot tester.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

fixincludes/ChangeLog:

* config.h.in: Regenerate.
* configure: Regenerate.

libada/ChangeLog:

* configure: Regenerate.

libiberty/ChangeLog:

* configure: Regenerate.

libobjc/ChangeLog:

* configure: Regenerate.

liboffloadmic/ChangeLog:

* configure: Regenerate.
* plugin/configure: Regenerate.

libquadmath/ChangeLog:

* configure: Regenerate.

libssp/ChangeLog:

* configure: Regenerate.

libvtv/ChangeLog:

* configure: Regenerate.

zlib/ChangeLog:

* configure: Regenerate.
---
 fixincludes/config.h.in| 204 -
 fixincludes/configure  |   2 +-
 libada/configure   |   3 +
 libiberty/configure|   3 +
 libobjc/configure  |   6 +-
 liboffloadmic/configure|  46 +++-
 liboffloadmic/plugin/configure |  46 +++-
 libquadmath/configure  |   6 +-
 libssp/configure   |   6 +-
 libvtv/configure   |  18 +--
 zlib/configure |   6 +-
 11 files changed, 61 insertions(+), 285 deletions(-)

diff --git a/fixincludes/config.h.in b/fixincludes/config.h.in
index 3f6cf1e574e..69a67f5f116 100644
--- a/fixincludes/config.h.in
+++ b/fixincludes/config.h.in
@@ -1,397 +1,211 @@
 /* config.h.in.  Generated from configure.ac by autoheader.  */
 
 /* Defined to the executable file extension on the host system */
-#ifndef USED_FOR_TARGET
 #undef EXE_EXT
-#endif
-
 
 /* Define to 1 if you have the `clearerr_unlocked' function. */
-#ifndef USED_FOR_TARGET
 #undef HAVE_CLEARERR_UNLOCKED
-#endif
-
 
 /* Define to 1 if you have the declaration of `abort', and to 0 if you don't.
*/
-#ifndef USED_FOR_TARGET
 #undef HAVE_DECL_ABORT
-#endif
-
 
 /* Define to 1 if you have the declaration of `asprintf', and to 0 if you
don't. */
-#ifndef USED_FOR_TARGET
 #undef HAVE_DECL_ASPRINTF
-#endif
-
 
 /* Define to 1 if you have the declaration of `basename(char *)', and to 0 if
you don't. */
-#ifndef USED_FOR_TARGET
 #undef HAVE_DECL_BASENAME
-#endif
-
 
 /* Define to 1 if you have the declaration of `clearerr_unlocked', and to 0 if
you don't. */
-#ifndef USED_FOR_TARGET
 #undef HAVE_DECL_CLEARERR_UNLOCKED
-#endif
-
 
 /* Define to 1 if you have the declaration of `errno', and to 0 if you don't.
*/
-#ifndef USED_FOR_TARGET
 #undef HAVE_DECL_ERRNO
-#endif
-
 
 /* Define to 1 if you have the declaration of `feof_unlocked', and to 0 if you
don't. */
-#ifndef USED_FOR_TARGET
 #undef HAVE_DECL_FEOF_UNLOCKED
-#endif
-
 
 /* Define to 1 if you have the declaration of `ferror_unlocked', and to 0 if
you don't. */
-#ifndef USED_FOR_TARGET
 #undef HAVE_DECL_FERROR_UNLOCKED
-#endif
-
 
 /* Define to 1 if you have the declaration of `fflush_unlocked', and to 0 if
you don't. */
-#ifndef USED_FOR_TARGET
 #undef HAVE_DECL_FFLUSH_UNLOCKED
-#endif
-
 
 /* Define to 1 if you have the declaration of `fgetc_unlocked', and to 0 if
you don't. */
-#ifndef USED_FOR_TARGET
 #undef HAVE_DECL_FGETC_UNLOCKED
-#endif
-
 
 /* Define to 1 if you have the declaration of `fgets_unlocked', and to 0 if
you don't. */
-#ifndef USED_FOR_TARGET
 #undef HAVE_DECL_FGETS_UNLOCKED
-#endif
-
 
 /* Define to 1 if you have the declaration of `fileno_unlocked', and to 0 if
you don't. */
-#ifndef USED_FOR_TARGET
 #undef HAVE_DECL_FILENO_UNLOCKED
-#endif
-
 
 /* Define to 1 if you have the declaration of `fprintf_unlocked', and to 0 if
you don't. */
-#ifndef USED_FOR_TARGET
 #undef HAVE_DECL_FPRINTF_UNLOCKED
-#endif
-
 
 /* Define to 1 if you have the declaration of `fputc_unlocked', and to 0 if
you don't. */
-#ifndef USED_FOR_TARGET
 #undef HAVE_DECL_FPUTC_UNLOCKED
-#endif
-
 
 /* Define to 1 if you have the declaration of `fputs_unlocked', and to 0 if
you don't. */
-#ifndef USED_FOR_TARGET
 #undef HAVE_DECL_FPUTS_UNLOCKED
-#endif
-
 
 /* Define to 1 if you have the declaration of `fread_unlocked', and to 0 if
you don't. */
-#ifndef USED_FOR_TARGET
 #undef HAVE_DECL_FREAD_UNLOCKED
-#endif
-
 
 /* Define to 1 if you have the declaration of `fwrite_unlocked', and to 0 if
you don't. */
-#ifndef USED_FOR_TARGET
 #undef HAVE_DECL_FWRITE_UNLOCKED
-#endif
-
 
 /* Define to 1 if you have the declaration of `getchar_unlocked', and to 0 if
you don't. */
-#ifndef USED_FOR_TARGET
 #undef HAVE_DECL_GETCHAR_UNLOCKED
-#endif
-
 
 /* Define to 1 if you have the declaration of `getc_unlocked', and to 0 if you
don't. */
-#ifndef USED_FOR_TARGET
 #undef HAVE_DECL_GETC_UNLOCKED
-#endif
-
 
 

[PATCH] middle-end: Add MULT_EXPR recognition for cond scalar reduction

2022-08-25 Thread Kong, Lingling via Gcc-patches
Hi,

The conditional mult reduction cannot be recognized with current GCC. The 
following loop cannot be vectorized.
Now add MULT_EXPR recognition for conditional scalar reduction.

float summa(int n, float *arg1, float *arg2)
{  
int i; 
float res1 = 1.0;
for(i = 0; i < n; i++) {
  if(arg2[i]) 
res1 *= arg1[i];
}  
return res1;   
}

gcc/ChangeLog:

* tree-if-conv.cc (is_cond_scalar_reduction): Add MULT_EXPR
recognition.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/gen-vect-34.c: New test.
* gcc.dg/vect/vect-ifcvt-18.c: New test.
---
 gcc/testsuite/gcc.dg/tree-ssa/gen-vect-34.c | 16 +
 gcc/testsuite/gcc.dg/vect/vect-ifcvt-18.c   | 38 +
 gcc/tree-if-conv.cc |  1 +
 3 files changed, 55 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/gen-vect-34.c
 create mode 100644 gcc/testsuite/gcc.dg/vect/vect-ifcvt-18.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-34.c 
b/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-34.c
new file mode 100644
index 000..8d2d36401fe
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-34.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-Ofast -fdump-tree-vect-details" } */
+/* { dg-additional-options "-mavx2" { target { x86_64-*-* i?86-*-* } } 
+} */
+
+float summul(int n, float *arg1, float *arg2)
+{  
+int i; 
+float res1 = 1.0;
+for(i = 0; i < n; i++) {
+  if(arg2[i]) 
+res1 *= arg1[i];
+}  
+return res1;   
+}
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { 
+target { ! { avr-*-* pru-*-* } } } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-ifcvt-18.c 
b/gcc/testsuite/gcc.dg/vect/vect-ifcvt-18.c
new file mode 100644
index 000..c1d3c27d819
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-ifcvt-18.c
@@ -0,0 +1,38 @@
+/* { dg-require-effective-target vect_condition } */
+/* { dg-require-effective-target vect_float } */
+/* { dg-additional-options "-Ofast -mavx" { target avx_runtime } } */
+
+
+int A0[4] = {36,39,42,45};
+int B0[4] = {42,42,0,42};
+float A1[8] = {36,39,42,45,43,32,21,12}; float B1[8] = 
+{42,42,0,42,42,42,0,42}; double A2[16] = 
+{36,39,42,45,43,32,21,12,23,34,45,56,42,78,89,11};
+double B2[16] = {42,42,0,42,42,42,42,42,42,42,42,42,0,42,42,42};
+
+int main ()
+{
+  int i, j;
+  int res0 = 1;
+  float res1 = 1.0;
+  double res2 = 1.0;
+
+  for (i = 0; i < 4; i++)
+if (B0[i])
+  res0 *= A0[i];
+
+  for (i = 0; i < 8; i++)
+if (B1[i])
+  res1 *= A1[i];
+  
+  for (i = 0; i < 16; i++)
+if (B2[i])
+  res2 *= A2[i];
+  /* check results:  */
+  if (res0 != 63180 || res1 != 1043228160.00
+  ||res2 != 3296728515318523101184.00)
+  __builtin_abort ();
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump "vectorized 3 loops" "vect" { target 
+i?86-*-* x86_64-*-* } } } */
diff --git a/gcc/tree-if-conv.cc b/gcc/tree-if-conv.cc index 
1c8e1a45234..bac29fb5574 100644
--- a/gcc/tree-if-conv.cc
+++ b/gcc/tree-if-conv.cc
@@ -1739,6 +1739,7 @@ is_cond_scalar_reduction (gimple *phi, gimple **reduc, 
tree arg_0, tree arg_1,
 
   if (reduction_op != PLUS_EXPR
   && reduction_op != MINUS_EXPR
+  && reduction_op != MULT_EXPR
   && reduction_op != BIT_IOR_EXPR
   && reduction_op != BIT_XOR_EXPR
   && reduction_op != BIT_AND_EXPR)
--
2.18.2



Re: [PATCH v2 2/2 resend] LoongArch: add model attribute

2022-08-25 Thread Xi Ruoyao via Gcc-patches
On Thu, 2022-08-25 at 16:53 +0800, Lulu Cheng wrote:
> I think this should add a sentence:
> 
> 
> "Currently, the identifier name can only be one of small or extreme."

I'll add this line and spend some time trying to see how to reliably
reproduce the ICE mentioned in 1/2 before committing.

> Others I think are ok.
-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: [PATCH] Teach vectorizer to deal with bitfield accesses (was: [RFC] Teach vectorizer to deal with bitfield reads)

2022-08-25 Thread Andre Vieira (lists) via Gcc-patches


On 17/08/2022 13:49, Richard Biener wrote:

Yes, of course.  What you need to do is subtract DECL_FIELD_BIT_OFFSET
of the representative from DECL_FIELD_BIT_OFFSET of the original bitfield
access - that's the offset within the representative (by construction
both fields share DECL_FIELD_OFFSET).

Doh! That makes sense...

So instead I change bitpos such that:
align_of_representative = TYPE_ALIGN (TREE_TYPE (representative));
bitpos -= bitpos.to_constant () / align_of_representative *
align_of_representative;

?  Not sure why alignment comes into play here?
Yeah just forget about this... it was my ill attempt at basically doing 
what you described above.

Not sure what you are saying but "yes", all shifting and masking should
happen in the type of the representative.

+  tree bitpos_tree = build_int_cst (bitsizetype, bitpos);

for your convenience there's bitsize_int (bitpos) you can use.

I don't think you are using the correct bitpos though, you fail to
adjust it for the BIT_FIELD_REF/BIT_INSERT_EXPR.
Not sure I understand what you mean? I do adjust it, I've changed it now 
so it should hopefully be clearer.


+build_int_cst (bitsizetype, TYPE_PRECISION
(bf_type)),

the size of the bitfield reference is DECL_SIZE of the original
FIELD_DECL - it might be bigger than the precision of its type.
You probably want to double-check it's equal to the precision
(because of the insert but also because of all the masking) and
refuse to lower if not.
I added a check for this but out of curiosity, how can the DECL_SIZE of 
a bitfield FIELD_DECL be different than it's type's precision?


+/* Return TRUE if there are bitfields to lower in this LOOP.  Fill
TO_LOWER
+   with data structures representing these bitfields.  */
+
+static bool
+bitfields_to_lower_p (class loop *loop,
+ vec  _to_lower,
+ vec  _to_lower)
+{
+  basic_block *bbs = get_loop_body (loop);
+  gimple_stmt_iterator gsi;

as said I'd prefer to do this walk as part of the other walks we
already do - if and if only because get_loop_body () is a DFS
walk over the loop body (you should at least share that).
I'm now sharing the use of ifc_bbs. The reason why I'd rather not share 
the walk over them is because it becomes quite complex to split out the 
decision to not lower if's because there are none, for which we will 
still want to lower bitfields, versus not lowering if's when they are 
there but aren't lowerable at which point we will forego lowering 
bitfields since we will not vectorize this loop anyway.


+  value = fold_build1 (NOP_EXPR, load_type, value);

fold_convert (load_type, value)

+  if (!CONSTANT_CLASS_P (value))
+   {
+ pattern_stmt
+   = gimple_build_assign (vect_recog_temp_ssa_var (load_type,
NULL),
+  value);
+ value = gimple_get_lhs (pattern_stmt);

there's in principle

  gimple_seq stmts = NULL;
  value = gimple_convert (, load_type, value);
  if (!gimple_seq_empty_p (stmts))
{
  pattern_stmt = gimple_seq_first_stmt (stmts);
  append_pattern_def_seq (vinfo, stmt_info, pattern_stmt);
}

though a append_pattern_def_seq helper to add a convenience sequence
would be nice to have here.

Ended up using the existing 'vect_convert_input', seems to do nicely here.

You probably want to double-check your lowering code by
bootstrapping / testing with -ftree-loop-if-convert.
Done, this lead me to find a new failure mode, where the type of the 
first operand of BIT_FIELD_REF was a FP type (TF mode), which then lead 
to failures when constructing the masking and shifting. I ended up 
adding a nop-conversion to an INTEGER type of the same width first if 
necessary. Also did a follow-up bootstrap with the addition of 
`-ftree-vectorize` and `-fno-vect-cost-model` to further test the 
codegen. All seems to be working on aarch64-linux-gnu.diff --git a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-1.c 
b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-1.c
new file mode 100644
index 
..01cf34fb44484ca926ca5de99eef76dd99b69e92
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-1.c
@@ -0,0 +1,40 @@
+/* { dg-require-effective-target vect_int } */
+
+#include 
+#include "tree-vect.h"
+
+extern void abort(void);
+
+struct s { int i : 31; };
+
+#define ELT0 {0}
+#define ELT1 {1}
+#define ELT2 {2}
+#define ELT3 {3}
+#define N 32
+#define RES 48
+struct s A[N]
+  = { ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3,
+  ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3,
+  ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3,
+  ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3};
+
+int __attribute__ ((noipa))
+f(struct s *ptr, unsigned n) {
+int res = 0;
+for (int i = 0; i < n; ++i)
+  res += ptr[i].i;
+return res;
+}
+
+int main (void)
+{
+  check_vect ();
+
+  if (f([0], N) != RES)
+abort ();
+
+  return 0;
+}
+
+/* { 

Re: [PATCH v2 2/2 resend] LoongArch: add model attribute

2022-08-25 Thread Lulu Cheng



在 2022/8/24 下午10:12, Xi Ruoyao 写道:

On Wed, 2022-08-24 at 22:08 +0800, Xi Ruoyao wrote:

v1 -> v2:

  * Avoid introduce of SYMBOL_PCREL32, use SYMBOL_PCREL for 32-bit PC
    relative.
  * Rebase onto a bug fix ([1/2] in the series) to avoid merge conflict.
  * Fix missed ChangeLog entries.

Resend because my mail client has done some stupid thing to the patch :(

-- >8 --

A linker script and/or a section attribute may locate some object
specially, so we need to handle the code model for such objects
differently than the -mcmodel setting. This happens when the Linux
kernel loads a module with per-CPU variables.

Add an attribute to override the code model for a specific variable.

gcc/ChangeLog:

* config/loongarch/loongarch-protos.h (loongarch_symbol_type):
Add SYMBOL_PCREL64 and change the description for SYMBOL_PCREL.
* config/loongarch/loongarch.cc (loongarch_attribute_table):
New attribute table.
(TARGET_ATTRIBUTE_TABLE): Define the target hook.
(loongarch_handle_model_attribute): New static function.
(loongarch_classify_symbol): Take TARGET_CMODEL_EXTREME and the
model attribute of SYMBOL_REF_DECL into account returning
SYMBOL_PCREL or SYMBOL_PCREL64.
(loongarch_use_anchors_for_symbol_p): New static function.
(TARGET_USE_ANCHORS_FOR_SYMBOL_P): Define the target hook.
(loongarch_symbol_extreme_p): New static function.
(loongarch_symbolic_constant_p): Handle SYMBOL_PCREL64.
(loongarch_symbol_insns): Likewise.
(loongarch_split_symbol_type): Likewise.
(loongarch_split_symbol): Check SYMBOL_PCREL64 instead of
TARGET_CMODEL_EXTREME for PC-relative addressing.
(loongarch_print_operand_reloc): Likewise.
* doc/extend.texi (Variable Attributes): Document new
LoongArch specific attribute.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/attr-model-test.c: New test.
* gcc.target/loongarch/attr-model-1.c: New test.
* gcc.target/loongarch/attr-model-2.c: New test.
* gcc.target/loongarch/attr-model-diag.c: New test.
---
  gcc/config/loongarch/loongarch-protos.h   |   8 +-
  gcc/config/loongarch/loongarch.cc | 190 --
  gcc/doc/extend.texi   |  16 ++
  .../gcc.target/loongarch/attr-model-1.c   |   6 +
  .../gcc.target/loongarch/attr-model-2.c   |   6 +
  .../gcc.target/loongarch/attr-model-diag.c|   7 +
  .../gcc.target/loongarch/attr-model-test.c|  25 +++
  7 files changed, 237 insertions(+), 21 deletions(-)
  create mode 100644 gcc/testsuite/gcc.target/loongarch/attr-model-1.c
  create mode 100644 gcc/testsuite/gcc.target/loongarch/attr-model-2.c
  create mode 100644 gcc/testsuite/gcc.target/loongarch/attr-model-diag.c
  create mode 100644 gcc/testsuite/gcc.target/loongarch/attr-model-test.c

diff --git a/gcc/config/loongarch/loongarch-protos.h 
b/gcc/config/loongarch/loongarch-protos.h
index cadaad7519c..77b2217247d 100644
--- a/gcc/config/loongarch/loongarch-protos.h
+++ b/gcc/config/loongarch/loongarch-protos.h
@@ -28,7 +28,12 @@ along with GCC; see the file COPYING3.  If not see
 The symbol's value will be loaded directly from the GOT.
  
 SYMBOL_PCREL

-   The symbol's value will be loaded directly from data section.
+   The symbol's value will be loaded directly from data section within
+   +/- 2GiB range.
+
+   SYMBOL_PCREL64
+   The symbol's value will be loaded directly from data section within
+   +/- 8EiB range.
  
 SYMBOL_TLS

 A thread-local symbol.
@@ -42,6 +47,7 @@ along with GCC; see the file COPYING3.  If not see
  enum loongarch_symbol_type {
SYMBOL_GOT_DISP,
SYMBOL_PCREL,
+  SYMBOL_PCREL64,
SYMBOL_TLS,
SYMBOL_TLS_IE,
SYMBOL_TLS_LE,
diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index 41d9cca6d31..d9061cdeee3 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -1633,8 +1633,11 @@ loongarch_rtx_constant_in_small_data_p (machine_mode 
mode)
  static enum loongarch_symbol_type
  loongarch_classify_symbol (const_rtx x)
  {
+  enum loongarch_symbol_type pcrel =
+TARGET_CMODEL_EXTREME ? SYMBOL_PCREL64 : SYMBOL_PCREL;
+
if (!SYMBOL_REF_P (x))
-return SYMBOL_PCREL;
+return pcrel;
  
if (SYMBOL_REF_TLS_MODEL (x))

  return SYMBOL_TLS;
@@ -1642,7 +1645,28 @@ loongarch_classify_symbol (const_rtx x)
if (!loongarch_symbol_binds_local_p (x))
  return SYMBOL_GOT_DISP;
  
-  return SYMBOL_PCREL;

+  tree t = SYMBOL_REF_DECL (x);
+  if (!t)
+return pcrel;
+
+  t = lookup_attribute ("model", DECL_ATTRIBUTES (t));
+  if (!t)
+return pcrel;
+
+  t = TREE_VALUE (TREE_VALUE (t));
+
+  /* loongarch_handle_model_attribute should reject other values.  */
+  gcc_assert (TREE_CODE (t) == STRING_CST);
+
+  const char *model = TREE_STRING_POINTER (t);
+  if (strcmp (model, "normal") == 

[PATCH] c++, v2: Implement C++23 P2071R2 - Named universal character escapes [PR106648]

2022-08-25 Thread Jakub Jelinek via Gcc-patches
On Wed, Aug 24, 2022 at 04:22:17PM -0400, Jason Merrill wrote:
> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> Does the copyright 2005-2022 mean that this code is partly derived from some
> other?

Yes, I was lazy and started by copying over makeucnid.cc which also
parses UnicodeData.txt.
In the end, according to diff -upd -U1 make{ucnid,uname2c}.cc, there are
~180 lines in common (out of ~530 lines of makeucnid.cc), out of which is
~80 lines in the two copyrights, most of the rest are just empty lines or
lines with { or } alone, beyond that
 #include 
 #include 
 #include 
 #include 
 #include 
 
 
 #define NUM_CODE_POINTS 0x11
 #define MAX_CODE_POINT 0x10
and
 /* Read UnicodeData.txt and fill in the 'decomp' table to be the
decompositions of characters for which both the character
decomposed and all the code points in the decomposition are valid
for some supported language version, and the 'all_decomp' table to
be the decompositions of all characters without those
constraints.  */
 
 static void
 {
   if (!f)
   for (;;)
 {
   char line[256];
   char *l;
 
   if (!fgets (line, sizeof (line), f))
break;
   codepoint = strtoul (line, , 16);
   if (l == line || *l != ';')
   if (codepoint > MAX_CODE_POINT)
 
   do {
   } while (*l != ';');
{
}
 }
   if (ferror (f))
   fclose (f);
 }
are the common lines close to each other (and whole
write_copyright function).  Dunno if with that I could use
just 2022 copyright or not.

> > + /* We don't know what the next letter will be.
> > +It could be ISALNUM, then we are supposed
> > +to omit it, or it could be a space and then
> > +we should not omit it and need to compare it.
> > +Fortunately the only 3 names with hyphen
> > +followed by non-letter are
> > +U+0F0A TIBETAN MARK BKA- SHOG YIG MGO
> > +U+0FD0 TIBETAN MARK BKA- SHOG GI MGO RGYAN
> > +U+0FD0 TIBETAN MARK BSKA- SHOG GI MGO RGYAN
> > +and makeuname2c.cc verifies this.
> > +Furthermore, prefixes of NR2 generated
> > +ranges all end with a hyphen, but the generated
> > +part is then followed by alpha-numeric.
> > +So, let's just assume that - at the end of
> > +key is always followed by alphanumeric and
> > +so should be omitted.  */
> 
> Let's mention that makeuname2c.cc verifies this property.

I had "and makeuname2c.cc verifies this." there already a few lines before,
but I agree it is better to move that to the end.

> > + for (j = start; j < end; j++)
> > +   {
> > + /* Actually strlen, but we know strlen () <= 3.  */
> 
> Is this comment saying that you're using a loop instead of calling strlen
> because you know the result will be small?  That seems an odd choice.

Yes, but perhaps it is a micro-optimization and maybe the Korean characters
will not be used that much that it isn't worth it.
Our optimizers certainly aren't able to figure out that when
strlen is called on an array element with size 4 that calling library
function isn't the best idea.  The string lengths are 0 in 3%, 1 in 44%,
2 in 47% and 3 in 6% of cases.
At least on x86_64 when I just use this_len = strlen (hangul_syllables[j]);
it calls the library routine.
Changed to this_len = strlen (hangul_syllables[j]);

> > + /* Try to do a loose name lookup according to
> > +Unicode loose matching rule UAX44-LM2.
> 
> Maybe factor the loose lookup into a separate function?

Good idea.

> > + bidi::kind kind;
> > + if (buffer->cur[-1] == 'N')
> > +   kind = get_bidi_named (pfile, buffer->cur, );
> > + else
> > +   kind = get_bidi_ucn (pfile, buffer->cur,
> > +buffer->cur[-1] == 'U', );
> 
> Hmm, I'm surprised that we're doing bidi checking before replacing escape
> characters with elements of the translation character set.  So now we need
> to check it three different ways.

It is unfortunate, but I'm afraid it is intentional.
Because after replacing the escape characters we lose the distinction
between characters written as UTF-8 in the source and the escape sequences.
The former need to be treated differently as they are more dangerous than
the latter, bidi written as UTF-8 can mislead what the source contains
already in (some) text editors or whatever way user looks at the source
code, while when written as UCNs (\u, \u{}, \U, \N{}) it can be dangerous
only when the program emits it at runtime unpaired.

Here is incremental diff and full patch (with the huge uname2c.h generated
header 

[PATCH] tree-optimization/106737 - remove intermediate SSA verification in autopar

2022-08-25 Thread Richard Biener via Gcc-patches
The following removes intermediate SSA verification in autopar which
isn't expected to succeed after previous changes delaying (virtual)
SSA update to the end of the pass.

Pushed as obvious.

PR tree-optimization/106737
* tree-parloops.cc (transform_to_exit_first_loop_alt): Do not
verify SSA form.

* gcc.dg/autopar/pr106737.c: New testcase.
---
 gcc/testsuite/gcc.dg/autopar/pr106737.c | 18 ++
 gcc/tree-parloops.cc|  2 --
 2 files changed, 18 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/autopar/pr106737.c

diff --git a/gcc/testsuite/gcc.dg/autopar/pr106737.c 
b/gcc/testsuite/gcc.dg/autopar/pr106737.c
new file mode 100644
index 000..3ec8f5dc9e8
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/autopar/pr106737.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O -floop-parallelize-all -ftree-parallelize-loops=2 
-fno-tree-dce" } */
+
+void
+foo (int x)
+{
+  int a[2];
+  int b, c = 0;
+
+  for (b = 0; b < 2; ++b)
+a[b] = 0;
+  for (b = 0; b < 2; ++b)
+a[b] = 0;
+
+  while (c < 1)
+while (x < 1)
+  ++x;
+}
diff --git a/gcc/tree-parloops.cc b/gcc/tree-parloops.cc
index b070527ee6e..76483fcaaa2 100644
--- a/gcc/tree-parloops.cc
+++ b/gcc/tree-parloops.cc
@@ -2484,8 +2484,6 @@ transform_to_exit_first_loop_alt (class loop *loop,
   /* Recalculate dominance info.  */
   free_dominance_info (CDI_DOMINATORS);
   calculate_dominance_info (CDI_DOMINATORS);
-
-  checking_verify_ssa (true, true);
 }
 
 /* Tries to moves the exit condition of LOOP to the beginning of its header
-- 
2.35.3


[PATCH] m32c-rtems: remove obsoleted port

2022-08-25 Thread Martin Liška
Remove the port that has been marked obsolete in GCC 12 change notes.

Ready to be installed?
Thanks,
Martin

contrib/ChangeLog:

* config-list.mk: Remove the port.

gcc/ChangeLog:

* config.gcc: Remove the port.
* config/m32c/rtems.h: Removed.

libgcc/ChangeLog:

* config.host: Remove the port.
---
 contrib/config-list.mk  |  2 +-
 gcc/config.gcc  |  6 --
 gcc/config/m32c/rtems.h | 38 --
 libgcc/config.host  |  2 +-
 4 files changed, 2 insertions(+), 46 deletions(-)
 delete mode 100644 gcc/config/m32c/rtems.h

diff --git a/contrib/config-list.mk b/contrib/config-list.mk
index c4db8dc849a..015f3a83bc6 100644
--- a/contrib/config-list.mk
+++ b/contrib/config-list.mk
@@ -59,7 +59,7 @@ LIST = aarch64-elf aarch64-linux-gnu aarch64-rtems \
   ia64-freebsd6 ia64-linux ia64-hpux ia64-hp-vms iq2000-elf lm32-elf \
   lm32-rtems lm32-uclinux \
   loongarch64-linux-gnuf64 loongarch64-linux-gnuf32 loongarch64-linux-gnusf \
-  m32c-rtems m32c-elf m32r-elf m32rle-elf \
+  m32c-elf m32r-elf m32rle-elf \
   m68k-elf m68k-netbsdelf \
   m68k-uclinux m68k-linux m68k-rtems \
   mcore-elf microblaze-linux microblaze-elf \
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 02f58970db0..8e8e1d0513f 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -251,7 +251,6 @@ case ${target} in
  cr16-*-*  \
  | hppa[12]*-*-hpux10* \
  | hppa[12]*-*-hpux11* \
- | m32c-*-rtems*   \
  )
 if test "x$enable_obsolete" != xyes; then
   echo "*** Configuration ${target} is obsolete." >&2
@@ -3555,11 +3554,6 @@ am33_2.0-*-linux*)
gas=yes gnu_ld=yes
use_collect2=no
;;
-m32c-*-rtems*)
-   tm_file="elfos.h ${tm_file} m32c/rtems.h rtems.h newlib-stdint.h"
-   c_target_objs="m32c-pragma.o"
-   cxx_target_objs="m32c-pragma.o"
-   ;;
 m32c-*-elf*)
tm_file="elfos.h newlib-stdint.h ${tm_file}"
c_target_objs="m32c-pragma.o"
diff --git a/gcc/config/m32c/rtems.h b/gcc/config/m32c/rtems.h
deleted file mode 100644
index 553f77cfd4d..000
--- a/gcc/config/m32c/rtems.h
+++ /dev/null
@@ -1,38 +0,0 @@
-/* Definitions for rtems targeting a M32C using ELF.
-   Copyright (C) 2008-2022 Free Software Foundation, Inc.
-   Contributed by Joel Sherrill (j...@oarcorp.com).
-
-   This file is part of GCC.
-
-   GCC is free software; you can redistribute it and/or modify it
-   under the terms of the GNU General Public License as published
-   by the Free Software Foundation; either version 3, or (at your
-   option) any later version.
-
-   GCC is distributed in the hope that it will be useful, but WITHOUT
-   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
-   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
-   License for more details.
-
-   Under Section 7 of GPL version 3, you are granted additional
-   permissions described in the GCC Runtime Library Exception, version
-   3.1, as published by the Free Software Foundation.
-
-   You should have received a copy of the GNU General Public License and
-   a copy of the GCC Runtime Library Exception along with this program;
-   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
-   .  */
-
-/* Target OS builtins.  */
-#undef TARGET_OS_CPP_BUILTINS
-#define TARGET_OS_CPP_BUILTINS()   \
-  do   \
-{  \
-   builtin_define ("__rtems__");   \
-   builtin_define ("__USE_INIT_FINI__");   \
-   builtin_assert ("system=rtems");\
-}  \
-  while (0)
-
-/* Use the default */
-#undef LINK_GCC_C_SEQUENCE_SPEC
diff --git a/libgcc/config.host b/libgcc/config.host
index b2a0a8e1aa9..974de349653 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -1516,7 +1516,7 @@ am33_2.0-*-linux*)
extra_parts="crtbegin.o crtend.o crtbeginS.o crtendS.o"
tmake_file="$tmake_file t-fdpbit"
;;
-m32c-*-elf*|m32c-*-rtems*)
+m32c-*-elf*)
tmake_file="$tmake_file m32c/t-m32c"
;;
 nvptx-*)
-- 
2.37.2



Re: [PATCH 1/2] LoongArch: Avoid RTL flag check failure in loongarch_classify_symbol

2022-08-25 Thread Lulu Cheng



在 2022/8/24 下午10:03, Xi Ruoyao 写道:

SYMBOL_REF_TLS_MODEL invokes SYMBOL_REF_FLAGS, and SYMBOL_REF_FLAGS
invokes RTL_FLAG_CHECK1 and aborts when RTL code is not SYMBOL_REF.

r13-1833 removed "gcc_assert (SYMBOL_REF_P (x))" before invoking
"SYMBOL_REF_TLS_MODEL (x)", indicating that it's now possible that "x"
is not a SYMBOL_REF.  So we need to check if "x" is SYMBOL_REF first.

This fixes a test failure happening with r13-2173:

 pr106096.C:26:1: internal compiler error: RTL flag check:
 SYMBOL_REF_FLAGS used with unexpected rtx code 'const' in
 loongarch_classify_symbol



I think there is no problem with the code modification, but I have not 
detected this ICE.




Re: [PATCH] internal-fn, tree-cfg, v2: Fix .TRAP handling and another __builtin_trap vops issue [PR106099]

2022-08-25 Thread Richard Biener via Gcc-patches
On Thu, 25 Aug 2022, Jakub Jelinek wrote:

> On Wed, Aug 24, 2022 at 10:20:45AM +, Richard Biener wrote:
> > So
> > 
> >combined_fn cfn = builtin_cfn_unreachable ();
> >gimple_build (, false, GSI_NEW_STMT, cfn, void_type_node);
> > 
> > ?
> 
> So what about just using existing call that creates the GIMPLE call,
> whether it is builtin call or internal call?
> I didn't use that initially because the code then wants to add a cgraph
> edge for the fndecl, but in fact it is quite easy to gimple_call_fndecl
> afterwards:

That works for me.

OK,

Thanks,
Richard.

> 2022-08-25  Jakub Jelinek  
> 
>   PR tree-optimization/106099
>   * internal-fn.def (TRAP): Add ECF_LOOPING_CONST_OR_PURE flag.
>   * tree-cfg.cc (execute_fixup_cfg): Add IFN_TRAP instead of
>   __builtin_trap to avoid the need of vops.
> 
>   * gcc.dg/pr106099.c: New test.
> 
> --- gcc/internal-fn.def.jj2022-07-28 12:43:12.876295553 +0200
> +++ gcc/internal-fn.def   2022-08-23 14:21:49.559364691 +0200
> @@ -458,7 +458,8 @@ DEF_INTERNAL_FN (SPACESHIP, ECF_CONST |
>  
>  /* __builtin_trap created from/for __builtin_unreachable.  */
>  DEF_INTERNAL_FN (TRAP, ECF_CONST | ECF_LEAF | ECF_NORETURN
> -| ECF_NOTHROW | ECF_COLD, NULL)
> +| ECF_NOTHROW | ECF_COLD | ECF_LOOPING_CONST_OR_PURE,
> +  NULL)
>  
>  #undef DEF_INTERNAL_INT_FN
>  #undef DEF_INTERNAL_FLT_FN
> --- gcc/tree-cfg.cc.jj2022-07-26 10:32:23.998267698 +0200
> +++ gcc/tree-cfg.cc   2022-08-25 09:42:19.013039869 +0200
> @@ -9878,16 +9878,16 @@ execute_fixup_cfg (void)
>   {
> if (stmt && is_gimple_call (stmt))
>   gimple_call_set_ctrl_altering (stmt, false);
> -   tree fndecl = builtin_decl_unreachable ();
> -   stmt = gimple_build_call (fndecl, 0);
> +   stmt = gimple_build_builtin_unreachable (UNKNOWN_LOCATION);
> gimple_stmt_iterator gsi = gsi_last_bb (bb);
> gsi_insert_after (, stmt, GSI_NEW_STMT);
> if (!cfun->after_inlining)
> - {
> -   gcall *call_stmt = dyn_cast  (stmt);
> -   node->create_edge (cgraph_node::get_create (fndecl),
> -  call_stmt, bb->count);
> - }
> + if (tree fndecl = gimple_call_fndecl (stmt))
> +   {
> + gcall *call_stmt = dyn_cast  (stmt);
> + node->create_edge (cgraph_node::get_create (fndecl),
> +call_stmt, bb->count);
> +   }
>   }
>   }
>  }
> --- gcc/testsuite/gcc.dg/pr106099.c.jj2022-08-23 14:30:51.992057144 
> +0200
> +++ gcc/testsuite/gcc.dg/pr106099.c   2022-08-23 14:29:04.271508337 +0200
> @@ -0,0 +1,10 @@
> +/* PR tree-optimization/106099 */
> +/* { dg-do compile } */
> +/* { dg-options "-O -fharden-compares -fno-tree-forwprop -fno-tree-ch 
> -fno-tree-dominator-opts -fno-tree-ccp -funreachable-traps 
> --param=scev-max-expr-size=1" } */
> +
> +void
> +foo (void)
> +{
> +  for (unsigned i = 0; i == 0; i++)
> +__builtin_printf ("%d", i);
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)


Re: [PATCH v3] Many pages: Document fixed-width types with ISO C naming

2022-08-25 Thread Xi Ruoyao via Gcc-patches
On Thu, 2022-08-25 at 09:48 +0200, Alejandro Colomar wrote:
> Hi Xi,
> 
> On 8/25/22 09:28, Xi Ruoyao wrote:
> > On Thu, 2022-08-25 at 09:20 +0200, Alejandro Colomar via Gcc-patches
> > wrote:
> > > I don't know for sure, and I never pretended to say otherwise.  But what
> > > IMHO the kernel could do is to make the types compatible, by typedefing
> > > to the same fundamental types (i.e., long or long long) that user-space
> > > types do.
> > 
> > In user-space things are already inconsistent as we have multiple libc
> > implementations.  Telling every libc implementation to sync their
> > typedef w/o a WG14 decision will only cause "aggressive discussion" (far
> > more aggressive than this thread, I'd say).
> > 
> > If int64_t etc. were defined as builtin types since epoch, things would
> > be a lot easier.  But we can't change history.
> 
> This would be great.  I mean, the fundamental types should be u8, u16,
> ... and int, long, ... typedefs for these, and not the other way around, 
> if the language was designed today.
> 
> Maybe GCC could consider something like that.

GCC already have __UINT8_TYPE__ etc. but again telling all libc
implementations to use "typedef __UINT8_TYPE__ uint8_t" etc. will make
no effect expect annoying their maintainers.

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: [PATCH v3] Many pages: Document fixed-width types with ISO C naming

2022-08-25 Thread Alejandro Colomar via Gcc-patches

On 8/25/22 09:44, Alejandro Colomar wrote:

Hi Greg,

On 8/25/22 07:57, Greg Kroah-Hartman wrote:

On Thu, Aug 25, 2022 at 01:36:10AM +0200, Alejandro Colomar wrote:

But from your side what do we have?  Just direct NAKs without much
explanation.  The only one who gave some explanation was Greg, and he
vaguely pointed to Linus's comments about it in the past, with no 
precise
pointer to it.  I investigated a lot before v2, and could not find 
anything
strong enough to recommend using kernel types in user space, so I 
pushed v2,

and the discussion was kept.


So despite me saying that "this is not ok", and many other maintainers
saying "this is not ok", you applied a patch with our objections on it?
That is very odd and a bit rude.

I would like that if you still oppose to the patch, at least were 
able to

provide some facts to this discussion.


The fact is that the kernel can not use the namespace that userspace has
with ISO C names.  It's that simple as the ISO standard does NOT
describe the variable types for an ABI that can cross the user/kernel
boundry.


I understand that.  But user-space programs are allowed to use the 
standard types when calling a syscall that really uses kernel types.


IMHO, it should be irrelevant for the user how the kernel decides to 
call a 64-bit unsigned integer, right?


Or do you mean that some of the pages I modified


... are intended mostly for kernel-space programmers?





Work with the ISO C standard if you wish to document such type usage,
and get it approved and then we would be willing to consider such a
change.  But until then, we have to stick to our variable name types,
just like all other operating systems have to (we are not alone here.)

Please revert your change.


Thanks for asking nicely.

Since there's ongoing discussion, and I don't want to make it look like 
ignoring it, I've reverted the patch for now.  If I apply it again, I 
hope that it will be with some more consensus, as I've always tried to 
do.  Sorry if I was a bit irascible yesterday.  Shit happens.


TL;DR:  Patch reverted; asking nicely works. =)



greg k-h


Cheers,

Alex



--
Alejandro Colomar



OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH v3] Many pages: Document fixed-width types with ISO C naming

2022-08-25 Thread Alejandro Colomar via Gcc-patches

Hi Linus,

(Oops, I mistyped you name in my previous reply; I'm on a roll for funny 
typos this week it seems)


On 8/25/22 09:42, Linus Torvalds wrote:

On Thu, Aug 25, 2022 at 12:20 AM Alejandro Colomar
 wrote:


This patch is not about kernel, but about the section 2 and 3 manual
pages, which are directed towards user-space readers most of the time.


They are about the types to the kernel interfaces. Those types that
the kernel defines and exposes.

And the kernel type in question looks like this:

 struct { /* anonymous struct used by BPF_PROG_LOAD command */
 __u32   prog_type;  /* one of enum bpf_prog_type */
 __u32   insn_cnt;
 __aligned_u64   insns;
 __aligned_u64   license;

because the kernel UAPI header wants to

  (a) work whether or not  was included


These days, (a) is more of a theoretical thing, since programs avoiding 
C99  will have a hard time.




  (b) doesn't want to include  so as to not pollute the namespace

  (c) actually wants to use our special types

I quoted a few more fields for that (c) reason: we've had a long
history of getting the user space API wrong due to strange alignment
issues, where 32-bit and 64-bit targets had different alignment for
types. So then we ended up with various compat structures to translate
from one to the other because they had all the same fields, but
different padding.

This happened a few times with the DRM people, and they finally got
tired of it, and started using that "__aligned_u64" type, which is
just the same old __u64, but explicitly aligned to its natural
alignment.

So these are the members that the interface actually uses.

If you document those members, wouldn't it be good to have that
documentation be actually accurate?

Honestly, I don't think it makes a *huge* amount of difference, but
documentation that doesn't actually match the source of the
documentation will just confuse somebody in the end. Somebody will go
"that's not right", and maybe even change the structure definitions to
match the documentation.

Which would be wrong.

Now, you don't have to take the kernel uapi headers. We *try* to make
those usable as-is, but hey, there has been problems in the past, and
it's not necessarily wrong to take the kernel header and then munge it
to be "appropriate" for user space.

So I guess you can just make your own structures with the names and
syntax you want, and say "these are *my* header files, and *my*
documentation for them".

That's fine. But I'm not surprised if the kernel maintainer then goes
"no, that's not the interface I agreed to" if only because it's a pain
to verify that you got it all right. Maybe it was all trivial and
automated and there can be no errors, but it's still a "why is there a
different copy of this complicated interface".

If you really want to describe things to people, wouldn't it be nicer
to just say "there's a 32-bit unsigned 'prog_type' member" and leave
it at that?

Do you really want to enforce your opinion of what is prettier on the
maintainer that wrote that thing, and then argue with him when he
doesn't agree?


You convinced me.  The man-pages will document the types exactly as they 
are in kernel.  It's just simpler.


As the patch was recently reverted after Greg asked me to do, I'll keep 
it that way.  I guess this closes the man-pages discussion.


I'd still like to see the kernel types be API-compatible with the 
user-space ones, for which there's a patch around, and also making the 
 types be builtind could also be nice.  Let's see if that 
works out.


Cheers,

Alex

--
Alejandro Colomar



OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH v3] Many pages: Document fixed-width types with ISO C naming

2022-08-25 Thread Alejandro Colomar via Gcc-patches

Hi Xi,

On 8/25/22 09:28, Xi Ruoyao wrote:

On Thu, 2022-08-25 at 09:20 +0200, Alejandro Colomar via Gcc-patches
wrote:

I don't know for sure, and I never pretended to say otherwise.  But what
IMHO the kernel could do is to make the types compatible, by typedefing
to the same fundamental types (i.e., long or long long) that user-space
types do.


In user-space things are already inconsistent as we have multiple libc
implementations.  Telling every libc implementation to sync their
typedef w/o a WG14 decision will only cause "aggressive discussion" (far
more aggressive than this thread, I'd say).

If int64_t etc. were defined as builtin types since epoch, things would
be a lot easier.  But we can't change history.


This would be great.  I mean, the fundamental types should be u8, u16, 
... and int, long, ... typedefs for these, and not the other way around, 
if the language was designed today.


Maybe GCC could consider something like that.

Cheers,

Alex

--
Alejandro Colomar



OpenPGP_signature
Description: OpenPGP digital signature


[PATCH] internal-fn, tree-cfg, v2: Fix .TRAP handling and another __builtin_trap vops issue [PR106099]

2022-08-25 Thread Jakub Jelinek via Gcc-patches
On Wed, Aug 24, 2022 at 10:20:45AM +, Richard Biener wrote:
> So
> 
>combined_fn cfn = builtin_cfn_unreachable ();
>gimple_build (, false, GSI_NEW_STMT, cfn, void_type_node);
> 
> ?

So what about just using existing call that creates the GIMPLE call,
whether it is builtin call or internal call?
I didn't use that initially because the code then wants to add a cgraph
edge for the fndecl, but in fact it is quite easy to gimple_call_fndecl
afterwards:

2022-08-25  Jakub Jelinek  

PR tree-optimization/106099
* internal-fn.def (TRAP): Add ECF_LOOPING_CONST_OR_PURE flag.
* tree-cfg.cc (execute_fixup_cfg): Add IFN_TRAP instead of
__builtin_trap to avoid the need of vops.

* gcc.dg/pr106099.c: New test.

--- gcc/internal-fn.def.jj  2022-07-28 12:43:12.876295553 +0200
+++ gcc/internal-fn.def 2022-08-23 14:21:49.559364691 +0200
@@ -458,7 +458,8 @@ DEF_INTERNAL_FN (SPACESHIP, ECF_CONST |
 
 /* __builtin_trap created from/for __builtin_unreachable.  */
 DEF_INTERNAL_FN (TRAP, ECF_CONST | ECF_LEAF | ECF_NORETURN
-  | ECF_NOTHROW | ECF_COLD, NULL)
+  | ECF_NOTHROW | ECF_COLD | ECF_LOOPING_CONST_OR_PURE,
+NULL)
 
 #undef DEF_INTERNAL_INT_FN
 #undef DEF_INTERNAL_FLT_FN
--- gcc/tree-cfg.cc.jj  2022-07-26 10:32:23.998267698 +0200
+++ gcc/tree-cfg.cc 2022-08-25 09:42:19.013039869 +0200
@@ -9878,16 +9878,16 @@ execute_fixup_cfg (void)
{
  if (stmt && is_gimple_call (stmt))
gimple_call_set_ctrl_altering (stmt, false);
- tree fndecl = builtin_decl_unreachable ();
- stmt = gimple_build_call (fndecl, 0);
+ stmt = gimple_build_builtin_unreachable (UNKNOWN_LOCATION);
  gimple_stmt_iterator gsi = gsi_last_bb (bb);
  gsi_insert_after (, stmt, GSI_NEW_STMT);
  if (!cfun->after_inlining)
-   {
- gcall *call_stmt = dyn_cast  (stmt);
- node->create_edge (cgraph_node::get_create (fndecl),
-call_stmt, bb->count);
-   }
+   if (tree fndecl = gimple_call_fndecl (stmt))
+ {
+   gcall *call_stmt = dyn_cast  (stmt);
+   node->create_edge (cgraph_node::get_create (fndecl),
+  call_stmt, bb->count);
+ }
}
}
 }
--- gcc/testsuite/gcc.dg/pr106099.c.jj  2022-08-23 14:30:51.992057144 +0200
+++ gcc/testsuite/gcc.dg/pr106099.c 2022-08-23 14:29:04.271508337 +0200
@@ -0,0 +1,10 @@
+/* PR tree-optimization/106099 */
+/* { dg-do compile } */
+/* { dg-options "-O -fharden-compares -fno-tree-forwprop -fno-tree-ch 
-fno-tree-dominator-opts -fno-tree-ccp -funreachable-traps 
--param=scev-max-expr-size=1" } */
+
+void
+foo (void)
+{
+  for (unsigned i = 0; i == 0; i++)
+__builtin_printf ("%d", i);
+}

Jakub



Re: [PATCH v3] Many pages: Document fixed-width types with ISO C naming

2022-08-25 Thread Linus Torvalds
On Thu, Aug 25, 2022 at 12:20 AM Alejandro Colomar
 wrote:
>
> This patch is not about kernel, but about the section 2 and 3 manual
> pages, which are directed towards user-space readers most of the time.

They are about the types to the kernel interfaces. Those types that
the kernel defines and exposes.

And the kernel type in question looks like this:

struct { /* anonymous struct used by BPF_PROG_LOAD command */
__u32   prog_type;  /* one of enum bpf_prog_type */
__u32   insn_cnt;
__aligned_u64   insns;
__aligned_u64   license;

because the kernel UAPI header wants to

 (a) work whether or not  was included

 (b) doesn't want to include  so as to not pollute the namespace

 (c) actually wants to use our special types

I quoted a few more fields for that (c) reason: we've had a long
history of getting the user space API wrong due to strange alignment
issues, where 32-bit and 64-bit targets had different alignment for
types. So then we ended up with various compat structures to translate
from one to the other because they had all the same fields, but
different padding.

This happened a few times with the DRM people, and they finally got
tired of it, and started using that "__aligned_u64" type, which is
just the same old __u64, but explicitly aligned to its natural
alignment.

So these are the members that the interface actually uses.

If you document those members, wouldn't it be good to have that
documentation be actually accurate?

Honestly, I don't think it makes a *huge* amount of difference, but
documentation that doesn't actually match the source of the
documentation will just confuse somebody in the end. Somebody will go
"that's not right", and maybe even change the structure definitions to
match the documentation.

Which would be wrong.

Now, you don't have to take the kernel uapi headers. We *try* to make
those usable as-is, but hey, there has been problems in the past, and
it's not necessarily wrong to take the kernel header and then munge it
to be "appropriate" for user space.

So I guess you can just make your own structures with the names and
syntax you want, and say "these are *my* header files, and *my*
documentation for them".

That's fine. But I'm not surprised if the kernel maintainer then goes
"no, that's not the interface I agreed to" if only because it's a pain
to verify that you got it all right. Maybe it was all trivial and
automated and there can be no errors, but it's still a "why is there a
different copy of this complicated interface".

If you really want to describe things to people, wouldn't it be nicer
to just say "there's a 32-bit unsigned 'prog_type' member" and leave
it at that?

Do you really want to enforce your opinion of what is prettier on the
maintainer that wrote that thing, and then argue with him when he
doesn't agree?

  Linus


Re: [PATCH v3] Many pages: Document fixed-width types with ISO C naming

2022-08-25 Thread Alejandro Colomar via Gcc-patches

Hi Greg,

On 8/25/22 07:57, Greg Kroah-Hartman wrote:

On Thu, Aug 25, 2022 at 01:36:10AM +0200, Alejandro Colomar wrote:

But from your side what do we have?  Just direct NAKs without much
explanation.  The only one who gave some explanation was Greg, and he
vaguely pointed to Linus's comments about it in the past, with no precise
pointer to it.  I investigated a lot before v2, and could not find anything
strong enough to recommend using kernel types in user space, so I pushed v2,
and the discussion was kept.


So despite me saying that "this is not ok", and many other maintainers
saying "this is not ok", you applied a patch with our objections on it?
That is very odd and a bit rude.


I would like that if you still oppose to the patch, at least were able to
provide some facts to this discussion.


The fact is that the kernel can not use the namespace that userspace has
with ISO C names.  It's that simple as the ISO standard does NOT
describe the variable types for an ABI that can cross the user/kernel
boundry.


I understand that.  But user-space programs are allowed to use the 
standard types when calling a syscall that really uses kernel types.


IMHO, it should be irrelevant for the user how the kernel decides to 
call a 64-bit unsigned integer, right?


Or do you mean that some of the pages I modified



Work with the ISO C standard if you wish to document such type usage,
and get it approved and then we would be willing to consider such a
change.  But until then, we have to stick to our variable name types,
just like all other operating systems have to (we are not alone here.)

Please revert your change.


Thanks for asking nicely.

Since there's ongoing discussion, and I don't want to make it look like 
ignoring it, I've reverted the patch for now.  If I apply it again, I 
hope that it will be with some more consensus, as I've always tried to 
do.  Sorry if I was a bit irascible yesterday.  Shit happens.


TL;DR:  Patch reverted; asking nicely works. =)



greg k-h


Cheers,

Alex

--
Alejandro Colomar



OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH v3] Many pages: Document fixed-width types with ISO C naming

2022-08-25 Thread Linus Torvalds
On Wed, Aug 24, 2022 at 11:41 PM Florian Weimer  wrote:
>
> The justifications brought forward are just regurgitating previous
> misinformation.  If you do that, it's hard to take you seriously.

Pot, meet kettle.

> There is actually a good reason for using __u64: it's always based on
> long long, so the format strings are no longer architecture-specific,
[..]

That's a small detail that yes, we've tried to avoid the absolute
humongous mess that the C standard library has with their horrendous
'PRId*' mess, but honestly, it's just a tiny detail.

The real issue is that we want to be able to control our own types,
and our own names, and in the process have sometimes been able to
standardize on types that makes it easier to just not have to deal
with "oh, somebody picked 'int' on this architecture, and 'long' on
this other, and they are both 32-bit types".

We still have to deal with that for '[s]size_t', but that's such a
standard legacy type that thankfully we have the whole '%zu/%zd' thing
for that.

And yes, sometimes we screw up even *though* we were the ones that
picked the types, and we've had pointless differences where '__u64'
could be 'unsigned long' on a 64-bit architecture, and 'unsigned long
long' on a 32-bit one, and then we were able to fix our own little
broken type system exactly because it was *OUR* little type system.

So you are correct that then in the specific case of '__u64' we have
been able to simply just standardize on 'unsigned long long' and make
printf strings simpler.

But you are wrong to think that that is somehow a special thing.

It's not.

It's very much all the same thing: we have types *we* control, and
thanks to that we can do them the way *we* need them done, and can fix
them when we made a silly mistake.

In other words, it's the whole *point* of not ever using 'stdint.h' at
all for those things.

(It's also about avoiding the kinds of unholy things that happen in
system header files. Have you ever *looked* at them? Christ. The
amount of absolute crap you get from including  in user
space is scary)

> You cannot avoid using certain ISO C names with current GCC or Clang,
> however hard you try.

You  are now the one who is regurgitating complete mis-information.

You do it so prettily, and with such weasel-wording, that I know you
must be knowingly threading that fine line between "actively
misleading" but trying to avoid "outright lying"..

You say "certain ISO C names" to try to make it sound as if this was
at all relevant to things like "uint32_t" and friends.

But deep down, you know you're basically lying by omission.

Because it's true that we have to know and care about things like
'size_t', which comes up for all the basic string.h functions.

So yes, we have a very small set of types that we make sure matches
the compiler notion of said types, and we carefully use things like

typedef __kernel_ulong_t __kernel_size_t;

and then we have our own 'stdarg.h which uses

typedef __builtin_va_list va_list;

that is explicitly the one that the compiler exposes with those double
underscores exactly because even the compiler can't expose the
"standard" name due to namespace issues.

And no, NONE OF THOSE ARE USABLE IN THE UAPI HEADERS!

And equally importantly, none of those have *anything* to do with the
'uint32_t' kind of names.

The fact that yes, we care about what the compiler thinks "size_t" is
(because we do want the compiler to do memset() for us) has absolutely
*NOTHING* to do with uint32_t and .

And I'm pretty sure you knew that, but you tried to make it sound like
they were somehow all in the same boat.

And yes, some drivers tend to actually use 'uint32_t' in the kernel,
and we allow it, but they cannot be used by user interfaces. So a uapi
file really *really* shouldn't ever use them.

And no, we don't use "-ffreestanding" and friends - we actually have
occasionally wanted and tried to do so just to make the boundary lines
clearer, but then that will make gcc no longer do sane things for
'memcpy()'' and friends, so it's kind of a balancing act.

> , ,  are compiler-provided headers that
> are designed to be safe to use for bare-metal contexts (like in
> kernels).  Avoiding them is not necessary per se.

We explicitly avoid them all.

We historically used stdarg.h and stddef.h (but never stdint.h -
there's absolutely _zero_ upside), but it was always a slight pain.

So we simply bake our own, exactly because it's simply less painful
than having to deal with possible system-provided ones.

People do odd compiler things with host compilers, bad or odd
installations of cross-build environments, it's just not worth the
pain to deal with the "system header files" when they just don't
provide any real value.

Linus


Re: [PATCH v3] Many pages: Document fixed-width types with ISO C naming

2022-08-25 Thread Xi Ruoyao via Gcc-patches
On Thu, 2022-08-25 at 09:20 +0200, Alejandro Colomar via Gcc-patches
wrote:
> I don't know for sure, and I never pretended to say otherwise.  But what 
> IMHO the kernel could do is to make the types compatible, by typedefing 
> to the same fundamental types (i.e., long or long long) that user-space 
> types do.

In user-space things are already inconsistent as we have multiple libc
implementations.  Telling every libc implementation to sync their
typedef w/o a WG14 decision will only cause "aggressive discussion" (far
more aggressive than this thread, I'd say).

If int64_t etc. were defined as builtin types since epoch, things would
be a lot easier.  But we can't change history.

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: [PATCH v3] Many pages: Document fixed-width types with ISO C naming

2022-08-25 Thread Alejandro Colomar via Gcc-patches

Hi Linux,

On 8/25/22 02:52, Linus Torvalds wrote:

On Wed, Aug 24, 2022 at 4:36 PM Alejandro Colomar
 wrote:


I'm trying to be nice, and ask for review to make sure I'm not making
some big mistake by accident, and I get disrespect?  No thanks.


You've been told multiple times that the kernel doesn't use the
"standard" names, and *cannot* use them for namespace reasons, and you
ignore all the feedback, and then you claim you are asking for review?


This patch is not about kernel, but about the section 2 and 3 manual 
pages, which are directed towards user-space readers most of the time. 
Admittedly, some syscalls are only callable from within the kernel 
itself, but that's very rare.


[...]



The fact is, kernel UAPI header files MUST NOT use the so-called standard names.


I don't know for sure, and I never pretended to say otherwise.  But what 
IMHO the kernel could do is to make the types compatible, by typedefing 
to the same fundamental types (i.e., long or long long) that user-space 
types do.




We cannot provide said names, because they are only provided by the
standard header files.

And since kernel header files cannot provide them, then kernel UAPI
header files cannot _use_ them.

End result: any kernel UAPI header file will continue to use __u32 etc
naming that doesn't have any namespace pollution issues.

Nothing else is even remotely acceptable.

Stop trying to make this something other than it is.

And if you cannot accept these simple technical reasons, why do you
expect respect?

Why are you so special that you think you can change the rules for
kernel uapi files over the *repeated* objections from maintainers who
know better?


No sorry, if someone understood this patch as changing anything in UAPI, 
it is not.


Cheers,

Alex

--
Alejandro Colomar



OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH 0/3] picolibc: Add picolibc linking help

2022-08-25 Thread Kito Cheng via Gcc-patches
I am thinking that maybe we should add
-mlibc=[newlib|newlib-nano|picolibc|unknown] option to bare-matel
toolchain, one reason is having an unify interface to select libc
implementation between clang/LLVM, spec file is a GCC specific stuff,
that cause very bad compatibility between GCC and clang/LLVM, and
having option to control that would be better since clang/LLVM don't
have those configure time option.

For linux toolchain, it uses -mbionic, -muclibc, -mglibc and -mmusl,
maybe we could also use an unify -mlibc= option for that?



On Thu, Aug 25, 2022 at 3:32 AM Keith Packard via Gcc-patches
 wrote:
>
> Andrew Pinski  writes:
>
> (removing gcc@ as not appropriate for patch discussions)
>
> Thanks for reviewing my patches; I appreciate the time you have taken to
> think about this.
>
> > Why do you need to change the specs to support picolibc? Why not have
> > the library supply the specs file instead, like what is done for
> > newlib and libgloss?
>
> Several architectures do include support for newlib's libgloss in their
> gcc configuration today (i386, m32r, microblaze, nds32, pru, riscv and
> sh), so I wondered if it made sense to add support for picolibc's
> target-specific libraries as well.
>
> Picolibc does deliver a spec file fragment which implements this
> functionality, but that requires the addition of --specs=picolibc.specs
> to the gcc command line instead of being built-in to gcc itself. When
> creating an integrated toolchain using picolibc, it seems a bit odd to
> require an option for the toolchain to work.
>
> > What OS libraries are not included in libc? I trying to figure out why
> > this needs to be special cased here.
>
> As a general-purpose embedded C library, picolibc doesn't include any
> OS-specific code. Instead, it defines a limited subset of POSIX
> interfaces which are to be supplied by the target platform.
>
> Picolibc itself supplies sample implementations of these ABIs that can
> run on top of bare metal systems with semihosting support which are used
> while testing picolibc itself.
>
> This is similar to newlib's libgloss: the C library is built atop
> another library which needs to follow it in the linker command line for
> symbol resolution to work correctly. Making this work requires a change
> in the *lib spec file fragment.
>
> Adjusting the *lib fragment can either be done in an externally provided
> specs file, or built-in to gcc. Both of these mechanisms are present in
> the gcc ecosystem today, leading me to wonder whether the gcc community
> would be interested in having an integrated option available.
>
> --
> -keith


Re: [PATCH v3] Many pages: Document fixed-width types with ISO C naming

2022-08-25 Thread Florian Weimer via Gcc-patches
* Greg Kroah-Hartman:

> On Thu, Aug 25, 2022 at 01:36:10AM +0200, Alejandro Colomar wrote:
>> But from your side what do we have?  Just direct NAKs without much
>> explanation.  The only one who gave some explanation was Greg, and he
>> vaguely pointed to Linus's comments about it in the past, with no precise
>> pointer to it.  I investigated a lot before v2, and could not find anything
>> strong enough to recommend using kernel types in user space, so I pushed v2,
>> and the discussion was kept.
>
> So despite me saying that "this is not ok", and many other maintainers
> saying "this is not ok", you applied a patch with our objections on it?
> That is very odd and a bit rude.

The justifications brought forward are just regurgitating previous
misinformation.  If you do that, it's hard to take you seriously.

There is actually a good reason for using __u64: it's always based on
long long, so the format strings are no longer architecture-specific,
and those ugly macro hacks are not needed to achieve portability.  But
that's really the only reason I'm aware of.  Admittedly, it's a pretty
good reason.

>> I would like that if you still oppose to the patch, at least were able to
>> provide some facts to this discussion.
>
> The fact is that the kernel can not use the namespace that userspace has
> with ISO C names.  It's that simple as the ISO standard does NOT
> describe the variable types for an ABI that can cross the user/kernel
> boundry.

You cannot avoid using certain ISO C names with current GCC or Clang,
however hard you try.  But currently, the kernel does not try at all,
not really: it is not using -ffreestanding and -fno-builtin, at least
not consistently.  This means that if the compiler sees a known function
(with the right name and a compatible prototype), it will optimize based
on that.  What kind of headers you use does not matter.

, ,  are compiler-provided headers that
are designed to be safe to use for bare-metal contexts (like in
kernels).  Avoiding them is not necessary per se.  However, 
is not particularly useful if you want to use your own printf-style
functions with the usual format specifiers (see above for __u64).  But
on its own, it's perfectly safe to use.  You have problems with
 *because* you use well-known, standard facilities in kernel
space (the printf format specifiers), not because you avoid them.  So
exactly the opposite of what you say.

> But until then, we have to stick to our variable name types,
> just like all other operating systems have to (we are not alone here.)

FreeBSD uses  and the  formatting macros in kernel
space.  I don't think that's unusual at all for current kernels.  It's
particularly safe for FreeBSD because they use a monorepo and toolchain
variance among developers is greatly reduced.  Linux would need to
provide its own  equivalent for the formatting macros
(as it's not a compiler header; FreeBSD has ).

At this point and with the current ABIs we have for Linux, it makes
equal (maybe more) sense to avoid the  types altogether and
use Linux-specific typedefs with have architecture-independent format
strings.

Thanks,
Florian



Re: [PATCH v3] rs6000: Rework ELFv2 support for -fpatchable-function-entry* [PR99888]

2022-08-25 Thread Kewen.Lin via Gcc-patches
on 2022/8/24 22:01, Segher Boessenkool wrote:
> On Wed, Aug 24, 2022 at 03:30:51PM +0800, Kewen.Lin wrote:
>> on 2022/8/23 22:33, Segher Boessenkool wrote:
>> I thought if we can consider [1] and updated the documentation similarly
>> like "For PowerPC with the ELFv2 ABI, there will be M nops before the global
>> entry point, and N-M after the local entry point".
> 
> But that does not agree with the documentation.  The N nops have to be
> consecutive.  If you want to support adding separate nop regions before
> the LEP and GEP entry points, that is fine, but it will need a separate
> command line option.
> 

OK, previously I thought if we can claim GEP and LEP (and the area between) as
one special function entry (area) in documentation, but admittedly it's too 
tricky.

Adding one separated command line option now just for one potential use case in
future seems not a good idea.  Following the previous proposal, I just posted
v4 at:

https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600277.html

BR,
Kewen