Re: [PATCH] rs6000: fmr gets used instead of faster xxlor [PR93571]

2023-02-24 Thread Ajit Agarwal via Gcc-patches
Hello Segher:

On 24/02/23 8:41 pm, Segher Boessenkool wrote:
> Hi!
> 
> For future patches: please don't send patches as replies to existing
> threads.  Just start a new thread for a new patch (series).  You can
> mark it as [PATCH v2] in the subject, if you want.
> 
> On Fri, Feb 24, 2023 at 01:41:49PM +0530, Ajit Agarwal wrote:
>> Here is the patch that uses xxlor instead of fmr where possible.
>> Performance results shows that fmr is better in power9 and 
>> power10 architectures whereas xxlor is better in power7 and
>> power 8 architectures.
> 
> And fmr is the only option before p7.
> 
>>  rs6000: Use xxlor instead of fmr where possible
>>
>>  This patch replaces fmr with xxlor instruction for power7
>>  and power8 architectures whereas for power9 and power10
>>  replaces xxlor with fmr instruction.
> 
> Saying "this patch" in a commit message reads strangely.  Just "Replace
> fmr with" etc.?
> 

I will correct this.

> The second part is just wrong, you cannot replace xxlor by fmr in
> general.
> 
>>  Perf measurement results:
>>
>>  Power9 fmr:  201,847,661 cycles.
>>  Power9 xxlor: 201,877,78 cycles.
>>  Power8 fmr: 201,057,795 cycles.
>> Power8 xxlor: 201,004,671 cycles.
> 
> What is this measuring?  100M insns back-to-back, each dependent on the
> previous one?
> 
Yes.

> What are the results on p7 and p10?
> 
> These numbers show there is no difference on p8 either.  Did you paste
> the wrong numbers maybe?
>

I will measure it again and update with a new patch.
 
>>  * config/rs6000/rs6000.md (*movdf_hardfloat64): Use xxlor
>>  for power7 and power8 and fmr for power9 and power10.
> 
> Please don't break lines early.  Changelogs lines can be 80 columns
> wide, just like source code lines.
> 
>> --- a/gcc/config/rs6000/rs6000.md
>> +++ b/gcc/config/rs6000/rs6000.md
>> @@ -354,7 +354,7 @@ (define_attr "cpu"
>>(const (symbol_ref "(enum attr_cpu) rs6000_tune")))
>>  
>>  ;; The ISA we implement.
>> -(define_attr "isa" "any,p5,p6,p7,p7v,p8v,p9,p9v,p9kf,p9tf,p10"
>> +(define_attr "isa" "any,p5,p6,p7,p7v,p8v,p9,p9v,p9kf,p9tf,p7p8,p10"
> 
> p78v, and sort it after p8v please.
> 
>> + (and (eq_attr "isa" "p7p8")
>> +  (match_test "TARGET_VSX && !TARGET_P9_VECTOR"))
>> + (const_int 1)
> 
> Okay.
> 
>>  (define_insn "*mov_hardfloat64"
>>[(set (match_operand:FMOVE64 0 "nonimmediate_operand"
>> -   "=m,   d,  d,  ,   wY,
>> - ,Z,  ,  ,  !r,
>> - YZ,  r,  !r, *c*l,   !r,
>> -*h,   r,  ,   wa")
>> +   "=m,   d,  ,  ,   wY,
>> +, Z,  wa, ,  !r,
>> +YZ,   r,  !r, *c*l,   !r,
>> +*h,   r,  ,   d,  wn,
>> +wa")
>>  (match_operand:FMOVE64 1 "input_operand"
> 
> (You posted this mail as wrapping.  That means the patch cannot be
> applied non-manually, and that replies to your mail will be mangled.
> Just get a Real mail client, and configure it correctly :-) )
>

I am using Thunderbird as mail client and the settings are all correct.
I have set the mailnews.wrapLength 0.

 
>> -"d,   m,  d,  wY, ,
>> - Z,   ,   ,  ,  ,
>> +"d,   m,  ,  wY, ,
>> + Z,   ,   wa, ,  ,
>>   r,   YZ, r,  r,  *h,
>> - 0,   ,   r,  eP"))]
>> + 0,   ,   r,  d,  wn,
>> + eP"))]
> 
> No.  It is impossible to figure out what you changed here by just
> reading it.
> 
> There is no requirement there should be exactly five alternatives per
> line, and/or that there should be the same number everywhere.
> 
> If the indentation was incorrect, and you want to fix that, do that in a
> separate *earlier* patch in the series, please.
> 

I will Keep indentation as same.
>>"TARGET_POWERPC64 && TARGET_HARD_FLOAT
>> && (gpc_reg_operand (operands[0], mode)
>> || gpc_reg_operand (operands[1], mode))"
>>"@
>> stfd%U0%X0 %1,%0
>> lfd%U1%X1 %0,%1
>> -   fmr %0,%1
>> +   xxlor %x0,%x1,%x1
>> lxsd %0,%1
>> stxsd %1,%0
>> lxsdx %x0,%y1
>> stxsdx %x1,%y0
>> -   xxlor %x0,%x1,%x1
>> +   fmr %0,%1
>> xxlxor %x0,%x0,%x0
>> li %0,0
>> std%U0%X0 %1,%0
>> @@ -8467,23 +8474,28 @@ (define_insn "*mov_hardfloat64"
>> nop
>> mfvsrd %0,%x1
>> mtvsrd %x0,%1
>> +   fmr %0,%1
>> +   fmr %0,%1
>> #"
>>[(set_attr "type"
>> -"fpstore, fpload, fpsimple,   fpload, fpstore,
>> +"fpstore, fpload, veclogical, fpload, fpstore,
>>   fpload,  fpstore,veclogical, veclogical, integer,
>>   store,   load,   *,  mtjmpr, mfjmpr,
>> -   

Re: [PATCH] RISC-V: Disable attribute generation by default

2023-02-24 Thread Palmer Dabbelt

On Fri, 24 Feb 2023 05:09:30 PST (-0800), gcc-patches@gcc.gnu.org wrote:

It did help people to identify what extension used in the binary, so I
would prefer keep that enable by default.


IMO it actually hurts more than helps, as it's not really encoding what 
extensions are in the binary (or necessary to run the binary) but 
instead just encodes what was in -march (with some noise added due to 
the merging bugs and ISA string changes).  Having the attributes just 
ends up tricking users into thinking the information is accurate when 
it's not.



and lld is begin fix those merge issue, so the situation should be improved
soon.


If toolchains are just going to ignore then attributes then it's a 
pretty good sign they're not useful.



Palmer Dabbelt  於 2023年2月24日 週五 10:29 寫道:


We generate a handful of attributes by default, but they don't really
encode any useful information.  We've broadly stopped ascribing any
meaning to them in binutils; but they trip up LLVM, older toolchains,
and users.  So let's just turn them off by default.  The old binaries
will still be floating around, but at least this way we'll stop tripping
over new incompatibilities.

If we get to a point where there's some attributes that are defined that
we can use then we can sort out how to turn those on without turning on
the old ones, but unless I'm missing something the current set of
attributes are too broken to be useful for anything.

gcc/ChangeLog:

* config.gcc (--with-riscv-attribute): Default to off.
---
I know it's pretty late, but I'd like to target this for GCC-13.  The
Zmmul stuff has resulted in another round of build breakages that we're
going to have to chase down, and while we could update everything to
turn off the attributes it seems easier to just set the default.
---
 gcc/config.gcc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index c070e6ecd2e..52639cf26d6 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -4596,7 +4596,7 @@ case "${target}" in
tm_defines="${tm_defines} TARGET_RISCV_ATTRIBUTE=0"
;;
""|default)
-   tm_defines="${tm_defines} TARGET_RISCV_ATTRIBUTE=1"
+   tm_defines="${tm_defines} TARGET_RISCV_ATTRIBUTE=0"
;;
*)
echo
"--with-riscv-attribute=${with_riscv_attribute} is not supported.  The
argument must begin with yes, no or default." 1>&2
--
2.39.1




Re: Support for WEAK attribute, part 2

2023-02-24 Thread Harald Anlauf via Gcc-patches

Hi Rimvydas,

Am 24.02.23 um 06:16 schrieb Rimvydas Jasinskas via Gcc-patches:

On Thu, Feb 23, 2023 at 10:53 PM Harald Anlauf  wrote:

the patch is mostly fine, but there is a minor style issue:

+  if (sym->attr.ext_attr & (1 << EXT_ATTR_WEAK))
+   gfc_error ("Symbol %qs at %L has the WEAK attribute but is a %s",
+  sym->name, >declared_at, sym->attr.dummy
+  ? "dummy argument" : "local variable");
+

It is my understanding that this is not translation-friendly.
Please use separate error texts for either case instead.

Interesting, I was under the impression this was fixed with OO-inlines
around the *.c rename.


if this is the case, I must have missed it.

> In any case, adjusted in v2 to use:

+  if (sym->attr.ext_attr & (1 << EXT_ATTR_WEAK))
+{
+  if (sym->attr.dummy)
+gfc_error ("Symbol %qs at %L has the WEAK attribute but is a "
+   "dummy argument", sym->name, >declared_at);
+  else
+gfc_error ("Symbol %qs at %L has the WEAK attribute but is a "
+   "local variable", sym->name, >declared_at);
+}


This is ok.


These testcases are dg-compile and do not go through the "-O0 -O1 -O2
-O3 -Os" options like dg-run.  Combining the testcases does not reduce
gfortran.sum a lot:


I wasn't thinking of gfortran.sum, it's about the total overhead of
the testsuite (dejagnu etc.).  But thanks for combining the tests!


Finally, please do not forget to CC patches to gcc-patches@
so that others can see them.

Out of curiosity, what is the purpose of CC patches to gcc-patches
too?  Attachments are even available in web mailing list too, like in:
https://gcc.gnu.org/pipermail/fortran/2023-February/058953.html


Well, patches should always go the gcc-patches@, see e.g.

https://gcc.gnu.org/gitwrite.html

On the other hand, many *Fortran* reviewers will ignore patches
there and look at them only when they are sent to fortran@.

Thanks for your patch, pushed as r13-6338-gbcbeebc498126c .

Harald


Regards,
Rimvydas





Re: [Patch] Fortran: Skip bound conv in gfc_conv_gfc_desc_to_cfi_desc with intent(out) ptr [PR108621]

2023-02-24 Thread Harald Anlauf via Gcc-patches

Hi Tobias,

Am 24.02.23 um 12:31 schrieb Tobias Burnus:

(B) The attached patch:

With 'intent(out)' there is no reason to do the conversions. While for
nullified
pointers the bounds-conversion loop is skipped, it may still be executed
for undefined
pointers. (Which is usually harmless.) In either case, not generating
this code makes
sense.

OK for mainline?


LGTM.

I was pondering whether one should keep the testcase closer to the
one in the PR, but the essence of the bug and the fix is well
represented in the reduced version, and also the tree dump tells
the whole story anyway.


Regarding GCC 12:  I am not really sure as it is no real regression.
Besides bogus
warnings, there might be an issue for undefined pointers and
-fsanitize=undefined, namely
if 'ubound - lbound' evaluated on random numbers overflows (such as for
ubound = huge(..)
and lbound = -huge(..)). But that looks like a rather special case. -
Thoughts?


I'd rather consider the case of undefined pointers as of practical
importance.  It's up to you or others to decide whether it should
be backported.  I would not oppose.

Thanks for the patch!

Harald


Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201,
80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer:
Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München;
Registergericht München, HRB 106955




[pushed] fortran: Plug leak of associated_dummy memory. [PR108923]

2023-02-24 Thread Mikael Morin

Hello,

I have just pushed a for PR108923 (a memory leak).
It fixes the small reproducer that I pasted in bugzilla, and I have run 
it through the fortran regression testsuite.

More details in the patch.From 545a7d5da5fcc338e29c5241b574ac99d03f4454 Mon Sep 17 00:00:00 2001
From: Mikael Morin 
Date: Fri, 24 Feb 2023 22:11:17 +0100
Subject: [PATCH] fortran: Plug leak of associated_dummy memory. [PR108923]

This fixes a memory leak by accompanying the release of
gfc_actual_arglist elements' memory with a release of the
associated_dummy field memory (if allocated).
Actual argument copy is adjusted as well so that each copy can free
its field independently.

	PR fortran/108923

gcc/fortran/ChangeLog:

	* expr.cc (gfc_free_actual_arglist): Free associated_dummy
	memory.
	(gfc_copy_actual_arglist): Make a copy of the associated_dummy
	field if it is set in the original element.
---
 gcc/fortran/expr.cc | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/gcc/fortran/expr.cc b/gcc/fortran/expr.cc
index c295721b9d6..4662328bf31 100644
--- a/gcc/fortran/expr.cc
+++ b/gcc/fortran/expr.cc
@@ -545,6 +545,7 @@ gfc_free_actual_arglist (gfc_actual_arglist *a1)
   a2 = a1->next;
   if (a1->expr)
 	gfc_free_expr (a1->expr);
+  free (a1->associated_dummy);
   free (a1);
   a1 = a2;
 }
@@ -565,6 +566,12 @@ gfc_copy_actual_arglist (gfc_actual_arglist *p)
   new_arg = gfc_get_actual_arglist ();
   *new_arg = *p;
 
+  if (p->associated_dummy != NULL)
+	{
+	  new_arg->associated_dummy = gfc_get_dummy_arg ();
+	  *new_arg->associated_dummy = *p->associated_dummy;
+	}
+
   new_arg->expr = gfc_copy_expr (p->expr);
   new_arg->next = NULL;
 
-- 
2.39.1



[PATCH, committed] Fortran: frontend passes do_subscript leaks gmp memory [PR108924]

2023-02-24 Thread Harald Anlauf via Gcc-patches
Dear all,

as reported by Richard - although without a testcase - we leak
gmp memory in do_subscript().  The attached patch was derived
by inspection of the code pointed at by valgrind and regtested
on x86_64-pc-linux-gnu.

Committed as obvious as

commit r13-6336-g45f406c4f62e516b58dcda20b5a7aa43ff0aa0f3
Author: Harald Anlauf 
Date: Fri Feb 24 19:56:32 2023 +0100

Thanks,
Harald

From 45f406c4f62e516b58dcda20b5a7aa43ff0aa0f3 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Fri, 24 Feb 2023 19:56:32 +0100
Subject: [PATCH] Fortran: frontend passes do_subscript leaks gmp memory
 [PR108924]

gcc/fortran/ChangeLog:

	PR fortran/108924
	* frontend-passes.cc (do_subscript): Clear used gmp variable.
---
 gcc/fortran/frontend-passes.cc | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/gcc/fortran/frontend-passes.cc b/gcc/fortran/frontend-passes.cc
index 02fcb41dbc4..90428982023 100644
--- a/gcc/fortran/frontend-passes.cc
+++ b/gcc/fortran/frontend-passes.cc
@@ -2883,7 +2883,10 @@ do_subscript (gfc_expr **e)
 		have_do_end = false;

 	  if (!have_do_start && !have_do_end)
-		return 0;
+		{
+		  mpz_clear (do_step);
+		  return 0;
+		}

 	  /* No warning inside a zero-trip loop.  */
 	  if (have_do_start && have_do_end)
--
2.35.3



[PATCH 2/2] testsuite: Fix gcc.dg/analyzer/allocation-size-multiline-3.c

2023-02-24 Thread Hans-Peter Nilsson via Gcc-patches
I'll install this as obvious provided that the prerequisite
multiline.exp patch is approved.
-- >8 --
For 32-bit newlib targets (such as cris-elf and pru-elf),
that int32_t is "long int".  See other regexps in the
testsuite matching "aka (long )?int" (with single-quotes
where needed) where the pattern in
allocation-size-multiline-3.c matches plain "int".  Uses the
special syntax recently introduced for multi-line patterns.

testsuite:
* gcc.dg/analyzer/allocation-size-multiline-3.c: Handle
int32_t being "long int".
---
 gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-3.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-3.c 
b/gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-3.c
index e27364a8e839..b3de582368fc 100644
--- a/gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-3.c
+++ b/gcc/testsuite/gcc.dg/analyzer/allocation-size-multiline-3.c
@@ -21,7 +21,7 @@ void test_constant_99 (void)
 |  ^~
 |  |
 |  (1) allocated 99 bytes here
-|  (2) assigned to 'int32_t *' {aka 'int *'} here; 'sizeof 
(int32_t {aka int})' is '4'
+|  (2) assigned to 'int32_t *' {aka '{re:long :re?}int *'} 
here; 'sizeof (int32_t {aka {re:long :re?}int})' is '4'
 |
{ dg-end-multiline-output "" } */
 
@@ -39,6 +39,6 @@ void test_symbolic (int n)
 |  ^~
 |  |
 |  (1) allocated 'n * 2' bytes here
-|  (2) assigned to 'int32_t *' {aka 'int *'} here; 'sizeof 
(int32_t {aka int})' is '4'
+|  (2) assigned to 'int32_t *' {aka '{re:long :re?}int *'} 
here; 'sizeof (int32_t {aka {re:long :re?}int})' is '4'
 |
{ dg-end-multiline-output "" } */
-- 
2.30.2



[PATCH 1/2] testsuite: Provide means to regexp in multiline patterns

2023-02-24 Thread Hans-Peter Nilsson via Gcc-patches
Ok to commit?
-- >8 --
Those multi-line-patterns are literal.  Sometimes a regexp
needs to be matched.  This is a start: just three elements
are supported: "(" ")" and the compound ")?" (and on second
thought, it can be argued that "(...)" alone is not useful).
Note that Tcl "string map" is documented to have the desired
effect: a once-over but no re-recognitions of previously
replaced mapped elements.  Also, drop a doubled "containing".

testsuite:
* lib/multiline.exp (_build_multiline_regex): Map
"{re:" to "(", ":re}" to ")" and ":re?}" to ")?".
---
 gcc/testsuite/lib/multiline.exp | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/lib/multiline.exp b/gcc/testsuite/lib/multiline.exp
index 5eccf2bbebc1..f746bc3a618e 100644
--- a/gcc/testsuite/lib/multiline.exp
+++ b/gcc/testsuite/lib/multiline.exp
@@ -297,7 +297,7 @@ proc _get_lines { filename first_line last_line } {
 
 # Convert $multiline from a list of strings to a multiline regex
 # We need to support matching arbitrary followup text on each line,
-# to deal with comments containing containing DejaGnu directives.
+# to deal with comments containing DejaGnu directives.
 
 proc _build_multiline_regex { multiline index } {
 verbose "_build_multiline_regex: $multiline $index" 4
@@ -307,7 +307,10 @@ proc _build_multiline_regex { multiline index } {
verbose "  line: $line" 4
 
# We need to escape "^" and other regexp metacharacters.
-   set line [string map {"^" "\\^"
+   set line [string map {"\{re:" "("
+ ":re?\}" ")?"
+ ":re\}" ")"
+ "^" "\\^"
  "(" "\\("
  ")" "\\)"
  "[" "\\["
-- 
2.30.2



Re: [PATCH] testsuite: Don't include multiline regexps in the the pass/fail log

2023-02-24 Thread David Malcolm via Gcc-patches
On Fri, 2023-02-24 at 18:54 +0100, Hans-Peter Nilsson wrote:
> Ok to commit?

Looks good to me [1]

Thanks
Dave

[1] though FWIW although I wrote this code, my DejaGnu skills are weak
and I'm not a testsuite maintainer :-/

> 
> -- >8 --
> I see overlong lines in the output when a test fails, for
> example for a bug exposed for cris-elf and pru-elf in
> gcc.dg/analyzer/allocation-size-multiline-3.c:
> 
> Running /x/gcc/testsuite/gcc.dg/analyzer/analyzer.exp ...
> FAIL: gcc.dg/analyzer/allocation-size-multiline-3.c expected
> multiline pattern lines 16-25 not found: "\s*int32_t \*ptr = alloca
> \(99\);[^\n\r]*\n  \^~\n  'test_constant_99':
> events 1-2[^\n\r]*\n    \|[^\n\r]*\n    \|   int32_t \*ptr = alloca
> \(99\);[^\n\r]*\n    \|  \^~\n   
> \|  \|[^\n\r]*\n    \|  \(1\)
> allocated 99 bytes here[^\n\r]*\n    \|  \(2\)
> assigned to 'int32_t \*' \{aka 'int \*'\} here; 'sizeof \(int32_t
> \{aka int\}\)' is '4'[^\n\r]*\n    \|[^\n\r]*\n"
> FAIL: gcc.dg/analyzer/allocation-size-multiline-3.c expected
> multiline pattern lines 34-43 not found: "   int32_t \*ptr = alloca
> \(n \* 2\);[^\n\r]*\n  \^~\n  'test_symbolic':
> events 1-2[^\n\r]*\n    \|[^\n\r]*\n    \|   int32_t \*ptr = alloca
> \(n \* 2\);[^\n\r]*\n    \|  \^~\n   
> \|  \|[^\n\r]*\n    \|  \(1\)
> allocated 'n \* 2' bytes here[^\n\r]*\n    \|  \(2\)
> assigned to 'int32_t \*' \{aka 'int \*'\} here; 'sizeof \(int32_t
> \{aka int\}\)' is '4'[^\n\r]*\n    \|[^\n\r]*\n"
> FAIL: gcc.dg/analyzer/allocation-size-multiline-3.c (test for excess
> errors)
> 
> That multiline-regexp-quoted-on-a-single-line is redundant
> when also outputting "lines 16-25" and "lines 34-43".  It's
> also so noisy that it can be mistaken for a testsuite error.
> If there's a need to inspect it, it can be seen at
> verbose-level 4, i.e. persons interested in seeing it
> without editing sources can just add "-v -v -v -v".
> 
> Let's "prune" the regexp from regular output, instead producing:
> Running /x/gcc/testsuite/gcc.dg/analyzer/analyzer.exp ...
> FAIL: gcc.dg/analyzer/allocation-size-multiline-3.c expected
> multiline pattern lines 16-25 not found
> FAIL: gcc.dg/analyzer/allocation-size-multiline-3.c expected
> multiline pattern lines 34-43 not found
> FAIL: gcc.dg/analyzer/allocation-size-multiline-3.c (test for excess
> errors)
> 
> * lib/multiline.exp (handle-multiline-outputs): Don't include
> the
> quoted multiline regexp in the pass/fail output.
> ---
>  gcc/testsuite/lib/multiline.exp | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/testsuite/lib/multiline.exp
> b/gcc/testsuite/lib/multiline.exp
> index 84ba9216642e..5eccf2bbebc1 100644
> --- a/gcc/testsuite/lib/multiline.exp
> +++ b/gcc/testsuite/lib/multiline.exp
> @@ -169,9 +169,9 @@ proc handle-multiline-outputs { text } {
> # Use "regsub" to attempt to prune the pattern from $text
> if {[regsub -line $rexp $text "" text]} {
>     # The multiline pattern was pruned.
> -   ${maybe_x}pass "$title was found: \"$escaped_regex\""
> +   ${maybe_x}pass "$title was found"
> } else {
> -   ${maybe_x}fail "$title not found: \"$escaped_regex\""
> +   ${maybe_x}fail "$title not found"
> }
>  
> set index [expr $index + 1]



Re: [PATCH 6/8] libstdc++: Fix formatting

2023-02-24 Thread Matthias Kretz via Gcc-patches
On Friday, 24 February 2023 18:14:53 CET Jonathan Wakely wrote:
> Looks like there are a few remaining spaces that could be removed
> where you've joined lines, e.g.

Fixed and pushed.

> OK for trunk anyway (and the branches if you want).

I'll likely backport after I backported all other patches to trunk that came 
before this one.

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 stdₓ::simd
──


[r13-6278 Regression] FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] [^\\n]* = foo\\.simdclone" 2 on Linux/x86_64

2023-02-24 Thread haochen.jiang via Gcc-patches
On Linux/x86_64,

3da77f217c8b2089ecba3eb201e727c3fcdcd19d is the first bad commit
commit 3da77f217c8b2089ecba3eb201e727c3fcdcd19d
Author: Andrew Stubbs 
Date:   Thu Jul 28 16:07:22 2022 +0100

vect: inbranch SIMD clones

caused

FAIL: gcc.dg/vect/vect-simd-clone-16.c scan-tree-dump-times vect "[\\n\\r] 
[^\\n]* = foo\\.simdclone" 2
FAIL: gcc.dg/vect/vect-simd-clone-16e.c scan-tree-dump-times vect "[\\n\\r] 
[^\\n]* = foo\\.simdclone" 3
FAIL: gcc.dg/vect/vect-simd-clone-16f.c scan-tree-dump-times vect "[\\n\\r] 
[^\\n]* = foo\\.simdclone" 2
FAIL: gcc.dg/vect/vect-simd-clone-17.c scan-tree-dump-times vect "[\\n\\r] 
[^\\n]* = foo\\.simdclone" 2
FAIL: gcc.dg/vect/vect-simd-clone-17e.c scan-tree-dump-times vect "[\\n\\r] 
[^\\n]* = foo\\.simdclone" 3
FAIL: gcc.dg/vect/vect-simd-clone-17f.c scan-tree-dump-times vect "[\\n\\r] 
[^\\n]* = foo\\.simdclone" 2
FAIL: gcc.dg/vect/vect-simd-clone-18.c scan-tree-dump-times vect "[\\n\\r] 
[^\\n]* = foo\\.simdclone" 2
FAIL: gcc.dg/vect/vect-simd-clone-18e.c scan-tree-dump-times vect "[\\n\\r] 
[^\\n]* = foo\\.simdclone" 3
FAIL: gcc.dg/vect/vect-simd-clone-18f.c scan-tree-dump-times vect "[\\n\\r] 
[^\\n]* = foo\\.simdclone" 2

with GCC configured with

../../gcc/configure 
--prefix=/export/users/haochenj/src/gcc-bisect/master/master/r13-6278/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-16.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-16.c 
--target_board='unix{-m64\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-16e.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-16e.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-16f.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-16f.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-16f.c 
--target_board='unix{-m64\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-17.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-17.c 
--target_board='unix{-m64\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-17e.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-17e.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-17f.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-17f.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-17f.c 
--target_board='unix{-m64\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-18.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-18.c 
--target_board='unix{-m64\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-18e.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-18e.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-18f.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-18f.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-18f.c 
--target_board='unix{-m64\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at haochen dot jiang at intel.com)


[V4][PATCH 2/2] Update documentation to clarify a GCC extension

2023-02-24 Thread Qing Zhao via Gcc-patches
on a structure with a C99 flexible array member being nested in
another structure.

"GCC extension accepts a structure containing an ISO C99 "flexible array
member", or a union containing such a structure (possibly recursively)
to be a member of a structure.

 There are two situations:

   * The structure with a C99 flexible array member is the last field of
 another structure, for example:

  struct flex  { int length; char data[]; };
  union union_flex { int others; struct flex f; };

  struct out_flex_struct { int m; struct flex flex_data; };
  struct out_flex_union { int n; union union_flex flex_data; };

 In the above, both 'out_flex_struct.flex_data.data[]' and
 'out_flex_union.flex_data.f.data[]' are considered as flexible
 arrays too.

   * The structure with a C99 flexible array member is the middle field
 of another structure, for example:

  struct flex  { int length; char data[]; };

  struct mid_flex { int m; struct flex flex_data; int n; };

 In the above, 'mid_flex.flex_data.data[]' is allowed to be extended
 flexibly to the padding.  E.g, up to 4 elements.

 However, relying on space in struct padding is a bad programming
 practice, compilers do not handle such extension consistently, Any
 code relying on this behavior should be modified to ensure that
 flexible array members only end up at the ends of structures.

 Please use warning option '-Wgnu-variable-sized-type-not-at-end' to
 identify all such cases in the source code and modify them.  This
 extension will be deprecated from gcc in the next release.
"

gcc/c-family/ChangeLog:

* c.opt: New option -Wgnu-variable-sized-type-not-at-end.

gcc/c/ChangeLog:

* c-decl.cc (finish_struct): Issue warnings for new option.

gcc/ChangeLog:

* doc/extend.texi: Document GCC extension on a structure containing
a flexible array member to be a member of another structure.

gcc/testsuite/ChangeLog:

* gcc.dg/variable-sized-type-flex-array.c: New test.
---
 gcc/c-family/c.opt|  5 ++
 gcc/c/c-decl.cc   |  7 +++
 gcc/doc/extend.texi   | 48 ++-
 .../gcc.dg/variable-sized-type-flex-array.c   | 31 
 4 files changed, 90 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/variable-sized-type-flex-array.c

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index cddeece..660ac07f3d4 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -737,6 +737,11 @@ Wformat-truncation=
 C ObjC C++ LTO ObjC++ Joined RejectNegative UInteger Var(warn_format_trunc) 
Warning LangEnabledBy(C ObjC C++ LTO ObjC++,Wformat=, warn_format >= 1, 0) 
IntegerRange(0, 2)
 Warn about calls to snprintf and similar functions that truncate output.
 
+Wgnu-variable-sized-type-not-at-end
+C C++ Var(warn_variable_sized_type_not_at_end) Warning
+Warn about structures or unions with C99 flexible array members are not
+at the end of a structure.
+
 Wif-not-aligned
 C ObjC C++ ObjC++ Var(warn_if_not_aligned) Init(1) Warning
 Warn when the field in a struct is not aligned.
diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index f589a2f5192..c5b54f07965 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -9296,6 +9296,13 @@ finish_struct (location_t loc, tree t, tree fieldlist, 
tree attributes,
   && is_last_field)
TYPE_INCLUDE_FLEXARRAY (t) = true;
 
+  if (warn_variable_sized_type_not_at_end
+ && !is_last_field
+ && TYPE_INCLUDE_FLEXARRAY (TREE_TYPE (x)))
+   warning_at (DECL_SOURCE_LOCATION (x),
+   OPT_Wgnu_variable_sized_type_not_at_end,
+   "variable sized type not at the end of a struct");
+
   if (DECL_NAME (x)
  || RECORD_OR_UNION_TYPE_P (TREE_TYPE (x)))
saw_named_field = true;
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index c1122916255..e278148c332 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -1748,7 +1748,53 @@ Flexible array members may only appear as the last 
member of a
 A structure containing a flexible array member, or a union containing
 such a structure (possibly recursively), may not be a member of a
 structure or an element of an array.  (However, these uses are
-permitted by GCC as extensions.)
+permitted by GCC as extensions, see details below.)
+@end itemize
+
+GCC extension accepts a structure containing an ISO C99 @dfn{flexible array
+member}, or a union containing such a structure (possibly recursively)
+to be a member of a structure.
+
+There are two situations:
+
+@itemize @bullet
+@item
+The structure with a C99 flexible array member is the last field of another
+structure, for example:
+
+@smallexample
+struct flex  @{ int length; char data[]; @};
+union union_flex @{ int others; struct flex f; @};
+
+struct out_flex_struct @{ int m; struct flex flex_data; @};
+struct 

[v4][PATCH 1/2] Handle component_ref to a structre/union field including C99 FAM [PR101832]

2023-02-24 Thread Qing Zhao via Gcc-patches
GCC extension accepts the case when a struct with a C99 flexible array member
is embedded into another struct or union (possibly recursively).
__builtin_object_size should treat such struct as flexible size.

gcc/c/ChangeLog:

PR tree-optimization/101832
* c-decl.cc (finish_struct): Set TYPE_INCLUDE_FLEXARRAY for
struct/union type.

gcc/cp/ChangeLog:

PR tree-optimization/101832
* module.cc (trees_out::core_bools): Stream out new bit
type_include_flexarray.
(trees_in::core_bools): Stream in new bit type_include_flexarray.

gcc/ChangeLog:

PR tree-optimization/101832
* print-tree.cc (print_node): Print new bit type_include_flexarray.
* tree-core.h (struct tree_type_common): New bit
type_include_flexarray.
* tree-object-size.cc (addr_object_size): Handle structure/union type
when it has flexible size.
* tree-streamer-in.cc (unpack_ts_type_common_value_fields): Stream
in new bit type_include_flexarray.
* tree-streamer-out.cc (pack_ts_type_common_value_fields): Stream
out new bit type_include_flexarray.
* tree.h (TYPE_INCLUDE_FLEXARRAY): New macro
TYPE_INCLUDE_FLEXARRAY.

gcc/testsuite/ChangeLog:

PR tree-optimization/101832
* gcc.dg/builtin-object-size-pr101832.c: New test.
---
 gcc/c/c-decl.cc   |  12 ++
 gcc/cp/module.cc  |   2 +
 gcc/print-tree.cc |   5 +
 .../gcc.dg/builtin-object-size-pr101832.c | 134 ++
 gcc/tree-core.h   |   4 +-
 gcc/tree-object-size.cc   |  79 +++
 gcc/tree-streamer-in.cc   |   1 +
 gcc/tree-streamer-out.cc  |   1 +
 gcc/tree.h|   6 +
 9 files changed, 215 insertions(+), 29 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/builtin-object-size-pr101832.c

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index 08078eadeb8..f589a2f5192 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -9284,6 +9284,18 @@ finish_struct (location_t loc, tree t, tree fieldlist, 
tree attributes,
   /* Set DECL_NOT_FLEXARRAY flag for FIELD_DECL x.  */
   DECL_NOT_FLEXARRAY (x) = !is_flexible_array_member_p (is_last_field, x);
 
+  /* Set TYPE_INCLUDE_FLEXARRAY for the context of x, t
+   * when x is an array.  */
+  if (TREE_CODE (TREE_TYPE (x)) == ARRAY_TYPE)
+   TYPE_INCLUDE_FLEXARRAY (t) = flexible_array_member_type_p (TREE_TYPE 
(x)) ;
+  /* Recursively set TYPE_INCLUDE_FLEXARRAY for the context of x, t
+when x is the last field.  */
+  else if ((TREE_CODE (TREE_TYPE (x)) == RECORD_TYPE
+   || TREE_CODE (TREE_TYPE (x)) == UNION_TYPE)
+  && TYPE_INCLUDE_FLEXARRAY (TREE_TYPE (x))
+  && is_last_field)
+   TYPE_INCLUDE_FLEXARRAY (t) = true;
+
   if (DECL_NAME (x)
  || RECORD_OR_UNION_TYPE_P (TREE_TYPE (x)))
saw_named_field = true;
diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index ac2fe66b080..c750361b704 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -5371,6 +5371,7 @@ trees_out::core_bools (tree t)
   WB (t->type_common.lang_flag_5);
   WB (t->type_common.lang_flag_6);
   WB (t->type_common.typeless_storage);
+  WB (t->type_common.type_include_flexarray);
 }
 
   if (CODE_CONTAINS_STRUCT (code, TS_DECL_COMMON))
@@ -5551,6 +5552,7 @@ trees_in::core_bools (tree t)
   RB (t->type_common.lang_flag_5);
   RB (t->type_common.lang_flag_6);
   RB (t->type_common.typeless_storage);
+  RB (t->type_common.type_include_flexarray);
 }
 
   if (CODE_CONTAINS_STRUCT (code, TS_DECL_COMMON))
diff --git a/gcc/print-tree.cc b/gcc/print-tree.cc
index 1f3afcbbc86..efacdb7686f 100644
--- a/gcc/print-tree.cc
+++ b/gcc/print-tree.cc
@@ -631,6 +631,11 @@ print_node (FILE *file, const char *prefix, tree node, int 
indent,
  && TYPE_CXX_ODR_P (node))
fputs (" cxx-odr-p", file);
 
+  if ((code == RECORD_TYPE
+  || code == UNION_TYPE)
+ && TYPE_INCLUDE_FLEXARRAY (node))
+   fputs (" include-flexarray", file);
+
   /* The transparent-union flag is used for different things in
 different nodes.  */
   if ((code == UNION_TYPE || code == RECORD_TYPE)
diff --git a/gcc/testsuite/gcc.dg/builtin-object-size-pr101832.c 
b/gcc/testsuite/gcc.dg/builtin-object-size-pr101832.c
new file mode 100644
index 000..60078e11634
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/builtin-object-size-pr101832.c
@@ -0,0 +1,134 @@
+/* PR 101832: 
+   GCC extension accepts the case when a struct with a C99 flexible array
+   member is embedded into another struct (possibly recursively).
+   __builtin_object_size will treat such struct as flexible size.
+   However, when a structure with non-C99 flexible array member, i.e, trailing
+   [0], [1], or [4], 

[V4][PATCH 0/2] Handle component_ref to a structure/union field including FAM for builtin_object_size

2023-02-24 Thread Qing Zhao via Gcc-patches
Hi, Joseph and Richard,

Could you please review this patch and let me know whether it???s ready
 for committing into GCC13?

The fix to Bug PR101832 is an important patch for kernel security
purpose. it's better to be put into GCC13.

=

These are the 4th version of the patches for PR101832, to fix
builtin_object_size to correctly handle component_ref to a
structure/union field that includes a flexible array member.

also includes a documentation update for the GCC extension on embedding
a structure/union with flexible array member into another structure.
which includes a fix to PR77650.

compared to the 3rd version of the patch, the major changes are:

1. update the documentation part per Joseph's comments.

compared to the 2nd version of the patch, the major changes are:

1. only include C99 flexible array member to this extension, trailing [0], [1]
  and [4] are all excluded.
2. for the new bit type_include_flexarray in tree_type_common, print it
  and also stream in/out it. 
3. update testing cases.
4. more clarification on the documentation. warnings for deprecating the 
  case when the structure with C99 FAM is embedded in the middle of
  another structure. 
5. add a new warning option -Wgnu-variable-sized-type-not-at-end for
  identifing all such cases.

bootstrapped and regression tested on aarch64 and x86.

Okay for commit?

thanks.

Qing

Qing Zhao (2):
  Handle component_ref to a structre/union field including C99 FAM
[PR101832]
  Update documentation to clarify a GCC extension

 gcc/c-family/c.opt|   5 +
 gcc/c/c-decl.cc   |  19 +++
 gcc/cp/module.cc  |   2 +
 gcc/doc/extend.texi   |  48 ++-
 gcc/print-tree.cc |   5 +
 .../gcc.dg/builtin-object-size-pr101832.c | 134 ++
 .../gcc.dg/variable-sized-type-flex-array.c   |  31 
 gcc/tree-core.h   |   4 +-
 gcc/tree-object-size.cc   |  79 +++
 gcc/tree-streamer-in.cc   |   1 +
 gcc/tree-streamer-out.cc  |   1 +
 gcc/tree.h|   6 +
 12 files changed, 305 insertions(+), 30 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/builtin-object-size-pr101832.c
 create mode 100644 gcc/testsuite/gcc.dg/variable-sized-type-flex-array.c

-- 
2.31.1



[PATCH] testsuite: Don't include multiline regexps in the the pass/fail log

2023-02-24 Thread Hans-Peter Nilsson via Gcc-patches
Ok to commit?

-- >8 --
I see overlong lines in the output when a test fails, for
example for a bug exposed for cris-elf and pru-elf in
gcc.dg/analyzer/allocation-size-multiline-3.c:

Running /x/gcc/testsuite/gcc.dg/analyzer/analyzer.exp ...
FAIL: gcc.dg/analyzer/allocation-size-multiline-3.c expected multiline pattern 
lines 16-25 not found: "\s*int32_t \*ptr = alloca \(99\);[^\n\r]*\n 
 \^~\n  'test_constant_99': events 1-2[^\n\r]*\n\|[^\n\r]*\n\|  
 int32_t \*ptr = alloca \(99\);[^\n\r]*\n\|  \^~\n
\|  \|[^\n\r]*\n\|  \(1\) allocated 99 
bytes here[^\n\r]*\n\|  \(2\) assigned to 'int32_t \*' 
\{aka 'int \*'\} here; 'sizeof \(int32_t \{aka int\}\)' is '4'[^\n\r]*\n
\|[^\n\r]*\n"
FAIL: gcc.dg/analyzer/allocation-size-multiline-3.c expected multiline pattern 
lines 34-43 not found: "   int32_t \*ptr = alloca \(n \* 2\);[^\n\r]*\n 
 \^~\n  'test_symbolic': events 1-2[^\n\r]*\n\|[^\n\r]*\n\| 
  int32_t \*ptr = alloca \(n \* 2\);[^\n\r]*\n\|  \^~\n 
   \|  \|[^\n\r]*\n\|  \(1\) allocated 'n 
\* 2' bytes here[^\n\r]*\n\|  \(2\) assigned to 'int32_t 
\*' \{aka 'int \*'\} here; 'sizeof \(int32_t \{aka int\}\)' is '4'[^\n\r]*\n
\|[^\n\r]*\n"
FAIL: gcc.dg/analyzer/allocation-size-multiline-3.c (test for excess errors)

That multiline-regexp-quoted-on-a-single-line is redundant
when also outputting "lines 16-25" and "lines 34-43".  It's
also so noisy that it can be mistaken for a testsuite error.
If there's a need to inspect it, it can be seen at
verbose-level 4, i.e. persons interested in seeing it
without editing sources can just add "-v -v -v -v".

Let's "prune" the regexp from regular output, instead producing:
Running /x/gcc/testsuite/gcc.dg/analyzer/analyzer.exp ...
FAIL: gcc.dg/analyzer/allocation-size-multiline-3.c expected multiline pattern 
lines 16-25 not found
FAIL: gcc.dg/analyzer/allocation-size-multiline-3.c expected multiline pattern 
lines 34-43 not found
FAIL: gcc.dg/analyzer/allocation-size-multiline-3.c (test for excess errors)

* lib/multiline.exp (handle-multiline-outputs): Don't include the
quoted multiline regexp in the pass/fail output.
---
 gcc/testsuite/lib/multiline.exp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/lib/multiline.exp b/gcc/testsuite/lib/multiline.exp
index 84ba9216642e..5eccf2bbebc1 100644
--- a/gcc/testsuite/lib/multiline.exp
+++ b/gcc/testsuite/lib/multiline.exp
@@ -169,9 +169,9 @@ proc handle-multiline-outputs { text } {
# Use "regsub" to attempt to prune the pattern from $text
if {[regsub -line $rexp $text "" text]} {
# The multiline pattern was pruned.
-   ${maybe_x}pass "$title was found: \"$escaped_regex\""
+   ${maybe_x}pass "$title was found"
} else {
-   ${maybe_x}fail "$title not found: \"$escaped_regex\""
+   ${maybe_x}fail "$title not found"
}
 
set index [expr $index + 1]
-- 
2.30.2



Re: [PATCH 6/8] libstdc++: Fix formatting

2023-02-24 Thread Jonathan Wakely via Gcc-patches
On Thu, 23 Feb 2023 at 08:54, Matthias Kretz via Libstdc++
 wrote:
>
>
>
> Whitespace changes only.

Looks like there are a few remaining spaces that could be removed
where you've joined lines, e.g.

+{ return static_cast<_Up*>( __builtin_assume_aligned(__ptr,
_S_alignment<_Tp, _Up>)); }

and

+  { return __vector_type_t<_Tp, _Np>{
static_cast<_Tp>(__gen(_SizeConstant<_I>()))...}; }

OK for trunk anyway (and the branches if you want).


>
> Signed-off-by: Matthias Kretz 
>
> libstdc++-v3/ChangeLog:
>
> * include/experimental/bits/simd.h: Line breaks and indenting
> fixed to follow the libstdc++ standard.
> * include/experimental/bits/simd_builtin.h: Likewise.
> * include/experimental/bits/simd_fixed_size.h: Likewise.
> * include/experimental/bits/simd_neon.h: Likewise.
> * include/experimental/bits/simd_ppc.h: Likewise.
> * include/experimental/bits/simd_scalar.h: Likewise.
> * include/experimental/bits/simd_x86.h: Likewise.
> ---
>  libstdc++-v3/include/experimental/bits/simd.h | 473 ++--
>  .../include/experimental/bits/simd_builtin.h  | 692 +-
>  .../experimental/bits/simd_fixed_size.h   | 228 +++---
>  .../include/experimental/bits/simd_neon.h |  24 +-
>  .../include/experimental/bits/simd_ppc.h  |   3 +-
>  .../include/experimental/bits/simd_scalar.h   | 362 +
>  .../include/experimental/bits/simd_x86.h  |  90 ++-
>  7 files changed, 942 insertions(+), 930 deletions(-)
>
>
> --
> ──
>  Dr. Matthias Kretz   https://mattkretz.github.io
>  GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
>  stdₓ::simd
> ──


Re: [PATCH 5/8] libstdc++: Always-inline most of non-cmath fixed_size implementation

2023-02-24 Thread Jonathan Wakely via Gcc-patches
On Thu, 23 Feb 2023 at 08:54, Matthias Kretz via Libstdc++
 wrote:
>
>
>
> For simd, the inlining behavior should be similar to builtin types. (No
> operator on buitin types is ever translated into a function call.)
> Therefore, always_inline is the right choice (i.e. inline on -O0 as
> well).

OK for trunk (and OK for backport later if no problems show up from
the extra inlining).


>
> Signed-off-by: Matthias Kretz 
>
> libstdc++-v3/ChangeLog:
>
> PR libstdc++/108030
> * include/experimental/bits/simd_fixed_size.h
> (_SimdImplFixedSize::_S_broadcast): Replace inline with
> _GLIBCXX_SIMD_INTRINSIC.
> (_SimdImplFixedSize::_S_generate): Likewise.
> (_SimdImplFixedSize::_S_load): Likewise.
> (_SimdImplFixedSize::_S_masked_load): Likewise.
> (_SimdImplFixedSize::_S_store): Likewise.
> (_SimdImplFixedSize::_S_masked_store): Likewise.
> (_SimdImplFixedSize::_S_min): Likewise.
> (_SimdImplFixedSize::_S_max): Likewise.
> (_SimdImplFixedSize::_S_complement): Likewise.
> (_SimdImplFixedSize::_S_unary_minus): Likewise.
> (_SimdImplFixedSize::_S_plus): Likewise.
> (_SimdImplFixedSize::_S_minus): Likewise.
> (_SimdImplFixedSize::_S_multiplies): Likewise.
> (_SimdImplFixedSize::_S_divides): Likewise.
> (_SimdImplFixedSize::_S_modulus): Likewise.
> (_SimdImplFixedSize::_S_bit_and): Likewise.
> (_SimdImplFixedSize::_S_bit_or): Likewise.
> (_SimdImplFixedSize::_S_bit_xor): Likewise.
> (_SimdImplFixedSize::_S_bit_shift_left): Likewise.
> (_SimdImplFixedSize::_S_bit_shift_right): Likewise.
> (_SimdImplFixedSize::_S_remquo): Add inline keyword (to be
> explicit about not always-inline, yet).
> (_SimdImplFixedSize::_S_isinf): Likewise.
> (_SimdImplFixedSize::_S_isfinite): Likewise.
> (_SimdImplFixedSize::_S_isnan): Likewise.
> (_SimdImplFixedSize::_S_isnormal): Likewise.
> (_SimdImplFixedSize::_S_signbit): Likewise.
> ---
>  .../experimental/bits/simd_fixed_size.h   | 60 +--
>  1 file changed, 30 insertions(+), 30 deletions(-)
>
>
> --
> ──
>  Dr. Matthias Kretz   https://mattkretz.github.io
>  GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
>  stdₓ::simd
> ──



Re: [PATCH 3/8] libstdc++: More efficient masked inc-/decrement implementation

2023-02-24 Thread Jonathan Wakely via Gcc-patches
On Thu, 23 Feb 2023 at 08:55, Matthias Kretz via Libstdc++
 wrote:
>


OK for trunk (and maybe backport later if you want to).


>
> Signed-off-by: Matthias Kretz 
>
> libstdc++-v3/ChangeLog:
>
> PR libstdc++/108856
> * include/experimental/bits/simd_builtin.h
> (_SimdImplBuiltin::_S_masked_unary): More efficient
> implementation of masked inc-/decrement for integers and floats
> without AVX2.
> * include/experimental/bits/simd_x86.h
> (_SimdImplX86::_S_masked_unary): New. Use AVX512 masked subtract
> builtins for masked inc-/decrement.
> ---
>  .../include/experimental/bits/simd_builtin.h  | 27 +++-
>  .../include/experimental/bits/simd_x86.h  | 68 +++
>  2 files changed, 93 insertions(+), 2 deletions(-)
>
>
> --
> ──
>  Dr. Matthias Kretz   https://mattkretz.github.io
>  GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
>  stdₓ::simd
> ──


[PATCH] testsuite: Add -fno-ivopts to gcc.dg/Wuse-after-free-2.c, PR108828

2023-02-24 Thread Hans-Peter Nilsson via Gcc-patches
Ok to commit?

I suggest that when committed I'll also set the bugzilla
entry in SUSPENDED mode, as opposed to RESOLVED.  I mean,
the issue isn't really solved; that'd be a patch improving
pointer tracking across ivopts.

-- >8 --
For cris-elf before this patch, ever since it was added,
this test gets:

Running /x/gcc/testsuite/gcc.dg/dg.exp ...
FAIL: gcc.dg/Wuse-after-free-2.c  (test for warnings, line 115)
FAIL: gcc.dg/Wuse-after-free-2.c  (test for warnings, line 116)

and comparing tree dumps with a native x86_64-pc-linux-gnu
run shows a suspicious difference in the "180t.ivopts" dump.
Indeed -fno-ivopts makes the warning appear for cris-elf
too.  It was suggested to simply add -fno-ivopts to the
test-flags, like before -fno-tree-loop-distribute-patterns
was added; thus.

PR tree-optimization/108828
* gcc.dg/Wuse-after-free-2.c: Add -fno-ivopts.
---
 gcc/testsuite/gcc.dg/Wuse-after-free-2.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/Wuse-after-free-2.c 
b/gcc/testsuite/gcc.dg/Wuse-after-free-2.c
index 68ec75845cec..ebc051690db5 100644
--- a/gcc/testsuite/gcc.dg/Wuse-after-free-2.c
+++ b/gcc/testsuite/gcc.dg/Wuse-after-free-2.c
@@ -1,6 +1,6 @@
 /* PR middle-end/104232 - spurious -Wuse-after-free after conditional free
{ dg-do compile }
-   { dg-options "-O2 -Wall -fno-tree-loop-distribute-patterns" }  */
+   { dg-options "-O2 -Wall -fno-tree-loop-distribute-patterns -fno-ivopts" }  
*/
 
 void free (void*);
 
@@ -108,7 +108,8 @@ int warn_cond_loop (char *p)
   char *q = p;
 
   /*  -fno-tree-loop-distribute-patterns ensures this does not get converted
-  into rawmemchr (making q and p unrelated).  */
+  into rawmemchr (making q and p unrelated).  Also, -fno-ivopts is required
+  for some targets, to not lose track of the pointer.  */
   while (*q)
 ++q;
 
-- 
2.30.2



Re: [PATCH v3 09/11] riscv: thead: Add support for the XTheadMemPair ISA extension

2023-02-24 Thread Kito Cheng via Gcc-patches
Could you move those thead_* and th_* functions into thead.cc

> +static bool
> +thead_mempair_operand_p (rtx mem, machine_mode mode)
> +{
> +  if (!MEM_SIZE_KNOWN_P (mem))
> +return false;
> +
> +  /* Only DI or SI mempair instructions exist.  */

add gcc_assert (mode == SImode || mode == DImode); here


Re: [PATCH] asan: adjust module name for global variables

2023-02-24 Thread Martin Liška
On 2/24/23 10:07, Jakub Jelinek wrote:
> On Fri, Feb 24, 2023 at 10:00:01AM +0100, Martin Liška wrote:
>> As mentioned in the PR, when we use LTO, we wrongly use ltrans output
>> file name as a module name of a global variable. That leads to a
>> non-reproducible output.
>>
>> After the suggested change, we emit context name of normal global
>> variables. And for artificial variables (like .Lubsan_data3), we use
>> aux_base_name (e.g. "./a.ltrans0.ltrans").
>>
>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>
>> Ready to be installed?
>> Thanks,
>> Martin
>>
>>  PR asan/108834
>>
>> gcc/ChangeLog:
>>
>>  * asan.cc (asan_add_global): Use proper TU name for normal
>>global variables (and aux_base_name for the artificial one).
>>
>> gcc/testsuite/ChangeLog:
>>
>>  * c-c++-common/asan/global-overflow-1.c: Test line and column
>>  info for a global variable.
>> ---
>>  gcc/asan.cc | 7 ++-
>>  gcc/testsuite/c-c++-common/asan/global-overflow-1.c | 2 +-
>>  2 files changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/gcc/asan.cc b/gcc/asan.cc
>> index f56d084bc7a..245abb14388 100644
>> --- a/gcc/asan.cc
>> +++ b/gcc/asan.cc
>> @@ -3287,7 +3287,12 @@ asan_add_global (tree decl, tree type, 
>> vec *v)
>>  pp_string (_pp, "");
>>str_cst = asan_pp_string (_pp);
>>  
>> -  pp_string (_name_pp, main_input_filename);
>> +  const_tree tu = get_ultimate_context ((const_tree)decl);
>> +  if (tu != NULL_TREE)
>> +pp_string (_name_pp, IDENTIFIER_POINTER (DECL_NAME (tu)));
>> +  else
>> +pp_string (_name_pp, aux_base_name);
> 
> I think for !in_lto_p we don't need to bother with get_ultimate_context
> and should just use main_input_filename as before.

All right, pushed with that change.

Thanks,
Martin

> 
> Otherwise LGTM.
> 
>   Jakub
> 



[PATCH] RISC-V: Add testcase for VSETVL PASS

2023-02-24 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/scalar_move-1.c: New test.
* gcc.target/riscv/rvv/base/scalar_move-2.c: New test.
* gcc.target/riscv/rvv/base/scalar_move-3.c: New test.
* gcc.target/riscv/rvv/base/scalar_move-4.c: New test.
* gcc.target/riscv/rvv/base/scalar_move-5.c: New test.
* gcc.target/riscv/rvv/base/scalar_move-6.c: New test.
* gcc.target/riscv/rvv/base/scalar_move-7.c: New test.
* gcc.target/riscv/rvv/base/scalar_move-8.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-100.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-101.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-78.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-79.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-80.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-81.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-82.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-83.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-84.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-85.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-86.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-87.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-88.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-89.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-90.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-91.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-92.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-93.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-94.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-95.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-96.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-97.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-98.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-99.c: New test.

---
 .../gcc.target/riscv/rvv/base/scalar_move-1.c |  75 +++
 .../gcc.target/riscv/rvv/base/scalar_move-2.c |  62 ++
 .../gcc.target/riscv/rvv/base/scalar_move-3.c |  58 +
 .../gcc.target/riscv/rvv/base/scalar_move-4.c |  54 +
 .../gcc.target/riscv/rvv/base/scalar_move-5.c | 176 +++
 .../gcc.target/riscv/rvv/base/scalar_move-6.c | 209 ++
 .../gcc.target/riscv/rvv/base/scalar_move-7.c | 176 +++
 .../gcc.target/riscv/rvv/base/scalar_move-8.c | 201 +
 .../riscv/rvv/vsetvl/avl_single-100.c |  25 +++
 .../riscv/rvv/vsetvl/avl_single-101.c |  29 +++
 .../riscv/rvv/vsetvl/avl_single-78.c  |  24 ++
 .../riscv/rvv/vsetvl/avl_single-79.c  |  22 ++
 .../riscv/rvv/vsetvl/avl_single-80.c  |  22 ++
 .../riscv/rvv/vsetvl/avl_single-81.c  |  23 ++
 .../riscv/rvv/vsetvl/avl_single-82.c  |  30 +++
 .../riscv/rvv/vsetvl/avl_single-83.c  |  31 +++
 .../riscv/rvv/vsetvl/avl_single-84.c  |  23 ++
 .../riscv/rvv/vsetvl/avl_single-85.c  |  22 ++
 .../riscv/rvv/vsetvl/avl_single-86.c  |  29 +++
 .../riscv/rvv/vsetvl/avl_single-87.c  |  30 +++
 .../riscv/rvv/vsetvl/avl_single-88.c  |  29 +++
 .../riscv/rvv/vsetvl/avl_single-89.c  |  31 +++
 .../riscv/rvv/vsetvl/avl_single-90.c  |  30 +++
 .../riscv/rvv/vsetvl/avl_single-91.c  |  33 +++
 .../riscv/rvv/vsetvl/avl_single-92.c  |  26 +++
 .../riscv/rvv/vsetvl/avl_single-93.c  |  21 ++
 .../riscv/rvv/vsetvl/avl_single-94.c  |  20 ++
 .../riscv/rvv/vsetvl/avl_single-95.c  |  20 ++
 .../riscv/rvv/vsetvl/avl_single-96.c  |  21 ++
 .../riscv/rvv/vsetvl/avl_single-97.c  |  22 ++
 .../riscv/rvv/vsetvl/avl_single-98.c  |  25 +++
 .../riscv/rvv/vsetvl/avl_single-99.c  |  23 ++
 32 files changed, 1622 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-6.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/scalar_move-8.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-100.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-101.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-78.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-79.c
 create mode 100644 

[PATCH] RISC-V: Add scalar move support and fix VSETVL bugs

2023-02-24 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/ChangeLog:

* config/riscv/constraints.md (Wb1): New constraint.
* config/riscv/predicates.md 
(vector_least_significant_set_mask_operand): New predicate.
(vector_broadcast_mask_operand): Ditto.
* config/riscv/riscv-protos.h (enum vlmul_type): Adjust.
(gen_scalar_move_mask): New function.
* config/riscv/riscv-v.cc (gen_scalar_move_mask): Ditto.
* config/riscv/riscv-vector-builtins-bases.cc (class vmv): New class.
(class vmv_s): Ditto.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def (vmv_x): Ditto.
(vmv_s): Ditto.
(vfmv_f): Ditto.
(vfmv_s): Ditto.
* config/riscv/riscv-vector-builtins-shapes.cc (struct 
scalar_move_def): Ditto.
(SHAPE): Ditto.
* config/riscv/riscv-vector-builtins-shapes.h: Ditto.
* config/riscv/riscv-vector-builtins.cc (function_expander::mask_mode): 
Ditto.
(function_expander::use_exact_insn): New function.
(function_expander::use_contiguous_load_insn): New function.
(function_expander::use_contiguous_store_insn): New function.
(function_expander::use_ternop_insn): New function.
(function_expander::use_widen_ternop_insn): New function.
(function_expander::use_scalar_move_insn): New function.
* config/riscv/riscv-vector-builtins.def (s): New operand suffix.
* config/riscv/riscv-vector-builtins.h 
(function_expander::add_scalar_move_mask_operand): New class.
* config/riscv/riscv-vsetvl.cc (ignore_vlmul_insn_p): New function.
(scalar_move_insn_p): Ditto.
(has_vsetvl_killed_avl_p): Ditto.
(anticipatable_occurrence_p): Ditto.
(insert_vsetvl): Ditto.
(get_vl_vtype_info): Ditto.
(calculate_sew): Ditto.
(calculate_vlmul): Ditto.
(incompatible_avl_p): Ditto.
(different_sew_p): Ditto.
(different_lmul_p): Ditto.
(different_ratio_p): Ditto.
(different_tail_policy_p): Ditto.
(different_mask_policy_p): Ditto.
(possible_zero_avl_p): Ditto.
(first_ratio_invalid_for_second_sew_p): Ditto.
(first_ratio_invalid_for_second_lmul_p): Ditto.
(second_ratio_invalid_for_first_sew_p): Ditto.
(second_ratio_invalid_for_first_lmul_p): Ditto.
(second_sew_less_than_first_sew_p): Ditto.
(first_sew_less_than_second_sew_p): Ditto.
(compare_lmul): Ditto.
(second_lmul_less_than_first_lmul_p): Ditto.
(first_lmul_less_than_second_lmul_p): Ditto.
(first_ratio_less_than_second_ratio_p): Ditto.
(second_ratio_less_than_first_ratio_p): Ditto.
(DEF_INCOMPATIBLE_COND): Ditto.
(greatest_sew): Ditto.
(first_sew): Ditto.
(second_sew): Ditto.
(first_vlmul): Ditto.
(second_vlmul): Ditto.
(first_ratio): Ditto.
(second_ratio): Ditto.
(vlmul_for_first_sew_second_ratio): Ditto.
(ratio_for_second_sew_first_vlmul): Ditto.
(DEF_SEW_LMUL_FUSE_RULE): Ditto.
(always_unavailable): Ditto.
(avl_unavailable_p): Ditto.
(sew_unavailable_p): Ditto.
(lmul_unavailable_p): Ditto.
(ge_sew_unavailable_p): Ditto.
(ge_sew_lmul_unavailable_p): Ditto.
(ge_sew_ratio_unavailable_p): Ditto.
(DEF_UNAVAILABLE_COND): Ditto.
(same_sew_lmul_demand_p): Ditto.
(propagate_avl_across_demands_p): Ditto.
(reg_available_p): Ditto.
(avl_info::has_non_zero_avl): Ditto.
(vl_vtype_info::has_non_zero_avl): Ditto.
(vector_insn_info::operator>=): Refactor.
(vector_insn_info::parse_insn): Adjust for scalar move.
(vector_insn_info::demand_vl_vtype): Remove.
(vector_insn_info::compatible_p): New function.
(vector_insn_info::compatible_avl_p): Ditto.
(vector_insn_info::compatible_vtype_p): Ditto.
(vector_insn_info::available_p): Ditto.
(vector_insn_info::merge): Ditto.
(vector_insn_info::fuse_avl): Ditto.
(vector_insn_info::fuse_sew_lmul): Ditto.
(vector_insn_info::fuse_tail_policy): Ditto.
(vector_insn_info::fuse_mask_policy): Ditto.
(vector_insn_info::dump): Ditto.
(vector_infos_manager::release): Ditto.
(pass_vsetvl::compute_local_backward_infos): Adjust for scalar move 
support.
(pass_vsetvl::get_backward_fusion_type): Adjust for scalar move support.
(pass_vsetvl::hard_empty_block_p): Ditto.
(pass_vsetvl::backward_demand_fusion): Ditto.
(pass_vsetvl::forward_demand_fusion): Ditto.
(pass_vsetvl::refine_vsetvls): Ditto.
(pass_vsetvl::cleanup_vsetvls): Ditto.
(pass_vsetvl::commit_vsetvls): Ditto.
(pass_vsetvl::propagate_avl): Ditto.
* config/riscv/riscv-vsetvl.h (enum demand_status): New class.
(struct 

Re: [PATCH] rs6000: fmr gets used instead of faster xxlor [PR93571]

2023-02-24 Thread Segher Boessenkool
Hi!

For future patches: please don't send patches as replies to existing
threads.  Just start a new thread for a new patch (series).  You can
mark it as [PATCH v2] in the subject, if you want.

On Fri, Feb 24, 2023 at 01:41:49PM +0530, Ajit Agarwal wrote:
> Here is the patch that uses xxlor instead of fmr where possible.
> Performance results shows that fmr is better in power9 and 
> power10 architectures whereas xxlor is better in power7 and
> power 8 architectures.

And fmr is the only option before p7.

>   rs6000: Use xxlor instead of fmr where possible
> 
>   This patch replaces fmr with xxlor instruction for power7
>   and power8 architectures whereas for power9 and power10
>   replaces xxlor with fmr instruction.

Saying "this patch" in a commit message reads strangely.  Just "Replace
fmr with" etc.?

The second part is just wrong, you cannot replace xxlor by fmr in
general.

>   Perf measurement results:
> 
>   Power9 fmr:  201,847,661 cycles.
>   Power9 xxlor: 201,877,78 cycles.
>   Power8 fmr: 201,057,795 cycles.
> Power8 xxlor: 201,004,671 cycles.

What is this measuring?  100M insns back-to-back, each dependent on the
previous one?

What are the results on p7 and p10?

These numbers show there is no difference on p8 either.  Did you paste
the wrong numbers maybe?

>   * config/rs6000/rs6000.md (*movdf_hardfloat64): Use xxlor
>   for power7 and power8 and fmr for power9 and power10.

Please don't break lines early.  Changelogs lines can be 80 columns
wide, just like source code lines.

> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -354,7 +354,7 @@ (define_attr "cpu"
>(const (symbol_ref "(enum attr_cpu) rs6000_tune")))
>  
>  ;; The ISA we implement.
> -(define_attr "isa" "any,p5,p6,p7,p7v,p8v,p9,p9v,p9kf,p9tf,p10"
> +(define_attr "isa" "any,p5,p6,p7,p7v,p8v,p9,p9v,p9kf,p9tf,p7p8,p10"

p78v, and sort it after p8v please.

> + (and (eq_attr "isa" "p7p8")
> +   (match_test "TARGET_VSX && !TARGET_P9_VECTOR"))
> + (const_int 1)

Okay.

>  (define_insn "*mov_hardfloat64"
>[(set (match_operand:FMOVE64 0 "nonimmediate_operand"
> -   "=m,   d,  d,  ,   wY,
> - ,Z,  ,  ,  !r,
> - YZ,  r,  !r, *c*l,   !r,
> -*h,   r,  ,   wa")
> +   "=m,   d,  ,  ,   wY,
> +, Z,  wa, ,  !r,
> +YZ,   r,  !r, *c*l,   !r,
> +*h,   r,  ,   d,  wn,
> +wa")
>   (match_operand:FMOVE64 1 "input_operand"

(You posted this mail as wrapping.  That means the patch cannot be
applied non-manually, and that replies to your mail will be mangled.
Just get a Real mail client, and configure it correctly :-) )

> -"d,   m,  d,  wY, ,
> - Z,   ,   ,  ,  ,
> +"d,   m,  ,  wY, ,
> + Z,   ,   wa, ,  ,
>   r,   YZ, r,  r,  *h,
> - 0,   ,   r,  eP"))]
> + 0,   ,   r,  d,  wn,
> + eP"))]

No.  It is impossible to figure out what you changed here by just
reading it.

There is no requirement there should be exactly five alternatives per
line, and/or that there should be the same number everywhere.

If the indentation was incorrect, and you want to fix that, do that in a
separate *earlier* patch in the series, please.

>"TARGET_POWERPC64 && TARGET_HARD_FLOAT
> && (gpc_reg_operand (operands[0], mode)
> || gpc_reg_operand (operands[1], mode))"
>"@
> stfd%U0%X0 %1,%0
> lfd%U1%X1 %0,%1
> -   fmr %0,%1
> +   xxlor %x0,%x1,%x1
> lxsd %0,%1
> stxsd %1,%0
> lxsdx %x0,%y1
> stxsdx %x1,%y0
> -   xxlor %x0,%x1,%x1
> +   fmr %0,%1
> xxlxor %x0,%x0,%x0
> li %0,0
> std%U0%X0 %1,%0
> @@ -8467,23 +8474,28 @@ (define_insn "*mov_hardfloat64"
> nop
> mfvsrd %0,%x1
> mtvsrd %x0,%1
> +   fmr %0,%1
> +   fmr %0,%1
> #"
>[(set_attr "type"
> -"fpstore, fpload, fpsimple,   fpload, fpstore,
> +"fpstore, fpload, veclogical, fpload, fpstore,
>   fpload,  fpstore,veclogical, veclogical, integer,
>   store,   load,   *,  mtjmpr, mfjmpr,
> - *,   mfvsr,  mtvsr,  vecperm")
> + *,   mfvsr,  mtvsr,  fpsimple,   fpsimple,
> + vecperm")
> (set_attr "size" "64")
> (set_attr "isa"
> -"*,   *,  *,  p9v,p9v,
> - p7v, p7v,*,  *,  *,
> - *,   *,  *,  *,  *,
> - *,   p8v,p8v,   

Re: C++ modules and AAPCS/ARM EABI clash on inline key methods

2023-02-24 Thread Alexandre Oliva via Gcc-patches
On Feb 24, 2023, Richard Earnshaw  wrote:

> Given the logic of this macro, the text should be
> "!TARGET_CXX_METHOD_MAY_BE_INLINE".

I was thinking just "related to that macro", but yeah, negating it makes
sense.

> OK with that change.

Thanks, here's what I'm checking in.


[PR105224] C++ modules and AAPCS/ARM EABI clash on inline key methods

From: Alexandre Oliva 

g++.dg/modules/virt-2_a.C fails on arm-eabi and many other arm targets
that use the AAPCS variant.  ARM is the only target that overrides
TARGET_CXX_KEY_METHOD_MAY_BE_INLINE.  It's not clear to me which way
the clash between AAPCS and C++ Modules design should be resolved, but
currently it favors AAPCS and thus the test fails, so skip it on
arm_eabi.


for  gcc/testsuite/ChangeLog

PR c++/105224
* g++.dg/modules/virt-2_a.C: Skip on arm_eabi.
---
 gcc/testsuite/g++.dg/modules/virt-2_a.C |3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/testsuite/g++.dg/modules/virt-2_a.C 
b/gcc/testsuite/g++.dg/modules/virt-2_a.C
index 580552be5a0d8..b5050445c3f15 100644
--- a/gcc/testsuite/g++.dg/modules/virt-2_a.C
+++ b/gcc/testsuite/g++.dg/modules/virt-2_a.C
@@ -1,3 +1,6 @@
+// AAPCS overrides TARGET_CXX_KEY_METHOD_MAY_BE_INLINE,
+// in a way that invalidates this test.
+// { dg-skip-if "!TARGET_CXX_KEY_METHOD_MAY_BE_INLINE" { arm_eabi } } 
 // { dg-module-do run }
 // { dg-additional-options -fmodules-ts }
 export module foo;


-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about 


[committed 2/5] libstdc++: Fix conversion to/from net::ip::address_v4::bytes_type

2023-02-24 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk.

-- >8 --

I messed up the endianness of the address_v4::bytes_type array, which
should always be in network byte order. We can just use bit_cast to
convert the _M_addr member to/from bytes_type.

libstdc++-v3/ChangeLog:

* include/experimental/internet (address_4(const bytes_type&)):
Use __builtin_bit_cast if available, otherwise convert to
network byte order.
(address_v4::to_bytes()): Likewise, but convert from network
byte order.
* testsuite/experimental/net/internet/address/v4/cons.cc: Fix
incorrect tests. Check for constexpr too.
* testsuite/experimental/net/internet/address/v4/creation.cc:
Likewise.
* testsuite/experimental/net/internet/address/v4/members.cc:
Check that bytes_type is a standard-layout type.
---
 libstdc++-v3/include/experimental/internet| 20 ---
 .../net/internet/address/v4/cons.cc   | 33 ---
 .../net/internet/address/v4/creation.cc   | 24 +++---
 .../net/internet/address/v4/members.cc|  3 ++
 4 files changed, 60 insertions(+), 20 deletions(-)

diff --git a/libstdc++-v3/include/experimental/internet 
b/libstdc++-v3/include/experimental/internet
index 08bd0db4bb2..3fd200251fa 100644
--- a/libstdc++-v3/include/experimental/internet
+++ b/libstdc++-v3/include/experimental/internet
@@ -198,7 +198,12 @@ namespace ip
 
 constexpr
 address_v4(const bytes_type& __b)
-: _M_addr((__b[0] << 24) | (__b[1] << 16) | (__b[2] << 8) | __b[3])
+#if __has_builtin(__builtin_bit_cast)
+: _M_addr(__builtin_bit_cast(uint_type, __b))
+#else
+: _M_addr(_S_hton_32((__b[0] << 24) | (__b[1] << 16)
+  | (__b[2] << 8) | __b[3]))
+#endif
 { }
 
 explicit constexpr
@@ -227,12 +232,17 @@ namespace ip
 constexpr bytes_type
 to_bytes() const noexcept
 {
+#if __has_builtin(__builtin_bit_cast)
+  return __builtin_bit_cast(bytes_type, _M_addr);
+#else
+  auto __host = to_uint();
   return bytes_type{
- (_M_addr >> 24) & 0xFF,
- (_M_addr >> 16) & 0xFF,
- (_M_addr >> 8) & 0xFF,
- _M_addr & 0xFF
+   (__host >> 24) & 0xFF,
+   (__host >> 16) & 0xFF,
+   (__host >> 8) & 0xFF,
+   __host & 0xFF
   };
+#endif
 }
 
 constexpr uint_type
diff --git 
a/libstdc++-v3/testsuite/experimental/net/internet/address/v4/cons.cc 
b/libstdc++-v3/testsuite/experimental/net/internet/address/v4/cons.cc
index 65f23642de4..af9fef2215e 100644
--- a/libstdc++-v3/testsuite/experimental/net/internet/address/v4/cons.cc
+++ b/libstdc++-v3/testsuite/experimental/net/internet/address/v4/cons.cc
@@ -24,41 +24,45 @@
 
 using std::experimental::net::ip::address_v4;
 
-void
+#if __cplusplus < 202002L
+// Naughty, but operator== for std::array is not constexpr until C++20.
+constexpr bool
+operator==(const address_v4::bytes_type& lhs, const address_v4::bytes_type& 
rhs)
+{
+  return lhs[0] == rhs[0] && lhs[1] == rhs[1]
+  && lhs[2] == rhs[2] && lhs[3] == rhs[3];
+}
+#endif
+
+constexpr void
 test01()
 {
-  bool test __attribute__((unused)) = false;
-
   address_v4 a0;
   VERIFY( a0.to_uint() == 0 );
   VERIFY( a0.to_bytes() == address_v4::bytes_type{} );
 }
 
-void
+constexpr void
 test02()
 {
-  bool test __attribute__((unused)) = false;
-
   address_v4 a0{ address_v4::bytes_type{} };
   VERIFY( a0.to_uint() == 0 );
   VERIFY( a0.to_bytes() == address_v4::bytes_type{} );
 
   address_v4::bytes_type b1{ 1, 2, 3, 4 };
   address_v4 a1{ b1 };
-  VERIFY( a1.to_uint() == ntohl((1 << 24) | (2 << 16) | (3 << 8) | 4) );
+  VERIFY( a1.to_uint() == ((1 << 24) | (2 << 16) | (3 << 8) | 4) );
   VERIFY( a1.to_bytes() == b1 );
 }
 
-void
+constexpr void
 test03()
 {
-  bool test __attribute__((unused)) = false;
-
   address_v4 a0{ 0u };
   VERIFY( a0.to_uint() == 0 );
   VERIFY( a0.to_bytes() == address_v4::bytes_type{} );
 
-  address_v4::uint_type u1 = ntohl((5 << 24) | (6 << 16) | (7 << 8) | 8);
+  address_v4::uint_type u1 = (5 << 24) | (6 << 16) | (7 << 8) | 8;
   address_v4 a1{ u1 };
   VERIFY( a1.to_uint() == u1 );
   VERIFY( a1.to_bytes() == address_v4::bytes_type( 5, 6, 7, 8 ) );
@@ -70,4 +74,11 @@ main()
   test01();
   test02();
   test03();
+
+  constexpr bool c = []{
+test01();
+test02();
+test03();
+return true;
+  };
 }
diff --git 
a/libstdc++-v3/testsuite/experimental/net/internet/address/v4/creation.cc 
b/libstdc++-v3/testsuite/experimental/net/internet/address/v4/creation.cc
index 441c832bf54..84aebbb7adc 100644
--- a/libstdc++-v3/testsuite/experimental/net/internet/address/v4/creation.cc
+++ b/libstdc++-v3/testsuite/experimental/net/internet/address/v4/creation.cc
@@ -25,7 +25,17 @@
 namespace net = std::experimental::net;
 using net::ip::address_v4;
 
-void
+#if __cplusplus < 202002L
+// Naughty, but operator== for std::array is not constexpr until C++20.
+constexpr bool
+operator==(const address_v4::bytes_type& lhs, 

[committed 3/5] libstdc++: Fix members of net::ip::network_v4

2023-02-24 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk.

-- >8 --

libstdc++-v3/ChangeLog:

* include/experimental/internet (network_v4::netmask()): Avoid
undefined shift.
(network_v4::broadcast()): Optimize and fix for targets with
uint_least32_t wider than 32 bits.
(network_v4::to_string(const Allocator&)): Fix for custom
allocators and optimize using to_chars.
(operator==(const network_v4&, const network_v4&)): Add missing
constexpr.
(operator==(const network_v6&, const network_v6&)): Likewise.
* testsuite/experimental/net/internet/network/v4/cons.cc: New test.
* testsuite/experimental/net/internet/network/v4/members.cc: New test.
---
 libstdc++-v3/include/experimental/internet|  41 ++--
 .../net/internet/network/v4/cons.cc   | 129 
 .../net/internet/network/v4/members.cc| 186 ++
 3 files changed, 343 insertions(+), 13 deletions(-)
 create mode 100644 
libstdc++-v3/testsuite/experimental/net/internet/network/v4/cons.cc
 create mode 100644 
libstdc++-v3/testsuite/experimental/net/internet/network/v4/members.cc

diff --git a/libstdc++-v3/include/experimental/internet 
b/libstdc++-v3/include/experimental/internet
index 3fd200251fa..5336b8a8ce3 100644
--- a/libstdc++-v3/include/experimental/internet
+++ b/libstdc++-v3/include/experimental/internet
@@ -1219,10 +1219,10 @@ namespace ip
 
   /// @}
 
-  bool
+  constexpr bool
   operator==(const network_v4& __a, const network_v4& __b) noexcept;
 
-  bool
+  constexpr bool
   operator==(const network_v6& __a, const network_v6& __b) noexcept;
 
 
@@ -1263,10 +1263,10 @@ namespace ip
 constexpr address_v4
 netmask() const noexcept
 {
-  address_v4::uint_type __val = address_v4::broadcast().to_uint();
-  __val >>= (32 - _M_prefix_len);
-  __val <<= (32 - _M_prefix_len);
-  return address_v4{__val};
+  address_v4 __m;
+  if (_M_prefix_len)
+   __m = address_v4(0xu << (32 - _M_prefix_len));
+  return __m;
 }
 
 constexpr address_v4
@@ -1275,7 +1275,7 @@ namespace ip
 
 constexpr address_v4
 broadcast() const noexcept
-{ return address_v4{_M_addr.to_uint() | ~netmask().to_uint()}; }
+{ return address_v4{_M_addr.to_uint() | (0xu >> _M_prefix_len)}; }
 
 address_v4_range
 hosts() const noexcept
@@ -1306,8 +1306,23 @@ namespace ip
   __string_with<_Allocator>
   to_string(const _Allocator& __a = _Allocator()) const
   {
-   return address().to_string(__a) + '/'
- + std::to_string(prefix_length());
+   auto __str = address().to_string(__a);
+   const unsigned __addrlen = __str.length();
+   const unsigned __preflen = prefix_length() >= 10 ? 2 : 1;
+   auto __write = [=](char* __p, size_t __n) {
+ __p[__addrlen] = '/';
+ std::__detail::__to_chars_10_impl(__p + __addrlen + 1, __preflen,
+   (unsigned char)prefix_length());
+ return __n;
+   };
+   const unsigned __len = __addrlen + 1 + __preflen;
+#if __cpp_lib_string_resize_and_overwrite
+   __str.resize_and_overwrite(__len, __write);
+#else
+   __str.resize(__len);
+   __write(&__str.front(), __len);
+#endif
+   return __str;
   }
 
   private:
@@ -1379,14 +1394,14 @@ namespace ip
* @{
*/
 
-  inline bool
+  constexpr bool
   operator==(const network_v4& __a, const network_v4& __b) noexcept
   {
 return __a.address() == __b.address()
   && __a.prefix_length() == __b.prefix_length();
   }
 
-  inline bool
+  constexpr bool
   operator!=(const network_v4& __a, const network_v4& __b) noexcept
   { return !(__a == __b); }
 
@@ -1396,14 +1411,14 @@ namespace ip
* @{
*/
 
-  inline bool
+  constexpr bool
   operator==(const network_v6& __a, const network_v6& __b) noexcept
   {
 return __a.address() == __b.address()
   && __a.prefix_length() == __b.prefix_length();
   }
 
-  inline bool
+  constexpr bool
   operator!=(const network_v6& __a, const network_v6& __b) noexcept
   { return !(__a == __b); }
 
diff --git 
a/libstdc++-v3/testsuite/experimental/net/internet/network/v4/cons.cc 
b/libstdc++-v3/testsuite/experimental/net/internet/network/v4/cons.cc
new file mode 100644
index 000..7784b6f6f58
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/net/internet/network/v4/cons.cc
@@ -0,0 +1,129 @@
+// { dg-do run { target c++14 } }
+// { dg-require-effective-target net_ts_ip }
+// { dg-add-options net_ts }
+
+#include 
+#include 
+#include 
+
+using std::experimental::net::ip::network_v4;
+using std::experimental::net::ip::address_v4;
+
+constexpr void
+test01()
+{
+  network_v4 n0;
+  VERIFY( n0.address().is_unspecified() );
+  VERIFY( n0.prefix_length() == 0 );
+}
+
+constexpr void
+test02()
+{
+  address_v4 a0;
+  network_v4 n0{ a0, 0 };
+  VERIFY( n0.address() == a0 );
+  VERIFY( n0.prefix_length() == 0 );
+
+  address_v4 a1{ 

[committed 4/5] libstdc++: Make net::ip::basic_endpoint comparisons constexpr

2023-02-24 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk.

-- >8 --

libstdc++-v3/ChangeLog:

* include/experimental/internet (basic_endpoint): Add missing
constexpr to comparison operators.
* testsuite/experimental/net/internet/endpoint/cons.cc: New test.
---
 libstdc++-v3/include/experimental/internet| 12 ++--
 .../net/internet/endpoint/cons.cc | 66 +++
 2 files changed, 72 insertions(+), 6 deletions(-)
 create mode 100644 
libstdc++-v3/testsuite/experimental/net/internet/endpoint/cons.cc

diff --git a/libstdc++-v3/include/experimental/internet 
b/libstdc++-v3/include/experimental/internet
index 5336b8a8ce3..cae07f466da 100644
--- a/libstdc++-v3/include/experimental/internet
+++ b/libstdc++-v3/include/experimental/internet
@@ -1626,19 +1626,19 @@ namespace ip
*/
 
   template
-inline bool
+constexpr bool
 operator==(const basic_endpoint<_InternetProtocol>& __a,
   const basic_endpoint<_InternetProtocol>& __b)
 { return __a.address() == __b.address() && __a.port() == __b.port(); }
 
   template
-inline bool
+constexpr bool
 operator!=(const basic_endpoint<_InternetProtocol>& __a,
   const basic_endpoint<_InternetProtocol>& __b)
 { return !(__a == __b); }
 
   template
-inline bool
+constexpr bool
 operator< (const basic_endpoint<_InternetProtocol>& __a,
   const basic_endpoint<_InternetProtocol>& __b)
 {
@@ -1647,19 +1647,19 @@ namespace ip
 }
 
   template
-inline bool
+constexpr bool
 operator> (const basic_endpoint<_InternetProtocol>& __a,
   const basic_endpoint<_InternetProtocol>& __b)
 { return __b < __a; }
 
   template
-inline bool
+constexpr bool
 operator<=(const basic_endpoint<_InternetProtocol>& __a,
   const basic_endpoint<_InternetProtocol>& __b)
 { return !(__b < __a); }
 
   template
-inline bool
+constexpr bool
 operator>=(const basic_endpoint<_InternetProtocol>& __a,
   const basic_endpoint<_InternetProtocol>& __b)
 { return !(__a < __b); }
diff --git a/libstdc++-v3/testsuite/experimental/net/internet/endpoint/cons.cc 
b/libstdc++-v3/testsuite/experimental/net/internet/endpoint/cons.cc
new file mode 100644
index 000..1b5c92c0b58
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/net/internet/endpoint/cons.cc
@@ -0,0 +1,66 @@
+// { dg-do run { target c++14 } }
+// { dg-require-effective-target net_ts_ip }
+// { dg-add-options net_ts }
+
+#include 
+#include 
+
+using namespace std::experimental::net;
+
+constexpr void
+test_default()
+{
+  ip::tcp::endpoint t1;
+  VERIFY( t1.protocol() == ip::tcp::v4() );
+  VERIFY( t1.address() == ip::address() );
+  VERIFY( t1.port() == 0 );
+
+  ip::udp::endpoint t2;
+  VERIFY( t2.protocol() == ip::udp::v4() );
+  VERIFY( t2.address() == ip::address() );
+  VERIFY( t2.port() == 0 );
+}
+
+constexpr void
+test_proto()
+{
+  ip::tcp::endpoint t1(ip::tcp::v4(), 22);
+  VERIFY( t1.protocol() == ip::tcp::v4() );
+  VERIFY( t1.address() == ip::address_v4() );
+  VERIFY( t1.port() == 22 );
+
+  ip::tcp::endpoint t2(ip::tcp::v6(), 80);
+  VERIFY( t2.protocol() == ip::tcp::v6() );
+  VERIFY( t2.address() == ip::address_v6() );
+  VERIFY( t2.port() == 80 );
+}
+
+constexpr void
+test_addr()
+{
+  ip::address_v4 a1(ip::address_v4::bytes_type(1, 2, 3, 4));
+  ip::tcp::endpoint t1(a1, 22);
+  VERIFY( t1.protocol() == ip::tcp::v4() );
+  VERIFY( t1.address() == a1 );
+  VERIFY( t1.port() == 22 );
+
+  ip::address_v6 a2(ip::address_v6::bytes_type(21,22,23,24,25,26,27,28,29));
+  ip::tcp::endpoint t2(a2, 80);
+  VERIFY( t2.protocol() == ip::tcp::v6() );
+  VERIFY( t2.address() == a2 );
+  VERIFY( t2.port() == 80 );
+}
+
+int main()
+{
+  test_default();
+  test_proto();
+  test_addr();
+
+  constexpr bool c = [] {
+test_default();
+test_proto();
+test_addr();
+return true;
+  };
+}
-- 
2.39.2



[committed 5/5] libstdc++: Constrain net::executor constructors

2023-02-24 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk.

-- >8 --

The TS says the arguments to these constructors shall meet the Executor
requirements, so it's undefined if they don't. Constraining on a subset
of those requirements won't affect valid cases, but prevents the
majority of invalid cases from trying to instantiate the constructor.

This prevents the non-explicit executor(Executor) constructor being a
candidate anywhere that a net::executor could be constructed e.g.
comparing ip::tcp::v4() == ip::udp::v4() would try to convert both
operands to executor using that constructor, then compare then using
operator==(const executor&, const executor&).

libstdc++-v3/ChangeLog:

* include/experimental/executor (executor): Constrain template
constructors.
---
 libstdc++-v3/include/experimental/executor | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/include/experimental/executor 
b/libstdc++-v3/include/experimental/executor
index cd75d99ddb3..1dae8925916 100644
--- a/libstdc++-v3/include/experimental/executor
+++ b/libstdc++-v3/include/experimental/executor
@@ -1012,6 +1012,9 @@ inline namespace v1
 
   class executor
   {
+template
+  using _Context_t = decltype(std::declval<_Executor&>().context());
+
   public:
 // construct / copy / destroy:
 
@@ -1021,12 +1024,14 @@ inline namespace v1
 executor(const executor&) noexcept = default;
 executor(executor&&) noexcept = default;
 
-template
+template>>>
   executor(_Executor __e)
   : _M_target(make_shared<_Tgt1<_Executor>>(std::move(__e)))
   { }
 
-template
+template>>>
   executor(allocator_arg_t, const _ProtoAlloc& __a, _Executor __e)
   : _M_target(allocate_shared<_Tgt2<_Executor, _ProtoAlloc>>(__a,
std::move(__e), __a))
-- 
2.39.2



[committed 1/5] libstdc++: Optimize net::ip::address_v4::to_string()

2023-02-24 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk.

-- >8 --

This is an order of magnitude faster than calling inet_ntop (and not
only because we now avoid allocating a string that is one byte larger
than the SSO buffer).

libstdc++-v3/ChangeLog:

* include/experimental/internet (address_v4::to_string):
Optimize.
* testsuite/experimental/net/internet/address/v4/members.cc:
Check more addresses.
---
 libstdc++-v3/include/experimental/internet| 28 +--
 .../net/internet/address/v4/members.cc| 11 
 2 files changed, 31 insertions(+), 8 deletions(-)

diff --git a/libstdc++-v3/include/experimental/internet 
b/libstdc++-v3/include/experimental/internet
index 707370d5611..08bd0db4bb2 100644
--- a/libstdc++-v3/include/experimental/internet
+++ b/libstdc++-v3/include/experimental/internet
@@ -44,6 +44,7 @@
 #include 
 #include 
 #include 
+#include 
 #ifdef _GLIBCXX_HAVE_UNISTD_H
 # include 
 #endif
@@ -241,17 +242,28 @@ namespace ip
   __string_with<_Allocator>
   to_string(const _Allocator& __a = _Allocator()) const
   {
-#ifdef _GLIBCXX_HAVE_ARPA_INET_H
+   auto __write = [__addr = to_uint()](char* __p, size_t __n) {
+ auto __to_chars = [](char* __p, uint8_t __v) {
+   unsigned __n = __v >= 100u ? 3 : __v >= 10u ? 2 : 1;
+   std::__detail::__to_chars_10_impl(__p, __n, __v);
+   return __p + __n;
+ };
+ const auto __begin = __p;
+ __p = __to_chars(__p, uint8_t(__addr >> 24));
+ for (int __i = 2; __i >= 0; __i--) {
+   *__p++ = '.';
+   __p = __to_chars(__p, uint8_t(__addr >> (__i * 8)));
+ }
+ return __p - __begin;
+   };
__string_with<_Allocator> __str(__a);
-   __str.resize(INET_ADDRSTRLEN);
-   if (inet_ntop(AF_INET, &_M_addr, &__str.front(), __str.size()))
- __str.erase(__str.find('\0'));
-   else
- __str.resize(0);
-   return __str;
+#if __cpp_lib_string_resize_and_overwrite
+   __str.resize_and_overwrite(15, __write);
 #else
-   std::__throw_system_error((int)__unsupported_err());
+   __str.resize(15);
+   __str.resize(__write(&__str.front(), 15));
 #endif
+   return __str;
   }
 
 // static members:
diff --git 
a/libstdc++-v3/testsuite/experimental/net/internet/address/v4/members.cc 
b/libstdc++-v3/testsuite/experimental/net/internet/address/v4/members.cc
index df19b11804d..c40a8103664 100644
--- a/libstdc++-v3/testsuite/experimental/net/internet/address/v4/members.cc
+++ b/libstdc++-v3/testsuite/experimental/net/internet/address/v4/members.cc
@@ -22,6 +22,7 @@
 #include 
 #include 
 #include 
+#include 
 
 using std::experimental::net::ip::address_v4;
 
@@ -100,6 +101,16 @@ test04()
   VERIFY( address_v4::any().to_string() == "0.0.0.0" );
   VERIFY( address_v4::loopback().to_string() == "127.0.0.1" );
   VERIFY( address_v4::broadcast().to_string() == "255.255.255.255" );
+  using b = address_v4::bytes_type;
+  VERIFY( address_v4(b(1, 23, 45, 67)).to_string() == "1.23.45.67" );
+  VERIFY( address_v4(b(12, 34, 56, 78)).to_string() == "12.34.56.78" );
+  VERIFY( address_v4(b(123, 4, 5, 6)).to_string() == "123.4.5.6" );
+  VERIFY( address_v4(b(123, 234, 124, 235)).to_string() == "123.234.124.235" );
+
+  __gnu_test::uneq_allocator alloc(123);
+  auto str = address_v4(b(12, 34, 56, 78)).to_string(alloc);
+  VERIFY(str.get_allocator().get_personality() == alloc.get_personality());
+  VERIFY( str == "12.34.56.78" );
 }
 
 void
-- 
2.39.2



[committed] libstdc++: Suppress warnings about use of deprecated std::aligned_storage

2023-02-24 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux. Pushed to trunk.

-- >8 --

libstdc++-v3/ChangeLog:

* include/ext/aligned_buffer.h (__aligned_buffer): Add
diagnostic pragmas.
---
 libstdc++-v3/include/ext/aligned_buffer.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/libstdc++-v3/include/ext/aligned_buffer.h 
b/libstdc++-v3/include/ext/aligned_buffer.h
index 765d14684f2..dc62dfc8741 100644
--- a/libstdc++-v3/include/ext/aligned_buffer.h
+++ b/libstdc++-v3/include/ext/aligned_buffer.h
@@ -81,6 +81,8 @@ namespace __gnu_cxx
   template
 using __aligned_buffer = __aligned_membuf<_Tp>;
 #else
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wdeprecated-declarations"
   // Similar to __aligned_membuf but aligned for complete objects, not members.
   // This type is used in , , 
   // and , but ideally they would use __aligned_membuf
@@ -118,6 +120,7 @@ namespace __gnu_cxx
   _M_ptr() const noexcept
   { return static_cast(_M_addr()); }
 };
+#pragma GCC diagnostic pop
 #endif
 
 } // namespace
-- 
2.39.2



[committed] libstdc++: Reorder dg-options before dg-do

2023-02-24 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk.

-- >8 --

The options need to be set first, so that -std=gnu++20 is used when
checking the c++20 effective target.

libstdc++-v3/ChangeLog:

* testsuite/std/format/arguments/lwg3810.cc: Move dg-options
before dg-do.
---
 libstdc++-v3/testsuite/std/format/arguments/lwg3810.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libstdc++-v3/testsuite/std/format/arguments/lwg3810.cc 
b/libstdc++-v3/testsuite/std/format/arguments/lwg3810.cc
index 60587a93d2c..9ccb654de1b 100644
--- a/libstdc++-v3/testsuite/std/format/arguments/lwg3810.cc
+++ b/libstdc++-v3/testsuite/std/format/arguments/lwg3810.cc
@@ -1,5 +1,5 @@
-// { dg-do compile { target c++20 } }
 // { dg-options "-std=gnu++20" }
+// { dg-do compile { target c++20 } }
 
 // LWG 3810. CTAD for std::basic_format_args
 
-- 
2.39.2



Re: [PATCH 2/2] Avoid default-initializing auto_vec storage, fix vec

2023-02-24 Thread Richard Biener via Gcc-patches
On Fri, 24 Feb 2023, Jakub Jelinek wrote:

> On Fri, Feb 24, 2023 at 02:47:39PM +0100, Richard Biener wrote:
> > * vec.h (vec::m_vecdata): Remove.
> > (vec::m_vecpfx): Align as T to avoid
> > changing alignment of vec and simplifying
> > address.
> > (vec::address): Compute as this + 1.
> > (vec::embedded_size): Use sizeof the
> > vector instead of the offset of the m_vecdata member.
> > (auto_vec::m_data): Turn storage into
> > uninitialized unsigned char.
> > (auto_vec::auto_vec): Allow allocation of one
> > stack member.  Initialize m_vec in a special way to
> > avoid later stringop overflow diagnostics.
> > * vec.cc (test_auto_alias): New.
> > (vec_cc_tests): Call it.
> > @@ -1559,8 +1560,14 @@ class auto_vec : public vec
> >  public:
> >auto_vec ()
> >{
> > -m_auto.embedded_init (MAX (N, 2), 0, 1);
> > -this->m_vec = _auto;
> > +m_auto.embedded_init (N, 0, 1);
> > +/* ???  Instead of initializing m_vec from _auto directly use an
> > +   expression that avoids refering to a specific member of 'this'
> > +   to derail the -Wstringop-overflow diagnostic code, avoiding
> > +   the impression that data accesses are supposed to be to the
> > +   m_auto memmber storage.  */
> 
> s/memmber/member/
> 
> > +size_t off = (char *) _auto - (char *) this;
> > +this->m_vec = (vec *) ((char *) this + off);
> >}
> >  
> >auto_vec (size_t s CXX_MEM_STAT_INFO)
> > @@ -1571,7 +1578,7 @@ public:
> > return;
> >}
> >  
> > -m_auto.embedded_init (MAX (N, 2), 0, 1);
> > +m_auto.embedded_init (N, 0, 1);
> >  this->m_vec = _auto;
> 
> Don't we need the above 2 lines here as well (perhaps with a shorter comment
> just referencing the earlier comment)?

I've noticed that as well and put it there now, it wasn't necessary
to get bootstrap working.

> Otherwise LGTM, thanks.

Thanks,
Richard.


Re: [PATCH 2/2] Avoid default-initializing auto_vec storage, fix vec

2023-02-24 Thread Jakub Jelinek via Gcc-patches
On Fri, Feb 24, 2023 at 02:47:39PM +0100, Richard Biener wrote:
>   * vec.h (vec::m_vecdata): Remove.
>   (vec::m_vecpfx): Align as T to avoid
>   changing alignment of vec and simplifying
>   address.
>   (vec::address): Compute as this + 1.
>   (vec::embedded_size): Use sizeof the
>   vector instead of the offset of the m_vecdata member.
>   (auto_vec::m_data): Turn storage into
>   uninitialized unsigned char.
>   (auto_vec::auto_vec): Allow allocation of one
>   stack member.  Initialize m_vec in a special way to
>   avoid later stringop overflow diagnostics.
>   * vec.cc (test_auto_alias): New.
>   (vec_cc_tests): Call it.
> @@ -1559,8 +1560,14 @@ class auto_vec : public vec
>  public:
>auto_vec ()
>{
> -m_auto.embedded_init (MAX (N, 2), 0, 1);
> -this->m_vec = _auto;
> +m_auto.embedded_init (N, 0, 1);
> +/* ???  Instead of initializing m_vec from _auto directly use an
> +   expression that avoids refering to a specific member of 'this'
> +   to derail the -Wstringop-overflow diagnostic code, avoiding
> +   the impression that data accesses are supposed to be to the
> +   m_auto memmber storage.  */

s/memmber/member/

> +size_t off = (char *) _auto - (char *) this;
> +this->m_vec = (vec *) ((char *) this + off);
>}
>  
>auto_vec (size_t s CXX_MEM_STAT_INFO)
> @@ -1571,7 +1578,7 @@ public:
>   return;
>}
>  
> -m_auto.embedded_init (MAX (N, 2), 0, 1);
> +m_auto.embedded_init (N, 0, 1);
>  this->m_vec = _auto;

Don't we need the above 2 lines here as well (perhaps with a shorter comment
just referencing the earlier comment)?

Otherwise LGTM, thanks.

Jakub



Re: [PATCH 1/2] Change vec<,,vl_embed>::m_vecdata refrences into address ()

2023-02-24 Thread Jakub Jelinek via Gcc-patches
On Fri, Feb 24, 2023 at 02:46:21PM +0100, Richard Biener wrote:
> As preparation to remove m_vecdata in the vl_embed vector this
> changes references to it into calls to address ().
> 
> As I was here it also fixes ::contains to avoid repeated bounds
> checking and the same issue in ::lower_bound which also suffers
> from unnecessary copying around values.
> 
>   * vec.h: Change m_vecdata references to address ().
>   * vec.h (vec::lower_bound): Adjust to
>   take a const reference to the object, use address to
>   access data.
>   (vec::contains): Use address to access data.
>   (vec::operator[]): Use address instead of
>   m_vecdata to access data.
>   (vec::iterate): Likewise.
>   (vec::copy): Likewise.
>   (vec::quick_push): Likewise.
>   (vec::pop): Likewise.
>   (vec::quick_insert): Likewise.
>   (vec::ordered_remove): Likewise.
>   (vec::unordered_remove): Likewise.
>   (vec::block_remove): Likewise.
>   (vec::address): Likewise.

Ok, thanks.

Jakub



Re: [v3][PATCH 2/2] Update documentation to clarify a GCC extension (PR77650)

2023-02-24 Thread Qing Zhao via Gcc-patches



> On Feb 23, 2023, at 7:56 PM, Joseph Myers  wrote:
> 
> On Thu, 23 Feb 2023, Qing Zhao via Gcc-patches wrote:
> 
>> But the following:
>> 
>> struct flex1  { int length1; char data1[]; };
>> struct flex2  { int length2; char data2[]; };
>> union union_flex { struct flex1 f1; struct flex2 f2; };  /* this is C 
>> standard.  */
>> 
>> struct out_flex { int n; union union_flex flex_data1;};  /* this is GNU 
>> extension.  */
>> 
>> Should add this item into the documentation?
> 
> "union that contains a structure with a flexible array member" is just 
> like "structure with a flexible array member".  I suppose the 
> documentation should try to make that clear, without repeating it too much 
> for every separate case.

Okay, thanks.

Qing
> 
> -- 
> Joseph S. Myers
> jos...@codesourcery.com



[PATCH 2/2] Avoid default-initializing auto_vec storage, fix vec

2023-02-24 Thread Richard Biener via Gcc-patches
The following avoids default-initializing auto_vec storage for
non-POD T since that's not what the allocated storage fallback
will do and it's also not expected for existing cases like

  auto_vec, 64> elts;

which exist to optimize the allocation.

It also fixes the array accesses done by vec to not
use its own m_vecdata member but instead access the container
provided storage via pointer arithmetic.

I've built the series with GCC 4.8 and clang 13 up to the stage1
target libs, a bootstrap and regtest on x86_64-unknown-linux-gnu
with GCC 12 was successful with the diagnostic pragma, I'm
currently re-bootstrapping and testing with a GCC 7 host compiler.

OK if that succeeds?

Thanks,
Richard.

* vec.h (vec::m_vecdata): Remove.
(vec::m_vecpfx): Align as T to avoid
changing alignment of vec and simplifying
address.
(vec::address): Compute as this + 1.
(vec::embedded_size): Use sizeof the
vector instead of the offset of the m_vecdata member.
(auto_vec::m_data): Turn storage into
uninitialized unsigned char.
(auto_vec::auto_vec): Allow allocation of one
stack member.  Initialize m_vec in a special way to
avoid later stringop overflow diagnostics.
* vec.cc (test_auto_alias): New.
(vec_cc_tests): Call it.
---
 gcc/vec.cc | 17 +
 gcc/vec.h  | 27 +--
 2 files changed, 34 insertions(+), 10 deletions(-)

diff --git a/gcc/vec.cc b/gcc/vec.cc
index 511e6dff50d..2128fb1 100644
--- a/gcc/vec.cc
+++ b/gcc/vec.cc
@@ -568,6 +568,22 @@ test_auto_delete_vec ()
   ASSERT_EQ (dtor_count, 2);
 }
 
+/* Verify accesses to m_vecdata are done indirectly.  */
+
+static void
+test_auto_alias ()
+{
+  volatile int i = 1;
+  auto_vec v;
+  v.quick_grow (2);
+  v[0] = 1;
+  v[1] = 2;
+  int val;
+  for (int ix = i; v.iterate (ix, ); ix++)
+ASSERT_EQ (val, 2);
+  ASSERT_EQ (val, 0);
+}
+
 /* Run all of the selftests within this file.  */
 
 void
@@ -587,6 +603,7 @@ vec_cc_tests ()
   test_qsort ();
   test_reverse ();
   test_auto_delete_vec ();
+  test_auto_alias ();
 }
 
 } // namespace selftest
diff --git a/gcc/vec.h b/gcc/vec.h
index 2b36f065234..3b03bfe076a 100644
--- a/gcc/vec.h
+++ b/gcc/vec.h
@@ -586,8 +586,9 @@ public:
   unsigned allocated (void) const { return m_vecpfx.m_alloc; }
   unsigned length (void) const { return m_vecpfx.m_num; }
   bool is_empty (void) const { return m_vecpfx.m_num == 0; }
-  T *address (void) { return m_vecdata; }
-  const T *address (void) const { return m_vecdata; }
+  T *address (void) { return reinterpret_cast  (this + 1); }
+  const T *address (void) const
+{ return reinterpret_cast  (this + 1); }
   T *begin () { return address (); }
   const T *begin () const { return address (); }
   T *end () { return address () + length (); }
@@ -629,10 +630,10 @@ public:
   friend struct va_gc_atomic;
   friend struct va_heap;
 
-  /* FIXME - These fields should be private, but we need to cater to
+  /* FIXME - This field should be private, but we need to cater to
 compilers that have stricter notions of PODness for types.  */
-  vec_prefix m_vecpfx;
-  T m_vecdata[1];
+  /* Align m_vecpfx to simplify address ().  */
+  alignas (T) alignas (vec_prefix) vec_prefix m_vecpfx;
 };
 
 
@@ -1315,7 +1316,7 @@ vec::embedded_size (unsigned alloc)
vec, vec_embedded>::type vec_stdlayout;
   static_assert (sizeof (vec_stdlayout) == sizeof (vec), "");
   static_assert (alignof (vec_stdlayout) == alignof (vec), "");
-  return offsetof (vec_stdlayout, m_vecdata) + alloc * sizeof (T);
+  return sizeof (vec_stdlayout) + alloc * sizeof (T);
 }
 
 
@@ -1559,8 +1560,14 @@ class auto_vec : public vec
 public:
   auto_vec ()
   {
-m_auto.embedded_init (MAX (N, 2), 0, 1);
-this->m_vec = _auto;
+m_auto.embedded_init (N, 0, 1);
+/* ???  Instead of initializing m_vec from _auto directly use an
+   expression that avoids refering to a specific member of 'this'
+   to derail the -Wstringop-overflow diagnostic code, avoiding
+   the impression that data accesses are supposed to be to the
+   m_auto memmber storage.  */
+size_t off = (char *) _auto - (char *) this;
+this->m_vec = (vec *) ((char *) this + off);
   }
 
   auto_vec (size_t s CXX_MEM_STAT_INFO)
@@ -1571,7 +1578,7 @@ public:
return;
   }
 
-m_auto.embedded_init (MAX (N, 2), 0, 1);
+m_auto.embedded_init (N, 0, 1);
 this->m_vec = _auto;
   }
 
@@ -1590,7 +1597,7 @@ public:
 
 private:
   vec m_auto;
-  T m_data[MAX (N - 1, 1)];
+  unsigned char m_data[sizeof (T) * N];
 };
 
 /* auto_vec is a sub class of vec whose storage is released when it is
-- 
2.35.3


[PATCH 1/2] Change vec<, , vl_embed>::m_vecdata refrences into address ()

2023-02-24 Thread Richard Biener via Gcc-patches
As preparation to remove m_vecdata in the vl_embed vector this
changes references to it into calls to address ().

As I was here it also fixes ::contains to avoid repeated bounds
checking and the same issue in ::lower_bound which also suffers
from unnecessary copying around values.

* vec.h: Change m_vecdata references to address ().
* vec.h (vec::lower_bound): Adjust to
take a const reference to the object, use address to
access data.
(vec::contains): Use address to access data.
(vec::operator[]): Use address instead of
m_vecdata to access data.
(vec::iterate): Likewise.
(vec::copy): Likewise.
(vec::quick_push): Likewise.
(vec::pop): Likewise.
(vec::quick_insert): Likewise.
(vec::ordered_remove): Likewise.
(vec::unordered_remove): Likewise.
(vec::block_remove): Likewise.
(vec::address): Likewise.
---
 gcc/vec.h | 44 +---
 1 file changed, 25 insertions(+), 19 deletions(-)

diff --git a/gcc/vec.h b/gcc/vec.h
index a536b68732d..2b36f065234 100644
--- a/gcc/vec.h
+++ b/gcc/vec.h
@@ -611,10 +611,10 @@ public:
   void qsort (int (*) (const void *, const void *));
   void sort (int (*) (const void *, const void *, void *), void *);
   void stablesort (int (*) (const void *, const void *, void *), void *);
-  T *bsearch (const void *key, int (*compar)(const void *, const void *));
+  T *bsearch (const void *key, int (*compar) (const void *, const void *));
   T *bsearch (const void *key,
  int (*compar)(const void *, const void *, void *), void *);
-  unsigned lower_bound (T, bool (*)(const T &, const T &)) const;
+  unsigned lower_bound (const T &, bool (*) (const T &, const T &)) const;
   bool contains (const T ) const;
   static size_t embedded_size (unsigned);
   void embedded_init (unsigned, unsigned = 0, unsigned = 0);
@@ -879,7 +879,7 @@ inline const T &
 vec::operator[] (unsigned ix) const
 {
   gcc_checking_assert (ix < m_vecpfx.m_num);
-  return m_vecdata[ix];
+  return address ()[ix];
 }
 
 template
@@ -887,7 +887,7 @@ inline T &
 vec::operator[] (unsigned ix)
 {
   gcc_checking_assert (ix < m_vecpfx.m_num);
-  return m_vecdata[ix];
+  return address ()[ix];
 }
 
 
@@ -929,7 +929,7 @@ vec::iterate (unsigned ix, T *ptr) const
 {
   if (ix < m_vecpfx.m_num)
 {
-  *ptr = m_vecdata[ix];
+  *ptr = address ()[ix];
   return true;
 }
   else
@@ -955,7 +955,7 @@ vec::iterate (unsigned ix, T **ptr) const
 {
   if (ix < m_vecpfx.m_num)
 {
-  *ptr = CONST_CAST (T *, _vecdata[ix]);
+  *ptr = CONST_CAST (T *,  ()[ix]);
   return true;
 }
   else
@@ -978,7 +978,7 @@ vec::copy (ALONE_MEM_STAT_DECL) const
 {
   vec_alloc (new_vec, len PASS_MEM_STAT);
   new_vec->embedded_init (len, len);
-  vec_copy_construct (new_vec->address (), m_vecdata, len);
+  vec_copy_construct (new_vec->address (), address (), len);
 }
   return new_vec;
 }
@@ -1018,7 +1018,7 @@ inline T *
 vec::quick_push (const T )
 {
   gcc_checking_assert (space (1));
-  T *slot = _vecdata[m_vecpfx.m_num++];
+  T *slot =  ()[m_vecpfx.m_num++];
   *slot = obj;
   return slot;
 }
@@ -1031,7 +1031,7 @@ inline T &
 vec::pop (void)
 {
   gcc_checking_assert (length () > 0);
-  return m_vecdata[--m_vecpfx.m_num];
+  return address ()[--m_vecpfx.m_num];
 }
 
 
@@ -1056,7 +1056,7 @@ vec::quick_insert (unsigned ix, const T 
)
 {
   gcc_checking_assert (length () < allocated ());
   gcc_checking_assert (ix <= length ());
-  T *slot = _vecdata[ix];
+  T *slot =  ()[ix];
   memmove (slot + 1, slot, (m_vecpfx.m_num++ - ix) * sizeof (T));
   *slot = obj;
 }
@@ -1071,7 +1071,7 @@ inline void
 vec::ordered_remove (unsigned ix)
 {
   gcc_checking_assert (ix < length ());
-  T *slot = _vecdata[ix];
+  T *slot =  ()[ix];
   memmove (slot, slot + 1, (--m_vecpfx.m_num - ix) * sizeof (T));
 }
 
@@ -1118,7 +1118,8 @@ inline void
 vec::unordered_remove (unsigned ix)
 {
   gcc_checking_assert (ix < length ());
-  m_vecdata[ix] = m_vecdata[--m_vecpfx.m_num];
+  T *p = address ();
+  p[ix] = p[--m_vecpfx.m_num];
 }
 
 
@@ -1130,7 +1131,7 @@ inline void
 vec::block_remove (unsigned ix, unsigned len)
 {
   gcc_checking_assert (ix + len <= length ());
-  T *slot = _vecdata[ix];
+  T *slot =  ()[ix];
   m_vecpfx.m_num -= len;
   memmove (slot, slot + len, (m_vecpfx.m_num - ix) * sizeof (T));
 }
@@ -1248,9 +1249,13 @@ inline bool
 vec::contains (const T ) const
 {
   unsigned int len = length ();
+  const T *p = address ();
   for (unsigned int i = 0; i < len; i++)
-if ((*this)[i] == search)
-  return true;
+{
+  const T *slot = [i];
+  if (*slot == search)
+   return true;
+}
 
   return false;
 }
@@ -1262,7 +1267,8 @@ vec::contains (const T ) const
 
 template
 unsigned
-vec::lower_bound (T obj, bool (*lessthan)(const T &, const T 
&))
+vec::lower_bound (const T ,
+ bool 

Re: [PATCH] cgraphclones: Don't share DECL_ARGUMENTS between thunk and its artificial thunk [PR108854]

2023-02-24 Thread Jan Hubicka via Gcc-patches
> Hi!
> 
> The following testcase ICEs on x86_64-linux with -m32.  The problem is
> we create an artificial thunk and because of -fPIC, ia32 and thunk
> destination which doesn't bind locally can't use a mi thunk.
> The ICE is because during expansion to RTL we see SSA_NAME for a PARM_DECL,
> but the PARM_DECL doesn't have DECL_CONTEXT of the current function.
> This is because duplicate_thunk_for_node creates a new DECL_ARGUMENTS chain
> only if some arguments need modification.
> 
> The following patch fixes it by copying the DECL_ARGUMENTS list even if
> the arguments can stay as is, to update DECL_CONTEXT on them.  While for
> mi thunks it doesn't really matter because we don't use those arguments
> in any way, for other thunks it is important.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2023-02-23  Jakub Jelinek  
> 
>   PR middle-end/108854
>   * cgraphclones.cc (duplicate_thunk_for_node): If no parameter
>   changes are needed, copy at least DECL_ARGUMENTS PARM_DECL
>   nodes and adjust their DECL_CONTEXT.
> 
>   * g++.dg/opt/pr108854.C: New test.
> 
> --- gcc/cgraphclones.cc.jj2023-02-22 20:50:27.417519830 +0100
> +++ gcc/cgraphclones.cc   2023-02-23 17:12:59.875133883 +0100
> @@ -218,7 +218,17 @@ duplicate_thunk_for_node (cgraph_node *t
>body_adj.modify_formal_parameters ();
>  }
>else
> -new_decl = copy_node (thunk->decl);
> +{
> +  new_decl = copy_node (thunk->decl);
> +  for (tree *arg = _ARGUMENTS (new_decl);
> +*arg; arg = _CHAIN (*arg))
> + {
> +   tree next = DECL_CHAIN (*arg);
> +   *arg = copy_node (*arg);
> +   DECL_CONTEXT (*arg) = new_decl;
> +   DECL_CHAIN (*arg) = next;

This makes sense to me. I wonder if we don't want to update abstract
origin too like we do in tree-inline?
Maybe it is unecessary since we don't do debug info for thunks

Jan


Re: [PATCH 2/2] Avoid default-initializing auto_vec storage, fix vec

2023-02-24 Thread Jakub Jelinek via Gcc-patches
On Fri, Feb 24, 2023 at 12:15:04PM +, Richard Biener wrote:
> > > Anyway, I wonder if you get the -Werror=stringop-overflow= errors during
> > > bootstrap that I got with my version or not.
> 
> Yes, I get this as well, not sure how to suppress it.  I guess there's
> no standard way to get at the address after some object without going
> through uintptr obfuscation - and obviously we do not want to have
> that (and if we optimize it away that doesn't help the diagnostic ...)

As I wrote on IRC, incremental:
--- gcc/vec.h   2023-02-24 11:54:49.859564268 +0100
+++ gcc/vec.h   2023-02-24 14:13:38.199163428 +0100
@@ -1553,7 +1553,8 @@
   auto_vec ()
   {
 m_auto.embedded_init (MAX (N, 2), 0, 1);
-this->m_vec = _auto;
+size_t off = (char *) _auto - (char *) this;
+this->m_vec = (vec *) ((char *) this + off);
   }
 
   auto_vec (size_t s CXX_MEM_STAT_INFO)
@@ -1565,7 +1566,8 @@
   }
 
 m_auto.embedded_init (MAX (N, 2), 0, 1);
-this->m_vec = _auto;
+size_t off = (char *) _auto - (char *) this;
+this->m_vec = (vec *) ((char *) this + off);
   }
 
   ~auto_vec ()
fixes the -Werror=stringop-overflow= errors for me during stage3, but
haven't done full bootstrap.
It is very ugly, but works.

Jakub



Re: [PATCH] RISC-V: Disable attribute generation by default

2023-02-24 Thread Kito Cheng via Gcc-patches
It did help people to identify what extension used in the binary, so I
would prefer keep that enable by default.

and lld is begin fix those merge issue, so the situation should be improved
soon.


Palmer Dabbelt  於 2023年2月24日 週五 10:29 寫道:

> We generate a handful of attributes by default, but they don't really
> encode any useful information.  We've broadly stopped ascribing any
> meaning to them in binutils; but they trip up LLVM, older toolchains,
> and users.  So let's just turn them off by default.  The old binaries
> will still be floating around, but at least this way we'll stop tripping
> over new incompatibilities.
>
> If we get to a point where there's some attributes that are defined that
> we can use then we can sort out how to turn those on without turning on
> the old ones, but unless I'm missing something the current set of
> attributes are too broken to be useful for anything.
>
> gcc/ChangeLog:
>
> * config.gcc (--with-riscv-attribute): Default to off.
> ---
> I know it's pretty late, but I'd like to target this for GCC-13.  The
> Zmmul stuff has resulted in another round of build breakages that we're
> going to have to chase down, and while we could update everything to
> turn off the attributes it seems easier to just set the default.
> ---
>  gcc/config.gcc | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/config.gcc b/gcc/config.gcc
> index c070e6ecd2e..52639cf26d6 100644
> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
> @@ -4596,7 +4596,7 @@ case "${target}" in
> tm_defines="${tm_defines} TARGET_RISCV_ATTRIBUTE=0"
> ;;
> ""|default)
> -   tm_defines="${tm_defines} TARGET_RISCV_ATTRIBUTE=1"
> +   tm_defines="${tm_defines} TARGET_RISCV_ATTRIBUTE=0"
> ;;
> *)
> echo
> "--with-riscv-attribute=${with_riscv_attribute} is not supported.  The
> argument must begin with yes, no or default." 1>&2
> --
> 2.39.1
>
>


Re: [PATCH 2/2] Avoid default-initializing auto_vec storage, fix vec

2023-02-24 Thread Jakub Jelinek via Gcc-patches
On Fri, Feb 24, 2023 at 12:15:04PM +, Richard Biener wrote:
> > > Also, I think it needs to be MAX (N, 2) instead of N, because auto_vec
> > > ctors use MAX (N, 2).  We could also change all those to MAX (N, 1)
> > > now, but it can't be N because m_data[sizeof (T) * 0] is invalid in
> > > standard C.
> 
> I've removed the MAX (N, 2) now, I think that N == 0 cannot happen
> because we have a specialization covering that.  So we know N is
> at least 1.

I think you're right.

> > > Anyway, I wonder if you get the -Werror=stringop-overflow= errors during
> > > bootstrap that I got with my version or not.
> 
> Yes, I get this as well, not sure how to suppress it.  I guess there's
> no standard way to get at the address after some object without going
> through uintptr obfuscation - and obviously we do not want to have
> that (and if we optimize it away that doesn't help the diagnostic ...)

I think we need to look at the exact IL on which it warns and see what
our options are.

Jakub



Re: [PATCH 1/2] Change vec<, , vl_embed>::m_vecdata refrences into address ()

2023-02-24 Thread Richard Biener via Gcc-patches
On Fri, 24 Feb 2023, Jakub Jelinek wrote:

> On Fri, Feb 24, 2023 at 12:32:45PM +0100, Richard Biener via Gcc-patches 
> wrote:
> > --- a/gcc/vec.h
> > +++ b/gcc/vec.h
> > @@ -614,7 +614,7 @@ public:
> >T *bsearch (const void *key, int (*compar)(const void *, const void *));
> >T *bsearch (const void *key,
> >   int (*compar)(const void *, const void *, void *), void *);
> > -  unsigned lower_bound (T, bool (*)(const T &, const T &)) const;
> > +  unsigned lower_bound (const T &, bool (*)(const T &, const T &)) const;
> 
> Missing space after (*) while you're there.
> 
> > @@ -929,7 +929,7 @@ vec::iterate (unsigned ix, T *ptr) const
> >  {
> >if (ix < m_vecpfx.m_num)
> >  {
> > -  *ptr = m_vecdata[ix];
> > +  *ptr = address()[ix];
> 
> Missing space before ().
> 
> > @@ -1118,7 +1118,7 @@ inline void
> >  vec::unordered_remove (unsigned ix)
> >  {
> >gcc_checking_assert (ix < length ());
> > -  m_vecdata[ix] = m_vecdata[--m_vecpfx.m_num];
> > +  address ()[ix] = address ()[--m_vecpfx.m_num];
> >  }
> 
> As address () is used twice here, can't we stick it into a temporary
> and use twice then?
> 
> > @@ -1249,8 +1249,11 @@ vec::contains (const T ) const
> >  {
> >unsigned int len = length ();
> >for (unsigned int i = 0; i < len; i++)
> > -if ((*this)[i] == search)
> > -  return true;
> > +{
> > +  const T *slot =  ()[i];
> > +  if (*slot == search)
> > +   return true;
> 
> Similarly, can't we do address () once before the loop into a temporary?
> 
> >  template
> >  unsigned
> > -vec::lower_bound (T obj, bool (*lessthan)(const T &, const 
> > T &))
> > +vec::lower_bound (const T ,
> > + bool (*lessthan)(const T &, const T &))
> 
> ) ( while you're at it.
> 
> Otherwise LGTM.

All fixed.

Richard.


Re: [PATCH 2/2] Avoid default-initializing auto_vec storage, fix vec

2023-02-24 Thread Richard Biener via Gcc-patches
On Fri, 24 Feb 2023, Jonathan Wakely wrote:

> On Fri, 24 Feb 2023 at 11:52, Jakub Jelinek  wrote:
> >
> > On Fri, Feb 24, 2023 at 12:44:44PM +0100, Richard Biener wrote:
> > > --- a/gcc/vec.h
> > > +++ b/gcc/vec.h
> > > @@ -586,8 +586,8 @@ public:
> > >unsigned allocated (void) const { return m_vecpfx.m_alloc; }
> > >unsigned length (void) const { return m_vecpfx.m_num; }
> > >bool is_empty (void) const { return m_vecpfx.m_num == 0; }
> > > -  T *address (void) { return m_vecdata; }
> > > -  const T *address (void) const { return m_vecdata; }
> > > +  T *address (void) { return reinterpret_cast  (this + 1); }
> > > +  const T *address (void) const { return reinterpret_cast  
> > > (this + 1); }
> >
> > This is now too long.

Fixed.

> > >T *begin () { return address (); }
> > >const T *begin () const { return address (); }
> > >T *end () { return address () + length (); }
> > > @@ -631,8 +631,7 @@ public:
> > >
> > >/* FIXME - These fields should be private, but we need to cater to
> > >compilers that have stricter notions of PODness for types.  */
> > > -  vec_prefix m_vecpfx;
> > > -  T m_vecdata[1];
> > > +  alignas (T) vec_prefix m_vecpfx;
> >
> > The comment needs adjustment and down't we need
> > alignas (T) alignas (vec_prefix) ?
> 
> Yes. If alignas(T) is less than the natural alignment then this will
> be an error. We want it to be the larger of  the two alignments, so we
> need to specify both.

OK, changed to specify both and adjusted the comment, also noting why
we do this - it simplifies address (), otherwise we'd have to round up
to an aligned address.

> >
> > > @@ -1588,7 +1587,7 @@ public:
> > >
> > >  private:
> > >vec m_auto;
> > > -  T m_data[MAX (N - 1, 1)];
> > > +  alignas(T) unsigned char m_data[sizeof (T) * N];
> > >  };
> >
> > I still believe you don't need alignas(T) here (and space before (T) ).

I was worried that with auto_vec<__int128> we get tail-padding in m_auto
re-used, but since this isn't inheritance we're probably safe.  So 
removed give that m_auto is aligned to T.

> > Also, I think it needs to be MAX (N, 2) instead of N, because auto_vec
> > ctors use MAX (N, 2).  We could also change all those to MAX (N, 1)
> > now, but it can't be N because m_data[sizeof (T) * 0] is invalid in
> > standard C.

I've removed the MAX (N, 2) now, I think that N == 0 cannot happen
because we have a specialization covering that.  So we know N is
at least 1.

> > Anyway, I wonder if you get the -Werror=stringop-overflow= errors during
> > bootstrap that I got with my version or not.

Yes, I get this as well, not sure how to suppress it.  I guess there's
no standard way to get at the address after some object without going
through uintptr obfuscation - and obviously we do not want to have
that (and if we optimize it away that doesn't help the diagnostic ...)

Richard.


[committed] i386: Update i386-builtin.def file comment description of BDESC{,_FIRST}

2023-02-24 Thread Jakub Jelinek via Gcc-patches
Hi!

I've noticed the description of these wasn't updated when the mask2
argument has been added in 2019.

Committed to trunk as obvious.

2023-02-24  Jakub Jelinek  

* config/i386/i386-builtin.def: Update description of BDESC
and BDESC_FIRST in file comment to include mask2.

--- gcc/config/i386/i386-builtin.def.jj 2023-02-24 10:12:37.027390923 +0100
+++ gcc/config/i386/i386-builtin.def2023-02-24 13:07:29.453512100 +0100
@@ -23,9 +23,9 @@
.  */
 
 /* Before including this file, some macros must be defined:
-   BDESC (mask, icode, name, code, comparison, flag)
+   BDESC (mask, mask2, icode, name, code, comparison, flag)
  -- definition of each builtin
-   BDESC_FIRST (kind, KIND, mask, icode, name, code, comparison, flag)
+   BDESC_FIRST (kind, KIND, mask, mask2, icode, name, code, comparison, flag)
  -- like BDESC, but used for the first builtin in each category;
bdesc_##kind will be used in the name of the array and
IX86_BUILTIN__BDESC_##KIND##_FIRST will be the low boundary

Jakub



Re: [PATCH 2/2] Avoid default-initializing auto_vec storage, fix vec

2023-02-24 Thread Jakub Jelinek via Gcc-patches
On Fri, Feb 24, 2023 at 11:54:54AM +, Jonathan Wakely wrote:
> > The comment needs adjustment and don't we need
> > alignas (T) alignas (vec_prefix) ?
> 
> Yes. If alignas(T) is less than the natural alignment then this will
> be an error. We want it to be the larger of  the two alignments, so we
> need to specify both.

Seems g++ doesn't diagnose this but clang++ does:
struct S { int a; };
alignas (char) alignas (S) S s;
alignas (char) S t;
$ g++ -S -o /tmp/1.s /tmp/1.C -pedantic-errors
$ clang++ -S -o /tmp/1.s /tmp/1.C -pedantic-errors
/tmp/1.C:3:1: error: requested alignment is less than minimum alignment of 4 
for type 'S'
alignas (char) S t;
^
1 error generated.

Jakub



Re: [PATCH 2/2] Avoid default-initializing auto_vec storage, fix vec

2023-02-24 Thread Jonathan Wakely via Gcc-patches
On Fri, 24 Feb 2023 at 11:52, Jakub Jelinek  wrote:
>
> On Fri, Feb 24, 2023 at 12:44:44PM +0100, Richard Biener wrote:
> > --- a/gcc/vec.h
> > +++ b/gcc/vec.h
> > @@ -586,8 +586,8 @@ public:
> >unsigned allocated (void) const { return m_vecpfx.m_alloc; }
> >unsigned length (void) const { return m_vecpfx.m_num; }
> >bool is_empty (void) const { return m_vecpfx.m_num == 0; }
> > -  T *address (void) { return m_vecdata; }
> > -  const T *address (void) const { return m_vecdata; }
> > +  T *address (void) { return reinterpret_cast  (this + 1); }
> > +  const T *address (void) const { return reinterpret_cast  
> > (this + 1); }
>
> This is now too long.
>
> >T *begin () { return address (); }
> >const T *begin () const { return address (); }
> >T *end () { return address () + length (); }
> > @@ -631,8 +631,7 @@ public:
> >
> >/* FIXME - These fields should be private, but we need to cater to
> >compilers that have stricter notions of PODness for types.  */
> > -  vec_prefix m_vecpfx;
> > -  T m_vecdata[1];
> > +  alignas (T) vec_prefix m_vecpfx;
>
> The comment needs adjustment and down't we need
> alignas (T) alignas (vec_prefix) ?

Yes. If alignas(T) is less than the natural alignment then this will
be an error. We want it to be the larger of  the two alignments, so we
need to specify both.

>
> > @@ -1588,7 +1587,7 @@ public:
> >
> >  private:
> >vec m_auto;
> > -  T m_data[MAX (N - 1, 1)];
> > +  alignas(T) unsigned char m_data[sizeof (T) * N];
> >  };
>
> I still believe you don't need alignas(T) here (and space before (T) ).
> Also, I think it needs to be MAX (N, 2) instead of N, because auto_vec
> ctors use MAX (N, 2).  We could also change all those to MAX (N, 1)
> now, but it can't be N because m_data[sizeof (T) * 0] is invalid in
> standard C.
>
> Anyway, I wonder if you get the -Werror=stringop-overflow= errors during
> bootstrap that I got with my version or not.
>
> Jakub
>



Re: [PATCH 2/2] Avoid default-initializing auto_vec storage, fix vec

2023-02-24 Thread Jakub Jelinek via Gcc-patches
On Fri, Feb 24, 2023 at 12:44:44PM +0100, Richard Biener wrote:
> --- a/gcc/vec.h
> +++ b/gcc/vec.h
> @@ -586,8 +586,8 @@ public:
>unsigned allocated (void) const { return m_vecpfx.m_alloc; }
>unsigned length (void) const { return m_vecpfx.m_num; }
>bool is_empty (void) const { return m_vecpfx.m_num == 0; }
> -  T *address (void) { return m_vecdata; }
> -  const T *address (void) const { return m_vecdata; }
> +  T *address (void) { return reinterpret_cast  (this + 1); }
> +  const T *address (void) const { return reinterpret_cast  (this 
> + 1); }

This is now too long.

>T *begin () { return address (); }
>const T *begin () const { return address (); }
>T *end () { return address () + length (); }
> @@ -631,8 +631,7 @@ public:
>  
>/* FIXME - These fields should be private, but we need to cater to
>compilers that have stricter notions of PODness for types.  */
> -  vec_prefix m_vecpfx;
> -  T m_vecdata[1];
> +  alignas (T) vec_prefix m_vecpfx;

The comment needs adjustment and down't we need
alignas (T) alignas (vec_prefix) ?

> @@ -1588,7 +1587,7 @@ public:
>  
>  private:
>vec m_auto;
> -  T m_data[MAX (N - 1, 1)];
> +  alignas(T) unsigned char m_data[sizeof (T) * N];
>  };

I still believe you don't need alignas(T) here (and space before (T) ).
Also, I think it needs to be MAX (N, 2) instead of N, because auto_vec
ctors use MAX (N, 2).  We could also change all those to MAX (N, 1)
now, but it can't be N because m_data[sizeof (T) * 0] is invalid in
standard C.

Anyway, I wonder if you get the -Werror=stringop-overflow= errors during
bootstrap that I got with my version or not.

Jakub



Re: [PATCH 1/2] Change vec<, , vl_embed>::m_vecdata refrences into address ()

2023-02-24 Thread Jakub Jelinek via Gcc-patches
On Fri, Feb 24, 2023 at 12:32:45PM +0100, Richard Biener via Gcc-patches wrote:
> --- a/gcc/vec.h
> +++ b/gcc/vec.h
> @@ -614,7 +614,7 @@ public:
>T *bsearch (const void *key, int (*compar)(const void *, const void *));
>T *bsearch (const void *key,
> int (*compar)(const void *, const void *, void *), void *);
> -  unsigned lower_bound (T, bool (*)(const T &, const T &)) const;
> +  unsigned lower_bound (const T &, bool (*)(const T &, const T &)) const;

Missing space after (*) while you're there.

> @@ -929,7 +929,7 @@ vec::iterate (unsigned ix, T *ptr) const
>  {
>if (ix < m_vecpfx.m_num)
>  {
> -  *ptr = m_vecdata[ix];
> +  *ptr = address()[ix];

Missing space before ().

> @@ -1118,7 +1118,7 @@ inline void
>  vec::unordered_remove (unsigned ix)
>  {
>gcc_checking_assert (ix < length ());
> -  m_vecdata[ix] = m_vecdata[--m_vecpfx.m_num];
> +  address ()[ix] = address ()[--m_vecpfx.m_num];
>  }

As address () is used twice here, can't we stick it into a temporary
and use twice then?

> @@ -1249,8 +1249,11 @@ vec::contains (const T ) const
>  {
>unsigned int len = length ();
>for (unsigned int i = 0; i < len; i++)
> -if ((*this)[i] == search)
> -  return true;
> +{
> +  const T *slot =  ()[i];
> +  if (*slot == search)
> + return true;

Similarly, can't we do address () once before the loop into a temporary?

>  template
>  unsigned
> -vec::lower_bound (T obj, bool (*lessthan)(const T &, const T 
> &))
> +vec::lower_bound (const T ,
> +   bool (*lessthan)(const T &, const T &))

) ( while you're at it.

Otherwise LGTM.

Jakub



[PATCH 2/2] Avoid default-initializing auto_vec storage, fix vec

2023-02-24 Thread Richard Biener via Gcc-patches
The following avoids default-initializing auto_vec storage for
non-POD T since that's not what the allocated storage fallback
will do and it's also not expected for existing cases like

  auto_vec, 64> elts;

which exist to optimize the allocation.

It also fixes the array accesses done by vec to not
use its own m_vecdata member but instead access the container
provided storage via pointer arithmetic.

This seems to work but it also somehow breaks genrecog which now
goes OOM with this change.  I'm going to see if the testsuite
shows anything, but maybe it's obvious from a second eye what
I did wrong ...

Comments welcome of course.

Thanks,
Richard.

* vec.h (vec::m_vecdata): Remove.
(vec::m_vecpfx): Align as T to avoid
changing alignment of vec and simplifying
address.
(vec::address): Compute as this + 1.
(vec::embedded_size): Use sizeof the
vector instead of the offset of the m_vecdata member.
(auto_vec): Turn m_data storage into
uninitialized unsigned char aligned as T.
* vec.cc (test_auto_alias): New.
(vec_cc_tests): Call it.
---
 gcc/vec.cc | 17 +
 gcc/vec.h  | 11 +--
 2 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/gcc/vec.cc b/gcc/vec.cc
index 511e6dff50d..2128fb1 100644
--- a/gcc/vec.cc
+++ b/gcc/vec.cc
@@ -568,6 +568,22 @@ test_auto_delete_vec ()
   ASSERT_EQ (dtor_count, 2);
 }
 
+/* Verify accesses to m_vecdata are done indirectly.  */
+
+static void
+test_auto_alias ()
+{
+  volatile int i = 1;
+  auto_vec v;
+  v.quick_grow (2);
+  v[0] = 1;
+  v[1] = 2;
+  int val;
+  for (int ix = i; v.iterate (ix, ); ix++)
+ASSERT_EQ (val, 2);
+  ASSERT_EQ (val, 0);
+}
+
 /* Run all of the selftests within this file.  */
 
 void
@@ -587,6 +603,7 @@ vec_cc_tests ()
   test_qsort ();
   test_reverse ();
   test_auto_delete_vec ();
+  test_auto_alias ();
 }
 
 } // namespace selftest
diff --git a/gcc/vec.h b/gcc/vec.h
index 5a2ee9c0294..b680efebe7a 100644
--- a/gcc/vec.h
+++ b/gcc/vec.h
@@ -586,8 +586,8 @@ public:
   unsigned allocated (void) const { return m_vecpfx.m_alloc; }
   unsigned length (void) const { return m_vecpfx.m_num; }
   bool is_empty (void) const { return m_vecpfx.m_num == 0; }
-  T *address (void) { return m_vecdata; }
-  const T *address (void) const { return m_vecdata; }
+  T *address (void) { return reinterpret_cast  (this + 1); }
+  const T *address (void) const { return reinterpret_cast  (this + 
1); }
   T *begin () { return address (); }
   const T *begin () const { return address (); }
   T *end () { return address () + length (); }
@@ -631,8 +631,7 @@ public:
 
   /* FIXME - These fields should be private, but we need to cater to
 compilers that have stricter notions of PODness for types.  */
-  vec_prefix m_vecpfx;
-  T m_vecdata[1];
+  alignas (T) vec_prefix m_vecpfx;
 };
 
 
@@ -1313,7 +1312,7 @@ vec::embedded_size (unsigned alloc)
vec, vec_embedded>::type vec_stdlayout;
   static_assert (sizeof (vec_stdlayout) == sizeof (vec), "");
   static_assert (alignof (vec_stdlayout) == alignof (vec), "");
-  return offsetof (vec_stdlayout, m_vecdata) + alloc * sizeof (T);
+  return sizeof (vec_stdlayout) + alloc * sizeof (T);
 }
 
 
@@ -1588,7 +1587,7 @@ public:
 
 private:
   vec m_auto;
-  T m_data[MAX (N - 1, 1)];
+  alignas(T) unsigned char m_data[sizeof (T) * N];
 };
 
 /* auto_vec is a sub class of vec whose storage is released when it is
-- 
2.35.3


[PATCH][committed] aarch64: Update FLAGS field documentation comment in aarch64-cores.def

2023-02-24 Thread Kyrylo Tkachov via Gcc-patches
Hi all,

With the cleanup of the arch features in GCC 13 the comment on the FLAGS field 
in aarch64-cores.def
is now outdated. It's now a comma-separated list rather than a bitwise or.
Spotted while reviewing an aarch64-cores.def patch.
Update the comment.

Pushing to trunk.
Thanks,
Kyrill

gcc/ChangeLog:

* config/aarch64/aarch64-cores.def (FLAGS): Update comment.


aarch64-comment.patch
Description: aarch64-comment.patch


[PATCH 1/2] Change vec<, , vl_embed>::m_vecdata refrences into address ()

2023-02-24 Thread Richard Biener via Gcc-patches
As preparation to remove m_vecdata in the vl_embed vector this
changes references to it into calls to address ().

As I was here it also fixes ::contains to avoid repeated bounds
checking and the same issue in ::lower_bound which also suffers
from unnecessary copying around values.

* vec.h: Change m_vecdata references to address ().
* vec.h (vec::lower_bound): Adjust to
take a const reference to the object, use address to
access data.
(vec::contains): Use address to access data.
(vec::operator[]): Use address instead of
m_vecdata to access data.
(vec::iterate): Likewise.
(vec::copy): Likewise.
(vec::quick_push): Likewise.
(vec::pop): Likewise.
(vec::quick_insert): Likewise.
(vec::ordered_remove): Likewise.
(vec::unordered_remove): Likewise.
(vec::block_remove): Likewise.
(vec::address): Likewise.
---
 gcc/vec.h | 40 ++--
 1 file changed, 22 insertions(+), 18 deletions(-)

diff --git a/gcc/vec.h b/gcc/vec.h
index a536b68732d..5a2ee9c0294 100644
--- a/gcc/vec.h
+++ b/gcc/vec.h
@@ -614,7 +614,7 @@ public:
   T *bsearch (const void *key, int (*compar)(const void *, const void *));
   T *bsearch (const void *key,
  int (*compar)(const void *, const void *, void *), void *);
-  unsigned lower_bound (T, bool (*)(const T &, const T &)) const;
+  unsigned lower_bound (const T &, bool (*)(const T &, const T &)) const;
   bool contains (const T ) const;
   static size_t embedded_size (unsigned);
   void embedded_init (unsigned, unsigned = 0, unsigned = 0);
@@ -879,7 +879,7 @@ inline const T &
 vec::operator[] (unsigned ix) const
 {
   gcc_checking_assert (ix < m_vecpfx.m_num);
-  return m_vecdata[ix];
+  return address ()[ix];
 }
 
 template
@@ -887,7 +887,7 @@ inline T &
 vec::operator[] (unsigned ix)
 {
   gcc_checking_assert (ix < m_vecpfx.m_num);
-  return m_vecdata[ix];
+  return address ()[ix];
 }
 
 
@@ -929,7 +929,7 @@ vec::iterate (unsigned ix, T *ptr) const
 {
   if (ix < m_vecpfx.m_num)
 {
-  *ptr = m_vecdata[ix];
+  *ptr = address()[ix];
   return true;
 }
   else
@@ -955,7 +955,7 @@ vec::iterate (unsigned ix, T **ptr) const
 {
   if (ix < m_vecpfx.m_num)
 {
-  *ptr = CONST_CAST (T *, _vecdata[ix]);
+  *ptr = CONST_CAST (T *,  ()[ix]);
   return true;
 }
   else
@@ -978,7 +978,7 @@ vec::copy (ALONE_MEM_STAT_DECL) const
 {
   vec_alloc (new_vec, len PASS_MEM_STAT);
   new_vec->embedded_init (len, len);
-  vec_copy_construct (new_vec->address (), m_vecdata, len);
+  vec_copy_construct (new_vec->address (), address (), len);
 }
   return new_vec;
 }
@@ -1018,7 +1018,7 @@ inline T *
 vec::quick_push (const T )
 {
   gcc_checking_assert (space (1));
-  T *slot = _vecdata[m_vecpfx.m_num++];
+  T *slot =  ()[m_vecpfx.m_num++];
   *slot = obj;
   return slot;
 }
@@ -1031,7 +1031,7 @@ inline T &
 vec::pop (void)
 {
   gcc_checking_assert (length () > 0);
-  return m_vecdata[--m_vecpfx.m_num];
+  return address ()[--m_vecpfx.m_num];
 }
 
 
@@ -1056,7 +1056,7 @@ vec::quick_insert (unsigned ix, const T 
)
 {
   gcc_checking_assert (length () < allocated ());
   gcc_checking_assert (ix <= length ());
-  T *slot = _vecdata[ix];
+  T *slot =  ()[ix];
   memmove (slot + 1, slot, (m_vecpfx.m_num++ - ix) * sizeof (T));
   *slot = obj;
 }
@@ -1071,7 +1071,7 @@ inline void
 vec::ordered_remove (unsigned ix)
 {
   gcc_checking_assert (ix < length ());
-  T *slot = _vecdata[ix];
+  T *slot =  ()[ix];
   memmove (slot, slot + 1, (--m_vecpfx.m_num - ix) * sizeof (T));
 }
 
@@ -1118,7 +1118,7 @@ inline void
 vec::unordered_remove (unsigned ix)
 {
   gcc_checking_assert (ix < length ());
-  m_vecdata[ix] = m_vecdata[--m_vecpfx.m_num];
+  address ()[ix] = address ()[--m_vecpfx.m_num];
 }
 
 
@@ -1130,7 +1130,7 @@ inline void
 vec::block_remove (unsigned ix, unsigned len)
 {
   gcc_checking_assert (ix + len <= length ());
-  T *slot = _vecdata[ix];
+  T *slot =  ()[ix];
   m_vecpfx.m_num -= len;
   memmove (slot, slot + len, (m_vecpfx.m_num - ix) * sizeof (T));
 }
@@ -1249,8 +1249,11 @@ vec::contains (const T ) const
 {
   unsigned int len = length ();
   for (unsigned int i = 0; i < len; i++)
-if ((*this)[i] == search)
-  return true;
+{
+  const T *slot =  ()[i];
+  if (*slot == search)
+   return true;
+}
 
   return false;
 }
@@ -1262,7 +1265,8 @@ vec::contains (const T ) const
 
 template
 unsigned
-vec::lower_bound (T obj, bool (*lessthan)(const T &, const T 
&))
+vec::lower_bound (const T ,
+ bool (*lessthan)(const T &, const T &))
   const
 {
   unsigned int len = length ();
@@ -1273,7 +1277,7 @@ vec::lower_bound (T obj, bool 
(*lessthan)(const T &, const T &))
   half = len / 2;
   middle = first;
   middle += half;
-  T middle_elem = (*this)[middle];
+  const T _elem = address ()[middle];
   if 

[Patch] Fortran: Skip bound conv in gfc_conv_gfc_desc_to_cfi_desc with intent(out) ptr [PR108621]

2023-02-24 Thread Tobias Burnus

[The following is about Fortran pointers as actual argument to a CFI taking 
procedure.]

The issue has been marked as 12/13 regression but the issue is just a 
diagnostic one.

To disentangle:

(A) Bogus warning
[Now tracked as middle-end https://gcc.gnu.org/PR108906 ]
Assume:
   nullify(p)
   call bind_c_proc(p)

For some reasons, the compiler does not propagate the NULL and thus assumes that
if (addr != NULL) could be true. This leads to a may-be-uninit warning with 
'-Wall -O0'.
(And no warning with -Og/-Os/-O1.)

We could silence it on the gfortran side, I think, by tweaking some tree 
properties,
but I am not sure we want to to it.

(The same kind of code is in GCC 11 but as it is hidden in libgfortran; hence, 
there are
no warnings when compiling user code. Since GCC 12, the compiler does the 
conversion
in place when generating the code.)



(B) The attached patch:

With 'intent(out)' there is no reason to do the conversions. While for nullified
pointers the bounds-conversion loop is skipped, it may still be executed for 
undefined
pointers. (Which is usually harmless.) In either case, not generating this code 
makes
sense.

OK for mainline?

Regarding GCC 12:  I am not really sure as it is no real regression. Besides 
bogus
warnings, there might be an issue for undefined pointers and 
-fsanitize=undefined, namely
if 'ubound - lbound' evaluated on random numbers overflows (such as for ubound 
= huge(..)
and lbound = -huge(..)). But that looks like a rather special case. - Thoughts?

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
Fortran: Skip bound conv in gfc_conv_gfc_desc_to_cfi_desc with intent(out) ptr [PR108621]

When the dummy argument of the bind(C) proc is 'pointer, intent(out)', the conversion
of the GFC to the CFI bounds can be skipped: it is not needed and avoids issues with
noninit memory.


Note that the 'cfi->base_addr = gfc->addr' assignment is kept as the C code of a user
might assume that a nullified pointer arrives as NULL (or even a specific value).
For instance, gfortran.dg/c-interop/section-{1,2}.f90 assumes the value NULL.

Note 2: The PR is about a may-be-uninitialized warning with intent(out). In the PR's
testcase, the pointer was nullified and should not have produced that warning.
That is a diagnostic issue, now tracked as PR middle-end/108906 as the issue in principle
still exists (e.g. with 'intent(inout)'). [But no longer for intent(out).]

Note 3: With undefined pointers and no 'intent', accessing uninit memory is unavoidable
on the caller side as the compiler cannot know what the C function does (but this usage
determines whether the pointer is permitted be undefined or whether the bounds must be
gfc-to-cfi converted).


gcc/fortran/ChangeLog:

	PR fortran/108621
	* trans-expr.cc (gfc_conv_gfc_desc_to_cfi_desc):

gcc/testsuite/ChangeLog:

	PR fortran/108621
	* gfortran.dg/c-interop/fc-descriptor-pr108621.f90: New test.

 gcc/fortran/trans-expr.cc  |  6 ++
 .../c-interop/fc-descriptor-pr108621.f90   | 65 ++
 2 files changed, 71 insertions(+)

diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc
index e85b53fae85..045c8b00b90 100644
--- a/gcc/fortran/trans-expr.cc
+++ b/gcc/fortran/trans-expr.cc
@@ -5673,6 +5673,9 @@ gfc_conv_gfc_desc_to_cfi_desc (gfc_se *parmse, gfc_expr *e, gfc_symbol *fsym)
   gfc_add_modify (, tmp,
 		  build_int_cst (TREE_TYPE (tmp), attr));
 
+  /* The cfi-base_addr assignment could be skipped for 'pointer, intent(out)'.
+ That is very sensible for undefined pointers, but the C code might assume
+ that the pointer retains the value, in particular, if it was NULL.  */
   if (e->rank == 0)
 {
   tmp = gfc_get_cfi_desc_base_addr (cfi);
@@ -5695,6 +5698,9 @@ gfc_conv_gfc_desc_to_cfi_desc (gfc_se *parmse, gfc_expr *e, gfc_symbol *fsym)
   gfc_add_modify (, tmp2, fold_convert (TREE_TYPE (tmp2), tmp));
 }
 
+  if (fsym->attr.pointer && fsym->attr.intent == INTENT_OUT)
+goto done;
+
   /* When allocatable + intent out, free the cfi descriptor.  */
   if (fsym->attr.allocatable && fsym->attr.intent == INTENT_OUT)
 {
diff --git a/gcc/testsuite/gfortran.dg/c-interop/fc-descriptor-pr108621.f90 b/gcc/testsuite/gfortran.dg/c-interop/fc-descriptor-pr108621.f90
new file mode 100644
index 000..9c9062bd62d
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/c-interop/fc-descriptor-pr108621.f90
@@ -0,0 +1,65 @@
+! { dg-do compile }
+! { dg-additional-options "-fdump-tree-original" }
+!
+! PR fortran/108621
+!
+! If the bind(C) procedure's dummy argument is a POINTER with INTENT(OUT),
+! avoid converting the array bounds for the CFI descriptor before the call.
+!
+! Rational: Fewer code and, esp. for undefined pointers, there might be a
+! 

Re: [PATCH v3 00/11] RISC-V: Add XThead* extension support

2023-02-24 Thread Christoph Müllner
On Fri, Feb 24, 2023 at 9:09 AM Kito Cheng  wrote:
>
> Hi Christoph:
>
> OK for trunk for the 1~8, feel free to commit 1~8 after you address
> those minor comments, and could you also prepare release notes for
> those extensions?

I addressed the comment regarding XTheadBs.
But I have not done anything regarding XTheadB* and Zb*.

Release notes patch can be found here:
  https://gcc.gnu.org/pipermail/gcc-patches/2023-February/612763.html

> And 9~11 needs to take a few more rounds of review and test.

I've seen the comments regarding patch 10 and 11.
We will try to clean this up asap.

In the patch for XTheadMemPair there was this nasty typo in one of the tests,
is there anything else that is needed?
I believe that patch should be in a better shape than the last two patches
and it is much less invasive.
Further similar code can be found in other backends.

Thanks,
Christoph

>
>
>
>
> On Fri, Feb 24, 2023 at 1:52 PM Christoph Muellner
>  wrote:
> >
> > From: Christoph Müllner 
> >
> > This series introduces support for the T-Head specific RISC-V ISA extensions
> > which are available e.g. on the T-Head XuanTie C906.
> >
> > The ISA spec can be found here:
> >   https://github.com/T-head-Semi/thead-extension-spec
> >
> > This series adds support for the following XThead* extensions:
> > * XTheadBa
> > * XTheadBb
> > * XTheadBs
> > * XTheadCmo
> > * XTheadCondMov
> > * XTheadFMemIdx
> > * XTheadFmv
> > * XTheadInt
> > * XTheadMac
> > * XTheadMemIdx
> > * XTheadMemPair
> > * XTheadSync
> >
> > All extensions are properly integrated and the included tests
> > demonstrate the improvements of the generated code.
> >
> > The series also introduces support for "-mcpu=thead-c906", which also
> > enables all available XThead* ISA extensions of the T-Head C906.
> >
> > All patches have been tested and don't introduce regressions for RV32 or 
> > RV64.
> > The patches have also been tested with SPEC CPU2017 on QEMU and real HW
> > (D1 board).
> >
> > Support patches for these extensions for Binutils, QEMU, and LLVM have
> > already been merged in the corresponding upstream projects.
> >
> > Changes in v3:
> > - Bugfix in XTheadBa
> > - Rewrite of XTheadMemPair
> > - Inclusion of XTheadMemIdx and XTheadFMemIdx
> >
> > Christoph Müllner (9):
> >   riscv: Add basic XThead* vendor extension support
> >   riscv: riscv-cores.def: Add T-Head XuanTie C906
> >   riscv: thead: Add support for the XTheadBa ISA extension
> >   riscv: thead: Add support for the XTheadBs ISA extension
> >   riscv: thead: Add support for the XTheadBb ISA extension
> >   riscv: thead: Add support for the XTheadCondMov ISA extensions
> >   riscv: thead: Add support for the XTheadMac ISA extension
> >   riscv: thead: Add support for the XTheadFmv ISA extension
> >   riscv: thead: Add support for the XTheadMemPair ISA extension
> >
> > moiz.hussain (2):
> >   riscv: thead: Add support for the XTheadMemIdx ISA extension
> >   riscv: thead: Add support for the XTheadFMemIdx ISA extension
> >
> >  gcc/common/config/riscv/riscv-common.cc   |   26 +
> >  gcc/config/riscv/bitmanip.md  |   52 +-
> >  gcc/config/riscv/constraints.md   |   43 +
> >  gcc/config/riscv/iterators.md |4 +
> >  gcc/config/riscv/peephole.md  |   56 +
> >  gcc/config/riscv/riscv-cores.def  |4 +
> >  gcc/config/riscv/riscv-opts.h |   29 +
> >  gcc/config/riscv/riscv-protos.h   |   28 +-
> >  gcc/config/riscv/riscv.cc | 1090 +++--
> >  gcc/config/riscv/riscv.h  |8 +-
> >  gcc/config/riscv/riscv.md |  169 ++-
> >  gcc/config/riscv/riscv.opt|3 +
> >  gcc/config/riscv/thead.md |  351 ++
> >  .../gcc.target/riscv/mcpu-thead-c906.c|   28 +
> >  .../gcc.target/riscv/xtheadba-addsl.c |   55 +
> >  gcc/testsuite/gcc.target/riscv/xtheadba.c |   14 +
> >  gcc/testsuite/gcc.target/riscv/xtheadbb-ext.c |   20 +
> >  .../gcc.target/riscv/xtheadbb-extu-2.c|   22 +
> >  .../gcc.target/riscv/xtheadbb-extu.c  |   22 +
> >  gcc/testsuite/gcc.target/riscv/xtheadbb-ff1.c |   18 +
> >  gcc/testsuite/gcc.target/riscv/xtheadbb-rev.c |   45 +
> >  .../gcc.target/riscv/xtheadbb-srri.c  |   21 +
> >  gcc/testsuite/gcc.target/riscv/xtheadbb.c |   14 +
> >  gcc/testsuite/gcc.target/riscv/xtheadbs-tst.c |   13 +
> >  gcc/testsuite/gcc.target/riscv/xtheadbs.c |   14 +
> >  gcc/testsuite/gcc.target/riscv/xtheadcmo.c|   14 +
> >  .../riscv/xtheadcondmov-mveqz-imm-eqz.c   |   38 +
> >  .../riscv/xtheadcondmov-mveqz-imm-not.c   |   38 +
> >  .../riscv/xtheadcondmov-mveqz-reg-eqz.c   |   38 +
> >  .../riscv/xtheadcondmov-mveqz-reg-not.c   |   38 +
> >  .../riscv/xtheadcondmov-mvnez-imm-cond.c  |   38 +
> >  .../riscv/xtheadcondmov-mvnez-imm-nez.c   |   38 +
> >  .../riscv/xtheadcondmov-mvnez-reg-cond.c  |   

[wwwdocs] gcc-13: riscv: Document the T-Head CPU support

2023-02-24 Thread Christoph Muellner
From: Christoph Müllner 

This patch documents the new T-Head CPU support for RISC-V.

Signed-off-by: Christoph Müllner 
---
 htdocs/gcc-13/changes.html | 24 +++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html
index a803f501..ce5ba35c 100644
--- a/htdocs/gcc-13/changes.html
+++ b/htdocs/gcc-13/changes.html
@@ -490,7 +490,29 @@ a work-in-progress.
 
 RISC-V
 
-New ISA extension support for zawrs.
+  New ISA extension support for Zawrs.
+  Support for the following vendor extensions has been added:
+
+  XTheadBa
+  XTheadBb
+  XTheadBs
+  XTheadCmo
+  XTheadCondMov
+  XTheadFMemIdx
+  XTheadFmv
+  XTheadInt
+  XTheadMac
+  XTheadMemIdx
+  XTheadMemPair
+  XTheadSync
+
+  
+  The following new CPUs are supported through the -mcpu
+  option (GCC identifiers in parentheses).
+
+  T-Head's XuanTie C906 (thead-c906).
+
+  
 
 
 
-- 
2.39.2



Re: [wwwdocs, patch] OpenMP update for gcc-13/changes.html + projects/gomp/

2023-02-24 Thread Benson Muite via Gcc-patches
On 2/24/23 10:32, Benson Muite via Gcc-patches wrote:
> On 2/24/23 04:02, Gerald Pfeifer wrote:
>> On Thu, 23 Feb 2023, Tobias Burnus wrote:
>>> PS: I also removed a stray , but admittedly only after the
>>> commit. I found it by manually running those through the w3 validator
>>> site. However, I did not see an automatic email, either it takes longer
>>> or does it no longer run? It did in the past!
>>
>> You are right, and this is a sore / sad point: validator.w3.org that we
>> used in the past now only supports interactive sessions. And they even
>> broke support for the Referer header, so I also had to remove the checking 
>> link I had embedded in all of our pages.
>>
>> These days I invoke the validator (via a version of the original script) 
>> when I see a commit. Which indeed leads to many orders of magnitude longer 
>> delays.
>>
>> Sadly I don't have a better alternative. :-(
>>
> Could one of the following be used or used to generate a better workflow:
> https://html-validate.org/usage/cli.html - written in Javascript, but
> has a command line interface
> https://github.com/validator/validator - packaged, but a little
> cumbersome, may need a wrapper
> https://github.com/w3c-validators/w3c_validators - Wrapper written in
> Ruby, with a nice interface to validate a local file
> 
html-tidy could work well. Written in C. A typical session from Git
Sources following [1]:

git pull
cd gcc
./configure
mkdir HTML
makeinfo --html --no-split -Idoc -Idoc/include -o HTML doc/gcc.texi
tidy -f HTML/errs.txt -imu HTML/gcc.html

Typical current reported errors are
line 23080 column 22 - Warning: nested emphasis 
line 40445 column 11 - Warning: nested emphasis 
line 3541 column 1 - Warning:  anchor "index-g_002b_002b" already
defined
line 54489 column 1 - Warning:  lacks "summary" attribute


[1]
https://unix.stackexchange.com/questions/493013/how-to-build-the-gcc-html-documentation-from-source-into-a-single-page


Re: [PATCH] Avoid default-initializing auto_vec storage

2023-02-24 Thread Jakub Jelinek via Gcc-patches
On Fri, Feb 24, 2023 at 11:59:53AM +0100, Jakub Jelinek via Gcc-patches wrote:
> > This needs to be alignas(T) unsigned char m_data[sizeof(T) * N];
> 
>   unsigned char m_data[MAX (N, 2) * sizeof (T)];
> 
> if we want to preserve current behavior I think.
> 
> I've screwed up, when I was about to change this line, I've realized
> I want to look at the embedded_size stuff first and then forgot
> to update this.  With the above line it builds, though I bet one will need
> to compare the generated code etc. and test with GCC 4.8.
> 
> Though I guess we now have also an option to change auto_vec with N 0/1
> to have 1 embedded element instead of 2, so that would also mean changing
> all of MAX (N, 2) to MAX (N, 1) if we want.

It builds then in non-optimized build, but when I try to build it in stage3,
the useless middle-end warnings kick in:

In member function ‘T* vec::quick_push(const T&) [with T = 
parameter; A = va_heap]’,
inlined from ‘T* vec::quick_push(const T&) [with T = parameter]’ at 
../../gcc/vec.h:1957:28,
inlined from ‘void populate_pattern_routine(create_pattern_info*, 
merge_state_info*, state*, const vec&)’ at 
../../gcc/genrecog.cc:3008:29:
../../gcc/vec.h:1021:3: error: writing 1 byte into a region of size 0 
[-Werror=stringop-overflow=]
 1021 |   *slot = obj;
  |   ^
../../gcc/vec.h: In function ‘void 
populate_pattern_routine(create_pattern_info*, merge_state_info*, state*, const 
vec&)’:
../../gcc/vec.h:1585:29: note: at offset 12 into destination object 
‘auto_vec::m_auto’ of size 8
 1585 |   vec m_auto;
  | ^~
In member function ‘T* vec::quick_push(const T&) [with T = 
parameter; A = va_heap]’,
inlined from ‘T* vec::quick_push(const T&) [with T = parameter]’ at 
../../gcc/vec.h:1957:28,
inlined from ‘decision* init_pattern_use(create_pattern_info*, 
merge_state_info*, const vec&)’ at 
../../gcc/genrecog.cc:2886:26:
../../gcc/vec.h:1021:3: error: writing 1 byte into a region of size 0 
[-Werror=stringop-overflow=]
 1021 |   *slot = obj;
  |   ^
../../gcc/vec.h: In function ‘decision* init_pattern_use(create_pattern_info*, 
merge_state_info*, const vec&)’:
../../gcc/vec.h:1585:29: note: at offset 12 into destination object 
‘auto_vec::m_auto’ of size 8
 1585 |   vec m_auto;
  | ^~


Jakub



Re: [PATCH] Avoid default-initializing auto_vec storage

2023-02-24 Thread Jakub Jelinek via Gcc-patches
On Fri, Feb 24, 2023 at 10:30:00AM +, Jonathan Wakely wrote:
> On Fri, 24 Feb 2023 at 10:24, Jakub Jelinek  wrote:
> >
> > On Fri, Feb 24, 2023 at 11:02:07AM +0100, Jakub Jelinek via Gcc-patches 
> > wrote:
> > > Maybe this would work, vl_relative even could be vl_embed.
> > > Because vl_embed I believe is used in two spots, part of
> > > auto_vec where it is followed by m_data and on heap or GGC
> > > allocated memory where vec<..., vl_embed> is followed by
> > > further storage for the vector.
> >
> > So roughtly something like below?  Except I get weird crashes with it
> > in the gen* tools.  And we'd need to adjust the gdb python hooks
> > which also use m_vecdata.
> >
> > --- gcc/vec.h.jj2023-01-02 09:32:32.177143804 +0100
> > +++ gcc/vec.h   2023-02-24 11:19:37.900157177 +0100
> > @@ -586,8 +586,8 @@ public:
> >unsigned allocated (void) const { return m_vecpfx.m_alloc; }
> >unsigned length (void) const { return m_vecpfx.m_num; }
> >bool is_empty (void) const { return m_vecpfx.m_num == 0; }
> > -  T *address (void) { return m_vecdata; }
> > -  const T *address (void) const { return m_vecdata; }
> > +  T *address (void) { return (T *) (this + 1); }
> > +  const T *address (void) const { return (const T *) (this + 1); }
> >T *begin () { return address (); }
> >const T *begin () const { return address (); }
> >T *end () { return address () + length (); }
> > @@ -629,10 +629,9 @@ public:
> >friend struct va_gc_atomic;
> >friend struct va_heap;
> >
> > -  /* FIXME - These fields should be private, but we need to cater to
> > +  /* FIXME - This field should be private, but we need to cater to
> >  compilers that have stricter notions of PODness for types.  */
> > -  vec_prefix m_vecpfx;
> > -  T m_vecdata[1];
> > +  alignas (T) alignas (vec_prefix) vec_prefix m_vecpfx;
> >  };
> >
> >
> > @@ -879,7 +878,7 @@ inline const T &
> >  vec::operator[] (unsigned ix) const
> >  {
> >gcc_checking_assert (ix < m_vecpfx.m_num);
> > -  return m_vecdata[ix];
> > +  return address ()[ix];
> >  }
> >
> >  template
> > @@ -887,7 +886,7 @@ inline T &
> >  vec::operator[] (unsigned ix)
> >  {
> >gcc_checking_assert (ix < m_vecpfx.m_num);
> > -  return m_vecdata[ix];
> > +  return address ()[ix];
> >  }
> >
> >
> > @@ -929,7 +928,7 @@ vec::iterate (unsigned i
> >  {
> >if (ix < m_vecpfx.m_num)
> >  {
> > -  *ptr = m_vecdata[ix];
> > +  *ptr = address ()[ix];
> >return true;
> >  }
> >else
> > @@ -955,7 +954,7 @@ vec::iterate (unsigned i
> >  {
> >if (ix < m_vecpfx.m_num)
> >  {
> > -  *ptr = CONST_CAST (T *, _vecdata[ix]);
> > +  *ptr = CONST_CAST (T *,  ()[ix]);
> >return true;
> >  }
> >else
> > @@ -978,7 +977,7 @@ vec::copy (ALONE_MEM_STA
> >  {
> >vec_alloc (new_vec, len PASS_MEM_STAT);
> >new_vec->embedded_init (len, len);
> > -  vec_copy_construct (new_vec->address (), m_vecdata, len);
> > +  vec_copy_construct (new_vec->address (), address (), len);
> >  }
> >return new_vec;
> >  }
> > @@ -1018,7 +1017,7 @@ inline T *
> >  vec::quick_push (const T )
> >  {
> >gcc_checking_assert (space (1));
> > -  T *slot = _vecdata[m_vecpfx.m_num++];
> > +  T *slot =  ()[m_vecpfx.m_num++];
> >*slot = obj;
> >return slot;
> >  }
> > @@ -1031,7 +1030,7 @@ inline T &
> >  vec::pop (void)
> >  {
> >gcc_checking_assert (length () > 0);
> > -  return m_vecdata[--m_vecpfx.m_num];
> > +  return address ()[--m_vecpfx.m_num];
> >  }
> >
> >
> > @@ -1056,7 +1055,7 @@ vec::quick_insert (unsig
> >  {
> >gcc_checking_assert (length () < allocated ());
> >gcc_checking_assert (ix <= length ());
> > -  T *slot = _vecdata[ix];
> > +  T *slot =  ()[ix];
> >memmove (slot + 1, slot, (m_vecpfx.m_num++ - ix) * sizeof (T));
> >*slot = obj;
> >  }
> > @@ -1071,7 +1070,7 @@ inline void
> >  vec::ordered_remove (unsigned ix)
> >  {
> >gcc_checking_assert (ix < length ());
> > -  T *slot = _vecdata[ix];
> > +  T *slot =  ()[ix];
> >memmove (slot, slot + 1, (--m_vecpfx.m_num - ix) * sizeof (T));
> >  }
> >
> > @@ -1118,7 +1117,7 @@ inline void
> >  vec::unordered_remove (unsigned ix)
> >  {
> >gcc_checking_assert (ix < length ());
> > -  m_vecdata[ix] = m_vecdata[--m_vecpfx.m_num];
> > +  address ()[ix] = address ()[--m_vecpfx.m_num];
> >  }
> >
> >
> > @@ -1130,7 +1129,7 @@ inline void
> >  vec::block_remove (unsigned ix, unsigned len)
> >  {
> >gcc_checking_assert (ix + len <= length ());
> > -  T *slot = _vecdata[ix];
> > +  T *slot =  ()[ix];
> >m_vecpfx.m_num -= len;
> >memmove (slot, slot + len, (m_vecpfx.m_num - ix) * sizeof (T));
> >  }
> > @@ -1309,7 +1308,7 @@ vec::embedded_size (unsi
> > vec, vec_embedded>::type vec_stdlayout;
> >static_assert (sizeof (vec_stdlayout) == sizeof (vec), "");
> >static_assert (alignof (vec_stdlayout) == alignof (vec), "");
> > -  return offsetof (vec_stdlayout, 

Re: [PATCH v3 03/11] riscv: thead: Add support for the XTheadBa ISA extension

2023-02-24 Thread Christoph Müllner
On Fri, Feb 24, 2023 at 11:05 AM Christoph Müllner
 wrote:
>
> On Fri, Feb 24, 2023 at 10:54 AM Kito Cheng  wrote:
> >
> > My impression is that md patterns will use first-match patterns? so
> > the zba will get higher priority than xtheadba if both patterns are
> > matched?
>
> Yes, I was just about to write this.
>
> /opt/riscv-thead/bin/riscv64-unknown-linux-gnu-gcc -O2
> -march=rv64gc_zba_xtheadba -mtune=thead-c906 -S
> ./gcc/testsuite/gcc.target/riscv/xtheadba-addsl.c
>
> The resulting xtheadba-addsl.s file has:
> .attribute arch, 
> "rv64i2p0_m2p0_a2p0_f2p0_d2p0_c2p0_zba1p0_xtheadba1p0"
> [...]
> sh1add  a0,a1,a0
>
> So the standard extension will be preferred over the custom extension.

I tested now with all of them (RV32 and RV64):

/opt/riscv-thead/bin/riscv64-unknown-linux-gnu-gcc -O2
-march=rv64gc_zba_xtheadba -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadba-addsl.c
/opt/riscv-thead/bin/riscv64-unknown-linux-gnu-gcc -O2
-march=rv64gc_zbb_xtheadbb -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbb-ext.c
/opt/riscv-thead/bin/riscv64-unknown-linux-gnu-gcc -O2
-march=rv64gc_zbb_xtheadbb -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbb-extu.c
/opt/riscv-thead/bin/riscv64-unknown-linux-gnu-gcc -O2
-march=rv64gc_zbb_xtheadbb -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbb-extu-2.c
/opt/riscv-thead/bin/riscv64-unknown-linux-gnu-gcc -O2
-march=rv64gc_zbb_xtheadbb -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbb-ff1.c
/opt/riscv-thead/bin/riscv64-unknown-linux-gnu-gcc -O2
-march=rv64gc_zbb_xtheadbb -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbb-rev.c
/opt/riscv-thead/bin/riscv64-unknown-linux-gnu-gcc -O2
-march=rv64gc_zbb_xtheadbb -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbb-srri.c
/opt/riscv-thead/bin/riscv64-unknown-linux-gnu-gcc -O2
-march=rv64gc_zbb_xtheadbs -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbs-tst.c

/opt/riscv-thead32/bin/riscv32-unknown-linux-gnu-gcc -O2
-march=rv32gc_zba_xtheadba -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadba-addsl.c
/opt/riscv-thead32/bin/riscv32-unknown-linux-gnu-gcc -O2
-march=rv32gc_zbb_xtheadbb -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbb-ext.c
/opt/riscv-thead32/bin/riscv32-unknown-linux-gnu-gcc -O2
-march=rv32gc_zbb_xtheadbb -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbb-extu.c
/opt/riscv-thead32/bin/riscv32-unknown-linux-gnu-gcc -O2
-march=rv32gc_zbb_xtheadbb -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbb-extu-2.c
/opt/riscv-thead32/bin/riscv32-unknown-linux-gnu-gcc -O2
-march=rv32gc_zbb_xtheadbb -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbb-ff1.c
/opt/riscv-thead32/bin/riscv32-unknown-linux-gnu-gcc -O2
-march=rv32gc_zbb_xtheadbb -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbb-rev.c
/opt/riscv-thead32/bin/riscv32-unknown-linux-gnu-gcc -O2
-march=rv32gc_zbb_xtheadbb -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbb-srri.c
/opt/riscv-thead32/bin/riscv32-unknown-linux-gnu-gcc -O2
-march=rv32gc_zbb_xtheadbs -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadbs-tst.c

All behave ok (also when dropping the xtheadb* from the -march).

Is it ok to leave this as is?

Thanks,
Christoph

>
>
> >
> > On Fri, Feb 24, 2023 at 2:52 PM Andrew Pinski via Gcc-patches
> >  wrote:
> > >
> > > On Thu, Feb 23, 2023 at 9:55 PM Christoph Muellner
> > >  wrote:
> > > >
> > > > From: Christoph Müllner 
> > > >
> > > > This patch adds support for the XTheadBa ISA extension.
> > > > The new INSN pattern is defined in a new file to separate
> > > > this vendor extension from the standard extensions.
> > >
> > > How does this interact with doing -march=rv32gc_xtheadba_zba ?
> > > Seems like it might be better handle that case correctly. I suspect
> > > these all XThreadB* extensions have a similar problem too.
> > >
> > > Thanks,
> > > Andrew Pinski
> > >
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > > * config/riscv/riscv.md: Include thead.md
> > > > * config/riscv/thead.md: New file.
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >
> > > > * gcc.target/riscv/xtheadba-addsl.c: New test.
> > > >
> > > > Changes in v3:
> > > > - Fix operand order for th.addsl.
> > > >
> > > > Signed-off-by: Christoph Müllner 
> > > > ---
> > > >  gcc/config/riscv/riscv.md |  1 +
> > > >  gcc/config/riscv/thead.md | 31 +++
> > > >  .../gcc.target/riscv/xtheadba-addsl.c | 55 +++
> > > >  3 files changed, 87 insertions(+)
> > > >  create mode 100644 gcc/config/riscv/thead.md
> > > >  create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadba-addsl.c
> > > >
> > > > diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> > > > index 05924e9bbf1..d6c2265e9d4 100644
> > > > --- a/gcc/config/riscv/riscv.md
> > > > +++ b/gcc/config/riscv/riscv.md
> > > > @@ 

Re: C++ modules and AAPCS/ARM EABI clash on inline key methods

2023-02-24 Thread Iain Sandoe



> On 24 Feb 2023, at 10:23, Richard Earnshaw via Gcc-patches 
>  wrote:
> 
> 
> 
> On 23/02/2023 21:20, Alexandre Oliva wrote:
>> On Feb 23, 2023, Alexandre Oliva  wrote:
>>> On Feb 23, 2023, Richard Earnshaw  wrote:
 On 22/02/2023 19:57, Alexandre Oliva wrote:
> On Feb 21, 2023, Richard Earnshaw  wrote:
> 
>> Rather than scanning for the triplet, a better test would be
> 
>> { xfail { arm_eabi } }
> 
> Indeed, thanks.  Here's the updated patch, retested.  Ok to install?
 Based on Nathan's comments, we should just skip the test on arm_eabi,
 it's simply not applicable.
>>> Like this, I suppose.  Retested on x86_64-linux-gnu (trunk) and
>>> arm-wrs-vxworks7 (gcc-12).  Ok to install?
>> Erhm, actually, that version still ran the assembler scans and failed.
>> This one skips the testset entirely.
> 
> Yeah, I tried something like that and it didn't appear to work. Perhaps it's 
> a bug in the way dg-do-module is implemented.

I think if you suppress the dg-do run line (with the target selector) then it 
will just do the default (which is to compile only?)

Skip seems like the correct thing to do here ..
Iain

> 
>> [PR105224] C++ modules and AAPCS/ARM EABI clash on inline key methods
>> From: Alexandre Oliva 
>> g++.dg/modules/virt-2_a.C fails on arm-eabi and many other arm targets
>> that use the AAPCS variant.  ARM is the only target that overrides
>> TARGET_CXX_KEY_METHOD_MAY_BE_INLINE.  It's not clear to me which way
>> the clash between AAPCS and C++ Modules design should be resolved, but
>> currently it favors AAPCS and thus the test fails, so skip it on
>> arm_eabi.
>> for  gcc/testsuite/ChangeLog
>>  PR c++/105224
>>  * g++.dg/modules/virt-2_a.C: Skip on arm_eabi.
>> ---
>>  gcc/testsuite/g++.dg/modules/virt-2_a.C |3 +++
>>  1 file changed, 3 insertions(+)
>> diff --git a/gcc/testsuite/g++.dg/modules/virt-2_a.C 
>> b/gcc/testsuite/g++.dg/modules/virt-2_a.C
>> index 580552be5a0d8..ede711c3e83be 100644
>> --- a/gcc/testsuite/g++.dg/modules/virt-2_a.C
>> +++ b/gcc/testsuite/g++.dg/modules/virt-2_a.C
>> @@ -1,3 +1,6 @@
>> +// AAPCS overrides TARGET_CXX_KEY_METHOD_MAY_BE_INLINE,
>> +// in a way that invalidates this test.
>> +// { dg-skip-if "TARGET_CXX_KEY_METHOD_MAY_BE_INLINE" { arm_eabi } }
> 
> Given the logic of this macro, the text should be 
> "!TARGET_CXX_METHOD_MAY_BE_INLINE".
> 
> OK with that change.
> 
> R.
> 
>>  // { dg-module-do run }
>>  // { dg-additional-options -fmodules-ts }
>>  export module foo;



Re: [PATCH] Avoid default-initializing auto_vec storage

2023-02-24 Thread Jonathan Wakely via Gcc-patches
On Fri, 24 Feb 2023 at 10:24, Jakub Jelinek  wrote:
>
> On Fri, Feb 24, 2023 at 11:02:07AM +0100, Jakub Jelinek via Gcc-patches wrote:
> > Maybe this would work, vl_relative even could be vl_embed.
> > Because vl_embed I believe is used in two spots, part of
> > auto_vec where it is followed by m_data and on heap or GGC
> > allocated memory where vec<..., vl_embed> is followed by
> > further storage for the vector.
>
> So roughtly something like below?  Except I get weird crashes with it
> in the gen* tools.  And we'd need to adjust the gdb python hooks
> which also use m_vecdata.
>
> --- gcc/vec.h.jj2023-01-02 09:32:32.177143804 +0100
> +++ gcc/vec.h   2023-02-24 11:19:37.900157177 +0100
> @@ -586,8 +586,8 @@ public:
>unsigned allocated (void) const { return m_vecpfx.m_alloc; }
>unsigned length (void) const { return m_vecpfx.m_num; }
>bool is_empty (void) const { return m_vecpfx.m_num == 0; }
> -  T *address (void) { return m_vecdata; }
> -  const T *address (void) const { return m_vecdata; }
> +  T *address (void) { return (T *) (this + 1); }
> +  const T *address (void) const { return (const T *) (this + 1); }
>T *begin () { return address (); }
>const T *begin () const { return address (); }
>T *end () { return address () + length (); }
> @@ -629,10 +629,9 @@ public:
>friend struct va_gc_atomic;
>friend struct va_heap;
>
> -  /* FIXME - These fields should be private, but we need to cater to
> +  /* FIXME - This field should be private, but we need to cater to
>  compilers that have stricter notions of PODness for types.  */
> -  vec_prefix m_vecpfx;
> -  T m_vecdata[1];
> +  alignas (T) alignas (vec_prefix) vec_prefix m_vecpfx;
>  };
>
>
> @@ -879,7 +878,7 @@ inline const T &
>  vec::operator[] (unsigned ix) const
>  {
>gcc_checking_assert (ix < m_vecpfx.m_num);
> -  return m_vecdata[ix];
> +  return address ()[ix];
>  }
>
>  template
> @@ -887,7 +886,7 @@ inline T &
>  vec::operator[] (unsigned ix)
>  {
>gcc_checking_assert (ix < m_vecpfx.m_num);
> -  return m_vecdata[ix];
> +  return address ()[ix];
>  }
>
>
> @@ -929,7 +928,7 @@ vec::iterate (unsigned i
>  {
>if (ix < m_vecpfx.m_num)
>  {
> -  *ptr = m_vecdata[ix];
> +  *ptr = address ()[ix];
>return true;
>  }
>else
> @@ -955,7 +954,7 @@ vec::iterate (unsigned i
>  {
>if (ix < m_vecpfx.m_num)
>  {
> -  *ptr = CONST_CAST (T *, _vecdata[ix]);
> +  *ptr = CONST_CAST (T *,  ()[ix]);
>return true;
>  }
>else
> @@ -978,7 +977,7 @@ vec::copy (ALONE_MEM_STA
>  {
>vec_alloc (new_vec, len PASS_MEM_STAT);
>new_vec->embedded_init (len, len);
> -  vec_copy_construct (new_vec->address (), m_vecdata, len);
> +  vec_copy_construct (new_vec->address (), address (), len);
>  }
>return new_vec;
>  }
> @@ -1018,7 +1017,7 @@ inline T *
>  vec::quick_push (const T )
>  {
>gcc_checking_assert (space (1));
> -  T *slot = _vecdata[m_vecpfx.m_num++];
> +  T *slot =  ()[m_vecpfx.m_num++];
>*slot = obj;
>return slot;
>  }
> @@ -1031,7 +1030,7 @@ inline T &
>  vec::pop (void)
>  {
>gcc_checking_assert (length () > 0);
> -  return m_vecdata[--m_vecpfx.m_num];
> +  return address ()[--m_vecpfx.m_num];
>  }
>
>
> @@ -1056,7 +1055,7 @@ vec::quick_insert (unsig
>  {
>gcc_checking_assert (length () < allocated ());
>gcc_checking_assert (ix <= length ());
> -  T *slot = _vecdata[ix];
> +  T *slot =  ()[ix];
>memmove (slot + 1, slot, (m_vecpfx.m_num++ - ix) * sizeof (T));
>*slot = obj;
>  }
> @@ -1071,7 +1070,7 @@ inline void
>  vec::ordered_remove (unsigned ix)
>  {
>gcc_checking_assert (ix < length ());
> -  T *slot = _vecdata[ix];
> +  T *slot =  ()[ix];
>memmove (slot, slot + 1, (--m_vecpfx.m_num - ix) * sizeof (T));
>  }
>
> @@ -1118,7 +1117,7 @@ inline void
>  vec::unordered_remove (unsigned ix)
>  {
>gcc_checking_assert (ix < length ());
> -  m_vecdata[ix] = m_vecdata[--m_vecpfx.m_num];
> +  address ()[ix] = address ()[--m_vecpfx.m_num];
>  }
>
>
> @@ -1130,7 +1129,7 @@ inline void
>  vec::block_remove (unsigned ix, unsigned len)
>  {
>gcc_checking_assert (ix + len <= length ());
> -  T *slot = _vecdata[ix];
> +  T *slot =  ()[ix];
>m_vecpfx.m_num -= len;
>memmove (slot, slot + len, (m_vecpfx.m_num - ix) * sizeof (T));
>  }
> @@ -1309,7 +1308,7 @@ vec::embedded_size (unsi
> vec, vec_embedded>::type vec_stdlayout;
>static_assert (sizeof (vec_stdlayout) == sizeof (vec), "");
>static_assert (alignof (vec_stdlayout) == alignof (vec), "");
> -  return offsetof (vec_stdlayout, m_vecdata) + alloc * sizeof (T);
> +  return sizeof (vec_stdlayout) + alloc * sizeof (T);
>  }
>
>
> @@ -1476,10 +1475,10 @@ public:
>{ return m_vec ? m_vec->length () : 0; }
>
>T *address (void)
> -  { return m_vec ? m_vec->m_vecdata : NULL; }
> +  { return m_vec ? m_vec->address () : NULL; }
>
>const T *address (void) const
> -  { return 

Re: [PATCH] Avoid default-initializing auto_vec storage

2023-02-24 Thread Jakub Jelinek via Gcc-patches
On Fri, Feb 24, 2023 at 11:02:07AM +0100, Jakub Jelinek via Gcc-patches wrote:
> Maybe this would work, vl_relative even could be vl_embed.
> Because vl_embed I believe is used in two spots, part of
> auto_vec where it is followed by m_data and on heap or GGC
> allocated memory where vec<..., vl_embed> is followed by
> further storage for the vector.

So roughtly something like below?  Except I get weird crashes with it
in the gen* tools.  And we'd need to adjust the gdb python hooks
which also use m_vecdata.

--- gcc/vec.h.jj2023-01-02 09:32:32.177143804 +0100
+++ gcc/vec.h   2023-02-24 11:19:37.900157177 +0100
@@ -586,8 +586,8 @@ public:
   unsigned allocated (void) const { return m_vecpfx.m_alloc; }
   unsigned length (void) const { return m_vecpfx.m_num; }
   bool is_empty (void) const { return m_vecpfx.m_num == 0; }
-  T *address (void) { return m_vecdata; }
-  const T *address (void) const { return m_vecdata; }
+  T *address (void) { return (T *) (this + 1); }
+  const T *address (void) const { return (const T *) (this + 1); }
   T *begin () { return address (); }
   const T *begin () const { return address (); }
   T *end () { return address () + length (); }
@@ -629,10 +629,9 @@ public:
   friend struct va_gc_atomic;
   friend struct va_heap;
 
-  /* FIXME - These fields should be private, but we need to cater to
+  /* FIXME - This field should be private, but we need to cater to
 compilers that have stricter notions of PODness for types.  */
-  vec_prefix m_vecpfx;
-  T m_vecdata[1];
+  alignas (T) alignas (vec_prefix) vec_prefix m_vecpfx;
 };
 
 
@@ -879,7 +878,7 @@ inline const T &
 vec::operator[] (unsigned ix) const
 {
   gcc_checking_assert (ix < m_vecpfx.m_num);
-  return m_vecdata[ix];
+  return address ()[ix];
 }
 
 template
@@ -887,7 +886,7 @@ inline T &
 vec::operator[] (unsigned ix)
 {
   gcc_checking_assert (ix < m_vecpfx.m_num);
-  return m_vecdata[ix];
+  return address ()[ix];
 }
 
 
@@ -929,7 +928,7 @@ vec::iterate (unsigned i
 {
   if (ix < m_vecpfx.m_num)
 {
-  *ptr = m_vecdata[ix];
+  *ptr = address ()[ix];
   return true;
 }
   else
@@ -955,7 +954,7 @@ vec::iterate (unsigned i
 {
   if (ix < m_vecpfx.m_num)
 {
-  *ptr = CONST_CAST (T *, _vecdata[ix]);
+  *ptr = CONST_CAST (T *,  ()[ix]);
   return true;
 }
   else
@@ -978,7 +977,7 @@ vec::copy (ALONE_MEM_STA
 {
   vec_alloc (new_vec, len PASS_MEM_STAT);
   new_vec->embedded_init (len, len);
-  vec_copy_construct (new_vec->address (), m_vecdata, len);
+  vec_copy_construct (new_vec->address (), address (), len);
 }
   return new_vec;
 }
@@ -1018,7 +1017,7 @@ inline T *
 vec::quick_push (const T )
 {
   gcc_checking_assert (space (1));
-  T *slot = _vecdata[m_vecpfx.m_num++];
+  T *slot =  ()[m_vecpfx.m_num++];
   *slot = obj;
   return slot;
 }
@@ -1031,7 +1030,7 @@ inline T &
 vec::pop (void)
 {
   gcc_checking_assert (length () > 0);
-  return m_vecdata[--m_vecpfx.m_num];
+  return address ()[--m_vecpfx.m_num];
 }
 
 
@@ -1056,7 +1055,7 @@ vec::quick_insert (unsig
 {
   gcc_checking_assert (length () < allocated ());
   gcc_checking_assert (ix <= length ());
-  T *slot = _vecdata[ix];
+  T *slot =  ()[ix];
   memmove (slot + 1, slot, (m_vecpfx.m_num++ - ix) * sizeof (T));
   *slot = obj;
 }
@@ -1071,7 +1070,7 @@ inline void
 vec::ordered_remove (unsigned ix)
 {
   gcc_checking_assert (ix < length ());
-  T *slot = _vecdata[ix];
+  T *slot =  ()[ix];
   memmove (slot, slot + 1, (--m_vecpfx.m_num - ix) * sizeof (T));
 }
 
@@ -1118,7 +1117,7 @@ inline void
 vec::unordered_remove (unsigned ix)
 {
   gcc_checking_assert (ix < length ());
-  m_vecdata[ix] = m_vecdata[--m_vecpfx.m_num];
+  address ()[ix] = address ()[--m_vecpfx.m_num];
 }
 
 
@@ -1130,7 +1129,7 @@ inline void
 vec::block_remove (unsigned ix, unsigned len)
 {
   gcc_checking_assert (ix + len <= length ());
-  T *slot = _vecdata[ix];
+  T *slot =  ()[ix];
   m_vecpfx.m_num -= len;
   memmove (slot, slot + len, (m_vecpfx.m_num - ix) * sizeof (T));
 }
@@ -1309,7 +1308,7 @@ vec::embedded_size (unsi
vec, vec_embedded>::type vec_stdlayout;
   static_assert (sizeof (vec_stdlayout) == sizeof (vec), "");
   static_assert (alignof (vec_stdlayout) == alignof (vec), "");
-  return offsetof (vec_stdlayout, m_vecdata) + alloc * sizeof (T);
+  return sizeof (vec_stdlayout) + alloc * sizeof (T);
 }
 
 
@@ -1476,10 +1475,10 @@ public:
   { return m_vec ? m_vec->length () : 0; }
 
   T *address (void)
-  { return m_vec ? m_vec->m_vecdata : NULL; }
+  { return m_vec ? m_vec->address () : NULL; }
 
   const T *address (void) const
-  { return m_vec ? m_vec->m_vecdata : NULL; }
+  { return m_vec ? m_vec->address () : NULL; }
 
   T *begin () { return address (); }
   const T *begin () const { return address (); }
@@ -1584,7 +1583,7 @@ public:
 
 private:
   vec m_auto;
-  T m_data[MAX (N - 1, 1)];
+  unsigned char m_data[MAX (N - 1, 1)];
 };
 
 /* auto_vec is a sub 

Re: C++ modules and AAPCS/ARM EABI clash on inline key methods

2023-02-24 Thread Richard Earnshaw via Gcc-patches




On 23/02/2023 21:20, Alexandre Oliva wrote:

On Feb 23, 2023, Alexandre Oliva  wrote:


On Feb 23, 2023, Richard Earnshaw  wrote:

On 22/02/2023 19:57, Alexandre Oliva wrote:

On Feb 21, 2023, Richard Earnshaw  wrote:


Rather than scanning for the triplet, a better test would be



{ xfail { arm_eabi } }


Indeed, thanks.  Here's the updated patch, retested.  Ok to install?



Based on Nathan's comments, we should just skip the test on arm_eabi,
it's simply not applicable.



Like this, I suppose.  Retested on x86_64-linux-gnu (trunk) and
arm-wrs-vxworks7 (gcc-12).  Ok to install?


Erhm, actually, that version still ran the assembler scans and failed.
This one skips the testset entirely.


Yeah, I tried something like that and it didn't appear to work. Perhaps 
it's a bug in the way dg-do-module is implemented.





[PR105224] C++ modules and AAPCS/ARM EABI clash on inline key methods

From: Alexandre Oliva 

g++.dg/modules/virt-2_a.C fails on arm-eabi and many other arm targets
that use the AAPCS variant.  ARM is the only target that overrides
TARGET_CXX_KEY_METHOD_MAY_BE_INLINE.  It's not clear to me which way
the clash between AAPCS and C++ Modules design should be resolved, but
currently it favors AAPCS and thus the test fails, so skip it on
arm_eabi.


for  gcc/testsuite/ChangeLog

PR c++/105224
* g++.dg/modules/virt-2_a.C: Skip on arm_eabi.
---
  gcc/testsuite/g++.dg/modules/virt-2_a.C |3 +++
  1 file changed, 3 insertions(+)

diff --git a/gcc/testsuite/g++.dg/modules/virt-2_a.C 
b/gcc/testsuite/g++.dg/modules/virt-2_a.C
index 580552be5a0d8..ede711c3e83be 100644
--- a/gcc/testsuite/g++.dg/modules/virt-2_a.C
+++ b/gcc/testsuite/g++.dg/modules/virt-2_a.C
@@ -1,3 +1,6 @@
+// AAPCS overrides TARGET_CXX_KEY_METHOD_MAY_BE_INLINE,
+// in a way that invalidates this test.
+// { dg-skip-if "TARGET_CXX_KEY_METHOD_MAY_BE_INLINE" { arm_eabi } }


Given the logic of this macro, the text should be 
"!TARGET_CXX_METHOD_MAY_BE_INLINE".


OK with that change.

R.


  // { dg-module-do run }
  // { dg-additional-options -fmodules-ts }
  export module foo;




Re: [PATCH v3 04/11] riscv: thead: Add support for the XTheadBs ISA extension

2023-02-24 Thread Christoph Müllner
On Fri, Feb 24, 2023 at 8:37 AM Kito Cheng  wrote:
>
> > diff --git a/gcc/config/riscv/thead.md b/gcc/config/riscv/thead.md
> > index 158e9124c3a..2c684885850 100644
> > --- a/gcc/config/riscv/thead.md
> > +++ b/gcc/config/riscv/thead.md
> > @@ -29,3 +29,14 @@ (define_insn "*th_addsl"
> >"th.addsl\t%0,%3,%1,%2"
> >[(set_attr "type" "bitmanip")
> > (set_attr "mode" "")])
> > +
> > +;; XTheadBs
> > +
> > +(define_insn "*th_tst"
> > +  [(set (match_operand:X 0 "register_operand" "=r")
> > +   (zero_extract:X (match_operand:X 1 "register_operand" "r")
> > +   (const_int 1)
> > +   (match_operand 2 "immediate_operand" "i")))]
> > +  "TARGET_XTHEADBS"
>
> Add range check like *bexti pattern?
>
> TARGET_XTHEADBS && UINTVAL (operands[2]) < GET_MODE_BITSIZE (mode)

Ok.

Thanks,
Christoph

>
> > +  "th.tst\t%0,%1,%2"
> > +  [(set_attr "type" "bitmanip")])


Re: [PATCH v3 03/11] riscv: thead: Add support for the XTheadBa ISA extension

2023-02-24 Thread Christoph Müllner
On Fri, Feb 24, 2023 at 10:54 AM Kito Cheng  wrote:
>
> My impression is that md patterns will use first-match patterns? so
> the zba will get higher priority than xtheadba if both patterns are
> matched?

Yes, I was just about to write this.

/opt/riscv-thead/bin/riscv64-unknown-linux-gnu-gcc -O2
-march=rv64gc_zba_xtheadba -mtune=thead-c906 -S
./gcc/testsuite/gcc.target/riscv/xtheadba-addsl.c

The resulting xtheadba-addsl.s file has:
.attribute arch, "rv64i2p0_m2p0_a2p0_f2p0_d2p0_c2p0_zba1p0_xtheadba1p0"
[...]
sh1add  a0,a1,a0

So the standard extension will be preferred over the custom extension.


>
> On Fri, Feb 24, 2023 at 2:52 PM Andrew Pinski via Gcc-patches
>  wrote:
> >
> > On Thu, Feb 23, 2023 at 9:55 PM Christoph Muellner
> >  wrote:
> > >
> > > From: Christoph Müllner 
> > >
> > > This patch adds support for the XTheadBa ISA extension.
> > > The new INSN pattern is defined in a new file to separate
> > > this vendor extension from the standard extensions.
> >
> > How does this interact with doing -march=rv32gc_xtheadba_zba ?
> > Seems like it might be better handle that case correctly. I suspect
> > these all XThreadB* extensions have a similar problem too.
> >
> > Thanks,
> > Andrew Pinski
> >
> > >
> > > gcc/ChangeLog:
> > >
> > > * config/riscv/riscv.md: Include thead.md
> > > * config/riscv/thead.md: New file.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > * gcc.target/riscv/xtheadba-addsl.c: New test.
> > >
> > > Changes in v3:
> > > - Fix operand order for th.addsl.
> > >
> > > Signed-off-by: Christoph Müllner 
> > > ---
> > >  gcc/config/riscv/riscv.md |  1 +
> > >  gcc/config/riscv/thead.md | 31 +++
> > >  .../gcc.target/riscv/xtheadba-addsl.c | 55 +++
> > >  3 files changed, 87 insertions(+)
> > >  create mode 100644 gcc/config/riscv/thead.md
> > >  create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadba-addsl.c
> > >
> > > diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> > > index 05924e9bbf1..d6c2265e9d4 100644
> > > --- a/gcc/config/riscv/riscv.md
> > > +++ b/gcc/config/riscv/riscv.md
> > > @@ -3093,4 +3093,5 @@ (define_insn "riscv_prefetchi_"
> > >  (include "pic.md")
> > >  (include "generic.md")
> > >  (include "sifive-7.md")
> > > +(include "thead.md")
> > >  (include "vector.md")
> > > diff --git a/gcc/config/riscv/thead.md b/gcc/config/riscv/thead.md
> > > new file mode 100644
> > > index 000..158e9124c3a
> > > --- /dev/null
> > > +++ b/gcc/config/riscv/thead.md
> > > @@ -0,0 +1,31 @@
> > > +;; Machine description for T-Head vendor extensions
> > > +;; Copyright (C) 2021-2022 Free Software Foundation, Inc.
> > > +
> > > +;; This file is part of GCC.
> > > +
> > > +;; GCC is free software; you can redistribute it and/or modify
> > > +;; it under the terms of the GNU General Public License as published by
> > > +;; the Free Software Foundation; either version 3, or (at your option)
> > > +;; any later version.
> > > +
> > > +;; GCC is distributed in the hope that it will be useful,
> > > +;; but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > > +;; GNU General Public License for more details.
> > > +
> > > +;; You should have received a copy of the GNU General Public License
> > > +;; along with GCC; see the file COPYING3.  If not see
> > > +;; .
> > > +
> > > +;; XTheadBa
> > > +
> > > +(define_insn "*th_addsl"
> > > +  [(set (match_operand:X 0 "register_operand" "=r")
> > > +   (plus:X (ashift:X (match_operand:X 1 "register_operand" "r")
> > > + (match_operand:QI 2 "immediate_operand" "I"))
> > > +   (match_operand:X 3 "register_operand" "r")))]
> > > +  "TARGET_XTHEADBA
> > > +   && (INTVAL (operands[2]) >= 0) && (INTVAL (operands[2]) <= 3)"
> > > +  "th.addsl\t%0,%3,%1,%2"
> > > +  [(set_attr "type" "bitmanip")
> > > +   (set_attr "mode" "")])
> > > diff --git a/gcc/testsuite/gcc.target/riscv/xtheadba-addsl.c 
> > > b/gcc/testsuite/gcc.target/riscv/xtheadba-addsl.c
> > > new file mode 100644
> > > index 000..5004735a246
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.target/riscv/xtheadba-addsl.c
> > > @@ -0,0 +1,55 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-march=rv32gc_xtheadba" { target { rv32 } } } */
> > > +/* { dg-options "-march=rv64gc_xtheadba" { target { rv64 } } } */
> > > +/* { dg-skip-if "" { *-*-* } { "-O0" } } */
> > > +
> > > +long
> > > +test_1 (long a, long b)
> > > +{
> > > +  /* th.addsl aX, aX, 1  */
> > > +  return a + (b << 1);
> > > +}
> > > +
> > > +int
> > > +foos (short *x, int n)
> > > +{
> > > +  /* th.addsl aX, aX, 1  */
> > > +  return x[n];
> > > +}
> > > +
> > > +long
> > > +test_2 (long a, long b)
> > > +{
> > > +  /* th.addsl aX, aX, 2  */
> > > +  return a + (b << 2);
> > > +}
> > > +
> > > +int
> > > +fooi (int *x, 

Re: [PATCH] Avoid default-initializing auto_vec storage

2023-02-24 Thread Jakub Jelinek via Gcc-patches
On Fri, Feb 24, 2023 at 09:55:13AM +, Jonathan Wakely wrote:
> > You would still be accessing past the end of the
> > vec::m_vecdata array which is UB.
> 
> My thinking is something like:
> 
> // New tag type
> struct vl_relative { };
> 
> // This must only be used as a member subobject of another type
> // which provides the trailing storage.
> template
> struct vec
> {
>   T *address (void) { return (T*)(m_vecpfx+1); }
>   const T *address (void) const { return (T*)(m_vecpfx+1); }
> 
>   alignas(T) alignas(vec_prefix) vec_prefix m_vecpfx;
> };
> 
> template
> class auto_vec : public vec
> {
>   // ...
> private:
>   vec m_head;
>   T m_data[N];
> 
> static_assert(...);
> };

Maybe this would work, vl_relative even could be vl_embed.
Because vl_embed I believe is used in two spots, part of
auto_vec where it is followed by m_data and on heap or GGC
allocated memory where vec<..., vl_embed> is followed by
further storage for the vector.

Jakub



Re: [PATCH] Avoid default-initializing auto_vec storage

2023-02-24 Thread Jonathan Wakely via Gcc-patches
On Fri, 24 Feb 2023 at 09:50, Jonathan Wakely  wrote:
>
> On Fri, 24 Feb 2023 at 09:49, Jakub Jelinek wrote:
> >
> > Assuming a compiler handles the T m_vecdata[1]; as flexible array member
> > like (which we need because standard C++ doesn't have flexible array members
> > nor [0] arrays), I wonder if we instead of the m_auto followed by m_data
> > trick couldn't make auto_vec have
> > alignas(vec) unsigned char buf m_data[sizeof (vec) + (N 
> > - 1) * sizeof (T)];
> > and do a placement new of vec into that m_data during auto_vec
> > construction.  Isn't it then similar to how are flexible array members
> > normally used in C, where one uses malloc or alloca to allocate storage
> > for them and the storage can be larger than the structure itself and
> > flexible array member then can use storage after it?
>
> You would still be accessing past the end of the
> vec::m_vecdata array which is UB.

My thinking is something like:

// New tag type
struct vl_relative { };

// This must only be used as a member subobject of another type
// which provides the trailing storage.
template
struct vec
{
  T *address (void) { return (T*)(m_vecpfx+1); }
  const T *address (void) const { return (T*)(m_vecpfx+1); }

  alignas(T) alignas(vec_prefix) vec_prefix m_vecpfx;
};

template
class auto_vec : public vec
{
  // ...
private:
  vec m_head;
  T m_data[N];

static_assert(...);
};



Re: [PATCH] Avoid default-initializing auto_vec storage

2023-02-24 Thread Jakub Jelinek via Gcc-patches
On Fri, Feb 24, 2023 at 09:50:33AM +, Jonathan Wakely wrote:
> On Fri, 24 Feb 2023 at 09:49, Jakub Jelinek wrote:
> >
> > Assuming a compiler handles the T m_vecdata[1]; as flexible array member
> > like (which we need because standard C++ doesn't have flexible array members
> > nor [0] arrays), I wonder if we instead of the m_auto followed by m_data
> > trick couldn't make auto_vec have
> > alignas(vec) unsigned char buf m_data[sizeof (vec) + (N 
> > - 1) * sizeof (T)];
> > and do a placement new of vec into that m_data during auto_vec
> > construction.  Isn't it then similar to how are flexible array members
> > normally used in C, where one uses malloc or alloca to allocate storage
> > for them and the storage can be larger than the structure itself and
> > flexible array member then can use storage after it?
> 
> You would still be accessing past the end of the
> vec::m_vecdata array which is UB.

Pedantically sure, but because C++ doesn't have flexible array members,
people in the wild use the flexible array member like arrays for that
purpose.
If there was T m_vecdata[];, would it still be UB (with the flexible
array member extensions)?
We could use T m_vecdata[]; if the host compiler supports them and
T m_vecdata[1]; otherwise in the hope that the compiler handles it
similarly.  After all, I think lots of other real-world programs do the
same.

Jakub



Re: [PATCH v3 03/11] riscv: thead: Add support for the XTheadBa ISA extension

2023-02-24 Thread Kito Cheng via Gcc-patches
My impression is that md patterns will use first-match patterns? so
the zba will get higher priority than xtheadba if both patterns are
matched?

On Fri, Feb 24, 2023 at 2:52 PM Andrew Pinski via Gcc-patches
 wrote:
>
> On Thu, Feb 23, 2023 at 9:55 PM Christoph Muellner
>  wrote:
> >
> > From: Christoph Müllner 
> >
> > This patch adds support for the XTheadBa ISA extension.
> > The new INSN pattern is defined in a new file to separate
> > this vendor extension from the standard extensions.
>
> How does this interact with doing -march=rv32gc_xtheadba_zba ?
> Seems like it might be better handle that case correctly. I suspect
> these all XThreadB* extensions have a similar problem too.
>
> Thanks,
> Andrew Pinski
>
> >
> > gcc/ChangeLog:
> >
> > * config/riscv/riscv.md: Include thead.md
> > * config/riscv/thead.md: New file.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/riscv/xtheadba-addsl.c: New test.
> >
> > Changes in v3:
> > - Fix operand order for th.addsl.
> >
> > Signed-off-by: Christoph Müllner 
> > ---
> >  gcc/config/riscv/riscv.md |  1 +
> >  gcc/config/riscv/thead.md | 31 +++
> >  .../gcc.target/riscv/xtheadba-addsl.c | 55 +++
> >  3 files changed, 87 insertions(+)
> >  create mode 100644 gcc/config/riscv/thead.md
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadba-addsl.c
> >
> > diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> > index 05924e9bbf1..d6c2265e9d4 100644
> > --- a/gcc/config/riscv/riscv.md
> > +++ b/gcc/config/riscv/riscv.md
> > @@ -3093,4 +3093,5 @@ (define_insn "riscv_prefetchi_"
> >  (include "pic.md")
> >  (include "generic.md")
> >  (include "sifive-7.md")
> > +(include "thead.md")
> >  (include "vector.md")
> > diff --git a/gcc/config/riscv/thead.md b/gcc/config/riscv/thead.md
> > new file mode 100644
> > index 000..158e9124c3a
> > --- /dev/null
> > +++ b/gcc/config/riscv/thead.md
> > @@ -0,0 +1,31 @@
> > +;; Machine description for T-Head vendor extensions
> > +;; Copyright (C) 2021-2022 Free Software Foundation, Inc.
> > +
> > +;; This file is part of GCC.
> > +
> > +;; GCC is free software; you can redistribute it and/or modify
> > +;; it under the terms of the GNU General Public License as published by
> > +;; the Free Software Foundation; either version 3, or (at your option)
> > +;; any later version.
> > +
> > +;; GCC is distributed in the hope that it will be useful,
> > +;; but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > +;; GNU General Public License for more details.
> > +
> > +;; You should have received a copy of the GNU General Public License
> > +;; along with GCC; see the file COPYING3.  If not see
> > +;; .
> > +
> > +;; XTheadBa
> > +
> > +(define_insn "*th_addsl"
> > +  [(set (match_operand:X 0 "register_operand" "=r")
> > +   (plus:X (ashift:X (match_operand:X 1 "register_operand" "r")
> > + (match_operand:QI 2 "immediate_operand" "I"))
> > +   (match_operand:X 3 "register_operand" "r")))]
> > +  "TARGET_XTHEADBA
> > +   && (INTVAL (operands[2]) >= 0) && (INTVAL (operands[2]) <= 3)"
> > +  "th.addsl\t%0,%3,%1,%2"
> > +  [(set_attr "type" "bitmanip")
> > +   (set_attr "mode" "")])
> > diff --git a/gcc/testsuite/gcc.target/riscv/xtheadba-addsl.c 
> > b/gcc/testsuite/gcc.target/riscv/xtheadba-addsl.c
> > new file mode 100644
> > index 000..5004735a246
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/xtheadba-addsl.c
> > @@ -0,0 +1,55 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-march=rv32gc_xtheadba" { target { rv32 } } } */
> > +/* { dg-options "-march=rv64gc_xtheadba" { target { rv64 } } } */
> > +/* { dg-skip-if "" { *-*-* } { "-O0" } } */
> > +
> > +long
> > +test_1 (long a, long b)
> > +{
> > +  /* th.addsl aX, aX, 1  */
> > +  return a + (b << 1);
> > +}
> > +
> > +int
> > +foos (short *x, int n)
> > +{
> > +  /* th.addsl aX, aX, 1  */
> > +  return x[n];
> > +}
> > +
> > +long
> > +test_2 (long a, long b)
> > +{
> > +  /* th.addsl aX, aX, 2  */
> > +  return a + (b << 2);
> > +}
> > +
> > +int
> > +fooi (int *x, int n)
> > +{
> > +  /* th.addsl aX, aX, 2  */
> > +  return x[n];
> > +}
> > +
> > +long
> > +test_3 (long a, long b)
> > +{
> > +  /* th.addsl aX, aX, 3  */
> > +  return a + (b << 3);
> > +}
> > +
> > +long
> > +fool (long *x, int n)
> > +{
> > +  /* th.addsl aX, aX, 2 (rv32)  */
> > +  /* th.addsl aX, aX, 3 (rv64)  */
> > +  return x[n];
> > +}
> > +
> > +/* { dg-final { scan-assembler-times "th.addsl\[ 
> > \t\]*a\[0-9\]+,a\[0-9\]+,a\[0-9\]+,1" 2 } } */
> > +
> > +/* { dg-final { scan-assembler-times "th.addsl\[ 
> > \t\]*a\[0-9\]+,a\[0-9\]+,a\[0-9\]+,2" 3 { target { rv32 } } } } */
> > +/* { dg-final { scan-assembler-times "th.addsl\[ 
> > \t\]*a\[0-9\]+,a\[0-9\]+,a\[0-9\]+,2" 2 { target { rv64 } } } } */
> > +

Re: [PATCH] Avoid default-initializing auto_vec storage

2023-02-24 Thread Jonathan Wakely via Gcc-patches
On Fri, 24 Feb 2023 at 09:49, Jakub Jelinek wrote:
>
> Assuming a compiler handles the T m_vecdata[1]; as flexible array member
> like (which we need because standard C++ doesn't have flexible array members
> nor [0] arrays), I wonder if we instead of the m_auto followed by m_data
> trick couldn't make auto_vec have
> alignas(vec) unsigned char buf m_data[sizeof (vec) + (N - 
> 1) * sizeof (T)];
> and do a placement new of vec into that m_data during auto_vec
> construction.  Isn't it then similar to how are flexible array members
> normally used in C, where one uses malloc or alloca to allocate storage
> for them and the storage can be larger than the structure itself and
> flexible array member then can use storage after it?

You would still be accessing past the end of the
vec::m_vecdata array which is UB.



Re: [PATCH] Avoid default-initializing auto_vec storage

2023-02-24 Thread Jakub Jelinek via Gcc-patches
On Fri, Feb 24, 2023 at 09:34:46AM +, Richard Biener wrote:
> > Looking at vec::operator[] which just does
> > 
> > template
> > inline const T &
> > vec::operator[] (unsigned ix) const
> > {
> >   gcc_checking_assert (ix < m_vecpfx.m_num);
> >   return m_vecdata[ix];
> > } 
> > 
> > the whole thing looks fragile at best - we basically have
> > 
> > struct auto_vec
> > {
> >   struct vec
> >   {
> > ...
> > T m_vecdata[1];
> >   } m_auto;
> >   T m_data[N-1];
> > };

Assuming a compiler handles the T m_vecdata[1]; as flexible array member
like (which we need because standard C++ doesn't have flexible array members
nor [0] arrays), I wonder if we instead of the m_auto followed by m_data
trick couldn't make auto_vec have
alignas(vec) unsigned char buf m_data[sizeof (vec) + (N - 
1) * sizeof (T)];
and do a placement new of vec into that m_data during auto_vec
construction.  Isn't it then similar to how are flexible array members
normally used in C, where one uses malloc or alloca to allocate storage
for them and the storage can be larger than the structure itself and
flexible array member then can use storage after it?

Though, of course, we'd need to test it with various compilers,
GCC 4.8 till now, various versions of clang, ICC, ...

Jakub



Re: [PATCH v3 11/11] riscv: thead: Add support for the XTheadFMemIdx ISA extension

2023-02-24 Thread Kito Cheng via Gcc-patches
> > +(define_memory_constraint "Qmx"
> > +  "@internal
> > +   An address valid for GPR."
> > +  (and (match_code "mem")
> > +   (match_test "!riscv_legitimize_address_index_p (
> > +   XEXP (op, 0), GET_MODE (op), false)")))
>
> Check TARGET_XTHEADFMEMIDX, and I don't quite understand why it

I changed my mind, don't check TARGET_XTHEADFMEMIDX here,
check ext in the pattern instead.


Re: [PATCH v3 10/11] riscv: thead: Add support for the XTheadMemIdx ISA extension

2023-02-24 Thread Kito Cheng via Gcc-patches
> diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
> index cf0cd669be4..5cd3f7673f0 100644
> --- a/gcc/config/riscv/riscv-opts.h
> +++ b/gcc/config/riscv/riscv-opts.h
> @@ -215,4 +215,7 @@ enum stack_protector_guard {
>  #define TARGET_XTHEADMEMPAIR ((riscv_xthead_subext & MASK_XTHEADMEMPAIR) != 
> 0)
>  #define TARGET_XTHEADSYNC((riscv_xthead_subext & MASK_XTHEADSYNC) != 0)
>
> +#define HAVE_POST_MODIFY_DISP TARGET_XTHEADMEMIDX
> +#define HAVE_PRE_MODIFY_DISP  TARGET_XTHEADMEMIDX
> +
>  #endif /* ! GCC_RISCV_OPTS_H */
> diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
> index 1b7ba02726d..019a0e08285 100644
> --- a/gcc/config/riscv/riscv-protos.h
> +++ b/gcc/config/riscv/riscv-protos.h
> @@ -65,6 +65,24 @@ extern void riscv_expand_int_scc (rtx, enum rtx_code, rtx, 
> rtx);
>  extern void riscv_expand_float_scc (rtx, enum rtx_code, rtx, rtx);
>  extern void riscv_expand_conditional_branch (rtx, enum rtx_code, rtx, rtx);
>  #endif
> +
> +extern bool
> +riscv_classify_address_index (struct riscv_address_info *info, rtx x,
> + machine_mode mode, bool strict_p);
> +extern bool
> +riscv_classify_address_modify (struct riscv_address_info *info, rtx x,
> +  machine_mode mode, bool strict_p);
> +
> +extern const char *
> +riscv_output_move_index (rtx x, machine_mode mode, bool ldr);
> +extern const char *
> +riscv_output_move_modify (rtx x, machine_mode mode, bool ldi);
> +
> +extern bool
> +riscv_legitimize_address_index_p (rtx x, machine_mode mode, bool uindex);
> +extern bool
> +riscv_legitimize_address_modify_p (rtx x, machine_mode mode, bool post);
> +
>  extern bool riscv_expand_conditional_move (rtx, rtx, rtx, rtx);
>  extern rtx riscv_legitimize_call_address (rtx);
>  extern void riscv_set_return_address (rtx, rtx);
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index 33854393bd2..2980dbd69f9 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -83,6 +83,19 @@ along with GCC; see the file COPYING3.  If not see
>
>  /* Classifies an address.
>
> +   ADDRESS_REG_REG
> +   A base register indexed by (optionally scaled) register.
> +
> +   ADDRESS_REG_UREG
> +   A base register indexed by (optionally scaled) zero-extended register.
> +
> +   ADDRESS_REG_WB
> +   A base register indexed by immediate offset with writeback.
> +
> +   ADDRESS_REG
> +   A natural register + offset address.  The register satisfies
> +   riscv_valid_base_register_p and the offset is a const_arith_operand.
> +
> ADDRESS_REG
> A natural register + offset address.  The register satisfies
> riscv_valid_base_register_p and the offset is a const_arith_operand.
> @@ -97,6 +110,9 @@ along with GCC; see the file COPYING3.  If not see
> ADDRESS_SYMBOLIC:
> A constant symbolic address.  */
>  enum riscv_address_type {
> +  ADDRESS_REG_REG,
> +  ADDRESS_REG_UREG,
> +  ADDRESS_REG_WB,
>ADDRESS_REG,
>ADDRESS_LO_SUM,
>ADDRESS_CONST_INT,
> @@ -201,6 +217,7 @@ struct riscv_address_info {
>rtx reg;
>rtx offset;
>enum riscv_symbol_type symbol_type;
> +  int shift;
>  };
>
>  /* One stage in a constant building sequence.  These sequences have
> @@ -1025,12 +1042,31 @@ riscv_classify_address (struct riscv_address_info 
> *info, rtx x,
>if (riscv_v_ext_vector_mode_p (mode))
> return false;
>
> +  if (riscv_valid_base_register_p (XEXP (x, 0), mode, strict_p)
> + && riscv_classify_address_index (info, XEXP (x, 1), mode, strict_p))
> +   {
> + info->reg = XEXP (x, 0);
> + return true;
> +   }
> +  else if (riscv_valid_base_register_p (XEXP (x, 1), mode, strict_p)
> +   && riscv_classify_address_index (info, XEXP (x, 0),
> +mode, strict_p))
> +   {
> + info->reg = XEXP (x, 1);
> + return true;
> +   }
> +
>info->type = ADDRESS_REG;
>info->reg = XEXP (x, 0);
>info->offset = XEXP (x, 1);
>return (riscv_valid_base_register_p (info->reg, mode, strict_p)
>   && riscv_valid_offset_p (info->offset, mode));
>
> +case POST_MODIFY:
> +case PRE_MODIFY:
> +
> +  return riscv_classify_address_modify (info, x, mode, strict_p);
> +
>  case LO_SUM:
>/* RVV load/store disallow LO_SUM.  */
>if (riscv_v_ext_vector_mode_p (mode))
> @@ -1269,6 +1305,263 @@ riscv_emit_move (rtx dest, rtx src)
>   : emit_move_insn_1 (dest, src));
>  }
>
> +/* Return true if address offset is a valid index.  If it is, fill in INFO
> +   appropriately.  STRICT_P is true if REG_OK_STRICT is in effect.  */
> +
> +bool
> +riscv_classify_address_index (struct riscv_address_info *info, rtx x,
> +  machine_mode mode, bool strict_p)

indent

> +{
> +  enum riscv_address_type type = ADDRESS_REG_REG;;
> +  rtx index;
> +  int shift = 0;
> +
> +  

Re: [PATCH] cgraphclones: Don't share DECL_ARGUMENTS between thunk and its artificial thunk [PR108854]

2023-02-24 Thread Richard Biener via Gcc-patches
On Fri, 24 Feb 2023, Jakub Jelinek wrote:

> Hi!
> 
> The following testcase ICEs on x86_64-linux with -m32.  The problem is
> we create an artificial thunk and because of -fPIC, ia32 and thunk
> destination which doesn't bind locally can't use a mi thunk.
> The ICE is because during expansion to RTL we see SSA_NAME for a PARM_DECL,
> but the PARM_DECL doesn't have DECL_CONTEXT of the current function.
> This is because duplicate_thunk_for_node creates a new DECL_ARGUMENTS chain
> only if some arguments need modification.
> 
> The following patch fixes it by copying the DECL_ARGUMENTS list even if
> the arguments can stay as is, to update DECL_CONTEXT on them.  While for
> mi thunks it doesn't really matter because we don't use those arguments
> in any way, for other thunks it is important.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

> 2023-02-23  Jakub Jelinek  
> 
>   PR middle-end/108854
>   * cgraphclones.cc (duplicate_thunk_for_node): If no parameter
>   changes are needed, copy at least DECL_ARGUMENTS PARM_DECL
>   nodes and adjust their DECL_CONTEXT.
> 
>   * g++.dg/opt/pr108854.C: New test.
> 
> --- gcc/cgraphclones.cc.jj2023-02-22 20:50:27.417519830 +0100
> +++ gcc/cgraphclones.cc   2023-02-23 17:12:59.875133883 +0100
> @@ -218,7 +218,17 @@ duplicate_thunk_for_node (cgraph_node *t
>body_adj.modify_formal_parameters ();
>  }
>else
> -new_decl = copy_node (thunk->decl);
> +{
> +  new_decl = copy_node (thunk->decl);
> +  for (tree *arg = _ARGUMENTS (new_decl);
> +*arg; arg = _CHAIN (*arg))
> + {
> +   tree next = DECL_CHAIN (*arg);
> +   *arg = copy_node (*arg);
> +   DECL_CONTEXT (*arg) = new_decl;
> +   DECL_CHAIN (*arg) = next;
> + }
> +}
>  
>gcc_checking_assert (!DECL_STRUCT_FUNCTION (new_decl));
>gcc_checking_assert (!DECL_INITIAL (new_decl));
> --- gcc/testsuite/g++.dg/opt/pr108854.C.jj2023-02-23 17:11:19.275583506 
> +0100
> +++ gcc/testsuite/g++.dg/opt/pr108854.C   2023-02-23 17:11:02.723822009 
> +0100
> @@ -0,0 +1,37 @@
> +// PR middle-end/108854
> +// { dg-do compile { target c++11 } }
> +// { dg-options "-O3" }
> +// { dg-additional-options "-fPIC" { target fpic } }
> +
> +struct A { A (int); ~A (); };
> +struct B { B (int, bool); ~B (); };
> +template 
> +struct C { void m1 (T); void m2 (T &&); };
> +class D;
> +struct E { virtual void m3 (); };
> +template 
> +struct F { virtual bool m4 (D &); };
> +struct D { virtual D m5 () { return D (); } };
> +void foo (void *, void *);
> +struct G {
> +  int a;
> +  C  b;
> +  void m4 (D ) { B l (a, true); r.m5 (); b.m1 (); b.m2 (); }
> +};
> +struct H : E, F  {
> +  template 
> +  H (int, T);
> +  bool m4 (D ) { A l (a); b.m4 (r); if (c) return true; } // { dg-warning 
> "control reaches end of non-void function" }
> +  int a;
> +  bool c;
> +  G b;
> +};
> +inline void bar (F  ) { D s, t; p.m4 (t); foo (, ); }
> +enum I { I1, I2 };
> +template 
> +struct J;
> +template 
> +void baz () { int g = 0, h = 0; T i (g, h); bar (i); }
> +template 
> +void qux () { baz > (); }
> +void corge () { qux  (); qux  (); }
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)


[PATCH] cgraphclones: Don't share DECL_ARGUMENTS between thunk and its artificial thunk [PR108854]

2023-02-24 Thread Jakub Jelinek via Gcc-patches
Hi!

The following testcase ICEs on x86_64-linux with -m32.  The problem is
we create an artificial thunk and because of -fPIC, ia32 and thunk
destination which doesn't bind locally can't use a mi thunk.
The ICE is because during expansion to RTL we see SSA_NAME for a PARM_DECL,
but the PARM_DECL doesn't have DECL_CONTEXT of the current function.
This is because duplicate_thunk_for_node creates a new DECL_ARGUMENTS chain
only if some arguments need modification.

The following patch fixes it by copying the DECL_ARGUMENTS list even if
the arguments can stay as is, to update DECL_CONTEXT on them.  While for
mi thunks it doesn't really matter because we don't use those arguments
in any way, for other thunks it is important.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2023-02-23  Jakub Jelinek  

PR middle-end/108854
* cgraphclones.cc (duplicate_thunk_for_node): If no parameter
changes are needed, copy at least DECL_ARGUMENTS PARM_DECL
nodes and adjust their DECL_CONTEXT.

* g++.dg/opt/pr108854.C: New test.

--- gcc/cgraphclones.cc.jj  2023-02-22 20:50:27.417519830 +0100
+++ gcc/cgraphclones.cc 2023-02-23 17:12:59.875133883 +0100
@@ -218,7 +218,17 @@ duplicate_thunk_for_node (cgraph_node *t
   body_adj.modify_formal_parameters ();
 }
   else
-new_decl = copy_node (thunk->decl);
+{
+  new_decl = copy_node (thunk->decl);
+  for (tree *arg = _ARGUMENTS (new_decl);
+  *arg; arg = _CHAIN (*arg))
+   {
+ tree next = DECL_CHAIN (*arg);
+ *arg = copy_node (*arg);
+ DECL_CONTEXT (*arg) = new_decl;
+ DECL_CHAIN (*arg) = next;
+   }
+}
 
   gcc_checking_assert (!DECL_STRUCT_FUNCTION (new_decl));
   gcc_checking_assert (!DECL_INITIAL (new_decl));
--- gcc/testsuite/g++.dg/opt/pr108854.C.jj  2023-02-23 17:11:19.275583506 
+0100
+++ gcc/testsuite/g++.dg/opt/pr108854.C 2023-02-23 17:11:02.723822009 +0100
@@ -0,0 +1,37 @@
+// PR middle-end/108854
+// { dg-do compile { target c++11 } }
+// { dg-options "-O3" }
+// { dg-additional-options "-fPIC" { target fpic } }
+
+struct A { A (int); ~A (); };
+struct B { B (int, bool); ~B (); };
+template 
+struct C { void m1 (T); void m2 (T &&); };
+class D;
+struct E { virtual void m3 (); };
+template 
+struct F { virtual bool m4 (D &); };
+struct D { virtual D m5 () { return D (); } };
+void foo (void *, void *);
+struct G {
+  int a;
+  C  b;
+  void m4 (D ) { B l (a, true); r.m5 (); b.m1 (); b.m2 (); }
+};
+struct H : E, F  {
+  template 
+  H (int, T);
+  bool m4 (D ) { A l (a); b.m4 (r); if (c) return true; } // { dg-warning 
"control reaches end of non-void function" }
+  int a;
+  bool c;
+  G b;
+};
+inline void bar (F  ) { D s, t; p.m4 (t); foo (, ); }
+enum I { I1, I2 };
+template 
+struct J;
+template 
+void baz () { int g = 0, h = 0; T i (g, h); bar (i); }
+template 
+void qux () { baz > (); }
+void corge () { qux  (); qux  (); }

Jakub



Re: [PATCH] Avoid default-initializing auto_vec storage

2023-02-24 Thread Richard Biener via Gcc-patches
On Fri, 24 Feb 2023, Richard Biener wrote:

> On Thu, 23 Feb 2023, Jakub Jelinek wrote:
> 
> > On Thu, Feb 23, 2023 at 03:02:01PM +, Richard Biener wrote:
> > > > >   * vec.h (auto_vec): Turn m_data storage into
> > > > >   uninitialized unsigned char.
> > > > 
> > > > Given that we actually never reference the m_data array anywhere,
> > > > it is just to reserve space, I think even the alignas(T) there is
> > > > useless.  The point is that m_auto has as data members:
> > > >   vec_prefix m_vecpfx;
> > > >   T m_vecdata[1];
> > > > and we rely on it (admittedly -fstrict-flex-arrays{,=2,=3} or
> > > > -fsanitize=bound-sstrict incompatible) being treated as
> > > > flexible array member flowing into the m_data storage after it.
> > > 
> > > Doesn't the array otherwise eventually overlap with tail padding
> > > in m_auto?  Or does an array of T never produce tail padding?
> > 
> > The array can certainly overlap with tail padding in m_auto if any.
> > But whether m_data is aligned to alignof (T) or not doesn't change anything
> > on it.
> > m_vecpfx is struct { unsigned m_alloc : 31, m_using_auto_storage : 1, 
> > m_num; },
> > so I think there is on most arches tail padding if T has smaller alignment
> > than int, so typically char/short or structs with the same size/alignments.
> > If that happens, alignof (auto_vec_x.m_auto) will be alignof (int),
> > there can be 2 or 3 padding bytes, but because sizeof (auto_vec_x.m_auto)
> > is 3 * sizeof (int), m_data will have offset always aligned to alignof (T).
> > If alignof (T) >= alignof (int), then there won't be any tail padding
> > at the end of m_auto, there could be padding between m_vecpfx and
> > m_vecdata, sizeof (auto_vec_x.m_auto) will be a multiple of sizeof (T) and
> > so m_data will be again already properly aligned.
> > 
> > So, I think your patch is fine without alignas(T), the rest is just that
> > there is more work to do incrementally, even for the case you want to
> > deal with (the point 1) in particular).
> 
> Looking at vec::operator[] which just does
> 
> template
> inline const T &
> vec::operator[] (unsigned ix) const
> {
>   gcc_checking_assert (ix < m_vecpfx.m_num);
>   return m_vecdata[ix];
> } 
> 
> the whole thing looks fragile at best - we basically have
> 
> struct auto_vec
> {
>   struct vec
>   {
> ...
> T m_vecdata[1];
>   } m_auto;
>   T m_data[N-1];
> };
> 
> and access m_auto.m_vecdata[] as if it extends to m_data.  That's
> not something supported by the middle-end - not by design at least.
> auto_vec *p; p->m_auto.m_vecdata[i] would never alias
> p->m_data[j], in practice we might not see this though.  Also
> get_ref_base_and_extent will compute a maxsize/size of sizeof(T)
> for any m_auto.m_vecdata[i] access, but I think we nowhere
> actually replace 'i' by zero based on this knowledge, but we'd
> perform CSE with earlier m_auto.m_vecdata[0] stores, so that
> might be something one could provoke.  Doing a self-test like
> 
> static __attribute__((noipa)) void
> test_auto_alias (int i)
> { 
>   auto_vec v;
>   v.quick_grow (2);
>   v[0] = 1;
>   v[1] = 2;
>   int val = v[i];
>   ASSERT_EQ (val, 2);
> } 
> 
> shows
> 
>   _27 = &_25->m_vecdata[0];
>   *_27 = 1;
> ...
>   _7 = &_12->m_vecdata[i.235_3];
>   val_13 = *_7;
> 
> which is safe in middle-end rules though.  So what "saves" us
> here is that we always return a reference and never a value.
> There's the ::iterate member function which fails to do this,
> the ::quick_push function does
> 
>   T *slot = _vecdata[m_vecpfx.m_num++];
>   *slot = obj;
> 
> with
> 
> static __attribute__((noipa)) void
> test_auto_alias (int i)
> { 
>   auto_vec v;
>   v.quick_grow (2);
>   v[0] = 1;
>   v[1] = 2;
>   int val;
>   for (int ix = i; v.iterate (ix, ); ix++)
> ;
>   ASSERT_EQ (val, 2);
> } 
> 
> I get that optimzied to a FAIL.  I have a "fix" for this.
> unordered_remove has a similar issue accesing the last element.

Turns out forwprop "breaks" this still, so the fix doesn't work.
That means we have a hole here in the middle-end.  We can
avoid this by obfuscating things even more, like to

  const T *first = m_vecdata;
  const T *slot = first + ix;
  *ptr = *slot;

which at least for variable 'ix' avoids forwprop from triggering.

But this also means that the existing [] accessor isn't really safe,
we're just lucky that we turn constant accesses to
MEM[(int &) + off] = val; and that we now have PR108355
which made the get_ref_base_and_extent info used less often in VN.

I'm testing the patch now without the new selftest, it should be
good to avoid these issues for constant indexes.  I can also split
the patch up.

But in the end I think we have to fix auto_vec in a better way.

Richard.

> There are a few functions using the [] access member which is
> at least sub-optimal due to repeated bounds checking but also safe.
> 
> I suppose if auto_vec would be a union of vec and
> a storage member with the vl_embed active that would work, but then
> 

Re: [PATCH v3 09/11] riscv: thead: Add support for the XTheadMemPair ISA extension

2023-02-24 Thread Christoph Müllner
On Fri, Feb 24, 2023 at 10:01 AM Kito Cheng  wrote:
>
> Got one fail:
>
> FAIL: gcc.target/riscv/xtheadmempair-1.c   -O2   scan-assembler-times
> th.luwd\t 4
>
> It should scan lwud rather than luwd?

Yes, this should be th.lwud.
Must have been introduced after testing.

I also ran the whole patchset again with RV32 and RV64.
This should be the only issue of this kind in the series.
Sorry for that!


[committed] i386: Fix up builtins used in avx512bf16vlintrin.h [PR108881]

2023-02-24 Thread Jakub Jelinek via Gcc-patches
Hi!

The builtins used in avx512bf16vlintrin.h implementation need both
avx512bf16 and avx512vl ISAs, which the header ensures for them, but
the builtins weren't actually requiring avx512vl, so when used by hand
with just -mavx512bf16 -mno-avx512vl it resulted in ICEs.

Fixed by adding OPTION_MASK_ISA_AVX512VL to their BDESC.

Bootstrapped/regtested on x86_64-linux and i686-linux, preapproved by
Hongtao in the PR, committed to trunk.

2023-02-23  Jakub Jelinek  

PR target/108881
* config/i386/i386-builtin.def (__builtin_ia32_cvtne2ps2bf16_v16bf,
__builtin_ia32_cvtne2ps2bf16_v16bf_mask,
__builtin_ia32_cvtne2ps2bf16_v16bf_maskz,
__builtin_ia32_cvtne2ps2bf16_v8bf,
__builtin_ia32_cvtne2ps2bf16_v8bf_mask,
__builtin_ia32_cvtne2ps2bf16_v8bf_maskz,
__builtin_ia32_cvtneps2bf16_v8sf_mask,
__builtin_ia32_cvtneps2bf16_v8sf_maskz,
__builtin_ia32_cvtneps2bf16_v4sf_mask,
__builtin_ia32_cvtneps2bf16_v4sf_maskz,
__builtin_ia32_dpbf16ps_v8sf, __builtin_ia32_dpbf16ps_v8sf_mask,
__builtin_ia32_dpbf16ps_v8sf_maskz, __builtin_ia32_dpbf16ps_v4sf,
__builtin_ia32_dpbf16ps_v4sf_mask,
__builtin_ia32_dpbf16ps_v4sf_maskz): Require also
OPTION_MASK_ISA_AVX512VL.

* gcc.target/i386/avx512bf16-pr108881.c: New test.

--- gcc/config/i386/i386-builtin.def.jj 2023-01-16 11:52:15.955735951 +0100
+++ gcc/config/i386/i386-builtin.def2023-02-23 18:20:37.139676726 +0100
@@ -2814,30 +2814,30 @@ BDESC (0, OPTION_MASK_ISA2_VAES, CODE_FO
 BDESC (0, OPTION_MASK_ISA2_AVX512BF16, CODE_FOR_avx512f_cvtne2ps2bf16_v32bf, 
"__builtin_ia32_cvtne2ps2bf16_v32bf", IX86_BUILTIN_CVTNE2PS2BF16_V32BF, 
UNKNOWN, (int) V32BF_FTYPE_V16SF_V16SF)
 BDESC (0, OPTION_MASK_ISA2_AVX512BF16, 
CODE_FOR_avx512f_cvtne2ps2bf16_v32bf_mask, 
"__builtin_ia32_cvtne2ps2bf16_v32bf_mask", 
IX86_BUILTIN_CVTNE2PS2BF16_V32BF_MASK, UNKNOWN, (int) 
V32BF_FTYPE_V16SF_V16SF_V32BF_USI)
 BDESC (0, OPTION_MASK_ISA2_AVX512BF16, 
CODE_FOR_avx512f_cvtne2ps2bf16_v32bf_maskz, 
"__builtin_ia32_cvtne2ps2bf16_v32bf_maskz", 
IX86_BUILTIN_CVTNE2PS2BF16_V32BF_MASKZ, UNKNOWN, (int) 
V32BF_FTYPE_V16SF_V16SF_USI)
-BDESC (0, OPTION_MASK_ISA2_AVX512BF16, CODE_FOR_avx512f_cvtne2ps2bf16_v16bf, 
"__builtin_ia32_cvtne2ps2bf16_v16bf", IX86_BUILTIN_CVTNE2PS2BF16_V16BF, 
UNKNOWN, (int) V16BF_FTYPE_V8SF_V8SF)
-BDESC (0, OPTION_MASK_ISA2_AVX512BF16, 
CODE_FOR_avx512f_cvtne2ps2bf16_v16bf_mask, 
"__builtin_ia32_cvtne2ps2bf16_v16bf_mask", 
IX86_BUILTIN_CVTNE2PS2BF16_V16BF_MASK, UNKNOWN, (int) 
V16BF_FTYPE_V8SF_V8SF_V16BF_UHI)
-BDESC (0, OPTION_MASK_ISA2_AVX512BF16, 
CODE_FOR_avx512f_cvtne2ps2bf16_v16bf_maskz, 
"__builtin_ia32_cvtne2ps2bf16_v16bf_maskz", 
IX86_BUILTIN_CVTNE2PS2BF16_V16BF_MASKZ, UNKNOWN, (int) 
V16BF_FTYPE_V8SF_V8SF_UHI)
-BDESC (0, OPTION_MASK_ISA2_AVX512BF16, CODE_FOR_avx512f_cvtne2ps2bf16_v8bf, 
"__builtin_ia32_cvtne2ps2bf16_v8bf", IX86_BUILTIN_CVTNE2PS2BF16_V8BF, UNKNOWN, 
(int) V8BF_FTYPE_V4SF_V4SF)
-BDESC (0, OPTION_MASK_ISA2_AVX512BF16, 
CODE_FOR_avx512f_cvtne2ps2bf16_v8bf_mask, 
"__builtin_ia32_cvtne2ps2bf16_v8bf_mask", IX86_BUILTIN_CVTNE2PS2BF16_V8BF_MASK, 
UNKNOWN, (int) V8BF_FTYPE_V4SF_V4SF_V8BF_UQI)
-BDESC (0, OPTION_MASK_ISA2_AVX512BF16, 
CODE_FOR_avx512f_cvtne2ps2bf16_v8bf_maskz, 
"__builtin_ia32_cvtne2ps2bf16_v8bf_maskz", 
IX86_BUILTIN_CVTNE2PS2BF16_V8BF_MASKZ, UNKNOWN, (int) V8BF_FTYPE_V4SF_V4SF_UQI)
+BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512BF16, 
CODE_FOR_avx512f_cvtne2ps2bf16_v16bf, "__builtin_ia32_cvtne2ps2bf16_v16bf", 
IX86_BUILTIN_CVTNE2PS2BF16_V16BF, UNKNOWN, (int) V16BF_FTYPE_V8SF_V8SF)
+BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512BF16, 
CODE_FOR_avx512f_cvtne2ps2bf16_v16bf_mask, 
"__builtin_ia32_cvtne2ps2bf16_v16bf_mask", 
IX86_BUILTIN_CVTNE2PS2BF16_V16BF_MASK, UNKNOWN, (int) 
V16BF_FTYPE_V8SF_V8SF_V16BF_UHI)
+BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512BF16, 
CODE_FOR_avx512f_cvtne2ps2bf16_v16bf_maskz, 
"__builtin_ia32_cvtne2ps2bf16_v16bf_maskz", 
IX86_BUILTIN_CVTNE2PS2BF16_V16BF_MASKZ, UNKNOWN, (int) 
V16BF_FTYPE_V8SF_V8SF_UHI)
+BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512BF16, 
CODE_FOR_avx512f_cvtne2ps2bf16_v8bf, "__builtin_ia32_cvtne2ps2bf16_v8bf", 
IX86_BUILTIN_CVTNE2PS2BF16_V8BF, UNKNOWN, (int) V8BF_FTYPE_V4SF_V4SF)
+BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512BF16, 
CODE_FOR_avx512f_cvtne2ps2bf16_v8bf_mask, 
"__builtin_ia32_cvtne2ps2bf16_v8bf_mask", IX86_BUILTIN_CVTNE2PS2BF16_V8BF_MASK, 
UNKNOWN, (int) V8BF_FTYPE_V4SF_V4SF_V8BF_UQI)
+BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512BF16, 
CODE_FOR_avx512f_cvtne2ps2bf16_v8bf_maskz, 
"__builtin_ia32_cvtne2ps2bf16_v8bf_maskz", 
IX86_BUILTIN_CVTNE2PS2BF16_V8BF_MASKZ, UNKNOWN, (int) V8BF_FTYPE_V4SF_V4SF_UQI)
 BDESC (0, OPTION_MASK_ISA2_AVX512BF16, CODE_FOR_avx512f_cvtneps2bf16_v16sf, 
"__builtin_ia32_cvtneps2bf16_v16sf", IX86_BUILTIN_CVTNEPS2BF16_V16SF, UNKNOWN, 
(int) V16BF_FTYPE_V16SF)
 BDESC (0, 

Re: Rust: In 'type_for_mode' langhook also consider all 'int_n' modes/types (was: Modula-2 / Rust: Many targets failing)

2023-02-24 Thread Jan-Benedict Glaw
Hi Thomas / Arthur!

On Wed, 2023-02-22 15:30:37 +0100, Arthur Cohen  
wrote:
[..]
> > >   --target=msp430-elfbare
> > > ~
> > >
> > > /var/lib/laminar/run/gcc-msp430-elfbare/24/toolchain-build/./gcc/xgcc 
> > > -B/var/lib/laminar/run/gcc-msp430-elfbare/24/toolchain-build/./gcc/  
> > > -xrust -frust-incomplete-and-experimental-compiler-do-not-use -nostdinc 
> > > /dev/null -S -o /dev/null -fself-test=../../gcc/gcc/testsuite/selftests
> > >: internal compiler error: Segmentation fault
> > >0xf2efbf crash_signal
> > >  ../../gcc/gcc/toplev.cc:314
> > >0x120c8c7 build_function_type(tree_node*, tree_node*, bool)
> > >  ../../gcc/gcc/tree.cc:7360
> > >0x120cc20 build_function_type_list(tree_node*, ...)
> > >  ../../gcc/gcc/tree.cc:7442
> > >0x120d16b build_common_builtin_nodes()
> > >  ../../gcc/gcc/tree.cc:9883
> > >0x8449b4 grs_langhook_init
> > >  ../../gcc/gcc/rust/rust-lang.cc:132
> > >0x8427b2 lang_dependent_init
> > >  ../../gcc/gcc/toplev.cc:1815
> > >0x8427b2 do_compile
> > >  ../../gcc/gcc/toplev.cc:2110
> > >Please submit a full bug report, with preprocessed source (by 
> > > using -freport-bug).
> > >Please include the complete backtrace with any bug report.
> > >See  for instructions.
> > >make[1]: *** [../../gcc/gcc/rust/Make-lang.in:275: 
> > > s-selftest-rust] Error 1

Confirmed successful build #37 for my msp320-elfbare build at
http://toolchain.lug-owl.de/laminar/jobs/gcc-msp430-elfbare

Thanks,
  Jan-Benedict

-- 


signature.asc
Description: PGP signature


Re: [PATCH] asan: adjust module name for global variables

2023-02-24 Thread Jakub Jelinek via Gcc-patches
On Fri, Feb 24, 2023 at 10:00:01AM +0100, Martin Liška wrote:
> As mentioned in the PR, when we use LTO, we wrongly use ltrans output
> file name as a module name of a global variable. That leads to a
> non-reproducible output.
> 
> After the suggested change, we emit context name of normal global
> variables. And for artificial variables (like .Lubsan_data3), we use
> aux_base_name (e.g. "./a.ltrans0.ltrans").
> 
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
> Ready to be installed?
> Thanks,
> Martin
> 
>   PR asan/108834
> 
> gcc/ChangeLog:
> 
>   * asan.cc (asan_add_global): Use proper TU name for normal
> global variables (and aux_base_name for the artificial one).
> 
> gcc/testsuite/ChangeLog:
> 
>   * c-c++-common/asan/global-overflow-1.c: Test line and column
>   info for a global variable.
> ---
>  gcc/asan.cc | 7 ++-
>  gcc/testsuite/c-c++-common/asan/global-overflow-1.c | 2 +-
>  2 files changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/asan.cc b/gcc/asan.cc
> index f56d084bc7a..245abb14388 100644
> --- a/gcc/asan.cc
> +++ b/gcc/asan.cc
> @@ -3287,7 +3287,12 @@ asan_add_global (tree decl, tree type, 
> vec *v)
>  pp_string (_pp, "");
>str_cst = asan_pp_string (_pp);
>  
> -  pp_string (_name_pp, main_input_filename);
> +  const_tree tu = get_ultimate_context ((const_tree)decl);
> +  if (tu != NULL_TREE)
> +pp_string (_name_pp, IDENTIFIER_POINTER (DECL_NAME (tu)));
> +  else
> +pp_string (_name_pp, aux_base_name);

I think for !in_lto_p we don't need to bother with get_ultimate_context
and should just use main_input_filename as before.

Otherwise LGTM.

Jakub



Re: [PATCH] Avoid default-initializing auto_vec storage

2023-02-24 Thread Richard Biener via Gcc-patches
On Thu, 23 Feb 2023, Jakub Jelinek wrote:

> On Thu, Feb 23, 2023 at 03:02:01PM +, Richard Biener wrote:
> > > > * vec.h (auto_vec): Turn m_data storage into
> > > > uninitialized unsigned char.
> > > 
> > > Given that we actually never reference the m_data array anywhere,
> > > it is just to reserve space, I think even the alignas(T) there is
> > > useless.  The point is that m_auto has as data members:
> > >   vec_prefix m_vecpfx;
> > >   T m_vecdata[1];
> > > and we rely on it (admittedly -fstrict-flex-arrays{,=2,=3} or
> > > -fsanitize=bound-sstrict incompatible) being treated as
> > > flexible array member flowing into the m_data storage after it.
> > 
> > Doesn't the array otherwise eventually overlap with tail padding
> > in m_auto?  Or does an array of T never produce tail padding?
> 
> The array can certainly overlap with tail padding in m_auto if any.
> But whether m_data is aligned to alignof (T) or not doesn't change anything
> on it.
> m_vecpfx is struct { unsigned m_alloc : 31, m_using_auto_storage : 1, m_num; 
> },
> so I think there is on most arches tail padding if T has smaller alignment
> than int, so typically char/short or structs with the same size/alignments.
> If that happens, alignof (auto_vec_x.m_auto) will be alignof (int),
> there can be 2 or 3 padding bytes, but because sizeof (auto_vec_x.m_auto)
> is 3 * sizeof (int), m_data will have offset always aligned to alignof (T).
> If alignof (T) >= alignof (int), then there won't be any tail padding
> at the end of m_auto, there could be padding between m_vecpfx and
> m_vecdata, sizeof (auto_vec_x.m_auto) will be a multiple of sizeof (T) and
> so m_data will be again already properly aligned.
> 
> So, I think your patch is fine without alignas(T), the rest is just that
> there is more work to do incrementally, even for the case you want to
> deal with (the point 1) in particular).

Looking at vec::operator[] which just does

template
inline const T &
vec::operator[] (unsigned ix) const
{
  gcc_checking_assert (ix < m_vecpfx.m_num);
  return m_vecdata[ix];
} 

the whole thing looks fragile at best - we basically have

struct auto_vec
{
  struct vec
  {
...
T m_vecdata[1];
  } m_auto;
  T m_data[N-1];
};

and access m_auto.m_vecdata[] as if it extends to m_data.  That's
not something supported by the middle-end - not by design at least.
auto_vec *p; p->m_auto.m_vecdata[i] would never alias
p->m_data[j], in practice we might not see this though.  Also
get_ref_base_and_extent will compute a maxsize/size of sizeof(T)
for any m_auto.m_vecdata[i] access, but I think we nowhere
actually replace 'i' by zero based on this knowledge, but we'd
perform CSE with earlier m_auto.m_vecdata[0] stores, so that
might be something one could provoke.  Doing a self-test like

static __attribute__((noipa)) void
test_auto_alias (int i)
{ 
  auto_vec v;
  v.quick_grow (2);
  v[0] = 1;
  v[1] = 2;
  int val = v[i];
  ASSERT_EQ (val, 2);
} 

shows

  _27 = &_25->m_vecdata[0];
  *_27 = 1;
...
  _7 = &_12->m_vecdata[i.235_3];
  val_13 = *_7;

which is safe in middle-end rules though.  So what "saves" us
here is that we always return a reference and never a value.
There's the ::iterate member function which fails to do this,
the ::quick_push function does

  T *slot = _vecdata[m_vecpfx.m_num++];
  *slot = obj;

with

static __attribute__((noipa)) void
test_auto_alias (int i)
{ 
  auto_vec v;
  v.quick_grow (2);
  v[0] = 1;
  v[1] = 2;
  int val;
  for (int ix = i; v.iterate (ix, ); ix++)
;
  ASSERT_EQ (val, 2);
} 

I get that optimzied to a FAIL.  I have a "fix" for this.
unordered_remove has a similar issue accesing the last element.
There are a few functions using the [] access member which is
at least sub-optimal due to repeated bounds checking but also safe.

I suppose if auto_vec would be a union of vec and
a storage member with the vl_embed active that would work, but then
that's likely not something C++11 supports.

So I think to support auto_vec we'd need to make the m_vecdata[]
member in vec of templated size (defaulted to 1)
and get rid of the m_data member in auto_vec instead.  Or have
another C++ way of increasing the size of auto_vec without
actually adding any member?

The vec data accesses then would need to go through
a wrapper obtaining a correctly typed pointer to m_vecdata[]
since we'd like to have that as unsigned char[] to avoid the
initialization.

> > Yes, I'm not proposing to fix non-POD support.  I want to make
> > as-if-POD stuff like std::pair to work like it was intended.
> > 
> > > Oh, and perhaps we should start marking such spots in GCC with
> > > strict_flex_array attribute to make it clear where we rely on the
> > > non-strict behavior.
> > 
> > I think we never access the array directly as array, do we?
> 
> Sure, the attribute should go to m_vecdata array, not to m_data.
> And to op array in gimple_statement_with_ops, operands array in
> operands, ops array in tree_omp_clause, val in tree_int_cst,
> 

Re: [PATCH v3 09/11] riscv: thead: Add support for the XTheadMemPair ISA extension

2023-02-24 Thread Kito Cheng via Gcc-patches
Got one fail:

FAIL: gcc.target/riscv/xtheadmempair-1.c   -O2   scan-assembler-times
th.luwd\t 4

It should scan lwud rather than luwd?


[PATCH] asan: adjust module name for global variables

2023-02-24 Thread Martin Liška
As mentioned in the PR, when we use LTO, we wrongly use ltrans output
file name as a module name of a global variable. That leads to a
non-reproducible output.

After the suggested change, we emit context name of normal global
variables. And for artificial variables (like .Lubsan_data3), we use
aux_base_name (e.g. "./a.ltrans0.ltrans").

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

PR asan/108834

gcc/ChangeLog:

* asan.cc (asan_add_global): Use proper TU name for normal
  global variables (and aux_base_name for the artificial one).

gcc/testsuite/ChangeLog:

* c-c++-common/asan/global-overflow-1.c: Test line and column
info for a global variable.
---
 gcc/asan.cc | 7 ++-
 gcc/testsuite/c-c++-common/asan/global-overflow-1.c | 2 +-
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/gcc/asan.cc b/gcc/asan.cc
index f56d084bc7a..245abb14388 100644
--- a/gcc/asan.cc
+++ b/gcc/asan.cc
@@ -3287,7 +3287,12 @@ asan_add_global (tree decl, tree type, 
vec *v)
 pp_string (_pp, "");
   str_cst = asan_pp_string (_pp);
 
-  pp_string (_name_pp, main_input_filename);
+  const_tree tu = get_ultimate_context ((const_tree)decl);
+  if (tu != NULL_TREE)
+pp_string (_name_pp, IDENTIFIER_POINTER (DECL_NAME (tu)));
+  else
+pp_string (_name_pp, aux_base_name);
+
   module_name_cst = asan_pp_string (_name_pp);
 
   if (asan_needs_local_alias (decl))
diff --git a/gcc/testsuite/c-c++-common/asan/global-overflow-1.c 
b/gcc/testsuite/c-c++-common/asan/global-overflow-1.c
index b97801da2b7..7e167cee67a 100644
--- a/gcc/testsuite/c-c++-common/asan/global-overflow-1.c
+++ b/gcc/testsuite/c-c++-common/asan/global-overflow-1.c
@@ -26,4 +26,4 @@ int main() {
 /* { dg-output "READ of size 1 at 0x\[0-9a-f\]+ thread T0.*(\n|\r\n|\r)" } */
 /* { dg-output "#0 0x\[0-9a-f\]+ +(in _*main 
(\[^\n\r]*global-overflow-1.c:20|\[^\n\r]*:0|\[^\n\r]*\\+0x\[0-9a-z\]*)|\[(\])\[^\n\r]*(\n|\r\n|\r).*"
 } */
 /* { dg-output "0x\[0-9a-f\]+ is located 0 bytes after global variable" } */
-/* { dg-output ".*YYY\[^\n\r]* of size 10\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output ".*YYY\[^\n\r]*asan/global-overflow-1.c:15:15'\[^\n\r]*of size 
10\[^\n\r]*(\n|\r\n|\r)" } */
-- 
2.39.2



[PATCH] use subreg for movsf_from_si and remove UNSPEC_SF_FROM_SI

2023-02-24 Thread Jiufu Guo via Gcc-patches
Hi,

In patch https://gcc.gnu.org/pipermail/gcc-patches/2023-February/612168.html,
we improved the bictcast from lowpart/highpart of DI to SF by using mtvsrws
or mtvsrd.

As investigating this functionality, we may improve the related code by using
bitcast subreg from SI to SF, and avoid generating UNSPEC_SF_FROM_SI.

We can also improve the cases like "subreg:SI(reg:SF)=reg:SI" which is cast
SI to SF (e.g. pr48335-1.c).

This patch also reduce clobber usage, only adding clobber for p8 where 
additional
register is required.

This patch pass bootstrap and regtest for ppc64(p7,p8 and p9) and 
ppc64le(p10,p9).

Is this patch ok for trunk (or maybe stage1)? Thanks for comments and 
sugguestions!


BR,
Jeff (Jiufu)

gcc/ChangeLog:

* config/rs6000/predicates.md: Rename TARGET_NO_SF_SUBREG to
BITCAST_SI_SF_IN_REGS, and rename TARGET_ALLOW_SF_SUBREG to
BITCAST_SI_SF_IN_MEM.
* config/rs6000/rs6000.cc (valid_sf_si_move): Likewise.
(is_lfs_stfs_insn): Split to is_stfs_insn and is_lfs_insn.
(is_stfs_insn): Split from is_lfs_stfs_insn.
(is_lfs_insn): Split from is_lfs_stfs_insn.
(prefixed_load_p): Call is_lfs_insn.
(prefixed_store_p): Call is_stfs_insn.
* config/rs6000/rs6000.h (TARGET_NO_SF_SUBREG): Rename to ...
(BITCAST_SI_SF_IN_REGS): ... this.
(TARGET_ALLOW_SF_SUBREG): Rename to ...
(BITCAST_SI_SF_IN_MEM): ... this.
* config/rs6000/rs6000.md (movsf_from_si_p8): New define_insn.

---
 gcc/config/rs6000/predicates.md | 16 +++---
 gcc/config/rs6000/rs6000.cc | 36 
 gcc/config/rs6000/rs6000.h  |  4 +-
 gcc/config/rs6000/rs6000.md | 98 +
 4 files changed, 97 insertions(+), 57 deletions(-)

diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index e57c9d99c6b..4a7d5893126 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -47,7 +47,7 @@ (define_predicate "sf_subreg_operand"
   rtx inner_reg = SUBREG_REG (op);
   machine_mode inner_mode = GET_MODE (inner_reg);
 
-  if (TARGET_ALLOW_SF_SUBREG || !REG_P (inner_reg))
+  if (BITCAST_SI_SF_IN_MEM || !REG_P (inner_reg))
 return 0;
 
   if ((mode == SFmode && GET_MODE_CLASS (inner_mode) == MODE_INT)
@@ -67,7 +67,7 @@ (define_predicate "altivec_register_operand"
 {
   if (SUBREG_P (op))
 {
-  if (TARGET_NO_SF_SUBREG && sf_subreg_operand (op, mode))
+  if (BITCAST_SI_SF_IN_REGS && sf_subreg_operand (op, mode))
return 0;
 
   op = SUBREG_REG (op);
@@ -88,7 +88,7 @@ (define_predicate "vsx_register_operand"
 {
   if (SUBREG_P (op))
 {
-  if (TARGET_NO_SF_SUBREG && sf_subreg_operand (op, mode))
+  if (BITCAST_SI_SF_IN_REGS && sf_subreg_operand (op, mode))
return 0;
 
   op = SUBREG_REG (op);
@@ -126,7 +126,7 @@ (define_predicate "vfloat_operand"
 {
   if (SUBREG_P (op))
 {
-  if (TARGET_NO_SF_SUBREG && sf_subreg_operand (op, mode))
+  if (BITCAST_SI_SF_IN_REGS && sf_subreg_operand (op, mode))
return 0;
 
   op = SUBREG_REG (op);
@@ -148,7 +148,7 @@ (define_predicate "vint_operand"
 {
   if (SUBREG_P (op))
 {
-  if (TARGET_NO_SF_SUBREG && sf_subreg_operand (op, mode))
+  if (BITCAST_SI_SF_IN_REGS && sf_subreg_operand (op, mode))
return 0;
 
   op = SUBREG_REG (op);
@@ -170,7 +170,7 @@ (define_predicate "vlogical_operand"
 {
   if (SUBREG_P (op))
 {
-  if (TARGET_NO_SF_SUBREG && sf_subreg_operand (op, mode))
+  if (BITCAST_SI_SF_IN_REGS && sf_subreg_operand (op, mode))
return 0;
 
   op = SUBREG_REG (op);
@@ -346,7 +346,7 @@ (define_predicate "gpc_reg_operand"
 {
   if (SUBREG_P (op))
 {
-  if (TARGET_NO_SF_SUBREG && sf_subreg_operand (op, mode))
+  if (BITCAST_SI_SF_IN_REGS && sf_subreg_operand (op, mode))
return 0;
 
   op = SUBREG_REG (op);
@@ -375,7 +375,7 @@ (define_predicate "int_reg_operand"
 {
   if (SUBREG_P (op))
 {
-  if (TARGET_NO_SF_SUBREG && sf_subreg_operand (op, mode))
+  if (BITCAST_SI_SF_IN_REGS && sf_subreg_operand (op, mode))
return 0;
 
   op = SUBREG_REG (op);
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 16ca3a31757..b8a9f01cbfa 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -10565,7 +10565,7 @@ rs6000_emit_le_vsx_move (rtx dest, rtx source, 
machine_mode mode)
 bool
 valid_sf_si_move (rtx dest, rtx src, machine_mode mode)
 {
-  if (TARGET_ALLOW_SF_SUBREG)
+  if (BITCAST_SI_SF_IN_MEM)
 return true;
 
   if (mode != SFmode && GET_MODE_CLASS (mode) != MODE_INT)
@@ -26425,13 +26425,10 @@ pcrel_opt_valid_mem_p (rtx reg, machine_mode mode, 
rtx mem)
- stfs:
 - SET is from UNSPEC_SI_FROM_SF to MEM:SI
 - CLOBBER is a V4SF
-   - lfs:
-- SET is from UNSPEC_SF_FROM_SI to REG:SF
-- CLOBBER is a DI
  */
 
 static bool
-is_lfs_stfs_insn (rtx_insn *insn)
+is_stfs_insn (rtx_insn *insn)
 {
   rtx 

[PATCH (pushed)] libsanitizer: cherry-pick commit 8f5962b1ccb5fcd4d4544121d43efb860ac3cc6d from upstream

2023-02-24 Thread Martin Liška
ASAN: keep support for Global::location

We as GCC still emit __asan_global_source_location for global variables
and we would like to use it in the future. On other hand, we don't
support llvm-symbolizer and the default libbacktraace symbolizer
does not support location info.
---
 libsanitizer/asan/asan_globals.cpp  | 9 +
 libsanitizer/asan/asan_interface_internal.h | 7 ---
 2 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/libsanitizer/asan/asan_globals.cpp 
b/libsanitizer/asan/asan_globals.cpp
index 8f3491f0199..01a243927ca 100644
--- a/libsanitizer/asan/asan_globals.cpp
+++ b/libsanitizer/asan/asan_globals.cpp
@@ -92,6 +92,10 @@ static void ReportGlobal(const Global , const char 
*prefix) {
   if (info.line != 0) {
 Report("  location: name=%s, %d\n", info.file, 
static_cast(info.line));
   }
+  else if (g.gcc_location != 0) {
+// Fallback to Global::gcc_location
+Report("  location: name=%s, %d\n", g.gcc_location->filename, 
g.gcc_location->line_no);
+  }
 }
 
 static u32 FindRegistrationSite(const Global *g) {
@@ -283,6 +287,11 @@ void PrintGlobalLocation(InternalScopedString *str, const 
__asan_global ) {
 
   if (info.line != 0) {
 str->append("%s:%d", info.file, static_cast(info.line));
+  } else if (g.gcc_location != 0) {
+// Fallback to Global::gcc_location
+str->append("%s", g.gcc_location->filename ? g.gcc_location->filename : 
g.module_name);
+if (g.gcc_location->line_no) str->append(":%d", g.gcc_location->line_no);
+if (g.gcc_location->column_no) str->append(":%d", 
g.gcc_location->column_no);
   } else {
 str->append("%s", g.module_name);
   }
diff --git a/libsanitizer/asan/asan_interface_internal.h 
b/libsanitizer/asan/asan_interface_internal.h
index 987f855c0f9..a9982637802 100644
--- a/libsanitizer/asan/asan_interface_internal.h
+++ b/libsanitizer/asan/asan_interface_internal.h
@@ -53,9 +53,10 @@ extern "C" {
 const char *module_name; // Module name as a C string. This pointer is a
  // unique identifier of a module.
 uptr has_dynamic_init;   // Non-zero if the global has dynamic initializer.
-uptr windows_padding;// TODO: Figure out how to remove this padding
- // that's simply here to make the MSVC incremental
- // linker happy...
+__asan_global_source_location *gcc_location;  // Source location of a 
global,
+  // used by GCC compiler. 
LLVM uses
+  // llvm-symbolizer that 
relies
+  // on DWARF debugging info.
 uptr odr_indicator;  // The address of the ODR indicator symbol.
   };
 
-- 
2.39.2



Re: [PATCH] rs6000: fmr gets used instead of faster xxlor [PR93571]

2023-02-24 Thread Ajit Agarwal via Gcc-patches
Hello All:

Here is the patch that uses xxlor instead of fmr where possible.
Performance results shows that fmr is better in power9 and 
power10 architectures whereas xxlor is better in power7 and
power 8 architectures.

Bootstrapped and regtested powepc64-linux-gnu.

Thanks & Regards
Ajit

rs6000: Use xxlor instead of fmr where possible

This patch replaces fmr with xxlor instruction for power7
and power8 architectures whereas for power9 and power10
replaces xxlor with fmr instruction.

Perf measurement results:

Power9 fmr:  201,847,661 cycles.
Power9 xxlor: 201,877,78 cycles.
Power8 fmr: 201,057,795 cycles.
Power8 xxlor: 201,004,671 cycles.

2023-02-24  Ajit Kumar Agarwal  

gcc/ChangeLog:

* config/rs6000/rs6000.md (*movdf_hardfloat64): Use xxlor
for power7 and power8 and fmr for power9 and power10.
---
 gcc/config/rs6000/rs6000.md | 46 +++--
 1 file changed, 29 insertions(+), 17 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 81bffb04ceb..1253b8622a7 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -354,7 +354,7 @@ (define_attr "cpu"
   (const (symbol_ref "(enum attr_cpu) rs6000_tune")))
 
 ;; The ISA we implement.
-(define_attr "isa" "any,p5,p6,p7,p7v,p8v,p9,p9v,p9kf,p9tf,p10"
+(define_attr "isa" "any,p5,p6,p7,p7v,p8v,p9,p9v,p9kf,p9tf,p7p8,p10"
   (const_string "any"))
 
 ;; Is this alternative enabled for the current CPU/ISA/etc.?
@@ -402,6 +402,11 @@ (define_attr "enabled" ""
  (and (eq_attr "isa" "p10")
  (match_test "TARGET_POWER10"))
  (const_int 1)
+  
+ (and (eq_attr "isa" "p7p8")
+ (match_test "TARGET_VSX && !TARGET_P9_VECTOR"))
+ (const_int 1)
+
 ] (const_int 0)))
 
 ;; If this instruction is microcoded on the CELL processor
@@ -8436,27 +8441,29 @@ (define_insn "*mov_softfloat32"
 
 (define_insn "*mov_hardfloat64"
   [(set (match_operand:FMOVE64 0 "nonimmediate_operand"
-   "=m,   d,  d,  ,   wY,
- ,Z,  ,  ,  !r,
- YZ,  r,  !r, *c*l,   !r,
-*h,   r,  ,   wa")
+   "=m,   d,  ,  ,   wY,
+, Z,  wa, ,  !r,
+YZ,   r,  !r, *c*l,   !r,
+*h,   r,  ,   d,  wn,
+wa")
(match_operand:FMOVE64 1 "input_operand"
-"d,   m,  d,  wY, ,
- Z,   ,   ,  ,  ,
+"d,   m,  ,  wY, ,
+ Z,   ,   wa, ,  ,
  r,   YZ, r,  r,  *h,
- 0,   ,   r,  eP"))]
+ 0,   ,   r,  d,  wn,
+ eP"))]
   "TARGET_POWERPC64 && TARGET_HARD_FLOAT
&& (gpc_reg_operand (operands[0], mode)
|| gpc_reg_operand (operands[1], mode))"
   "@
stfd%U0%X0 %1,%0
lfd%U1%X1 %0,%1
-   fmr %0,%1
+   xxlor %x0,%x1,%x1
lxsd %0,%1
stxsd %1,%0
lxsdx %x0,%y1
stxsdx %x1,%y0
-   xxlor %x0,%x1,%x1
+   fmr %0,%1
xxlxor %x0,%x0,%x0
li %0,0
std%U0%X0 %1,%0
@@ -8467,23 +8474,28 @@ (define_insn "*mov_hardfloat64"
nop
mfvsrd %0,%x1
mtvsrd %x0,%1
+   fmr %0,%1
+   fmr %0,%1
#"
   [(set_attr "type"
-"fpstore, fpload, fpsimple,   fpload, fpstore,
+"fpstore, fpload, veclogical, fpload, fpstore,
  fpload,  fpstore,veclogical, veclogical, integer,
  store,   load,   *,  mtjmpr, mfjmpr,
- *,   mfvsr,  mtvsr,  vecperm")
+ *,   mfvsr,  mtvsr,  fpsimple,   fpsimple,
+ vecperm")
(set_attr "size" "64")
(set_attr "isa"
-"*,   *,  *,  p9v,p9v,
- p7v, p7v,*,  *,  *,
- *,   *,  *,  *,  *,
- *,   p8v,p8v,p10")
+"*,   *,  p7p8,p9v,p9v,
+ p7v, p7v,*,   *,  *,
+ *,   *,  *,   *,  *,
+ *,   p8v,p8v, *,  *,
+ p10")
(set_attr "prefixed"
 "*,   *,  *,  *,  *,
  *,   *,  *,  *,  *,
  *,   *,  *,  *,  *,
- *,   *,  *,  *")])
+ *,   *,  *,  *,  *,
+ *")])
 
 ;;   STD  LD   MR  MT MF G-const
 ;;   H-const  F-const  

Re: [PATCH v3 00/11] RISC-V: Add XThead* extension support

2023-02-24 Thread Kito Cheng via Gcc-patches
Hi Christoph:

OK for trunk for the 1~8, feel free to commit 1~8 after you address
those minor comments, and could you also prepare release notes for
those extensions?

And 9~11 needs to take a few more rounds of review and test.




On Fri, Feb 24, 2023 at 1:52 PM Christoph Muellner
 wrote:
>
> From: Christoph Müllner 
>
> This series introduces support for the T-Head specific RISC-V ISA extensions
> which are available e.g. on the T-Head XuanTie C906.
>
> The ISA spec can be found here:
>   https://github.com/T-head-Semi/thead-extension-spec
>
> This series adds support for the following XThead* extensions:
> * XTheadBa
> * XTheadBb
> * XTheadBs
> * XTheadCmo
> * XTheadCondMov
> * XTheadFMemIdx
> * XTheadFmv
> * XTheadInt
> * XTheadMac
> * XTheadMemIdx
> * XTheadMemPair
> * XTheadSync
>
> All extensions are properly integrated and the included tests
> demonstrate the improvements of the generated code.
>
> The series also introduces support for "-mcpu=thead-c906", which also
> enables all available XThead* ISA extensions of the T-Head C906.
>
> All patches have been tested and don't introduce regressions for RV32 or RV64.
> The patches have also been tested with SPEC CPU2017 on QEMU and real HW
> (D1 board).
>
> Support patches for these extensions for Binutils, QEMU, and LLVM have
> already been merged in the corresponding upstream projects.
>
> Changes in v3:
> - Bugfix in XTheadBa
> - Rewrite of XTheadMemPair
> - Inclusion of XTheadMemIdx and XTheadFMemIdx
>
> Christoph Müllner (9):
>   riscv: Add basic XThead* vendor extension support
>   riscv: riscv-cores.def: Add T-Head XuanTie C906
>   riscv: thead: Add support for the XTheadBa ISA extension
>   riscv: thead: Add support for the XTheadBs ISA extension
>   riscv: thead: Add support for the XTheadBb ISA extension
>   riscv: thead: Add support for the XTheadCondMov ISA extensions
>   riscv: thead: Add support for the XTheadMac ISA extension
>   riscv: thead: Add support for the XTheadFmv ISA extension
>   riscv: thead: Add support for the XTheadMemPair ISA extension
>
> moiz.hussain (2):
>   riscv: thead: Add support for the XTheadMemIdx ISA extension
>   riscv: thead: Add support for the XTheadFMemIdx ISA extension
>
>  gcc/common/config/riscv/riscv-common.cc   |   26 +
>  gcc/config/riscv/bitmanip.md  |   52 +-
>  gcc/config/riscv/constraints.md   |   43 +
>  gcc/config/riscv/iterators.md |4 +
>  gcc/config/riscv/peephole.md  |   56 +
>  gcc/config/riscv/riscv-cores.def  |4 +
>  gcc/config/riscv/riscv-opts.h |   29 +
>  gcc/config/riscv/riscv-protos.h   |   28 +-
>  gcc/config/riscv/riscv.cc | 1090 +++--
>  gcc/config/riscv/riscv.h  |8 +-
>  gcc/config/riscv/riscv.md |  169 ++-
>  gcc/config/riscv/riscv.opt|3 +
>  gcc/config/riscv/thead.md |  351 ++
>  .../gcc.target/riscv/mcpu-thead-c906.c|   28 +
>  .../gcc.target/riscv/xtheadba-addsl.c |   55 +
>  gcc/testsuite/gcc.target/riscv/xtheadba.c |   14 +
>  gcc/testsuite/gcc.target/riscv/xtheadbb-ext.c |   20 +
>  .../gcc.target/riscv/xtheadbb-extu-2.c|   22 +
>  .../gcc.target/riscv/xtheadbb-extu.c  |   22 +
>  gcc/testsuite/gcc.target/riscv/xtheadbb-ff1.c |   18 +
>  gcc/testsuite/gcc.target/riscv/xtheadbb-rev.c |   45 +
>  .../gcc.target/riscv/xtheadbb-srri.c  |   21 +
>  gcc/testsuite/gcc.target/riscv/xtheadbb.c |   14 +
>  gcc/testsuite/gcc.target/riscv/xtheadbs-tst.c |   13 +
>  gcc/testsuite/gcc.target/riscv/xtheadbs.c |   14 +
>  gcc/testsuite/gcc.target/riscv/xtheadcmo.c|   14 +
>  .../riscv/xtheadcondmov-mveqz-imm-eqz.c   |   38 +
>  .../riscv/xtheadcondmov-mveqz-imm-not.c   |   38 +
>  .../riscv/xtheadcondmov-mveqz-reg-eqz.c   |   38 +
>  .../riscv/xtheadcondmov-mveqz-reg-not.c   |   38 +
>  .../riscv/xtheadcondmov-mvnez-imm-cond.c  |   38 +
>  .../riscv/xtheadcondmov-mvnez-imm-nez.c   |   38 +
>  .../riscv/xtheadcondmov-mvnez-reg-cond.c  |   38 +
>  .../riscv/xtheadcondmov-mvnez-reg-nez.c   |   38 +
>  .../gcc.target/riscv/xtheadcondmov.c  |   14 +
>  .../riscv/xtheadfmemidx-fldr-fstr.c   |   58 +
>  .../gcc.target/riscv/xtheadfmemidx.c  |   14 +
>  .../gcc.target/riscv/xtheadfmv-fmv.c  |   24 +
>  gcc/testsuite/gcc.target/riscv/xtheadfmv.c|   14 +
>  gcc/testsuite/gcc.target/riscv/xtheadint.c|   14 +
>  .../gcc.target/riscv/xtheadmac-mula-muls.c|   43 +
>  gcc/testsuite/gcc.target/riscv/xtheadmac.c|   14 +
>  .../gcc.target/riscv/xtheadmemidx-ldi-sdi.c   |   72 ++
>  .../riscv/xtheadmemidx-ldr-str-32.c   |   23 +
>  .../riscv/xtheadmemidx-ldr-str-64.c   |   53 +
>  .../gcc.target/riscv/xtheadmemidx-macros.h|  110 ++
>  gcc/testsuite/gcc.target/riscv/xtheadmemidx.c |   14 +
>