Re: [PATCH] x86: Replace ne:CCC/ne:CCO with UNSPEC_CC_NE in neg patterns

2022-10-27 Thread Hongtao Liu via Gcc-patches
On Thu, Oct 27, 2022 at 2:59 AM H.J. Lu via Gcc-patches
 wrote:
>
> In i386.md, neg patterns which set MODE_CC register like
>
> (set (reg:CCC FLAGS_REG)
>  (ne:CCC (match_operand:SWI48 1 "general_reg_operand") (const_int 0)))
>
> can lead to errors when operand 1 is a constant value.  If FLAGS_REG in
>
> (set (reg:CCC FLAGS_REG)
>  (ne:CCC (const_int 2) (const_int 0)))
>
> is set to 1, RTX simplifiers may simplify
>
> (set (reg:SI 93)
>  (neg:SI (ltu:SI (reg:CCC 17 flags) (const_int 0 [0]
>
> as
>
> (set (reg:SI 93)
>  (neg:SI (ltu:SI (const_int 1) (const_int 0 [0]
>
> which leads to incorrect results since LTU on MODE_CC register isn't the
> same as "unsigned less than" in x86 backend.  To prevent RTL optimizers
> from setting MODE_CC register to a constant, use UNSPEC_CC_NE to replace
> ne:CCC/ne:CCO when setting FLAGS_REG in neg patterns.
>
> gcc/
>
> PR target/107172
> * config/i386/i386.md (UNSPEC_CC_NE): New.
> Replace ne:CCC/ne:CCO with UNSPEC_CC_NE in neg patterns.
>
> gcc/testsuite/
>
> PR target/107172
> * gcc.target/i386/pr107172.c: New test.
> ---
>  gcc/config/i386/i386.md  | 45 +---
>  gcc/testsuite/gcc.target/i386/pr107172.c | 26 ++
>  2 files changed, 51 insertions(+), 20 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr107172.c
>
> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> index baf1f1f8fa2..aaa678e7314 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -113,6 +113,7 @@ (define_c_enum "unspec" [
>UNSPEC_PEEPSIB
>UNSPEC_INSN_FALSE_DEP
>UNSPEC_SBB
> +  UNSPEC_CC_NE
>
>;; For SSE/MMX support:
>UNSPEC_FIX_NOTRUNC
> @@ -11470,7 +11471,7 @@ (define_insn_and_split "*neg2_doubleword"
>"&& reload_completed"
>[(parallel
>  [(set (reg:CCC FLAGS_REG)
> - (ne:CCC (match_dup 1) (const_int 0)))
> + (unspec:CCC [(match_dup 1) (const_int 0)] UNSPEC_CC_NE))
>   (set (match_dup 0) (neg:DWIH (match_dup 1)))])
> (parallel
>  [(set (match_dup 2)
> @@ -11499,7 +11500,8 @@ (define_peephole2
> (match_operand:SWI48 1 "nonimmediate_gr_operand"))
> (parallel
>  [(set (reg:CCC FLAGS_REG)
> - (ne:CCC (match_operand:SWI48 2 "general_reg_operand") (const_int 
> 0)))
> + (unspec:CCC [(match_operand:SWI48 2 "general_reg_operand")
> +  (const_int 0)] UNSPEC_CC_NE))
>   (set (match_dup 2) (neg:SWI48 (match_dup 2)))])
> (parallel
>  [(set (match_dup 0)
> @@ -11517,7 +11519,7 @@ (define_peephole2
> && !reg_mentioned_p (operands[2], operands[1])"
>[(parallel
>  [(set (reg:CCC FLAGS_REG)
> - (ne:CCC (match_dup 2) (const_int 0)))
> + (unspec:CCC [(match_dup 2) (const_int 0)] UNSPEC_CC_NE))
>   (set (match_dup 2) (neg:SWI48 (match_dup 2)))])
> (parallel
>  [(set (match_dup 0)
> @@ -11543,7 +11545,8 @@ (define_peephole2
>   (clobber (reg:CC FLAGS_REG))])
> (parallel
>  [(set (reg:CCC FLAGS_REG)
> - (ne:CCC (match_operand:SWI48 1 "general_reg_operand") (const_int 
> 0)))
> + (unspec:CCC [(match_operand:SWI48 1 "general_reg_operand")
> +  (const_int 0)] UNSPEC_CC_NE))
>   (set (match_dup 1) (neg:SWI48 (match_dup 1)))])
> (parallel
>  [(set (match_dup 0)
> @@ -11559,7 +11562,7 @@ (define_peephole2
>"REGNO (operands[0]) != REGNO (operands[1])"
>[(parallel
>  [(set (reg:CCC FLAGS_REG)
> - (ne:CCC (match_dup 1) (const_int 0)))
> + (unspec:CCC [(match_dup 1) (const_int 0)] UNSPEC_CC_NE))
>   (set (match_dup 1) (neg:SWI48 (match_dup 1)))])
> (parallel
>  [(set (match_dup 0)
> @@ -11635,9 +11638,9 @@ (define_insn "*negsi_2_zext"
>
>  (define_insn "*neg_ccc_1"
>[(set (reg:CCC FLAGS_REG)
> -   (ne:CCC
> - (match_operand:SWI 1 "nonimmediate_operand" "0")
> - (const_int 0)))
> +   (unspec:CCC
> + [(match_operand:SWI 1 "nonimmediate_operand" "0")
> +  (const_int 0)] UNSPEC_CC_NE))
> (set (match_operand:SWI 0 "nonimmediate_operand" "=m")
> (neg:SWI (match_dup 1)))]
>""
> @@ -11647,9 +11650,9 @@ (define_insn "*neg_ccc_1"
>
>  (define_insn "*neg_ccc_2"
>[(set (reg:CCC FLAGS_REG)
> -   (ne:CCC
> - (match_operand:SWI 1 "nonimmediate_operand" "0")
> - (const_int 0)))
> +   (unspec:CCC
> + [(match_operand:SWI 1 "nonimmediate_operand" "0")
> +  (const_int 0)] UNSPEC_CC_NE))
> (clobber (match_scratch:SWI 0 "="))]
>""
>"neg{}\t%0"
> @@ -11659,8 +11662,8 @@ (define_insn "*neg_ccc_2"
>  (define_expand "x86_neg_ccc"
>[(parallel
>  [(set (reg:CCC FLAGS_REG)
> - (ne:CCC (match_operand:SWI48 1 "register_operand")
> - (const_int 0)))
> + (unspec:CCC [(match_operand:SWI48 1 "register_operand")
> +  (const_int 0)] UNSPEC_CC_NE))
>   (set 

Re: [PATCH] riscv/RTEMS: Add RISCV_GCOV_TYPE_SIZE

2022-10-27 Thread Sebastian Huber

On 28/10/2022 01:05, Palmer Dabbelt wrote:

On Thu, 27 Oct 2022 15:56:17 PDT (-0700), gcc-patches@gcc.gnu.org wrote:


On 10/26/22 01:49, Sebastian Huber wrote:
The RV32A extension does not support 64-bit atomic operations.  For 
RTEMS, use

a 32-bit gcov type for RV32.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_gcov_type_size): New.
(TARGET_GCOV_TYPE_SIZE): Likewise.
* config/riscv/rtems.h (RISCV_GCOV_TYPE_SIZE): New.


Why make this specific to rtems?  ISTM the logic behind this change
would apply independently of the os.


Reducing the gcov type to 32-bit has the drawback that the program 
runtime is reduced. I am not sure if this is generally acceptable.




Looks like rv32gc is just broken here:

$ cat test.s
int func(int x) { return x + 1; }
$ gcc -march=rv32gc -O3 -fprofile-update=atomic -fprofile-arcs test.c -S 
-o-

func(int):
    lui a4,%hi(__gcov0.func(int))
    lw  a5,%lo(__gcov0.func(int))(a4)
    lw  a2,%lo(__gcov0.func(int)+4)(a4)
    addi    a0,a0,1
    addi    a3,a5,1
    sltu    a5,a3,a5
    add a5,a5,a2
    sw  a3,%lo(__gcov0.func(int))(a4)
    sw  a5,%lo(__gcov0.func(int)+4)(a4)
    ret
_sub_I_00100_0:
    lui a0,%hi(.LANCHOR0)
    addi    a0,a0,%lo(.LANCHOR0)
    tail    __gcov_init
_sub_D_00100_1:
    tail    __gcov_exit
__gcov0.func(int):
    .zero   8

Those are not atomic...


Well, you get at least a warning:

test.c:1:1: warning: target does not support atomic profile update, 
single mode is selected


With the patch you get:

riscv-rtems6-gcc -march=rv32gc -O3 -fprofile-update=atomic 
-fprofile-arcs test.c -S -o-

func:
lui a5,%hi(__gcov0.func)
li  a4,1
addia5,a5,%lo(__gcov0.func)
amoadd.w zero,a4,0(a5)
addia0,a0,1
ret
.size   func, .-func

The Armv7-A doesn't have an issue with 64-bit atomics:

arm-rtems6-gcc -march=armv7-a -O3 -fprofile-update=atomic -fprofile-arcs 
test.c -S -o-

func:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
movwr3, #:lower16:.LANCHOR0
movtr3, #:upper16:.LANCHOR0
push{r4, r5, r6, r7}
mov r4, #1
mov r5, #0
.L2:
ldrexd  r6, r7, [r3]
addsr6, r6, r4
adc r7, r7, r5
strexd  r1, r6, r7, [r3]
cmp r1, #0
bne .L2
add r0, r0, #1
pop {r4, r5, r6, r7}
bx  lr

Maybe RV32 should also support LL/SC instructions with two 32-bit registers.

Another option would be to split the atomic increment into two parts as 
suggested by Jakub Jelinek:


https://patchwork.ozlabs.org/project/gcc/patch/19c4a81d-6ecd-8c6e-b641-e257c1959...@suse.cz/#1447334

Another option would be to use library calls if hardware atomics are not 
available.


--
embedded brains GmbH
Herr Sebastian HUBER
Dornierstr. 4
82178 Puchheim
Germany
email: sebastian.hu...@embedded-brains.de
phone: +49-89-18 94 741 - 16
fax:   +49-89-18 94 741 - 08

Registergericht: Amtsgericht München
Registernummer: HRB 157899
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/


Re: [PATCH v2] RISC-V: Libitm add RISC-V support.

2022-10-27 Thread Xi Ruoyao via Gcc-patches
On Thu, 2022-10-27 at 17:44 -0700, Palmer Dabbelt wrote:
> though I don't have an opinion on whether libitm should be taking ports 
> to new targets, I'd never even heard of it before.

I asked this question to myself when I reviewed LoongArch libitm port. 
But I remember one maintainer of Deepin (a distro) has complained that
some packages were depending on libitm (and/or libvtv).

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: [PATCH] [x86] Enable V4BFmode and V2BFmode.

2022-10-27 Thread Hongtao Liu via Gcc-patches
I'm going to check in this patch.

On Wed, Oct 26, 2022 at 10:30 AM liuhongt  wrote:
>
> Enable V4BFmode and V2BFmode with the same ABI as V4HFmode and
> V2HFmode. No real operation is supported for them except for movement.
> This should solve PR target/107261.
>
> Also I notice there's redundancy in VALID_AVX512FP16_REG_MODE, and
> remove V2BFmode remove it.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?
>
> gcc/ChangeLog:
>
> PR target/107261
> * config/i386/i386-modes.def (VECTOR_MODE): Support V2BFmode.
> * config/i386/i386.cc (classify_argument): Handle V4BFmode and
> V2BFmode.
> (ix86_convert_const_vector_to_integer): Ditto.
> * config/i386/i386.h (VALID_AVX512FP16_REG_MODE): Remove
> V2BFmode.
> (VALID_SSE2_REG_MODE): Add V4BFmode and V2BFmode.
> (VALID_MMX_REG_MODE): Add V4BFmode.
> * config/i386/i386.md (mode): Add V4BF and V2BF.
> (MODE_SIZE): Ditto.
> * config/i386/mmx.md (MMXMODE) Add V4BF.
> (V_32): Add V2BF.
> (V_16_32_64): Add V4BF and V2BF.
> (mmxinsnmode): Add V4BF and V2BF.
> (*mov_internal): Hanlde V4BFmode and V2BFmode.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/pr107261.c: New test.
> ---
>  gcc/config/i386/i386-modes.def   |  1 +
>  gcc/config/i386/i386.cc  |  6 
>  gcc/config/i386/i386.h   |  9 +++---
>  gcc/config/i386/i386.md  |  5 ++--
>  gcc/config/i386/mmx.md   | 26 +---
>  gcc/testsuite/gcc.target/i386/pr107261.c | 38 
>  6 files changed, 68 insertions(+), 17 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr107261.c
>
> diff --git a/gcc/config/i386/i386-modes.def b/gcc/config/i386/i386-modes.def
> index b49daaef253..dbc3165c5fc 100644
> --- a/gcc/config/i386/i386-modes.def
> +++ b/gcc/config/i386/i386-modes.def
> @@ -93,6 +93,7 @@ VECTOR_MODES (FLOAT, 64); /*  V32HF V16SF V8DF V4TF */
>  VECTOR_MODES (FLOAT, 128);/* V64HF V32SF V16DF V8TF */
>  VECTOR_MODES (FLOAT, 256);/* V128HF V64SF V32DF V16TF */
>  VECTOR_MODE (FLOAT, HF, 2);   /*  V2HF */
> +VECTOR_MODE (FLOAT, BF, 2);   /*  V2BF */
>  VECTOR_MODE (FLOAT, HF, 6);   /*  V6HF */
>  VECTOR_MODE (INT, TI, 1); /*   V1TI */
>  VECTOR_MODE (INT, DI, 1); /*   V1DI */
> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> index aeea26ef4be..1aca7d55a09 100644
> --- a/gcc/config/i386/i386.cc
> +++ b/gcc/config/i386/i386.cc
> @@ -2507,7 +2507,9 @@ classify_argument (machine_mode mode, const_tree type,
>  case E_V2SImode:
>  case E_V4HImode:
>  case E_V4HFmode:
> +case E_V4BFmode:
>  case E_V2HFmode:
> +case E_V2BFmode:
>  case E_V8QImode:
>classes[0] = X86_64_SSE_CLASS;
>return 1;
> @@ -2991,6 +2993,7 @@ pass_in_reg:
>  case E_V8QImode:
>  case E_V4HImode:
>  case E_V4HFmode:
> +case E_V4BFmode:
>  case E_V2SImode:
>  case E_V2SFmode:
>  case E_V1TImode:
> @@ -3240,6 +3243,7 @@ pass_in_reg:
>  case E_V8QImode:
>  case E_V4HImode:
>  case E_V4HFmode:
> +case E_V4BFmode:
>  case E_V2SImode:
>  case E_V2SFmode:
>  case E_V1TImode:
> @@ -15810,7 +15814,9 @@ ix86_convert_const_vector_to_integer (rtx op, 
> machine_mode mode)
> }
>break;
>  case E_V2HFmode:
> +case E_V2BFmode:
>  case E_V4HFmode:
> +case E_V4BFmode:
>  case E_V2SFmode:
>for (int i = 0; i < nunits; ++i)
> {
> diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
> index fd7c9df47e5..16d9c606077 100644
> --- a/gcc/config/i386/i386.h
> +++ b/gcc/config/i386/i386.h
> @@ -1033,13 +1033,12 @@ extern const char *host_detect_local_cpu (int argc, 
> const char **argv);
> || (MODE) == V8BFmode || (MODE) == TImode)
>
>  #define VALID_AVX512FP16_REG_MODE(MODE)  
>   \
> -  ((MODE) == V8HFmode || (MODE) == V16HFmode || (MODE) == V32HFmode\
> -   || (MODE) == V2HFmode)
> +  ((MODE) == V8HFmode || (MODE) == V16HFmode || (MODE) == V32HFmode)
>
>  #define VALID_SSE2_REG_MODE(MODE)  \
>((MODE) == V16QImode || (MODE) == V8HImode || (MODE) == V2DFmode \
> || (MODE) == V8HFmode || (MODE) == V4HFmode || (MODE) == V2HFmode   \
> -   || (MODE) == V8BFmode \
> +   || (MODE) == V8BFmode || (MODE) == V4BFmode || (MODE) == V2BFmode   \
> || (MODE) == V4QImode || (MODE) == V2HImode || (MODE) == V1SImode   \
> || (MODE) == V2DImode || (MODE) == V2QImode || (MODE) == DFmode \
> || (MODE) == HFmode || (MODE) == BFmode)
> @@ -1057,7 +1056,7 @@ extern const char *host_detect_local_cpu (int argc, 
> const char **argv);
>((MODE) == V1DImode || (MODE) == DImode  \
> || (MODE) == V2SImode || (MODE) == 

Re: [PATCH v2] RISC-V: Libitm add RISC-V support.

2022-10-27 Thread Palmer Dabbelt
On Thu, 27 Oct 2022 16:05:19 PDT (-0700), gcc-patches@gcc.gnu.org wrote:
>
> On 10/27/22 06:49, Xiongchuan Tan via Gcc-patches wrote:
>> libitm/ChangeLog:
>>
>>  * configure.tgt: Add riscv support.
>>  * config/riscv/asm.h: New file.
>>  * config/riscv/sjlj.S: New file.
>>  * config/riscv/target.h: New file.
>> ---
>> v2: Change HW_CACHELINE_SIZE to 64 (in accordance with the RVA profiles, see
>> https://github.com/riscv/riscv-profiles/blob/main/profiles.adoc)
>>
>>   libitm/config/riscv/asm.h|  52 +
>>   libitm/config/riscv/sjlj.S   | 144 +++
>>   libitm/config/riscv/target.h |  50 
>>   libitm/configure.tgt |   2 +
>>   4 files changed, 248 insertions(+)
>>   create mode 100644 libitm/config/riscv/asm.h
>>   create mode 100644 libitm/config/riscv/sjlj.S
>>   create mode 100644 libitm/config/riscv/target.h
>
> Not objecting or even reviewing  But hasn't transactional memory
> largely fallen out of favor these days?  Intel has pulled it and I think
> IBM did as well.    Should we be investing in extending libitm at all?

I think we didn't get the memo: 
https://github.com/riscv/riscv-isa-manual/pull/906

The code looks fine to me, so

Reviewed-by: Palmer Dabbelt 
Acked-by: Palmer Dabbelt 

though I don't have an opinion on whether libitm should be taking ports 
to new targets, I'd never even heard of it before.

Some minor comments:

> +_ITM_beginTransaction:
> +   cfi_startproc
> +mv a1, sp
> +   addi sp, sp, -(14*SZ_GPR+12*SZ_FPR)
> +   cfi_adjust_cfa_offset(14*SZ_GPR+12*SZ_FPR)

Many of the ABIs require 16-byte stack alignment.

Also: it doesn't hurt anything to use the extra stack, but we only 
stricly need space for the FPRs if we're going to bother saving them.

> +/* ??? The size of one line in hardware caches (in bytes). */
> +#define HW_CACHELINE_SIZE 64

Maybe we should have a placeholder libc/vdso routine for the cache line 
size?  The specs are sort of just a suggestion for that sort of thing.

> +static inline void
> +cpu_relax (void)
> +{
> +__asm__ volatile ("" : : : "memory");
> +}

We have Zihintpause now, but that's a pretty minor optimization.


[committed] c: C2x enums with fixed underlying type [PR61469]

2022-10-27 Thread Joseph Myers
C2x adds support for enums with a fixed underlying type specified
("enum e : long long;" and similar).  Implement this in the C front
end.  The same representation is used for these types as in C++, with
two macros moved from cp-tree.h to c-common.h.

Such enums can have bool as the underlying type, and various C
front-end code checking for boolean types is adjusted to use a new
C_BOOLEAN_TYPE_P to handle such enums the same way as bool.  (Note
that for C++ we have bug 96496 that enums with underlying type bool
don't work correctly there.)

There are various issues with the wording for such enums in the
current C2x working draft (including but not limited to wording in the
accepted paper that failed to make it into the working draft), which I
intend to raise in NB comments.  I think what I've implemented and
added tests for matches the intent.

Bootstrapped with no regressions for x86_64-pc-linux-gnu.

PR c/61469

gcc/c-family/
* c-common.h (ENUM_UNDERLYING_TYPE, ENUM_FIXED_UNDERLYING_TYPE_P):
New.  Moved from cp/cp-tree.h.
* c-warn.cc (warnings_for_convert_and_check): Do not consider
conversions to enum with underlying type bool to overflow.

gcc/c/
* c-convert.cc (c_convert): Handle enums with underlying boolean
type like bool.
* c-decl.cc (shadow_tag_warned): Allow shadowing declarations for
enums with enum type specifier, but give errors for storage class
specifiers, qualifiers or alignment specifiers in non-definition
declarations of such enums.
(grokdeclarator): Give error for non-definition use of type
specifier with an enum type specifier.
(parser_xref_tag): Add argument has_enum_type_specifier.  Pass it
to lookup_tag and use it to set ENUM_FIXED_UNDERLYING_TYPE_P.
(xref_tag): Update call to parser_xref_tag.
(start_enum): Add argument fixed_underlying_type.  Complete enum
type with a fixed underlying type given in the definition.  Give
error for defining without a fixed underlying type in the
definition if one was given in a prior declaration.  Do not mark
enums with fixed underlying type as packed for -fshort-enums.
Store the enum type in the_enum.
(finish_enum): Do not adjust types of values or check their range
for an enum with a fixed underlying type.  Set underlying type of
enum and variants.
(build_enumerator): Check enumeration constants for enum with
fixed underlying type against that type and convert to that type.
Increment in the underlying integer type, with handling for bool.
(c_simulate_enum_decl): Update call to start_enum.
(declspecs_add_type): Set specs->enum_type_specifier_ref_p.
* c-objc-common.cc (c_get_alias_set): Use ENUM_UNDERLYING_TYPE
rather than recomputing an underlying type based on size.
* c-parser.cc (c_parser_declspecs)
(c_parser_struct_or_union_specifier, c_parser_typeof_specifier):
Set has_enum_type_specifier for type specifiers.
(c_parser_enum_specifier): Handle enum type specifiers.
(c_parser_struct_or_union_specifier): Update call to
parser_xref_tag.
(c_parser_omp_atomic): Check for boolean increment or decrement
using C_BOOLEAN_TYPE_P.
* c-tree.h (C_BOOLEAN_TYPE_P): New.
(struct c_typespec): Add has_enum_type_specifier.
(struct c_declspecs): Add enum_type_specifier_ref_p.
(struct c_enum_contents): Add enum_type.
(start_enum, parser_xref_tag): Update prototypes.
* c-typeck.cc (composite_type): Allow for enumerated types
compatible with bool.
(common_type, comptypes_internal, perform_integral_promotions):
Use ENUM_UNDERLYING_TYPE.
(parser_build_binary_op, build_unary_op, convert_for_assignment)
(c_finish_return, c_start_switch, build_binary_op): Check for
boolean types using C_BOOLEAN_TYPE_P.

gcc/cp/
* cp-tree.h (ENUM_FIXED_UNDERLYING_TYPE_P, ENUM_UNDERLYING_TYPE):
Remove.  Moved to c-common.h.

gcc/testsuite/
* gcc.dg/c11-enum-4.c, gcc.dg/c11-enum-5.c, gcc.dg/c11-enum-6.c,
gcc.dg/c2x-enum-6.c, gcc.dg/c2x-enum-7.c, gcc.dg/c2x-enum-8.c,
gcc.dg/gnu2x-enum-1.c: New tests.

diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 62ab4ba437b..f9d0d2945a5 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1004,6 +1004,30 @@ extern void c_parse_final_cleanups (void);
 /* True iff TYPE is cv decltype(nullptr).  */
 #define NULLPTR_TYPE_P(TYPE) (TREE_CODE (TYPE) == NULLPTR_TYPE)
 
+/* Returns the underlying type of the given enumeration type. The
+   underlying type is determined in different ways, depending on the
+   properties of the enum:
+
+ - In C++0x or C2x, the underlying type can be explicitly specified, e.g.,
+
+ enum E1 : char { ... } // underlying type is char
+
+ 

Re: [PATCH] libgo: use _off_t for mmap offset argument

2022-10-27 Thread Ian Lance Taylor via Gcc-patches
On Sat, Oct 22, 2022 at 6:45 AM Sören Tempel  wrote:
>
> PING.
>
> soe...@soeren-tempel.net wrote:
> > From: Sören Tempel 
> >
> > On glibc-based systems, off_t is a 32-bit type on 32-bit systems and a
> > 64-bit type on 64-bit systems by default. However, on systems using musl
> > libc off_t is unconditionally a 64-bit type. As such, it is insufficient
> > to use a uintptr type for the mmap offset parameter.
> >
> > Presently, the (incorrect) mmap declaration causes a libgo run-time
> > failure on 32-bit musl systems (fatal error: runtime: cannot allocate
> > memory). This commit fixes this run-time error.
> >
> > Signed-off-by: Sören Tempel 
> > ---
> > This implements what has been proposed by Ian in a GitHub comment
> > https://github.com/golang/go/issues/51280#issuecomment-1046322011
> >
> > I don't have access to a 32-bit glibc system to test this on but
> > this does seem to work fine on 32-bit and 64-bit musl systems.

Thanks.  Committed as follows using _libgo_off_t_type to avoid the
confusion between off_t and off64_t.

Sorry for the delay.

Ian
11a5fc0c76aedb100b5d7ecc7dd4bed33d850bb8
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index 5b95b38a541..7e531c3f90b 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-6c188108858e3ae8c8ea8e4cc55427d8cf01bbc8
+5e658f4659c551330ea68f5667e4f951b218f32d
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/libgo/go/runtime/mem_gccgo.go b/libgo/go/runtime/mem_gccgo.go
index fa3389d857e..1e84f4f5c56 100644
--- a/libgo/go/runtime/mem_gccgo.go
+++ b/libgo/go/runtime/mem_gccgo.go
@@ -15,7 +15,7 @@ import (
 //go:linkname sysFree
 
 //extern mmap
-func sysMmap(addr unsafe.Pointer, n uintptr, prot, flags, fd int32, off 
uintptr) unsafe.Pointer
+func sysMmap(addr unsafe.Pointer, n uintptr, prot, flags, fd int32, off 
_libgo_off_t_type) unsafe.Pointer
 
 //extern munmap
 func munmap(addr unsafe.Pointer, length uintptr) int32
@@ -38,7 +38,7 @@ func init() {
 }
 
 func mmap(addr unsafe.Pointer, n uintptr, prot, flags, fd int32, off uintptr) 
(unsafe.Pointer, int) {
-   p := sysMmap(addr, n, prot, flags, fd, off)
+   p := sysMmap(addr, n, prot, flags, fd, _libgo_off_t_type(off))
if uintptr(p) == _MAP_FAILED {
return nil, errno()
}
@@ -47,6 +47,7 @@ func mmap(addr unsafe.Pointer, n uintptr, prot, flags, fd 
int32, off uintptr) (u
 
 // Don't split the stack as this method may be invoked without a valid G, which
 // prevents us from allocating more stack.
+//
 //go:nosplit
 func sysAlloc(n uintptr, sysStat *sysMemStat) unsafe.Pointer {
p, err := mmap(nil, n, _PROT_READ|_PROT_WRITE, _MAP_ANON|_MAP_PRIVATE, 
mmapFD, 0)
@@ -165,6 +166,7 @@ func sysHugePage(v unsafe.Pointer, n uintptr) {
 
 // Don't split the stack as this function may be invoked without a valid G,
 // which prevents us from allocating more stack.
+//
 //go:nosplit
 func sysFree(v unsafe.Pointer, n uintptr, sysStat *sysMemStat) {
sysStat.add(-int64(n))


[PATCH v2 1/3] libcpp: reject codepoints above 0x10FFFF

2022-10-27 Thread Ben Boeckel via Gcc-patches
Unicode does not support such values because they are unrepresentable in
UTF-16.

Signed-off-by: Ben Boeckel 
---
 libcpp/ChangeLog  | 6 ++
 libcpp/charset.cc | 4 ++--
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/libcpp/ChangeLog b/libcpp/ChangeLog
index 18d5bcceaf0..4d707277531 100644
--- a/libcpp/ChangeLog
+++ b/libcpp/ChangeLog
@@ -1,3 +1,9 @@
+2022-10-27  Ben Boeckel  
+
+   * include/charset.cc: Reject encodings of codepoints above 0x10.
+   UTF-16 does not support such codepoints and therefore all Unicode
+   rejects such values.
+
 2022-10-19  Lewis Hyatt  
 
* include/cpplib.h (struct cpp_string): Use new "string_length" GTY.
diff --git a/libcpp/charset.cc b/libcpp/charset.cc
index 12a398e7527..e9da6674b5f 100644
--- a/libcpp/charset.cc
+++ b/libcpp/charset.cc
@@ -216,7 +216,7 @@ one_utf8_to_cppchar (const uchar **inbufp, size_t 
*inbytesleftp,
   if (c <= 0x3FF && nbytes > 5) return EILSEQ;
 
   /* Make sure the character is valid.  */
-  if (c > 0x7FFF || (c >= 0xD800 && c <= 0xDFFF)) return EILSEQ;
+  if (c > 0x10 || (c >= 0xD800 && c <= 0xDFFF)) return EILSEQ;
 
   *cp = c;
   *inbufp = inbuf;
@@ -320,7 +320,7 @@ one_utf32_to_utf8 (iconv_t bigend, const uchar **inbufp, 
size_t *inbytesleftp,
   s += inbuf[bigend ? 2 : 1] << 8;
   s += inbuf[bigend ? 3 : 0];
 
-  if (s >= 0x7FFF || (s >= 0xD800 && s <= 0xDFFF))
+  if (s > 0x10 || (s >= 0xD800 && s <= 0xDFFF))
 return EILSEQ;
 
   rval = one_cppchar_to_utf8 (s, outbufp, outbytesleftp);
-- 
2.37.3



[PATCH v2 3/3] p1689r5: initial support

2022-10-27 Thread Ben Boeckel via Gcc-patches
This patch implements support for [P1689R5][] to communicate to a build
system the C++20 module dependencies to build systems so that they may
build `.gcm` files in the proper order.

Support is communicated through the following three new flags:

- `-fdeps-format=` specifies the format for the output. Currently named
  `p1689r5`.

- `-fdeps-file=` specifies the path to the file to write the format to.

- `-fdep-output=` specifies the `.o` that will be written for the TU
  that is scanned. This is required so that the build system can
  correlate the dependency output with the actual compilation that will
  occur.

CMake supports this format as of 17 Jun 2022 (to be part of 3.25.0)
using an experimental feature selection (to allow for future usage
evolution without committing to how it works today). While it remains
experimental, docs may be found in CMake's documentation for
experimental features.

Future work may include using this format for Fortran module
dependencies as well, however this is still pending work.

[P1689R5]: https://isocpp.org/files/papers/P1689R5.html
[cmake-experimental]: 
https://gitlab.kitware.com/cmake/cmake/-/blob/master/Help/dev/experimental.rst

TODO:

- header-unit information fields

Header units (including the standard library headers) are 100%
unsupported right now because the `-E` mechanism wants to import their
BMIs. A new mode (i.e., something more workable than existing `-E`
behavior) that mocks up header units as if they were imported purely
from their path and content would be required.

- non-utf8 paths

The current standard says that paths that are not unambiguously
represented using UTF-8 are not supported (because these cases are rare
and the extra complication is not worth it at this time). Future
versions of the format might have ways of encoding non-UTF-8 paths. For
now, this patch just doesn't support non-UTF-8 paths (ignoring the
"unambiguously represetable in UTF-8" case).

- figure out why junk gets placed at the end of the file

Sometimes it seems like the file gets a lot of `NUL` bytes appended to
it. It happens rarely and seems to be the result of some
`ftruncate`-style call which results in extra padding in the contents.
Noting it here as an observation at least.

Signed-off-by: Ben Boeckel 

---
 gcc/ChangeLog |   5 +
 gcc/c-family/ChangeLog|   6 +
 gcc/c-family/c-opts.cc|  40 +++-
 gcc/c-family/c.opt|  12 +
 gcc/cp/ChangeLog  |   5 +
 gcc/cp/module.cc  |   3 +-
 gcc/doc/invoke.texi   |  15 ++
 gcc/testsuite/ChangeLog   |   7 +
 gcc/testsuite/g++.dg/modules/depflags-f-MD.C  |   2 +
 gcc/testsuite/g++.dg/modules/depflags-f.C |   1 +
 gcc/testsuite/g++.dg/modules/depflags-fi.C|   3 +
 gcc/testsuite/g++.dg/modules/depflags-fj-MD.C |   3 +
 gcc/testsuite/g++.dg/modules/depflags-fj.C|   4 +
 .../g++.dg/modules/depflags-fjo-MD.C  |   4 +
 gcc/testsuite/g++.dg/modules/depflags-fjo.C   |   5 +
 gcc/testsuite/g++.dg/modules/depflags-fo-MD.C |   3 +
 gcc/testsuite/g++.dg/modules/depflags-fo.C|   4 +
 gcc/testsuite/g++.dg/modules/depflags-j-MD.C  |   2 +
 gcc/testsuite/g++.dg/modules/depflags-j.C |   3 +
 gcc/testsuite/g++.dg/modules/depflags-jo-MD.C |   3 +
 gcc/testsuite/g++.dg/modules/depflags-jo.C|   4 +
 gcc/testsuite/g++.dg/modules/depflags-o-MD.C  |   2 +
 gcc/testsuite/g++.dg/modules/depflags-o.C |   3 +
 gcc/testsuite/g++.dg/modules/modules.exp  |  11 +
 gcc/testsuite/g++.dg/modules/p1689-1.C|  18 ++
 gcc/testsuite/g++.dg/modules/p1689-1.exp.json |  27 +++
 gcc/testsuite/g++.dg/modules/p1689-2.C|  16 ++
 gcc/testsuite/g++.dg/modules/p1689-2.exp.json |  16 ++
 gcc/testsuite/g++.dg/modules/p1689-3.C|  14 ++
 gcc/testsuite/g++.dg/modules/p1689-3.exp.json |  16 ++
 gcc/testsuite/g++.dg/modules/p1689-4.C|  14 ++
 gcc/testsuite/g++.dg/modules/p1689-4.exp.json |  14 ++
 gcc/testsuite/g++.dg/modules/p1689-5.C|  14 ++
 gcc/testsuite/g++.dg/modules/p1689-5.exp.json |  14 ++
 gcc/testsuite/g++.dg/modules/test-p1689.py| 222 ++
 gcc/testsuite/lib/modules.exp |  71 ++
 libcpp/ChangeLog  |  11 +
 libcpp/include/cpplib.h   |  12 +-
 libcpp/include/mkdeps.h   |  17 +-
 libcpp/init.cc|  13 +-
 libcpp/mkdeps.cc  | 149 +++-
 41 files changed, 789 insertions(+), 19 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/modules/depflags-f-MD.C
 create mode 100644 gcc/testsuite/g++.dg/modules/depflags-f.C
 create mode 100644 gcc/testsuite/g++.dg/modules/depflags-fi.C
 create mode 100644 gcc/testsuite/g++.dg/modules/depflags-fj-MD.C
 create mode 100644 gcc/testsuite/g++.dg/modules/depflags-fj.C
 create mode 100644 

[PATCH v2 2/3] libcpp: add a function to determine UTF-8 validity of a C string

2022-10-27 Thread Ben Boeckel via Gcc-patches
This simplifies the interface for other UTF-8 validity detections when a
simple "yes" or "no" answer is sufficient.

Signed-off-by: Ben Boeckel 
---
 libcpp/ChangeLog  |  6 ++
 libcpp/charset.cc | 18 ++
 libcpp/internal.h |  2 ++
 3 files changed, 26 insertions(+)

diff --git a/libcpp/ChangeLog b/libcpp/ChangeLog
index 4d707277531..4e2c7900ae2 100644
--- a/libcpp/ChangeLog
+++ b/libcpp/ChangeLog
@@ -1,3 +1,9 @@
+2022-10-27  Ben Boeckel  
+
+   * include/charset.cc: Add `_cpp_valid_utf8_str` which determines
+   whether a C string is valid UTF-8 or not.
+   * include/internal.h: Add prototype for `_cpp_valid_utf8_str`.
+
 2022-10-27  Ben Boeckel  
 
* include/charset.cc: Reject encodings of codepoints above 0x10.
diff --git a/libcpp/charset.cc b/libcpp/charset.cc
index e9da6674b5f..0524ab6beba 100644
--- a/libcpp/charset.cc
+++ b/libcpp/charset.cc
@@ -1864,6 +1864,24 @@ _cpp_valid_utf8 (cpp_reader *pfile,
   return true;
 }
 
+extern bool
+_cpp_valid_utf8_str (const char *name)
+{
+  const uchar* in = (const uchar*)name;
+  size_t len = strlen(name);
+  cppchar_t cp;
+
+  while (*in)
+{
+  if (one_utf8_to_cppchar(, , ))
+   {
+ return false;
+   }
+}
+
+  return true;
+}
+
 /* Subroutine of convert_hex and convert_oct.  N is the representation
in the execution character set of a numeric escape; write it into the
string buffer TBUF and update the end-of-string pointer therein.  WIDE
diff --git a/libcpp/internal.h b/libcpp/internal.h
index badfd1b40da..4f2dd4a2f5c 100644
--- a/libcpp/internal.h
+++ b/libcpp/internal.h
@@ -834,6 +834,8 @@ extern bool _cpp_valid_utf8 (cpp_reader *pfile,
 struct normalize_state *nst,
 cppchar_t *cp);
 
+extern bool _cpp_valid_utf8_str (const char *str);
+
 extern void _cpp_destroy_iconv (cpp_reader *);
 extern unsigned char *_cpp_convert_input (cpp_reader *, const char *,
  unsigned char *, size_t, size_t,
-- 
2.37.3



[PATCH v2 0/1] RFC: P1689R5 support

2022-10-27 Thread Ben Boeckel via Gcc-patches
Hi,

This patch adds initial support for ISO C++'s [P1689R5][], a format for
describing C++ module requirements and provisions based on the source
code. This is required because compiling C++ with modules is not
embarrassingly parallel and need to be ordered to ensure that `import
some_module;` can be satisfied in time by making sure that the TU with
`export import some_module;` is compiled first.

[P1689R5]: https://isocpp.org/files/papers/P1689R5.html

I'd like feedback on the approach taken here with respect to the
user-visible flags. I'll also note that header units are not supported
at this time because the current `-E` behavior with respect to `import
;` is to search for an appropriate `.gcm` file which is not
something such a "scan" can support. A new mode will likely need to be
created (e.g., replacing `-E` with `-fc++-module-scanning` or something)
where headers are looked up "normally" and processed only as much as
scanning requires.

For the record, Clang has patches with similar flags and behavior by
Chuanqi Xu here:

https://reviews.llvm.org/D134269

with the same flags.

Thanks,

--Ben

---
v1 -> v2:

- removal of the `deps_write(extra)` parameter to option-checking where
  ndeeded
- default parameter of `cpp_finish(fdeps_stream = NULL)`
- unification of libcpp UTF-8 validity functions from v1
- test cases for flag parsing states (depflags-*) and p1689 output
  (p1689-*)

Ben Boeckel (3):
  libcpp: reject codepoints above 0x10
  libcpp: add a function to determine UTF-8 validity of a C string
  p1689r5: initial support

 gcc/ChangeLog |   5 +
 gcc/c-family/ChangeLog|   6 +
 gcc/c-family/c-opts.cc|  40 +++-
 gcc/c-family/c.opt|  12 +
 gcc/cp/ChangeLog  |   5 +
 gcc/cp/module.cc  |   3 +-
 gcc/doc/invoke.texi   |  15 ++
 gcc/testsuite/ChangeLog   |   7 +
 gcc/testsuite/g++.dg/modules/depflags-f-MD.C  |   2 +
 gcc/testsuite/g++.dg/modules/depflags-f.C |   1 +
 gcc/testsuite/g++.dg/modules/depflags-fi.C|   3 +
 gcc/testsuite/g++.dg/modules/depflags-fj-MD.C |   3 +
 gcc/testsuite/g++.dg/modules/depflags-fj.C|   4 +
 .../g++.dg/modules/depflags-fjo-MD.C  |   4 +
 gcc/testsuite/g++.dg/modules/depflags-fjo.C   |   5 +
 gcc/testsuite/g++.dg/modules/depflags-fo-MD.C |   3 +
 gcc/testsuite/g++.dg/modules/depflags-fo.C|   4 +
 gcc/testsuite/g++.dg/modules/depflags-j-MD.C  |   2 +
 gcc/testsuite/g++.dg/modules/depflags-j.C |   3 +
 gcc/testsuite/g++.dg/modules/depflags-jo-MD.C |   3 +
 gcc/testsuite/g++.dg/modules/depflags-jo.C|   4 +
 gcc/testsuite/g++.dg/modules/depflags-o-MD.C  |   2 +
 gcc/testsuite/g++.dg/modules/depflags-o.C |   3 +
 gcc/testsuite/g++.dg/modules/modules.exp  |  11 +
 gcc/testsuite/g++.dg/modules/p1689-1.C|  18 ++
 gcc/testsuite/g++.dg/modules/p1689-1.exp.json |  27 +++
 gcc/testsuite/g++.dg/modules/p1689-2.C|  16 ++
 gcc/testsuite/g++.dg/modules/p1689-2.exp.json |  16 ++
 gcc/testsuite/g++.dg/modules/p1689-3.C|  14 ++
 gcc/testsuite/g++.dg/modules/p1689-3.exp.json |  16 ++
 gcc/testsuite/g++.dg/modules/p1689-4.C|  14 ++
 gcc/testsuite/g++.dg/modules/p1689-4.exp.json |  14 ++
 gcc/testsuite/g++.dg/modules/p1689-5.C|  14 ++
 gcc/testsuite/g++.dg/modules/p1689-5.exp.json |  14 ++
 gcc/testsuite/g++.dg/modules/test-p1689.py| 222 ++
 gcc/testsuite/lib/modules.exp |  71 ++
 libcpp/ChangeLog  |  23 ++
 libcpp/charset.cc |  22 +-
 libcpp/include/cpplib.h   |  12 +-
 libcpp/include/mkdeps.h   |  17 +-
 libcpp/init.cc|  13 +-
 libcpp/internal.h |   2 +
 libcpp/mkdeps.cc  | 149 +++-
 43 files changed, 823 insertions(+), 21 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/modules/depflags-f-MD.C
 create mode 100644 gcc/testsuite/g++.dg/modules/depflags-f.C
 create mode 100644 gcc/testsuite/g++.dg/modules/depflags-fi.C
 create mode 100644 gcc/testsuite/g++.dg/modules/depflags-fj-MD.C
 create mode 100644 gcc/testsuite/g++.dg/modules/depflags-fj.C
 create mode 100644 gcc/testsuite/g++.dg/modules/depflags-fjo-MD.C
 create mode 100644 gcc/testsuite/g++.dg/modules/depflags-fjo.C
 create mode 100644 gcc/testsuite/g++.dg/modules/depflags-fo-MD.C
 create mode 100644 gcc/testsuite/g++.dg/modules/depflags-fo.C
 create mode 100644 gcc/testsuite/g++.dg/modules/depflags-j-MD.C
 create mode 100644 gcc/testsuite/g++.dg/modules/depflags-j.C
 create mode 100644 gcc/testsuite/g++.dg/modules/depflags-jo-MD.C
 create mode 100644 gcc/testsuite/g++.dg/modules/depflags-jo.C
 create mode 100644 gcc/testsuite/g++.dg/modules/depflags-o-MD.C
 create mode 100644 

Re: Document 'distclean-stage[N]'

2022-10-27 Thread Jeff Law via Gcc-patches



On 10/26/22 06:10, Thomas Schwinge wrote:

Hi!

OK to push the attached patch to "Document 'distclean-stage[N]'"?


OK

jeff




Re: [PATCH v2] RISC-V: Libitm add RISC-V support.

2022-10-27 Thread Jeff Law via Gcc-patches



On 10/27/22 06:49, Xiongchuan Tan via Gcc-patches wrote:

libitm/ChangeLog:

 * configure.tgt: Add riscv support.
 * config/riscv/asm.h: New file.
 * config/riscv/sjlj.S: New file.
 * config/riscv/target.h: New file.
---
v2: Change HW_CACHELINE_SIZE to 64 (in accordance with the RVA profiles, see
https://github.com/riscv/riscv-profiles/blob/main/profiles.adoc)

  libitm/config/riscv/asm.h|  52 +
  libitm/config/riscv/sjlj.S   | 144 +++
  libitm/config/riscv/target.h |  50 
  libitm/configure.tgt |   2 +
  4 files changed, 248 insertions(+)
  create mode 100644 libitm/config/riscv/asm.h
  create mode 100644 libitm/config/riscv/sjlj.S
  create mode 100644 libitm/config/riscv/target.h


Not objecting or even reviewing  But hasn't transactional memory 
largely fallen out of favor these days?  Intel has pulled it and I think 
IBM did as well.    Should we be investing in extending libitm at all?



jeff




Re: [PATCH] riscv/RTEMS: Add RISCV_GCOV_TYPE_SIZE

2022-10-27 Thread Palmer Dabbelt

On Thu, 27 Oct 2022 15:56:17 PDT (-0700), gcc-patches@gcc.gnu.org wrote:


On 10/26/22 01:49, Sebastian Huber wrote:

The RV32A extension does not support 64-bit atomic operations.  For RTEMS, use
a 32-bit gcov type for RV32.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_gcov_type_size): New.
(TARGET_GCOV_TYPE_SIZE): Likewise.
* config/riscv/rtems.h (RISCV_GCOV_TYPE_SIZE): New.


Why make this specific to rtems?  ISTM the logic behind this change
would apply independently of the os.


Looks like rv32gc is just broken here:

$ cat test.s
int func(int x) { return x + 1; }
$ gcc -march=rv32gc -O3 -fprofile-update=atomic -fprofile-arcs test.c -S -o-
func(int):
   lui a4,%hi(__gcov0.func(int))
   lw  a5,%lo(__gcov0.func(int))(a4)
   lw  a2,%lo(__gcov0.func(int)+4)(a4)
   addia0,a0,1
   addia3,a5,1
   sltua5,a3,a5
   add a5,a5,a2
   sw  a3,%lo(__gcov0.func(int))(a4)
   sw  a5,%lo(__gcov0.func(int)+4)(a4)
   ret
_sub_I_00100_0:
   lui a0,%hi(.LANCHOR0)
   addia0,a0,%lo(.LANCHOR0)
   tail__gcov_init
_sub_D_00100_1:
   tail__gcov_exit
__gcov0.func(int):
   .zero   8

Those are not atomic...

On rv64 we got some amoadds, which are sane.


Re: [PATCH 1/2] ivopts: Revert computation of address cost complexity.

2022-10-27 Thread Jeff Law via Gcc-patches



On 10/21/22 07:52, Dimitrije Milosevic wrote:

From: Dimitrije Milošević 

This patch reverts the computation of address cost complexity
to the legacy one. After f9f69dd, complexity is calculated
using the valid_mem_ref_p target hook. Architectures like
Mips only allow BASE + OFFSET addressing modes, which in turn
prevents the calculation of complexity for other addressing
modes, resulting in non-optimal candidate selection.

gcc/ChangeLog:

* tree-ssa-address.cc (multiplier_allowed_in_address_p): Change
to non-static.
* tree-ssa-address.h (multiplier_allowed_in_address_p): Declare.
* tree-ssa-loop-ivopts.cc (compute_symbol_and_var_present): Reintroduce.
(compute_min_and_max_offset): Likewise.
(get_address_cost): Revert
complexity calculation.


THe part I don't understand is, if you only have BASE+OFF, why does 
preventing the calculation of more complex addressing modes matter?  ie, 
what's the point of computing the cost of something like base + off + 
scaled index when the target can't utilize it?



jeff




Re: [PATCH] Convert flag_finite_math_only uses in frange to HONOR_*.

2022-10-27 Thread Jeff Law via Gcc-patches



On 10/25/22 14:59, Aldy Hernandez via Gcc-patches wrote:

[As Richi, and probably Jakub, have mentioned in the past...]

As mentioned earlier, we should be using HONOR_* on types rather than
flag_finite_math_only.

Will commit pending tests.

gcc/ChangeLog:

* value-range.cc (frange::set): Use HONOR_*.
(frange::verify_range): Same.
* value-range.h (frange_val_min): Same.
(frange_val_max): Same.


I haven't verified it's this patch, but our friend the vax regression is 
back:



cc1: internal compiler error: in fail, at selftest.cc:47
0x1686807 selftest::fail(selftest::location const&, char const*)
    ../../../gcc/gcc/selftest.cc:47
0x10578d2 range_tests_floats
    ../../../gcc/gcc/value-range.cc:4038
0x10658fd range_tests_floats_various
    ../../../gcc/gcc/value-range.cc:4056
0x10658fd selftest::range_tests()
    ../../../gcc/gcc/value-range.cc:4069

http://law-sandy.freeddns.org:8080/job/vax-unknown-linux/1458/console


Jeff




Re: [PATCH] riscv/RTEMS: Add RISCV_GCOV_TYPE_SIZE

2022-10-27 Thread Jeff Law via Gcc-patches



On 10/26/22 01:49, Sebastian Huber wrote:

The RV32A extension does not support 64-bit atomic operations.  For RTEMS, use
a 32-bit gcov type for RV32.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_gcov_type_size): New.
(TARGET_GCOV_TYPE_SIZE): Likewise.
* config/riscv/rtems.h (RISCV_GCOV_TYPE_SIZE): New.


Why make this specific to rtems?  ISTM the logic behind this change 
would apply independently of the os.



jeff




Re: [PATCH] docs: document sanitizers can trigger warnings

2022-10-27 Thread Jeff Law via Gcc-patches



On 10/26/22 05:09, Martin Liška wrote:

PR sanitizer/107298

gcc/ChangeLog:

* doc/invoke.texi: Document sanitizers can trigger warnings.


OK

jeff




Re: [PATCH] RISC-V: Change constexpr back to CONSTEXPR

2022-10-27 Thread Jeff Law via Gcc-patches



On 10/27/22 08:41, juzhe.zh...@rivai.ai wrote:

From: Ju-Zhe Zhong 

According to 
https://github.com/gcc-mirror/gcc/commit/f95d3d5de72a1c43e8d529bad3ef59afc3214705.
Since GCC 4.8.6 doesn't support constexpr, we should change it back to 
CONSTEXPR.
gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc: Change constexpr back to 
CONSTEXPR.
* config/riscv/riscv-vector-builtins-shapes.cc (SHAPE): Ditto.
* config/riscv/riscv-vector-builtins.cc (struct 
registered_function_hasher): Ditto.
* config/riscv/riscv-vector-builtins.h (struct rvv_arg_type_info): 
Ditto.


OK.   Please install.


Maybe we can move past gcc-4.8 as a bootstrapping requirement one day ;-)


Jeff




Re: [PATCH] [PR tree-optimization/107394] Canonicalize global franges as they are read back.

2022-10-27 Thread Jeff Law via Gcc-patches



On 10/25/22 15:01, Aldy Hernandez via Gcc-patches wrote:

[Richi/Jakub/FP experts, does this sound like the right solution, or am I
missing some subtle IPA/inlining issue?]

The problem here is that we're inlining a global range with NANs into
a function that has been tagged with __attribute__((optimize
("-ffinite-math-only"))).  As the global range is copied from
SSA_NAME_RANGE_INFO, its NAN bits are copied, which then cause
frange::verify_range() to fail a sanity check making sure no NANs
creep in when !HONOR_NANS.

I think what we should do is nuke the NAN bits as we're restoring the
global range.  For that matter, if we use the frange constructor,
everything except that NAN sign will be done automatically, including
dropping INFs to the min/max representable range when appropriate.

PR tree-optimization/107394

gcc/ChangeLog:

* value-range-storage.cc (frange_storage_slot::get_frange): Use
frange constructor.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr107394.c: New test.


The other approach would be to disabling inlining in this case due to an 
unsafe attribute mismatch, but we're not currently doing much sanity 
checking in this space and it might be a huge can of worms.  I'm 
inclined to ACK, but give Jakub and Richi until Monday to chime in first.



jeff



Re: [PATCH] RISC-V: Add Zawrs ISA extension support

2022-10-27 Thread Palmer Dabbelt

On Thu, 27 Oct 2022 11:23:17 PDT (-0700), christoph.muell...@vrull.eu wrote:

On Thu, Oct 27, 2022 at 8:11 PM Christoph Muellner <
christoph.muell...@vrull.eu> wrote:


From: Christoph Muellner 

This patch adds support for the Zawrs ISA extension.
The patch depends on the corresponding Binutils patch
to be usable (see [1])

The specification can be found here:
https://github.com/riscv/riscv-zawrs/blob/main/zawrs.adoc

Note, that the Zawrs extension is not frozen or ratified yet.
Therefore this patch is an RFC and not intended to get merged.



Sorry, forgot to update this part:
The Zawrs extension is frozen but not ratified.
Let me know if I should send a v2 for this change of the commit msg.


IMO it's fine to just fix it up at commit time.  This LGTM, we just need 
the NEWS entry too.  I also don't see any build/test results.


Thanks!


Binuitls support has been merged recently:

https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=eb668e50036e979fb0a74821df4eee0307b44e66




[1] https://sourceware.org/pipermail/binutils/2022-April/120559.html

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc: Add zawrs extension.
* config/riscv/riscv-opts.h (MASK_ZAWRS): New.
(TARGET_ZAWRS): New.
* config/riscv/riscv.opt: New.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zawrs.c: New test.

Signed-off-by: Christoph Muellner 
---
 gcc/common/config/riscv/riscv-common.cc |  4 
 gcc/config/riscv/riscv-opts.h   |  3 +++
 gcc/config/riscv/riscv.opt  |  3 +++
 gcc/testsuite/gcc.target/riscv/zawrs.c  | 13 +
 4 files changed, 23 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/zawrs.c

diff --git a/gcc/common/config/riscv/riscv-common.cc
b/gcc/common/config/riscv/riscv-common.cc
index d6404a01205..4b7f777c103 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -163,6 +163,8 @@ static const struct riscv_ext_version
riscv_ext_version_table[] =
   {"zifencei", ISA_SPEC_CLASS_20191213, 2, 0},
   {"zifencei", ISA_SPEC_CLASS_20190608, 2, 0},

+  {"zawrs", ISA_SPEC_CLASS_NONE, 1, 0},
+
   {"zba", ISA_SPEC_CLASS_NONE, 1, 0},
   {"zbb", ISA_SPEC_CLASS_NONE, 1, 0},
   {"zbc", ISA_SPEC_CLASS_NONE, 1, 0},
@@ -1180,6 +1182,8 @@ static const riscv_ext_flag_table_t
riscv_ext_flag_table[] =
   {"zicsr",_options::x_riscv_zi_subext, MASK_ZICSR},
   {"zifencei", _options::x_riscv_zi_subext, MASK_ZIFENCEI},

+  {"zawrs", _options::x_riscv_za_subext, MASK_ZAWRS},
+
   {"zba",_options::x_riscv_zb_subext, MASK_ZBA},
   {"zbb",_options::x_riscv_zb_subext, MASK_ZBB},
   {"zbc",_options::x_riscv_zb_subext, MASK_ZBC},
diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index 1dfe8c89209..25fd85b09b1 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -73,6 +73,9 @@ enum stack_protector_guard {
 #define TARGET_ZICSR((riscv_zi_subext & MASK_ZICSR) != 0)
 #define TARGET_ZIFENCEI ((riscv_zi_subext & MASK_ZIFENCEI) != 0)

+#define MASK_ZAWRS   (1 << 0)
+#define TARGET_ZAWRS ((riscv_za_subext & MASK_ZAWRS) != 0)
+
 #define MASK_ZBA  (1 << 0)
 #define MASK_ZBB  (1 << 1)
 #define MASK_ZBC  (1 << 2)
diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt
index 426ea95cd14..7c3ca48d1cc 100644
--- a/gcc/config/riscv/riscv.opt
+++ b/gcc/config/riscv/riscv.opt
@@ -203,6 +203,9 @@ long riscv_stack_protector_guard_offset = 0
 TargetVariable
 int riscv_zi_subext

+TargetVariable
+int riscv_za_subext
+
 TargetVariable
 int riscv_zb_subext

diff --git a/gcc/testsuite/gcc.target/riscv/zawrs.c
b/gcc/testsuite/gcc.target/riscv/zawrs.c
new file mode 100644
index 000..0b7e2662343
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zawrs.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zawrs" { target { rv64 } } } */
+/* { dg-options "-march=rv32gc_zawrs" { target { rv32 } } } */
+
+#ifndef __riscv_zawrs
+#error Feature macro not defined
+#endif
+
+int
+foo (int a)
+{
+  return a;
+}
--
2.37.3




Re: [PATCH v3] Re: OpenMP: Generate SIMD clones for functions with "declare target"

2022-10-27 Thread Sandra Loosemore

On 10/27/22 04:09, Thomas Schwinge wrote:

Hi!

On 2022-10-26T20:27:19-0600, Sandra Loosemore  wrote:

One of my test cases examines the .s output to make sure that the clones
are emitted as local symbols and not global.  I have not been able to
find the symbol linkage information in any of the dump files


Hmm, also some of '-fdump-ipa-all-details' doesn't help here?


Maybe I'm not looking at the right dump file, but all I see is names of 
functions in the dumps and nothing about symbol linkage/visibility, even 
with -details.



And/or, where you implement the logic to "make sure that the clones
are emitted as local symbols and not global", do emit some "tag" in the
dump file, and the scan for that?

Random examples that I just remembered:

'gcc/omp-offload.cc:execute_oacc_loop_designation' handling of
'OMP_CLAUSE_NOHOST', and how that's scanned (host-side) in test cases
such as 'libgomp/testsuite/libgomp.oacc-c-c++-common/routine-nohost-1.c',
'libgomp/testsuite/libgomp.oacc-fortran/routine-nohost-1.f90'.

'gcc/config/nvptx/nvptx.cc:nvptx_find_sese' doing
'fprintf (dump_file, "SESE regions:"); [...]', and that's scanned in:

 libgomp/testsuite/libgomp.oacc-c-c++-common/nvptx-sese-1.c-/* Match 
{N->N(.N)+} */
 libgomp/testsuite/libgomp.oacc-c-c++-common/nvptx-sese-1.c:/* { dg-final { scan-offload-rtl-dump 
"SESE regions:.* \[0-9\]+{\[0-9\]+->\[0-9\]+(\\.\[0-9\]+)+}" "mach" } } */

(You'd be doing this at the 'scan-offload-tree-dump[...]' level, I
suppose.)


I guess customizing the dump output from the simdclone pass with the 
information I need is the easiest solution.  I'm still concerned about 
getting adequate routine test coverage, though, when it's so specialized 
to a particular offload target.


Thanks for the help!  :-)

-Sandra


Re: [PATCH] libstdc++: Implement ranges::cartesian_product_view from P2374R4

2022-10-27 Thread Patrick Palka via Gcc-patches
On Thu, 27 Oct 2022, Patrick Palka wrote:

> This also implements the proposed resolutions of the tentatively ready
> LWG issues 3760 and 3761.
> 
> I'm not sure how/if we should implement the recommended practice of:
> 
>   difference_type should be the smallest signed-integer-like type that
>   is sufficiently wide to store the product of the maximum sizes of all
>   underlying ranges if such a type exists
> 
> because for e.g.
> 
>   extern std::vector x, y;
>   auto v = views::cartesian_product(x, y);
> 
> IIUC it'd mean difference_type should be __int128 or so, which seems
> quite wasteful: in practice the size of the cartesian product probably
> won't exceed the precision of say ptrdiff_t, and it's probably also not
> worth adding logic for using less precision than that either.  So this
> patch chooses defines difference_type as
> 
>   common_type_t, 
> range_difference_t<_Vs>...>
> 
> which should mean it's least as large as the difference_type of each
> underlying range, and at least as large as ptrdiff_t.  If overflow
> occurs due to this choice of difference_type, this patch has debug mode
> checks to catch this.
> 
> Tested on x86_64-pc-linux-gnu, does this look OK for trunk?
> 
> libstdc++-v3/ChangeLog:
> 
>   * include/std/ranges (__maybe_const_t): New alias for
>   __detail::__maybe_const_t.
>   (__detail::__cartesian_product_is_random_access): Define.
>   (__detail::__cartesian_product_common_arg): Define.
>   (__detail::__cartesian_product_is_bidirectional): Define.
>   (__detail::__cartesian_product_is_common): Define.
>   (__detail::__cartesian_product_is_sized): Define.
>   (__detail::__cartesian_is_sized_sentinel): Define.
>   (__detail::__cartesian_common_arg_end): Define.
>   (cartesian_product_view): Define.
>   (cartesian_product_view::_Iterator): Define.
>   (views::__detail::__can_cartesian_product_view): Define.
>   (views::_Cartesian_product, views::cartesian_product): Define.
>   * testsuite/std/ranges/cartesian_product/1.cc: New test.
> ---
>  libstdc++-v3/include/std/ranges   | 500 ++
>  .../std/ranges/cartesian_product/1.cc | 162 ++
>  2 files changed, 662 insertions(+)
>  create mode 100644 libstdc++-v3/testsuite/std/ranges/cartesian_product/1.cc
> 
> diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
> index a55e9e7f760..771da97ed6d 100644
> --- a/libstdc++-v3/include/std/ranges
> +++ b/libstdc++-v3/include/std/ranges
> @@ -829,6 +829,9 @@ namespace __detail
>  
>  } // namespace __detail
>  
> +// Shorthand for __detail::__maybe_const_t.
> +using __detail::__maybe_const_t;
> +
>  namespace views::__adaptor
>  {
>// True if the range adaptor _Adaptor can be applied with _Args.
> @@ -7973,6 +7976,503 @@ namespace views::__adaptor
>  
>  inline constexpr _Stride stride;
>}
> +
> +  namespace __detail
> +  {
> +template
> +  concept __cartesian_product_is_random_access
> + = (random_access_range<__maybe_const_t<_Const, _First>>
> +&& ...
> +&& (random_access_range<__maybe_const_t<_Const, _Vs>>
> +&& sized_range<__maybe_const_t<_Const, _Vs>>));
> +
> +template
> +  concept __cartesian_product_common_arg = common_range<_Range>
> + || (sized_range<_Range> && random_access_range<_Range>);
> +
> +template
> +  concept __cartesian_product_is_bidirectional
> + = (bidirectional_range<__maybe_const_t<_Const, _First>>
> +&& ...
> +&& (bidirectional_range<__maybe_const_t<_Const, _Vs>>
> +&& __cartesian_product_common_arg<__maybe_const_t<_Const, 
> _Vs>>));
> +
> +template
> +  concept __cartesian_product_is_common = 
> __cartesian_product_common_arg<_First>;
> +
> +template
> +  concept __cartesian_product_is_sized = (sized_range<_Vs> && ...);
> +
> +template class FirstSent, typename 
> _First, typename... _Vs>
> +  concept __cartesian_is_sized_sentinel
> +  = (sized_sentinel_for>,
> + iterator_t<__maybe_const_t<_Const, _First>>>
> +  && ...
> +  && (sized_range<__maybe_const_t<_Const, _Vs>>
> +  && sized_sentinel_for>,
> +iterator_t<__maybe_const_t<_Const, _Vs>>>));
> +
> +template<__cartesian_product_common_arg _Range>
> +  constexpr auto
> +  __cartesian_common_arg_end(_Range& __r)
> +  {
> + if constexpr (common_range<_Range>)
> +   return ranges::end(__r);
> + else
> +   return ranges::begin(__r) + ranges::distance(__r);
> +  }
> +  } // namespace __detail
> +
> +  template
> +requires (view<_First> && ... && view<_Vs>)
> +  class cartesian_product_view : public 
> view_interface>
> +  {
> +tuple<_First, _Vs...> _M_bases;
> +
> +template class _Iterator;
> +
> +static auto
> +_S_difference_type()
> +{
> +  // TODO: Implement the recommended practice of using the smallest
> +  // 

Re: [PATCH] RISC-V: Add Zawrs ISA extension support

2022-10-27 Thread Christoph Müllner
On Thu, Oct 27, 2022 at 8:11 PM Christoph Muellner <
christoph.muell...@vrull.eu> wrote:

> From: Christoph Muellner 
>
> This patch adds support for the Zawrs ISA extension.
> The patch depends on the corresponding Binutils patch
> to be usable (see [1])
>
> The specification can be found here:
> https://github.com/riscv/riscv-zawrs/blob/main/zawrs.adoc
>
> Note, that the Zawrs extension is not frozen or ratified yet.
> Therefore this patch is an RFC and not intended to get merged.
>

Sorry, forgot to update this part:
The Zawrs extension is frozen but not ratified.
Let me know if I should send a v2 for this change of the commit msg.

Binuitls support has been merged recently:

https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=eb668e50036e979fb0a74821df4eee0307b44e66


>
> [1] https://sourceware.org/pipermail/binutils/2022-April/120559.html
>
> gcc/ChangeLog:
>
> * common/config/riscv/riscv-common.cc: Add zawrs extension.
> * config/riscv/riscv-opts.h (MASK_ZAWRS): New.
> (TARGET_ZAWRS): New.
> * config/riscv/riscv.opt: New.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/zawrs.c: New test.
>
> Signed-off-by: Christoph Muellner 
> ---
>  gcc/common/config/riscv/riscv-common.cc |  4 
>  gcc/config/riscv/riscv-opts.h   |  3 +++
>  gcc/config/riscv/riscv.opt  |  3 +++
>  gcc/testsuite/gcc.target/riscv/zawrs.c  | 13 +
>  4 files changed, 23 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/zawrs.c
>
> diff --git a/gcc/common/config/riscv/riscv-common.cc
> b/gcc/common/config/riscv/riscv-common.cc
> index d6404a01205..4b7f777c103 100644
> --- a/gcc/common/config/riscv/riscv-common.cc
> +++ b/gcc/common/config/riscv/riscv-common.cc
> @@ -163,6 +163,8 @@ static const struct riscv_ext_version
> riscv_ext_version_table[] =
>{"zifencei", ISA_SPEC_CLASS_20191213, 2, 0},
>{"zifencei", ISA_SPEC_CLASS_20190608, 2, 0},
>
> +  {"zawrs", ISA_SPEC_CLASS_NONE, 1, 0},
> +
>{"zba", ISA_SPEC_CLASS_NONE, 1, 0},
>{"zbb", ISA_SPEC_CLASS_NONE, 1, 0},
>{"zbc", ISA_SPEC_CLASS_NONE, 1, 0},
> @@ -1180,6 +1182,8 @@ static const riscv_ext_flag_table_t
> riscv_ext_flag_table[] =
>{"zicsr",_options::x_riscv_zi_subext, MASK_ZICSR},
>{"zifencei", _options::x_riscv_zi_subext, MASK_ZIFENCEI},
>
> +  {"zawrs", _options::x_riscv_za_subext, MASK_ZAWRS},
> +
>{"zba",_options::x_riscv_zb_subext, MASK_ZBA},
>{"zbb",_options::x_riscv_zb_subext, MASK_ZBB},
>{"zbc",_options::x_riscv_zb_subext, MASK_ZBC},
> diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
> index 1dfe8c89209..25fd85b09b1 100644
> --- a/gcc/config/riscv/riscv-opts.h
> +++ b/gcc/config/riscv/riscv-opts.h
> @@ -73,6 +73,9 @@ enum stack_protector_guard {
>  #define TARGET_ZICSR((riscv_zi_subext & MASK_ZICSR) != 0)
>  #define TARGET_ZIFENCEI ((riscv_zi_subext & MASK_ZIFENCEI) != 0)
>
> +#define MASK_ZAWRS   (1 << 0)
> +#define TARGET_ZAWRS ((riscv_za_subext & MASK_ZAWRS) != 0)
> +
>  #define MASK_ZBA  (1 << 0)
>  #define MASK_ZBB  (1 << 1)
>  #define MASK_ZBC  (1 << 2)
> diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt
> index 426ea95cd14..7c3ca48d1cc 100644
> --- a/gcc/config/riscv/riscv.opt
> +++ b/gcc/config/riscv/riscv.opt
> @@ -203,6 +203,9 @@ long riscv_stack_protector_guard_offset = 0
>  TargetVariable
>  int riscv_zi_subext
>
> +TargetVariable
> +int riscv_za_subext
> +
>  TargetVariable
>  int riscv_zb_subext
>
> diff --git a/gcc/testsuite/gcc.target/riscv/zawrs.c
> b/gcc/testsuite/gcc.target/riscv/zawrs.c
> new file mode 100644
> index 000..0b7e2662343
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/zawrs.c
> @@ -0,0 +1,13 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gc_zawrs" { target { rv64 } } } */
> +/* { dg-options "-march=rv32gc_zawrs" { target { rv32 } } } */
> +
> +#ifndef __riscv_zawrs
> +#error Feature macro not defined
> +#endif
> +
> +int
> +foo (int a)
> +{
> +  return a;
> +}
> --
> 2.37.3
>
>


Re: [PATCH] lto-dump: modernize a bit

2022-10-27 Thread Richard Biener via Gcc-patches



> Am 27.10.2022 um 10:43 schrieb Martin Liška :
> 
> Hi.
> 
> Ready to be installed?

Ok

Richard 

> Thanks,
> Martin
> 
> gcc/lto/ChangeLog:
> 
>* lto-dump.cc (dump_list): Remove trailing return.
>(dump_symbol): Likewise.
>(dump_body): Filter name based on mangled name.
>(dump_tool_help): Use GIMPLE wording.
>(lto_main): Update wording.
> ---
> gcc/lto/lto-dump.cc | 19 +++
> 1 file changed, 7 insertions(+), 12 deletions(-)
> 
> diff --git a/gcc/lto/lto-dump.cc b/gcc/lto/lto-dump.cc
> index cb9782722a9..5c4dbf5d297 100644
> --- a/gcc/lto/lto-dump.cc
> +++ b/gcc/lto/lto-dump.cc
> @@ -227,7 +227,6 @@ void dump_list (void)
> {
>   dump_list_functions ();
>   dump_list_variables ();
> -  return;
> }
>  /* Dump specific variables and functions used in IL.  */
> @@ -243,7 +242,6 @@ void dump_symbol ()
>  printf ("\n");
>}
> }
> -  return;
> }
>  /* Dump specific gimple body of specified function.  */
> @@ -259,19 +257,17 @@ void dump_body ()
> return;
>   }
>   cgraph_node *cnode;
> -  FOR_EACH_FUNCTION (cnode)
> -if (cnode->definition
> -&& !cnode->alias
> -&& !strcmp (cnode->name (), flag_dump_body))
> +  FOR_EACH_DEFINED_FUNCTION (cnode)
> +if (!cnode->alias
> +&& !strcmp (cnode->asm_name (), flag_dump_body))
>   {
> -printf ("Gimple Body of Function: %s\n", cnode->name ());
> +printf ("GIMPLE body of function: %s\n\n", cnode->asm_name ());
>cnode->get_untransformed_body ();
>debug_function (cnode->decl, flags);
>flag = 1;
>   }
>   if (!flag)
> error_at (input_location, "Function not found.");
> -  return;
> }
>  /* List of command line options for dumping.  */
> @@ -292,13 +288,12 @@ void dump_tool_help ()
> "  -callgraphDump the callgraph in graphviz format.\n"
> "  -type-stats   Dump statistics of tree types.\n"
> "  -tree-stats   Dump statistics of trees.\n"
> -"  -gimple-stats Dump statistics of gimple statements.\n"
> -"  -dump-body=   Dump the specific gimple body.\n"
> +"  -gimple-stats Dump statistics of GIMPLE statements.\n"
> +"  -dump-body=   Dump the specific GIMPLE body.\n"
> "  -dump-level=  Deciding the optimization level of body.\n"
> "  -help Display the dump tool help.\n";
>fputs (msg, stdout);
> -  return;
> }
>  unsigned int
> @@ -365,7 +360,7 @@ lto_main (void)
>"%<--enable-gather-detailed-mem-stats%>.");
>   else
>{
> -  printf ("Tree Statistics\n");
> +  printf ("Tree statistics\n");
>  dump_tree_statistics ();
>}
> }
> -- 
> 2.38.0
> 


Re: [PATCH] Use simple_dce_from_worklist with match_simplify_replacement.

2022-10-27 Thread Richard Biener via Gcc-patches



> Am 27.10.2022 um 17:11 schrieb apinski--- via Gcc-patches 
> :
> 
> From: Andrew Pinski 
> 
> This is a simple patch to do some DCE after a successful
> match and simplify replacement in PHI-OPT. match and simplify
> likes to generate some extra statements which should be cleaned
> up.
> 
> OK? Bootstrapped and tested on x86_64-linux with no regressions.

Ok.

Richard 

> Thanks,
> Andrew Pinski
> 
> gcc/ChangeLog:
> 
>* tree-ssa-phiopt.cc: Include tree-ssa-dce.h
>(replace_phi_edge_with_variable):
>New argument, dce_ssa_names. Call simple_dce_from_worklist.
>(match_simplify_replacement): If we inserted a sequence,
>mark the lhs of the new sequence to be possible dce.
>Always move the statement and mark the lhs (if it is a name)
>as possible to remove.
> ---
> gcc/tree-ssa-phiopt.cc | 35 ++-
> 1 file changed, 26 insertions(+), 9 deletions(-)
> 
> diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
> index 925bd7d..996700b 100644
> --- a/gcc/tree-ssa-phiopt.cc
> +++ b/gcc/tree-ssa-phiopt.cc
> @@ -53,6 +53,7 @@ along with GCC; see the file COPYING3.  If not see
> #include "gimple-match.h"
> #include "dbgcnt.h"
> #include "tree-ssa-propagate.h"
> +#include "tree-ssa-dce.h"
> 
> static unsigned int tree_ssa_phiopt_worker (bool, bool, bool);
> static bool two_value_replacement (basic_block, basic_block, edge, gphi *,
> @@ -74,7 +75,6 @@ static bool cond_store_replacement (basic_block, 
> basic_block, edge, edge,
>hash_set *);
> static bool cond_if_else_store_replacement (basic_block, basic_block, 
> basic_block);
> static hash_set * get_non_trapping ();
> -static void replace_phi_edge_with_variable (basic_block, edge, gphi *, tree);
> static void hoist_adjacent_loads (basic_block, basic_block,
>  basic_block, basic_block);
> static bool gate_hoist_loads (void);
> @@ -402,7 +402,8 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool 
> do_hoist_loads, bool early_p)
> 
> static void
> replace_phi_edge_with_variable (basic_block cond_block,
> -edge e, gphi *phi, tree new_tree)
> +edge e, gphi *phi, tree new_tree,
> +bitmap dce_ssa_names = auto_bitmap())
> {
>   basic_block bb = gimple_bb (phi);
>   gimple_stmt_iterator gsi;
> @@ -477,6 +478,8 @@ replace_phi_edge_with_variable (basic_block cond_block,
>gimple_cond_make_true (cond);
> }
> 
> +  simple_dce_from_worklist (dce_ssa_names);
> +
>   statistics_counter_event (cfun, "Replace PHI with variable", 1);
> 
>   if (dump_file && (dump_flags & TDF_DETAILS))
> @@ -986,6 +989,7 @@ match_simplify_replacement (basic_block cond_bb, 
> basic_block middle_bb,
>   gimple_seq seq = NULL;
>   tree result;
>   gimple *stmt_to_move = NULL;
> +  auto_bitmap inserted_exprs;
> 
>   /* Special case A ? B : B as this will always simplify to B. */
>   if (operand_equal_for_phi_arg_p (arg0, arg1))
> @@ -1060,14 +1064,22 @@ match_simplify_replacement (basic_block cond_bb, 
> basic_block middle_bb,
>   gsi = gsi_last_bb (cond_bb);
>   /* Insert the sequence generated from gimple_simplify_phiopt.  */
>   if (seq)
> +{
> +  // Mark the lhs of the new statements maybe for dce
> +  gimple_stmt_iterator gsi1 = gsi_start (seq);
> +  for (; !gsi_end_p (gsi1); gsi_next ())
> +{
> +  gimple *stmt = gsi_stmt (gsi1);
> +  tree name = gimple_get_lhs (stmt);
> +  if (name && TREE_CODE (name) == SSA_NAME)
> +bitmap_set_bit (inserted_exprs, SSA_NAME_VERSION (name));
> +}
> gsi_insert_seq_before (, seq, GSI_CONTINUE_LINKING);
> +  }
> 
> -  /* If there was a statement to move and the result of the statement
> - is going to be used, move it to right before the original
> - conditional.  */
> -  if (stmt_to_move
> -  && (gimple_assign_lhs (stmt_to_move) == result
> -  || !has_single_use (gimple_assign_lhs (stmt_to_move
> +  /* If there was a statement to move, move it to right before
> + the original conditional.  */
> +  if (stmt_to_move)
> {
>   if (dump_file && (dump_flags & TDF_DETAILS))
>{
> @@ -1075,12 +1087,17 @@ match_simplify_replacement (basic_block cond_bb, 
> basic_block middle_bb,
>  print_gimple_stmt (dump_file, stmt_to_move, 0,
>   TDF_VOPS|TDF_MEMSYMS);
>}
> +
> +  tree name = gimple_get_lhs (stmt_to_move);
> +  // Mark the name to be renamed if there is one.
> +  if (name && TREE_CODE (name) == SSA_NAME)
> +bitmap_set_bit (inserted_exprs, SSA_NAME_VERSION (name));
>   gimple_stmt_iterator gsi1 = gsi_for_stmt (stmt_to_move);
>   gsi_move_before (, );
>   reset_flow_sensitive_info (gimple_assign_lhs (stmt_to_move));
> }
> 
> -  replace_phi_edge_with_variable (cond_bb, e1, phi, result);
> +  replace_phi_edge_with_variable (cond_bb, e1, phi, result, inserted_exprs);
> 
>   /* Add Statistic here even though replace_phi_edge_with_variable already
>  does it as we want to be able to count when 

[PATCH] RISC-V: Add Zawrs ISA extension support

2022-10-27 Thread Christoph Muellner
From: Christoph Muellner 

This patch adds support for the Zawrs ISA extension.
The patch depends on the corresponding Binutils patch
to be usable (see [1])

The specification can be found here:
https://github.com/riscv/riscv-zawrs/blob/main/zawrs.adoc

Note, that the Zawrs extension is not frozen or ratified yet.
Therefore this patch is an RFC and not intended to get merged.

[1] https://sourceware.org/pipermail/binutils/2022-April/120559.html

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc: Add zawrs extension.
* config/riscv/riscv-opts.h (MASK_ZAWRS): New.
(TARGET_ZAWRS): New.
* config/riscv/riscv.opt: New.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zawrs.c: New test.

Signed-off-by: Christoph Muellner 
---
 gcc/common/config/riscv/riscv-common.cc |  4 
 gcc/config/riscv/riscv-opts.h   |  3 +++
 gcc/config/riscv/riscv.opt  |  3 +++
 gcc/testsuite/gcc.target/riscv/zawrs.c  | 13 +
 4 files changed, 23 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/zawrs.c

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index d6404a01205..4b7f777c103 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -163,6 +163,8 @@ static const struct riscv_ext_version 
riscv_ext_version_table[] =
   {"zifencei", ISA_SPEC_CLASS_20191213, 2, 0},
   {"zifencei", ISA_SPEC_CLASS_20190608, 2, 0},
 
+  {"zawrs", ISA_SPEC_CLASS_NONE, 1, 0},
+
   {"zba", ISA_SPEC_CLASS_NONE, 1, 0},
   {"zbb", ISA_SPEC_CLASS_NONE, 1, 0},
   {"zbc", ISA_SPEC_CLASS_NONE, 1, 0},
@@ -1180,6 +1182,8 @@ static const riscv_ext_flag_table_t 
riscv_ext_flag_table[] =
   {"zicsr",_options::x_riscv_zi_subext, MASK_ZICSR},
   {"zifencei", _options::x_riscv_zi_subext, MASK_ZIFENCEI},
 
+  {"zawrs", _options::x_riscv_za_subext, MASK_ZAWRS},
+
   {"zba",_options::x_riscv_zb_subext, MASK_ZBA},
   {"zbb",_options::x_riscv_zb_subext, MASK_ZBB},
   {"zbc",_options::x_riscv_zb_subext, MASK_ZBC},
diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index 1dfe8c89209..25fd85b09b1 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -73,6 +73,9 @@ enum stack_protector_guard {
 #define TARGET_ZICSR((riscv_zi_subext & MASK_ZICSR) != 0)
 #define TARGET_ZIFENCEI ((riscv_zi_subext & MASK_ZIFENCEI) != 0)
 
+#define MASK_ZAWRS   (1 << 0)
+#define TARGET_ZAWRS ((riscv_za_subext & MASK_ZAWRS) != 0)
+
 #define MASK_ZBA  (1 << 0)
 #define MASK_ZBB  (1 << 1)
 #define MASK_ZBC  (1 << 2)
diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt
index 426ea95cd14..7c3ca48d1cc 100644
--- a/gcc/config/riscv/riscv.opt
+++ b/gcc/config/riscv/riscv.opt
@@ -203,6 +203,9 @@ long riscv_stack_protector_guard_offset = 0
 TargetVariable
 int riscv_zi_subext
 
+TargetVariable
+int riscv_za_subext
+
 TargetVariable
 int riscv_zb_subext
 
diff --git a/gcc/testsuite/gcc.target/riscv/zawrs.c 
b/gcc/testsuite/gcc.target/riscv/zawrs.c
new file mode 100644
index 000..0b7e2662343
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zawrs.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zawrs" { target { rv64 } } } */
+/* { dg-options "-march=rv32gc_zawrs" { target { rv32 } } } */
+
+#ifndef __riscv_zawrs
+#error Feature macro not defined
+#endif
+
+int
+foo (int a)
+{
+  return a;
+}
-- 
2.37.3



[PATCH] c++: libcpp: Support raw strings with newlines in directives [PR55971]

2022-10-27 Thread Lewis Hyatt via Gcc-patches
Hello-

May I please ask for a review of this patch from June? I realize it's a
10-year-old PR that doesn't seem to be bothering people much, but I still feel
like it's an unfortunate gap in C++11 support that is not hard to fix.

Original submission is here:
https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596820.html

But I have attached a new version here that is simplified, all the
_Pragma-related stuff has been removed and I will handle that in a later patch
instead. I also removed the changes to c-ppoutput.cc that I realized were not
needed after all. Bootstrap+regtest all languages on x86-64 Linux still looks
good. Thanks!

-Lewis

-- >8 --

It's not currently possible to use a C++11 raw string containing a newline as
part of the definition of a macro, or in any other preprocessing directive,
such as:

 #define X R"(two
lines)"

 #error R"(this error has
two lines)"

Add support for that by relaxing the conditions under which
_cpp_get_fresh_line() refuses to get a new line. For the case of lexing a raw
string, it's OK to do so as long as there is another line within the current
buffer. The code in cpp_get_fresh_line() was refactored into a new function
get_fresh_line_impl(), so that the new logic is applied only when processing a
raw string and not any other times.

libcpp/ChangeLog:

PR preprocessor/55971
* lex.cc (get_fresh_line_impl): New function refactoring the code
from...
(_cpp_get_fresh_line): ...here.
(lex_raw_string): Use the new version of get_fresh_line_impl() to
support raw strings containing new lines when processing a directive.

gcc/testsuite/ChangeLog:

PR preprocessor/55971
* c-c++-common/raw-string-directive-1.c: New test.
* c-c++-common/raw-string-directive-2.c: New test.

gcc/c-family/ChangeLog:

PR preprocessor/55971
* c-ppoutput.cc (adjust_for_newlines): Update comment.
---
 gcc/c-family/c-ppoutput.cc| 10 ++-
 .../c-c++-common/raw-string-directive-1.c | 74 +++
 .../c-c++-common/raw-string-directive-2.c | 33 +
 libcpp/lex.cc | 41 +++---
 4 files changed, 148 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/raw-string-directive-1.c
 create mode 100644 gcc/testsuite/c-c++-common/raw-string-directive-2.c

diff --git a/gcc/c-family/c-ppoutput.cc b/gcc/c-family/c-ppoutput.cc
index a99d9e9c5ca..6e054358e9e 100644
--- a/gcc/c-family/c-ppoutput.cc
+++ b/gcc/c-family/c-ppoutput.cc
@@ -433,7 +433,15 @@ scan_translation_unit_directives_only (cpp_reader *pfile)
 lang_hooks.preprocess_token (pfile, NULL, streamer.filter);
 }
 
-/* Adjust print.src_line for newlines embedded in output.  */
+/* Adjust print.src_line for newlines embedded in output.  For example, if a 
raw
+   string literal contains newlines, then we need to increment our notion of 
the
+   current line to keep in sync and avoid outputting a line marker
+   unnecessarily.  If a raw string literal containing newlines is the result of
+   macro expansion, then we have the opposite problem, where the token takes up
+   more lines in the output than it did in the input, and hence a line marker 
is
+   needed to restore the correct state for subsequent lines.  In this case,
+   incrementing print.src_line still does the job, because it will cause us to
+   emit the line marker the next time a token is streamed.  */
 static void
 account_for_newlines (const unsigned char *str, size_t len)
 {
diff --git a/gcc/testsuite/c-c++-common/raw-string-directive-1.c 
b/gcc/testsuite/c-c++-common/raw-string-directive-1.c
new file mode 100644
index 000..d6525e107bc
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/raw-string-directive-1.c
@@ -0,0 +1,74 @@
+/* { dg-do compile } */
+/* { dg-options "-std=gnu99" { target c } } */
+/* { dg-options "-std=c++11" { target c++ } } */
+
+/* Test that multi-line raw strings are lexed OK for all preprocessing
+   directives where one could appear. Test raw-string-directive-2.c
+   checks that #define is also processed properly.  */
+
+/* Note that in cases where we cause GCC to produce a multi-line error
+   message, we construct the string so that the second line looks enough
+   like an error message for DejaGNU to process it as such, so that we
+   can use dg-warning or dg-error directives to check for it.  */
+
+#warning R"delim(line1 /* { dg-warning "line1" } */
+file:15:1: warning: line2)delim" /* { dg-warning "line2" } */
+
+#error R"delim(line3 /* { dg-error "line3" } */
+file:18:1: error: line4)delim" /* { dg-error "line4" } */
+
+#define X1 R"(line 5
+line 6
+line 7
+line 8
+/*
+//
+line 9)" R"delim(
+line10)delim"
+
+#define X2(a) X1 #a R"(line 11
+/*
+line12
+)"
+
+#if R"(line 13 /* { dg-error "line13" } */
+file:35:1: error: line14)" /* { dg-error "line14\\)\"\" is not valid" } */
+#endif R"(line 15 /* { dg-warning "extra tokens at end of #endif" } */
+\
+line16)" ""
+
+#ifdef 

[PATCH] libstdc++: Implement ranges::cartesian_product_view from P2374R4

2022-10-27 Thread Patrick Palka via Gcc-patches
This also implements the proposed resolutions of the tentatively ready
LWG issues 3760 and 3761.

I'm not sure how/if we should implement the recommended practice of:

  difference_type should be the smallest signed-integer-like type that
  is sufficiently wide to store the product of the maximum sizes of all
  underlying ranges if such a type exists

because for e.g.

  extern std::vector x, y;
  auto v = views::cartesian_product(x, y);

IIUC it'd mean difference_type should be __int128 or so, which seems
quite wasteful: in practice the size of the cartesian product probably
won't exceed the precision of say ptrdiff_t, and it's probably also not
worth adding logic for using less precision than that either.  So this
patch chooses defines difference_type as

  common_type_t, 
range_difference_t<_Vs>...>

which should mean it's least as large as the difference_type of each
underlying range, and at least as large as ptrdiff_t.  If overflow
occurs due to this choice of difference_type, this patch has debug mode
checks to catch this.

Tested on x86_64-pc-linux-gnu, does this look OK for trunk?

libstdc++-v3/ChangeLog:

* include/std/ranges (__maybe_const_t): New alias for
__detail::__maybe_const_t.
(__detail::__cartesian_product_is_random_access): Define.
(__detail::__cartesian_product_common_arg): Define.
(__detail::__cartesian_product_is_bidirectional): Define.
(__detail::__cartesian_product_is_common): Define.
(__detail::__cartesian_product_is_sized): Define.
(__detail::__cartesian_is_sized_sentinel): Define.
(__detail::__cartesian_common_arg_end): Define.
(cartesian_product_view): Define.
(cartesian_product_view::_Iterator): Define.
(views::__detail::__can_cartesian_product_view): Define.
(views::_Cartesian_product, views::cartesian_product): Define.
* testsuite/std/ranges/cartesian_product/1.cc: New test.
---
 libstdc++-v3/include/std/ranges   | 500 ++
 .../std/ranges/cartesian_product/1.cc | 162 ++
 2 files changed, 662 insertions(+)
 create mode 100644 libstdc++-v3/testsuite/std/ranges/cartesian_product/1.cc

diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
index a55e9e7f760..771da97ed6d 100644
--- a/libstdc++-v3/include/std/ranges
+++ b/libstdc++-v3/include/std/ranges
@@ -829,6 +829,9 @@ namespace __detail
 
 } // namespace __detail
 
+// Shorthand for __detail::__maybe_const_t.
+using __detail::__maybe_const_t;
+
 namespace views::__adaptor
 {
   // True if the range adaptor _Adaptor can be applied with _Args.
@@ -7973,6 +7976,503 @@ namespace views::__adaptor
 
 inline constexpr _Stride stride;
   }
+
+  namespace __detail
+  {
+template
+  concept __cartesian_product_is_random_access
+   = (random_access_range<__maybe_const_t<_Const, _First>>
+  && ...
+  && (random_access_range<__maybe_const_t<_Const, _Vs>>
+  && sized_range<__maybe_const_t<_Const, _Vs>>));
+
+template
+  concept __cartesian_product_common_arg = common_range<_Range>
+   || (sized_range<_Range> && random_access_range<_Range>);
+
+template
+  concept __cartesian_product_is_bidirectional
+   = (bidirectional_range<__maybe_const_t<_Const, _First>>
+  && ...
+  && (bidirectional_range<__maybe_const_t<_Const, _Vs>>
+  && __cartesian_product_common_arg<__maybe_const_t<_Const, 
_Vs>>));
+
+template
+  concept __cartesian_product_is_common = 
__cartesian_product_common_arg<_First>;
+
+template
+  concept __cartesian_product_is_sized = (sized_range<_Vs> && ...);
+
+template class FirstSent, typename _First, 
typename... _Vs>
+  concept __cartesian_is_sized_sentinel
+  = (sized_sentinel_for>,
+   iterator_t<__maybe_const_t<_Const, _First>>>
+&& ...
+&& (sized_range<__maybe_const_t<_Const, _Vs>>
+&& sized_sentinel_for>,
+  iterator_t<__maybe_const_t<_Const, _Vs>>>));
+
+template<__cartesian_product_common_arg _Range>
+  constexpr auto
+  __cartesian_common_arg_end(_Range& __r)
+  {
+   if constexpr (common_range<_Range>)
+ return ranges::end(__r);
+   else
+ return ranges::begin(__r) + ranges::distance(__r);
+  }
+  } // namespace __detail
+
+  template
+requires (view<_First> && ... && view<_Vs>)
+  class cartesian_product_view : public 
view_interface>
+  {
+tuple<_First, _Vs...> _M_bases;
+
+template class _Iterator;
+
+static auto
+_S_difference_type()
+{
+  // TODO: Implement the recommended practice of using the smallest
+  // sufficiently wide type according to the maximum sizes of the
+  // underlying ranges?
+  return common_type_t,
+  range_difference_t<_Vs>...>{};
+}
+
+  public:
+cartesian_product_view() = default;
+
+constexpr 

c++: Templated lambda mangling

2022-10-27 Thread Nathan Sidwell via Gcc-patches

(Explicitly) Templated lambdas have a different signature to
implicitly templated lambdas -- '[] (T) {}' is not the
same as '[](auto) {}'.  This should be reflected in the mangling.  The
ABI captures this as
https://github.com/itanium-cxx-abi/cxx-abi/issues/31, and clang has
implemented such additions.

It's relatively straight forwards to write out the non-synthetic
template parms, and note if we need to issue an ABI warning.

I did find a couple of bugs on the way -- one is a failure to parse thhe pack 
expansion in :

  inline auto l_var2 = [] (int (&...)[I]) {}

the other was clang miscounting substitutions, 
https://github.com/llvm/llvm-project/issues/58631, always a good idea to have 
multiple implementations :)


the remaining change to do is lambda sequence numbering, as that is affected by 
lambda signature.


nathan

--
Nathan SidwellFrom bf6e972b65c56c615682f712f785d0f0541ac77b Mon Sep 17 00:00:00 2001
From: Nathan Sidwell 
Date: Mon, 24 Oct 2022 17:39:55 -0400
Subject: [PATCH] c++: Templated lambda mangling

(Explicitly) Templated lambdas have a different signature to
implicitly templated lambdas -- '[] (T) {}' is not the
same as '[](auto) {}'.  This should be reflected in the mangling.  The
ABI captures this as
https://github.com/itanium-cxx-abi/cxx-abi/issues/31, and clang has
implemented such additions.

It's relatively straight forwards to write out the non-synthetic
template parms, and note if we need to issue an ABI warning.

	gcc/cp/
	* mangle.cc (write_closure_template_head): New.
	(write_closure_type_name): Call it.
	gcc/testsuite/
	* g++.dg/abi/lambda-ctx1-18.C: Adjust.
	* g++.dg/abi/lambda-ctx1-18vs17.C: Adjust.
	* g++.dg/abi/lambda-tpl1-17.C: New.
	* g++.dg/abi/lambda-tpl1-18.C: New.
	* g++.dg/abi/lambda-tpl1-18vs17.C: New.
	* g++.dg/abi/lambda-tpl1.h: New.
---
 gcc/cp/mangle.cc  | 68 +++
 gcc/testsuite/g++.dg/abi/lambda-ctx1-18.C |  4 +-
 gcc/testsuite/g++.dg/abi/lambda-ctx1-18vs17.C |  4 +-
 gcc/testsuite/g++.dg/abi/lambda-tpl1-17.C | 20 ++
 gcc/testsuite/g++.dg/abi/lambda-tpl1-18.C | 25 +++
 gcc/testsuite/g++.dg/abi/lambda-tpl1-18vs17.C | 16 +
 gcc/testsuite/g++.dg/abi/lambda-tpl1.h| 59 
 7 files changed, 192 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/abi/lambda-tpl1-17.C
 create mode 100644 gcc/testsuite/g++.dg/abi/lambda-tpl1-18.C
 create mode 100644 gcc/testsuite/g++.dg/abi/lambda-tpl1-18vs17.C
 create mode 100644 gcc/testsuite/g++.dg/abi/lambda-tpl1.h

diff --git a/gcc/cp/mangle.cc b/gcc/cp/mangle.cc
index 1215463089b..e39621876ef 100644
--- a/gcc/cp/mangle.cc
+++ b/gcc/cp/mangle.cc
@@ -1727,6 +1727,66 @@ write_unnamed_type_name (const tree type)
   write_compact_number (discriminator);
 }
 
+// A template head, for templated lambdas.
+//  ::=   Tp* Ty
+//   Tp* Tn 
+//   Tp* Tt  E
+// New in ABI=18. Returns true iff we emitted anything -- used for ABI
+// version warning.
+
+static bool
+write_closure_template_head (tree tmpl)
+{
+  bool any = false;
+
+  // We only need one level of template parms
+  tree inner = INNERMOST_TEMPLATE_PARMS (DECL_TEMPLATE_PARMS (tmpl));
+
+  for (int ix = 0, len = TREE_VEC_LENGTH (inner); ix != len; ix++)
+{
+  tree parm = TREE_VEC_ELT (inner, ix);
+  if (parm == error_mark_node)
+	continue;
+  parm = TREE_VALUE (parm);
+
+  if (DECL_VIRTUAL_P (parm))
+	// A synthetic parm, we're done.
+	break;
+
+  any = true;
+  if (abi_version_at_least (18))
+	{
+	  if (TREE_CODE (parm) == PARM_DECL
+	  ? TEMPLATE_PARM_PARAMETER_PACK (DECL_INITIAL (parm))
+	  : TEMPLATE_TYPE_PARAMETER_PACK (TREE_TYPE (parm)))
+	write_string ("Tp");
+
+	  switch (TREE_CODE (parm))
+	{
+	default:
+	  gcc_unreachable ();
+
+	case TYPE_DECL:
+	  write_string ("Ty");
+	  break;
+
+	case PARM_DECL:
+	  write_string ("Tn");
+	  write_type (TREE_TYPE (parm));
+	  break;
+
+	case TEMPLATE_DECL:
+	  write_string ("Tt");
+	  write_closure_template_head (parm);
+	  write_string ("E");
+	  break;
+	}
+	}
+}
+
+  return any;
+}
+
 /*  ::= Ul  E [  ] _
 ::= +  # Parameter types or "v" if the lambda has no parameters */
 
@@ -1740,6 +1800,14 @@ write_closure_type_name (const tree type)
   MANGLE_TRACE_TREE ("closure-type-name", type);
 
   write_string ("Ul");
+
+  if (auto ti = maybe_template_info (fn))
+if (write_closure_template_head (TI_TEMPLATE (ti)))
+  // If there were any explicit template parms, we may need to
+  // issue a mangling diagnostic.
+  if (abi_warn_or_compat_version_crosses (18))
+	G.need_abi_warning = true;
+
   write_method_parms (parms, /*method_p=*/1, fn);
   write_char ('E');
   write_compact_number (LAMBDA_EXPR_DISCRIMINATOR (lambda));
diff --git a/gcc/testsuite/g++.dg/abi/lambda-ctx1-18.C b/gcc/testsuite/g++.dg/abi/lambda-ctx1-18.C
index c1c9e274d7f..3dd68a4bed2 100644
--- 

[PATCH] c++: -Wdangling-reference and system headers

2022-10-27 Thread Marek Polacek via Gcc-patches
I got this testcase:

  auto f() -> std::optional;
  for (char c : f().value()) { }

which has a dangling reference: std::optional::value returns
a reference to the contained value, but here it's the f() temporary.
We warn, which is great, but only with -Wsystem-headers, because
the function comes from a system header and warning_enabled_at used
in do_warn_dangling_reference checks diagnostic_report_warnings_p,
which in this case returned false so we didn't warn.

Fixed as below.  I could also override dc_warn_system_headers so that
the warning is enabled in system headers always.  With that, I found one
issue in libstdc++:

libstdc++-v3/include/bits/fs_path.h:1265:15: warning: possibly dangling 
reference to a temporary [-Wdangling-reference]
 1265 | auto& __last = *--end();
  |   ^~

which looks like a true positive as well.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

gcc/cp/ChangeLog:

* call.cc (maybe_warn_dangling_reference): Enable the warning in
system headers if the decl isn't in a system header.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wdangling-reference4.C: New test.
---
 gcc/cp/call.cc   |  7 +++
 gcc/testsuite/g++.dg/warn/Wdangling-reference4.C | 14 ++
 2 files changed, 21 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/warn/Wdangling-reference4.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index 951b9fd2a88..c7c7a122045 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -13539,6 +13539,13 @@ maybe_warn_dangling_reference (const_tree decl, tree 
init)
 return;
   if (!TYPE_REF_P (TREE_TYPE (decl)))
 return;
+  /* Don't suppress the diagnostic just because the call comes from
+ a system header.  If the DECL is not in a system header, or if
+ -Wsystem-headers was provided, warn.  */
+  auto wsh
+= make_temp_override (global_dc->dc_warn_system_headers,
+ (!in_system_header_at (DECL_SOURCE_LOCATION (decl))
+  || global_dc->dc_warn_system_headers));
   if (tree call = do_warn_dangling_reference (init))
 {
   auto_diagnostic_group d;
diff --git a/gcc/testsuite/g++.dg/warn/Wdangling-reference4.C 
b/gcc/testsuite/g++.dg/warn/Wdangling-reference4.C
new file mode 100644
index 000..aee7a29019b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wdangling-reference4.C
@@ -0,0 +1,14 @@
+// { dg-do compile { target c++17 } }
+// { dg-options "-Wdangling-reference" }
+// Check that we warn here even without -Wsystem-headers.
+
+#include 
+#include 
+
+auto f() -> std::optional;
+
+void
+g ()
+{
+  for (char c : f().value()) { (void) c; } // { dg-warning "dangling 
reference" }
+}

base-commit: f95d3d5de72a1c43e8d529bad3ef59afc3214705
-- 
2.37.3



[PATCH] Use simple_dce_from_worklist with match_simplify_replacement.

2022-10-27 Thread apinski--- via Gcc-patches
From: Andrew Pinski 

This is a simple patch to do some DCE after a successful
match and simplify replacement in PHI-OPT. match and simplify
likes to generate some extra statements which should be cleaned
up.

OK? Bootstrapped and tested on x86_64-linux with no regressions.

Thanks,
Andrew Pinski

gcc/ChangeLog:

* tree-ssa-phiopt.cc: Include tree-ssa-dce.h
(replace_phi_edge_with_variable):
New argument, dce_ssa_names. Call simple_dce_from_worklist.
(match_simplify_replacement): If we inserted a sequence,
mark the lhs of the new sequence to be possible dce.
Always move the statement and mark the lhs (if it is a name)
as possible to remove.
---
 gcc/tree-ssa-phiopt.cc | 35 ++-
 1 file changed, 26 insertions(+), 9 deletions(-)

diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
index 925bd7d..996700b 100644
--- a/gcc/tree-ssa-phiopt.cc
+++ b/gcc/tree-ssa-phiopt.cc
@@ -53,6 +53,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple-match.h"
 #include "dbgcnt.h"
 #include "tree-ssa-propagate.h"
+#include "tree-ssa-dce.h"
 
 static unsigned int tree_ssa_phiopt_worker (bool, bool, bool);
 static bool two_value_replacement (basic_block, basic_block, edge, gphi *,
@@ -74,7 +75,6 @@ static bool cond_store_replacement (basic_block, basic_block, 
edge, edge,
hash_set *);
 static bool cond_if_else_store_replacement (basic_block, basic_block, 
basic_block);
 static hash_set * get_non_trapping ();
-static void replace_phi_edge_with_variable (basic_block, edge, gphi *, tree);
 static void hoist_adjacent_loads (basic_block, basic_block,
  basic_block, basic_block);
 static bool gate_hoist_loads (void);
@@ -402,7 +402,8 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool 
do_hoist_loads, bool early_p)
 
 static void
 replace_phi_edge_with_variable (basic_block cond_block,
-   edge e, gphi *phi, tree new_tree)
+   edge e, gphi *phi, tree new_tree,
+   bitmap dce_ssa_names = auto_bitmap())
 {
   basic_block bb = gimple_bb (phi);
   gimple_stmt_iterator gsi;
@@ -477,6 +478,8 @@ replace_phi_edge_with_variable (basic_block cond_block,
gimple_cond_make_true (cond);
 }
 
+  simple_dce_from_worklist (dce_ssa_names);
+
   statistics_counter_event (cfun, "Replace PHI with variable", 1);
 
   if (dump_file && (dump_flags & TDF_DETAILS))
@@ -986,6 +989,7 @@ match_simplify_replacement (basic_block cond_bb, 
basic_block middle_bb,
   gimple_seq seq = NULL;
   tree result;
   gimple *stmt_to_move = NULL;
+  auto_bitmap inserted_exprs;
 
   /* Special case A ? B : B as this will always simplify to B. */
   if (operand_equal_for_phi_arg_p (arg0, arg1))
@@ -1060,14 +1064,22 @@ match_simplify_replacement (basic_block cond_bb, 
basic_block middle_bb,
   gsi = gsi_last_bb (cond_bb);
   /* Insert the sequence generated from gimple_simplify_phiopt.  */
   if (seq)
+{
+  // Mark the lhs of the new statements maybe for dce
+  gimple_stmt_iterator gsi1 = gsi_start (seq);
+  for (; !gsi_end_p (gsi1); gsi_next ())
+   {
+ gimple *stmt = gsi_stmt (gsi1);
+ tree name = gimple_get_lhs (stmt);
+ if (name && TREE_CODE (name) == SSA_NAME)
+   bitmap_set_bit (inserted_exprs, SSA_NAME_VERSION (name));
+   }
 gsi_insert_seq_before (, seq, GSI_CONTINUE_LINKING);
+  }
 
-  /* If there was a statement to move and the result of the statement
- is going to be used, move it to right before the original
- conditional.  */
-  if (stmt_to_move
-  && (gimple_assign_lhs (stmt_to_move) == result
- || !has_single_use (gimple_assign_lhs (stmt_to_move
+  /* If there was a statement to move, move it to right before
+ the original conditional.  */
+  if (stmt_to_move)
 {
   if (dump_file && (dump_flags & TDF_DETAILS))
{
@@ -1075,12 +1087,17 @@ match_simplify_replacement (basic_block cond_bb, 
basic_block middle_bb,
  print_gimple_stmt (dump_file, stmt_to_move, 0,
   TDF_VOPS|TDF_MEMSYMS);
}
+
+  tree name = gimple_get_lhs (stmt_to_move);
+  // Mark the name to be renamed if there is one.
+  if (name && TREE_CODE (name) == SSA_NAME)
+   bitmap_set_bit (inserted_exprs, SSA_NAME_VERSION (name));
   gimple_stmt_iterator gsi1 = gsi_for_stmt (stmt_to_move);
   gsi_move_before (, );
   reset_flow_sensitive_info (gimple_assign_lhs (stmt_to_move));
 }
 
-  replace_phi_edge_with_variable (cond_bb, e1, phi, result);
+  replace_phi_edge_with_variable (cond_bb, e1, phi, result, inserted_exprs);
 
   /* Add Statistic here even though replace_phi_edge_with_variable already
  does it as we want to be able to count when match-simplify happens vs
-- 
1.8.3.1



[PATCH] RISC-V: Change constexpr back to CONSTEXPR

2022-10-27 Thread juzhe . zhong
From: Ju-Zhe Zhong 

According to 
https://github.com/gcc-mirror/gcc/commit/f95d3d5de72a1c43e8d529bad3ef59afc3214705.
Since GCC 4.8.6 doesn't support constexpr, we should change it back to 
CONSTEXPR.
gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc: Change constexpr back to 
CONSTEXPR.
* config/riscv/riscv-vector-builtins-shapes.cc (SHAPE): Ditto.
* config/riscv/riscv-vector-builtins.cc (struct 
registered_function_hasher): Ditto.
* config/riscv/riscv-vector-builtins.h (struct rvv_arg_type_info): 
Ditto.

---
 gcc/config/riscv/riscv-vector-builtins-bases.cc  |  4 ++--
 gcc/config/riscv/riscv-vector-builtins-shapes.cc |  2 +-
 gcc/config/riscv/riscv-vector-builtins.cc| 14 +++---
 gcc/config/riscv/riscv-vector-builtins.h |  2 +-
 4 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index 713a7566e29..231b63a610d 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -84,8 +84,8 @@ public:
   }
 };
 
-static constexpr const vsetvl vsetvl_obj;
-static constexpr const vsetvl vsetvlmax_obj;
+static CONSTEXPR const vsetvl vsetvl_obj;
+static CONSTEXPR const vsetvl vsetvlmax_obj;
 namespace bases {
 const function_base *const vsetvl = _obj;
 const function_base *const vsetvlmax = _obj;
diff --git a/gcc/config/riscv/riscv-vector-builtins-shapes.cc 
b/gcc/config/riscv/riscv-vector-builtins-shapes.cc
index 14c59690c06..24fc1c02341 100644
--- a/gcc/config/riscv/riscv-vector-builtins-shapes.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-shapes.cc
@@ -71,7 +71,7 @@ build_all (function_builder , const function_group_info 
)
 /* Declare the function shape NAME, pointing it to an instance
of class _def.  */
 #define SHAPE(DEF, VAR) \
-  static constexpr const DEF##_def VAR##_obj; \
+  static CONSTEXPR const DEF##_def VAR##_obj; \
   namespace shapes { const function_shape *const VAR = ##_obj; }
 
 /* Base class for for build.  */
diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
b/gcc/config/riscv/riscv-vector-builtins.cc
index 06a4a85087d..43150aa47a4 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -88,7 +88,7 @@ struct registered_function_hasher : 
nofree_ptr_hash
 };
 
 /* Static information about each RVV type.  */
-static constexpr const vector_type_info vector_types[] = {
+static CONSTEXPR const vector_type_info vector_types[] = {
 #define DEF_RVV_TYPE(NAME, NCHARS, ABI_NAME, ARGS...)  
\
   {#NAME, #ABI_NAME, "u" #NCHARS #ABI_NAME},
 #include "riscv-vector-builtins.def"
@@ -123,23 +123,23 @@ static const rvv_type_info i_ops[] = {
 #include "riscv-vector-builtins-types.def"
   {NUM_VECTOR_TYPES, 0}};
 
-static constexpr const rvv_arg_type_info rvv_arg_type_info_end
+static CONSTEXPR const rvv_arg_type_info rvv_arg_type_info_end
   = rvv_arg_type_info (NUM_BASE_TYPES);
 
 /* A list of args for size_t func (void) function.  */
-static constexpr const rvv_arg_type_info void_args[]
+static CONSTEXPR const rvv_arg_type_info void_args[]
   = {rvv_arg_type_info (RVV_BASE_void), rvv_arg_type_info_end};
 
 /* A list of args for size_t func (size_t) function.  */
-static constexpr const rvv_arg_type_info size_args[]
+static CONSTEXPR const rvv_arg_type_info size_args[]
   = {rvv_arg_type_info (RVV_BASE_size), rvv_arg_type_info_end};
 
 /* A list of none preds that will be registered for intrinsic functions.  */
-static constexpr const predication_type_index none_preds[]
+static CONSTEXPR const predication_type_index none_preds[]
   = {PRED_TYPE_none, NUM_PRED_TYPES};
 
 /* A static operand information for size_t func (void) function registration. 
*/
-static constexpr const rvv_op_info i_none_size_void_ops
+static CONSTEXPR const rvv_op_info i_none_size_void_ops
   = {i_ops,/* Types */
  OP_TYPE_none, /* Suffix */
  rvv_arg_type_info (RVV_BASE_size), /* Return type */
@@ -147,7 +147,7 @@ static constexpr const rvv_op_info i_none_size_void_ops
 
 /* A static operand information for size_t func (size_t) function registration.
  */
-static constexpr const rvv_op_info i_none_size_size_ops
+static CONSTEXPR const rvv_op_info i_none_size_size_ops
   = {i_ops,/* Types */
  OP_TYPE_none, /* Suffix */
  rvv_arg_type_info (RVV_BASE_size), /* Return type */
diff --git a/gcc/config/riscv/riscv-vector-builtins.h 
b/gcc/config/riscv/riscv-vector-builtins.h
index e5636e23a7c..425da12326c 100644
--- a/gcc/config/riscv/riscv-vector-builtins.h
+++ b/gcc/config/riscv/riscv-vector-builtins.h
@@ -171,7 +171,7 @@ struct rvv_builtin_suffixes
 /* RVV Builtin argument information.  */
 struct rvv_arg_type_info
 {
-  constexpr rvv_arg_type_info (rvv_base_type base_type_in)
+  CONSTEXPR rvv_arg_type_info 

Re: [PATCH zero-call-used-regs] Add leafy mode for zero-call-used-regs

2022-10-27 Thread Qing Zhao via Gcc-patches


> On Oct 26, 2022, at 5:29 PM, Alexandre Oliva  wrote:
> 
> On Oct 25, 2022, Qing Zhao  wrote:
> 
>>> 'all' for leaf functions is likely wasteful.  If no other functions are
>>> called, one can determine exactly which registers might carry
>>> information out and thus need zeroing, and 'used' is thus likely enough,
>>> depending on the purpose of register scrubbing.  (In some scenarios, it
>>> might make sense to want scrubbing of all registers, even unused ones
>>> that carry incoming values)
> 
>> Under what kinds of situations, we should clear the un-used registers
>> for leaf functions?
> 
> The one (admittedly contrived) scenario that comes to mind is calling an
> out-of-line, empty leaf function with 'all' for explicit register
> scrubbing at desired points in the program.  I.e., this empty leaf
> function would be in charge of scrubbing the caller's registers.  It
> could even be tail-called.
> 
> I'm sure there are other scenarios in which keeping at least the
> possibility of 'all' is useful.
Okay.

> 
>> Now I am wondering whether we should make “leafy” mode by default then?
> 
> I'm not sure what you mean by default.  I think "skip" is the right
> default for general use, where register scrubbing is not explicitly
> requested.  When it is, perhaps -fzero-call-used-regs without
> '=' could be 'leafy' indeed or, even better, the extended form
> thereof that is in search of a name and so far unimplemented.

I guess that I was not clear in the previous email with the “by default”.
My previous point was:

If there is no need to clear the un-used registers for leaf functions, we can 
make the following change:

+  if ((zero_regs_type & LEAFY_MODE) && leaf_function_p ())
+only_used = true;
+

As 

+  if ( leaf_function_p ())
+only_used = true;
+

i.e, instead introducing a new MODE “LEAFY_MODE” and a new user sub-option, for 
LEAF functions, only
Clear its’ used registers even for “ALL”.

However, since there is need to clear the un-used registers for leaf functions. 
It looks like it is needed to provide
This new sub-option to users.

Is this clear this time?

> 
>> Another thing is, do you have any information on how much this new mode can 
>> save the 
>> code size and run-time compared to mode “all”?
> 
> I'm afraid I have not performed any measurements.

The major purpose of this new mode is to provide some improvement for the 
run-time and code-size. So, I think that 
Some information on this will be very helpful. Just a suggestion.


Another suggestion is: If this new mode is decided to add into GCC, the 
documentation might need to add more details on what’s the LEAFY mode,
The purpose of it, and how to use it, provide more details to the end-users
> 
>>> I have not (yet?) implemented this variant; I haven't even found a name
>>> I'm happy with for it.  (seal?  plug?  cork?  another leak antonym?)
> 
>> For this improvement, I am still thinking no need to add a new mode,
>> just add it by default?
> 
> Even if it is default, it may still need a name to appear before
> e.g. '-gpr'.  'default-gpr' might do, but I'm not happy with it either.

Default here, also means, no-need to introduce a user sub option, just add an 
optimization to the compiler. -:)

Qing
> 
> 
> -- 
> Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
>   Free Software Activist   GNU Toolchain Engineer
> Disinformation flourishes because many people care deeply about injustice
> but very few check the facts.  Ask me about 



[pushed] aarch64: Reinstate some uses of CONSTEXPR

2022-10-27 Thread Richard Sandiford via Gcc-patches
In 9482a5e4eac8d696129ec2854b331e1bb5dbab42 I'd replaced uses
of CONSTEXPR with direct uses of constexpr.  However, it turns
out that we still have CONSTEXPR for a reason: GCC 4.8 doesn't
implement constexpr properly, and for example rejects things like:

  extern const int x;
  constexpr int x = 1;

This patch partially reverts the previous one.  To make things
more complicated, there are still some things that need to be
constexpr rather than CONSTEXPR, since they are used to initialise
scalar constants.  The patch therefore doesn't change anything
in aarch64-feature-deps.h.

Tested on aarch64-linux-gnu (including with GCC 4.8 as the build
compiler) & pushed.

Richard


gcc/
* config/aarch64/aarch64-protos.h: Replace constexpr with
CONSTEXPR.
* config/aarch64/aarch64-sve-builtins-base.cc: Likewise.
* config/aarch64/aarch64-sve-builtins-functions.h: Likewise.
* config/aarch64/aarch64-sve-builtins-shapes.cc: Likewise.
* config/aarch64/aarch64-sve-builtins-sve2.cc: Likewise.
* config/aarch64/aarch64-sve-builtins.cc: Likewise.
* config/aarch64/aarch64.cc: Likewise.
* config/aarch64/driver-aarch64.cc: Likewise
---
 gcc/config/aarch64/aarch64-protos.h   |  6 +-
 .../aarch64/aarch64-sve-builtins-base.cc  | 56 +--
 .../aarch64/aarch64-sve-builtins-functions.h  | 28 +-
 .../aarch64/aarch64-sve-builtins-shapes.cc|  8 +--
 .../aarch64/aarch64-sve-builtins-sve2.cc  | 12 ++--
 gcc/config/aarch64/aarch64-sve-builtins.cc|  8 +--
 gcc/config/aarch64/aarch64.cc |  2 +-
 gcc/config/aarch64/driver-aarch64.cc  |  4 +-
 8 files changed, 62 insertions(+), 62 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index 1a71f022841..238820581c5 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -254,7 +254,7 @@ typedef struct simd_vec_cost advsimd_vec_cost;
 /* SVE-specific extensions to the information provided by simd_vec_cost.  */
 struct sve_vec_cost : simd_vec_cost
 {
-  constexpr sve_vec_cost (const simd_vec_cost ,
+  CONSTEXPR sve_vec_cost (const simd_vec_cost ,
  unsigned int clast_cost,
  unsigned int fadda_f16_cost,
  unsigned int fadda_f32_cost,
@@ -354,7 +354,7 @@ using aarch64_scalar_vec_issue_info = 
aarch64_base_vec_issue_info;
Advanced SIMD and SVE.  */
 struct aarch64_simd_vec_issue_info : aarch64_base_vec_issue_info
 {
-  constexpr aarch64_simd_vec_issue_info (aarch64_base_vec_issue_info base,
+  CONSTEXPR aarch64_simd_vec_issue_info (aarch64_base_vec_issue_info base,
 unsigned int ld2_st2_general_ops,
 unsigned int ld3_st3_general_ops,
 unsigned int ld4_st4_general_ops)
@@ -382,7 +382,7 @@ using aarch64_advsimd_vec_issue_info = 
aarch64_simd_vec_issue_info;
is a concept of "predicate operations".  */
 struct aarch64_sve_vec_issue_info : aarch64_simd_vec_issue_info
 {
-  constexpr aarch64_sve_vec_issue_info
+  CONSTEXPR aarch64_sve_vec_issue_info
 (aarch64_simd_vec_issue_info base,
  unsigned int pred_ops_per_cycle,
  unsigned int while_pred_ops,
diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc 
b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
index 23b4d42822a..6347407555f 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc
+++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
@@ -177,7 +177,7 @@ public:
 class svac_impl : public function_base
 {
 public:
-  constexpr svac_impl (int unspec) : m_unspec (unspec) {}
+  CONSTEXPR svac_impl (int unspec) : m_unspec (unspec) {}
 
   rtx
   expand (function_expander ) const override
@@ -209,7 +209,7 @@ public:
 class svadr_bhwd_impl : public function_base
 {
 public:
-  constexpr svadr_bhwd_impl (unsigned int shift) : m_shift (shift) {}
+  CONSTEXPR svadr_bhwd_impl (unsigned int shift) : m_shift (shift) {}
 
   rtx
   expand (function_expander ) const override
@@ -259,7 +259,7 @@ public:
 class svbrk_binary_impl : public function_base
 {
 public:
-  constexpr svbrk_binary_impl (int unspec) : m_unspec (unspec) {}
+  CONSTEXPR svbrk_binary_impl (int unspec) : m_unspec (unspec) {}
 
   rtx
   expand (function_expander ) const override
@@ -275,7 +275,7 @@ public:
 class svbrk_unary_impl : public function_base
 {
 public:
-  constexpr svbrk_unary_impl (int unspec) : m_unspec (unspec) {}
+  CONSTEXPR svbrk_unary_impl (int unspec) : m_unspec (unspec) {}
 
   rtx
   expand (function_expander ) const override
@@ -309,7 +309,7 @@ public:
 class svclast_impl : public quiet
 {
 public:
-  constexpr svclast_impl (int unspec) : m_unspec (unspec) {}
+  CONSTEXPR svclast_impl (int unspec) : m_unspec (unspec) {}
 
   rtx
   expand (function_expander ) const override
@@ -381,7 +381,7 @@ public:
 class svcmp_impl : public 

Re: [OG12 commit] vect: WORKAROUND vectorizer bug

2022-10-27 Thread Andrew Stubbs

On 24/10/2022 19:06, Richard Biener wrote:




Am 24.10.2022 um 18:51 schrieb Andrew Stubbs :

I've committed this to the OG12 branch to remove some test failures. We 
probably ought to have something on mainline also, but a proper fix would be 
better.

Without this. the libgomp.oacc-c-c++-common/private-variables.c testcase fails to compile 
due to an ICE.  The OpenACC worker broadcasting code is creating SLP optimizable loads 
and stores in amdgcn address-space-4. Previously this was "ok" as SLP didn't 
work with less that 64-lane vectors, but the newly implemented smaller vectors are 
working as intended and optimizing this.

Unfortunately the vectorizer is losing the address-space data from the intermediate 
types, and it all falls apart during expand when it tries the convert a 32-bit address 
into a 64-bit address and that's not something that works. At first sight it looks like 
we could possibly make that work with POINTERS_EXTEND_UNSIGNED, but that only changes the 
error message. Fundamentally we need to make sure that various instances of 
"vectype" have the correct address space, but my attempts to do so showed that 
that's a larger task than I have time for right now.


Istr there were issues like this in the past that I fixed, so any testcase that 
exposes this with just a gcn cc1 would be nice to have.


I've been unable to reproduce this issue on the mainline compiler. The 
SLP vectorizer says the accesses are not consecutive, although I don't 
know why they would be different.


A simple testcase works fine on OG12 as well. It's something weird to do 
with the OpenACC worker broadcasting code that I can't reproduce manually.


Thank you for the offer. I'll let you know if I get a testcase.

Andrew


[PATCH v2] RISC-V: Libitm add RISC-V support.

2022-10-27 Thread Xiongchuan Tan via Gcc-patches
libitm/ChangeLog:

* configure.tgt: Add riscv support.
* config/riscv/asm.h: New file.
* config/riscv/sjlj.S: New file.
* config/riscv/target.h: New file.
---
v2: Change HW_CACHELINE_SIZE to 64 (in accordance with the RVA profiles, see
https://github.com/riscv/riscv-profiles/blob/main/profiles.adoc)

 libitm/config/riscv/asm.h|  52 +
 libitm/config/riscv/sjlj.S   | 144 +++
 libitm/config/riscv/target.h |  50 
 libitm/configure.tgt |   2 +
 4 files changed, 248 insertions(+)
 create mode 100644 libitm/config/riscv/asm.h
 create mode 100644 libitm/config/riscv/sjlj.S
 create mode 100644 libitm/config/riscv/target.h

diff --git a/libitm/config/riscv/asm.h b/libitm/config/riscv/asm.h
new file mode 100644
index 000..6ba5e2c
--- /dev/null
+++ b/libitm/config/riscv/asm.h
@@ -0,0 +1,52 @@
+/* Copyright (C) 2022 Free Software Foundation, Inc.
+   Contributed by Xiongchuan Tan .
+
+   This file is part of the GNU Transactional Memory Library (libitm).
+
+   Libitm is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   Libitm is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+   more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   .  */
+
+#ifndef _RV_ASM_H
+#define _RV_ASM_H
+
+#if __riscv_xlen == 64
+#  define GPR_L ld
+#  define GPR_S sd
+#  define SZ_GPR 8
+#elif __riscv_xlen == 32
+#  define GPR_L lw
+#  define GPR_S sw
+#  define SZ_GPR 4
+#else
+#  error Unsupported XLEN (must be 64-bit or 32-bit).
+#endif
+
+#if defined(__riscv_flen) && __riscv_flen == 64
+#  define FPR_L fld
+#  define FPR_S fsd
+#  define SZ_FPR 8
+#elif defined(__riscv_flen) && __riscv_flen == 32
+#  define FPR_L flw
+#  define FPR_S fsw
+#  define SZ_FPR 4
+#else
+#  define SZ_FPR 0
+#endif
+
+#endif  /* _RV_ASM_H */
diff --git a/libitm/config/riscv/sjlj.S b/libitm/config/riscv/sjlj.S
new file mode 100644
index 000..6f25cb5
--- /dev/null
+++ b/libitm/config/riscv/sjlj.S
@@ -0,0 +1,144 @@
+/* Copyright (C) 2022 Free Software Foundation, Inc.
+   Contributed by Xiongchuan Tan .
+
+   This file is part of the GNU Transactional Memory Library (libitm).
+
+   Libitm is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   Libitm is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+   more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   .  */
+
+#include "asmcfi.h"
+#include "asm.h"
+
+   .text
+   .align  2
+   .global _ITM_beginTransaction
+   .type   _ITM_beginTransaction, @function
+
+_ITM_beginTransaction:
+   cfi_startproc
+mv a1, sp
+   addi sp, sp, -(14*SZ_GPR+12*SZ_FPR)
+   cfi_adjust_cfa_offset(14*SZ_GPR+12*SZ_FPR)
+
+   /* Return Address */
+   GPR_S ra, 0*SZ_GPR(sp)
+   cfi_rel_offset(ra, 0*SZ_GPR)
+
+   /* Caller's sp */
+   GPR_S a1, 1*SZ_GPR(sp)
+
+   /* Caller's s0/fp */
+   GPR_S fp, 2*SZ_GPR(sp)
+   cfi_rel_offset(fp, 2*SZ_GPR)
+
+   /* Callee-saved registers */
+   GPR_S s1, 3*SZ_GPR(sp)
+   GPR_S s2, 4*SZ_GPR(sp)
+   GPR_S s3, 5*SZ_GPR(sp)
+   GPR_S s4, 6*SZ_GPR(sp)
+   GPR_S s5, 7*SZ_GPR(sp)
+   GPR_S s6, 8*SZ_GPR(sp)
+   GPR_S s7, 9*SZ_GPR(sp)
+   GPR_S s8, 10*SZ_GPR(sp)
+   GPR_S s9, 11*SZ_GPR(sp)
+   GPR_S s10, 12*SZ_GPR(sp)
+   GPR_S s11, 13*SZ_GPR(sp)
+
+#if defined(__riscv_flen)
+   /* Callee-saved floating-point registers */
+   FPR_S fs0, 0*SZ_FPR+14*SZ_GPR(sp)
+   FPR_S fs1, 

Re: [PATCH] [x86] Fix incorrect digit constraint

2022-10-27 Thread Uros Bizjak via Gcc-patches
On Thu, Oct 27, 2022 at 12:55 PM liuhongt  wrote:
>
> Matching constraints are used in these circumstances. More precisely,
> the two operands that match must include one input-only operand and
> one output-only operand. Moreover, the digit must be a smaller number
> than the number of the operand that uses it in the constraint.
>
> In pr107057, the 2 operands in the pattern are both input operands.

Ouch...

> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?

You have a couple of other patterns where operand 1 is matched to
produce vmovddup insn. These are *avx512f_unpcklpd512 and
avx_unpcklpd256. You can also remove expander in both
cases.

Uros.

>
> gcc/ChangeLog:
>
> PR target/107057
> * config/i386/sse.md (*vec_interleave_highv2df): Remove
> constraint 1.
> (*vec_interleave_lowv2df): Ditto.
> (vec_concatv2df): Ditto.
> * config/i386/i386.cc (ix86_vec_interleave_v2df_operator_ok):
> Disallow MEM_P (op1) && MEM_P (op2).
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/pr107057.c: New test.
> ---
>  gcc/config/i386/i386.cc  |  2 +-
>  gcc/config/i386/sse.md   | 68 +++-
>  gcc/testsuite/gcc.target/i386/pr107057.c | 19 +++
>  3 files changed, 50 insertions(+), 39 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr107057.c
>
> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> index aeea26ef4be..e3b7bea0d68 100644
> --- a/gcc/config/i386/i386.cc
> +++ b/gcc/config/i386/i386.cc
> @@ -15652,7 +15652,7 @@ ix86_vec_interleave_v2df_operator_ok (rtx 
> operands[3], bool high)
>if (MEM_P (operands[0]))
>  return rtx_equal_p (operands[0], operands[1 + high]);
>if (MEM_P (operands[1]) && MEM_P (operands[2]))
> -return TARGET_SSE3 && rtx_equal_p (operands[1], operands[2]);
> +return false;
>return true;
>  }
>
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> index f4b5506703f..e6fefe39ca2 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -12170,29 +12170,28 @@ (define_expand "vec_interleave_highv2df"
>  })
>
>  (define_insn "*vec_interleave_highv2df"
> -  [(set (match_operand:V2DF 0 "nonimmediate_operand" "=x,v,v,x,v,m")
> +  [(set (match_operand:V2DF 0 "nonimmediate_operand" "=x,v,x,v,m")
> (vec_select:V2DF
>   (vec_concat:V4DF
> -   (match_operand:V2DF 1 "nonimmediate_operand" " 0,v,o,o,o,v")
> -   (match_operand:V2DF 2 "nonimmediate_operand" " x,v,1,0,v,0"))
> +   (match_operand:V2DF 1 "nonimmediate_operand" " 0,v,o,o,v")
> +   (match_operand:V2DF 2 "nonimmediate_operand" " x,v,0,v,0"))
>   (parallel [(const_int 1)
>  (const_int 3)])))]
>"TARGET_SSE2 && ix86_vec_interleave_v2df_operator_ok (operands, 1)"
>"@
> unpckhpd\t{%2, %0|%0, %2}
> vunpckhpd\t{%2, %1, %0|%0, %1, %2}
> -   %vmovddup\t{%H1, %0|%0, %H1}
> movlpd\t{%H1, %0|%0, %H1}
> vmovlpd\t{%H1, %2, %0|%0, %2, %H1}
> %vmovhpd\t{%1, %0|%q0, %1}"
> -  [(set_attr "isa" "noavx,avx,sse3,noavx,avx,*")
> -   (set_attr "type" "sselog,sselog,sselog,ssemov,ssemov,ssemov")
> +  [(set_attr "isa" "noavx,avx,noavx,avx,*")
> +   (set_attr "type" "sselog,sselog,ssemov,ssemov,ssemov")
> (set (attr "prefix_data16")
> - (if_then_else (eq_attr "alternative" "3,5")
> + (if_then_else (eq_attr "alternative" "2,4")
>(const_string "1")
>(const_string "*")))
> -   (set_attr "prefix" "orig,maybe_evex,maybe_vex,orig,maybe_evex,maybe_vex")
> -   (set_attr "mode" "V2DF,V2DF,DF,V1DF,V1DF,V1DF")])
> +   (set_attr "prefix" "orig,maybe_evex,orig,maybe_evex,maybe_vex")
> +   (set_attr "mode" "V2DF,V2DF,V1DF,V1DF,V1DF")])
>
>  (define_expand "avx512f_movddup512"
>[(set (match_operand:V8DF 0 "register_operand")
> @@ -12332,29 +12331,28 @@ (define_expand "vec_interleave_lowv2df"
>  })
>
>  (define_insn "*vec_interleave_lowv2df"
> -  [(set (match_operand:V2DF 0 "nonimmediate_operand" "=x,v,v,x,v,o")
> +  [(set (match_operand:V2DF 0 "nonimmediate_operand" "=x,v,x,v,o")
> (vec_select:V2DF
>   (vec_concat:V4DF
> -   (match_operand:V2DF 1 "nonimmediate_operand" " 0,v,m,0,v,0")
> -   (match_operand:V2DF 2 "nonimmediate_operand" " x,v,1,m,m,v"))
> +   (match_operand:V2DF 1 "nonimmediate_operand" " 0,v,0,v,0")
> +   (match_operand:V2DF 2 "nonimmediate_operand" " x,v,m,m,v"))
>   (parallel [(const_int 0)
>  (const_int 2)])))]
>"TARGET_SSE2 && ix86_vec_interleave_v2df_operator_ok (operands, 0)"
>"@
> unpcklpd\t{%2, %0|%0, %2}
> vunpcklpd\t{%2, %1, %0|%0, %1, %2}
> -   %vmovddup\t{%1, %0|%0, %q1}
> movhpd\t{%2, %0|%0, %q2}
> vmovhpd\t{%2, %1, %0|%0, %1, %q2}
> %vmovlpd\t{%2, %H0|%H0, %2}"
> -  [(set_attr "isa" "noavx,avx,sse3,noavx,avx,*")
> -   (set_attr "type" 

Re: [PATCH] RISC-V: Libitm add RISC-V support.

2022-10-27 Thread Andrew Waterman
I'm surprised by the hard-coded 128-byte cache line size.  If we need
to hard-code a value, it should be 64 (in accordance with the RVA
profiles, see https://github.com/riscv/riscv-profiles/blob/main/profiles.adoc),
but ideally this would be queried dynamically.


On Thu, Oct 27, 2022 at 3:51 AM Xiongchuan Tan via Gcc-patches
 wrote:
>
> libitm/ChangeLog:
>
> * configure.tgt: Add riscv support.
> * config/riscv/asm.h: New file.
> * config/riscv/sjlj.S: New file.
> * config/riscv/target.h: New file.
> ---
>  libitm/config/riscv/asm.h|  52 +
>  libitm/config/riscv/sjlj.S   | 144 +++
>  libitm/config/riscv/target.h |  50 
>  libitm/configure.tgt |   2 +
>  4 files changed, 248 insertions(+)
>  create mode 100644 libitm/config/riscv/asm.h
>  create mode 100644 libitm/config/riscv/sjlj.S
>  create mode 100644 libitm/config/riscv/target.h
>
> diff --git a/libitm/config/riscv/asm.h b/libitm/config/riscv/asm.h
> new file mode 100644
> index 000..6ba5e2c
> --- /dev/null
> +++ b/libitm/config/riscv/asm.h
> @@ -0,0 +1,52 @@
> +/* Copyright (C) 2022 Free Software Foundation, Inc.
> +   Contributed by Xiongchuan Tan .
> +
> +   This file is part of the GNU Transactional Memory Library (libitm).
> +
> +   Libitm is free software; you can redistribute it and/or modify it
> +   under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3 of the License, or
> +   (at your option) any later version.
> +
> +   Libitm is distributed in the hope that it will be useful, but WITHOUT ANY
> +   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
> +   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> +   more details.
> +
> +   Under Section 7 of GPL version 3, you are granted additional
> +   permissions described in the GCC Runtime Library Exception, version
> +   3.1, as published by the Free Software Foundation.
> +
> +   You should have received a copy of the GNU General Public License and
> +   a copy of the GCC Runtime Library Exception along with this program;
> +   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
> +   .  */
> +
> +#ifndef _RV_ASM_H
> +#define _RV_ASM_H
> +
> +#if __riscv_xlen == 64
> +#  define GPR_L ld
> +#  define GPR_S sd
> +#  define SZ_GPR 8
> +#elif __riscv_xlen == 32
> +#  define GPR_L lw
> +#  define GPR_S sw
> +#  define SZ_GPR 4
> +#else
> +#  error Unsupported XLEN (must be 64-bit or 32-bit).
> +#endif
> +
> +#if defined(__riscv_flen) && __riscv_flen == 64
> +#  define FPR_L fld
> +#  define FPR_S fsd
> +#  define SZ_FPR 8
> +#elif defined(__riscv_flen) && __riscv_flen == 32
> +#  define FPR_L flw
> +#  define FPR_S fsw
> +#  define SZ_FPR 4
> +#else
> +#  define SZ_FPR 0
> +#endif
> +
> +#endif  /* _RV_ASM_H */
> diff --git a/libitm/config/riscv/sjlj.S b/libitm/config/riscv/sjlj.S
> new file mode 100644
> index 000..6f25cb5
> --- /dev/null
> +++ b/libitm/config/riscv/sjlj.S
> @@ -0,0 +1,144 @@
> +/* Copyright (C) 2022 Free Software Foundation, Inc.
> +   Contributed by Xiongchuan Tan .
> +
> +   This file is part of the GNU Transactional Memory Library (libitm).
> +
> +   Libitm is free software; you can redistribute it and/or modify it
> +   under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; either version 3 of the License, or
> +   (at your option) any later version.
> +
> +   Libitm is distributed in the hope that it will be useful, but WITHOUT ANY
> +   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
> +   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> +   more details.
> +
> +   Under Section 7 of GPL version 3, you are granted additional
> +   permissions described in the GCC Runtime Library Exception, version
> +   3.1, as published by the Free Software Foundation.
> +
> +   You should have received a copy of the GNU General Public License and
> +   a copy of the GCC Runtime Library Exception along with this program;
> +   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
> +   .  */
> +
> +#include "asmcfi.h"
> +#include "asm.h"
> +
> +   .text
> +   .align  2
> +   .global _ITM_beginTransaction
> +   .type   _ITM_beginTransaction, @function
> +
> +_ITM_beginTransaction:
> +   cfi_startproc
> +mv a1, sp
> +   addi sp, sp, -(14*SZ_GPR+12*SZ_FPR)
> +   cfi_adjust_cfa_offset(14*SZ_GPR+12*SZ_FPR)
> +
> +   /* Return Address */
> +   GPR_S ra, 0*SZ_GPR(sp)
> +   cfi_rel_offset(ra, 0*SZ_GPR)
> +
> +   /* Caller's sp */
> +   GPR_S a1, 1*SZ_GPR(sp)
> +
> +   /* Caller's s0/fp */
> +   GPR_S fp, 2*SZ_GPR(sp)
> +   cfi_rel_offset(fp, 2*SZ_GPR)
> +
> +   /* Callee-saved registers */
> +   GPR_S s1, 3*SZ_GPR(sp)
> +   

[PATCH] [x86] Fix incorrect digit constraint

2022-10-27 Thread liuhongt via Gcc-patches
Matching constraints are used in these circumstances. More precisely,
the two operands that match must include one input-only operand and
one output-only operand. Moreover, the digit must be a smaller number
than the number of the operand that uses it in the constraint.

In pr107057, the 2 operands in the pattern are both input operands.

Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ok for trunk?

gcc/ChangeLog:

PR target/107057
* config/i386/sse.md (*vec_interleave_highv2df): Remove
constraint 1.
(*vec_interleave_lowv2df): Ditto.
(vec_concatv2df): Ditto.
* config/i386/i386.cc (ix86_vec_interleave_v2df_operator_ok):
Disallow MEM_P (op1) && MEM_P (op2).

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr107057.c: New test.
---
 gcc/config/i386/i386.cc  |  2 +-
 gcc/config/i386/sse.md   | 68 +++-
 gcc/testsuite/gcc.target/i386/pr107057.c | 19 +++
 3 files changed, 50 insertions(+), 39 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr107057.c

diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index aeea26ef4be..e3b7bea0d68 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -15652,7 +15652,7 @@ ix86_vec_interleave_v2df_operator_ok (rtx operands[3], 
bool high)
   if (MEM_P (operands[0]))
 return rtx_equal_p (operands[0], operands[1 + high]);
   if (MEM_P (operands[1]) && MEM_P (operands[2]))
-return TARGET_SSE3 && rtx_equal_p (operands[1], operands[2]);
+return false;
   return true;
 }
 
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index f4b5506703f..e6fefe39ca2 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -12170,29 +12170,28 @@ (define_expand "vec_interleave_highv2df"
 })
 
 (define_insn "*vec_interleave_highv2df"
-  [(set (match_operand:V2DF 0 "nonimmediate_operand" "=x,v,v,x,v,m")
+  [(set (match_operand:V2DF 0 "nonimmediate_operand" "=x,v,x,v,m")
(vec_select:V2DF
  (vec_concat:V4DF
-   (match_operand:V2DF 1 "nonimmediate_operand" " 0,v,o,o,o,v")
-   (match_operand:V2DF 2 "nonimmediate_operand" " x,v,1,0,v,0"))
+   (match_operand:V2DF 1 "nonimmediate_operand" " 0,v,o,o,v")
+   (match_operand:V2DF 2 "nonimmediate_operand" " x,v,0,v,0"))
  (parallel [(const_int 1)
 (const_int 3)])))]
   "TARGET_SSE2 && ix86_vec_interleave_v2df_operator_ok (operands, 1)"
   "@
unpckhpd\t{%2, %0|%0, %2}
vunpckhpd\t{%2, %1, %0|%0, %1, %2}
-   %vmovddup\t{%H1, %0|%0, %H1}
movlpd\t{%H1, %0|%0, %H1}
vmovlpd\t{%H1, %2, %0|%0, %2, %H1}
%vmovhpd\t{%1, %0|%q0, %1}"
-  [(set_attr "isa" "noavx,avx,sse3,noavx,avx,*")
-   (set_attr "type" "sselog,sselog,sselog,ssemov,ssemov,ssemov")
+  [(set_attr "isa" "noavx,avx,noavx,avx,*")
+   (set_attr "type" "sselog,sselog,ssemov,ssemov,ssemov")
(set (attr "prefix_data16")
- (if_then_else (eq_attr "alternative" "3,5")
+ (if_then_else (eq_attr "alternative" "2,4")
   (const_string "1")
   (const_string "*")))
-   (set_attr "prefix" "orig,maybe_evex,maybe_vex,orig,maybe_evex,maybe_vex")
-   (set_attr "mode" "V2DF,V2DF,DF,V1DF,V1DF,V1DF")])
+   (set_attr "prefix" "orig,maybe_evex,orig,maybe_evex,maybe_vex")
+   (set_attr "mode" "V2DF,V2DF,V1DF,V1DF,V1DF")])
 
 (define_expand "avx512f_movddup512"
   [(set (match_operand:V8DF 0 "register_operand")
@@ -12332,29 +12331,28 @@ (define_expand "vec_interleave_lowv2df"
 })
 
 (define_insn "*vec_interleave_lowv2df"
-  [(set (match_operand:V2DF 0 "nonimmediate_operand" "=x,v,v,x,v,o")
+  [(set (match_operand:V2DF 0 "nonimmediate_operand" "=x,v,x,v,o")
(vec_select:V2DF
  (vec_concat:V4DF
-   (match_operand:V2DF 1 "nonimmediate_operand" " 0,v,m,0,v,0")
-   (match_operand:V2DF 2 "nonimmediate_operand" " x,v,1,m,m,v"))
+   (match_operand:V2DF 1 "nonimmediate_operand" " 0,v,0,v,0")
+   (match_operand:V2DF 2 "nonimmediate_operand" " x,v,m,m,v"))
  (parallel [(const_int 0)
 (const_int 2)])))]
   "TARGET_SSE2 && ix86_vec_interleave_v2df_operator_ok (operands, 0)"
   "@
unpcklpd\t{%2, %0|%0, %2}
vunpcklpd\t{%2, %1, %0|%0, %1, %2}
-   %vmovddup\t{%1, %0|%0, %q1}
movhpd\t{%2, %0|%0, %q2}
vmovhpd\t{%2, %1, %0|%0, %1, %q2}
%vmovlpd\t{%2, %H0|%H0, %2}"
-  [(set_attr "isa" "noavx,avx,sse3,noavx,avx,*")
-   (set_attr "type" "sselog,sselog,sselog,ssemov,ssemov,ssemov")
+  [(set_attr "isa" "noavx,avx,noavx,avx,*")
+   (set_attr "type" "sselog,sselog,ssemov,ssemov,ssemov")
(set (attr "prefix_data16")
- (if_then_else (eq_attr "alternative" "3,5")
+ (if_then_else (eq_attr "alternative" "2,4")
   (const_string "1")
   (const_string "*")))
-   (set_attr "prefix" "orig,maybe_evex,maybe_vex,orig,maybe_evex,maybe_vex")
-   (set_attr "mode" "V2DF,V2DF,DF,V1DF,V1DF,V1DF")])

[PATCH] RISC-V: Libitm add RISC-V support.

2022-10-27 Thread Xiongchuan Tan via Gcc-patches
libitm/ChangeLog:

* configure.tgt: Add riscv support.
* config/riscv/asm.h: New file.
* config/riscv/sjlj.S: New file.
* config/riscv/target.h: New file.
---
 libitm/config/riscv/asm.h|  52 +
 libitm/config/riscv/sjlj.S   | 144 +++
 libitm/config/riscv/target.h |  50 
 libitm/configure.tgt |   2 +
 4 files changed, 248 insertions(+)
 create mode 100644 libitm/config/riscv/asm.h
 create mode 100644 libitm/config/riscv/sjlj.S
 create mode 100644 libitm/config/riscv/target.h

diff --git a/libitm/config/riscv/asm.h b/libitm/config/riscv/asm.h
new file mode 100644
index 000..6ba5e2c
--- /dev/null
+++ b/libitm/config/riscv/asm.h
@@ -0,0 +1,52 @@
+/* Copyright (C) 2022 Free Software Foundation, Inc.
+   Contributed by Xiongchuan Tan .
+
+   This file is part of the GNU Transactional Memory Library (libitm).
+
+   Libitm is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   Libitm is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+   more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   .  */
+
+#ifndef _RV_ASM_H
+#define _RV_ASM_H
+
+#if __riscv_xlen == 64
+#  define GPR_L ld
+#  define GPR_S sd
+#  define SZ_GPR 8
+#elif __riscv_xlen == 32
+#  define GPR_L lw
+#  define GPR_S sw
+#  define SZ_GPR 4
+#else
+#  error Unsupported XLEN (must be 64-bit or 32-bit).
+#endif
+
+#if defined(__riscv_flen) && __riscv_flen == 64
+#  define FPR_L fld
+#  define FPR_S fsd
+#  define SZ_FPR 8
+#elif defined(__riscv_flen) && __riscv_flen == 32
+#  define FPR_L flw
+#  define FPR_S fsw
+#  define SZ_FPR 4
+#else
+#  define SZ_FPR 0
+#endif
+
+#endif  /* _RV_ASM_H */
diff --git a/libitm/config/riscv/sjlj.S b/libitm/config/riscv/sjlj.S
new file mode 100644
index 000..6f25cb5
--- /dev/null
+++ b/libitm/config/riscv/sjlj.S
@@ -0,0 +1,144 @@
+/* Copyright (C) 2022 Free Software Foundation, Inc.
+   Contributed by Xiongchuan Tan .
+
+   This file is part of the GNU Transactional Memory Library (libitm).
+
+   Libitm is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   Libitm is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+   more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   .  */
+
+#include "asmcfi.h"
+#include "asm.h"
+
+   .text
+   .align  2
+   .global _ITM_beginTransaction
+   .type   _ITM_beginTransaction, @function
+
+_ITM_beginTransaction:
+   cfi_startproc
+mv a1, sp
+   addi sp, sp, -(14*SZ_GPR+12*SZ_FPR)
+   cfi_adjust_cfa_offset(14*SZ_GPR+12*SZ_FPR)
+
+   /* Return Address */
+   GPR_S ra, 0*SZ_GPR(sp)
+   cfi_rel_offset(ra, 0*SZ_GPR)
+
+   /* Caller's sp */
+   GPR_S a1, 1*SZ_GPR(sp)
+
+   /* Caller's s0/fp */
+   GPR_S fp, 2*SZ_GPR(sp)
+   cfi_rel_offset(fp, 2*SZ_GPR)
+
+   /* Callee-saved registers */
+   GPR_S s1, 3*SZ_GPR(sp)
+   GPR_S s2, 4*SZ_GPR(sp)
+   GPR_S s3, 5*SZ_GPR(sp)
+   GPR_S s4, 6*SZ_GPR(sp)
+   GPR_S s5, 7*SZ_GPR(sp)
+   GPR_S s6, 8*SZ_GPR(sp)
+   GPR_S s7, 9*SZ_GPR(sp)
+   GPR_S s8, 10*SZ_GPR(sp)
+   GPR_S s9, 11*SZ_GPR(sp)
+   GPR_S s10, 12*SZ_GPR(sp)
+   GPR_S s11, 13*SZ_GPR(sp)
+
+#if defined(__riscv_flen)
+   /* Callee-saved floating-point registers */
+   FPR_S fs0, 0*SZ_FPR+14*SZ_GPR(sp)
+   FPR_S fs1, 1*SZ_FPR+14*SZ_GPR(sp)
+   FPR_S fs2, 2*SZ_FPR+14*SZ_GPR(sp)
+   FPR_S fs3, 3*SZ_FPR+14*SZ_GPR(sp)
+   FPR_S fs4, 4*SZ_FPR+14*SZ_GPR(sp)
+ 

Re: Ping [PATCH] Add condition coverage profiling

2022-10-27 Thread Martin Liška

On 10/25/22 08:33, Jørgen Kvalsvik wrote:

Gentle ping. I have a tuned the summary output slightly (decisions covered ->
condition outcomes covered) already.


Sorry for a small delay, I'm working on it.

One general issue I noticed is you use an invalid coding style, where you use 4 
spaces
for each level. Plus see the indentation for '{', '}':

diff --git a/gcc/tree-profile.cc b/gcc/tree-profile.cc
index 0b537d64d97..b661ed92045 100644
--- a/gcc/tree-profile.cc
+++ b/gcc/tree-profile.cc
@@ -553,11 +553,12 @@ masking_vectors (conds_ctx& ctx, array_slice 
blocks,
 body[0] = body.pop ();
 
 for (const basic_block b : body)

-{
+  {
for (edge e1 : b->preds)
-   for (edge e2 : b->preds)
-   {
-   const basic_block top = e1->src;
+ for (edge e2 : b->preds)
+   {
+ const basic_block top = e1->src;
+...
const basic_block bot = e2->src;
const unsigned cond = e1->flags & e2->flags & (EDGE_CONDITION);
 
Link: https://gcc.gnu.org/codingconventions.html


What editor do you use?

Cheers,
Martin


Re: [PATCH] c++: Fix ICE on g++.dg/modules/adl-3_c.C [PR107379]

2022-10-27 Thread Nathan Sidwell via Gcc-patches

On 10/27/22 04:17, Jakub Jelinek wrote:

Hi!

As mentioned in the PR, apparently my r13-2887 P1467R9 changes
regressed these tests on powerpc64le-linux with IEEE quad by default.

I believe my changes just uncovered a latent bug.
The problem is that push_namespace calls find_namespace_slot,
which does:
   tree *slot = DECL_NAMESPACE_BINDINGS (ns)
 ->find_slot_with_hash (name, name ? IDENTIFIER_HASH_VALUE (name) : 0,
create_p ? INSERT : NO_INSERT);
In the  ns case, slot is non-NULL
above with a binding_vector in it.
Then pushdecl is called and this does:
  slot = find_namespace_slot (ns, name, ns == 
current_namespace);
where ns == current_namespace (ns is :: and name is details) is true.
So this again calls
  tree *slot = DECL_NAMESPACE_BINDINGS (ns)
->find_slot_with_hash (name, name ? IDENTIFIER_HASH_VALUE (name) : 
0,
   create_p ? INSERT : NO_INSERT);
but this time with create_p and so INSERT.
At this point we reach
  if (insert == INSERT && m_size * 3 <= m_n_elements * 4)
expand ();
and when we are unlucky and the occupancy of the hash table just reached 3/4,
expand () is called and the hash table is reallocated.  But when that happens,
it means the slot pointer in the pushdecl caller (push_namespace) points to
freed memory and so any accesses to it in make_namespace_finish will be UB.


that's unfortunate, oh well.


The following patch fixes it by calling find_namespace_slot again even if it
was non-NULL, just doesn't assert it is *slot == ns in that case (because
it often is not).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


ok. thanks

nathan
--
Nathan Sidwell



Re: [PATCH v3] Re: OpenMP: Generate SIMD clones for functions with "declare target"

2022-10-27 Thread Thomas Schwinge
Hi!

On 2022-10-26T20:27:19-0600, Sandra Loosemore  wrote:
> On 10/20/22 08:07, Jakub Jelinek wrote:
>> Thus, IMHO it is exactly the pass_omp_simd_clone pass where you want to
>> implement this auto-simdization discovery, guarded with
>> #ifdef ACCEL_COMPILER and the new option (which means it will be done
>> only for gcn and not on the host right now).
>
> I'm running into a practical difficulty with making this controlled by a
> static #ifdef: namely, testing.
>
> One of my test cases examines the .s output to make sure that the clones
> are emitted as local symbols and not global.  I have not been able to
> find the symbol linkage information in any of the dump files

Hmm, also some of '-fdump-ipa-all-details' doesn't help here?

> and I have
> also not been able to figure out how to get a .s file from the offload
> compiler even outside of the DejaGnu test harness.  (It's possible I am
> just an extreme dummy about the latter problem, but so far none of my
> colleagues here has been able to give me a recipe either.)

Right, currently only 'scan-offload-tree-dump[...]',
'scan-offload-rtl-dump[...]' are implemented; I assume
'scan-offload-assembler[...]' could be added without too much effort.

> On top of that, I worry that this should be tested more broadly than for
> the one target we're presently focusing on (AMD GCN), and we'll get much
> more regular test coverage if it's also enabled for x86_64 target which
> has the necessary compute_vecsize_and_simdlen target hook.
>
> I remember Carlos O'Donnell used to have a favorite mantra, "design for
> test".

Heh, I don't remember him ever saying that to me -- but maybe that's
because this is what I do anyway.  ;-P

> So, maybe generalize the new -fopenmp-target-simd-clone option
> to take a parameter to force clones to be generated on the OpenMP host
> for test purposes?  The "declare target" directive already has a clause
>
> device_type(host|nohost|any)
>
> that defaults to "any"; maybe we could use that syntax like
> -fopenmp-target-simd-clone=any
> and use the intersection of the two sets to determine what to
> auto-generate clones for?

Seems reasonable to me (but I'm missing a lot of context here).


There anyway is a goal (far out) to get rid of compilation-time
'#ifdef ACCEL_COMPILER' etc., and instead make such code dependent on a
command-line flag (or some other state), so that it's possible to use the
the same compiler for target (host) as well as offload target compilation.
(For example, to simulate offloading compilation with standard
x86_64-pc-linux-gnu GCC.)


And/or, where you implement the logic to "make sure that the clones
are emitted as local symbols and not global", do emit some "tag" in the
dump file, and the scan for that?

Random examples that I just remembered:

'gcc/omp-offload.cc:execute_oacc_loop_designation' handling of
'OMP_CLAUSE_NOHOST', and how that's scanned (host-side) in test cases
such as 'libgomp/testsuite/libgomp.oacc-c-c++-common/routine-nohost-1.c',
'libgomp/testsuite/libgomp.oacc-fortran/routine-nohost-1.f90'.

'gcc/config/nvptx/nvptx.cc:nvptx_find_sese' doing
'fprintf (dump_file, "SESE regions:"); [...]', and that's scanned in:

libgomp/testsuite/libgomp.oacc-c-c++-common/nvptx-sese-1.c-/* Match 
{N->N(.N)+} */
libgomp/testsuite/libgomp.oacc-c-c++-common/nvptx-sese-1.c:/* { dg-final { 
scan-offload-rtl-dump "SESE regions:.* 
\[0-9\]+{\[0-9\]+->\[0-9\]+(\\.\[0-9\]+)+}" "mach" } } */

(You'd be doing this at the 'scan-offload-tree-dump[...]' level, I
suppose.)


Grüße
 Thomas
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


Re: [PATCH] [x86_64] Zhaoxin lujiazui enablement

2022-10-27 Thread Martin Liška

On 10/27/22 11:09, Mayshao-oc wrote:





Hi Martin:
    Thanks for your patch,  I comment the questions below.



Hi.



:)





Hello.



I noticed this patch set which is kind of related to 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107364 
.



And I have a couple of questions:



1) I noticed you drop AVX and F16C features for the newly added "lujiazui". Why 
do you need it?
  I would expect these features would be properly detected by cpuid?


Yes, these features could be detected by cpuid, and in respect of 
functionality, these features are ok, but in respect of performance, these 
features need further improvement, so we decide to drop it now, and add these 
features back when performance meet  our expectation.



 I see. So theoretically you can increase costs of the corresponding insns and 
that could be dropped now?
 But I'm not a costing expert.


Hi.

One note: please try to send plain-text emails to GCC's mailing lists and not 
HTML version. Thanks!



I am new to gcc, and have lots of things to learn. About LTO and PGO, I have 
read some knowledge you and hubicka shared, and it helps me a lot, As a 
performance issue, it is a good idea to use cost model to solve, and disable 
avx entirely seems overkill. But cost model need to set the appropriate value 
of the cost, it's challenging to specify the number and more challenging to 
justify why we set that number. Our current approach have a pitfall to 
accommodate AVX intrinsic functions(eg: __mm256_loadu_pd), we could use -mavx 
to specify this explictly to overcome this.


Sure, makes sense.

Martin






2) If you really need it, can you please test for me the attached patch? It 
should come up
  with a new function.


I have tested the patch, It's ok. 



 Good, I'm going to install it.





3) Have question about:



else if (vendor == signature_CENTAUR_ebx && family < 0x07)
    cpu_model->__cpu_vendor = VENDOR_CENTAUR;
else if (vendor == signature_SHANGHAI_ebx
               || vendor == signature_CENTAUR_ebx)



Are there any signature_CENTAUR_ebx models with family == 0x7 ?
Similarly, are there any signature_SHANGHAI_ebx modes with family < 0x7 ?


Yes, both cases exist in our products.



 Good. Then we miss a CPU features detection for (vendor == signature_CENTAUR_ebx 
&& family < 0x07)
 aka https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107364 
. But it's not worth it as 
it's a legacy hardware,
 right?


Yes, for legacy hardware, we need to keep it work correctly, but in respect of 
performance, we don't spend a lot of time to tune.


 Cheers,
 Martin





Thanks,

Martin

BR 
Mayshao






Re: [PATCH] [x86_64] Zhaoxin lujiazui enablement

2022-10-27 Thread Mayshao-oc




>>
>> Hi Martin:
>> Thanks for your patch,  I comment the questions below.

>Hi.

>:)

>>
>>> Hello.
>>
>>> I noticed this patch set which is kind of related to 
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107364.
>>
>>> And I have a couple of questions:
>>
>>>1) I noticed you drop AVX and F16C features for the newly added "lujiazui". 
>>>Why do you need it?
>>>  I would expect these features would be properly detected by cpuid?
>>
>> Yes, these features could be detected by cpuid, and in respect of 
>> functionality, these features are ok, but in respect of performance, these 
>> features need further improvement, so we decide to drop it now, and add 
>> these features back when performance meet our expectation.

> I see. So theoretically you can increase costs of the corresponding insns and 
> that could be dropped now?
> But I'm not a costing expert.

I am new to gcc, and have lots of things to learn. About LTO and PGO, I have 
read some knowledge you and hubicka shared, and it helps me a lot, As a 
performance issue, it is a good idea to use cost model to solve, and disable 
avx entirely seems overkill. But cost model need to set the appropriate value 
of the cost, it's challenging to specify the number and more challenging to 
justify why we set that number. Our current approach have a pitfall to 
accommodate AVX intrinsic functions(eg: __mm256_loadu_pd), we could use -mavx 
to specify this explictly to overcome this.

>>
>>> 2) If you really need it, can you please test for me the attached patch? It 
>>> should come up
>>>  with a new function.
>>
>> I have tested the patch, It's ok.

> Good, I'm going to install it.

>>
>>> 3) Have question about:
>>
>>> else if (vendor == signature_CENTAUR_ebx && family < 0x07)
>>>cpu_model->__cpu_vendor = VENDOR_CENTAUR;
>>> else if (vendor == signature_SHANGHAI_ebx
>>>   || vendor == signature_CENTAUR_ebx)
>>
>>> Are there any signature_CENTAUR_ebx models with family == 0x7 ?
>>> Similarly, are there any signature_SHANGHAI_ebx modes with family < 0x7 ?
>>
>> Yes, both cases exist in our products.

> Good. Then we miss a CPU features detection for (vendor == 
> signature_CENTAUR_ebx && family < 0x07)
> aka https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107364. But it's not worth 
> it as it's a legacy hardware,
> right?

Yes, for legacy hardware, we need to keep it work correctly, but in respect of 
performance, we don't spend a lot of time to tune.

> Cheers,
> Martin

>>
>>> Thanks,
>> Martin
>>
>> BR
>> Mayshao



[PATCH] lto-dump: modernize a bit

2022-10-27 Thread Martin Liška

Hi.

Ready to be installed?
Thanks,
Martin

gcc/lto/ChangeLog:

* lto-dump.cc (dump_list): Remove trailing return.
(dump_symbol): Likewise.
(dump_body): Filter name based on mangled name.
(dump_tool_help): Use GIMPLE wording.
(lto_main): Update wording.
---
 gcc/lto/lto-dump.cc | 19 +++
 1 file changed, 7 insertions(+), 12 deletions(-)

diff --git a/gcc/lto/lto-dump.cc b/gcc/lto/lto-dump.cc
index cb9782722a9..5c4dbf5d297 100644
--- a/gcc/lto/lto-dump.cc
+++ b/gcc/lto/lto-dump.cc
@@ -227,7 +227,6 @@ void dump_list (void)
 {
   dump_list_functions ();
   dump_list_variables ();
-  return;
 }
 
 /* Dump specific variables and functions used in IL.  */

@@ -243,7 +242,6 @@ void dump_symbol ()
  printf ("\n");
}
 }
-  return;
 }
 
 /* Dump specific gimple body of specified function.  */

@@ -259,19 +257,17 @@ void dump_body ()
 return;
   }
   cgraph_node *cnode;
-  FOR_EACH_FUNCTION (cnode)
-if (cnode->definition
-   && !cnode->alias
-   && !strcmp (cnode->name (), flag_dump_body))
+  FOR_EACH_DEFINED_FUNCTION (cnode)
+if (!cnode->alias
+   && !strcmp (cnode->asm_name (), flag_dump_body))
   {
-   printf ("Gimple Body of Function: %s\n", cnode->name ());
+   printf ("GIMPLE body of function: %s\n\n", cnode->asm_name ());
cnode->get_untransformed_body ();
debug_function (cnode->decl, flags);
flag = 1;
   }
   if (!flag)
 error_at (input_location, "Function not found.");
-  return;
 }
 
 /* List of command line options for dumping.  */

@@ -292,13 +288,12 @@ void dump_tool_help ()
 "  -callgraphDump the callgraph in graphviz format.\n"
 "  -type-stats   Dump statistics of tree types.\n"
 "  -tree-stats   Dump statistics of trees.\n"
-"  -gimple-stats Dump statistics of gimple statements.\n"
-"  -dump-body=   Dump the specific gimple body.\n"
+"  -gimple-stats Dump statistics of GIMPLE statements.\n"
+"  -dump-body=   Dump the specific GIMPLE body.\n"
 "  -dump-level=  Deciding the optimization level of body.\n"
 "  -help Display the dump tool help.\n";
 
   fputs (msg, stdout);

-  return;
 }
 
 unsigned int

@@ -365,7 +360,7 @@ lto_main (void)
"%<--enable-gather-detailed-mem-stats%>.");
   else
{
- printf ("Tree Statistics\n");
+ printf ("Tree statistics\n");
  dump_tree_statistics ();
}
 }
--
2.38.0



[PATCH (pushed)] lto: do not load LTO stream for aliases [PR107418]

2022-10-27 Thread Martin Liška

It's the similar condition we use in lto-dump.

Pushed as obvious.

MArtin

PR lto/107418

gcc/lto/ChangeLog:

* lto-dump.cc (lto_main): Do not load LTO stream for aliases.
---
 gcc/lto/lto-dump.cc | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/lto/lto-dump.cc b/gcc/lto/lto-dump.cc
index f3d852df51f..cb9782722a9 100644
--- a/gcc/lto/lto-dump.cc
+++ b/gcc/lto/lto-dump.cc
@@ -347,7 +347,8 @@ lto_main (void)
   /* Dump gimple statement statistics.  */
   cgraph_node *node;
   FOR_EACH_DEFINED_FUNCTION (node)
-   node->get_untransformed_body ();
+   if (!node->alias)
+ node->get_untransformed_body ();
   if (!GATHER_STATISTICS)
warning_at (input_location, 0,
"Not configured with "
--
2.38.0



[PATCH] c++: Fix ICE on g++.dg/modules/adl-3_c.C [PR107379]

2022-10-27 Thread Jakub Jelinek via Gcc-patches
Hi!

As mentioned in the PR, apparently my r13-2887 P1467R9 changes
regressed these tests on powerpc64le-linux with IEEE quad by default.

I believe my changes just uncovered a latent bug.
The problem is that push_namespace calls find_namespace_slot,
which does:
  tree *slot = DECL_NAMESPACE_BINDINGS (ns)
->find_slot_with_hash (name, name ? IDENTIFIER_HASH_VALUE (name) : 0,
   create_p ? INSERT : NO_INSERT);
In the  ns case, slot is non-NULL
above with a binding_vector in it.
Then pushdecl is called and this does:
  slot = find_namespace_slot (ns, name, ns == 
current_namespace);
where ns == current_namespace (ns is :: and name is details) is true.
So this again calls
  tree *slot = DECL_NAMESPACE_BINDINGS (ns)
->find_slot_with_hash (name, name ? IDENTIFIER_HASH_VALUE (name) : 
0,
   create_p ? INSERT : NO_INSERT);
but this time with create_p and so INSERT.
At this point we reach
  if (insert == INSERT && m_size * 3 <= m_n_elements * 4)
expand ();
and when we are unlucky and the occupancy of the hash table just reached 3/4,
expand () is called and the hash table is reallocated.  But when that happens,
it means the slot pointer in the pushdecl caller (push_namespace) points to
freed memory and so any accesses to it in make_namespace_finish will be UB.

The following patch fixes it by calling find_namespace_slot again even if it
was non-NULL, just doesn't assert it is *slot == ns in that case (because
it often is not).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-10-27  Jakub Jelinek  

PR c++/107379
* name-lookup.cc (push_namespace): Call find_namespace_slot again
after pushdecl as the hash table might be expanded during pushdecl.

--- gcc/cp/name-lookup.cc.jj2022-10-12 17:51:00.912944731 +0200
+++ gcc/cp/name-lookup.cc   2022-10-26 12:06:38.177590655 +0200
@@ -8596,6 +8596,13 @@ push_namespace (tree name, bool make_inl
  /* This should find the slot created by pushdecl.  */
  gcc_checking_assert (slot && *slot == ns);
}
+ else
+   {
+ /* pushdecl could have expanded the hash table, so
+slot might be invalid.  */
+ slot = find_namespace_slot (current_namespace, name);
+ gcc_checking_assert (slot);
+   }
  make_namespace_finish (ns, slot);
 
  /* Add the anon using-directive here, we don't do it in

Jakub



Re: [PATCH Rust front-end v3 40/46] gccrs: Add GCC Rust front-end Make-lang.in

2022-10-27 Thread Arthur Cohen

(...snip...)


+RUST_SELFTEST_FLAGS = -xrs $(SELFTEST_FLAGS)


I've noticed that this patch contains a typo which prevents self-tests 
from running properly. This should be `-xrust`, not `-xrs`. I assume 
there will be some other review comments, so that will be fixed in a v4 
of the patches.


Sorry about the annoyance.

(...snip...)

Kindly,

--
Arthur Cohen 

Toolchain Engineer

Embecosm GmbH

Geschäftsführer: Jeremy Bennett
Niederlassung: Nürnberg
Handelsregister: HR-B 36368
www.embecosm.de

Fürther Str. 27
90429 Nürnberg


Tel.: 091 - 128 707 040
Fax: 091 - 128 707 077


OpenPGP_0x1B3465B044AD9C65.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH Rust front-end v3 35/46] gccrs: Add metadata ouptput pass

2022-10-27 Thread Arthur Cohen

On 10/26/22 23:04, David Malcolm wrote:

%{On Wed, 2022-10-26 at 10:18 +0200, arthur.co...@embecosm.com wrote:

From: Philip Herron 

Extern crates statements to tell the front-end to look for another
library.
The mechanism here is heavily inspired from gccgo, so when we compile
a
library for example we invoke:



[...snip...]


+  rust_error_at (Location (),
+    "expected metadata-output path to have base file
name of: "
+    "%<%s%> got %<%s%>",
+    expected_file_name.c_str (), path_base_name);


I can't comment on the patch in depth, but does rust_error_at call into
GCC's regular diagnostics?

If so, "%qs" is a more idiomatic way to express printing a string
argument in quotes (and bold), rather than "%<%s%>", though IIRC they
do the same thing (unless I'm missing something?).


I also believe that they do the same thing. We have some %<%s%> 
left-over from previous, more complex format strings, so good catch and 
thank you for noticing. I'll fix them up.



This shows up in a few places in this patch, and might affect other
patches in the kit - though it's a minor nitpick, of course.

Dave



OpenPGP_0x1B3465B044AD9C65.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


[PATCH] libstdc++: std::to_chars std::{,b}float16_t support

2022-10-27 Thread Jakub Jelinek via Gcc-patches
Hi!

The following patch on top of
https://gcc.gnu.org/pipermail/libstdc++/2022-October/054849.html
adds std::{,b}float16_t support for std::to_chars.
When precision is specified (or for std::bfloat16_t for hex mode even if not),
I believe we can just use the std::to_chars float (when float is mode
compatible with std::float32_t) overloads, both formats are proper subsets
of std::float32_t.
Unfortunately when precision is not specified and we are supposed to emit
shortest string, the std::{,b}float16_t strings are usually much shorter.
E.g. 1.e7p-14f16 shortest fixed representation is
0.0001161 and shortest scientific representation is
1.161e-04 while 1.e7p-14f32 (same number promoted to std::float32_t)
0.00011610985 and
1.1610985e-04.
Similarly for 1.38p-112bf16,
0.0235
2.35e-34 vs. 1.38p-112f32
0.023472271
2.3472271e-34
For std::float16_t there are differences even in the shortest hex, say:
0.01p-14 vs. 1p-22
but only for denormal std::float16_t values (where all std::float16_t
denormals converted to std::float32_t are normal), __FLT16_MIN__ and
everything larger in absolute value than that is the same.  Unless
that is a bug and we should try to discover shorter representations
even for denormals...
std::bfloat16_t has the same exponent range as std::float32_t, so all
std::bfloat16_t denormals are also std::float32_t denormals and thus
the shortest hex representations are the same.

As documented, ryu can handle arbitrary IEEE like floating point formats
(probably not wider than IEEE quad) using the generic_128 handling, but
ryu is hidden in libstdc++.so.  As only few architectures support
std::float16_t right now and some of them have special ISA requirements
for those (e.g. on i?86 one needs -msse2) and std::bfloat16_t is right
now supported only on x86 (again with -msse2), perhaps with aarch64/arm
coming next if ARM is interested, but I think it is possible that more
will be added later, instead of exporting APIs from the library to handle
directly the std::{,b}float16_t overloads this patch instead exports
functions which take a float which is a superset of those and expects
the inline overloads to promote the 16-bit formats to 32-bit, then inside
of the library it ensures they are printed right.
With the added [[gnu::cold]] attribute because I think most users
will primarily use these formats as storage formats and perform arithmetics
in the excess precision for them and print also as std::float32_t the
added support doesn't seem to be too large, on x86_64:
readelf -Ws libstdc++.so.6.0.31 | grep float16_t
   912: 000ae824   950 FUNCGLOBAL DEFAULT   13 
_ZSt21__to_chars_bfloat16_tPcS_fSt12chars_format@@GLIBCXX_3.4.31
  5767: 000ae4a1   899 FUNCGLOBAL DEFAULT   13 
_ZSt20__to_chars_float16_tPcS_fSt12chars_format@@GLIBCXX_3.4.31
   842: 0016d430   106 FUNCLOCAL  DEFAULT   13 
_ZN12_GLOBAL__N_113get_ieee_reprINS_23floating_type_float16_tEEENS_6ieee_tIT_EES3_
   865: 00170980  1613 FUNCLOCAL  DEFAULT   13 
_ZSt23__floating_to_chars_hexIN12_GLOBAL__N_123floating_type_float16_tEESt15to_chars_resultPcS3_T_St8optionalIiE.constprop.0.isra.0
  7205: 000ae824   950 FUNCGLOBAL DEFAULT   13 
_ZSt21__to_chars_bfloat16_tPcS_fSt12chars_format
  7985: 000ae4a1   899 FUNCGLOBAL DEFAULT   13 
_ZSt20__to_chars_float16_tPcS_fSt12chars_format
so 3568 code bytes together or so.

Tested with the attached test (which doesn't prove the shortest
representation, just prints std::{,b}float16_t and std::float32_t
shortest strings side by side, then tries to verify it can be
emitted even into the exact sized range and can't be into range
one smaller than that and tries to read what is printed
back using from_chars float32_t overload (so there could be
double rounding, but apparently there is none for the shortest strings).
The only differences printed are for NaNs, where sNaNs are canonicalized
to canonical qNaNs and as to_chars doesn't print NaN mantissa, even qNaNs
other than the canonical one are read back just as the canonical NaN.

Also attaching what Patrick wrote to generate the pow10_adjustment_tab,
for std::float16_t only 1.0, 10.0, 100.0, 1000.0 and 1.0 are powers
of 10 in the range because __FLT16_MAX__ is 65504.0, and all of the above
are exactly representable in std::float16_t, so we want to use 0 in
pow10_adjustment_tab.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-10-27  Jakub Jelinek  

* include/std/charconv (__to_chars_float16_t, __to_chars_bfloat16_t):
Declare.
(to_chars): Add _Float16 and __gnu_cxx::__bfloat16_t overloads.
* config/abi/pre/gnu.ver (GLIBCXX_3.4.31): Export
_ZSt20__to_chars_float16_tPcS_fSt12chars_format and
_ZSt21__to_chars_bfloat16_tPcS_fSt12chars_format.
* src/c++17/floating_to_chars.cc (floating_type_float16_t,
floating_type_bfloat16_t): New types.

Re: [PATCH] IRA: Make sure array is big enough

2022-10-27 Thread Torbjorn SVENSSON via Gcc-patches




On 2022-10-26 22:26, Vladimir Makarov wrote:


On 2022-10-25 06:01, Torbjörn SVENSSON wrote:

In commit 081c96621da, the call to resize_reg_info() was moved before
the call to remove_scratches() and the latter one can increase the
number of regs and that would cause an out of bounds usage on the
reg_renumber global array.

Without this patch, the following testcase randomly fails with:
during RTL pass: ira
In file included from 
/src/gcc/testsuite/gcc.dg/compat//struct-by-value-5b_y.c:13:
/src/gcc/testsuite/gcc.dg/compat//struct-by-value-5b_y.c: In function 
'checkgSf13':
/src/gcc/testsuite/gcc.dg/compat//fp-struct-test-by-value-y.h:28:1: 
internal compiler error: Segmentation fault
/src/gcc/testsuite/gcc.dg/compat//struct-by-value-5b_y.c:22:1: note: 
in expansion of macro 'TEST'


gcc/ChangeLog:

* ira.c: Resize array after reg number increased.


The patch is ok to commit it into gcc-11,12 branches and master.


Thank you for the review!
Pushed to gcc-11, gcc-12 and master.



Thank you for fixing this.


Co-Authored-By: Yvan ROUX 
Signed-off-by: Torbjörn SVENSSON 
---
  gcc/ira.cc | 1 +
  1 file changed, 1 insertion(+)

diff --git a/gcc/ira.cc b/gcc/ira.cc
index 42c9cead9f8..d28a67b2546 100644
--- a/gcc/ira.cc
+++ b/gcc/ira.cc
@@ -5718,6 +5718,7 @@ ira (FILE *f)
  regstat_free_ri ();
  regstat_init_n_sets_and_refs ();
  regstat_compute_ri ();
+    resize_reg_info ();
    };
    int max_regno_before_rm = max_reg_num ();




Re: [PATCH] testsuite: Adjust vect-bitfield-read-* with vect_shift and vect_long_long [PR107240]

2022-10-27 Thread Richard Biener via Gcc-patches



> Am 27.10.2022 um 09:10 schrieb Kewen.Lin :
> 
> Hi,
> 
> The test cases vect-bitfield-read-* requires vector shift
> target support, they need one explicit vect_shift effective
> target requirement checking.  Besides, the vectype for struct
> in test cases vect-bitfield-read-{2,4} is vector of long long,
> we need to check effective target vect_long_long for them.
> This patch can help to fix remaining vect-bitfield-* test
> failures on powerpc.
> 
> Tested on powerpc64-linux-gnu P7 and P8, as well as
> powerpc64le-linux-gnu P9 and P10.
> 
> Is it ok for trunk?

Ok

Thanks, Richard 

> BR,
> Kewen
> -
>PR testsuite/107240
> 
> gcc/testsuite/ChangeLog:
> 
>* gcc.dg/vect/vect-bitfield-read-1.c: Add effective target checking
>vect_shift.
>* gcc.dg/vect/vect-bitfield-read-3.c: Likewise.
>* gcc.dg/vect/vect-bitfield-read-5.c: Likewise.
>* gcc.dg/vect/vect-bitfield-read-6.c: Likewise.
>* gcc.dg/vect/vect-bitfield-read-7.c: Likewise.
>* gcc.dg/vect/vect-bitfield-read-2.c: Add effective target checking
>vect_shift and replace vect_int with vect_long_long.
>* gcc.dg/vect/vect-bitfield-read-4.c: Likewise.
> ---
> gcc/testsuite/gcc.dg/vect/vect-bitfield-read-1.c | 1 +
> gcc/testsuite/gcc.dg/vect/vect-bitfield-read-2.c | 3 ++-
> gcc/testsuite/gcc.dg/vect/vect-bitfield-read-3.c | 1 +
> gcc/testsuite/gcc.dg/vect/vect-bitfield-read-4.c | 3 ++-
> gcc/testsuite/gcc.dg/vect/vect-bitfield-read-5.c | 1 +
> gcc/testsuite/gcc.dg/vect/vect-bitfield-read-6.c | 1 +
> gcc/testsuite/gcc.dg/vect/vect-bitfield-read-7.c | 1 +
> 7 files changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-1.c 
> b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-1.c
> index 01cf34fb444..42e50d9f0c8 100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-1.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-1.c
> @@ -1,4 +1,5 @@
> /* { dg-require-effective-target vect_int } */
> +/* { dg-require-effective-target vect_shift } */
> 
> #include 
> #include "tree-vect.h"
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-2.c 
> b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-2.c
> index 1a4a1579c14..a9aeefcd72c 100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-2.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-2.c
> @@ -1,4 +1,5 @@
> -/* { dg-require-effective-target vect_int } */
> +/* { dg-require-effective-target vect_shift } */
> +/* { dg-require-effective-target vect_long_long } */
> 
> #include 
> #include "tree-vect.h"
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-3.c 
> b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-3.c
> index 849f4a017e1..c7d0fd26bad 100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-3.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-3.c
> @@ -1,4 +1,5 @@
> /* { dg-require-effective-target vect_int } */
> +/* { dg-require-effective-target vect_shift } */
> 
> #include 
> #include "tree-vect.h"
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-4.c 
> b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-4.c
> index 5bc9c412e96..6a3ed8c0c6f 100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-4.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-4.c
> @@ -1,4 +1,5 @@
> -/* { dg-require-effective-target vect_int } */
> +/* { dg-require-effective-target vect_shift } */
> +/* { dg-require-effective-target vect_long_long } */
> 
> #include 
> #include "tree-vect.h"
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-5.c 
> b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-5.c
> index 1dc24d3eded..b2889df8a0a 100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-5.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-5.c
> @@ -1,4 +1,5 @@
> /* { dg-require-effective-target vect_int } */
> +/* { dg-require-effective-target vect_shift } */
> 
> #include 
> #include "tree-vect.h"
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-6.c 
> b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-6.c
> index 7d24c299758..2445f531be2 100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-6.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-6.c
> @@ -1,4 +1,5 @@
> /* { dg-require-effective-target vect_int } */
> +/* { dg-require-effective-target vect_shift } */
> 
> #include 
> #include "tree-vect.h"
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-7.c 
> b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-7.c
> index 3b505db2bd3..4b1ec8a6dab 100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-7.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-7.c
> @@ -1,4 +1,5 @@
> /* { dg-require-effective-target vect_int } */
> +/* { dg-require-effective-target vect_shift } */
> 
> #include 
> #include "tree-vect.h"
> --
> 2.27.0


[PATCH] testsuite: Adjust vect-bitfield-read-* with vect_shift and vect_long_long [PR107240]

2022-10-27 Thread Kewen.Lin via Gcc-patches
Hi,

The test cases vect-bitfield-read-* requires vector shift
target support, they need one explicit vect_shift effective
target requirement checking.  Besides, the vectype for struct
in test cases vect-bitfield-read-{2,4} is vector of long long,
we need to check effective target vect_long_long for them.
This patch can help to fix remaining vect-bitfield-* test
failures on powerpc.

Tested on powerpc64-linux-gnu P7 and P8, as well as
powerpc64le-linux-gnu P9 and P10.

Is it ok for trunk?

BR,
Kewen
-
PR testsuite/107240

gcc/testsuite/ChangeLog:

* gcc.dg/vect/vect-bitfield-read-1.c: Add effective target checking
vect_shift.
* gcc.dg/vect/vect-bitfield-read-3.c: Likewise.
* gcc.dg/vect/vect-bitfield-read-5.c: Likewise.
* gcc.dg/vect/vect-bitfield-read-6.c: Likewise.
* gcc.dg/vect/vect-bitfield-read-7.c: Likewise.
* gcc.dg/vect/vect-bitfield-read-2.c: Add effective target checking
vect_shift and replace vect_int with vect_long_long.
* gcc.dg/vect/vect-bitfield-read-4.c: Likewise.
---
 gcc/testsuite/gcc.dg/vect/vect-bitfield-read-1.c | 1 +
 gcc/testsuite/gcc.dg/vect/vect-bitfield-read-2.c | 3 ++-
 gcc/testsuite/gcc.dg/vect/vect-bitfield-read-3.c | 1 +
 gcc/testsuite/gcc.dg/vect/vect-bitfield-read-4.c | 3 ++-
 gcc/testsuite/gcc.dg/vect/vect-bitfield-read-5.c | 1 +
 gcc/testsuite/gcc.dg/vect/vect-bitfield-read-6.c | 1 +
 gcc/testsuite/gcc.dg/vect/vect-bitfield-read-7.c | 1 +
 7 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-1.c 
b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-1.c
index 01cf34fb444..42e50d9f0c8 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-1.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-1.c
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target vect_shift } */

 #include 
 #include "tree-vect.h"
diff --git a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-2.c 
b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-2.c
index 1a4a1579c14..a9aeefcd72c 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-2.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-2.c
@@ -1,4 +1,5 @@
-/* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target vect_shift } */
+/* { dg-require-effective-target vect_long_long } */

 #include 
 #include "tree-vect.h"
diff --git a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-3.c 
b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-3.c
index 849f4a017e1..c7d0fd26bad 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-3.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-3.c
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target vect_shift } */

 #include 
 #include "tree-vect.h"
diff --git a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-4.c 
b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-4.c
index 5bc9c412e96..6a3ed8c0c6f 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-4.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-4.c
@@ -1,4 +1,5 @@
-/* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target vect_shift } */
+/* { dg-require-effective-target vect_long_long } */

 #include 
 #include "tree-vect.h"
diff --git a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-5.c 
b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-5.c
index 1dc24d3eded..b2889df8a0a 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-5.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-5.c
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target vect_shift } */

 #include 
 #include "tree-vect.h"
diff --git a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-6.c 
b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-6.c
index 7d24c299758..2445f531be2 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-6.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-6.c
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target vect_shift } */

 #include 
 #include "tree-vect.h"
diff --git a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-7.c 
b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-7.c
index 3b505db2bd3..4b1ec8a6dab 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-7.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-7.c
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target vect_shift } */

 #include 
 #include "tree-vect.h"
--
2.27.0


Re: [PATCH] x86: Replace ne:CCC/ne:CCO with UNSPEC_CC_NE in neg patterns

2022-10-27 Thread Uros Bizjak via Gcc-patches
On Wed, Oct 26, 2022 at 8:59 PM H.J. Lu  wrote:
>
> In i386.md, neg patterns which set MODE_CC register like
>
> (set (reg:CCC FLAGS_REG)
>  (ne:CCC (match_operand:SWI48 1 "general_reg_operand") (const_int 0)))
>
> can lead to errors when operand 1 is a constant value.  If FLAGS_REG in
>
> (set (reg:CCC FLAGS_REG)
>  (ne:CCC (const_int 2) (const_int 0)))
>
> is set to 1, RTX simplifiers may simplify
>
> (set (reg:SI 93)
>  (neg:SI (ltu:SI (reg:CCC 17 flags) (const_int 0 [0]
>
> as
>
> (set (reg:SI 93)
>  (neg:SI (ltu:SI (const_int 1) (const_int 0 [0]
>
> which leads to incorrect results since LTU on MODE_CC register isn't the
> same as "unsigned less than" in x86 backend.  To prevent RTL optimizers
> from setting MODE_CC register to a constant, use UNSPEC_CC_NE to replace
> ne:CCC/ne:CCO when setting FLAGS_REG in neg patterns.
>
> gcc/
>
> PR target/107172
> * config/i386/i386.md (UNSPEC_CC_NE): New.
> Replace ne:CCC/ne:CCO with UNSPEC_CC_NE in neg patterns.
>
> gcc/testsuite/
>
> PR target/107172
> * gcc.target/i386/pr107172.c: New test.

Looking at the PR107172, comments #44 and #45, this patch is a trivial
substitution for an invalid RTX.

So, OK.

Thanks,
Uros.

> ---
>  gcc/config/i386/i386.md  | 45 +---
>  gcc/testsuite/gcc.target/i386/pr107172.c | 26 ++
>  2 files changed, 51 insertions(+), 20 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr107172.c
>
> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> index baf1f1f8fa2..aaa678e7314 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -113,6 +113,7 @@ (define_c_enum "unspec" [
>UNSPEC_PEEPSIB
>UNSPEC_INSN_FALSE_DEP
>UNSPEC_SBB
> +  UNSPEC_CC_NE
>
>;; For SSE/MMX support:
>UNSPEC_FIX_NOTRUNC
> @@ -11470,7 +11471,7 @@ (define_insn_and_split "*neg2_doubleword"
>"&& reload_completed"
>[(parallel
>  [(set (reg:CCC FLAGS_REG)
> - (ne:CCC (match_dup 1) (const_int 0)))
> + (unspec:CCC [(match_dup 1) (const_int 0)] UNSPEC_CC_NE))
>   (set (match_dup 0) (neg:DWIH (match_dup 1)))])
> (parallel
>  [(set (match_dup 2)
> @@ -11499,7 +11500,8 @@ (define_peephole2
> (match_operand:SWI48 1 "nonimmediate_gr_operand"))
> (parallel
>  [(set (reg:CCC FLAGS_REG)
> - (ne:CCC (match_operand:SWI48 2 "general_reg_operand") (const_int 
> 0)))
> + (unspec:CCC [(match_operand:SWI48 2 "general_reg_operand")
> +  (const_int 0)] UNSPEC_CC_NE))
>   (set (match_dup 2) (neg:SWI48 (match_dup 2)))])
> (parallel
>  [(set (match_dup 0)
> @@ -11517,7 +11519,7 @@ (define_peephole2
> && !reg_mentioned_p (operands[2], operands[1])"
>[(parallel
>  [(set (reg:CCC FLAGS_REG)
> - (ne:CCC (match_dup 2) (const_int 0)))
> + (unspec:CCC [(match_dup 2) (const_int 0)] UNSPEC_CC_NE))
>   (set (match_dup 2) (neg:SWI48 (match_dup 2)))])
> (parallel
>  [(set (match_dup 0)
> @@ -11543,7 +11545,8 @@ (define_peephole2
>   (clobber (reg:CC FLAGS_REG))])
> (parallel
>  [(set (reg:CCC FLAGS_REG)
> - (ne:CCC (match_operand:SWI48 1 "general_reg_operand") (const_int 
> 0)))
> + (unspec:CCC [(match_operand:SWI48 1 "general_reg_operand")
> +  (const_int 0)] UNSPEC_CC_NE))
>   (set (match_dup 1) (neg:SWI48 (match_dup 1)))])
> (parallel
>  [(set (match_dup 0)
> @@ -11559,7 +11562,7 @@ (define_peephole2
>"REGNO (operands[0]) != REGNO (operands[1])"
>[(parallel
>  [(set (reg:CCC FLAGS_REG)
> - (ne:CCC (match_dup 1) (const_int 0)))
> + (unspec:CCC [(match_dup 1) (const_int 0)] UNSPEC_CC_NE))
>   (set (match_dup 1) (neg:SWI48 (match_dup 1)))])
> (parallel
>  [(set (match_dup 0)
> @@ -11635,9 +11638,9 @@ (define_insn "*negsi_2_zext"
>
>  (define_insn "*neg_ccc_1"
>[(set (reg:CCC FLAGS_REG)
> -   (ne:CCC
> - (match_operand:SWI 1 "nonimmediate_operand" "0")
> - (const_int 0)))
> +   (unspec:CCC
> + [(match_operand:SWI 1 "nonimmediate_operand" "0")
> +  (const_int 0)] UNSPEC_CC_NE))
> (set (match_operand:SWI 0 "nonimmediate_operand" "=m")
> (neg:SWI (match_dup 1)))]
>""
> @@ -11647,9 +11650,9 @@ (define_insn "*neg_ccc_1"
>
>  (define_insn "*neg_ccc_2"
>[(set (reg:CCC FLAGS_REG)
> -   (ne:CCC
> - (match_operand:SWI 1 "nonimmediate_operand" "0")
> - (const_int 0)))
> +   (unspec:CCC
> + [(match_operand:SWI 1 "nonimmediate_operand" "0")
> +  (const_int 0)] UNSPEC_CC_NE))
> (clobber (match_scratch:SWI 0 "="))]
>""
>"neg{}\t%0"
> @@ -11659,8 +11662,8 @@ (define_insn "*neg_ccc_2"
>  (define_expand "x86_neg_ccc"
>[(parallel
>  [(set (reg:CCC FLAGS_REG)
> - (ne:CCC (match_operand:SWI48 1 "register_operand")
> - (const_int 0)))
> + (unspec:CCC