[Bug target/108322] Using __restrict parameter with -ftree-vectorize (default with -O2) results in massive code bloat

2023-01-09 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108322

Richard Biener  changed:

   What|Removed |Added

 Blocks||53947
 Ever confirmed|0   |1
   Last reconfirmed||2023-01-10
 Status|UNCONFIRMED |NEW
 CC||rguenth at gcc dot gnu.org

--- Comment #4 from Richard Biener  ---
The vectorizer vectorizes this with a strided store, costing

*pSrc_16 1 times unaligned_load (misalign -1) costs 12 in body
_1 16 times scalar_store costs 192 in body
_1 16 times vec_to_scalar costs 64 in body
t.c:8:44: note:  operating only on full vectors.
t.c:8:44: note:  Cost model analysis: 
  Vector inside of loop cost: 268
  Vector prologue cost: 0
  Vector epilogue cost: 0
  Scalar iteration cost: 24
  Scalar outside cost: 0
  Vector outside cost: 0
  prologue iterations: 0
  epilogue iterations: 0
  Calculated minimum iters for profitability: 0

now later forwprop figures it can replace the element extracts from the
vector load with scalar loads which then results in effective unrolling
of the loop by a factor of 16.

The vectorizer misses the fact that w/o SSE 4.1 it cannot do efficient
lane extracts.  With SSE 4.1 and disabling the forwprop you'd get

.L3:
movdqu  (%rsi), %xmm0
addq$16, %rsi
addq$32, %rax
pextrb  $0, %xmm0, -32(%rax)
pextrb  $1, %xmm0, -30(%rax)
pextrb  $2, %xmm0, -28(%rax)
pextrb  $3, %xmm0, -26(%rax)
pextrb  $4, %xmm0, -24(%rax)
pextrb  $5, %xmm0, -22(%rax)
pextrb  $6, %xmm0, -20(%rax)
pextrb  $7, %xmm0, -18(%rax)
pextrb  $8, %xmm0, -16(%rax)
pextrb  $9, %xmm0, -14(%rax)
pextrb  $10, %xmm0, -12(%rax)
pextrb  $11, %xmm0, -10(%rax)
pextrb  $12, %xmm0, -8(%rax)
pextrb  $13, %xmm0, -6(%rax)
pextrb  $14, %xmm0, -4(%rax)
pextrb  $15, %xmm0, -2(%rax)
cmpq%rdx, %rsi
jne .L3

which is what the vectorizer thinks is going to be generated.  But with
just SSE2 we are spilling to memory for the lane extract.

For the case at hand loading two vectors from the destination and then
punpck{h,l}bw and storing them again might be the most efficient thing
to do here.

On the cost model side 'vec_to_scalar' is ambiguous, the x86 backend
tries to compensate with

  /* If we do elementwise loads into a vector then we are bound by
 latency and execution resources for the many scalar loads 
 (AGU and load ports).  Try to account for this by scaling the
 construction cost by the number of elements involved.  */
  if ((kind == vec_construct || kind == vec_to_scalar)
  && stmt_info
  && (STMT_VINFO_TYPE (stmt_info) == load_vec_info_type
  || STMT_VINFO_TYPE (stmt_info) == store_vec_info_type)
  && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info) == VMAT_ELEMENTWISE
  && TREE_CODE (DR_STEP (STMT_VINFO_DATA_REF (stmt_info))) != INTEGER_CST)
{ 
  stmt_cost = ix86_builtin_vectorization_cost (kind, vectype, misalign);
  stmt_cost *= (TYPE_VECTOR_SUBPARTS (vectype) + 1);
}

but that doesn't trigger here because the step is constant two.

RTL expansion will eventually use the vec_extract optab and that succeeds
even for SSE2 by spilling, so it isn't useful to query support:

void
ix86_expand_vector_extract (bool mmx_ok, rtx target, rtx vec, int elt)
{   
...
  if (use_vec_extr)
{ 
...
}
  else
{
  rtx mem = assign_stack_temp (mode, GET_MODE_SIZE (mode));

  emit_move_insn (mem, vec);

  tmp = adjust_address (mem, inner_mode, elt*GET_MODE_SIZE (inner_mode));
  emit_move_insn (target, tmp);
}
}

the fallback is eventually done by RTL expansion anyway.

Note fixing that and querying vec_extract support (the vectorizer doesn't
do that - it relies on expands fallback here but could do better costing
and also generate a single spill slot rather than one for each extract).


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

[Bug target/108348] ICE in gen_movoo, at config/rs6000/mma.md:292

2023-01-09 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108348

--- Comment #2 from Kewen Lin  ---
This is a 32 bit specific issue, the function rs6000_pass_by_reference has:

  /* Allow -maltivec -mabi=no-altivec without warning.  Altivec vector
 modes only exist for GCC vector types if -maltivec.  */
  if (TARGET_32BIT && !TARGET_ALTIVEC_ABI && ALTIVEC_VECTOR_MODE (arg.mode))
{
  if (TARGET_DEBUG_ARG)
fprintf (stderr, "function_arg_pass_by_reference: AltiVec\n");
  return 1;
}

It assumes that the altivec is on when we see those altivec vector modes, but
it doesn't hold with the option combination for this case. It returns true for
rs6000_pass_by_reference, then the following logic of generic code tries to
make a copy of the object and pass the address to the function being called, it
invokes store_expr for storing to the copy, then triggers ICE.

And targetm.calls.function_arg is too late to raise the errors.

Re: [PATCH][pushed] contrib: add 'contrib' to default dirs in update-copyright.py

2023-01-09 Thread Martin Liška
> However, I noticed when I run ./contrib/update-copyright.py --this-year
> I get much more modifications out of contrib folder:

@Jakub?

Martin



Re: More znver4 x86-tune flags

2023-01-09 Thread Hongtao Liu via Gcc-patches
On Tue, Jan 10, 2023 at 12:32 PM Jan Hubicka via Gcc-patches
 wrote:
>
>
> Hi,
> this patch adds more tunes for zen4:
>  - new tunes for avx512 scater instructions.
>In micro benchmarks these seems consistent loss compared to open-coded coe
>  - disable use of gather for zen4
>While these are win for a micro benchmarks (based on TSVC), enabling gather
>is a loss for parest. So for now it seems safe to keep it off.
>  - disable pass to avoid FMA chains for znver4 since fmadd was optimized and 
> does not seem
>to cause regressions.
>
> Bootstrapped/regtested x86_64.
> Honza
>
> * i386.cc (ix86_vectorize_builtin_scatter): Guard scatter by 
> TARGET_USE_SCATTER.
> * i386.h (TARGET_USE_SCATTER_2PARTS, TARGET_USE_SCATTER_4PARTS, 
> TARGET_USE_SCATTER): New macros.
> * x86-tune.def (TARGET_USE_SCATTER_2PARTS, TARGET_USE_SCATTER_4PARTS, 
> TARGET_USE_SCATTER): New tunes.
> (X86_TUNE_AVOID_256FMA_CHAINS, X86_TUNE_AVOID_512FMA_CHAINS): Disable 
> for znver4.
> (X86_TUNE_USE_GATHER): Disable for zen4.
> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> index de978d19063..9fb69f6c174 100644
> --- a/gcc/config/i386/i386.cc
> +++ b/gcc/config/i386/i386.cc
> @@ -19051,6 +19051,13 @@ ix86_vectorize_builtin_scatter (const_tree vectype,
>if (!TARGET_AVX512F)
>  return NULL_TREE;
>
> +  if (known_eq (TYPE_VECTOR_SUBPARTS (vectype), 2u)
> +  ? !TARGET_USE_SCATTER_2PARTS
> +  : (known_eq (TYPE_VECTOR_SUBPARTS (vectype), 4u)
> +? !TARGET_USE_SCATTER_4PARTS
> +: !TARGET_USE_SCATTER))
> +return NULL_TREE;
> +
>if ((TREE_CODE (index_type) != INTEGER_TYPE
> && !POINTER_TYPE_P (index_type))
>|| (TYPE_MODE (index_type) != SImode
> diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
> index e6a603ed31a..cd7ed19e29c 100644
> --- a/gcc/config/i386/i386.h
> +++ b/gcc/config/i386/i386.h
> @@ -397,10 +397,16 @@ extern unsigned char ix86_tune_features[X86_TUNE_LAST];
> ix86_tune_features[X86_TUNE_AVOID_4BYTE_PREFIXES]
>  #define TARGET_USE_GATHER_2PARTS \
> ix86_tune_features[X86_TUNE_USE_GATHER_2PARTS]
> +#define TARGET_USE_SCATTER_2PARTS \
> +   ix86_tune_features[X86_TUNE_USE_SCATTER_2PARTS]
>  #define TARGET_USE_GATHER_4PARTS \
> ix86_tune_features[X86_TUNE_USE_GATHER_4PARTS]
> +#define TARGET_USE_SCATTER_4PARTS \
> +   ix86_tune_features[X86_TUNE_USE_SCATTER_4PARTS]
>  #define TARGET_USE_GATHER \
> ix86_tune_features[X86_TUNE_USE_GATHER]
> +#define TARGET_USE_SCATTER \
> +   ix86_tune_features[X86_TUNE_USE_SCATTER]
>  #define TARGET_FUSE_CMP_AND_BRANCH_32 \
> ix86_tune_features[X86_TUNE_FUSE_CMP_AND_BRANCH_32]
>  #define TARGET_FUSE_CMP_AND_BRANCH_64 \
> diff --git a/gcc/config/i386/x86-tune.def b/gcc/config/i386/x86-tune.def
> index fae3b650434..7e9c7244fc0 100644
> --- a/gcc/config/i386/x86-tune.def
> +++ b/gcc/config/i386/x86-tune.def
> @@ -483,28 +483,43 @@ DEF_TUNE (X86_TUNE_AVOID_4BYTE_PREFIXES, 
> "avoid_4byte_prefixes",
>  DEF_TUNE (X86_TUNE_USE_GATHER_2PARTS, "use_gather_2parts",
>   ~(m_ZNVER1 | m_ZNVER2 | m_ZNVER3 | m_ZNVER4 | m_ALDERLAKE | 
> m_CORE_ATOM | m_GENERIC))
>
> +/* X86_TUNE_USE_SCATTER_2PARTS: Use scater instructions for vectors with 2
> +   elements.  */
> +DEF_TUNE (X86_TUNE_USE_SCATTER_2PARTS, "use_scatter_2parts",
> + ~(m_ZNVER4 | m_GENERIC))
> +
>  /* X86_TUNE_USE_GATHER_4PARTS: Use gather instructions for vectors with 4
> elements.  */
>  DEF_TUNE (X86_TUNE_USE_GATHER_4PARTS, "use_gather_4parts",
>   ~(m_ZNVER1 | m_ZNVER2 | m_ZNVER3 | m_ZNVER4 | m_ALDERLAKE | 
> m_CORE_ATOM | m_GENERIC))
>
> +/* X86_TUNE_USE_SCATTER_4PARTS: Use scater instructions for vectors with 4
> +   elements.  */
> +DEF_TUNE (X86_TUNE_USE_SCATTER_4PARTS, "use_scatter_4parts",
> + ~(m_ZNVER4 | m_GENERIC))
> +
>  /* X86_TUNE_USE_GATHER: Use gather instructions for vectors with 8 or more
> elements.  */
>  DEF_TUNE (X86_TUNE_USE_GATHER, "use_gather",
> - ~(m_ZNVER1 | m_ZNVER2 | m_ALDERLAKE | m_CORE_ATOM | m_GENERIC))
> + ~(m_ZNVER1 | m_ZNVER2 | m_ZNVER4 | m_ALDERLAKE | m_CORE_ATOM | 
> m_GENERIC))
> +
> +/* X86_TUNE_USE_SCATTER: Use scater instructions for vectors with 8 or more
> +   elements.  */
> +DEF_TUNE (X86_TUNE_USE_SCATTER, "use_scatter",
> + ~(m_ZNVER4 | m_GENERIC))
>
>  /* X86_TUNE_AVOID_128FMA_CHAINS: Avoid creating loops with tight 128bit or
> smaller FMA chain.  */
> -DEF_TUNE (X86_TUNE_AVOID_128FMA_CHAINS, "avoid_fma_chains", m_ZNVER)
> +DEF_TUNE (X86_TUNE_AVOID_128FMA_CHAINS, "avoid_fma_chains", m_ZNVER1 | 
> m_ZNVER2 | m_ZNVER3)
According to comments, it's *256bit or smaller*, so shouldn't
avoid_fma_chains be implied by avoid_fma256_chains.
>
>  /* X86_TUNE_AVOID_256FMA_CHAINS: Avoid creating loops with tight 256bit or
> smaller FMA chain.  */
> -DEF_TUNE (X86_TUNE_AVOID_256FMA_CHAINS, "avoid_fma256_chains", m_ZNVER2 | 
> m_ZNVER3 | m_ZNVER4
> +DEF_TUNE 

Re: B^HDEAD code generation (AMD64)

2023-01-09 Thread Gabriel Ravier via Gcc

On 1/10/23 01:34, Stefan Kanthak wrote:

"Thomas Koenig"  wrote:


On 09.01.23 12:35, Stefan Kanthak wrote:

20 superfluous instructions of the total 102 instructions!

The proper place for bug reports is https://gcc.gnu.org/bugzilla/ .

OUCH: there's NO proper place for bugs at all!


Feel free to submit these cases there.

I feel free to do whatever I like to do where I do it, for example:

--- bug.cpp ---
int main() {
 __uint128_t long long bug = 0;
}
--- EOF ---

See 

regards
Stefan


If you're trying to speedrun actually getting banned from this mailing 
list, then sure, I guess you can "do whatever I like to do where I do 
it", but you might find that more difficult after somebody decides to do 
something about it.




[Bug c++/108321] [13 regression] g++.dg/contracts/contracts-tmpl-spec2.C fails after r13-4160-g2efb237ffc68ec

2023-01-09 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108321

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #2 from Richard Biener  ---
so fixed

[Bug c++/108321] [13 regression] g++.dg/contracts/contracts-tmpl-spec2.C fails after r13-4160-g2efb237ffc68ec

2023-01-09 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108321

Richard Biener  changed:

   What|Removed |Added

   Keywords||testsuite-fail
   Target Milestone|--- |13.0

[Bug target/108316] [13 Regression] ICE in maybe_gen_insn via expand_SCATTER_STORE when vectorizing for SVE since r13-2737-g4a773bf2f08656

2023-01-09 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108316

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P1

[Bug tree-optimization/108314] [13 Regression] Segfault in gimple-match-head.cc:do_valueize when vectorizing for SVE since r13-707-g68e0063397ba82

2023-01-09 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108314

Richard Biener  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org

--- Comment #2 from Richard Biener  ---
I will have a look.

[Bug c++/101687] Scoped enumerators of a member enumeration shall not be referred by a class member access expression

2023-01-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101687

Andrew Pinski  changed:

   What|Removed |Added

 Status|NEW |SUSPENDED

--- Comment #4 from Andrew Pinski  ---
Suspending as the defect report is still considered open but is in the process
of drafting:
https://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#2557

[Bug c++/101687] Scoped enumerators of a member enumeration shall not be referred by a class member access expression

2023-01-09 Thread xmh970252187 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101687

--- Comment #3 from jim x  ---
I think CWG2557 is clear with this aspect
https://cplusplus.github.io/CWG/issues/2557.html

[Bug target/108339] [11/10 only] riscv64-linux-gnu: fails to link libgcc_s.so on the GCC 10 branch

2023-01-09 Thread doko at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108339

--- Comment #3 from Matthias Klose  ---
thanks for the pointer. The GCC 11 branch already has the backport.

More znver4 x86-tune flags

2023-01-09 Thread Jan Hubicka via Gcc-patches


Hi,
this patch adds more tunes for zen4:
 - new tunes for avx512 scater instructions.
   In micro benchmarks these seems consistent loss compared to open-coded coe
 - disable use of gather for zen4
   While these are win for a micro benchmarks (based on TSVC), enabling gather
   is a loss for parest. So for now it seems safe to keep it off.
 - disable pass to avoid FMA chains for znver4 since fmadd was optimized and 
does not seem
   to cause regressions.

Bootstrapped/regtested x86_64.
Honza

* i386.cc (ix86_vectorize_builtin_scatter): Guard scatter by 
TARGET_USE_SCATTER.
* i386.h (TARGET_USE_SCATTER_2PARTS, TARGET_USE_SCATTER_4PARTS, 
TARGET_USE_SCATTER): New macros.
* x86-tune.def (TARGET_USE_SCATTER_2PARTS, TARGET_USE_SCATTER_4PARTS, 
TARGET_USE_SCATTER): New tunes.
(X86_TUNE_AVOID_256FMA_CHAINS, X86_TUNE_AVOID_512FMA_CHAINS): Disable 
for znver4.
(X86_TUNE_USE_GATHER): Disable for zen4.
diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index de978d19063..9fb69f6c174 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -19051,6 +19051,13 @@ ix86_vectorize_builtin_scatter (const_tree vectype,
   if (!TARGET_AVX512F)
 return NULL_TREE;
 
+  if (known_eq (TYPE_VECTOR_SUBPARTS (vectype), 2u)
+  ? !TARGET_USE_SCATTER_2PARTS
+  : (known_eq (TYPE_VECTOR_SUBPARTS (vectype), 4u)
+? !TARGET_USE_SCATTER_4PARTS
+: !TARGET_USE_SCATTER))
+return NULL_TREE;
+
   if ((TREE_CODE (index_type) != INTEGER_TYPE
&& !POINTER_TYPE_P (index_type))
   || (TYPE_MODE (index_type) != SImode
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index e6a603ed31a..cd7ed19e29c 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -397,10 +397,16 @@ extern unsigned char ix86_tune_features[X86_TUNE_LAST];
ix86_tune_features[X86_TUNE_AVOID_4BYTE_PREFIXES]
 #define TARGET_USE_GATHER_2PARTS \
ix86_tune_features[X86_TUNE_USE_GATHER_2PARTS]
+#define TARGET_USE_SCATTER_2PARTS \
+   ix86_tune_features[X86_TUNE_USE_SCATTER_2PARTS]
 #define TARGET_USE_GATHER_4PARTS \
ix86_tune_features[X86_TUNE_USE_GATHER_4PARTS]
+#define TARGET_USE_SCATTER_4PARTS \
+   ix86_tune_features[X86_TUNE_USE_SCATTER_4PARTS]
 #define TARGET_USE_GATHER \
ix86_tune_features[X86_TUNE_USE_GATHER]
+#define TARGET_USE_SCATTER \
+   ix86_tune_features[X86_TUNE_USE_SCATTER]
 #define TARGET_FUSE_CMP_AND_BRANCH_32 \
ix86_tune_features[X86_TUNE_FUSE_CMP_AND_BRANCH_32]
 #define TARGET_FUSE_CMP_AND_BRANCH_64 \
diff --git a/gcc/config/i386/x86-tune.def b/gcc/config/i386/x86-tune.def
index fae3b650434..7e9c7244fc0 100644
--- a/gcc/config/i386/x86-tune.def
+++ b/gcc/config/i386/x86-tune.def
@@ -483,28 +483,43 @@ DEF_TUNE (X86_TUNE_AVOID_4BYTE_PREFIXES, 
"avoid_4byte_prefixes",
 DEF_TUNE (X86_TUNE_USE_GATHER_2PARTS, "use_gather_2parts",
  ~(m_ZNVER1 | m_ZNVER2 | m_ZNVER3 | m_ZNVER4 | m_ALDERLAKE | 
m_CORE_ATOM | m_GENERIC))
 
+/* X86_TUNE_USE_SCATTER_2PARTS: Use scater instructions for vectors with 2
+   elements.  */
+DEF_TUNE (X86_TUNE_USE_SCATTER_2PARTS, "use_scatter_2parts",
+ ~(m_ZNVER4 | m_GENERIC))
+
 /* X86_TUNE_USE_GATHER_4PARTS: Use gather instructions for vectors with 4
elements.  */
 DEF_TUNE (X86_TUNE_USE_GATHER_4PARTS, "use_gather_4parts",
  ~(m_ZNVER1 | m_ZNVER2 | m_ZNVER3 | m_ZNVER4 | m_ALDERLAKE | 
m_CORE_ATOM | m_GENERIC))
 
+/* X86_TUNE_USE_SCATTER_4PARTS: Use scater instructions for vectors with 4
+   elements.  */
+DEF_TUNE (X86_TUNE_USE_SCATTER_4PARTS, "use_scatter_4parts",
+ ~(m_ZNVER4 | m_GENERIC))
+
 /* X86_TUNE_USE_GATHER: Use gather instructions for vectors with 8 or more
elements.  */
 DEF_TUNE (X86_TUNE_USE_GATHER, "use_gather",
- ~(m_ZNVER1 | m_ZNVER2 | m_ALDERLAKE | m_CORE_ATOM | m_GENERIC))
+ ~(m_ZNVER1 | m_ZNVER2 | m_ZNVER4 | m_ALDERLAKE | m_CORE_ATOM | 
m_GENERIC))
+
+/* X86_TUNE_USE_SCATTER: Use scater instructions for vectors with 8 or more
+   elements.  */
+DEF_TUNE (X86_TUNE_USE_SCATTER, "use_scatter",
+ ~(m_ZNVER4 | m_GENERIC))
 
 /* X86_TUNE_AVOID_128FMA_CHAINS: Avoid creating loops with tight 128bit or
smaller FMA chain.  */
-DEF_TUNE (X86_TUNE_AVOID_128FMA_CHAINS, "avoid_fma_chains", m_ZNVER)
+DEF_TUNE (X86_TUNE_AVOID_128FMA_CHAINS, "avoid_fma_chains", m_ZNVER1 | 
m_ZNVER2 | m_ZNVER3)
 
 /* X86_TUNE_AVOID_256FMA_CHAINS: Avoid creating loops with tight 256bit or
smaller FMA chain.  */
-DEF_TUNE (X86_TUNE_AVOID_256FMA_CHAINS, "avoid_fma256_chains", m_ZNVER2 | 
m_ZNVER3 | m_ZNVER4
+DEF_TUNE (X86_TUNE_AVOID_256FMA_CHAINS, "avoid_fma256_chains", m_ZNVER2 | 
m_ZNVER3
  | m_ALDERLAKE | m_SAPPHIRERAPIDS | m_CORE_ATOM)
 
 /* X86_TUNE_AVOID_512FMA_CHAINS: Avoid creating loops with tight 512bit or
smaller FMA chain.  */
-DEF_TUNE (X86_TUNE_AVOID_512FMA_CHAINS, "avoid_fma512_chains", m_ZNVER4)
+DEF_TUNE (X86_TUNE_AVOID_512FMA_CHAINS, "avoid_fma512_chains", m_NONE)
 
 /* 

Re: Missing dependencies in m2/ ?

2023-01-09 Thread Gaius Mulley via Gcc-patches
Jeff Law  writes:

> I'm still seeing it as of about 2 hours ago:
>
> http://law-sandy.freeddns.org:8080/job/avr-elf/2125/console
>
> A good run (yesterday):
>
> http://law-sandy.freeddns.org:8080/job/avr-elf/2124/console
>
>
Hi Jeff,

many thanks for the urls above - useful I'll attempt to reproduce the
gcc compile.

> However, I did find that my scripts were enabling all languages --
> sorry I stated otherwise and blamed it on the M2 front-end.

No problem at all - it allowed me to find I was using wrong
version of autoconf :-).

> THe only issue we need to resolve is the dependency problems.

Yes indeed, I think I've found some missing dependencies which I'll push
to git when the bootstrap completes.  In the meantime here is the patch:

regards,
Gaius

--- o< --- o< --- o< --- o< --- o< --- o< --- o<
diff --git a/gcc/m2/Make-lang.in b/gcc/m2/Make-lang.in
index 08d0f3b963f..5c173f22540 100644
--- a/gcc/m2/Make-lang.in
+++ b/gcc/m2/Make-lang.in
@@ -1360,7 +1360,7 @@ m2/boot-bin/mc$(exeext): $(BUILD-MC-BOOT-O) 
$(BUILD-MC-INTERFACE-O) \
  $(BUILD-MC-INTERFACE-O) m2/mc-boot/main.o \
  mcflex.o m2/gm2-libs-boot/RTcodummy.o -lm
 
-m2/mc-boot/$(SRC_PREFIX)%.o: m2/mc-boot/$(SRC_PREFIX)%.c
+m2/mc-boot/$(SRC_PREFIX)%.o: m2/mc-boot/$(SRC_PREFIX)%.c 
m2/gm2-libs/gm2-libs-host.h
$(CXX) -g -c -I. -I$(srcdir)/m2/mc-boot-ch -I$(srcdir)/m2/mc-boot 
-I$(srcdir)/../include -I$(srcdir) $(INCLUDES) $< -o $@
 
 m2/mc-boot-ch/$(SRC_PREFIX)%.o: m2/mc-boot-ch/$(SRC_PREFIX)%.c 
m2/gm2-libs/gm2-libs-host.h
@@ -1373,7 +1373,7 @@ m2/mc-boot/main.o: $(M2LINK) $(srcdir)/m2/init/mcinit
unset CC ; $(M2LINK) -s --langc++ --exit --name m2/mc-boot/main.c 
$(srcdir)/m2/init/mcinit
$(CXX) -g -c -I. -I$(srcdir)/../include -I$(srcdir) $(INCLUDES) 
m2/mc-boot/main.c -o $@
 
-mcflex.o: mcflex.c
+mcflex.o: mcflex.c m2/gm2-libs/gm2-libs-host.h
$(CC) -I$(srcdir)/m2/mc -g -c $< -o $@   # remember that mcReserved.h 
is copied into m2/mc
 
 mcflex.c: $(srcdir)/m2/mc/mc.flex


[PATCH] xtensa: Make instruction cost estimation for size more accurate

2023-01-09 Thread Takayuki 'January June' Suwa via Gcc-patches
Until now, we applied COSTS_N_INSNS() (multiplying by 4) after dividing
the instruction length by 3, so we couldn't express the difference less
than modulo 3 in insn cost for size (e.g. 11 Bytes and 12 bytes cost the
same).

This patch fixes that.

;; 2 bytes
addi.n  a2, a2, -1  ; cost 3

;; 3 bytes
addmi   a2, a2, 1024; cost 4

;; 4 bytes
movi.n  a3, 80  ; cost 5
bnez.n  a2, a3, .L4

;; 5 bytes
srlia2, a3, 1   ; cost 7
add.n   a2, a2, a2

;; 6 bytes
ssai8   ; cost 8
src a4, a2, a3

:: 3 + 4 bytes
l32ra2, .L5 ; cost 9

;; 11 bytes ; cost 15
;; 12 bytes ; cost 16

gcc/ChangeLog:

* config/xtensa/xtensa.cc (xtensa_insn_cost):
Let insn cost for size be obtained by applying COSTS_N_INSNS()
to instruction length and then dividing by 3.
---
 gcc/config/xtensa/xtensa.cc | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/gcc/config/xtensa/xtensa.cc b/gcc/config/xtensa/xtensa.cc
index a1f184950ae..6cf6b35399a 100644
--- a/gcc/config/xtensa/xtensa.cc
+++ b/gcc/config/xtensa/xtensa.cc
@@ -4519,13 +4519,15 @@ xtensa_insn_cost (rtx_insn *insn, bool speed)
 {
   if (!(recog_memoized (insn) < 0))
 {
-  int len = get_attr_length (insn), n = (len + 2) / 3;
+  int len = get_attr_length (insn);
 
   if (len == 0)
return COSTS_N_INSNS (0);
 
   if (speed)  /* For speed cost.  */
{
+ int n = (len + 2) / 3;
+
  /* "L32R" may be particular slow (implementation-dependent).  */
  if (xtensa_is_insn_L32R_p (insn))
return COSTS_N_INSNS (1 + xtensa_extra_l32r_costs);
@@ -4572,10 +4574,11 @@ xtensa_insn_cost (rtx_insn *insn, bool speed)
{
  /* "L32R" itself plus constant in litpool.  */
  if (xtensa_is_insn_L32R_p (insn))
-   return COSTS_N_INSNS (2) + 1;
+   len = 3 + 4;
 
- /* Consider ".n" short instructions.  */
- return COSTS_N_INSNS (n) - (n * 3 - len);
+ /* Consider fractional instruction length (for example, ".n"
+short instructions or "L32R" litpool constants.  */
+ return (COSTS_N_INSNS (len) + 1) / 3;
}
}
 }
-- 
2.30.2


Re: B^HDEAD code generation (AMD64)

2023-01-09 Thread Andrew Pinski via Gcc
On Mon, Jan 9, 2023 at 4:42 PM Stefan Kanthak  wrote:
>
> "Thomas Koenig"  wrote:
>
> > On 09.01.23 12:35, Stefan Kanthak wrote:
> >> 20 superfluous instructions of the total 102 instructions!
> >
> > The proper place for bug reports is https://gcc.gnu.org/bugzilla/ .
>
> OUCH: there's NO proper place for bugs at all!

HUH? soon people will ignore emails that are demeaning/ableist like
what you have been recently (saying things like braindead, etc.). And
yes bugzilla is where GCC tracks bug reports.

>
> > Feel free to submit these cases there.
>
> I feel free to do whatever I like to do where I do it, for example:
>
> --- bug.cpp ---
> int main() {
> __uint128_t long long bug = 0;
> }
> --- EOF ---

With -pedantic-errors we get:

: In function 'int main()':
:2:22: error: 'long long' specified with '__int128 unsigned'
[-Wpedantic]
2 | __uint128_t long long bug = 0;
  |  ^~~~

And also run into https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108099 .

This is a known extension but maybe it is not documented ...
Anyways read that bug report.


>
> See 
>
> regards
> Stefan


[Bug target/108348] ICE in gen_movoo, at config/rs6000/mma.md:292

2023-01-09 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108348

Kewen Lin  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1
   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org
   Last reconfirmed||2023-01-10
   Target Milestone|--- |13.0
 CC||linkw at gcc dot gnu.org

--- Comment #1 from Kewen Lin  ---
Thanks for reporting, confirmed!

Since ppc64le is with 64 bit configuration, it rejects -m32. But this can be
reproduced on ppc64 with option:

-m32 -mcpu=power10 -mno-altivec

Investigating.

Re: [RFC/PATCH] Remove the workaround for _Float128 precision [PR107299]

2023-01-09 Thread Michael Meissner via Gcc-patches
On Fri, Jan 06, 2023 at 07:41:07PM -0500, Michael Meissner wrote:
> On Wed, Dec 21, 2022 at 09:40:24PM +, Joseph Myers wrote:
> > On Wed, 21 Dec 2022, Segher Boessenkool wrote:
> > 
> > > > --- a/gcc/tree.cc
> > > > +++ b/gcc/tree.cc
> > > > @@ -9442,15 +9442,6 @@ build_common_tree_nodes (bool signed_char)
> > > >if (!targetm.floatn_mode (n, extended).exists ())
> > > > continue;
> > > >int precision = GET_MODE_PRECISION (mode);
> > > > -  /* Work around the rs6000 KFmode having precision 113 not
> > > > -128.  */
> > > 
> > > It has precision 126 now fwiw.
> > > 
> > > Joseph: what do you think about this patch?  Is the workaround it
> > > removes still useful in any way, do we need to do that some other way if
> > > we remove this?
> > 
> > I think it's best for the TYPE_PRECISION, for any type with the binary128 
> > format, to be 128 (not 126).
> > 
> > It's necessary that _Float128, _Float64x and long double all have the same 
> > TYPE_PRECISION when they have the same (binary128) format, or at least 
> > that TYPE_PRECISION for _Float128 >= that for long double >= that for 
> > _Float64x, so that the rules in c_common_type apply properly.
> > 
> > How the TYPE_PRECISION compares to that of __ibm128, or of long double 
> > when that's double-double, is less important.
> 
> I spent a few days on working on this.  I have patches to make the 3 128-bit
> types to all have TYPE_PRECISION of 128.  To do this, I added a new mode macro
> (FRACTIONAL_FLOAT_MODE_NO_WIDEN) that takes the same arguments as
> FRACTIONAL_FLOAT_MODE.

...

I had the patches to change the precision to 128, and I just ran them.  C and
C++ do not seem to be bothered by changing the precision to 128 (once I got it
to build, etc.).  But Fortran on the other hand does actually use the precision
to differentiate between IBM extended double and IEEE 128-bit.  In particular,
the following 3 tests fail when long double is IBM extended double:

gfortran.dg/PR100914.f90
gfortran.dg/c-interop/typecodes-array-float128.f90
gfortran.dg/c-interop/typecodes-scalar-float128.f90

I tried adding code to use the old precisions for Fortran, but not for C/C++,
but it didn't seem to work.

So while it might be possible to use a single 128 for the precision, it needs
more work and attention, particularly on the Fortran side.

I'm not sure it is worth it to try and change things.

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


[Bug target/108240] [13 Regression] ICE in emit_library_call_value_1 at gcc/calls.cc:4181 since r13-4894-gacc727cf02a144

2023-01-09 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108240

--- Comment #6 from Kewen Lin  ---
(In reply to Kewen Lin from comment #5)
> (In reply to Segher Boessenkool from comment #4)
> > (In reply to Kewen Lin from comment #3)
> > > With the culprit commit r13-4894, we always implicitly enable powerpc64 
> > > for
> > > both explicit and implicit 64 bit, it's the same as before for the 
> > > explicit
> > > 64 bit case, but for the implicit 64 bit case, there is no chance for the
> > > used cpu to unset powerpc64 (like this case). To keep it consistent with 
> > > the
> > > previous, the fix can be to only enable powerpc64 implicitly for explicit 
> > > 64
> > > bit, while let it be for implicit 64 bit.
> > 
> > No?  If the user says to use a CPU without 64-bit instructions, while the
> > user also says we require 64-bit insns (via -m64), we should just error.
> 
> But both the previous behavior (before r13-4894) and the current behavior
> (starting from r13-4894) honour the given explicit -m64, it would always
> enable -mpowerpc64 at the same time without any errors/warnings.
> 

It's implied that when the user explicitly specify -m64, the handlings would
neglect the impact of CPU, I'm not sure if it's intentional but the reason
probably is that the underlying CPU is actually 64 bit in most cases, so make
-m64 win and the compilation can go forward.

If we change the behavior to error for both explicit and implicit 64 bit, some
compilations which worked in the past can start to fail (though it's arguable
that it's expected). Note that for implicit 64 bit and no powerpc64, we gets
errors on Linux but just warnings on darwin/aix (maybe more fallouts come out
on them). So considering the current release phase, I'm inclined to just make
it consistent with the previous, and try to adjust the behavior (as Segher's
proposal) in next release.

[Bug target/108272] [13 Regression] ICE in gen_movxo, at config/rs6000/mma.md:339

2023-01-09 Thread asolokha at gmx dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108272

--- Comment #5 from Arseny Solokha  ---
(In reply to Kewen Lin from comment #4)
> Yes, please file a new one. Thanks again.

I've filed PR108348 for that.

[Bug target/108348] New: ICE in gen_movoo, at config/rs6000/mma.md:292

2023-01-09 Thread asolokha at gmx dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108348

Bug ID: 108348
   Summary: ICE in gen_movoo, at config/rs6000/mma.md:292
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Keywords: ice-on-invalid-code
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: asolokha at gmx dot com
  Target Milestone: ---
Target: powerpc-*-linux-gnu

gcc 13.0.0 20230108 snapshot (g:e3a4bd0bbdccdde0cff85f93064b01a44fb10d2a) ICEs
when compiling gcc/testsuite/gcc.target/powerpc/pr96506-1.c w/ -m32 for a
target w/o MMA support:

% powerpc-e300c3-linux-gnu-gcc-13 -m32 -c
gcc/testsuite/gcc.target/powerpc/pr96506-1.c
during RTL pass: expand
gcc/testsuite/gcc.target/powerpc/pr96506-1.c: In function 'foo0':
gcc/testsuite/gcc.target/powerpc/pr96506-1.c:20:3: internal compiler error: in
gen_movoo, at config/rs6000/mma.md:292
   20 |   bar0 (v); /* { dg-error "invalid use of MMA operand of type
.__vector_pair. as a function parameter" } */
  |   ^~~~
0x78e467 gen_movoo(rtx_def*, rtx_def*)
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-13.0.0_p20230108/work/gcc-13-20230108/gcc/config/rs6000/mma.md:292
0xa8cfa7 rtx_insn* insn_gen_fn::operator()(rtx_def*,
rtx_def*) const
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-13.0.0_p20230108/work/gcc-13-20230108/gcc/recog.h:407
0xa8cfa7 emit_move_insn_1(rtx_def*, rtx_def*)
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-13.0.0_p20230108/work/gcc-13-20230108/gcc/expr.cc:4172
0xa8d3af emit_move_insn(rtx_def*, rtx_def*)
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-13.0.0_p20230108/work/gcc-13-20230108/gcc/expr.cc:4342
0xa9573e store_expr(tree_node*, rtx_def*, int, bool, bool)
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-13.0.0_p20230108/work/gcc-13-20230108/gcc/expr.cc:6522
0x9461fb initialize_argument_information
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-13.0.0_p20230108/work/gcc-13-20230108/gcc/calls.cc:1463
0x9461fb expand_call(tree_node*, rtx_def*, int)
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-13.0.0_p20230108/work/gcc-13-20230108/gcc/calls.cc:2969
0xa895af expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
expand_modifier, rtx_def**, bool)
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-13.0.0_p20230108/work/gcc-13-20230108/gcc/expr.cc:11875
0x95da58 expand_expr
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-13.0.0_p20230108/work/gcc-13-20230108/gcc/expr.h:310
0x95da58 expand_call_stmt
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-13.0.0_p20230108/work/gcc-13-20230108/gcc/cfgexpand.cc:2831
0x95da58 expand_gimple_stmt_1
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-13.0.0_p20230108/work/gcc-13-20230108/gcc/cfgexpand.cc:3880
0x95da58 expand_gimple_stmt
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-13.0.0_p20230108/work/gcc-13-20230108/gcc/cfgexpand.cc:4044
0x96322e expand_gimple_basic_block
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-13.0.0_p20230108/work/gcc-13-20230108/gcc/cfgexpand.cc:6096
0x964d57 execute
   
/var/tmp/portage/cross-powerpc-e300c3-linux-gnu/gcc-13.0.0_p20230108/work/gcc-13-20230108/gcc/cfgexpand.cc:6822

[Bug target/108272] [13 Regression] ICE in gen_movxo, at config/rs6000/mma.md:339

2023-01-09 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108272

--- Comment #4 from Kewen Lin  ---
(In reply to Arseny Solokha from comment #3)
> (In reply to Kewen Lin from comment #2)
> > Created attachment 54192 [details]
> > untested patch
> > 
> > Hi @Arseny, I hope this patch can help to clear all the ICEs about
> > unexpected uses of MMA opaque types in inline asm, that is to filter those
> > noises duplicated to this bug.
> 
> Indeed, I haven't seen such ICEs w/ the patch applied so far. Still get an
> ICE in gen_movoo, at config/rs6000/mma.md:292 when compiling
> gcc/testsuite/gcc.target/powerpc/pr96506-1.c w/ -m32, though. Do you want me
> to file another PR for that one?

Thanks @Arseny!  Yes, please file a new one. Thanks again.

[Bug modula2/108142] Many empty directories created in the build directory

2023-01-09 Thread gaius at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108142

--- Comment #7 from Gaius Mulley  ---
Updated patch posted to list.

Re: Missing dependencies in m2/ ?

2023-01-09 Thread Jeff Law via Gcc-patches




On 1/8/23 21:18, Gaius Mulley wrote:

Jeff Law via Gcc-patches  writes:


I've been getting sporatic errors like this since the introduction of
the modula-2 front-end:


In file included from ../../..//gcc/gcc/m2/mc-boot/GSFIO.c:29:
../../..//gcc/gcc/system.h:556:20: error: conflicting declaration of C function 
'const char* strsignal(int)'
   556 | extern const char *strsignal (int);
   |^
In file included from /usr/include/c++/12/cstring:42,
  from ../../..//gcc/gcc/system.h:241:
/usr/include/string.h:478:14: note: previous declaration 'char* strsignal(int)'
   478 | extern char *strsignal (int __sig) __THROW;
   |  ^
In file included from ../../..//gcc/gcc/system.h:707:
../../..//gcc/gcc/../include/libiberty.h:112:14: error: ambiguating new 
declaration of 'char* basename(const char*)'
   112 | extern char *basename (const char *) ATTRIBUTE_RETURNS_NONNULL 
ATTRIBUTE_NONNULL(1);
   |  ^~~~
/usr/include/string.h:524:26: note: old declaration 'const char* basename(const 
char*)'
   524 | extern "C++" const char *basename (const char *__filename)
   |  ^~~~
make[1]: *** [../../..//gcc/gcc/m2/Make-lang.in:1364: m2/mc-boot/GSFIO.o] Error 
1



They seem to come and go without rhyme or reason.  For example build
#1885 on lm32-elf failed, while #1884 passed.

Aside from the fact that I coonfigure with --enable-languages=c,c++
and yet modula-2 stuff still gets built (can that be fixed?) it seems
like we're missing dependencies to ensure that the generated config.h
file is made before building the modula-2 stuff.

In a good build you'll see something like this:

config.status: creating auto-host.h
[ ... ]
Build GSFIO.o:
g++ -g -c -I. -I../../..//gcc/gcc/m2/mc-boot-ch
-I../../..//gcc/gcc/m2/mc-boot -I../../..//gcc/gcc/../include
-I../../..//gcc/gcc -I. -Im2/mc-boot -I../../..//gcc/gcc
  -I../../..//gcc/gcc/m2/mc-boot -I../../..//gcc/gcc/../include
-I../../..//gcc/gcc/../libcpp/include -I../../..//gcc/gcc/../libcody
  -I../../..//gcc/gcc/../libdecnumber
-I../../..//gcc/gcc/../libdecnumber/dpd -I../libdecnumber
  -I../../..//gcc/gcc/../libbacktrace
../../..//gcc/gcc/m2/mc-boot/GSFIO.c -o m2/mc-boot/GSFIO.o

Which naturally works just fine.

In a bad build, auto-host.h is _not_ created before trying to build GSFIO.o.

Can you please take care of this.  It's rather annoying to have builds
failing in the continuous testing system like this, particularly when
modula-2 isn't even enabled.

Jeff


Hi Jeff,

many apologies for the breakage - I've now added the Makefile
dependencies.  I've also regenerated the m2 configure scripts

I'm still seeing it as of about 2 hours ago:

http://law-sandy.freeddns.org:8080/job/avr-elf/2125/console

A good run (yesterday):

http://law-sandy.freeddns.org:8080/job/avr-elf/2124/console


However, I did find that my scripts were enabling all languages -- sorry 
I stated otherwise and blamed it on the M2 front-end.  THe only issue we 
need to resolve is the dependency problems.


jeff


[PATCH, Modula2] PR-108142 Many empty directories created in the build directory

2023-01-09 Thread Gaius Mulley via Gcc-patches


PR-108142 Modula-2 configure generates many subdirectories in the top
build directory.  This patch dynamically creates subdirectories under
gcc/m2 if and when required.

Bootstrapped on x86_64 gnu/linux, ok for master?

regards,
Gaius


gcc/m2/ChangeLog:

* Make-lang.in (GM2_1): Change -B path to m2/stage1.
($(objdir)/m2/images/gnu.eps): Check and create dest dir
if necessary.
(gm2-libs.texi-check): Check and create dir m2/gm2-libs-pim,
m2/gm2-libs-iso and m2/gm2-libs if necessary.
($(objdir)/m2/gm2-compiler-boot): Remove.
($(objdir)/m2/gm2-libs-boot): Remove.
($(objdir)/m2/gm2-libs-libiberty): Remove.
($(objdir)/m2/gm2-libiberty): Remove.
($(objdir)/m2/gm2-gcc): Remove.
($(objdir)/m2/gm2-compiler): Remove.
($(objdir)/m2/gm2-libs): Remove.
($(objdir)/m2/gm2-libs-iso): Remove.
($(objdir)/m2/gm2-libs-min): Remove.
($(objdir)/m2/gm2-compiler-paranoid): Remove.
($(objdir)/m2/gm2-libs-paranoid): Remove.
($(objdir)/m2/gm2-compiler-verify): Remove.
($(objdir)/m2/boot-bin): Remove.
($(objdir)/m2/gm2-libs-pim): Remove.
($(objdir)/m2/gm2-libs-coroutines): Remove.
(stage1/m2): Remove.
(stage2/m2): Remove.
(stage3/m2): Remove.
(m2.stageprofile): New rule.
(m2.stagefeedback): New rule.
(cc1gm2$(exeext)): Change dependent name.
(m2/stage2/cc1gm2$(exeext)): Change dependent name.
Check and create dest dir.
(m2/stage1/cc1gm2$(exeext)): Check and create dest dir
if necessary.
(m2/gm2-gcc/%.o): Ditto.
(m2/gm2-gcc/rtegraph.o): Ditto.
(m2/gm2-gcc/$(SRC_PREFIX)%.h): Ditto.
(m2/gm2-gcc/$(SRC_PREFIX)%.h): Ditto.
(m2/mc-boot): Ditto.
(m2/mc-boot-ch): Ditto.
(m2/gm2-libs-boot): Ditto.
(m2/gm2-compiler-boot): Ditto.
(m2/gm2-compiler): Ditto.
(m2/gm2-libiberty): Ditto.
(m2/gm2-compiler): Ditto.
(m2/gm2-libs-iso): Ditto.
(m2/gm2-libs): Ditto.
(m2/gm2-libs-min): Ditto.
(m2/gm2-libs-coroutines): Ditto.
(m2/boot-bin): Ditto.
(m2/pge-boot): Ditto.
(m2/pge-boot): Ditto.
* Make-maintainer.in (m2/gm2-ppg-boot): Check and create
dest dir if necessary.
(m2): Ditto.
(m2/gm2-ppg-boot): Ditto.
(m2/gm2-pg-boot): Ditto.
(m2/gm2-auto): Ditto.
(m2/gm2-pg-boot): Ditto.
(m2/gm2-pge-boot): Ditto.
($(objdir)/plugin): Ditto.
($(objdir)/m2/mc-boot-ch): Ditto.
($(objdir)/m2/mc-boot-gen): Ditto.
(m2/boot-bin): Ditto.
(m2/mc): Ditto.
(m2/mc-obj): Ditto.
($(objdir)/m2/gm2-ppg-boot): Ditto.
($(objdir)/m2/gm2-pg-boot): Ditto.
($(objdir)/m2/gm2-pge-boot): Ditto.
(m2/mc-boot-gen): Ditto.
(m2/m2obj3): Ditto.
(m2/gm2-libs-paranoid): Ditto.
(m2/gm2-compiler-paranoid): Ditto.
(m2/gm2-libs-paranoid): Ditto.
(m2/gm2-compiler-paranoid): Ditto.
(m2/gm2-libs-paranoid): Ditto.
(m2/gm2-compiler-paranoid): Ditto.
* config-lang.in (m2/gm2-compiler-boot): Remove mkdir.
(m2/gm2-libs-boot): Ditto.
(m2/gm2-ici-boot): Ditto.
(m2/gm2-libiberty): Ditto.
(m2/gm2-gcc): Ditto.
(m2/gm2-compiler): Ditto.
(m2/gm2-libs): Ditto.
(m2/gm2-libs-iso): Ditto.
(m2/gm2-compiler-paranoid): Ditto.
(m2/gm2-libs-paranoid): Ditto.
(m2/gm2-compiler-verify): Ditto.
(m2/boot-bin): Ditto.
(m2/gm2-libs-pim): Ditto.
(m2/gm2-libs-coroutines): Ditto.
(m2/gm2-libs-min): Ditto.
(m2/pge-boot): Ditto.
(plugin): Ditto.
(stage1/m2): Ditto.
(stage2/m2): Ditto.
(stage3/m2): Ditto.
(stage4/m2): Ditto.
(m2/gm2-auto): Ditto.
(m2/gm2-pg-boot): Ditto.
(m2/gm2-pge-boot): Ditto.
(m2/gm2-ppg-boot): Ditto.
(m2/mc-boot): Ditto.
(m2/mc-boot-ch): Ditto.
(m2/mc-boot-gen): Ditto.

-- o< -- o< -- o< -- o< -- o< -- o< -- o<
diff --git a/gcc/m2/Make-lang.in b/gcc/m2/Make-lang.in
index 08d0f3b963f..a3751109481 100644
--- a/gcc/m2/Make-lang.in
+++ b/gcc/m2/Make-lang.in
@@ -27,7 +27,7 @@ GM2_CROSS_NAME = `echo gm2|sed 
'$(program_transform_cross_name)'`

 M2_MAINTAINER = no

-GM2_1 = ./gm2 -B./stage1/m2 -g -fm2-g
+GM2_1 = ./gm2 -B./m2/stage1 -g -fm2-g

 GM2_FOR_TARGET = $(STAGE_CC_WRAPPER) ./gm2 -B./ -B$(build_tooldir)/bin/ 
-L$(objdir)/../ld $(TFLAGS)

@@ -71,7 +71,6 @@ m2.srcextra: m2/SYSTEM-pim.texi m2/SYSTEM-iso.texi 
m2/gm2-libs.texi m2/gm2-ebnf.
-cp -p m2/SYSTEM-iso.texi $(srcdir)/m2
-cp -p m2/gm2-libs.texi $(srcdir)/m2
-cp -p m2/gm2-ebnf.texi $(srcdir)/m2
-   find . -name '*.texi' -print
 else
 m2.srcextra:
 endif
@@ -167,7 +166,7 @@ doc/m2.info: $(TEXISRC)
else true; fi

 

[Bug target/108240] [13 Regression] ICE in emit_library_call_value_1 at gcc/calls.cc:4181 since r13-4894-gacc727cf02a144

2023-01-09 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108240

--- Comment #5 from Kewen Lin  ---
(In reply to Segher Boessenkool from comment #4)
> (In reply to Kewen Lin from comment #3)
> > With the culprit commit r13-4894, we always implicitly enable powerpc64 for
> > both explicit and implicit 64 bit, it's the same as before for the explicit
> > 64 bit case, but for the implicit 64 bit case, there is no chance for the
> > used cpu to unset powerpc64 (like this case). To keep it consistent with the
> > previous, the fix can be to only enable powerpc64 implicitly for explicit 64
> > bit, while let it be for implicit 64 bit.
> 
> No?  If the user says to use a CPU without 64-bit instructions, while the
> user also says we require 64-bit insns (via -m64), we should just error.

But both the previous behavior (before r13-4894) and the current behavior
(starting from r13-4894) honour the given explicit -m64, it would always enable
-mpowerpc64 at the same time without any errors/warnings.

> Not hide the problem (and cause many more problems!)
> 

The behavior change is for the case without any explicit -m64 but the
TARGET_DEFAULT has 64 bit set (implicit -m64).  And yes, different from the
previous behavior, the current behavior hides the error/warning and force the
-mpower64, so I posted one patch at:

https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609492.html

It would allow that powerpc64 gets unset if the user says to use a CPU without
64-bit instructions and with implicit 64 bit.

Re: [PATCH v3 17/19] modula2 front end: dejagnu expect library scripts

2023-01-09 Thread Jason Merrill via Gcc-patches

On 12/6/22 09:47, Gaius Mulley via Gcc-patches wrote:

Here are the dejagnu expect library scripts for the gm2
testsuite.


A couple of weeks ago I noticed on a testrun that the modula tests 
didn't seem to be timing out properly, so I made this change.  It looks 
like they didn't run at all in the bootstrap/test I did just now, so I 
don't know if this change is actually helpful, but here it is if you 
think it makes sense:


From 6c9007800b8793c68921ee3d24f3a5000b44a100 Mon Sep 17 00:00:00 2001
From: Jason Merrill 
Date: Wed, 21 Dec 2022 17:01:50 -0500
Subject: [PATCH] testsuite: use same timeout for gm2 as other front-ends
To: gcc-patches@gcc.gnu.org

I noticed Modula tests running forever in a regression test run, and then
that its .exp wasn't using timeout.exp like the other front-ends.

gcc/testsuite/ChangeLog:

	* lib/gm2.exp: Use timeout.exp.
---
 gcc/testsuite/lib/gm2.exp | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/lib/gm2.exp b/gcc/testsuite/lib/gm2.exp
index 9eba195291a..1fa62d8e6ea 100644
--- a/gcc/testsuite/lib/gm2.exp
+++ b/gcc/testsuite/lib/gm2.exp
@@ -22,7 +22,7 @@ load_lib libgloss.exp
 load_lib prune.exp
 load_lib gcc-defs.exp
 load_lib target-libpath.exp
-
+load_lib timeout.exp
 
 #
 # GCC_UNDER_TEST is the compiler under test.
@@ -183,9 +183,7 @@ proc gm2_target_compile_default { source dest type options } {
 if [info exists TOOL_OPTIONS] {
 	lappend options "additional_flags=$TOOL_OPTIONS"
 }
-if [target_info exists gcc,timeout] {
-	lappend options "timeout=[target_info gcc,timeout]"
-}
+lappend options "timeout=[timeout_value]"
 lappend options "compiler=$GCC_UNDER_TEST"
 # puts stderr "options = $options\n"
 # puts stderr "* target_compile: $source $dest $type $options\n"
-- 
2.31.1



Re: B^HDEAD code generation (AMD64)

2023-01-09 Thread Stefan Kanthak
"Thomas Koenig"  wrote:

> On 09.01.23 12:35, Stefan Kanthak wrote:
>> 20 superfluous instructions of the total 102 instructions!
> 
> The proper place for bug reports is https://gcc.gnu.org/bugzilla/ .

OUCH: there's NO proper place for bugs at all!

> Feel free to submit these cases there.

I feel free to do whatever I like to do where I do it, for example:

--- bug.cpp ---
int main() {
__uint128_t long long bug = 0;
}
--- EOF ---

See 

regards
Stefan


[Bug tree-optimization/106878] [11/12 Regression] ICE: verify_gimple failed at -O2 with pointers and bitwise calculation

2023-01-09 Thread vvinayag at arm dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106878

vvinayag at arm dot com changed:

   What|Removed |Added

 CC||vvinayag at arm dot com

--- Comment #13 from vvinayag at arm dot com ---
Will this fix be backported to GCC 12 and GCC 11 ?

[Bug modula2/108261] modula-2 module registration process seems to fail with shared libraries.

2023-01-09 Thread iains at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108261

--- Comment #10 from Iain Sandoe  ---
Initial questions (still digesting the remainder).



when a module has the same name but a different interface are the symbols
distinct (i.e. mangled differently)?

If not:

 - then I can see how it works with static archives - because the static linker
picks the first one presented.
 - but multiple shared libraries in the same process, with the same symbol in
them would seem to be challenging (I'm not sure how the various dynamic loaders
would behave - i.e. the load order might not be sufficient?).

If they are:

 - we could still build a monolithic library; it is up to the FE (presumably in
conjunction with the include paths) to ensure that it references the symbol
that is relevant to the interface style (iso/pim) chosen.

[PATCH] RISC-V: Add the rest testcases of AVL=REG support

2023-01-09 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/vsetvl/avl_single-1.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-10.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-11.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-12.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-13.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-14.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-15.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-16.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-17.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-18.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-19.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-7.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-70.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-71.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-72.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-8.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-9.c: New test.

---
 .../riscv/rvv/vsetvl/avl_single-1.c   | 17 ++
 .../riscv/rvv/vsetvl/avl_single-10.c  | 21 +++
 .../riscv/rvv/vsetvl/avl_single-11.c  | 21 +++
 .../riscv/rvv/vsetvl/avl_single-12.c  | 19 +++
 .../riscv/rvv/vsetvl/avl_single-13.c  | 28 ++
 .../riscv/rvv/vsetvl/avl_single-14.c  | 27 +
 .../riscv/rvv/vsetvl/avl_single-15.c  | 27 +
 .../riscv/rvv/vsetvl/avl_single-16.c  | 32 +++
 .../riscv/rvv/vsetvl/avl_single-17.c  | 29 ++
 .../riscv/rvv/vsetvl/avl_single-18.c  | 29 ++
 .../riscv/rvv/vsetvl/avl_single-19.c  | 40 +
 .../riscv/rvv/vsetvl/avl_single-7.c   | 17 ++
 .../riscv/rvv/vsetvl/avl_single-70.c  | 41 ++
 .../riscv/rvv/vsetvl/avl_single-71.c  | 54 ++
 .../riscv/rvv/vsetvl/avl_single-72.c  | 46 +++
 .../riscv/rvv/vsetvl/avl_single-8.c   | 18 ++
 .../riscv/rvv/vsetvl/avl_single-9.c   | 56 +++
 17 files changed, 522 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-10.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-11.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-12.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-13.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-14.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-15.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-16.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-17.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-18.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-19.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-70.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-71.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-72.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-8.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-9.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-1.c
new file mode 100644
index 000..84225dbe7d2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-1.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-schedule-insns 
-fno-schedule-insns2" } */
+
+#include "riscv_vector.h"
+
+void f (void * restrict in, void * restrict out, int n, int vl)
+{
+  for (int i = 0; i < n; i++)
+{
+  vint8mf8_t v = __riscv_vle8_v_i8mf8 (in + i, vl);
+  __riscv_vse8_v_i8mf8 (out + i, v, vl);
+}
+}
+
+/* { dg-final { scan-assembler-times 
{\.L[0-9]+\:\s+vle8\.v\s+v[0-9]+,\s*0\s*\([a-x0-9]+\)} 1 { target { no-opts 
"-O0" no-opts "-Os" no-opts "-g" no-opts "-funroll-loops" } } } } */
+/* { dg-final { scan-assembler-times 
{vsetvli\s+zero,\s*[a-x0-9]+,\s*e8,\s*mf8,\s*t[au],\s*m[au]} 1 { target { 
no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */
+/* { dg-final { scan-assembler-times {vsetvli} 1 { target { no-opts "-O0" 
no-opts "-g" no-opts "-funroll-loops" } } } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-10.c 
b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-10.c
new file mode 100644
index 000..f64d1c3680f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-10.c
@@ -0,0 

[PATCH] RISC-V: Add testcases for AVL=REG support

2023-01-09 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/vsetvl/avl_single-2.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-20.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-21.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-22.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-23.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-24.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-25.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-26.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-27.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-28.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-29.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-3.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-30.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-31.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-32.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-33.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-34.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-35.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-36.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-37.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-38.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-39.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-4.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-40.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-41.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-42.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-43.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-44.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-45.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-46.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-47.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-48.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-49.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-5.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-50.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-51.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-52.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-53.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-54.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-55.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-56.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-57.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-58.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-59.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-6.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-60.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-61.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-62.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-63.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-64.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-65.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-66.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-67.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-68.c: New test.
* gcc.target/riscv/rvv/vsetvl/avl_single-69.c: New test.

---
 .../riscv/rvv/vsetvl/avl_single-2.c   | 18 ++
 .../riscv/rvv/vsetvl/avl_single-20.c  | 40 +
 .../riscv/rvv/vsetvl/avl_single-21.c  | 32 +++
 .../riscv/rvv/vsetvl/avl_single-22.c  | 42 ++
 .../riscv/rvv/vsetvl/avl_single-23.c  | 34 +++
 .../riscv/rvv/vsetvl/avl_single-24.c  | 36 
 .../riscv/rvv/vsetvl/avl_single-25.c  | 38 +
 .../riscv/rvv/vsetvl/avl_single-26.c  | 35 
 .../riscv/rvv/vsetvl/avl_single-27.c  | 36 
 .../riscv/rvv/vsetvl/avl_single-28.c  | 30 ++
 .../riscv/rvv/vsetvl/avl_single-29.c  | 31 ++
 .../riscv/rvv/vsetvl/avl_single-3.c   | 19 +++
 .../riscv/rvv/vsetvl/avl_single-30.c  | 29 ++
 .../riscv/rvv/vsetvl/avl_single-31.c  | 27 +
 .../riscv/rvv/vsetvl/avl_single-32.c  | 27 +
 .../riscv/rvv/vsetvl/avl_single-33.c  | 29 ++
 .../riscv/rvv/vsetvl/avl_single-34.c  | 28 +
 .../riscv/rvv/vsetvl/avl_single-35.c  | 27 +
 .../riscv/rvv/vsetvl/avl_single-36.c  | 25 
 .../riscv/rvv/vsetvl/avl_single-37.c  | 29 ++
 .../riscv/rvv/vsetvl/avl_single-38.c  | 57 +++
 .../riscv/rvv/vsetvl/avl_single-39.c   

[PATCH] RISC-V: Adjust testcases for AVL=REG support

2023-01-09 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/vsetvl/imm_bb_prop-3.c: Adjust testcase.
* gcc.target/riscv/rvv/vsetvl/imm_bb_prop-4.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/imm_conflict-4.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/imm_conflict-5.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/imm_loop_invariant-17.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-27.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-28.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-45.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-25.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-26.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-27.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-28.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-3.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_conflict-7.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-12.c: Ditto.

---
 .../gcc.target/riscv/rvv/vsetvl/imm_bb_prop-3.c|  2 +-
 .../gcc.target/riscv/rvv/vsetvl/imm_bb_prop-4.c|  2 +-
 .../gcc.target/riscv/rvv/vsetvl/imm_conflict-4.c   | 12 ++--
 .../gcc.target/riscv/rvv/vsetvl/imm_conflict-5.c   | 12 ++--
 .../riscv/rvv/vsetvl/imm_loop_invariant-17.c   |  3 +--
 .../riscv/rvv/vsetvl/vlmax_back_prop-27.c  |  4 ++--
 .../riscv/rvv/vsetvl/vlmax_back_prop-28.c  |  4 ++--
 .../riscv/rvv/vsetvl/vlmax_back_prop-45.c  |  2 +-
 .../gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-25.c | 14 +++---
 .../gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-26.c | 12 ++--
 .../gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-27.c | 12 ++--
 .../gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-28.c |  2 +-
 .../gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-3.c  |  2 +-
 .../gcc.target/riscv/rvv/vsetvl/vlmax_conflict-7.c |  1 -
 .../riscv/rvv/vsetvl/vlmax_switch_vtype-12.c   |  2 +-
 15 files changed, 42 insertions(+), 44 deletions(-)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/imm_bb_prop-3.c 
b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/imm_bb_prop-3.c
index 3da7b8722c2..20a1cd27c43 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/imm_bb_prop-3.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/imm_bb_prop-3.c
@@ -19,4 +19,4 @@ void f(void *base, void *out, void *mask_in, size_t vl, 
size_t m) {
   }
 }
 
-/* { dg-final { scan-assembler-times 
{vsetivli\s+zero,\s*4,\s*e8,\s*mf8,\s*tu,\s*mu} 1 { target { no-opts "-O0" 
no-opts "-O1" no-opts "-Os" no-opts "-g" no-opts "-funroll-loops" } } } } */
+/* { dg-final { scan-assembler-times 
{vsetivli\s+zero,\s*4,\s*e16,\s*mf4,\s*tu,\s*mu} 1 { target { no-opts "-O0" 
no-opts "-O1" no-opts "-Os" no-opts "-g" no-opts "-funroll-loops" } } } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/imm_bb_prop-4.c 
b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/imm_bb_prop-4.c
index 2a9616eb7ea..58aecb0a219 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/imm_bb_prop-4.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/imm_bb_prop-4.c
@@ -21,5 +21,5 @@ void f(void *base, void *out, void *mask_in, size_t vl, 
size_t m, size_t n) {
   }
 }
 
-/* { dg-final { scan-assembler-times 
{vsetivli\s+zero,\s*4,\s*e8,\s*mf8,\s*tu,\s*mu} 1 { target { no-opts "-O0" 
no-opts "-O1" no-opts "-Os" no-opts "-g" no-opts "-funroll-loops" } } } } */
+/* { dg-final { scan-assembler-times 
{vsetivli\s+zero,\s*4,\s*e16,\s*mf4,\s*tu,\s*mu} 1 { target { no-opts "-O0" 
no-opts "-O1" no-opts "-Os" no-opts "-g" no-opts "-funroll-loops" } } } } */
 
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/imm_conflict-4.c 
b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/imm_conflict-4.c
index f24e129b4dc..fdfcb07a63d 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/imm_conflict-4.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/imm_conflict-4.c
@@ -30,9 +30,9 @@ void f (void * restrict in, void * restrict out, int n, int 
cond)
   }
 }
 
-/* { dg-final { scan-assembler-times 
{vsetivli\s+zero,\s*5,\s*e8,\s*mf8,\s*tu,\s*m[au]} 1 { target { no-opts "-O0" 
no-opts "-g" no-opts "-funroll-loops" } } } } */
-/* { dg-final { scan-assembler-times 
{vsetivli\s+zero,\s*19,\s*e32,\s*m1,\s*t[au],\s*m[au]} 1 { target { no-opts 
"-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */
-/* { dg-final { scan-assembler-times 
{vsetivli\s+zero,\s*8,\s*e32,\s*m1,\s*t[au],\s*m[au]} 1 { target { no-opts 
"-O0" no-opts "-O1" no-opts "-g" no-opts "-funroll-loops" } } } } */
-/* { dg-final { scan-assembler-times 
{vsetvli\s+[a-x0-9]+,\s*zero,\s*e8,\s*mf8,\s*t[au],\s*m[au]} 1 { target { 
no-opts "-O0"  no-opts "-funroll-loops" no-opts "-g" } } } } */
-/* { dg-final { scan-assembler-times {vsetivli} 3 { target { no-opts "-O0" 
no-opts "-O1" no-opts "-funroll-loops" no-opts "-g" } } } } */
-/* { dg-final { scan-assembler-times {vsetvli} 1 { target { no-opts "-O0"  
no-opts "-funroll-loops" no-opts "-g" } } } } */
+/* { 

[PATCH] RISC-V: Fix bugs of supporting AVL=REG (single-real-def) in VSETVL PASS

2023-01-09 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (same_bb_and_before_p): Remove it.
(real_insn_and_same_bb_p): New function.
(same_bb_and_after_or_equal_p): Remove it.
(before_p): New function.
(reg_killed_by_bb_p): Ditto.
(has_vsetvl_killed_avl_p): Ditto.
(get_vl): Move location so that we can call it.
(anticipatable_occurrence_p): Fix issue of AVL=REG support.
(available_occurrence_p): Ditto.
(dominate_probability_p): Remove it.
(can_backward_propagate_p): Remove it.
(get_all_nonphi_defs): New function.
(get_all_predecessors): Ditto.
(any_insn_in_bb_p): Ditto.
(insert_vsetvl): Adjust AVL REG.
(source_equal_p): New function.
(extract_single_source): Ditto.
(avl_info::single_source_equal_p): Ditto.
(avl_info::operator==): Adjust for AVL=REG.
(vl_vtype_info::same_avl_p): Ditto.
(vector_insn_info::set_demand_info): Remove it.
(vector_insn_info::compatible_p): Adjust for AVL=REG.
(vector_insn_info::compatible_avl_p): New function.
(vector_insn_info::merge): Adjust AVL=REG.
(vector_insn_info::dump): Ditto.
(pass_vsetvl::merge_successors): Remove it.
(enum fusion_type): New enum.
(pass_vsetvl::get_backward_fusion_type): New function.
(pass_vsetvl::backward_demand_fusion): Adjust for AVL=REG.
(pass_vsetvl::forward_demand_fusion): Ditto.
(pass_vsetvl::demand_fusion): Ditto.
(pass_vsetvl::prune_expressions): Ditto.
(pass_vsetvl::compute_local_properties): Ditto.
(pass_vsetvl::cleanup_vsetvls): Ditto.
(pass_vsetvl::commit_vsetvls): Ditto.
(pass_vsetvl::init): Ditto.
* config/riscv/riscv-vsetvl.h (enum fusion_type): New enum.
(enum merge_type): New enum.

---
 gcc/config/riscv/riscv-vsetvl.cc | 928 +--
 gcc/config/riscv/riscv-vsetvl.h  |  68 ++-
 2 files changed, 710 insertions(+), 286 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 7aa2852b456..0245124e28f 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -178,34 +178,97 @@ vsetvl_insn_p (rtx_insn *rinsn)
 || INSN_CODE (rinsn) == CODE_FOR_vsetvlsi);
 }
 
-/* Return true if INSN1 comes befeore INSN2 in the same block.  */
 static bool
-same_bb_and_before_p (const insn_info *insn1, const insn_info *insn2)
+real_insn_and_same_bb_p (const insn_info *insn, const bb_info *bb)
 {
-  return ((insn1->bb ()->index () == insn2->bb ()->index ())
-&& (*insn1 < *insn2));
+  return insn != nullptr && insn->is_real () && insn->bb () == bb;
 }
 
-/* Return true if INSN1 comes after or equal INSN2 in the same block.  */
 static bool
-same_bb_and_after_or_equal_p (const insn_info *insn1, const insn_info *insn2)
+before_p (const insn_info *insn1, const insn_info *insn2)
 {
-  return ((insn1->bb ()->index () == insn2->bb ()->index ())
-&& (*insn1 >= *insn2));
+  return insn1->compare_with (insn2) == -1;
+}
+
+static bool
+reg_killed_by_bb_p (const bb_info *bb, rtx x)
+{
+  if (!x || vlmax_avl_p (x))
+return false;
+  for (const insn_info *insn : bb->real_nondebug_insns ())
+if (find_access (insn->defs (), REGNO (x)))
+  return true;
+  return false;
+}
+
+static bool
+has_vsetvl_killed_avl_p (const bb_info *bb, const vector_insn_info )
+{
+  if (info.dirty_with_killed_avl_p ())
+{
+  rtx avl = info.get_avl ();
+  for (const insn_info *insn : bb->reverse_real_nondebug_insns ())
+   {
+ def_info *def = find_access (insn->defs (), REGNO (avl));
+ if (def)
+   {
+ set_info *set = safe_dyn_cast (def);
+ if (!set)
+   return false;
+
+ rtx new_avl = gen_rtx_REG (GET_MODE (avl), REGNO (avl));
+ gcc_assert (new_avl != avl);
+ if (!info.compatible_avl_p (avl_info (new_avl, set)))
+   return false;
+
+ return true;
+   }
+   }
+}
+  return false;
+}
+
+/* Helper function to get VL operand.  */
+static rtx
+get_vl (rtx_insn *rinsn)
+{
+  if (has_vl_op (rinsn))
+{
+  extract_insn_cached (rinsn);
+  return recog_data.operand[get_attr_vl_op_idx (rinsn)];
+}
+  return SET_DEST (XVECEXP (PATTERN (rinsn), 0, 0));
 }
 
 /* An "anticipatable occurrence" is one that is the first occurrence in the
basic block, the operands are not modified in the basic block prior
to the occurrence and the output is not used between the start of
-   the block and the occurrence.  */
+   the block and the occurrence.
+
+   For VSETVL instruction, we have these following formats:
+ 1. vsetvl zero, rs1.
+ 2. vsetvl zero, imm.
+ 3. vsetvl rd, rs1.
+
+   So base on these circumstances, a DEM is considered as a local anticipatable
+   occurrence should satisfy these 

[PATCH] RISC-V: Call DCE to remove redundant instructions created by the PASS

2023-01-09 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (pass_vsetvl::done): Add DCE.
* config/riscv/t-riscv: Add DCE.

---
 gcc/config/riscv/riscv-vsetvl.cc | 2 ++
 gcc/config/riscv/t-riscv | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 7d8c3a32aaa..7aa2852b456 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -87,6 +87,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "predict.h"
 #include "profile-count.h"
 #include "riscv-vsetvl.h"
+#include "dce.h"
 
 using namespace rtl_ssa;
 using namespace riscv_vector;
@@ -2627,6 +2628,7 @@ pass_vsetvl::done (void)
cleanup_cfg (0);
   delete crtl->ssa;
   crtl->ssa = nullptr;
+  run_fast_dce ();
 }
   m_vector_manager->release ();
   delete m_vector_manager;
diff --git a/gcc/config/riscv/t-riscv b/gcc/config/riscv/t-riscv
index d30e0235356..c95f4aff358 100644
--- a/gcc/config/riscv/t-riscv
+++ b/gcc/config/riscv/t-riscv
@@ -54,7 +54,7 @@ riscv-c.o: $(srcdir)/config/riscv/riscv-c.cc $(CONFIG_H) 
$(SYSTEM_H) \
 riscv-vsetvl.o: $(srcdir)/config/riscv/riscv-vsetvl.cc \
   $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) $(REGS_H) \
   $(TARGET_H) tree-pass.h df.h rtl-ssa.h cfgcleanup.h insn-config.h \
-  insn-attr.h insn-opinit.h tm-constrs.h cfgrtl.h cfganal.h lcm.h \
+  insn-attr.h insn-opinit.h tm-constrs.h cfgrtl.h cfganal.h lcm.h dce.h \
   predict.h profile-count.h $(srcdir)/config/riscv/riscv-vsetvl.h
$(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \
$(srcdir)/config/riscv/riscv-vsetvl.cc
-- 
2.36.1



[PATCH] RISC-V: Add probability model of each block to prevent endless loop of Phase 3

2023-01-09 Thread juzhe . zhong
From: Ju-Zhe Zhong 

Notice that the PASS is just simpily pick the probability >= 50% 
to do the backward fusion which will create endless loop on Phase 3.

Adding this probability to fix this bug.
gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc 
(vector_infos_manager::vector_infos_manager): Add probability.
(vector_infos_manager::dump): Ditto.
(pass_vsetvl::compute_probabilities): Ditto.
* config/riscv/riscv-vsetvl.h (struct vector_block_info): Ditto.

---
 gcc/config/riscv/riscv-vsetvl.cc | 39 
 gcc/config/riscv/riscv-vsetvl.h  |  3 +++
 2 files changed, 42 insertions(+)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 0f12d4ddb23..7d8c3a32aaa 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -1465,6 +1465,7 @@ vector_infos_manager::vector_infos_manager ()
  vector_block_infos[bb->index ()].reaching_out = vector_insn_info ();
  for (insn_info *insn : bb->real_insns ())
vector_insn_infos[insn->uid ()].parse_insn (insn);
+ vector_block_infos[bb->index ()].probability = profile_probability ();
}
 }
 }
@@ -1642,6 +1643,8 @@ vector_infos_manager::dump (FILE *file) const
}
   fprintf (file, "=");
   vector_block_infos[cfg_bb->index].reaching_out.dump (file);
+  fprintf (file, "=");
+  vector_block_infos[cfg_bb->index].probability.dump (file);
   fprintf (file, "\n\n");
 }
 
@@ -1764,6 +1767,7 @@ private:
 
   void init (void);
   void done (void);
+  void compute_probabilities (void);
 
 public:
   pass_vsetvl (gcc::context *ctxt) : rtl_opt_pass (pass_data_vsetvl, ctxt) {}
@@ -2629,6 +2633,41 @@ pass_vsetvl::done (void)
   m_vector_manager = nullptr;
 }
 
+/* Compute probability for each block.  */
+void
+pass_vsetvl::compute_probabilities (void)
+{
+  /* Don't compute it in -O0 since we don't need it.  */
+  if (!optimize)
+return;
+  edge e;
+  edge_iterator ei;
+
+  for (const bb_info *bb : crtl->ssa->bbs ())
+{
+  basic_block cfg_bb = bb->cfg_bb ();
+  auto _prob
+   = m_vector_manager->vector_block_infos[cfg_bb->index].probability;
+  if (ENTRY_BLOCK_PTR_FOR_FN (cfun) == cfg_bb)
+   curr_prob = profile_probability::always ();
+  gcc_assert (curr_prob.initialized_p ());
+  FOR_EACH_EDGE (e, ei, cfg_bb->succs)
+   {
+ auto _prob
+   = m_vector_manager->vector_block_infos[e->dest->index].probability;
+ if (!new_prob.initialized_p ())
+   new_prob = curr_prob * e->probability;
+ else if (new_prob == profile_probability::always ())
+   continue;
+ else
+   new_prob += curr_prob * e->probability;
+   }
+}
+  auto _block
+= m_vector_manager->vector_block_infos[EXIT_BLOCK_PTR_FOR_FN 
(cfun)->index];
+  exit_block.probability = profile_probability::always ();
+}
+
 /* Lazy vsetvl insertion for optimize > 0. */
 void
 pass_vsetvl::lazy_vsetvl (void)
diff --git a/gcc/config/riscv/riscv-vsetvl.h b/gcc/config/riscv/riscv-vsetvl.h
index 563ad3084ed..fb3ebb9db79 100644
--- a/gcc/config/riscv/riscv-vsetvl.h
+++ b/gcc/config/riscv/riscv-vsetvl.h
@@ -291,6 +291,9 @@ struct vector_block_info
   /* The reaching_out vector insn_info of the block.  */
   vector_insn_info reaching_out;
 
+  /* The static execute probability of the demand info.  */
+  profile_probability probability;
+
   vector_block_info () = default;
 };
 
-- 
2.36.1



[PATCH] RISC-V: Remove dirty_pat since it is redundant

2023-01-09 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (vector_insn_info::operator==): Remove 
dirty_pat.
(vector_insn_info::merge): Ditto.
(vector_insn_info::dump): Ditto.
(pass_vsetvl::merge_successors): Ditto.
(pass_vsetvl::backward_demand_fusion): Ditto.
(pass_vsetvl::forward_demand_fusion): Ditto.
(pass_vsetvl::commit_vsetvls): Ditto.
* config/riscv/riscv-vsetvl.h: Ditto.

---
 gcc/config/riscv/riscv-vsetvl.cc | 28 
 gcc/config/riscv/riscv-vsetvl.h  | 11 +--
 2 files changed, 13 insertions(+), 26 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 3c920779992..0f12d4ddb23 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -1083,10 +1083,10 @@ vector_insn_info::operator== (const vector_insn_info 
) const
 if (m_demands[i] != other.demand_p ((enum demand_type) i))
   return false;
 
-  if (m_insn != other.get_insn ())
-return false;
-  if (m_dirty_pat != other.get_dirty_pat ())
-return false;
+  if (vector_config_insn_p (m_insn->rtl ())
+  || vector_config_insn_p (other.get_insn ()->rtl ()))
+if (m_insn != other.get_insn ())
+  return false;
 
   if (!same_avl_p (other))
 return false;
@@ -1317,8 +1317,6 @@ vector_insn_info::merge (const vector_insn_info 
_info,
new_info.set_insn (merge_info.get_insn ());
 }
 
-  new_info.set_dirty_pat (merge_info.get_dirty_pat ());
-
   if (!demand_p (DEMAND_AVL) && !merge_info.demand_p (DEMAND_AVL))
 new_info.undemand (DEMAND_AVL);
   if (!demand_p (DEMAND_SEW) && !merge_info.demand_p (DEMAND_SEW))
@@ -1431,11 +1429,6 @@ vector_insn_info::dump (FILE *file) const
  fprintf (file, "The real INSN=");
  print_rtl_single (file, get_insn ()->rtl ());
}
-  if (get_dirty_pat ())
-   {
- fprintf (file, "Dirty RTL Pattern=");
- print_rtl_single (file, get_dirty_pat ());
-   }
 }
 }
 
@@ -1967,7 +1960,6 @@ pass_vsetvl::merge_successors (const basic_block father,
 
   new_info.set_dirty ();
   rtx new_pat = gen_vsetvl_pat (new_info.get_insn ()->rtl (), new_info);
-  new_info.set_dirty_pat (new_pat);
 
   father_info.local_dem = new_info;
   father_info.reaching_out = new_info;
@@ -2051,7 +2043,6 @@ pass_vsetvl::backward_demand_fusion (void)
 
  block_info.reaching_out = prop;
  block_info.reaching_out.set_dirty ();
- block_info.reaching_out.set_dirty_pat (new_pat);
  block_info.local_dem = block_info.reaching_out;
  changed_p = true;
}
@@ -2080,7 +2071,6 @@ pass_vsetvl::backward_demand_fusion (void)
  rtx new_pat
= gen_vsetvl_pat (new_info.get_insn ()->rtl (), new_info);
  new_info.set_dirty ();
- new_info.set_dirty_pat (new_pat);
  block_info.local_dem = new_info;
  block_info.reaching_out = new_info;
  changed_p = true;
@@ -2178,7 +2168,6 @@ pass_vsetvl::forward_demand_fusion (void)
= gen_vsetvl_pat (prop.get_insn ()->rtl (), prop);
  local_dem = prop;
  local_dem.set_dirty ();
- local_dem.set_dirty_pat (dirty_pat);
  reaching_out = local_dem;
}
  else
@@ -2507,10 +2496,17 @@ pass_vsetvl::commit_vsetvls (void)
   if (!reaching_out.dirty_p ())
continue;
 
-  rtx new_pat = reaching_out.get_dirty_pat ();
+
+  rtx new_pat;
   if (can_refine_vsetvl_p (cfg_bb, reaching_out.get_ratio ()))
new_pat
  = gen_vsetvl_pat (VSETVL_VTYPE_CHANGE_ONLY, reaching_out, NULL_RTX);
+  else if (vlmax_avl_p (reaching_out.get_avl ()))
+   new_pat = gen_vsetvl_pat (VSETVL_NORMAL, reaching_out,
+ get_vl (reaching_out.get_insn ()->rtl ()));
+  else
+   new_pat
+ = gen_vsetvl_pat (VSETVL_DISCARD_RESULT, reaching_out, NULL_RTX);
 
   start_sequence ();
   emit_insn (new_pat);
diff --git a/gcc/config/riscv/riscv-vsetvl.h b/gcc/config/riscv/riscv-vsetvl.h
index dc16c55b918..563ad3084ed 100644
--- a/gcc/config/riscv/riscv-vsetvl.h
+++ b/gcc/config/riscv/riscv-vsetvl.h
@@ -220,13 +220,6 @@ private:
  (with AVL included) before vmv.x.s, but vmv.x.s is not the INSN holding 
the
  definition of AVL.  */
   rtl_ssa::insn_info *m_insn;
-  /* Save instruction pattern for Dirty block.
- Since empty block may be polluted as a dirty block during dem backward
- propagation (phase 3) which is intending to cheat LCM there is a VSETVL
- instruction here to gain better LCM optimization. Such instruction is not
- emit yet, we save this here and then emit it in the 4th phase if it is
- necessary.  */
-  rtx m_dirty_pat;
 
   /* Parse the instruction to get VL/VTYPE information and demanding

[Bug c++/108347] Incorrect error: ambiguous template instantiation

2023-01-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108347

--- Comment #3 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #2)
> Created attachment 54223 [details]
> testcase that removes the C++17isms and convert it over to C++11

The reason why I did this is because I wanted to see if older versions of GCC
rejected this code and they do in a similar fashion. (MSVC still ICEs).

[PATCH] RISC-V: Rename insn into rinsn for rtx_insn *

2023-01-09 Thread juzhe . zhong
From: Ju-Zhe Zhong 

Since the PASS is implemented base on RTL_SSA framework.
According to rtl_ssa, they name insn_info * as insn and
name rtx_insn * rinsn. I follow this rule in this pass but I missed
this function. So rename it to make codes be consistent to RTL_SSA
framework.

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (add_label_notes):

---
 gcc/config/riscv/riscv-vsetvl.cc | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index a972c49561a..3c920779992 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -682,7 +682,7 @@ insert_vsetvl (enum emit_type emit_type, rtx_insn *rinsn,
necessary REG_LABEL_OPERAND and REG_LABEL_TARGET notes.  */
 
 static void
-add_label_notes (rtx x, rtx_insn *insn)
+add_label_notes (rtx x, rtx_insn *rinsn)
 {
   enum rtx_code code = GET_CODE (x);
   int i, j;
@@ -699,8 +699,8 @@ add_label_notes (rtx x, rtx_insn *insn)
   /* There's no reason for current users to emit jump-insns with
 such a LABEL_REF, so we don't have to handle REG_LABEL_TARGET
 notes.  */
-  gcc_assert (!JUMP_P (insn));
-  add_reg_note (insn, REG_LABEL_OPERAND, label_ref_label (x));
+  gcc_assert (!JUMP_P (rinsn));
+  add_reg_note (rinsn, REG_LABEL_OPERAND, label_ref_label (x));
 
   if (LABEL_P (label_ref_label (x)))
LABEL_NUSES (label_ref_label (x))++;
@@ -711,10 +711,10 @@ add_label_notes (rtx x, rtx_insn *insn)
   for (i = GET_RTX_LENGTH (code) - 1, fmt = GET_RTX_FORMAT (code); i >= 0; i--)
 {
   if (fmt[i] == 'e')
-   add_label_notes (XEXP (x, i), insn);
+   add_label_notes (XEXP (x, i), rinsn);
   else if (fmt[i] == 'E')
for (j = XVECLEN (x, i) - 1; j >= 0; j--)
- add_label_notes (XVECEXP (x, i, j), insn);
+ add_label_notes (XVECEXP (x, i, j), rinsn);
 }
 }
 
-- 
2.36.1



[Bug c++/108347] Incorrect error: ambiguous template instantiation

2023-01-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108347

--- Comment #2 from Andrew Pinski  ---
Created attachment 54223
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54223=edit
testcase that removes the C++17isms and convert it over to C++11

[Bug c++/108347] Incorrect error: ambiguous template instantiation

2023-01-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108347

--- Comment #1 from Andrew Pinski  ---
Well MSVC has an internal compiler error  with this code.

[PATCH] RISC-V: Avoid redundant flow in backward fusion

2023-01-09 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (pass_vsetvl::backward_demand_fusion): 
Add pre-check to avoid redundant flow.

---
 gcc/config/riscv/riscv-vsetvl.cc | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 0ca1c51d121..a972c49561a 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -2020,6 +2020,8 @@ pass_vsetvl::backward_demand_fusion (void)
 
   if (!backward_propagate_worthwhile_p (cfg_bb, curr_block_info))
continue;
+
+  if (cfg_bb == EXIT_BLOCK_PTR_FOR_FN (cfun))
continue;
 
   edge e;
-- 
2.36.1



[PATCH] RISC-V: Refine codes in backward fusion

2023-01-09 Thread juzhe . zhong
From: Ju-Zhe Zhong 

This NFC patch is preparing for the following patches.
gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (pass_vsetvl::backward_demand_fusion): 
Refine codes.

---
 gcc/config/riscv/riscv-vsetvl.cc | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 18c6f437db6..0ca1c51d121 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -2010,15 +2010,16 @@ pass_vsetvl::backward_demand_fusion (void)
   for (const bb_info *bb : crtl->ssa->reverse_bbs ())
 {
   basic_block cfg_bb = bb->cfg_bb ();
-  const auto 
-   = m_vector_manager->vector_block_infos[cfg_bb->index].local_dem;
+  const auto _block_info
+   = m_vector_manager->vector_block_infos[cfg_bb->index];
+  const auto  = curr_block_info.local_dem;
 
   /* If there is nothing to propagate, just skip it.  */
   if (!prop.valid_or_dirty_p ())
continue;
 
-  if (!backward_propagate_worthwhile_p (
-   cfg_bb, m_vector_manager->vector_block_infos[cfg_bb->index]))
+  if (!backward_propagate_worthwhile_p (cfg_bb, curr_block_info))
+   continue;
continue;
 
   edge e;
-- 
2.36.1



[PATCH] bpf: correct bpf_print_operand for floats [PR108293]

2023-01-09 Thread David Faust via Gcc-patches
The existing logic in bpf_print_operand was only correct for integral
CONST_DOUBLEs, and emitted garbage for floating point modes. Fix it so
floating point mode operands are correctly handled.

Tested on bpf-unknown-none, no known regressions.
OK to check-in?

Thanks.


PR target/108293

gcc/

* config/bpf/bpf.cc (bpf_print_operand): Correct handling for
floating point modes.

gcc/testsuite/

* gcc.target/bpf/double-1.c: New test.
* gcc.target/bpf/double-2.c: New test.
* gcc.target/bpf/float-1.c: New test.
---
 gcc/config/bpf/bpf.cc   | 21 ++---
 gcc/testsuite/gcc.target/bpf/double-1.c | 12 
 gcc/testsuite/gcc.target/bpf/double-2.c | 12 
 gcc/testsuite/gcc.target/bpf/float-1.c  | 12 
 4 files changed, 50 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/bpf/double-1.c
 create mode 100644 gcc/testsuite/gcc.target/bpf/double-2.c
 create mode 100644 gcc/testsuite/gcc.target/bpf/float-1.c

diff --git a/gcc/config/bpf/bpf.cc b/gcc/config/bpf/bpf.cc
index 2aeaeaf309b..9dde3944e9c 100644
--- a/gcc/config/bpf/bpf.cc
+++ b/gcc/config/bpf/bpf.cc
@@ -880,13 +880,20 @@ bpf_print_operand (FILE *file, rtx op, int code 
ATTRIBUTE_UNUSED)
   output_address (GET_MODE (op), XEXP (op, 0));
   break;
 case CONST_DOUBLE:
-  if (CONST_DOUBLE_HIGH (op))
-   fprintf (file, HOST_WIDE_INT_PRINT_DOUBLE_HEX,
-CONST_DOUBLE_HIGH (op), CONST_DOUBLE_LOW (op));
-  else if (CONST_DOUBLE_LOW (op) < 0)
-   fprintf (file, HOST_WIDE_INT_PRINT_HEX, CONST_DOUBLE_LOW (op));
-  else
-   fprintf (file, HOST_WIDE_INT_PRINT_DEC, CONST_DOUBLE_LOW (op));
+  long vals[2];
+  real_to_target (vals, CONST_DOUBLE_REAL_VALUE (op), GET_MODE (op));
+  vals[0] &= 0x;
+  vals[1] &= 0x;
+  if (GET_MODE (op) == SFmode)
+   fprintf (file, "0x%08lx", vals[0]);
+  else if (GET_MODE (op) == DFmode)
+   {
+ /* Note: real_to_target puts vals in target word order.  */
+ if (WORDS_BIG_ENDIAN)
+   fprintf (file, "0x%08lx%08lx", vals[0], vals[1]);
+ else
+   fprintf (file, "0x%08lx%08lx", vals[1], vals[0]);
+   }
   break;
 default:
   output_addr_const (file, op);
diff --git a/gcc/testsuite/gcc.target/bpf/double-1.c 
b/gcc/testsuite/gcc.target/bpf/double-1.c
new file mode 100644
index 000..200f1bd18f8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/bpf/double-1.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-mlittle-endian" } */
+
+double f;
+double a() { f = 1.0; return 1.0; }
+double b() { f = 2.0; return 2.0; }
+double c() { f = 2.0; return 3.0; }
+double d() { f = 3.0; return 3.0; }
+
+/* { dg-final { scan-assembler-times "lddw\t%r.,0x3ff0" 2 } } */
+/* { dg-final { scan-assembler-times "lddw\t%r.,0x4000" 3 } } */
+/* { dg-final { scan-assembler-times "lddw\t%r.,0x4008" 3 } } */
diff --git a/gcc/testsuite/gcc.target/bpf/double-2.c 
b/gcc/testsuite/gcc.target/bpf/double-2.c
new file mode 100644
index 000..d04ddd0c575
--- /dev/null
+++ b/gcc/testsuite/gcc.target/bpf/double-2.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-mbig-endian" } */
+
+double f;
+double a() { f = 1.0; return 1.0; }
+double b() { f = 2.0; return 2.0; }
+double c() { f = 2.0; return 3.0; }
+double d() { f = 3.0; return 3.0; }
+
+/* { dg-final { scan-assembler-times "lddw\t%r.,0x3ff0" 2 } } */
+/* { dg-final { scan-assembler-times "lddw\t%r.,0x4000" 3 } } */
+/* { dg-final { scan-assembler-times "lddw\t%r.,0x4008" 3 } } */
diff --git a/gcc/testsuite/gcc.target/bpf/float-1.c 
b/gcc/testsuite/gcc.target/bpf/float-1.c
new file mode 100644
index 000..05ed7bb651d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/bpf/float-1.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-mlittle-endian" } */
+
+float f;
+float a() { f = 1.0; return 1.0; }
+float b() { f = 2.0; return 2.0; }
+float c() { f = 2.0; return 3.0; }
+float d() { f = 3.0; return 3.0; }
+
+/* { dg-final { scan-assembler-times "lddw\t%r.,0x3f80" 2 } } */
+/* { dg-final { scan-assembler-times "lddw\t%r.,0x4000" 3 } } */
+/* { dg-final { scan-assembler-times "lddw\t%r.,0x4040" 3 } } */
-- 
2.39.0



[Bug c++/105838] [10/11/12/13 Regression] g++ 12.1.0 runs out of memory or time when building const std::vector of std::strings

2023-01-09 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105838

--- Comment #21 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:01ea66a6c56e53163d9430f4d87615d570848aa8

commit r13-5075-g01ea66a6c56e53163d9430f4d87615d570848aa8
Author: Jakub Jelinek 
Date:   Mon Jan 9 23:41:20 2023 +0100

c++: Only do maybe_init_list_as_range optimization if
!processing_template_decl [PR108047]

The last testcase in this patch ICEs, because
maybe_init_list_as_range -> maybe_init_list_as_array
calls maybe_constant_init in:
  /* Don't do this if the conversion would be constant.  */
  first = maybe_constant_init (first);
  if (TREE_CONSTANT (first))
return NULL_TREE;
but maybe_constant_init shouldn't be called when processing_template_decl.
While we could replace that call with fold_non_dependent_init, my limited
understanding is that this is an optimization and even if we don't optimize
it when processing_template_decl, build_user_type_conversion_1 will be
called again during instantiation with !processing_template_decl if it is
every instantiated and we can do the optimization only then.

2023-01-09  Jakub Jelinek  

PR c++/105838
PR c++/108047
PR c++/108266
* call.cc (maybe_init_list_as_range): Always return NULL_TREE if
processing_template_decl.

* g++.dg/tree-ssa/initlist-opt2.C: New test.
* g++.dg/tree-ssa/initlist-opt3.C: New test.

[Bug c++/108266] [13 Regression] ICE during cxx_eval_constant_expression on IMPLICIT_CONV_EXPR since r13-4564

2023-01-09 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108266

--- Comment #2 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:01ea66a6c56e53163d9430f4d87615d570848aa8

commit r13-5075-g01ea66a6c56e53163d9430f4d87615d570848aa8
Author: Jakub Jelinek 
Date:   Mon Jan 9 23:41:20 2023 +0100

c++: Only do maybe_init_list_as_range optimization if
!processing_template_decl [PR108047]

The last testcase in this patch ICEs, because
maybe_init_list_as_range -> maybe_init_list_as_array
calls maybe_constant_init in:
  /* Don't do this if the conversion would be constant.  */
  first = maybe_constant_init (first);
  if (TREE_CONSTANT (first))
return NULL_TREE;
but maybe_constant_init shouldn't be called when processing_template_decl.
While we could replace that call with fold_non_dependent_init, my limited
understanding is that this is an optimization and even if we don't optimize
it when processing_template_decl, build_user_type_conversion_1 will be
called again during instantiation with !processing_template_decl if it is
every instantiated and we can do the optimization only then.

2023-01-09  Jakub Jelinek  

PR c++/105838
PR c++/108047
PR c++/108266
* call.cc (maybe_init_list_as_range): Always return NULL_TREE if
processing_template_decl.

* g++.dg/tree-ssa/initlist-opt2.C: New test.
* g++.dg/tree-ssa/initlist-opt3.C: New test.

[Bug c++/108047] [13 Regression] ICE: unexpected expression of kind implicit_conv_expr since r13-4564-gd081807d8d70e3e8

2023-01-09 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108047

--- Comment #11 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:01ea66a6c56e53163d9430f4d87615d570848aa8

commit r13-5075-g01ea66a6c56e53163d9430f4d87615d570848aa8
Author: Jakub Jelinek 
Date:   Mon Jan 9 23:41:20 2023 +0100

c++: Only do maybe_init_list_as_range optimization if
!processing_template_decl [PR108047]

The last testcase in this patch ICEs, because
maybe_init_list_as_range -> maybe_init_list_as_array
calls maybe_constant_init in:
  /* Don't do this if the conversion would be constant.  */
  first = maybe_constant_init (first);
  if (TREE_CONSTANT (first))
return NULL_TREE;
but maybe_constant_init shouldn't be called when processing_template_decl.
While we could replace that call with fold_non_dependent_init, my limited
understanding is that this is an optimization and even if we don't optimize
it when processing_template_decl, build_user_type_conversion_1 will be
called again during instantiation with !processing_template_decl if it is
every instantiated and we can do the optimization only then.

2023-01-09  Jakub Jelinek  

PR c++/105838
PR c++/108047
PR c++/108266
* call.cc (maybe_init_list_as_range): Always return NULL_TREE if
processing_template_decl.

* g++.dg/tree-ssa/initlist-opt2.C: New test.
* g++.dg/tree-ssa/initlist-opt3.C: New test.

Re: [PATCH] Modula-2: fix documentation layout

2023-01-09 Thread Gaius Mulley via Gcc-patches
Eric Botcazou  writes:

> Hi,
>
> the Modula-2 documentation is rejected by older versions of Makeinfo because 
> the web of @node markers is fairly broken, apparently some subsections were 
> moved around, most notably between the Overview and Using sections, and the 
> @node markers were not (properly) adjusted.
>
> This patch allows me to build it with these older versions, as well as with 
> modern versions.  OK for mainline?
>
>
> 2023-01-09  Eric Botcazou  
>
>   * doc/gm2.texi (Overview): Fix @node markers.
>   (Using): Likewise.  Remove subsections that were moved to
>   Overview from the menu and move others around.

Hi Eric,

yes indeed and thanks for the patch!

regards,
Gaius


[PATCH] RISC-V: Avoid redundant flow in forward fusion

2023-01-09 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (pass_vsetvl::forward_demand_fusion): 
Add pre-check for redundant flow.

---
 gcc/config/riscv/riscv-vsetvl.cc | 8 
 1 file changed, 8 insertions(+)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 7800c2ee509..18c6f437db6 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -2140,6 +2140,9 @@ pass_vsetvl::forward_demand_fusion (void)
   if (!prop.valid_or_dirty_p ())
continue;
 
+  if (cfg_bb == ENTRY_BLOCK_PTR_FOR_FN (cfun))
+   continue;
+
   edge e;
   edge_iterator ei;
   /* Forward propagate to each successor.  */
@@ -2153,6 +2156,11 @@ pass_vsetvl::forward_demand_fusion (void)
  /* It's quite obvious, we don't need to propagate itself.  */
  if (e->dest->index == cfg_bb->index)
continue;
+ /* We don't propagate through critical edges.  */
+ if (e->flags & EDGE_COMPLEX)
+   continue;
+ if (e->dest->index == EXIT_BLOCK_PTR_FOR_FN (cfun)->index)
+   continue;
 
  /* If there is nothing to propagate, just skip it.  */
  if (!local_dem.valid_or_dirty_p ())
-- 
2.36.1



Re: [PATCH] c++: Only do maybe_init_list_as_range optimization if !processing_template_decl [PR108047]

2023-01-09 Thread Jason Merrill via Gcc-patches

On 1/9/23 05:19, Jakub Jelinek wrote:

Hi!

The last testcase in this patch ICEs, because
maybe_init_list_as_range -> maybe_init_list_as_array
calls maybe_constant_init in:
   /* Don't do this if the conversion would be constant.  */
   first = maybe_constant_init (first);
   if (TREE_CONSTANT (first))
 return NULL_TREE;
but maybe_constant_init shouldn't be called when processing_template_decl.
While we could replace that call with fold_non_dependent_init, my limited
understanding is that this is an optimization and even if we don't optimize
it when processing_template_decl, build_user_type_conversion_1 will be
called again during instantiation with !processing_template_decl if it is
every instantiated and we can do the optimization only then.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


OK.


Or do you want fold_non_dependent_init instead?

2023-01-09  Jakub Jelinek  

PR c++/105838
PR c++/108047
PR c++/108266
* call.cc (maybe_init_list_as_range): Always return NULL_TREE if
processing_template_decl.

* g++.dg/tree-ssa/initlist-opt2.C: New test.
* g++.dg/tree-ssa/initlist-opt3.C: New test.

--- gcc/cp/call.cc.jj   2022-12-15 09:24:44.265935297 +0100
+++ gcc/cp/call.cc  2023-01-06 11:24:44.837270905 +0100
@@ -4285,7 +4285,8 @@ maybe_init_list_as_array (tree elttype,
  static tree
  maybe_init_list_as_range (tree fn, tree expr)
  {
-  if (BRACE_ENCLOSED_INITIALIZER_P (expr)
+  if (!processing_template_decl
+  && BRACE_ENCLOSED_INITIALIZER_P (expr)
&& is_list_ctor (fn)
&& decl_in_std_namespace_p (fn))
  {
--- gcc/testsuite/g++.dg/tree-ssa/initlist-opt2.C.jj2023-01-06 
11:53:13.160432870 +0100
+++ gcc/testsuite/g++.dg/tree-ssa/initlist-opt2.C   2023-01-06 
11:53:44.561976302 +0100
@@ -0,0 +1,31 @@
+// PR c++/105838
+// { dg-additional-options -fdump-tree-gimple }
+// { dg-do compile { target c++11 } }
+
+// Test that we do range-initialization from const char *.
+// { dg-final { scan-tree-dump {_M_range_initialize} 
"gimple" } }
+
+#include 
+#include 
+
+void g (const void *);
+
+template 
+void f (const char *p)
+{
+  std::vector lst = {
+  "aahing", "aaliis", "aarrgh", "abacas", "abacus", "abakas", "abamps", "abands", "abased", 
"abaser", "abases", "abasia",
+  "abated", "abater", "abates", "abatis", "abator", "abattu", "abayas", "abbacy", "abbess", 
"abbeys", "abbots", "abcees",
+  "abdabs", "abduce", "abduct", "abears", "abeigh", "abeles", "abelia", "abends", "abhors", 
"abided", "abider", "abides",
+  "abject", "abjure", "ablate", "ablaut", "ablaze", "ablest", "ablets", "abling", "ablins", 
"abloom", "ablush", "abmhos",
+  "aboard", "aboded", "abodes", "abohms", "abolla", "abomas", "aboral", "abords", "aborne", 
"aborts", "abound", "abouts",
+  "aboves", "abrade", "abraid", "abrash", "abrays", "abrazo", "abrege", "abrins", "abroad", 
"abrupt", "abseil", "absent",
+  };
+
+  g();
+}
+
+void h (const char *p)
+{
+  f<0> (p);
+}
--- gcc/testsuite/g++.dg/tree-ssa/initlist-opt3.C.jj2023-01-06 
11:56:36.981469370 +0100
+++ gcc/testsuite/g++.dg/tree-ssa/initlist-opt3.C   2023-01-06 
11:56:09.984861898 +0100
@@ -0,0 +1,21 @@
+// PR c++/108266
+// { dg-do compile { target c++11 } }
+
+#include 
+#include 
+
+struct S { S (const char *); };
+void bar (std::vector);
+
+template 
+void
+foo ()
+{
+  bar ({"", ""});
+}
+
+void
+baz ()
+{
+  foo<0> ();
+}

Jakub





[Bug c++/108347] New: Incorrect error: ambiguous template instantiation

2023-01-09 Thread CoachHagins at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108347

Bug ID: 108347
   Summary: Incorrect error: ambiguous template instantiation
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: CoachHagins at gmail dot com
  Target Milestone: ---

Created attachment 54222
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54222=edit
C++ source - no include directives

The attached code is a single .cpp file, as it contains no #include directives.

I have also posted on compiler explorer here: https://godbolt.org/z/jjEPKdx8Y

The code compiles fine with all versions of clang from 7 up through the latest
branch on CE.

The code FAILS to compile with gcc, all versions from 7 up through the latest
branch on CE.

There are THREE #defines in the included code, which provide three different
"fixes" to make the code compile.

The command

g++ -std=c++17 -DFIX0=0 -DFIX1=0 -DFIX2=0

will fail to compile with an ambiguous template instantiation error.  However,
all of the following will allow the code to compile.  Hopefully, the various
"fixes" will help identify the underlying issue(s).

g++ -std=c++17 -DFIX0=0 -DFIX1=0 -DFIX2=1
g++ -std=c++17 -DFIX0=0 -DFIX1=1 -DFIX2=0
g++ -std=c++17 -DFIX0=0 -DFIX1=1 -DFIX2=1
g++ -std=c++17 -DFIX0=1 -DFIX1=0 -DFIX2=0
g++ -std=c++17 -DFIX0=1 -DFIX1=0 -DFIX2=1
g++ -std=c++17 -DFIX0=1 -DFIX1=1 -DFIX2=0
g++ -std=c++17 -DFIX0=1 -DFIX1=1 -DFIX2=1

[Bug modula2/108261] modula-2 module registration process seems to fail with shared libraries.

2023-01-09 Thread gaius at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108261

--- Comment #9 from Gaius Mulley  ---

I wonder if:

0.  change link array to contain elements of
{ char *name, (*fn) module_init, (*fn) module_fini }.

1.  add new option for gm2 -flibname=foo when creating libraries.
libname is buried inside the object.
So we can build:
gm2 -o iso/Storage.o  ../iso/Storage.mod  -flibname=iso
build libiso from iso/*

gm2 -o pim/Storage.o  ../pim/Storage.mod  -flibname=pim
build libpim from pim/*

2.  change the per module ctor to pass the libname in the call to
M2RTS_RegisterModule (modname, libname: ADDRESS;
  init, fini: ArgCVEnvP;
  dependencies: PROC) ;

3.  change M2RTS.def/mod:
   ConstructModules (applicationmodule, flibs: ADDRESS;
 argc: INTEGER; argv, envp: ADDRESS) ;

4.  driver passes -flibs=pim,iso,etc to cc1gm2
cc1gm2 emeddeds this string into the main when constructing
scaffold and call M2RTS_ConstructModules with the string
-flibs=... as the extra parameter.

5.  M2RTS_ConstructModules uses (in order):

   (a)  contents of link array (if it exists) as a dictionary
lookup based on name -> init function.
   (b)  if the module name does not exist in the link array then
it chooses the module which has registered itself and
whose libname is earliest on the flibs path.

the driver will have to link the libraries in flibs order to
choose the dialect specific module versions.


6.  I think we have to retain the m2pimlib.a m2isolib.a as there are
a few modules with the same name but different interface
(Storage).  However as you mentioned we could split them:
-lm2pim -lm2iso -lm2common.  In part this is how they were
designed (but never split up).  There are a number of rtName
(runtime) modules:  rtCo, rtEntity etc and these could be expanded
as required to provide a better layered approach.  The rt modules
were never expected to be presented to users.

I'm encouraged that -fscaffold-static fixes the shared library usage.
-fscaffold-static is in effect a similar solution to (0) above.

[PATCH] RISC-V: Cleanup the codes of bitmap create and free [NFC]

2023-01-09 Thread juzhe . zhong
From: Ju-Zhe Zhong 

This patch is a NFC patch to move the codes into a wrapper function so that
they can be reused. I will reuse them in the following patches.

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc 
(vector_infos_manager::create_bitmap_vectors): New function.
(vector_infos_manager::free_bitmap_vectors): Ditto.
(pass_vsetvl::pre_vsetvl): Adjust codes.
* config/riscv/riscv-vsetvl.h: New function declaration.

---
 gcc/config/riscv/riscv-vsetvl.cc | 95 +++-
 gcc/config/riscv/riscv-vsetvl.h  |  2 +
 2 files changed, 59 insertions(+), 38 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index d42cfa91d63..7800c2ee509 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -1569,18 +1569,62 @@ vector_infos_manager::release (void)
 vector_exprs.release ();
 
   if (optimize > 0)
-{
-  /* Finished. Free up all the things we've allocated.  */
-  free_edge_list (vector_edge_list);
-  sbitmap_vector_free (vector_del);
-  sbitmap_vector_free (vector_insert);
-  sbitmap_vector_free (vector_kill);
-  sbitmap_vector_free (vector_antic);
-  sbitmap_vector_free (vector_transp);
-  sbitmap_vector_free (vector_comp);
-  sbitmap_vector_free (vector_avin);
-  sbitmap_vector_free (vector_avout);
-}
+free_bitmap_vectors ();
+}
+
+void
+vector_infos_manager::create_bitmap_vectors (void)
+{
+  /* Create the bitmap vectors.  */
+  vector_antic = sbitmap_vector_alloc (last_basic_block_for_fn (cfun),
+  vector_exprs.length ());
+  vector_transp = sbitmap_vector_alloc (last_basic_block_for_fn (cfun),
+   vector_exprs.length ());
+  vector_comp = sbitmap_vector_alloc (last_basic_block_for_fn (cfun),
+ vector_exprs.length ());
+  vector_avin = sbitmap_vector_alloc (last_basic_block_for_fn (cfun),
+ vector_exprs.length ());
+  vector_avout = sbitmap_vector_alloc (last_basic_block_for_fn (cfun),
+  vector_exprs.length ());
+  vector_kill = sbitmap_vector_alloc (last_basic_block_for_fn (cfun),
+ vector_exprs.length ());
+
+  bitmap_vector_ones (vector_transp, last_basic_block_for_fn (cfun));
+  bitmap_vector_clear (vector_antic, last_basic_block_for_fn (cfun));
+  bitmap_vector_clear (vector_comp, last_basic_block_for_fn (cfun));
+}
+
+void
+vector_infos_manager::free_bitmap_vectors (void)
+{
+  /* Finished. Free up all the things we've allocated.  */
+  free_edge_list (vector_edge_list);
+  if (vector_del)
+sbitmap_vector_free (vector_del);
+  if (vector_insert)
+sbitmap_vector_free (vector_insert);
+  if (vector_kill)
+sbitmap_vector_free (vector_kill);
+  if (vector_antic)
+sbitmap_vector_free (vector_antic);
+  if (vector_transp)
+sbitmap_vector_free (vector_transp);
+  if (vector_comp)
+sbitmap_vector_free (vector_comp);
+  if (vector_avin)
+sbitmap_vector_free (vector_avin);
+  if (vector_avout)
+sbitmap_vector_free (vector_avout);
+
+  vector_edge_list = nullptr;
+  vector_kill = nullptr;
+  vector_del = nullptr;
+  vector_insert = nullptr;
+  vector_antic = nullptr;
+  vector_transp = nullptr;
+  vector_comp = nullptr;
+  vector_avin = nullptr;
+  vector_avout = nullptr;
 }
 
 void
@@ -2480,32 +2524,7 @@ pass_vsetvl::pre_vsetvl (void)
   /* Compute entity list.  */
   prune_expressions ();
 
-  /* Create the bitmap vectors.  */
-  m_vector_manager->vector_antic
-= sbitmap_vector_alloc (last_basic_block_for_fn (cfun),
-   m_vector_manager->vector_exprs.length ());
-  m_vector_manager->vector_transp
-= sbitmap_vector_alloc (last_basic_block_for_fn (cfun),
-   m_vector_manager->vector_exprs.length ());
-  m_vector_manager->vector_comp
-= sbitmap_vector_alloc (last_basic_block_for_fn (cfun),
-   m_vector_manager->vector_exprs.length ());
-  m_vector_manager->vector_avin
-= sbitmap_vector_alloc (last_basic_block_for_fn (cfun),
-   m_vector_manager->vector_exprs.length ());
-  m_vector_manager->vector_avout
-= sbitmap_vector_alloc (last_basic_block_for_fn (cfun),
-   m_vector_manager->vector_exprs.length ());
-  m_vector_manager->vector_kill
-= sbitmap_vector_alloc (last_basic_block_for_fn (cfun),
-   m_vector_manager->vector_exprs.length ());
-
-  bitmap_vector_ones (m_vector_manager->vector_transp,
- last_basic_block_for_fn (cfun));
-  bitmap_vector_clear (m_vector_manager->vector_antic,
-  last_basic_block_for_fn (cfun));
-  bitmap_vector_clear (m_vector_manager->vector_comp,
-  last_basic_block_for_fn (cfun));
+  m_vector_manager->create_bitmap_vectors ();
   

[PATCH] Modula-2: fix documentation layout

2023-01-09 Thread Eric Botcazou via Gcc-patches
Hi,

the Modula-2 documentation is rejected by older versions of Makeinfo because 
the web of @node markers is fairly broken, apparently some subsections were 
moved around, most notably between the Overview and Using sections, and the 
@node markers were not (properly) adjusted.

This patch allows me to build it with these older versions, as well as with 
modern versions.  OK for mainline?


2023-01-09  Eric Botcazou  

* doc/gm2.texi (Overview): Fix @node markers.
(Using): Likewise.  Remove subsections that were moved to
Overview from the menu and move others around.

-- 
Eric Botcazoudiff --git a/gcc/doc/gm2.texi b/gcc/doc/gm2.texi
index 18cb798c6cd..35e0f5ef622 100644
--- a/gcc/doc/gm2.texi
+++ b/gcc/doc/gm2.texi
@@ -89,7 +89,7 @@ Boston, MA 02110-1301, USA@*
 * Features::  GNU Modula-2 Features
 @end menu
 
-@node What is GNU Modula-2, Why use GNU Modula-2, , Using
+@node What is GNU Modula-2, Why use GNU Modula-2, , Overview
 @section What is GNU Modula-2
 
 GNU Modula-2 is a @uref{http://gcc.gnu.org/frontends.html, front end}
@@ -115,7 +115,7 @@ technology - programming languages - part 1: Modula-2 Language,
 ISO/IEC 10514-1 (1996)'
 }
 
-@node Why use GNU Modula-2, Release map, What is GNU Modula-2, Using
+@node Why use GNU Modula-2, Development, What is GNU Modula-2, Overview
 @section Why use GNU Modula-2
 
 There are a number of advantages of using GNU Modula-2 rather than
@@ -149,25 +149,13 @@ directory for a sub directory @code{foo} containing the library
 contents.  The library module search path is altered accordingly
 for compile and link.
 
-@node Release map, Development, Why use GNU Modula-2, Using
-@section Release map
-
-GNU Modula-2 is now part of GCC and therefore will adopt the GCC
-release schedule.  It is intended that GNU Modula-2 implement more of
-the GCC builtins (vararg access) and GCC features.
-
-There is an intention to implement the ISO generics and the M2R10
-dialect of Modula-2.  It will also implement all language changes.  If
-you wish to see something different please email
-@email{gm2@@nongnu.org} with your ideas.
-
-@node Development, Features, Release map, Using
+@node Development, Features, Why use GNU Modula-2, Overview
 @section How to get source code using git
 
 GNU Modula-2 is now in the @url{https://gcc.gnu.org/git.html, GCC git
 tree}.
 
-@node Features, Documentation, Development, Using
+@node Features, , Development, Overview
 @section GNU Modula-2 Features
 
 @itemize @bullet
@@ -230,99 +218,7 @@ such as the AVR and the ARM).
 
 @end itemize
 
-@node Documentation, Regression tests, Features, Using
-@section Documentation
-
-The GNU Modula-2 documentation is available on line
-@url{https://www.nongnu.org/gm2/homepage.html,at the gm2 homepage}
-or in the pdf, info, html file format.
-
-@node Regression tests, Limitations, Documentation, Using
-@section Regression tests for gm2 in the repository
-
-The regression testsuite can be run from the gcc build directory:
-
-@example
-$ cd build-gcc
-$ make check -j 24
-@end example
-
-which runs the complete testsuite for all compilers using 24 parallel
-invocations of the compiler.  Individual language testsuites can be
-run by specifying the language, for example the Modula-2 testsuite can
-be run using:
-
-@example
-$ cd build-gcc
-$ make check-m2 -j 24
-@end example
-
-Finally the results of the testsuite can be emailed to the
-@url{https://gcc.gnu.org/lists.html, gcc-testresults} list using the
-@file{test_summary} script found in the gcc source tree:
-
-@example
-$ @samp{directory to the sources}/contrib/test_summary
-@end example
-
-@node Limitations, Objectives, Regression tests, Using
-@section Limitations
-
-Logitech compatibility library is incomplete.  The principle modules
-for this platform exist however for a comprehensive list of completed
-modules please check the documentation
-@url{gm2.html}.
-
-@node Objectives, FAQ, , Using
-@section Objectives
-
-@itemize @bullet
-
-@item
-The intention of GNU Modula-2 is to provide a production Modula-2
-front end to GCC.
-
-@item
-It should support all Niklaus Wirth PIM Dialects [234] and also ISO
-Modula-2 including a re-implementation of all the ISO modules.
-
-@item
-There should be an easy interface to C.
-
-@item
-Exploit the features of GCC.
-
-@item
-Listen to the requests of the users.
-@end itemize
-
-@node FAQ, Community, Objectives, Using
-@section FAQ
-
-@subsection Why use the C++ exception mechanism in GCC, rather than a bespoke Modula-2 mechanism?
-
-The C++ mechanism is tried and tested, it also provides GNU Modula-2
-with the ability to link with C++ modules and via swig it can raise
-Python exceptions.
-
-@node Community, Other languages, FAQ, Using
-@section Community
-
-You can subscribe to the GNU Modula-2 mailing by sending an
-email to:
-@email{gm2-subscribe@@nongnu.org}
-or by
-@url{http://lists.nongnu.org/mailman/listinfo/gm2}.
-The mailing list contents can be viewed

[Bug libstdc++/77691] [10/11/12/13 regression] experimental/memory_resource/resource_adaptor.cc FAILs

2023-01-09 Thread dave.anglin at bell dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77691

--- Comment #56 from dave.anglin at bell dot net ---
On 2023-01-09 6:20 a.m., redi at gcc dot gnu.org wrote:
> Maybe like this.
Actually, the change i sent was for
libstdc++-v3/testsuite/experimental/memory_resource/new_delete_resource.cc.

It still fails.  No objection to the approach.

[committed] c: Check for modifiable static compound literals in inline definitions

2023-01-09 Thread Joseph Myers
The C rule against modifiable objects with static storage duration in
inline definitions should apply to compound literals (using the C2x
feature of storage-class specifiers for compound literals) just as to
variables.  Add a call to record_inline_static for compound literals
to make sure this case is detected.

Bootstrapped with no regressions for x86_64-pc-linux-gnu.

gcc/c/
* c-decl.cc (build_compound_literal): Call record_inline_static.

gcc/testsuite/
* gcc.dg/c2x-complit-8.c: New test.

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index e47ca6718b3..d76ffb3380d 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -6260,6 +6260,13 @@ build_compound_literal (location_t loc, tree type, tree 
init, bool non_const,
   DECL_USER_ALIGN (decl) = 1;
 }
   store_init_value (loc, decl, init, NULL_TREE);
+  if (current_scope != file_scope
+  && TREE_STATIC (decl)
+  && !TREE_READONLY (decl)
+  && DECL_DECLARED_INLINE_P (current_function_decl)
+  && DECL_EXTERNAL (current_function_decl))
+record_inline_static (input_location, current_function_decl,
+ decl, csi_modifiable);
 
   if (TREE_CODE (type) == ARRAY_TYPE && !COMPLETE_TYPE_P (type))
 {
diff --git a/gcc/testsuite/gcc.dg/c2x-complit-8.c 
b/gcc/testsuite/gcc.dg/c2x-complit-8.c
new file mode 100644
index 000..fb614ab7802
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c2x-complit-8.c
@@ -0,0 +1,70 @@
+/* Test C2x storage class specifiers in compound literals: inline function
+   constraints.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c2x -pedantic-errors" } */
+
+inline void
+f1 ()
+{
+  (static int) { 123 }; /* { dg-error "static but declared in inline function 
'f1' which is not static" } */
+  (static thread_local int) { 456 } ; /* { dg-error "static but declared in 
inline function 'f1' which is not static" } */
+  (int) { 789 };
+  (register int) { 1234 };
+}
+
+inline void
+f1e ()
+{
+  (static int) { 123 };
+  (static thread_local int) { 456 } ;
+}
+
+static inline void
+f1s ()
+{
+  (static int) { 123 };
+  (static thread_local int) { 456 } ;
+}
+
+inline void
+f2 ()
+{
+  (static const int) { 123 };
+  (static thread_local const int) { 456 };
+}
+
+inline void
+f2e ()
+{
+  (static const int) { 123 };
+  (static thread_local const int) { 456 };
+}
+
+static inline void
+f2s ()
+{
+  (static const int) { 123 };
+  (static thread_local const int) { 456 };
+}
+
+inline void
+f3 ()
+{
+  (static constexpr int) { 123 };
+}
+
+inline void
+f3e ()
+{
+  (static constexpr int) { 123 };
+}
+
+static inline void
+f3s ()
+{
+  (static constexpr int) { 123 };
+}
+
+extern void f1e ();
+extern void f2e ();
+extern void f3e ();

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Adding a new thread model to GCC

2023-01-09 Thread Eric Botcazou via Gcc-patches
> fixed now.
> bootstrapped successfully!

Thanks for fixing it.  Another way out is to hide the Win32 API by defining  
__GTHREAD_HIDE_WIN32API like libstdc++ does in its header files.

-- 
Eric Botcazou




[Bug analyzer/108252] false positive: leak detection

2023-01-09 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108252

--- Comment #2 from David Malcolm  ---
Created attachment 54221
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54221=edit
Reduced reproducer

Reproduces with trunk, with -fanalyzer:
  https://godbolt.org/z/x15xdYa57

[Bug tree-optimization/108199] Bitfields, unions and SRA and storage_order_attribute

2023-01-09 Thread ebotcazou at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108199

--- Comment #11 from Eric Botcazou  ---
> Here is a testcase for the trunk on x86_64-linux-gnu:

Thanks.  The problem is indeed the BIT_FIELD_REF of a scalar, which is an
undocumented extension of GENERIC:

/* Reference to a group of bits within an object.  Similar to COMPONENT_REF
   except the position is given explicitly rather than via a FIELD_DECL.
   Operand 0 is the structure or union expression;
   operand 1 is a tree giving the constant number of bits being referenced;
   operand 2 is a tree giving the constant position of the first referenced
bit.
   The result type width has to match the number of bits referenced.
   If the result type is integral, its signedness specifies how it is extended
   to its mode width.  */
DEFTREECODE (BIT_FIELD_REF, "bit_field_ref", tcc_reference, 3)

How are the bits numbered in there, IOW is bit 0 always the LSB or not?

[Bug analyzer/108252] false positive: leak detection

2023-01-09 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108252

David Malcolm  changed:

   What|Removed |Added

   Last reconfirmed||2023-01-09
 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED

--- Comment #1 from David Malcolm  ---
Thanks for filing this bug; confirmed.  I'm working on minimizing the
reproducer.

[Bug c++/108342] std::complex: ignoring packed attribute because of unpacked non-POD field

2023-01-09 Thread ruilvo at ua dot pt via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108342

--- Comment #12 from Rui Oliveira  ---
(In reply to Jakub Jelinek from comment #11)

> No, if you have the packed ph_fcomplex_t not aligned at alignof (float), you
> need
> to copy it to a properly aligned variable before trying to reinterpret_cast
> it.

Some `if constexpr` comparing of the remainder between
alignof(std::complex) and (alignof(bb_frame_t) +
offsetof(bb_iq_samples)) could perhaps make one avoid that. But that's just a
side idea to think of.

Main point is, the code is de-serializing a serial stream. I do not expect to
find the "magic" word at the right aligment for `bb_frame_t` anyway, so
generous memcpy'ing to properly aligned variables will be required anyway.

[Bug c++/108342] std::complex: ignoring packed attribute because of unpacked non-POD field

2023-01-09 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108342

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #11 from Jakub Jelinek  ---
(In reply to Rui Oliveira from comment #10)
> So my options are to create like a placeholder, say 
> 
> ```c
> typedef struct __attribute__((__packed__)) // Packed isn't really necessary
> here I think?
> {
> float re, im;
> } ph_fcomplex_t
> 
> ```
> 
> To silence the warning and get packing to work, and trust
> [complex.numbers.general] for a reinterpret_cast into std::complex I
> guess.

No, if you have the packed ph_fcomplex_t not aligned at alignof (float), you
need
to copy it to a properly aligned variable before trying to reinterpret_cast it.

[Bug c++/108342] std::complex: ignoring packed attribute because of unpacked non-POD field

2023-01-09 Thread ruilvo at ua dot pt via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108342

--- Comment #10 from Rui Oliveira  ---
So my options are to create like a placeholder, say 

```c
typedef struct __attribute__((__packed__)) // Packed isn't really necessary
here I think?
{
float re, im;
} ph_fcomplex_t

```

To silence the warning and get packing to work, and trust
[complex.numbers.general] for a reinterpret_cast into std::complex I
guess.

[Bug modula2/108182] gm2 driver mishandles target and multilib options

2023-01-09 Thread iains at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108182

Iain Sandoe  changed:

   What|Removed |Added

  Attachment #54184|0   |1
is obsolete||
  Attachment #54208|0   |1
is obsolete||
  Attachment #54214|0   |1
is obsolete||

--- Comment #11 from Iain Sandoe  ---
Created attachment 54220
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54220=edit
patch version 3.1


This is patch v3 + some specific changes. [hence it is 3.1 :) ]

The main issues with v3 (and v4 on PR108261) are:

 - link items are positional (you need to ensure that the runtime libraries
appear after the user's objects).
 - adding {f*} to the cc1gm2 line causes f options to be duplicated, this could
(potentially alter the behaviour of the command line when final values of
opposite switches are used - which is the 'usual' mechanism).
 - V3 was still adding the '-L' options for the various libraries which are not
needed (v4 fixes this, but not the other issues)

 - Supporting the target's ability to handle -Bstatic/dynamic in specs is going
to be hard.

 so ... 


1. we use the specs now to insert the include paths; this works very nicely.

2. we use the existing sequencing the language-driver to ensure that the link
positional arguments are in the right places (and to handle the Bstatic/dynamic
stuff)

3. We remove the {f*} from the cc1gm2 spec [note it is possible that other
similar  entries will cause duplication of their content .. I did not check
this yet]

This means that we can drop the linker-related extra specs and code (and
actually simplify things a bit in the lang-specific driver).

4. We skip options that we will re-insert to avoid duplication there too.

-

NOTE: with specs, it is usually necessary to ensure that they being and/or end
with whitespace because they can be arbitrarily concatenated.

-

This does not fix PR108261 (neither does v4, FWIW) on Darwin.

[Bug tree-optimization/108199] Bitfields, unions and SRA and storage_order_attribute

2023-01-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108199

Andrew Pinski  changed:

   What|Removed |Added

 Status|WAITING |NEW

--- Comment #10 from Andrew Pinski  ---
Here is a testcase for the trunk on x86_64-linux-gnu:
```
#define BITFIELD_ENDIAN "big-endian"

#define SRC_ENDIAN "big-endian"
#define DST_ENDIAN "big-endian"

typedef unsigned long long u64;

union DST {
  unsigned long val;

  struct {
u64 x : 1;
u64 y : 1;
u64 r: 62;
  } __attribute__((scalar_storage_order("big-endian")));
} __attribute__((scalar_storage_order("big-endian")));


struct SRC {
  u64 a;
} __attribute__((scalar_storage_order("big-endian")));

[[gnu::noipa]]
void foo (){__builtin_abort();}
[[gnu::noinline]]
int bar(struct SRC *src)
{
  union DST dst;

  dst.val = src->a;

  if (dst.y) {
foo();
  }
  return 0;
}
int main(void)
{
struct SRC t = {-1ull & (~(0x01ull<<62))};
bar();
return 0;
}
```
It does not cause an abort at -O0 but does at -O2.

[Bug middle-end/69482] Writing through pointers to volatile not always preserved

2023-01-09 Thread dboles.src at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69482

Daniel Boles  changed:

   What|Removed |Added

 CC||dboles.src at gmail dot com

--- Comment #12 from Daniel Boles  ---
Is this the same cause as https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71793 ?

Thanks!

[Bug tree-optimization/108199] Bitfields, unions and SRA and storage_order_attribute

2023-01-09 Thread ebotcazou at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108199

Eric Botcazou  changed:

   What|Removed |Added

 Status|NEW |WAITING

--- Comment #9 from Eric Botcazou  ---
Please mention the exact compiler version and the compilation switches.

[Bug target/107453] [13 Regression] New stdarg tests in r13-3549-g4fe34cdcc80ac2 fail

2023-01-09 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107453

--- Comment #8 from Thomas Schwinge  ---
(In reply to Jakub Jelinek from comment #7)
> No testing on nvptx

Thanks, confirming fixed for nvptx target, too.

[Bug target/41989] Code optimized for AMD Geode is slower than generic

2023-01-09 Thread pokox38850 at tohup dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=41989

Samantha Keen  changed:

   What|Removed |Added

 CC||pokox38850 at tohup dot com

--- Comment #28 from Samantha Keen  ---
Did this issue gets resolved in the newer version of AMD Geode?
https://paradipport.gov.in/en/hotmail-login/

[Bug analyzer/108251] false positive: null dereference

2023-01-09 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108251

--- Comment #6 from David Malcolm  ---
The analyzer sees the error-handling case in objt_conn, and considers the
execution path where it bails out early due to "t" being NULL i.e.
smp->sess->origin is NULL, and thus conn being initialized to NULL.

At it turns out, ssl_sock_get_ssl_object is defined (in src/ssl_sock.c) as:

SSL *ssl_sock_get_ssl_object(struct connection *conn)
{
struct ssl_sock_ctx *ctx = conn_get_ssl_sock_ctx(conn);

return ctx ? ctx->ssl : NULL;
}

and conn_get_ssl_sock_ctx is defined (in include/haproxy/connection.h) as:

/* retrieves the ssl_sock_ctx for this connection otherwise NULL */
static inline struct ssl_sock_ctx *conn_get_ssl_sock_ctx(struct connection
*conn)
{
if (!conn || !conn->xprt || !conn->xprt->get_ssl_sock_ctx)
return NULL;
return conn->xprt->get_ssl_sock_ctx(conn);
}

Hence it's a false positive: if conn is NULL, then ssl_sock_get_ssl_object will
return NULL.

The TU "sees" the definition of conn_get_ssl_sock_ctx, but only the declaration
of ssl_sock_get_ssl_object, not the latter's declaration.

Without a definition of ssl_sock_get_ssl_object, -fanalyzer can't "know" of the
interaction between "ssl" and "conn" (-fanalyzer doesn't work well with LTO
yet).

Hence it erroneously considers the case that ssl_sock_get_ssl_object could
return non-NULL on a NULL "conn", and sees the deref of conn, which it reports.

[PATCH] PR rtl-optimization/106421: ICE in bypass_block from non-local goto.

2023-01-09 Thread Roger Sayle

This patch fixes PR rtl-optimization/106421, an ICE-on-valid (but
undefined) regression.  The fix, as proposed by Richard Biener, is to
defend against BLOCK_FOR_INSN returning NULL in cprop's bypass_block.

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32},
with no new failures.  Ok for mainline?


2023-01-09  Roger Sayle  

gcc/ChangeLog
PR rtl-optimization/106421
* cprop.cc (bypass_block): Check that DEST is local to this
function (non-NULL) before calling find_edge.

gcc/testsuite/ChangeLog
PR rtl-optimization/106421
* gcc.dg/pr106421.c: New test case.


Thanks in advance,
Roger
--

diff --git a/gcc/cprop.cc b/gcc/cprop.cc
index 5b203ec..6ec0bda 100644
--- a/gcc/cprop.cc
+++ b/gcc/cprop.cc
@@ -1622,9 +1622,12 @@ bypass_block (basic_block bb, rtx_insn *setcc, rtx_insn 
*jump)
{
  dest = BLOCK_FOR_INSN (XEXP (new_rtx, 0));
  /* Don't bypass edges containing instructions.  */
- edest = find_edge (bb, dest);
- if (edest && edest->insns.r)
-   dest = NULL;
+ if (dest)
+   {
+ edest = find_edge (bb, dest);
+ if (edest && edest->insns.r)
+   dest = NULL;
+   }
}
  else
dest = NULL;
diff --git a/gcc/testsuite/gcc.dg/pr106421.c b/gcc/testsuite/gcc.dg/pr106421.c
new file mode 100644
index 000..73e522a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr106421.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+int main(int argc, char **argv)
+{
+   __label__ loop, end;
+   void jmp(int c) { goto *(c ? & : &); }
+loop:
+   jmp(argc < 0);
+end:
+   return 0;
+}
+


[Bug target/108346] gather/scatter loops optimized too often for znver4 (and other zens)

2023-01-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108346

Andrew Pinski  changed:

   What|Removed |Added

  Component|middle-end  |target
 Blocks||53947
 Target||x86_64-linux-gnu
   Keywords||missed-optimization

--- Comment #1 from Andrew Pinski  ---
This is a cost issue, either not having a decent way of expressing the cost or
the backend cost model needs to be improved (or both).


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

[Bug middle-end/108346] New: gather/scatter loops optimized too often for znver4 (and other zens)

2023-01-09 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108346

Bug ID: 108346
   Summary: gather/scatter loops optimized too often for znver4
(and other zens)
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: hubicka at gcc dot gnu.org
  Target Milestone: ---

The following two benchmarks tests gather/scatter codegen:
s4113.c:

#include 
#include 

//typedef float real_t;
#define iterations 100
#define LEN_1D 32000
#define LEN_2D 256
real_t a[LEN_1D],b[LEN_1D],c[LEN_1D],d[LEN_1D],e[LEN_1D];
real_t aa[LEN_2D][LEN_2D];
real_t bb[LEN_2D][LEN_2D];
real_t cc[LEN_2D][LEN_2D];
real_t qq;
int
main(void)
{
//reductions
//if to max reduction

real_t x;
int * __restrict__ ip = (int *) malloc(LEN_1D*sizeof(real_t));

for (int i = 0; i < LEN_1D; i = i+5){
(ip)[i]   = (i+4);
(ip)[i+1] = (i+2);
(ip)[i+2] = (i);
(ip)[i+3] = (i+3);
(ip)[i+4] = (i+1);
}
for (int nl = 0; nl < 2*iterations; nl++) {
for (int i = 1; i < LEN_1D; i += 2) {
a[ip[i]] = b[ip[i]] + c[i];
}
asm("":::"memory");
}

return x;
}


s4115.c:
#include 
#include 

#define iterations 100
#define LEN_1D 32000
#define LEN_2D 256
real_t a[LEN_1D],b[LEN_1D],c[LEN_1D],d[LEN_1D],e[LEN_1D];
real_t aa[LEN_2D][LEN_2D];
real_t bb[LEN_2D][LEN_2D];
real_t cc[LEN_2D][LEN_2D];
real_t qq;
int
main(void)
{
//reductions
//if to max reduction

real_t x;
int * __restrict__ ip = (int *) malloc(LEN_1D*sizeof(real_t));

for (int i = 0; i < LEN_1D; i = i+5){
(ip)[i]   = (i+4);
(ip)[i+1] = (i+2);
(ip)[i+2] = (i);
(ip)[i+3] = (i+3);
(ip)[i+4] = (i+1);
}
for (int nl = 0; nl < 2*iterations; nl++) {
for (int i = 1; i < LEN_1D; i += 2) {
x += a[i] * b[ip[i]];
}
asm("":::"memory");
}

return x;
}

On zver4 I get following times with disabling/enabling vectorization and
disabling/enabling gather use:

 runtime
type  optimizationoperation  scalar nogather gather parts instruction
char  avx256_optimal  load+store 14.23  N/A  N/A
char  avx256_optimal  load   14.25  N/A  N/A
char  ^avx256_optimal load+store 14.02  N/A  N/A
char  ^avx256_optimal load   14.25  N/A  N/A
short avx256_optimal  load+store*14.23  N/A  N/A
short avx256_optimal  load  *14.23  N/A  N/A
short ^avx256_optimal load+store 15.22  N/A  N/A
short ^avx256_optimal load   14.23  N/A  N/A
int   avx256_optimal  load+store*16.51  27.6625.96  8 vpgatherdd
ymm,vpscatterdd ymm
int   avx256_optimal  load   14.13  13.17   *12.71  8 vpgatherdd
ymm
int   ^avx256_optimal load+store*16.57  33.2526.06  16vpgatherdd
zmm,vpscatterdd zmm
int   ^avx256_optimal load   14.14  16.81   *13.63  16vpgatherdd
zmm
long  avx256_optimal  load+store*20.59  20.6632.03  4 vpgatherdq
zmm,vpscatterdq zmm
long  avx256_optimal  load   15.36 *15.3615.82  4 vpgatherdq
zmm
long  ^avx256_optimal load+store 22.42 *20.9630.54  8 vpgatherdq
zmm,vpscatterdq zmm
long  ^avx256_optimal load  *15.87  16.4018.68  8 vpgatherdq
zmm
float avx256_optimal  load+store 16.88  27.7826.08  8 vgatherdps
ymm, vscatterdps ymm
float avx256_optimal  load   26.01 *13.1913.30  8 vgatherdps
ymm
float ^avx256_optimal load+store*16.89  33.2226.19  16vgatherdps
zmm, vscatterdps zmm
float ^avx256_optimal load   26.01  16.61   *13.85  16vgatherdps
zmm
doubleavx256_optimal  load+store 21.94 *20.8131.43  4 vgatherdpd
ymm, vscatterdpd ymm
doubleavx256_optimal  load   26.01  26.01   *15.20  4 vgatherdpd
ymm
double^avx256_optimal load+store 21.44 *21.6530.73  8 vgatherdpd
zmm, vscatterdpd zmm
double^avx256_optimal load   26.01  26.01   *18.24  8 vgatherdpd
zmm


We incorrectly vectorize for int load+store loop causing 60% regression.
Vectorizing avx512 long load loop seems to be also slight loss, but not that
important.  I will post patch todisable scatter instructions since they does
not seem to be win.

[Bug analyzer/108251] false positive: null dereference

2023-01-09 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108251

--- Comment #5 from David Malcolm  ---
As per comment #4 (optimization disabled), but adding: -fanalyzer-verbosity=3
makes things clearer:

../../src/null-deref-pr108251-smp_fetch_ssl_fc_has_early.c: In function
‘smp_fetch_ssl_fc_has_early’:
../../src/null-deref-pr108251-smp_fetch_ssl_fc_has_early.c:92:27: warning:
dereference of NULL ‘conn’ [CWE-476] [-Wanalyzer-null-dereference]
   92 |  smp->data.u.sint = ((conn->flags & CO_FL_EARLY_DATA) &&
  |   ^~~
  ‘smp_fetch_ssl_fc_has_early’: events 1-2
|
|   78 | smp_fetch_ssl_fc_has_early(const struct arg *args, struct sample
*smp, const char *kw, void *private)
|  | ^~
|  | |
|  | (1) entry to ‘smp_fetch_ssl_fc_has_early’
|..
|   83 |  conn = objt_conn(smp->sess->origin);
|  | 
|  | |
|  | (2) calling ‘objt_conn’ from ‘smp_fetch_ssl_fc_has_early’
|
+--> ‘objt_conn’: events 3-6
   |
   |   61 | static inline struct connection *objt_conn(enum obj_type
*t)
   |  |  ^
   |  |  |
   |  |  (3) entry to ‘objt_conn’
   |   62 | {
   |   63 |  if (!t || *t != OBJ_TYPE_CONN)
   |  | ~ 
   |  | |
   |  | (4) following ‘true’ branch (when ‘t’ is NULL)...
   |   64 |return ((void *)0);
   |  |   ~   
   |  |   |
   |  |   (5) ...to here
   |  |   (6) ‘0’ is NULL
   |
<--+
|
  ‘smp_fetch_ssl_fc_has_early’: events 7-10
|
|   83 |  conn = objt_conn(smp->sess->origin);
|  | ^~~~
|  | |
|  | (7) return of NULL to ‘smp_fetch_ssl_fc_has_early’ from
‘objt_conn’
|   84 |  ssl = ssl_sock_get_ssl_object(conn);
|   85 |  if (!ssl)
|  | ~
|  | |
|  | (8) following ‘false’ branch (when ‘ssl’ is non-NULL)...
|..
|   88 |  smp->flags = 0;
|  |  ~~
|  | |
|  | (9) ...to here
|..
|   92 |  smp->data.u.sint = ((conn->flags & CO_FL_EARLY_DATA) &&
|  |   ~~~
|  |   |
|  |   (10) dereference of NULL ‘conn’
|

[Bug analyzer/108251] false positive: null dereference

2023-01-09 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108251

--- Comment #4 from David Malcolm  ---
Without optimization, trunk with just -Wno-address-of-packed-member (and
-fanalyzer), I get:

../../src/null-deref-pr108251-smp_fetch_ssl_fc_has_early.c: In function
‘smp_fetch_ssl_fc_has_early’:
../../src/null-deref-pr108251-smp_fetch_ssl_fc_has_early.c:92:27: warning:
dereference of NULL ‘conn’ [CWE-476] [-Wanalyzer-null-dereference]
   92 |  smp->data.u.sint = ((conn->flags & CO_FL_EARLY_DATA) &&
  |   ^~~
  ‘smp_fetch_ssl_fc_has_early’: events 1-2
|
|   78 | smp_fetch_ssl_fc_has_early(const struct arg *args, struct sample
*smp, const char *kw, void *private)
|  | ^~
|  | |
|  | (1) entry to ‘smp_fetch_ssl_fc_has_early’
|..
|   83 |  conn = objt_conn(smp->sess->origin);
|  | 
|  | |
|  | (2) calling ‘objt_conn’ from ‘smp_fetch_ssl_fc_has_early’
|
+--> ‘objt_conn’: events 3-4
   |
   |   61 | static inline struct connection *objt_conn(enum obj_type
*t)
   |  |  ^
   |  |  |
   |  |  (3) entry to ‘objt_conn’
   |..
   |   64 |return ((void *)0);
   |  |   ~   
   |  |   |
   |  |   (4) ‘0’ is NULL
   |
<--+
|
  ‘smp_fetch_ssl_fc_has_early’: events 5-8
|
|   83 |  conn = objt_conn(smp->sess->origin);
|  | ^~~~
|  | |
|  | (5) return of NULL to ‘smp_fetch_ssl_fc_has_early’ from
‘objt_conn’
|   84 |  ssl = ssl_sock_get_ssl_object(conn);
|   85 |  if (!ssl)
|  | ~
|  | |
|  | (6) following ‘false’ branch (when ‘ssl’ is non-NULL)...
|..
|   88 |  smp->flags = 0;
|  |  ~~
|  | |
|  | (7) ...to here
|..
|   92 |  smp->data.u.sint = ((conn->flags & CO_FL_EARLY_DATA) &&
|  |   ~~~
|  |   |
|  |   (8) dereference of NULL ‘conn’
|

[Bug analyzer/108251] false positive: null dereference

2023-01-09 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108251

--- Comment #3 from David Malcolm  ---
Adding  -fanalyzer-verbosity=3 to comment #2, I get:

../../src/null-deref-pr108251-smp_fetch_ssl_fc_has_early.c: In function
‘smp_fetch_ssl_fc_has_early’:
../../src/null-deref-pr108251-smp_fetch_ssl_fc_has_early.c:92:27: warning:
dereference of NULL ‘0’ [CWE-476] [-Wanalyzer-null-dereference]
   92 |  smp->data.u.sint = ((conn->flags & CO_FL_EARLY_DATA) &&
  |   ^~~
  ‘smp_fetch_ssl_fc_has_early’: events 1-2
|
|   78 | smp_fetch_ssl_fc_has_early(const struct arg *args, struct sample
*smp, const char *kw, void *private)
|  | ^~
|  | |
|  | (1) entry to ‘smp_fetch_ssl_fc_has_early’
|..
|   83 |  conn = objt_conn(smp->sess->origin);
|  | ~
|  | |
|  | (2) inlined call to ‘objt_conn’ from
‘smp_fetch_ssl_fc_has_early’
|
+--> ‘objt_conn’: event 3
   |
   |   63 |  if (!t || *t != OBJ_TYPE_CONN)
   |  | ^
   |  | |
   |  | (3) following ‘true’ branch...
   |
<--+
|
  ‘smp_fetch_ssl_fc_has_early’: event 4
|
|cc1:
| (4): ...to here
|
  ‘smp_fetch_ssl_fc_has_early’: events 5-7
|
|   85 |  if (!ssl)
|  | ^
|  | |
|  | (5) following ‘false’ branch (when ‘ssl’ is non-NULL)...
|..
|   88 |  smp->flags = 0;
|  |  ~~
|  | |
|  | (6) ...to here
|..
|   92 |  smp->data.u.sint = ((conn->flags & CO_FL_EARLY_DATA) &&
|  |   ~~~
|  |   |
|  |   (7) dereference of NULL ‘’
|

which is slightly clearer; arguably we shouldn't have pruned events 2-4 from
this at the lower verbosity level, since it's really hard to figure out what
the analyzer is "thinking" in comment #2.

[Bug analyzer/108251] false positive: null dereference

2023-01-09 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108251

--- Comment #2 from David Malcolm  ---
With trunk and -Wno-address-of-packed-member -O2, I get:

../../src/null-deref-pr108251-smp_fetch_ssl_fc_has_early.c: In function
‘smp_fetch_ssl_fc_has_early’:
../../src/null-deref-pr108251-smp_fetch_ssl_fc_has_early.c:92:27: warning:
dereference of NULL ‘0’ [CWE-476] [-Wanalyzer-null-dereference]
   92 |  smp->data.u.sint = ((conn->flags & CO_FL_EARLY_DATA) &&
  |   ^~~
  ‘smp_fetch_ssl_fc_has_early’: events 1-3
|
|   85 |  if (!ssl)
|  | ^
|  | |
|  | (1) following ‘false’ branch (when ‘ssl’ is non-NULL)...
|..
|   88 |  smp->flags = 0;
|  |  ~~
|  | |
|  | (2) ...to here
|..
|   92 |  smp->data.u.sint = ((conn->flags & CO_FL_EARLY_DATA) &&
|  |   ~~~
|  |   |
|  |   (3) dereference of NULL ‘’
|

[Bug analyzer/108251] false positive: null dereference

2023-01-09 Thread dmalcolm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108251

--- Comment #1 from David Malcolm  ---
Created attachment 54219
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54219=edit
Simplified reproducer for smp_fetch_ssl_fc_has_early

Thanks for filing this bug.  I see the warnings, and have reduced the
smp_fetch_ssl_fc_has_early example to the attached.

Re: [PATCH] c++: Define built-in for std::tuple_element [PR100157]

2023-01-09 Thread Patrick Palka via Gcc-patches
On Mon, 9 Jan 2023, Patrick Palka wrote:

> On Wed, 5 Oct 2022, Patrick Palka wrote:
> 
> > On Thu, 7 Jul 2022, Jonathan Wakely via Gcc-patches wrote:
> > 
> > > This adds a new built-in to replace the recursive class template
> > > instantiations done by traits such as std::tuple_element and
> > > std::variant_alternative. The purpose is to select the Nth type from a
> > > list of types, e.g. __builtin_type_pack_element(1, char, int, float) is
> > > int.
> > > 
> > > For a pathological example tuple_element_t<1000, tuple<2000 types...>>
> > > the compilation time is reduced by more than 90% and the memory  used by
> > > the compiler is reduced by 97%. In realistic examples the gains will be
> > > much smaller, but still relevant.
> > > 
> > > Clang has a similar built-in, __type_pack_element, but that's a
> > > "magic template" built-in using <> syntax, which GCC doesn't support. So
> > > this provides an equivalent feature, but as a built-in function using
> > > parens instead of <>. I don't really like the name "type pack element"
> > > (it gives you an element from a pack of types) but the semi-consistency
> > > with Clang seems like a reasonable argument in favour of keeping the
> > > name. I'd be open to alternative names though, e.g. __builtin_nth_type
> > > or __builtin_type_at_index.
> > 
> > Rather than giving the trait a different name from __type_pack_element,
> > I wonder if we could just special case cp_parser_trait to expect <>
> > instead of parens for this trait?
> > 
> > Btw the frontend recently got a generic TRAIT_TYPE tree code, which gets
> > rid of much of the boilerplate of adding a new type-yielding built-in
> > trait, see e.g. cp-trait.def.
> 
> Here's a tested patch based on Jonathan's original patch that implements
> the built-in in terms of TRAIT_TYPE, names it __type_pack_element
> instead of __builtin_type_pack_element, and treats invocations of it
> like a template-id instead of a call (to match Clang).
> 
> -- >8 --
> 
> Subject: [PATCH] c++: Define built-in for std::tuple_element [PR100157]
> 
> This adds a new built-in to replace the recursive class template
> instantiations done by traits such as std::tuple_element and
> std::variant_alternative.  The purpose is to select the Nth type from a
> list of types, e.g. __type_pack_element<1, char, int, float> is int.
> We implement it as a special kind of TRAIT_TYPE.
> 
> For a pathological example tuple_element_t<1000, tuple<2000 types...>>
> the compilation time is reduced by more than 90% and the memory  used by
> the compiler is reduced by 97%.  In realistic examples the gains will be
> much smaller, but still relevant.
> 
> Unlike the other built-in traits, __type_pack_element uses template-id
> syntax instead of call syntax and is SFINAE-enabled, matching Clang's
> implementation.  And like the other built-in traits, it's not mangleable
> so we can't use it directly in function signatures.
> 
> Some caveats:
> 
>   * Clang's version of the built-in seems to act like a "magic template"
> that can e.g. be used as a template template argument.  For simplicity
> we implement it in a more ad-hoc way.
>   * Our parsing of the <>'s in __type_pack_element<...> is currently
> rudimentary and doesn't try to disambiguate a trailing >> vs > >
> as cp_parser_enclosed_template_argument_list does.

Hmm, this latter caveat turns out to be inconvenient (for code such as
type_pack_element3.C) and admits an easy workaround inspired by what
cp_parser_enclosed_template_argument_list does.

v2: Consider the >> in __type_pack_element<0, int, char>> to be two >'s.
Handle non-type TRAIT_TYPE_TYPE1 in strip_typedefs (for sake of
CPTK_TYPE_PACK_ELEMENT).

-- >8 --

Subject: [PATCH] c++: Define built-in for std::tuple_element [PR100157]

This adds a new built-in to replace the recursive class template
instantiations done by traits such as std::tuple_element and
std::variant_alternative.  The purpose is to select the Nth type from a
list of types, e.g. __type_pack_element<1, char, int, float> is int.
We implement it as a special kind of TRAIT_TYPE.

For a pathological example tuple_element_t<1000, tuple<2000 types...>>
the compilation time is reduced by more than 90% and the memory  used by
the compiler is reduced by 97%.  In realistic examples the gains will be
much smaller, but still relevant.

Unlike the other built-in traits, __type_pack_element uses template-id
syntax instead of call syntax and is SFINAE-enabled, matching Clang's
implementation.  And like the other built-in traits, it's not mangleable
so we can't use it directly in function signatures.

N.B. Clang seems to implement __type_pack_element as a first-class
template that can e.g. be used as a template template argument.  For
simplicity we implement it in a more ad-hoc way.

Co-authored-by: Jonathan Wakely 

PR c++/100157

gcc/cp/ChangeLog:

* cp-trait.def (TYPE_PACK_ELEMENT): Define.
* cp-tree.h (finish_trait_type): Add complain 

[Bug tree-optimization/108341] argument to `__builtin_ctz` should be assumed non-zero when CTZ_DEFINED_VALUE_AT_ZERO says it is undefined

2023-01-09 Thread aldyh at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108341

Aldy Hernandez  changed:

   What|Removed |Added

   Severity|normal  |enhancement

--- Comment #6 from Aldy Hernandez  ---
Huh.  Didn't know you could do that.  Thanks.

FWIW, the function is actually:

gimple_infer_range::gimple_infer_range (gimple *s)

[Bug tree-optimization/108341] argument to `__builtin_ctz` should be assumed non-zero when CTZ_DEFINED_VALUE_AT_ZERO says it is undefined

2023-01-09 Thread amacleod at redhat dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108341

--- Comment #5 from Andrew Macleod  ---
(In reply to Aldy Hernandez from comment #2)
> (In reply to Martin Liška from comment #1)
> > May be an opportunity for Ranger?
> 
> Hmmm... I don't think so:
> 
>  :
> value.0_1 = (unsigned int) value_4(D);
> _2 = __builtin_ctz (value.0_1);
> r = _2;
> _3 = value_4(D) != 0;
> _7 = (int) _3;
> return _7;
> 
> We could add an op1_range operator to class cfn_clz to return nonzero for
> op1, but that would only work if we knew _2 to be anything...and have no
> info on _2.

Seems more like a candidate for gimpe_infer::gimple_infer (gimple *s).

THe side effect to register would be to check if 's' is a builtin_ctz and if
so, call add_nonzero (operand1) if whatever those other conditions are are
matched which make it true.

That should register a non-zero inferred range on value.0_1 after the
assignment of _2.


=== BB 2 
Partial equiv (value.0_1 pe32 value_4(D))
 :
value.0_1 = (unsigned int) value_4(D);
_2 = __builtin_ctz (value.0_1);
r = _2;
_3 = value_4(D) != 0;
_7 = (int) _3;
return _7;

I see ranger also registers a 32 bit equivalence between value.0_1 and value_4,
so in theory we would then be able to determine that value_4 is also non-zero
for the comparison.

[Bug tree-optimization/108281] float value range estimation missing (vs. integer)

2023-01-09 Thread aldyh at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108281

--- Comment #5 from Aldy Hernandez  ---
(In reply to Jakub Jelinek from comment #4)
> (In reply to Aldy Hernandez from comment #3)
> > (In reply to Richard Biener from comment #2)
> > > GCC 13 got float range tracking but the description isn't clear as what
> > > transform you are looking after?  It seems you are looking for ranges
> > > of standard math functions - I think those are not yet implemented?
> > 
> > Correct.  We don't track libm functions.
> 
> Yet.  I hope we do that for GCC 14.

Yeah. That'd be very nice. It's on our radar for the next releaseat least
provide any missing framework and provide a few sample ones.

[Bug target/108272] [13 Regression] ICE in gen_movxo, at config/rs6000/mma.md:339

2023-01-09 Thread asolokha at gmx dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108272

--- Comment #3 from Arseny Solokha  ---
(In reply to Kewen Lin from comment #2)
> Created attachment 54192 [details]
> untested patch
> 
> Hi @Arseny, I hope this patch can help to clear all the ICEs about
> unexpected uses of MMA opaque types in inline asm, that is to filter those
> noises duplicated to this bug.

Indeed, I haven't seen such ICEs w/ the patch applied so far. Still get an ICE
in gen_movoo, at config/rs6000/mma.md:292 when compiling
gcc/testsuite/gcc.target/powerpc/pr96506-1.c w/ -m32, though. Do you want me to
file another PR for that one?

Re: Bypass assembler when generating LTO object files

2023-01-09 Thread Martin Jambor
Hello,

On Sun, Dec 18 2022, Mohamed Atef wrote:
> Hello,
>I am interested in working in this project during my free time, is
> understanding this https://gcc.gnu.org/wiki/LinkTimeOptimization
> A good starting point

That section of the Wiki is very old.  You may find bits there that are
still valid and relevant, but I would be actually a bit careful with
that content.

If you're looking for high-level overview of LTO, unfortunately I can
only recommend videos:

  - Honza's "Building openSUSE with GCC's link time optimization"
https://events.opensuse.org/conferences/oSC18/program/proposals/1846#2

  - my "Interprodecural optimizations in GCC"
https://www.youtube.com/watch?v=oQ71ZbOuSW4
(the first 12 minutes or so, the rest is then about optimizations)

For the task specifically, the patch from 2014
https://gcc.gnu.org/legacy-ml/gcc/2014-09/msg00340.html is still a good
starting point, even if not a very clear one.  The crux of the matter is
to enhance libiberty/simple-object*.[ch] to be able to create elf from
scratch (as opposed to modifying an existing one).  So look there too.

If you have any questions, feel free to ask.

Good luck,

Martin


[Bug middle-end/108278] [13 Regression] runtime error with -O1 -Wall

2023-01-09 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108278

--- Comment #17 from Jakub Jelinek  ---
We know the commit introduced UB on the compiler side, but what I don't
understand is why it triggered on the testcases you've provided.  It surely
introduced UB when compiling libgcc/unwind-dw2.c or during predefining macros
with -fbuilding-libgcc, but I think the problematic code with uninitialized
array of poly_ints could happen at different times.

Re: [PATCH] Add support for x86_64-*-gnu-* targets to build x86_64 gnumach/hurd

2023-01-09 Thread Flávio Cruz via Gcc-patches
Friendly ping

On Mon, Dec 26, 2022 at 12:34 PM Flavio Cruz  wrote:

> Tested by building a toolchain and compiling gnumach for x86_64 [1].
> This is the basic version without unwind support which I think is only
> required to
> implement exceptions.
>
> [1] https://github.com/flavioc/cross-hurd/blob/master/bootstrap-kernel.sh.
>
> ---
>  gcc/config.gcc  |  5 -
>  gcc/config/i386/gnu64.h | 40 +
>  libgcc/config.host  |  8 ++-
>  libgcc/config/i386/gnu-unwind.h | 10 +
>  4 files changed, 61 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/config/i386/gnu64.h
>
> diff --git a/gcc/config.gcc b/gcc/config.gcc
> index 95190233820..0e2b15768bf 100644
> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
> @@ -1955,7 +1955,7 @@ i[34567]86-*-linux* | i[34567]86-*-kfreebsd*-gnu |
> i[34567]86-*-gnu* | i[34567]8
> ;;
> esac
> ;;
> -x86_64-*-linux* | x86_64-*-kfreebsd*-gnu)
> +x86_64-*-linux* | x86_64-*-kfreebsd*-gnu | x86_64-*-gnu*)
> tm_file="${tm_file} i386/unix.h i386/att.h elfos.h gnu-user.h
> glibc-stdint.h \
>  i386/x86-64.h i386/gnu-user-common.h i386/gnu-user64.h"
> case ${target} in
> @@ -1966,6 +1966,9 @@ x86_64-*-linux* | x86_64-*-kfreebsd*-gnu)
> x86_64-*-kfreebsd*-gnu)
> tm_file="${tm_file} kfreebsd-gnu.h i386/kfreebsd-gnu64.h"
> ;;
> +   x86_64-*-gnu*)
> +   tm_file="${tm_file} gnu.h i386/gnu64.h"
> +   ;;
> esac
> tmake_file="${tmake_file} i386/t-linux64"
> x86_multilibs="${with_multilib_list}"
> diff --git a/gcc/config/i386/gnu64.h b/gcc/config/i386/gnu64.h
> new file mode 100644
> index 000..a1ecfaa1cdb
> --- /dev/null
> +++ b/gcc/config/i386/gnu64.h
> @@ -0,0 +1,40 @@
> +/* Configuration for an x86_64 running GNU with ELF as the target
> machine.  */
> +
> +/*
> +Copyright (C) 2022 Free Software Foundation, Inc.
> +
> +This file is part of GCC.
> +
> +GCC is free software: you can redistribute it and/or modify
> +it under the terms of the GNU General Public License as published by
> +the Free Software Foundation, either version 3 of the License, or
> +(at your option) any later version.
> +
> +GCC is distributed in the hope that it will be useful,
> +but WITHOUT ANY WARRANTY; without even the implied warranty of
> +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +GNU General Public License for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC.  If not, see .
> +*/
> +
> +#define GNU_USER_LINK_EMULATION32 "elf_i386"
> +#define GNU_USER_LINK_EMULATION64 "elf_x86_64"
> +#define GNU_USER_LINK_EMULATIONX32 "elf32_x86_64"
> +
> +#undef GNU_USER_DYNAMIC_LINKER
> +#define GNU_USER_DYNAMIC_LINKER32 "/lib/ld.so.1"
> +#define GNU_USER_DYNAMIC_LINKER64 "/lib/ld-x86-64.so.1"
> +#define GNU_USER_DYNAMIC_LINKERX32 "/lib/ld-x32.so.1"
> +
> +#undef STARTFILE_SPEC
> +#if defined HAVE_LD_PIE
> +#define STARTFILE_SPEC \
> +  "%{!shared:
> %{pg|p|profile:%{static:gcrt0.o%s;:gcrt1.o%s};pie:Scrt1.o%s;static:crt0.o%s;:crt1.o%s}}
> \
> +   crti.o%s
> %{static:crtbeginT.o%s;shared|pie:crtbeginS.o%s;:crtbegin.o%s}"
> +#else
> +#define STARTFILE_SPEC \
> +  "%{!shared:
> %{pg|p|profile:%{static:gcrt0.o%s;:gcrt1.o%s};static:crt0.o%s;:crt1.o%s}} \
> +   crti.o%s
> %{static:crtbeginT.o%s;shared|pie:crtbeginS.o%s;:crtbegin.o%s}"
> +#endif
> diff --git a/libgcc/config.host b/libgcc/config.host
> index eb23abe89f5..75fd1b778fe 100644
> --- a/libgcc/config.host
> +++ b/libgcc/config.host
> @@ -751,6 +751,12 @@ x86_64-*-kfreebsd*-gnu)
> tmake_file="${tmake_file} i386/t-crtpc t-crtfm i386/t-crtstuff
> t-dfprules"
> tm_file="${tm_file} i386/elf-lib.h"
> ;;
> +x86_64-*-gnu*)
> +   extra_parts="$extra_parts crtprec32.o crtprec64.o crtprec80.o
> crtfastmath.o"
> +   tmake_file="${tmake_file} i386/t-crtpc t-crtfm i386/t-crtstuff
> t-dfprules"
> +   tm_file="${tm_file} i386/elf-lib.h"
> +   md_unwind_header=i386/gnu-unwind.h
> +   ;;
>  i[34567]86-pc-msdosdjgpp*)
> ;;
>  i[34567]86-*-lynxos*)
> @@ -1523,7 +1529,7 @@ esac
>  case ${host} in
>  i[34567]86-*-linux* | x86_64-*-linux* | \
>i[34567]86-*-kfreebsd*-gnu | x86_64-*-kfreebsd*-gnu | \
> -  i[34567]86-*-gnu*)
> +  i[34567]86-*-gnu* | x86_64-*-gnu*)
> tmake_file="${tmake_file} t-tls i386/t-linux i386/t-msabi
> t-slibgcc-libgcc"
> if test "$libgcc_cv_cfi" = "yes"; then
> tmake_file="${tmake_file} t-stack i386/t-stack-i386"
> diff --git a/libgcc/config/i386/gnu-unwind.h
> b/libgcc/config/i386/gnu-unwind.h
> index 25eb690e370..2cbfc40ea7e 100644
> --- a/libgcc/config/i386/gnu-unwind.h
> +++ b/libgcc/config/i386/gnu-unwind.h
> @@ -30,6 +30,14 @@ see the files COPYING3 and COPYING.RUNTIME
> respectively.  If not, see
>
>  #include 
>
> +#ifdef __x86_64__
> +
> +/*

[Bug tree-optimization/108281] float value range estimation missing (vs. integer)

2023-01-09 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108281

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #4 from Jakub Jelinek  ---
(In reply to Aldy Hernandez from comment #3)
> (In reply to Richard Biener from comment #2)
> > GCC 13 got float range tracking but the description isn't clear as what
> > transform you are looking after?  It seems you are looking for ranges
> > of standard math functions - I think those are not yet implemented?
> 
> Correct.  We don't track libm functions.

Yet.  I hope we do that for GCC 14.

[Bug tree-optimization/108281] float value range estimation missing (vs. integer)

2023-01-09 Thread aldyh at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108281

--- Comment #3 from Aldy Hernandez  ---
(In reply to Richard Biener from comment #2)
> GCC 13 got float range tracking but the description isn't clear as what
> transform you are looking after?  It seems you are looking for ranges
> of standard math functions - I think those are not yet implemented?

Correct.  We don't track libm functions.

[Bug c++/108285] [13 Regression] error: conversion from ‘long double’ to ‘double’ may change value [-Werror=float-conversion] since r13-3291-g16ec267063c8ce60

2023-01-09 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108285

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P1

[Bug tree-optimization/108281] float value range estimation missing (vs. integer)

2023-01-09 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108281

Richard Biener  changed:

   What|Removed |Added

 CC||aldyh at gcc dot gnu.org
Version|12.2.0  |13.0

--- Comment #2 from Richard Biener  ---
GCC 13 got float range tracking but the description isn't clear as what
transform you are looking after?  It seems you are looking for ranges
of standard math functions - I think those are not yet implemented?

[Bug middle-end/108278] [13 Regression] runtime error with -O1 -Wall

2023-01-09 Thread dcb314 at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108278

--- Comment #16 from David Binderman  ---
(In reply to Richard Biener from comment #15)
> So this bug is fixed?

Jakub and I seem to think so. Good enough ?

[Bug middle-end/108278] [13 Regression] runtime error with -O1 -Wall

2023-01-09 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108278

Richard Biener  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |WAITING
   Last reconfirmed||2023-01-09

--- Comment #15 from Richard Biener  ---
So this bug is fixed?

Re: [PATCH 9/15] arm: Set again stack pointer as CFA reg when popping if necessary

2023-01-09 Thread Richard Earnshaw via Gcc-patches




On 09/01/2023 16:48, Richard Earnshaw via Gcc-patches wrote:



On 09/01/2023 14:58, Andrea Corallo via Gcc-patches wrote:

Andrea Corallo via Gcc-patches  writes:


Richard Earnshaw  writes:


On 27/09/2022 16:24, Kyrylo Tkachov via Gcc-patches wrote:



-Original Message-
From: Andrea Corallo 
Sent: Tuesday, September 27, 2022 11:06 AM
To: Kyrylo Tkachov 
Cc: Andrea Corallo via Gcc-patches ; Richard
Earnshaw ; nd 
Subject: Re: [PATCH 9/15] arm: Set again stack pointer as CFA reg 
when

popping if necessary

Kyrylo Tkachov  writes:


Hi Andrea,


-Original Message-
From: Gcc-patches  On Behalf Of Andrea
Corallo via Gcc-patches
Sent: Friday, August 12, 2022 4:34 PM
To: Andrea Corallo via Gcc-patches 
Cc: Richard Earnshaw ; nd 
Subject: [PATCH 9/15] arm: Set again stack pointer as CFA reg when

popping

if necessary

Hi all,

this patch enables 'arm_emit_multi_reg_pop' to set again the stack
pointer as CFA reg when popping if this is necessary.



  From what I can tell from similar functions this is correct, 
but could you

elaborate on why this change is needed for my understanding please?

Thanks,
Kyrill


Hi Kyrill,

sure, if the frame pointer was set, than it is the current CFA 
register.

If we request to adjust the current CFA register offset indicating it
being SP (while it's actually FP) that is indeed not correct and the
incoherence we will be detected by an assertion in the dwarf emission
machinery.

Thanks,  the patch is ok
Kyrill



Best Regards

    Andrea


Hmm, wait.  Why would a multi-reg pop be updating the stack pointer?


Hi Richard,

not sure I understand, isn't any pop updating SP by definition?



Back on this,

compiling:

===
int i;

void foo (int);

int bar()
{
   foo (i);
   return 0;
}
===

With -march=armv8.1-m.main+fp -mbranch-protection=pac-ret+leaf -mthumb 
-O0 -g


Produces the following asm for bar.

bar:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 1, uses_anonymous_args = 0
pac    ip, lr, sp
push    {r3, r7, ip, lr}
add    r7, sp, #0
ldr    r3, .L3
ldr    r3, [r3]
mov    r0, r3
bl    foo
movs    r3, #0
mov    r0, r3
pop    {r3, r7, ip, lr}
aut    ip, lr, sp
bx    lr

The offending instruction causing the ICE (without this patch) when
emitting dwarf is "pop {r3, r7, ip, lr}".

The current CFA reg when emitting the multipop is R7 (the frame
pointer).  If is not the multipop that has the duty to restore SP as
current CFA here which other instruction should do it?



Digging a bit deeper, I'm now even more confused.  arm_expand_epilogue 
contains (parphrasing the code):


  if frame_pointer_needed
    {
  if arm
    {}
  else
    {
  if adjust
    r7 += adjust
  mov sp, r7    // Reset CFA to SP
    }
     }

so there should always be a move of r7 into SP, even if this is strictly 
redundant.  I don't understand why this doesn't happen for your 
testcase.  Can you dig a bit deeper?  I wonder if we've (probably 
incorrectly) assumed that this function doesn't need an epilogue but can 
use a simple return?  I don't think we should do that when 
authentication is needed: a simple return should really be one instruction.




So I strongly suspect the real problem here is that use_return_insn () 
in arm.cc needs to be updated to return false when using pointer 
authentication.  The specification for this function says that a return 
can be done in one instruction; and clearly when we need authentication 
more than one is needed.


R.


Best Regards

   Andrea


R.


[Bug target/108339] [11/10 only] riscv64-linux-gnu: fails to link libgcc_s.so on the GCC 10 branch

2023-01-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108339

Andrew Pinski  changed:

   What|Removed |Added

  Known to work|11.3.1, 12.2.1  |12.1.0
Summary|[10 only]   |[11/10 only]
   |riscv64-linux-gnu: fails to |riscv64-linux-gnu: fails to
   |link libgcc_s.so on the GCC |link libgcc_s.so on the GCC
   |10 branch   |10 branch

--- Comment #2 from Andrew Pinski  ---
Which was not backported to GCC 11 branch either.

[Bug target/108339] [10 only] riscv64-linux-gnu: fails to link libgcc_s.so on the GCC 10 branch

2023-01-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108339

--- Comment #1 from Andrew Pinski  ---
r12-5799-g45116f342057b7

Patch ping

2023-01-09 Thread Jakub Jelinek via Gcc-patches
Hi!

I'd like to ping a few pending patches:

https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606973.html
  - PR107465 - c-family: Fix up -Wsign-compare BIT_NOT_EXPR handling

https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607104.html
  - PR107465 - c-family: Incremental fix for -Wsign-compare BIT_NOT_EXPR 
handling

https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607145.html
  - PR107558 - c++: Don't clear TREE_READONLY for -fmerge-all-constants for 
non-aggregates

https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607534.html
  - PR107846 - c-family: Account for integral promotions of left shifts for 
-Wshift-overflow warning

https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606382.html
  - PR107703 - libgcc, i386: Add __fix{,uns}bfti and __float{,un}tibf

https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608932.html
  - PR108079 - c, c++, cgraphunit: Prevent duplicated -Wunused-value warnings

Thanks

Jakub



[Bug c/108345] New: Mismatch __attribute__((aligned(x))) between declaration and definition does not raise error/warning

2023-01-09 Thread dumoulin.thibaut at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108345

Bug ID: 108345
   Summary: Mismatch __attribute__((aligned(x))) between
declaration and definition does not raise
error/warning
   Product: gcc
   Version: 10.3.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: dumoulin.thibaut at gmail dot com
  Target Milestone: ---

If `__attribute__((aligned(x)))` is present in function declaration but NOT
present in function implementation, GCC does not fire any warning/error.

Example:

```
#include 

typedef uint32_t _type_unaligned __attribute__((aligned(1)));

/* Prototype with attribute aligned */
void function_a(_type_unaligned* a);

int main(void) {
  struct test_t {
uint8_t a;
uint32_t b;
uint32_t c;
  } __attribute__((__packed__)) test = {.a = 0, .b = 1, .c = 2};

  function_a(&(test.b));

  return 0;
}

/* Declaration WITHOUT attribute aligned */
void function_a(uint32_t* a) {
  uint32_t _a = *(a + 0);
  uint32_t _b = *(a + 1);
  *(a + 3) = _a + _b;
}
```

`arm-none-eabi-gcc --specs=nosys.specs -mcpu=cortex-m4 -Wall -Wextra -O3
-mthumb -mlittle-endian`

```
$ arm-none-eabi-gcc --version
arm-none-eabi-gcc (15:10.3-2021.07-4) 10.3.1 20210621 (release)
```

Assembly generated for function_a:
```
811c :
811c:   e9d0 3200   ldrdr3, r2, [r0] --> illegal ARM V7
instruction if unaligned!
8120:   4413add r3, r2
8122:   60c3str r3, [r0, #12]
8124:   4770bx  lr
8126:   bf00nop
```

Header and implementation of the function mismatch about `__attribute__` and
this is misleading. Here, GCC does NOT generate code to align the pointer in
`function_a` implementation.
Correct code would be:
```
/* Declaration with attribute aligned */
void function_a(_type_unaligned* a) {
  uint32_t _a = *(a + 0);
  uint32_t _b = *(a + 1);
  *(a + 3) = _a + _b;
}
```

Usually, when function declaration and definition mismatch, GCC fires a
warning/error.
I think this is a bug here, GCC should not allow to compile this code or at
least should raise a warning.

[Bug target/108339] [10 only] riscv64-linux-gnu: fails to link libgcc_s.so on the GCC 10 branch

2023-01-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108339

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |10.5

Re: [PATCH 9/15] arm: Set again stack pointer as CFA reg when popping if necessary

2023-01-09 Thread Richard Earnshaw via Gcc-patches




On 09/01/2023 14:58, Andrea Corallo via Gcc-patches wrote:

Andrea Corallo via Gcc-patches  writes:


Richard Earnshaw  writes:


On 27/09/2022 16:24, Kyrylo Tkachov via Gcc-patches wrote:



-Original Message-
From: Andrea Corallo 
Sent: Tuesday, September 27, 2022 11:06 AM
To: Kyrylo Tkachov 
Cc: Andrea Corallo via Gcc-patches ; Richard
Earnshaw ; nd 
Subject: Re: [PATCH 9/15] arm: Set again stack pointer as CFA reg when
popping if necessary

Kyrylo Tkachov  writes:


Hi Andrea,


-Original Message-
From: Gcc-patches  On Behalf Of Andrea
Corallo via Gcc-patches
Sent: Friday, August 12, 2022 4:34 PM
To: Andrea Corallo via Gcc-patches 
Cc: Richard Earnshaw ; nd 
Subject: [PATCH 9/15] arm: Set again stack pointer as CFA reg when

popping

if necessary

Hi all,

this patch enables 'arm_emit_multi_reg_pop' to set again the stack
pointer as CFA reg when popping if this is necessary.



  From what I can tell from similar functions this is correct, but could you

elaborate on why this change is needed for my understanding please?

Thanks,
Kyrill


Hi Kyrill,

sure, if the frame pointer was set, than it is the current CFA register.
If we request to adjust the current CFA register offset indicating it
being SP (while it's actually FP) that is indeed not correct and the
incoherence we will be detected by an assertion in the dwarf emission
machinery.

Thanks,  the patch is ok
Kyrill



Best Regards

Andrea


Hmm, wait.  Why would a multi-reg pop be updating the stack pointer?


Hi Richard,

not sure I understand, isn't any pop updating SP by definition?



Back on this,

compiling:

===
int i;

void foo (int);

int bar()
{
   foo (i);
   return 0;
}
===

With -march=armv8.1-m.main+fp -mbranch-protection=pac-ret+leaf -mthumb -O0 -g

Produces the following asm for bar.

bar:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 1, uses_anonymous_args = 0
pac ip, lr, sp
push{r3, r7, ip, lr}
add r7, sp, #0
ldr r3, .L3
ldr r3, [r3]
mov r0, r3
bl  foo
movsr3, #0
mov r0, r3
pop {r3, r7, ip, lr}
aut ip, lr, sp
bx  lr

The offending instruction causing the ICE (without this patch) when
emitting dwarf is "pop {r3, r7, ip, lr}".

The current CFA reg when emitting the multipop is R7 (the frame
pointer).  If is not the multipop that has the duty to restore SP as
current CFA here which other instruction should do it?



Digging a bit deeper, I'm now even more confused.  arm_expand_epilogue 
contains (parphrasing the code):


 if frame_pointer_needed
   {
 if arm
   {}
 else
   {
 if adjust
   r7 += adjust
 mov sp, r7 // Reset CFA to SP
   }
}

so there should always be a move of r7 into SP, even if this is strictly 
redundant.  I don't understand why this doesn't happen for your 
testcase.  Can you dig a bit deeper?  I wonder if we've (probably 
incorrectly) assumed that this function doesn't need an epilogue but can 
use a simple return?  I don't think we should do that when 
authentication is needed: a simple return should really be one instruction.



Best Regards

   Andrea


R.


[Bug middle-end/107991] [10/11/12/13 Regression] Extra mov instructions with ternary on x86

2023-01-09 Thread roger at nextmovesoftware dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107991

Roger Sayle  changed:

   What|Removed |Added

 CC||roger at nextmovesoftware dot 
com

--- Comment #6 from Roger Sayle  ---
An x86 peephole2 to workaround the problem was proposed here:
https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609578.html
but improved register allocation (if possible) would be a better solution:
https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609588.html

  1   2   3   >