Re: RFC: Introduce -fhardened to enable security-related flags

2023-09-18 Thread Hans-Peter Nilsson via Gcc-patches
> From: Sam James 
> Date: Mon, 18 Sep 2023 08:21:45 +0100

> Hans-Peter Nilsson  writes:
> 
> >> From: Sam James 
> >> Date: Sun, 17 Sep 2023 05:00:37 +0100
> >
> >> Hans-Peter Nilsson via Gcc-patches  writes:
> >> > The situation was described as "we noticed that some test
> >> > suites takes 35% percent longer time to finish.  After
> >> > further investigation it was noticed that running systemctl
> >> > unmask x takes around 5s more time on [version including
> >> > patch vs. before that patch]" (timing out some tests).
> >> > Reverting that patch fixed the drop in performance.
> >> 
> >> Did some bug ever get filed for this to see if we can do a bit
> >> better here?
> >
> > Not that I know of; neither for systemd nor gcc.
> >
> >> Some slowdown doesn't mean it's of the expected magnitude.
> >
> > Can you please rephrase that?
> 
> Sorry, what I meant was: yes, I'd expect a bit of a slowdown
> with this option that's unavoidable, but what you're describing
> sounds far beyond the normal case.

Thanks.  My take is that it's something I more or less
expect when indiscriminately using that option.  IOW, for
code consciously using that option, a step is necessary
where the code is vetted and adjusted, for example to remove
large structures and arrays allocated on the stack (just a
guess).

> (to the extent that it might be
> we're either missing an optimisation in GCC for the relevant bits,
> or the systemd parts being exercised here are pathological.)

Sorry, I don't have anything more in terms of local
observations from that investigation; no perf and
flamegraphs.  Again as above, the issue should be observable
as a noticeable slowdown for "systemctl unmask x" for a
systemd compiled with/without 1a4e392760.

brgds, H-P


Re: RFC: Introduce -fhardened to enable security-related flags

2023-09-17 Thread Hans-Peter Nilsson via Gcc-patches
> From: Sam James 
> Date: Sun, 17 Sep 2023 05:00:37 +0100

> Hans-Peter Nilsson via Gcc-patches  writes:
> 
> >> Date: Tue, 29 Aug 2023 15:42:27 -0400
> >> From: Marek Polacek via Gcc-patches 
> >
> >> Surely, there must be no ABI impact, the option cannot cause
> >> severe performance issues,
> >
> >> Currently, -fhardened enables:
> > ...
> >>   -ftrivial-auto-var-init=zero
> >
> >> Thoughts?
> >
> > Regarding -ftrivial-auto-var-init=zero, I was consulted when
> > colleagues investigating a performance regression
> > pint-pointed it as *causing severe performance issues*;
> > cf. https://github.com/systemd/systemd.git commit 1a4e392760
> > (TL;DR: adds "-ftrivial-auto-var-init=zero" to the systemd
> > build).
> >
> > The situation was described as "we noticed that some test
> > suites takes 35% percent longer time to finish.  After
> > further investigation it was noticed that running systemctl
> > unmask x takes around 5s more time on [version including
> > patch vs. before that patch]" (timing out some tests).
> > Reverting that patch fixed the drop in performance.
> 
> Did some bug ever get filed for this to see if we can do a bit
> better here?

Not that I know of; neither for systemd nor gcc.

> Some slowdown doesn't mean it's of the expected magnitude.

Can you please rephrase that?

> > Just a data point, but I believe also exactly your intended
> > use.  IMO including -ftrivial-auto-var-init is worth extra
> > consideration.
> >
> > Alternatively, strike the while "cannot cause severe
> > performance issues".
> >
> > brgds, H-P
> 


Re: RFC: Introduce -fhardened to enable security-related flags

2023-09-16 Thread Hans-Peter Nilsson via Gcc-patches
> Date: Tue, 29 Aug 2023 15:42:27 -0400
> From: Marek Polacek via Gcc-patches 

> Surely, there must be no ABI impact, the option cannot cause
> severe performance issues,

> Currently, -fhardened enables:
...
>   -ftrivial-auto-var-init=zero

> Thoughts?

Regarding -ftrivial-auto-var-init=zero, I was consulted when
colleagues investigating a performance regression
pint-pointed it as *causing severe performance issues*;
cf. https://github.com/systemd/systemd.git commit 1a4e392760
(TL;DR: adds "-ftrivial-auto-var-init=zero" to the systemd
build).

The situation was described as "we noticed that some test
suites takes 35% percent longer time to finish.  After
further investigation it was noticed that running systemctl
unmask x takes around 5s more time on [version including
patch vs. before that patch]" (timing out some tests).
Reverting that patch fixed the drop in performance.

Just a data point, but I believe also exactly your intended
use.  IMO including -ftrivial-auto-var-init is worth extra
consideration.

Alternatively, strike the while "cannot cause severe
performance issues".

brgds, H-P


Re: [RFC] libstdc++: Make --enable-libstdcxx-backtrace=auto default to yes

2023-09-06 Thread Hans-Peter Nilsson via Gcc-patches
> Date: Thu, 7 Sep 2023 00:11:04 +0100
> From: Jonathan Wakely via Gcc-patches 

> On Thu, 7 Sept 2023 at 00:10, Jonathan Wakely  wrote:
> > I don't think there's a bug. $is_hosted is true for
> > --enable-hosted-libstdcxx which is on by default.
> 
> And IIRC __STDC_HOSTED__ is defined unless you compile with -ffreestanding.

Also documented and thus fine by any (reasonable)
definition :) no need for anyone (me) to guess.

brgds, H-P


Re: [RFC] libstdc++: Make --enable-libstdcxx-backtrace=auto default to yes

2023-09-06 Thread Hans-Peter Nilsson via Gcc-patches
> From: Jonathan Wakely 
> Date: Wed, 6 Sep 2023 23:30:08 +0100

> On Mon, 4 Sept 2023 at 17:49, Jonathan Wakely  wrote:
> >
> > On Mon, 4 Sept 2023 at 17:47, Hans-Peter Nilsson via Libstdc++
> >  wrote:
> > >
> > > > Date: Fri, 1 Sep 2023 12:16:40 +0100
> > > > Reply-To: Jonathan Wakely 
> > > >
> > > > On Wed, 23 Aug 2023 at 17:03, Jonathan Wakely via Libstdc++
> > > >  wrote:
> > > > >
> > > > > Any objections to this? It's a C++23 feture, so should be enabled by
> > > > > default.
> > > >
> > > > I've pushed this to trunk, so let's see what breaks!
> > > >
> > > >
> > > > >
> > > > > -- >8 --
> > > > >
> > > > > This causes libstdc++_libbacktrace.a to be built by default. This 
> > > > > might
> > > > > fail on some targets, in which case we can make the 'auto' choice 
> > > > > expand
> > > > > to either 'yes' or 'no' depending on the target.
> > > > >
> > > > > libstdc++-v3/ChangeLog:
> > > > >
> > > > > * acinclude.m4 (GLIBCXX_ENABLE_BACKTRACE): Default to yes.
> > > > > * configure: Regenerate.
> > >
> > > Incidentally, should check_effective_target_stacktrace in
> > > libstdc++.exp also be adjusted to match; removing the
> > > _GLIBCXX_HOSTED condition?
> >
> > No, it should still depend on is_hosted. The acinclude.m4 macro should
> > check that.
> 
> Done in r14-3761-g6de5f5a4fe85bd

Aha, that's what you meant.  I thought you meant that just
check_effective_target_stacktrace should be gated on
$is_hosted.

Yeah, it makes sense not to have backtrace enabled by
default for ! $is_hosted.  On the other hand, bare-iron
targets like cris-elf apparently *are* hosted, according to
"#if __STDC_HOSTED__".  I guess I'll have to dig into what
the definition of "hosted" is, because I don't agree by the
layman obvious definition.  Maybe there's a bug to fix.
There sure are many yaks to shave these days.

brgds, H-P


Re: [RFC] libstdc++: Make --enable-libstdcxx-backtrace=auto default to yes

2023-09-04 Thread Hans-Peter Nilsson via Gcc-patches
> Date: Fri, 1 Sep 2023 12:16:40 +0100
> Reply-To: Jonathan Wakely 
> 
> On Wed, 23 Aug 2023 at 17:03, Jonathan Wakely via Libstdc++
>  wrote:
> >
> > Any objections to this? It's a C++23 feture, so should be enabled by
> > default.
> 
> I've pushed this to trunk, so let's see what breaks!
> 
> 
> >
> > -- >8 --
> >
> > This causes libstdc++_libbacktrace.a to be built by default. This might
> > fail on some targets, in which case we can make the 'auto' choice expand
> > to either 'yes' or 'no' depending on the target.
> >
> > libstdc++-v3/ChangeLog:
> >
> > * acinclude.m4 (GLIBCXX_ENABLE_BACKTRACE): Default to yes.
> > * configure: Regenerate.

Incidentally, should check_effective_target_stacktrace in
libstdc++.exp also be adjusted to match; removing the
_GLIBCXX_HOSTED condition?

brgds, H-P


Re: [RFC] libstdc++: Make --enable-libstdcxx-backtrace=auto default to yes

2023-09-04 Thread Hans-Peter Nilsson via Gcc-patches
I was about to enter a PR for the regression, but as you're
already aware, I'll wait 24 hours to see if this magically
goes away. :]

> On Fri, 1 Sept 2023 at 12:16, Jonathan Wakely  wrote:
> >
> > On Wed, 23 Aug 2023 at 17:03, Jonathan Wakely via Libstdc++
> >  wrote:
> > >
> > > Any objections to this? It's a C++23 feture, so should be enabled by
> > > default.
> >
> > I've pushed this to trunk, so let's see what breaks!
> 
> This modules header broke on aarch64, of course:
> FAIL: g++.dg/modules/xtreme-header_b.C -std=c++2b (test for excess errors)

And others, according to testresults@ including
powerpc64le-unknown-linux-gnu, x86_64-pc-linux-gnu,
s390x-ibm-linux-gnu, m68k-unknown-linux-gnu,
pru-unknown-elf, and...cris-elf (notably, both "64-bit" and
"32-bit" configurations).

Not sure how much information you have, but for cris-elf,
g++.log shows:

FAIL: g++.dg/modules/xtreme-header_b.C -std=c++2b (test for excess errors)
Excess errors:
/obj/libstdc++-v3/include/stacktrace:202:24: error: mangling of 'constexpr 
std::stacktrace_entry::_M_get_info(std::string*, std::string*, int*) 
constoperator 
void (*)(void*, std::stacktrace_entry::uintptr_t, const char*, 
std::stacktrace_entry::uintptr_t, std::stacktrace_entry::uintptr_t)() const' as 
'_ZZNKSt16stacktrace_entry11_M_get_infoEPNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES6_PiENKUlPvmPKcmmE_cvPFvS8_mSA_mmEEv'
 conflicts with a previous mangle
/obj/libstdc++-v3/include/stacktrace:202:24: error: mangling of 'static 
constexpr void std::stacktrace_entry::_M_get_info(std::string*, std::string*, 
int*) const_FUN(void*, 
std::stacktrace_entry::uintptr_t, const char*, 
std::stacktrace_entry::uintptr_t, std::stacktrace_entry::uintptr_t)' as 
'_ZZNKSt16stacktrace_entry11_M_get_infoEPNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES6_PiENUlPvmPKcmmE_4_FUNES8_mSA_mm'
 conflicts with a previous mangle
/obj/libstdc++-v3/include/stacktrace:202:24: error: mangling of 
'std::stacktrace_entry::_M_get_info(std::string*, std::string*, int*) 
const::' as 
'_ZZNKSt16stacktrace_entry11_M_get_infoEPNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES6_PiENKUlPvmPKcmmE_clES8_mSA_mm'
 conflicts with a previous mangle

So, I *guess* it's some kind of pre-existing mangling foulup
with C++20 in the backtrace-support that just happens to be
ticked off by the module testsuite.  But you probably
already knew that.

brgds, H-P


Re: [PATCH] analyzer: implement reference count checking for CPython plugin [PR107646]

2023-08-31 Thread Hans-Peter Nilsson via Gcc-patches
(Looks like this was committed as r14-3580-g597b9ec69bca8a)

> Cc: g...@gcc.gnu.org, gcc-patches@gcc.gnu.org, Eric Feng 
> From: Eric Feng via Gcc 

> gcc/testsuite/ChangeLog:
>   PR analyzer/107646
>   * gcc.dg/plugin/analyzer_cpython_plugin.c: Implements reference count
>   * checking for PyObjects.
>   * gcc.dg/plugin/cpython-plugin-test-2.c: Moved to...
>   * gcc.dg/plugin/cpython-plugin-test-PyList_Append.c: ...here (and
>   * added more tests).
>   * gcc.dg/plugin/cpython-plugin-test-1.c: Moved to...
>   * gcc.dg/plugin/cpython-plugin-test-no-plugin.c: ...here (and added
>   * more tests).
>   * gcc.dg/plugin/plugin.exp: New tests.
>   * gcc.dg/plugin/cpython-plugin-test-PyList_New.c: New test.
>   * gcc.dg/plugin/cpython-plugin-test-PyLong_FromLong.c: New test.
>   * gcc.dg/plugin/cpython-plugin-test-refcnt-checking.c: New test.

It seems this was more or less a rewrite, but that said,
it's generally preferable to always *add* tests, never *modify* them.

>  .../gcc.dg/plugin/analyzer_cpython_plugin.c   | 376 +-

^^^ Ouch!  Was it not within reason to keep that test as it
was, and just add another test?

Anyway, the test after rewrite fails, and for some targets
like cris-elf and apparently m68k-linux, yields an error.
I see a PR was already opened.

Also, mostly for future reference, several files in the
patch miss a final newline, as seen by a "\ No newline at
end of file"-marker.

I think I found the problem; a mismatch between default C++
language standard between host-gcc and target-gcc.

(It's actually *not* as simple as "auto var = typeofvar()"
not being recognized in C++11 --or else there'd be an error
for the hash_set declaration too, which I just changed for
consistency-- but it's close enough for me.)

With this, retesting plugin.exp for cris-elf works.

Ok to commit?

-- >8 --
From: Hans-Peter Nilsson 
Date: Fri, 1 Sep 2023 04:36:03 +0200
Subject: [PATCH] testsuite: Fix analyzer_cpython_plugin.c declarations, PR 
testsuite/111264

Also, add missing newline at end of file.

PR testsuite/111264
* gcc.dg/plugin/analyzer_cpython_plugin.c: Make declarations
C++11-compatible.
---
 gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c 
b/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
index 7af520436549..bf1982e79c37 100644
--- a/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
+++ b/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
@@ -477,8 +477,8 @@ pyobj_refcnt_checker (const region_model *model,
   if (!ctxt)
 return;
 
-  auto region_to_refcnt = hash_map ();
-  auto seen_regions = hash_set ();
+  hash_map region_to_refcnt;
+  hash_set seen_regions;
 
   count_pyobj_references (model, region_to_refcnt, retval, seen_regions);
   check_refcnts (model, old_model, retval, ctxt, region_to_refcnt);
@@ -561,7 +561,7 @@ public:
 if (!ctxt)
   return;
 region_model *model = cd.get_model ();
-auto region_to_refcnt = hash_map ();
+hash_map region_to_refcnt;
 count_all_references(model, region_to_refcnt);
 dump_refcnt_info(region_to_refcnt, model, ctxt);
   }
@@ -1330,4 +1330,4 @@ plugin_init (struct plugin_name_args *plugin_info,
   sorry_no_analyzer ();
 #endif
   return 0;
-}
\ No newline at end of file
+}
-- 
2.30.2

brgds, H-P


Re: [PATCH] libstdc++: Use GLIBCXX_CHECK_LINKER_FEATURES for cross-builds (PR111238)

2023-08-31 Thread Hans-Peter Nilsson via Gcc-patches
> From: Hans-Peter Nilsson 
> Date: Thu, 31 Aug 2023 19:05:19 +0200

> > Date: Thu, 31 Aug 2023 17:25:45 +0200
> > From: Christophe Lyon via Gcc-patches 

> > However, this would hide the fact that libstdc++ somehow forces the
> > user to use -Wl,-gc-sections to avoid undefined references to chdir,
> > mkdir, chmod, pathconf, ... so maybe it's better to keep the status
> > quo and not apply this patch?

I agree with the sentiment, but maybe --gc-sections should
instead be passed by default for arm-eabi when linking, with
way to opt-out; as for cris-elf per below.

> Datapoint: no failures for cris-elf in the listed tests -
> but it instead passes --gc-sections if -O2 or -O3 is seen
> for linking; see cris/cris.h.  It's been like that forever,
> modulo a patch in 2002 not passing it if "-r" is seen.
> 
> Incidentally, I've been sort-of investigating a recent-ish
> commit to newlib (8/8) that added a stub for getpriority,
> which was apparently added due to testsuite breakage for
> libstdc++ and arm-eabi, but that instead broke testsuite
> results for *other* targets, as warning at link-time.  Film
> at 11.
> 
> > 2023-08-31  Christophe Lyon  
> > 
> > libstdc++-v3/ChangeLog:
> > 
> > PR libstdc++/111238
> > * configure: Regenerate.
> > * configure.ac: Call GLIBCXX_CHECK_LINKER_FEATURES in cross,
> > non-Canadian builds.
> 
> On this actual patch, I can't say yay or nay though (but
> leaning towards yay), but I'll test for cris-elf.  Would you
> mind holding off committing for a day or two?

No regressions for cris-elf with this patch.  Still, on one
thought I'm also not wild about libstdc++ this way
overriding the target, and on the other hand, I'll likely to
suggest something similar (adding options) to "improve"
GCC_TRY_COMPILE_OR_LINK (more targets actually linking).

brgds, H-P


Re: [PATCH] libstdc++: Use GLIBCXX_CHECK_LINKER_FEATURES for cross-builds (PR111238)

2023-08-31 Thread Hans-Peter Nilsson via Gcc-patches
> Date: Thu, 31 Aug 2023 17:25:45 +0200
> From: Christophe Lyon via Gcc-patches 

> As discussed in PR104167 (comments #8 and below), and PR111238, using
> -Wl,-gc-sections in the libstdc++ testsuite for arm-eabi
> (cross-toolchain) avoids link failures for a few tests:
> 
> 27_io/filesystem/path/108636.cc
> std/time/clock/gps/1.cc
> std/time/clock/gps/io.cc
> std/time/clock/tai/1.cc
> std/time/clock/tai/io.cc
> std/time/clock/utc/1.cc
> std/time/clock/utc/io.cc
> std/time/clock/utc/leap_second_info.cc
> std/time/exceptions.cc
> std/time/format.cc
> std/time/time_zone/get_info_local.cc
> std/time/time_zone/get_info_sys.cc
> std/time/tzdb/1.cc
> std/time/tzdb/leap_seconds.cc
> std/time/tzdb_list/1.cc
> std/time/zoned_time/1.cc
> std/time/zoned_time/custom.cc
> std/time/zoned_time/io.cc
> std/time/zoned_traits.cc
> 
> This patch achieves this by calling GLIBCXX_CHECK_LINKER_FEATURES in
> cross-build cases, like we already do for native builds. We keep not
> doing so in Canadian-cross builds.
> 
> However, this would hide the fact that libstdc++ somehow forces the
> user to use -Wl,-gc-sections to avoid undefined references to chdir,
> mkdir, chmod, pathconf, ... so maybe it's better to keep the status
> quo and not apply this patch?

Datapoint: no failures for cris-elf in the listed tests -
but it instead passes --gc-sections if -O2 or -O3 is seen
for linking; see cris/cris.h.  It's been like that forever,
modulo a patch in 2002 not passing it if "-r" is seen.

Incidentally, I've been sort-of investigating a recent-ish
commit to newlib (8/8) that added a stub for getpriority,
which was apparently added due to testsuite breakage for
libstdc++ and arm-eabi, but that instead broke testsuite
results for *other* targets, as warning at link-time.  Film
at 11.

> 2023-08-31  Christophe Lyon  
> 
> libstdc++-v3/ChangeLog:
> 
> PR libstdc++/111238
> * configure: Regenerate.
> * configure.ac: Call GLIBCXX_CHECK_LINKER_FEATURES in cross,
> non-Canadian builds.

On this actual patch, I can't say yay or nay though (but
leaning towards yay), but I'll test for cris-elf.  Would you
mind holding off committing for a day or two?

brgds, H-P


Re: Fix profile update in tree-ssa-reassoc

2023-08-24 Thread Hans-Peter Nilsson via Gcc-patches
> Date: Wed, 23 Aug 2023 11:10:02 +0200
> From: Jan Hubicka via Gcc-patches 

> Hi,
> this patch adds missing profile update to maybe_optimize_range_tests.

[...]

> Jakub, it seems that the code is originally yours.  Any idea why those are 
> not turned to
> constant true or false conditionals?
> 
> Bootstrapped/regtested x86_64-linux, does it seem to make sense?
> 
> gcc/ChangeLog:
> 
>   PR tree-optimization/110628
>   * tree-ssa-reassoc.cc (maybe_optimize_range_tests): Add profile update.

Hi.  Feeling somewhat guilty for not noticing that you had
posted a patch before me xfailing it, I went ahead and
tested this patch for cris-elf against
r14-3431-g7e05cd632fab, but unfortunately it regresses a few
tests, and it appears it's not just testcase (dumps) that
need tweaking.  Four test-cases regress (counting multiple
runs as just one):

Running /x/gcc/gcc/testsuite/gcc.c-torture/execute/execute.exp ...
FAIL: gcc.c-torture/execute/pr95731.c   -O1  execution test
FAIL: gcc.c-torture/execute/pr95731.c   -O2  execution test
FAIL: gcc.c-torture/execute/pr95731.c   -O3 -fomit-frame-pointer -funroll-loops 
-fpeel-loops -ftracer -finline-functions  execution test
FAIL: gcc.c-torture/execute/pr95731.c   -O3 -g  execution test
FAIL: gcc.c-torture/execute/pr95731.c   -Os  execution test
FAIL: gcc.c-torture/execute/pr95731.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  execution test
FAIL: gcc.c-torture/execute/pr95731.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  execution test
...

Running /x/gcc/gcc/testsuite/gcc.dg/dg.exp ...
FAIL: gcc.dg/pr46309-2.c scan-tree-dump-times reassoc2 "Optimizing range tests 
[^\r\n]*_[0-9]* -.0, 0. and -.128, 128.[\n\r]* into" 1
...

Running /x/gcc/gcc/testsuite/gcc.dg/torture/dg-torture.exp ...
FAIL: gcc.dg/torture/pr63464.c   -Os  execution test
...

Running /x/gcc/gcc/testsuite/gcc.dg/tree-ssa/tree-ssa.exp ...
...
FAIL: gcc.dg/tree-ssa/pr95731.c scan-tree-dump-times optimized " >= 0| < 0" 6

brgds, H-P


[committed] testsuite: Xfail gcc.dg/tree-ssa/update-threading.c for CRIS, PR110628

2023-08-23 Thread Hans-Peter Nilsson via Gcc-patches
Oops, looks like the PR title annotation didn't work and I
forgot the classic changelog annotation.

Anyway, after fixing a testsuite inconsistency, this test
fails for *some* architectures and shows up as a regression;
see the PR.

-- >8 --

* gcc.dg/tree-ssa/update-threading.c: Xfail for cris-*-*.
---
 gcc/testsuite/gcc.dg/tree-ssa/update-threading.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/update-threading.c 
b/gcc/testsuite/gcc.dg/tree-ssa/update-threading.c
index 1435e9ba2e02..9500099cddff 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/update-threading.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/update-threading.c
@@ -20,4 +20,4 @@ main (int argc, char **argv)
 foo (((A) { ((!(i >> 4) ? 8 : 64 + (i >> 4)) << 8) + (i << 4) } ).a);
   exit (0);
 }
-/* { dg-final { scan-tree-dump-times "Invalid sum" 0 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "Invalid sum" 0 "optimized" { xfail 
cris-*-* } } } Xfail: PR110628 */
-- 
2.30.2



Re: [committed] libstdc++: Reuse double overload of __convert_to_v if possible

2023-08-17 Thread Hans-Peter Nilsson via Gcc-patches
> Date: Thu, 17 Aug 2023 21:32:29 +0100
> From: Jonathan Wakely via Gcc-patches 

> Tested x86_64-linux. Pushed to trunk.

Does the below typo imply that for x86_64-linux,
"__DBL_MANT_DIG__ == __LDBL_MANT_DIG__" is false and the
code is actually untested?

> libstdc++-v3/ChangeLog:
> 
>   * config/locale/generic/c_locale.cc (__convert_to_v): Reuse
>   double overload for long double if possible.

Breakage for cris-elf:

libtool: compile:  /auto/cris-elf/gccobj/./gcc/xgcc -shared-libgcc 
-B/auto/cris-elf/gccobj/./gcc -nostdinc++ 
-L/auto/cris-elf/gccobj/cris-elf/libstdc++-v3/src 
-L/auto/cris-elf/gccobj/cris-elf/libstdc++-v3/src/.libs 
-L/auto/cris-elf/gccobj/cris-elf/libstdc++-v3/libsupc++/.libs -nostdinc 
-B/auto/cris-elf/gccobj/cris-elf/newlib/ -isystem 
/auto/cris-elf/gccobj/cris-elf/newlib/targ-include -isystem 
/auto/gcc/newlib/libc/include -B/auto/cris-elf/gccobj/cris-elf/libgloss/cris 
-L/auto/cris-elf/gccobj/cris-elf/libgloss/libnosys -L/auto/gcc/libgloss/cris 
-B/auto/cris-elf/pre/cris-elf/bin/ -B/auto/cris-elf/pre/cris-elf/lib/ -isystem 
/auto/cris-elf/pre/cris-elf/include -isystem 
/auto/cris-elf/pre/cris-elf/sys-include -I/auto/gcc/libstdc++-v3/../libgcc 
-I/auto/cris-elf/gccobj/cris-elf/libstdc++-v3/include/cris-elf 
-I/auto/cris-elf/gccobj/cris-elf/libstdc++-v3/include 
-I/auto/gcc/libstdc++-v3/libsupc++ -std=gnu++98 -fno-implicit-templates -Wall 
-Wextra -Wwrite-strings -Wcast-qual -Wabi=2 -fdiagnos
 tics-show-location=once -ffunction-sections -fdata-sections 
-frandom-seed=c++locale.lo -g -O2 -fimplicit-templates -c c++locale.cc -o 
c++locale.o
c++locale.cc: In function 'void std::__convert_to_v(const char*, _Tp&, 
ios_base::iostate&, int* const&) [with _Tp = long double; ios_base::iostate = 
ios_base::iostate; __c_locale = int*]':
c++locale.cc:192:49: error: expected primary-expression before ')' token
  192 |   __convert_to_v(__s, __d, __err, __c_locale);
  | ^
make[5]: *** [Makefile:881: c++locale.lo] Error 1
make[5]: Leaving directory 
'/auto/cris-elf/gccobj/cris-elf/libstdc++-v3/src/c++98'

(Formally, a commit in the range ee40bdbfb07c..aad83d61d2e9
but this one seems pretty clear.)

In the context:
 __convert_to_v(const char* __s, long double& __v,
   ios_base::iostate& __err, const __c_locale&) throw()

So, __c_locale" appears to be the type(def) and you're
missing a parameter name. :)

brgds, H-P


[PATCH v2] CRIS: Don't include tree.h in cris-protos.h, PR bootstrap/111021

2023-08-14 Thread Hans-Peter Nilsson via Gcc-patches
Re-testing as previously mentioned, reposted freshly for reference.
-- >8 --
While there's another patch that fixes the immediate error
in the PR by other means, the include of tree.h here is
something I prefer to avoid.

PR bootstrap/111021
* config/cris/cris-protos.h: Revert recent change.
* config/cris/cris.cc (cris_legitimate_address_p): Remove
code_helper unused parameter.
(cris_legitimate_address_p_hook): New wrapper function.
(TARGET_LEGITIMATE_ADDRESS_P): Change to
cris_legitimate_address_p_hook.
---
 gcc/config/cris/cris-protos.h |  5 +
 gcc/config/cris/cris.cc   | 13 +++--
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/gcc/config/cris/cris-protos.h b/gcc/config/cris/cris-protos.h
index 58555943986c..666e04f9eeec 100644
--- a/gcc/config/cris/cris-protos.h
+++ b/gcc/config/cris/cris-protos.h
@@ -20,8 +20,6 @@ along with GCC; see the file COPYING3.  If not see
 
 /* Prototypes for the CRIS port.  */
 
-#include "tree.h" /* For ERROR_MARK.  */
-
 extern bool cris_simple_epilogue (void);
 #ifdef RTX_CODE
 extern const char *cris_op_str (rtx);
@@ -36,8 +34,7 @@ extern bool cris_base_or_autoincr_p (const_rtx, bool);
 extern bool cris_bdap_index_p (const_rtx, bool);
 extern void cris_reduce_compare (rtx *, rtx *, rtx *);
 extern bool cris_biap_index_p (const_rtx, bool);
-extern bool cris_legitimate_address_p (machine_mode, rtx, bool,
-  code_helper = ERROR_MARK);
+extern bool cris_legitimate_address_p (machine_mode, rtx, bool);
 extern bool cris_store_multiple_op_p (rtx);
 extern bool cris_movem_load_rest_p (rtx);
 extern void cris_asm_output_symbol_ref (FILE *, rtx);
diff --git a/gcc/config/cris/cris.cc b/gcc/config/cris/cris.cc
index 853c07920f07..4eaaf2184b63 100644
--- a/gcc/config/cris/cris.cc
+++ b/gcc/config/cris/cris.cc
@@ -168,6 +168,8 @@ static unsigned int cris_hard_regno_nregs (unsigned int, 
machine_mode);
 static bool cris_hard_regno_mode_ok (unsigned int, machine_mode);
 static HOST_WIDE_INT cris_static_rtx_alignment (machine_mode);
 static HOST_WIDE_INT cris_constant_alignment (const_tree, HOST_WIDE_INT);
+static bool cris_legitimate_address_p_hook (machine_mode, rtx, bool,
+   code_helper);
 
 /* This is the parsed result of the "-max-stack-stackframe=" option.  If
it (still) is zero, then there was no such option given.  */
@@ -217,7 +219,7 @@ int cris_cpu_version = CRIS_DEFAULT_CPU_VERSION;
 #define TARGET_INIT_LIBFUNCS cris_init_libfuncs
 
 #undef TARGET_LEGITIMATE_ADDRESS_P
-#define TARGET_LEGITIMATE_ADDRESS_P cris_legitimate_address_p
+#define TARGET_LEGITIMATE_ADDRESS_P cris_legitimate_address_p_hook
 
 #undef TARGET_PREFERRED_RELOAD_CLASS
 #define TARGET_PREFERRED_RELOAD_CLASS cris_preferred_reload_class
@@ -1536,8 +1538,15 @@ cris_biap_index_p (const_rtx x, bool strict)
 
 /* Worker function for TARGET_LEGITIMATE_ADDRESS_P.  */
 
+static bool
+cris_legitimate_address_p_hook (machine_mode mode, rtx x, bool strict,
+   code_helper)
+{
+  return cris_legitimate_address_p (mode, x, strict);
+}
+
 bool
-cris_legitimate_address_p (machine_mode mode, rtx x, bool strict, code_helper)
+cris_legitimate_address_p (machine_mode mode, rtx x, bool strict)
 {
   const_rtx x1, x2;
 
-- 
2.30.2



Re: [PATCH] CRIS: Don't include tree.h in cris-protos.h, PR bootstrap/111021

2023-08-14 Thread Hans-Peter Nilsson via Gcc-patches
> From: Hans-Peter Nilsson 
> Date: Tue, 15 Aug 2023 06:57:04 +0200

Whoops, of course there was a typo due to
insufficient-last-minute-renaming syndrome. :)

> -#define TARGET_LEGITIMATE_ADDRESS_P cris_legitimate_address_p
> +#define TARGET_LEGITIMATE_ADDRESS_P cris_target_legitimate_address_p

*cris_legitimate_address_p_hook

brgds, H-P


[PATCH] CRIS: Don't include tree.h in cris-protos.h, PR bootstrap/111021

2023-08-14 Thread Hans-Peter Nilsson via Gcc-patches
I'll commit this in a few hours pending testing.  It seems
trivial enough to be posted before testing is finished
though, now that it has passed the previous
point-of-breakage.  JFTR, I'm testing against the version
with the "first" breaking commit: r14-3092, not r14-3093 the
one with recog.h.

-- >8 --
While there's another patch that fixes the immediate error
in the PR by other means, the include of tree.h here is
something I prefer to avoid.

PR bootstrap/111021
* config/cris/cris-protos.h: Revert recent change.
* config/cris/cris.cc (cris_legitimate_address_p): Remove
code_helper unused parameter.
(cris_legitimate_address_p_hook): New wrapper function.
(TARGET_LEGITIMATE_ADDRESS_P): Change to
cris_legitimate_address_p_hook.
---
 gcc/config/cris/cris-protos.h |  5 +
 gcc/config/cris/cris.cc   | 13 +++--
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/gcc/config/cris/cris-protos.h b/gcc/config/cris/cris-protos.h
index 58555943986c..666e04f9eeec 100644
--- a/gcc/config/cris/cris-protos.h
+++ b/gcc/config/cris/cris-protos.h
@@ -20,8 +20,6 @@ along with GCC; see the file COPYING3.  If not see
 
 /* Prototypes for the CRIS port.  */
 
-#include "tree.h" /* For ERROR_MARK.  */
-
 extern bool cris_simple_epilogue (void);
 #ifdef RTX_CODE
 extern const char *cris_op_str (rtx);
@@ -36,8 +34,7 @@ extern bool cris_base_or_autoincr_p (const_rtx, bool);
 extern bool cris_bdap_index_p (const_rtx, bool);
 extern void cris_reduce_compare (rtx *, rtx *, rtx *);
 extern bool cris_biap_index_p (const_rtx, bool);
-extern bool cris_legitimate_address_p (machine_mode, rtx, bool,
-  code_helper = ERROR_MARK);
+extern bool cris_legitimate_address_p (machine_mode, rtx, bool);
 extern bool cris_store_multiple_op_p (rtx);
 extern bool cris_movem_load_rest_p (rtx);
 extern void cris_asm_output_symbol_ref (FILE *, rtx);
diff --git a/gcc/config/cris/cris.cc b/gcc/config/cris/cris.cc
index 853c07920f07..4eaaf2184b63 100644
--- a/gcc/config/cris/cris.cc
+++ b/gcc/config/cris/cris.cc
@@ -168,6 +168,8 @@ static unsigned int cris_hard_regno_nregs (unsigned int, 
machine_mode);
 static bool cris_hard_regno_mode_ok (unsigned int, machine_mode);
 static HOST_WIDE_INT cris_static_rtx_alignment (machine_mode);
 static HOST_WIDE_INT cris_constant_alignment (const_tree, HOST_WIDE_INT);
+static bool cris_legitimate_address_p_hook (machine_mode, rtx, bool,
+   code_helper);
 
 /* This is the parsed result of the "-max-stack-stackframe=" option.  If
it (still) is zero, then there was no such option given.  */
@@ -217,7 +219,7 @@ int cris_cpu_version = CRIS_DEFAULT_CPU_VERSION;
 #define TARGET_INIT_LIBFUNCS cris_init_libfuncs
 
 #undef TARGET_LEGITIMATE_ADDRESS_P
-#define TARGET_LEGITIMATE_ADDRESS_P cris_legitimate_address_p
+#define TARGET_LEGITIMATE_ADDRESS_P cris_target_legitimate_address_p
 
 #undef TARGET_PREFERRED_RELOAD_CLASS
 #define TARGET_PREFERRED_RELOAD_CLASS cris_preferred_reload_class
@@ -1536,8 +1538,15 @@ cris_biap_index_p (const_rtx x, bool strict)
 
 /* Worker function for TARGET_LEGITIMATE_ADDRESS_P.  */
 
+static bool
+cris_legitimate_address_p_hook (machine_mode mode, rtx x, bool strict,
+   code_helper)
+{
+  return cris_legitimate_address_p (mode, x, strict);
+}
+
 bool
-cris_legitimate_address_p (machine_mode mode, rtx x, bool strict, code_helper)
+cris_legitimate_address_p (machine_mode mode, rtx x, bool strict)
 {
   const_rtx x1, x2;
 
-- 
2.30.2



Re: [PATCH 2/3] ivopts: Call valid_mem_ref_p with code_helper [PR110248]

2023-08-14 Thread Hans-Peter Nilsson via Gcc-patches
> Date: Mon, 14 Aug 2023 16:47:40 +0800
> From: "Kewen.Lin via Gcc-patches" 

> on 2023/8/14 15:53, Jan-Benedict Glaw wrote:
> > echo timestamp > s-constrs-h
> > /var/lib/laminar/run/gcc-local/82/local-toolchain-install/bin/g++ 
> > -std=c++11 -c   -g -O2   -DIN_GCC-fno-exceptions -fno-rtti 
> > -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings 
> > -Wcast-qual -Wmissing-format-attribute -Wconditionally-supported 
> > -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros 
> > -Wno-overlength-strings -Werror -fno-common  -DHAVE_CONFIG_H  
> > -DGENERATOR_FILE -I. -Ibuild -I../../gcc/gcc -I../../gcc/gcc/build 
> > -I../../gcc/gcc/../include  -I../../gcc/gcc/../libcpp/include  \
> >  -o build/gencondmd.o build/gencondmd.cc
> > In file included from ../../gcc/gcc/tree.h:23,
> >  from ../../gcc/gcc/recog.h:24,
> >  from build/gencondmd.cc:40:
> > ../../gcc/gcc/tree-core.h:145:10: fatal error: all-tree.def: No such file 
> > or directory
> >   145 | #include "all-tree.def"
> 
> 
> Thanks for reporting and sorry for the breakage.  This failure only gets 
> exposed if
> all-tree.def isn't generated before compiling these gen*.cc including recog.h 
> during the
> build.  It explains why I didn't catch this failure before.  I will check the 
> existing
> practice and post a patch soon.

I entered PR bootstrap/111021 for a similar breakage.  Looks
like I won't be able to work around it for CRIS (as alluded
in the PR) as recog.h was hacked too; not just
${TARGET}-protos.h. :(

Please consider defaulting to NULL or something like that
instead of introducing a tree.h-et-al dependency.

brgds, H-P


[committed] CRIS: Replace unspec CRIS_UNSPEC_SWAP_BITS with rtx bitreverse

2023-07-03 Thread Hans-Peter Nilsson via Gcc-patches
This is just expected to be a change in representation.
No code is expected to change; no new tests are added.

* config/cris/cris.md (CRIS_UNSPEC_SWAP_BITS): Remove.
("cris_swap_bits", "ctzsi2"): Use bitreverse instead.
---
 gcc/config/cris/cris.md | 9 ++---
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/gcc/config/cris/cris.md b/gcc/config/cris/cris.md
index 7504b63dabf3..deb2f0c6b7c7 100644
--- a/gcc/config/cris/cris.md
+++ b/gcc/config/cris/cris.md
@@ -50,9 +50,6 @@ (define_c_enum ""
   [
;; Stack frame deallocation barrier.
CRIS_UNSPEC_FRAME_DEALLOC
-
-   ;; Swap all 32 bits of the operand; 31 <=> 0, 30 <=> 1...
-   CRIS_UNSPEC_SWAP_BITS
   ])
 
 ;; Register numbers.
@@ -2177,8 +2174,7 @@ (define_insn 
"bswapsi2"
 
 (define_insn "cris_swap_bits"
   [(set (match_operand:SI 0 "register_operand" "=r")
-   (unspec:SI [(match_operand:SI 1 "register_operand" "0")]
-  CRIS_UNSPEC_SWAP_BITS))
+   (bitreverse:SI (match_operand:SI 1 "register_operand" "0")))
(clobber (reg:CC CRIS_CC0_REGNUM))]
   "TARGET_HAS_SWAP"
   "swapwbr %0"
@@ -2193,8 +2189,7 @@ (define_expand "ctzsi2"
  (match_operand:SI 1 "register_operand"))
  (clobber (reg:CC CRIS_CC0_REGNUM))])
(parallel
-[(set (match_dup 2)
- (unspec:SI [(match_dup 2)] CRIS_UNSPEC_SWAP_BITS))
+[(set (match_dup 2) (bitreverse:SI (match_dup 2)))
  (clobber (reg:CC CRIS_CC0_REGNUM))])
(parallel
 [(set (match_operand:SI 0 "register_operand")
-- 
2.30.2



[committed] dwarf2out.cc (mem_loc_descriptor): Handle BITREVERSE

2023-07-03 Thread Hans-Peter Nilsson via Gcc-patches
Committed as obvious after regtest for cris-elf together
with the "next" patch, that replaces unspec
CRIS_UNSPEC_SWAP_BITS with bitreverse (which hit the ICE).

-- >8 --
This seems to have just been overlooked when introducing
BITREVERSE.  Note that the function name mem_loc_descriptor
is a misnomer; it'd better be called rtx_loc_descriptor or
any_loc_descriptor, because "anything" RTX can end up here.
To wit, when introducing new RTL that ends up as code or for
other reasons appear in debug expressions, don't forget to
update this function.  This was observed by building
libstdc+++ for cris-elf with a patch replacing the
CRIS_UNSPEC_SWAP_BITS by bitreverse, as hitting the
internal-error-generating default case.

Looking at the BSWAP, POPCOUNT and ROTATE cases, BITREVERSE
can probably be fully expressed as DWARF code if need be,
but let's start with not throwing an internal error.

gcc:
* dwarf2out.cc (mem_loc_descriptor): Handle BITREVERSE.
---
 gcc/dwarf2out.cc | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/dwarf2out.cc b/gcc/dwarf2out.cc
index 9112fc0c64b5..e973644102c0 100644
--- a/gcc/dwarf2out.cc
+++ b/gcc/dwarf2out.cc
@@ -16940,6 +16940,7 @@ mem_loc_descriptor (rtx rtl, machine_mode mode,
 case CLOBBER:
 case SMUL_HIGHPART:
 case UMUL_HIGHPART:
+case BITREVERSE:
   break;
 
 case CONST_STRING:
-- 
2.30.2



PR108672 re-fixed after [PATCH] libstdc++: Synchronize PSTL with upstream

2023-06-29 Thread Hans-Peter Nilsson via Gcc-patches
> Date: Mon, 26 Jun 2023 11:57:49 -0700
> From: Thomas Rodgers via Gcc-patches 

> On Wed, May 17, 2023 at 12:32 PM Jonathan Wakely  wrote:
> > All the actual code changes look good.

Unfortunately, this overwrote the fix for PR108672.  I take
it there's a step missing from the synchronization process;
a check that no local commits are overwritten?  Sounds like
something that can be fully scripted (not volunteering) or
already available (like, "list all commits affecting
contents touched by/between two named commits").

I did *not* check whether any other local commits were also
overwritten.  Also, not sure about whether better try to get
this upstreamed: __INT32_TYPE__ seems gcc-specific.

Anyway, r13-5702-g72058eea9d407e was "re-committed" per
below as obvious after regtesting cris-elf.

brgds, H-P

-- >8 --
Subject: libstdc++: Re-apply PR108672 fix (avoid use of naked int32_t in 
unseq_backend_simd.h)

The fix was overwritten by r14-2109-g3162ca09dbdc2e "libstdc++:
Synchronize PSTL with upstream".

libstdc++-v3:

PR libstdc++/108672
* include/pstl/unseq_backend_simd.h (__simd_or): Re-apply using
__INT32_TYPE__ instead of int32_t.
---
 libstdc++-v3/include/pstl/unseq_backend_simd.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libstdc++-v3/include/pstl/unseq_backend_simd.h 
b/libstdc++-v3/include/pstl/unseq_backend_simd.h
index 69784bcdbe66..f3c38fbbbc2a 100644
--- a/libstdc++-v3/include/pstl/unseq_backend_simd.h
+++ b/libstdc++-v3/include/pstl/unseq_backend_simd.h
@@ -74,7 +74,7 @@ __simd_or(_Index __first, _DifferenceType __n, _Pred __pred) 
noexcept
 const _Index __last = __first + __n;
 while (__last != __first)
 {
-int32_t __flag = 1;
+__INT32_TYPE__ __flag = 1;
 _PSTL_PRAGMA_SIMD_REDUCTION(& : __flag)
 for (_DifferenceType __i = 0; __i < __block_size; ++__i)
 if (__pred(*(__first + __i)))
-- 
2.30.2




[committed] testsuite: check_effective_target_lra: CRIS is LRA

2023-06-28 Thread Hans-Peter Nilsson via Gcc-patches
Left-over from r14-383-gfaf8bea79b6256.

* lib/target-supports.exp (check_effective_target_lra): Remove
cris-*-* from expression for exceptions to LRA.
---
 gcc/testsuite/lib/target-supports.exp | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 184fafb020f8..bad97d8c26b9 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -12278,7 +12278,7 @@ proc check_effective_target_o_flag_in_section { } {
 # return 1 if LRA is supported.
 
 proc check_effective_target_lra { } {
-if { [istarget hppa*-*-*] || [istarget cris-*-*] || [istarget avr-*-*] } {
+if { [istarget hppa*-*-*] || [istarget avr-*-*] } {
return 0
 }
 return 1
-- 
2.30.2



[committed] CRIS: Don't apply PATTERN to insn before validation (PR 110144)

2023-06-28 Thread Hans-Peter Nilsson via Gcc-patches
Oops.  The validation was there, but PATTERN was applied
before that.  Noticeable only with rtl-checking (for example
as in the report: "--enable-checking=yes,rtl") as this
statement was only a (one of many) straggling olde-C
declare-and-initialize-at-beginning-of-block thing.

PR target/110144
* config/cris/cris.cc (cris_postdbr_cmpelim): Don't apply PATTERN
to insn before validating it.
---
 gcc/config/cris/cris.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/cris/cris.cc b/gcc/config/cris/cris.cc
index 7fca2af085a7..f04f501326e7 100644
--- a/gcc/config/cris/cris.cc
+++ b/gcc/config/cris/cris.cc
@@ -375,7 +375,6 @@ cris_postdbr_cmpelim ()
   for (insn = get_insns (); insn; insn = next)
 {
   rtx_insn *outer_insn = insn;
-  rtx pat = PATTERN (insn);
 
   next = NEXT_INSN (outer_insn);
 
@@ -389,6 +388,7 @@ cris_postdbr_cmpelim ()
 
   if (!NONDEBUG_INSN_P (insn))
continue;
+  rtx pat = PATTERN (insn);
 
   /* Consider filled delay slots; there might be a comparison there.
 It's only the second insn in a sequence that is interesting.  */
-- 
2.30.2



[PATCH] (Re: Splitting up 27_io/basic_istream/ignore/wchar_t/94749.cc (takes too long))

2023-06-09 Thread Hans-Peter Nilsson via Gcc-patches
Thank you for your consideration.  (Or is that phrase only used negatively?)

> From: Jonathan Wakely 
> Date: Fri, 9 Jun 2023 21:40:15 +0100

> test01, test02, test03 and test04 should run almost instantly. On my system
> they take about 5 microseconds each. So I don't think splitting those up
> will help.

Right.

> I thought it would help to avoid re-allocating the buffer and zeroing it
> again. If we reuse the same buffer, then we just have to loop until we
> overflow the 32-bit counter. That would make the whole test run much
> faster, which would reduce the total time for a testsuite run. Splitting
> the file up into smaller files would not decrease the total time, only
> decrease the time for that single test so it doesn't time out.
> 
> I've attached a patch that does that. I makes very little difference for
> me, probably because allocating zero-filled pages isn't actually expensive
> on linux. Maybe it will make a differene for your simulator though?

Nope, just some five seconds down (from about 10min 21s).

> You could also try reducing the size of the buffer:
> +#ifdef SIMULATOR_TEST
> +  static const streamsize bufsz = 16 << limits::digits10;
> +#else
>   static const streamsize bufsz = 2048 << limits::digits10;
> +#endif

Was that supposed to be with or without the patch?  Anyway;
both: 606s.  Only smaller bufsz: 614s.  (All numbers subject
to usual system jitter.)

> test06 is the really slow part, that takes 10+ seconds for me. But that
> entire function should already be skipped for simulators.

Yep, we may have been here before...  I certainly get a
deja-vu feeling here, but visiting old email conversations
of ours, it seems I easily conflate several similar ones.
I see that here, test06 was always #ifndef SIMULATOR_TEST.

> We can probably skip test05 for simulators too, none of the code it tests
> is platform-specific, so as long as it's being tested on x86 we don't
> really need to test it on cris-elf too.

Thanks.  Let's do that, then.  The similar s/wchar_t/char/
test clocks in at "only" 3m30s, but I suggest treating it
the same, if nothing else than for symmetry.

Ok as below?

-- >8 --
Subject: [PATCH] testsuite: Cut down 27_io/basic_istream/.../94749.cc for
 simulators

The test wchar_t/94749.cc can take about 10 minutes on some
simulator/host combinations with char/94749.cc at a third of
that time.  The cause is test05 which is quite heavy and
includes wrapping a 32-bit counter.  Run it only for native
setups.

* testsuite/27_io/basic_istream/ignore/wchar_t/94749.cc (main)
[! SIMULATOR_TEST]: Also exclude running test05.
* testsuite/27_io/basic_istream/ignore/char/94749.cc: Ditto.
---
 libstdc++-v3/testsuite/27_io/basic_istream/ignore/char/94749.cc | 2 +-
 .../testsuite/27_io/basic_istream/ignore/wchar_t/94749.cc   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/testsuite/27_io/basic_istream/ignore/char/94749.cc 
b/libstdc++-v3/testsuite/27_io/basic_istream/ignore/char/94749.cc
index 6416863983b7..9160995c05ec 100644
--- a/libstdc++-v3/testsuite/27_io/basic_istream/ignore/char/94749.cc
+++ b/libstdc++-v3/testsuite/27_io/basic_istream/ignore/char/94749.cc
@@ -221,8 +221,8 @@ main()
   test02();
   test03();
   test04();
-  test05();
 #ifndef SIMULATOR_TEST
+  test05();
   test06();
 #endif
 }
diff --git a/libstdc++-v3/testsuite/27_io/basic_istream/ignore/wchar_t/94749.cc 
b/libstdc++-v3/testsuite/27_io/basic_istream/ignore/wchar_t/94749.cc
index 65e0a326c109..a5b9eb71a389 100644
--- a/libstdc++-v3/testsuite/27_io/basic_istream/ignore/wchar_t/94749.cc
+++ b/libstdc++-v3/testsuite/27_io/basic_istream/ignore/wchar_t/94749.cc
@@ -221,8 +221,8 @@ main()
   test02();
   test03();
   test04();
-  test05();
 #ifndef SIMULATOR_TEST
+  test05();
   test06();
 #endif
 }
-- 
2.30.2



Re: Splitting up 27_io/basic_istream/ignore/wchar_t/94749.cc (takes too long)

2023-06-09 Thread Hans-Peter Nilsson via Gcc-patches
> From: Mike Stump 
> Date: Fri, 9 Jun 2023 10:18:45 -0700

> On Jun 9, 2023, at 9:20 AM, Hans-Peter Nilsson via Gcc-patches 
>  wrote:
> > 
> > The test 27_io/basic_istream/ignore/wchar_t/94749.cc takes
> > about 10 minutes to run for cris-elf in the "gdb simulator"
> 
> I'd let the libstdc++ people comment on specific things.
> I'll comment on general things.  We could let line count
> (or word count or character count) scale the timeout in
> part, we could record times in a db and put an expected
> run time into test cases or in an along side db. We could
> have factors for slow systems, slow simulators. A 5 GHz
> x86_64 will likely be faster that a 40 year old pdp11. We
> can have these scale factors trigger off OS, cpu
> statically, and/or we can do a quick bogomips calculation
> and let that scale it and record that scaling factor in
> the build tree.

Wild plans, but with some points.

Beware that uniform testing IMO weighs in much heavier than
uniform test-time.  Like, arm-eabi, rv32-elf and cris-elf,
having common main factors, should test the same code and
the same number of iterations, preferably regardless of
simulator quality.  (FWIW, I consider the cris-elf gdb
simulator is *fast* - before 2021 or when built
--disable-sim-hardware.)

> A wealth of possibilities.

And complexity!

> Solutions that require maintenance or test case
> modification are annoying.

Yeah, but that maintenance annoyance unfortunately includes
initial setup.  You propose quite a major shift there.  It
sounds good, but sorry, but I must settle for just editing
the test-case some way.

brgds, H-P


Splitting up 27_io/basic_istream/ignore/wchar_t/94749.cc (takes too long)

2023-06-09 Thread Hans-Peter Nilsson via Gcc-patches
Hi!

The test 27_io/basic_istream/ignore/wchar_t/94749.cc takes
about 10 minutes to run for cris-elf in the "gdb simulator"
here on my arguably way-past-retirement machine (and it
looks like it gained a minute with LRA).  I've seen it
timing out every now and then on busy days with load >
`nproc`.  Usually it happens some time after I've forgot
about why. :)

It has had some performance surgery before (pruning for
simulators, doubling timeout for ilp32).  I'd probably just
try cutting along the function boundaries and keep those
parts separate that have >1 min execution time.

Anyway, your thoughts on the matter would be appreciated.

brgds, H-P


Re: [pushed] c++: allow NRV and non-NRV returns [PR58487]

2023-06-08 Thread Hans-Peter Nilsson via Gcc-patches
> Date: Wed,  7 Jun 2023 18:06:15 -0400
> From: Jason Merrill via Gcc-patches 

> Tested x86_64-pc-linux-gnu, applying to trunk.
> 
> -- 8< --
> 
> Now that we support NRV from an inner block, we can also support non-NRV
> returns from other blocks, since once the NRV is out of scope a later return
> expression can't possibly alias it.
> 
> This fixes 58487 and half-fixes 53637: now one of the returns is elided, but
> not the other.
> 
> Fixing the remaining xfails in these testcases will require a very different
> approach, probably involving a full tree/block walk from finalize_nrv, and
> check_return_expr only adding to a list of potential return variables.
> 
>   PR c++/58487
>   PR c++/53637
> 
> gcc/cp/ChangeLog:
> 
>   * cp-tree.h (INIT_EXPR_NRV_P): New.
>   * semantics.cc (finalize_nrv_r): Check it.
>   * name-lookup.h (decl_in_scope_p): Declare.
>   * name-lookup.cc (decl_in_scope_p): New.
>   * typeck.cc (check_return_expr): Allow non-NRV
>   returns if the NRV is no longer in scope.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/opt/nrv26.C: New test.
>   * g++.dg/opt/nrv26a.C: New test.
>   * g++.dg/opt/nrv27.C: New test.

This somehow caused 21 regressions for cris-elf in the c++
and libstdc++ testsuites.  I opened PR110185 to hold the
preprocessed g++.dg/cpp2a/spaceship-p1186.C.

brgds, H-P


Re: [PATCH] libstdc++: Use AS_IF in configure.ac

2023-06-07 Thread Hans-Peter Nilsson via Gcc-patches
> Date: Tue, 6 Jun 2023 16:30:12 +0100
> From: Jonathan Wakely via Gcc-patches 

> On Thu, 1 Jun 2023 at 16:59, Jonathan Wakely via Libstdc++ <
> libstd...@gcc.gnu.org> wrote:
> 
> > Tested x86_64-linux. I'd appreciate a second set of eyeballs on this
> > before I push it.
> >
> 
> Pushed to trunk now.

...as r14-1581-g97a5e8a2a48d16, after which (apparently)
*all* linking libstdc++ tests for cris-elf (a "newlib
target") get (for example):

FAIL: 17_intro/freestanding.cc (test for excess errors)
Excess errors:
/x/cris-elf/pre/cris-elf/bin/ld: cannot find -liconv: No such file or directory

(deduced from libstdc++.log and the commits in the range
ce2188e4320c..585c660f041c where 4144 regressions in
libstdc++ were introduced for cris-elf)

>From the generated configure and a brief RTFM for AS_IF, it
looks almost like AS_IF was "miscompiled" and behaving
literally AS_IF (!) in that the condition TEST1 (here
[$GLIBCXX_IS_NATIVE] seems to be emitted *after* the
RUN-IF-TRUE1 clause (the next 31 lines).  Not obvious what
went wrong.  I even tried regenerating configure.  HTH.

brgds, H-P


[committed] bootstrap rtl-checking: Fix XVEC vs XVECEXP in postreload.cc

2023-06-05 Thread Hans-Peter Nilsson via Gcc-patches
Oops.  Sorry.  Committed as obvious.  A bootstrap
--enable-checking=yes,extra,rtl (same as the reporter, but
not the default) with the patch completed, where a bootstrap
without it failed.

-- >8 --
PR bootstrap/110120
* postreload.cc (reload_cse_move2add, move2add_use_add2_insn): Use
XVECEXP, not XEXP, to access first item of a PARALLEL.
---
 gcc/postreload.cc | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/postreload.cc b/gcc/postreload.cc
index b479d4b921ba..20e138b4fa8b 100644
--- a/gcc/postreload.cc
+++ b/gcc/postreload.cc
@@ -1801,7 +1801,7 @@ move2add_use_add2_insn (scalar_int_mode mode, rtx reg, 
rtx sym, rtx off,
 naked SET, or else its single_set is the first element
 in a PARALLEL.  */
  rtx *setloc = GET_CODE (PATTERN (insn)) == PARALLEL
-   ?  (PATTERN (insn), 0) :  (insn);
+   ?  (PATTERN (insn), 0, 0) :  (insn);
  if (*setloc == set && costs_lt_p (, , speed))
{
  changed = validate_change (insn, setloc, new_set, 0);
@@ -2027,7 +2027,7 @@ reload_cse_move2add (rtx_insn *first)
  costs_add_n_insns (, 1);
 
  rtx *setloc = GET_CODE (PATTERN (next)) == PARALLEL
-   ?  (PATTERN (next), 0) :  (next);
+   ?  (PATTERN (next), 0, 0) :  (next);
  if (*setloc == set
  && costs_lt_p (, , speed)
  && have_add2_insn (reg, new_src))
-- 
2.30.2



Re: Build-break in libstdc++-v3 at r14-1442-ge1240bda3e0bb1 for non-float128 targets

2023-05-31 Thread Hans-Peter Nilsson via Gcc-patches
> From: Jonathan Wakely 
> Date: Wed, 31 May 2023 21:06:16 +0100
> On Wed, 31 May 2023 at 16:32, Jonathan Wakely  wrote:
> > On Wed, 31 May 2023 at 16:29, Hans-Peter Nilsson via Libstdc++ <
> > libstd...@gcc.gnu.org> wrote:
> >
> >> Since I don't see a quick fix at r14-1444-g3f4853a5f00fab, I
> >> thought I'd better notify the author (I have written authors
> >> if there was more than one ;-) of suspect commits in the
> >> range r14-1425-g80ee7d02e8db48..e1240bda3e0b for the
> >> build-break at r14-1442-ge1240bda3e0bb1 for cris-elf, where
> >> I get:
> >>
> >> /x/gcc/libstdc++-v3/src/c++17/floating_from_chars.cc:1330:47: error:
> >> '_Float128' is not supported on this target
> >>  1330 | from_chars(const char* first, const char* last, _Float128& value,
> >>   |   ^
> >>
> >
> > Sorry, I'll fix or revert it today.
> >
> 
> It should be fixed at  r14-1451-ga239a35075ffd8

JFTR: confirmed at r14-1451-ga239a35075ffd8, thanks!

brgds, H-P


Build-break in libstdc++-v3 at r14-1442-ge1240bda3e0bb1 for non-float128 targets

2023-05-31 Thread Hans-Peter Nilsson via Gcc-patches
Since I don't see a quick fix at r14-1444-g3f4853a5f00fab, I
thought I'd better notify the author (I have written authors
if there was more than one ;-) of suspect commits in the
range r14-1425-g80ee7d02e8db48..e1240bda3e0b for the
build-break at r14-1442-ge1240bda3e0bb1 for cris-elf, where
I get:

/x/gcc/libstdc++-v3/src/c++17/floating_from_chars.cc:1330:47: error: 
'_Float128' is not supported on this target
 1330 | from_chars(const char* first, const char* last, _Float128& value,
  |   ^
/x/gcc/libstdc++-v3/src/c++17/floating_from_chars.cc:1330:49: error: expected 
identifier before '_Float128'
 1330 | from_chars(const char* first, const char* last, _Float128& value,
  | ^
/x/gcc/libstdc++-v3/src/c++17/floating_from_chars.cc:1330:49: error: 
'_Float128' is not supported on this target
/x/gcc/libstdc++-v3/src/c++17/floating_from_chars.cc:1330:49: error: expected 
',' or '...' before '_Float128'
/x/gcc/libstdc++-v3/src/c++17/floating_from_chars.cc: In function 
'std::from_chars_result std::from_chars(const char*, const char*, int)':
/x/gcc/libstdc++-v3/src/c++17/floating_from_chars.cc:1340:53: error: 'fmt' was 
not declared in this scope; did you mean 'fma'?
 1340 |   auto res = std::from_chars(first, last, ldbl_val, fmt);
  | ^~~
  | fma
/x/gcc/libstdc++-v3/src/c++17/floating_from_chars.cc:1342:5: error: 'value' was 
not declared in this scope
 1342 | value = ldbl_val;
  | ^
make[5]: *** [Makefile:587: floating_from_chars.lo] Error 1

brgds, H-P


[PATCH] reload_cse_move2add: Handle trivial single_set:s

2023-05-31 Thread Hans-Peter Nilsson via Gcc-patches
Tested cris-elf, bootstrapped & checked native
x86_64-pc-linux-gnu for good measure.  Ok to commit?

If it wasn't for there already being an auto_inc_dec pass,
this looks like a good place to put it, considering the
framework data.  (BTW, current auto-inc-dec generation is so
poor that you can replace half of what auto_inc_dec does
with a few peephole2s.)

brgds, H-P

-- >8 --
The reload_cse_move2add part of "postreload" handled only
insns whose PATTERN was a SET.  That excludes insns that
e.g. clobber a flags register, which it does only for
"simplicity".  The patch extends the "simplicity" to most
single_set insns.  For a subset of those insns there's still
an assumption; that the single_set of a PARALLEL insn is the
first element in the PARALLEL.  If the assumption fails,
it's no biggie; the optimization just isn't performed.
Don't let the name deceive you, this optimization doesn't
hit often, but as often (or as rarely) for LRA as for reload
at least on e.g. cris-elf where the biggest effect was seen
in reducing repeated addresses in copies from fixed-address
arrays, like in gcc.c-torture/compile/pr78694.c.

* postreload.cc (move2add_use_add2_insn): Handle
trivial single_sets.  Rename variable PAT to SET.
(move2add_use_add3_insn, reload_cse_move2add): Similar.
---
 gcc/postreload.cc | 67 +++
 1 file changed, 38 insertions(+), 29 deletions(-)

diff --git a/gcc/postreload.cc b/gcc/postreload.cc
index fb392651e1b6..996206f589d3 100644
--- a/gcc/postreload.cc
+++ b/gcc/postreload.cc
@@ -1744,8 +1744,8 @@ static bool
 move2add_use_add2_insn (scalar_int_mode mode, rtx reg, rtx sym, rtx off,
rtx_insn *insn)
 {
-  rtx pat = PATTERN (insn);
-  rtx src = SET_SRC (pat);
+  rtx set = single_set (insn);
+  rtx src = SET_SRC (set);
   int regno = REGNO (reg);
   rtx new_src = gen_int_mode (UINTVAL (off) - reg_offset[regno], mode);
   bool speed = optimize_bb_for_speed_p (BLOCK_FOR_INSN (insn));
@@ -1764,21 +1764,21 @@ move2add_use_add2_insn (scalar_int_mode mode, rtx reg, 
rtx sym, rtx off,
 (reg)), would be discarded.  Maybe we should
 try a truncMN pattern?  */
   if (INTVAL (off) == reg_offset [regno])
-   changed = validate_change (insn, _SRC (pat), reg, 0);
+   changed = validate_change (insn, _SRC (set), reg, 0);
 }
   else
 {
   struct full_rtx_costs oldcst, newcst;
   rtx tem = gen_rtx_PLUS (mode, reg, new_src);
 
-  get_full_set_rtx_cost (pat, );
-  SET_SRC (pat) = tem;
-  get_full_set_rtx_cost (pat, );
-  SET_SRC (pat) = src;
+  get_full_set_rtx_cost (set, );
+  SET_SRC (set) = tem;
+  get_full_set_rtx_cost (set, );
+  SET_SRC (set) = src;
 
   if (costs_lt_p (, , speed)
  && have_add2_insn (reg, new_src))
-   changed = validate_change (insn, _SRC (pat), tem, 0);   
+   changed = validate_change (insn, _SRC (set), tem, 0);
   else if (sym == NULL_RTX && mode != BImode)
{
  scalar_int_mode narrow_mode;
@@ -1796,10 +1796,15 @@ move2add_use_add2_insn (scalar_int_mode mode, rtx reg, 
rtx sym, rtx off,
narrow_reg),
   narrow_src);
  get_full_set_rtx_cost (new_set, );
- if (costs_lt_p (, , speed))
+
+ /* We perform this replacement only if NEXT is either a
+naked SET, or else its single_set is the first element
+in a PARALLEL.  */
+ rtx *setloc = GET_CODE (PATTERN (insn)) == PARALLEL
+   ?  (PATTERN (insn), 0) :  (insn);
+ if (*setloc == set && costs_lt_p (, , speed))
{
- changed = validate_change (insn,  (insn),
-new_set, 0);
+ changed = validate_change (insn, setloc, new_set, 0);
  if (changed)
break;
}
@@ -1825,8 +1830,8 @@ static bool
 move2add_use_add3_insn (scalar_int_mode mode, rtx reg, rtx sym, rtx off,
rtx_insn *insn)
 {
-  rtx pat = PATTERN (insn);
-  rtx src = SET_SRC (pat);
+  rtx set = single_set (insn);
+  rtx src = SET_SRC (set);
   int regno = REGNO (reg);
   int min_regno = 0;
   bool speed = optimize_bb_for_speed_p (BLOCK_FOR_INSN (insn));
@@ -1836,10 +1841,10 @@ move2add_use_add3_insn (scalar_int_mode mode, rtx reg, 
rtx sym, rtx off,
   rtx plus_expr;
 
   init_costs_to_max ();
-  get_full_set_rtx_cost (pat, );
+  get_full_set_rtx_cost (set, );
 
   plus_expr = gen_rtx_PLUS (GET_MODE (reg), reg, const0_rtx);
-  SET_SRC (pat) = plus_expr;
+  SET_SRC (set) = plus_expr;
 
   for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
 if (move2add_valid_value_p (i, mode)
@@ -1864,7 +1869,7 @@ move2add_use_add3_insn (scalar_int_mode mode, rtx reg, 
rtx sym, rtx off,
else
 

Re: [committed] Convert xstormy16 to LRA

2023-05-12 Thread Hans-Peter Nilsson via Gcc-patches
> From: Hans-Peter Nilsson 
> Date: Sat, 13 May 2023 02:56:39 +0200
> 
> > From: "Roger Sayle" 
> > Date: Fri, 12 May 2023 15:04:03 +0100
> 
> > Hi H-P,
> > This patch should now already be on trunk:
> > https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=d8a6945c6ea22efa4d5e42fe1922d2
> > b27953c8cd
> > Many thanks to Jeff for the review/approval.
> > There have been no reported adverse effects so far.
> > Please let me/us know if this has helped CRIS.
> 
> TL;DR:
> It sure helped for the big offender; arith-rand-ll with LRA
> is now 1.3% faster!  Not everything is rosy though.

Coremark regressed slightly for LRA, cris-elf, -O2
-march=v10; with/without the patch: 100.0014% by speed,
100.037% by size.

brgds, H-P



Re: [committed] Convert xstormy16 to LRA

2023-05-12 Thread Hans-Peter Nilsson via Gcc-patches
> From: "Roger Sayle" 
> Date: Fri, 12 May 2023 15:04:03 +0100

> Hi H-P,
> This patch should now already be on trunk:
> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=d8a6945c6ea22efa4d5e42fe1922d2
> b27953c8cd
> Many thanks to Jeff for the review/approval.
> There have been no reported adverse effects so far.
> Please let me/us know if this has helped CRIS.

TL;DR:
It sure helped for the big offender; arith-rand-ll with LRA
is now 1.3% faster!  Not everything is rosy though.

It's not a complete win for CRIS with LRA, as other of my
mini-regression tests (like, a trivial loop with 0<=i<=10
fprintf (f, "%s: %d\n", "Hello, world", i); with newlib),
where it got 0.04% larger.  On average though, 0.5% smaller.
Actually, it's more accurate to state it in terms of the
code that changed: newlib's _vfprintf_r improved slightly
performance-wise on some paths (1 cycle per call), but
regressed on others (7 cycles per call).  It got 0.2%
bigger.  Libgcc's __moddi3 regressed slightly (1 cycle per
11 calls) but stayed the same size.  The main function in
arith-rand-ll (where I guess all small functions got
inlined) improved 3% by size, performance as above.

Those clobbers must have helped reload, because with reload
there was a 0.15% performance regression for arith-rand-ll
(and 0.2% by size because main got 1.6% larger) with this
patch.  But size was smaller or equal for all other tests
using reload with this patch, if only by 0.04% on average.
 Or in other numbers: _vfprintf_r got 4 cycles faster per
call on some paths and 2 on others.  It got 0.05% bigger.  A
similar function, __vfiprintf_r, shrank by 0.5%.  Go figure.
For all but arith-rand-ll, this patch was a wash for reload.
 (Yes: the case with reload is now artificial; for that, I
was using master before Jeff's commit and added my own
commits since then, to keep track of my fight against LRA
regressions; the current score is that code is still 0.34%
slower and 0.17% larger with this applied just for LRA.)

Can we ask Vladimir for a statement; perhaps LRA could just
have handled those clobbers better (referring to the commit)?

brgds, H-P


Re: [committed] Convert xstormy16 to LRA

2023-05-12 Thread Hans-Peter Nilsson via Gcc-patches
> From: Hans-Peter Nilsson 
> Date: Fri, 12 May 2023 15:53:49 +0200

> Anyway, Roger mentioned that the clobbers emitted by the
> lower-subreg passes were apparently damaging, so I'll try
> this out "for fun", on the assumption that they're actually
> unnecessary.  I don't think actually removing them has been
> attempted?

> --- lower-subreg.cc.orig  2023-04-29 02:53:39.0 +0200
> +++ lower-subreg.cc   2023-05-12 15:35:25.574668930 +0200

Bah, then I noticed r14-554-gd8a6945c6ea22e, committed
several days ago.  I should have checked up-to-date sources...
Thanks Roger!

Now off to measure the impact.  Maybe up to par with reload
now? :)

brgds, H-P


Re: [committed] Convert xstormy16 to LRA

2023-05-12 Thread Hans-Peter Nilsson via Gcc-patches
> From: Hans-Peter Nilsson 
> Date: Thu, 11 May 2023 17:05:40 +0200

> Next, I'll turn around completely, and try defaulting to
> -fsplit-wide-types-early, which sounds more promising. :)
> I don't like throwing defaults around randomly, but trying
> out a promising idea this way is easy.

Absolutely nothing changed (not counting now running
"subreg2" and generating a dump-file), compared to the
default.  Besides coremark and local micro-benchmarks I
inspected running arith-rand-ll.c with -O2 and briefly
stepped through the passes with gdb: the costs guiding the
splits are fine, properly enabling the splits, but not all
DImode registers are naturally "splittable"; looks like the
ones used in non-decomposable operations remain.  It seems
all splittable opportunities are dealt with by the first
pass ("subreg1").  I guess this pass has the most impact for
targets that have few or no DImode operations at all.

But why is the option called -fsplit-wide-types-early when
what it does is enabling a "subreg2" pass, there being
"subreg1" and "subreg3" enabled with -fsplit-wide-types?  It
should rather be called -fsplit-wide-types-second! :)

Looking at its placement in passes.def makes me wonder what
magic properties targets have that benefit from it.

Anyway, Roger mentioned that the clobbers emitted by the
lower-subreg passes were apparently damaging, so I'll try
this out "for fun", on the assumption that they're actually
unnecessary.  I don't think actually removing them has been
attempted?

The patch below seems to substantially lower register
pressure for arith-rand-ll for CRIS, but I've only inspected
the assembly source (not even compared the result to the
reload version).  Quoting it for reference only, and if it
"works" (passes regtest for cris-elf and x86-64-linux) I
think I'll resubmit as a proper patch:

--- lower-subreg.cc.orig2023-04-29 02:53:39.0 +0200
+++ lower-subreg.cc 2023-05-12 15:35:25.574668930 +0200
@@ -1086,9 +1086,6 @@ resolve_simple_move (rtx set, rtx_insn *
 {
   unsigned int i;
 
-  if (REG_P (dest) && !HARD_REGISTER_NUM_P (REGNO (dest)))
-   emit_clobber (dest);
-
   for (i = 0; i < words; ++i)
{
  rtx t = simplify_gen_subreg_concatn (word_mode, dest,

brgds, H-P


Re: [committed] Convert xstormy16 to LRA

2023-05-11 Thread Hans-Peter Nilsson via Gcc-patches
> Date: Thu, 11 May 2023 12:15:20 -0600
> From: Jeff Law 

> On 5/11/23 10:55, Paul Koning wrote:
> > 
> > 
> >> On May 11, 2023, at 11:05 AM, Hans-Peter Nilsson via Gcc-patches 
> >>  wrote:
> >>
> >> ...
> >> Yes, very interesting.  Thank you for sharing this.  I've
> >> seen regressions with LRA for CRIS too, for
> >> "double-register-sized" types, which for CRIS, a 32-bit
> >> target, translates to 64-bit types (DFmode and DImode), and
> >> where LRA does a much worse job than reload; spills a lot
> >> more often to stack, even after trying every
> >> register-allocation-related hook I found (and also an LRA
> >> patch which helped only by a fraction, but regressed results
> >> on x86_64-linux, so let's quickly forget it again).
> > 
> > That observation makes me a bit worried.  While CRIS may not be a priority 
> > platform, that description makes it sound like a case that would be 
> > significant in any 32 bit platform, which would include priority ones like 
> > i386 and ARM.
> If I understood things correctly, it seems to impact more when the 
> target exposes double-word patterns but doesn't actually have 
> instructions for those operations.  That's an implementation pattern 
> we've largely been moving away from over the last decade or so.

That description doesn't really match CRIS though.  The "ax"
prefix used in DImode patterns links the next instruction to
include the carry.  Thus better than an "open-coded"
version.  In comparison, CRIS doesn't define separable
patterns (anddi3, iordi3 etc.)

But, there's a movdi expander and splitter - with a long
reload-related comment.  Perhaps I can do away with that
even though having some arithmetic and compares in DImode.
Thanks for the hint.

brgds, H-P


Re: [committed] Convert xstormy16 to LRA

2023-05-11 Thread Hans-Peter Nilsson via Gcc-patches
> From: "Roger Sayle" 
> Date: Tue, 2 May 2023 00:37:14 +0100

> Jeff Law wrote:
> > This patch converts the xstormy16 patch to LRA.  It introduces a code 
> > quality regression in the shiftsi testcase, but it also fixes numerous 
> > aborts/errors.  IMHO it's a good tradeoff.
> 
> I've investigated the shiftsi regression on xstormy16 and the underlying
> cause
> appears to be an interaction between lower-subreg's "subreg3" pass and the
> new LRA.  Previously, reload was not phased by the "clobbers" that are 
> introduced by the decompose_multiword_subregs function, but they appear
> to interfere with LRA's register assignments.
> 
> combine's make_extra_copies introduces a new pseudo-to-pseudo move,
> but when subreg3 inserts a naked clobber between the original and the
> new move, LRA is recombine theses pseudos back to the same allocno.
> 
> The shiftsi.cc regression on xstormy16 is fixed by adding
> -fno-split-wide-types.
> In fact, if all the regression tests pass, I'd suggest that
> flag_split_wide-types = false
> should be the default on xstormy16 now that we've moved to LRA.  And if this
> works for xstormy16, it might be useful to other targets for the LRA
> transition;
> it's a difference in behaviour between reload and LRA that could potentially
> affect multiple targets.
> 
> For reference, xstormy16 has a post-reload define_insn_and_split for movsi
> (i.e. a multi-word move).  If this insn was split during split1 (i.e. before
> subreg3)
> there wouldn't be a problem (no clobber), but alas the target's
> xstormy16_split_move
> function has several asserts insisting this only get called when
> reload_completed.
> 
> I hope this is useful.
> Cheers,
> Roger

Yes, very interesting.  Thank you for sharing this.  I've
seen regressions with LRA for CRIS too, for
"double-register-sized" types, which for CRIS, a 32-bit
target, translates to 64-bit types (DFmode and DImode), and
where LRA does a much worse job than reload; spills a lot
more often to stack, even after trying every
register-allocation-related hook I found (and also an LRA
patch which helped only by a fraction, but regressed results
on x86_64-linux, so let's quickly forget it again).

No fix or nicely stated bug entry yet, but at least a
different observation:

Coremark for cris-elf built with -O2 -march=v10, when going
from reload to LRA is slightly faster but a bit bigger (for
example before/after Jeffs r14-383-gfaf8bea79b6256, 5090593
to 5090567 cycles and 48887 to 48901 bytes), a relative
observation which has not changed much since February when I
started working on an LRA transition for CRIS.

But, the case for code with heavy use of "double-register-
sized" types is much worse; up to several percent slower.
My favorite sharable example is
gcc/testsuite/gcc.c-torture/execute/arith-rand-ll.c
(with a few unimportant local tweaks not suitable for
upstreaming but which I'm happy to share with anyone asking)
which around that commit goes from 1295021 to 1317531 cycles
(101.74%) and one percent larger; 4008 to 4048 bytes.

Your suggestion to default to -fno-split-wide-types seemed
too good to be true, and though worth a try, unfortunately
it was.  I'm seeing *horrible* regressions for
double-register codes with the patch below on top of LRA.
Coremark numbers suffer too (different baseline here than
above; closer to today's sources) from 5078989 to 5081968
cycles and from 48537 to 50145 bytes.

But, arith-rand-ll suffers much more: from 1317530 to
2182080 cycles (yes, 165.62%) and from 4044 to 4174 bytes.
(With reload, it's bad too, but "only" regressing 143.67% by
speed.)

Next, I'll turn around completely, and try defaulting to
-fsplit-wide-types-early, which sounds more promising. :)
I don't like throwing defaults around randomly, but trying
out a promising idea this way is easy.

So because of the numbers above, this will never be
committed, just passed for reference.  I believe this is the
correct way to default to -fno-split-wide-types:

-- >8 --
[PATCH] CRIS: Default to -fno-split-wide-types

* common/config/cris/cris-common.cc (cris_option_optimization_table):
New.  Default to -fno-split-wide-types.
(TARGET_OPTION_OPTIMIZATION_TABLE): Define.
---
 gcc/common/config/cris/cris-common.cc | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/gcc/common/config/cris/cris-common.cc 
b/gcc/common/config/cris/cris-common.cc
index b08d6014102d..cf00c1414651 100644
--- a/gcc/common/config/cris/cris-common.cc
+++ b/gcc/common/config/cris/cris-common.cc
@@ -26,6 +26,14 @@ along with GCC; see the file COPYING3.  If not see
 #include "opts.h"
 #include "flags.h"
 
+/* Implement TARGET_OPTION_OPTIMIZATION_TABLE.  */
+
+static const struct default_options cris_option_optimization_table[] =
+  {
+{ OPT_LEVELS_1_PLUS, OPT_fsplit_wide_types, NULL, 0 },
+{ OPT_LEVELS_NONE, 0, NULL, 0 }
+  };
+
 /* TARGET_HANDLE_OPTION worker.  We just store the values into local
variables here.  Checks for correct 

[committed] CRIS: Fix ccmode typo in cris_postdbr_cmpelim

2023-05-09 Thread Hans-Peter Nilsson via Gcc-patches
Typo spotted while doing CCmode improvements, as a missed
optimization.  It's almost visible from the patch context;
there's not much done in terms of "mode-adjustment" when
replacing (reg:CC CRIS_CC0_REGNUM) with a copy!
This bug affects functions in the newlib printf-formatting
functions (nothing else in libgcc or newlib libc), with the
performance impact on coremark scores being less than 1e-6
(3/5078992 cycles, 6/48543 bytes).

* config/cris/cris.cc (cris_postdbr_cmpelim): Correct mode
of modeadjusted_dccr.
---
 gcc/config/cris/cris.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/cris/cris.cc b/gcc/config/cris/cris.cc
index 05dead9c0778..5b4cc4e204eb 100644
--- a/gcc/config/cris/cris.cc
+++ b/gcc/config/cris/cris.cc
@@ -432,7 +432,7 @@ cris_postdbr_cmpelim ()
  machine_mode ccmode = GET_MODE (src);
  rtx modeadjusted_dccr
= (ccmode == CCmode ? dccr
-  : gen_rtx_REG (CCmode, CRIS_CC0_REGNUM));
+  : gen_rtx_REG (ccmode, CRIS_CC0_REGNUM));
  rtx compare
/* We don't need to copy_rtx pat: we're going to
   delete that insn. */
-- 
2.30.2



[committed] CRIS: peephole2 an add into two addq or subq

2023-05-05 Thread Hans-Peter Nilsson via Gcc-patches
Unfortunately, doesn't cause a performance improvement for coremark,
but happens a few times in newlib, just enough to affect coremark
0.01% by size (or 4 bytes, and three cycles (__fwalk_sglue and
__vfiprintf_r each two bytes).

gcc:
* config/cris/cris.md (splitop): Add PLUS.
* config/cris/cris.cc (cris_split_constant): Also handle
PLUS when a split into two insns may be useful.

gcc/testsuite:
* gcc.target/cris/peep2-addsplit1.c: New test.
---
 gcc/config/cris/cris.cc   | 25 +++-
 gcc/config/cris/cris.md   |  6 +-
 .../gcc.target/cris/peep2-addsplit1.c | 59 +++
 3 files changed, 88 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/cris/peep2-addsplit1.c

diff --git a/gcc/config/cris/cris.cc b/gcc/config/cris/cris.cc
index 331f5908a538..561ca1b3fa92 100644
--- a/gcc/config/cris/cris.cc
+++ b/gcc/config/cris/cris.cc
@@ -2642,7 +2642,30 @@ cris_split_constant (HOST_WIDE_INT wval, enum rtx_code 
code,
   int32_t ival = (int32_t) wval;
   uint32_t uval = (uint32_t) wval;
 
-  if (code != AND || IN_RANGE(ival, -32, 31)
+  /* Can we do with two addq or two subq, improving chances of filling a
+ delay-slot?  At worst, we break even, both performance and
+ size-wise.  */
+  if (code == PLUS
+  && (IN_RANGE (ival, -63 * 2, -63 - 1)
+ || IN_RANGE (ival, 63 + 1, 63 * 2)))
+{
+  if (generate)
+   {
+ int sign = ival < 0 ? -1 : 1;
+ int aval = abs (ival);
+
+ if (mode != SImode)
+   {
+ dest = gen_rtx_REG (SImode, REGNO (dest));
+ op = gen_rtx_REG (SImode, REGNO (op));
+   }
+ emit_insn (gen_addsi3 (dest, op, GEN_INT (63 * sign)));
+ emit_insn (gen_addsi3 (dest, op, GEN_INT ((aval - 63) * sign)));
+   }
+  return 2;
+}
+
+  if (code != AND || IN_RANGE (ival, -32, 31)
   /* Implemented using movu.[bw] elsewhere.  */
   || ival == 255 || ival == 65535
   /* Implemented using clear.[bw] elsewhere.  */
diff --git a/gcc/config/cris/cris.md b/gcc/config/cris/cris.md
index 53fc2f2de4af..243d47748b78 100644
--- a/gcc/config/cris/cris.md
+++ b/gcc/config/cris/cris.md
@@ -209,7 +209,7 @@ (define_code_iterator plusminusumin [plus minus umin])
 (define_code_iterator plusumin [plus umin])
 
 ;; For opsplit1.
-(define_code_iterator splitop [and])
+(define_code_iterator splitop [and plus])
 
 ;; The addsubbo and nd code-attributes form a hack.  We need to output
 ;; "addu.b", "subu.b" but "bound.b" (no "u"-suffix) which means we'd
@@ -2984,6 +2984,10 @@ (define_peephole2 ; movandsplit1
 
 ;; Large (read: non-quick) numbers can sometimes be AND:ed by other means.
 ;; Testcase: gcc.target/cris/peep2-andsplit1.c
+;; 
+;; Another case is add N,rx with -126..-64,64..126: it has the same
+;; size and execution time as two addq or subq, but addq and subq can fill
+;; a delay-slot.
 (define_peephole2 ; opsplit1
   [(parallel
 [(set (match_operand 0 "register_operand")
diff --git a/gcc/testsuite/gcc.target/cris/peep2-addsplit1.c 
b/gcc/testsuite/gcc.target/cris/peep2-addsplit1.c
new file mode 100644
index ..7dff1d8c77c7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/cris/peep2-addsplit1.c
@@ -0,0 +1,52 @@
+/* Check that "opsplit1" with PLUS does its job.  */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-leading-underscore" } */
+/* { dg-final { check-function-bodies "**" "" "" } } */
+
+int addsi (int x)
+{
+  return x + 64;
+}
+
+char addqi (char x)
+{
+  return x + 126;
+}
+
+short addhi (short x)
+{
+  return x - 64;
+}
+
+unsigned short addhi2 (short x)
+{
+  return x - 126;
+}
+
+/*
+** addsi:
+** addq 63,.r10
+** ret
+** addq 1,.r10
+*/
+
+/*
+** addqi:
+** addq 63,.r10
+** ret
+** addq 63,.r10
+*/
+
+/*
+** addhi:
+** subq 63,.r10
+** ret
+** subq 1,.r10
+*/
+
+/*
+** addhi2:
+** subq 63,.r10
+** ret
+** subq 63,.r10
+*/
-- 
2.30.2



[committed] CRIS: peephole2 a move of constant followed by and of same register

2023-05-05 Thread Hans-Peter Nilsson via Gcc-patches
While moves of constants into registers are separately
optimizable, a combination of a move with a subsequent "and"
is slightly preferable even if the move can be generated
with the same number (and timing) of insns, as moves of
"just" registers are eliminated now and then in different
passes, loosely speaking.  This movandsplit1 pattern feeds
into the opsplit1/AND peephole2, with matching occurrences
observed in the floating point functions in libgcc.  Also, a
test-case to fit.  Coremark improvements are unimpressive:
less than 0.0003% speed, 0.1% size.

But that was pre-LRA; after the switch to LRA this peephole2
doesn't match anymore (for any of coremark, local tests,
libgcc and newlib libc) and the test-case passes with and
without the patch.  Still, there's no apparent reason why
LRA prefers "move R1,R2" "and I,R2" to "move I,R1" "and
R1,R2", or why that wouldn't "randomly" change (also seen
with other operations than "and").  Thus committed.

gcc:
* config/cris/cris.md (movandsplit1): New define_peephole2.

gcc/testsuite:
* gcc.target/cris/peep2-movandsplit1.c: New test.
---
 gcc/config/cris/cris.md   | 38 +++
 .../gcc.target/cris/peep2-movandsplit1.c  | 17 +
 2 files changed, 55 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/cris/peep2-movandsplit1.c

diff --git a/gcc/config/cris/cris.md b/gcc/config/cris/cris.md
index d5aadf752e86..53fc2f2de4af 100644
--- a/gcc/config/cris/cris.md
+++ b/gcc/config/cris/cris.md
@@ -2944,6 +2944,44 @@ (define_peephole2 ; andqu
   operands[4] = GEN_INT (trunc_int_for_mode (INTVAL (operands[1]), QImode));
 })
 
+;; Somewhat similar to andqu, but a different range and expansion,
+;; intended to feed the output into opsplit1 with AND:
+;;  move.d 0x7,$r10
+;;  and.d $r11,$r10
+;; into:
+;;  move.d $r11,$r10
+;;  and.d 0x7,$r10
+;; which opsplit1/AND will change into:
+;;  move.d $r11,$r10 (unaffected by opsplit1/AND; shown only for context)
+;;  lslq 13,$r10
+;;  lsrq 13,$r10
+;; thereby winning in space, but in time only if the 0x7 happened to
+;; be unaligned in the code.
+(define_peephole2 ; movandsplit1
+  [(parallel
+[(set (match_operand 0 "register_operand")
+ (match_operand 1 "const_int_operand"))
+ (clobber (reg:CC CRIS_CC0_REGNUM))])
+   (parallel
+[(set (match_operand 2 "register_operand")
+ (and (match_operand 3 "register_operand")
+  (match_operand 4 "register_operand")))
+ (clobber (reg:CC CRIS_CC0_REGNUM))])]
+  "REGNO (operands[0]) == REGNO (operands[2])
+   && REGNO (operands[0]) == REGNO (operands[3])
+   && cris_splittable_constant_p (INTVAL (operands[1]), AND,
+ GET_MODE (operands[2]),
+ optimize_function_for_speed_p (cfun))"
+  [(parallel
+[(set (match_dup 2) (match_dup 4))
+ (clobber (reg:CC CRIS_CC0_REGNUM))])
+   (parallel
+[(set (match_dup 2) (match_dup 5))
+ (clobber (reg:CC CRIS_CC0_REGNUM))])]
+{
+  operands[5] = gen_rtx_AND (GET_MODE (operands[2]), operands[2], operands[1]);
+})
+
 ;; Large (read: non-quick) numbers can sometimes be AND:ed by other means.
 ;; Testcase: gcc.target/cris/peep2-andsplit1.c
 (define_peephole2 ; opsplit1
diff --git a/gcc/testsuite/gcc.target/cris/peep2-movandsplit1.c 
b/gcc/testsuite/gcc.target/cris/peep2-movandsplit1.c
new file mode 100644
index ..e4a860d966e2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/cris/peep2-movandsplit1.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-final { scan-assembler-times "lsrq " 2 } } */
+/* { dg-final { scan-assembler-times "lslq " 2 } } */
+/* { dg-final { scan-assembler-times "move.d \\\$r11,\\\$r10" 2 } } */
+/* { dg-final { scan-assembler-times "\tmov" 2 } } */
+/* { dg-final { scan-assembler-not "\tand" } } */
+/* { dg-options "-O2" } */
+
+unsigned int xmovandr (unsigned int y, unsigned int x)
+{
+  return x & 0x7;
+}
+
+unsigned int xmovandl (unsigned int y, unsigned int x)
+{
+  return x & 0xfffe;
+}
-- 
2.30.2



[committed] CRIS: peephole2 a lsrq into a lslq+lsrq pair

2023-05-05 Thread Hans-Peter Nilsson via Gcc-patches
Observed after opsplit1 with AND in libgcc floating-point
functions, like the first spottings of opsplit1/AND
opportunities.  Two patterns are nominally needed, as the
peephole2 optimizer continues from the *first replacement*
insn, not from a minimum context for general matching; one
that includes it as the last match.

But, the "free-standing" opportunity (three shifts) didn't
match by itself in a gcc build of libraries plus running the
test-suite, and thus deemed uninteresting and left out.
(As expected; if it had matched, that'd have indicated a
previously missed optimization or other problem elsewhere.)
Only the one that includes the previous define_peephole2
that may generate the sequence (i.e. opsplit1/AND), matches
easily.

Coremark results aren't impressive though: 0.003%
improvement in speed and slightly less than 0.1% in size.

A testcase is added to match and another one to cover a case
of movulsr checking that it's used; it's preferable to
lsrandsplit when both would match.

gcc:
* config/cris/cris.md (lsrandsplit1): New define_peephole2.

gcc/testsuite:
* gcc.target/cris/peep2-lsrandsplit1.c,
gcc.target/cris/peep2-movulsr2.c: New tests.
---
 gcc/config/cris/cris.md   | 53 +++
 .../gcc.target/cris/peep2-lsrandsplit1.c  | 19 +++
 .../gcc.target/cris/peep2-movulsr2.c  | 19 +++
 3 files changed, 91 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/cris/peep2-lsrandsplit1.c
 create mode 100644 gcc/testsuite/gcc.target/cris/peep2-movulsr2.c

diff --git a/gcc/config/cris/cris.md b/gcc/config/cris/cris.md
index e72943b942e5..d5aadf752e86 100644
--- a/gcc/config/cris/cris.md
+++ b/gcc/config/cris/cris.md
@@ -2690,6 +2690,59 @@ (define_peephole2 ; movulsr
 = INTVAL (operands[2]) <= 0xff ? GEN_INT (0xff) :  GEN_INT (0x);
 })
 
+;; Avoid, after opsplit1 with AND (below), sequences of:
+;;  lsrq N,R
+;;  lslq M,R
+;;  lsrq M,R
+;; (N < M), where we can fold the first lsrq into the lslq-lsrq, like:
+;;  lslq M-N,R
+;;  lsrq M,R
+;; We have to match this before opsplit1 below and before other peephole2s of
+;; lesser value, since peephole2 matching resumes at the first generated insn,
+;; and thus wouldn't match a pattern of the three shifts after opsplit1/AND.
+;; Note that this lsrandsplit1 is in turn of lesser value than movulsr, since
+;; that one doesn't require the same operand for source and destination, but
+;; they happen to be the same hard-register at peephole2 time even if
+;; naturally separated like in peep2-movulsr2.c, thus this placement.  (Source
+;; and destination will be re-separated and the move optimized out in
+;; cprop_hardreg at time of this writing.)
+;; Testcase: gcc.target/cris/peep2-lsrandsplit1.c
+(define_peephole2 ; lsrandsplit1
+  [(parallel
+[(set (match_operand:SI 0 "register_operand")
+ (lshiftrt:SI
+  (match_operand:SI 1 "register_operand")
+  (match_operand:SI 2 "const_int_operand")))
+ (clobber (reg:CC CRIS_CC0_REGNUM))])
+   (parallel
+[(set (match_operand 3 "register_operand")
+ (and
+  (match_operand 4 "register_operand")
+  (match_operand 5 "const_int_operand")))
+ (clobber (reg:CC CRIS_CC0_REGNUM))])]
+  "REGNO (operands[0]) == REGNO (operands[1])
+   && REGNO (operands[0]) == REGNO (operands[3])
+   && REGNO (operands[0]) == REGNO (operands[4])
+   && (INTVAL (operands[2])
+   < (clz_hwi (INTVAL (operands[5])) - (HOST_BITS_PER_WIDE_INT - 32)))
+   && cris_splittable_constant_p (INTVAL (operands[5]), AND, SImode,
+ optimize_function_for_speed_p (cfun)) == 2"
+  ;; We're guaranteed by the above hw_clz test (certainly non-zero) and the
+  ;; test for a two-insn return-value from cris_splittable_constant_p, that
+  ;; the cris_splittable_constant_p AND-replacement would be lslq-lsrq.
+  [(parallel
+[(set (match_dup 0) (ashift:SI (match_dup 0) (match_dup 9)))
+ (clobber (reg:CC CRIS_CC0_REGNUM))])
+   (parallel
+[(set (match_dup 0) (lshiftrt:SI (match_dup 0) (match_dup 10)))
+ (clobber (reg:CC CRIS_CC0_REGNUM))])]
+{
+  HOST_WIDE_INT shiftval
+= clz_hwi (INTVAL (operands[5])) - (HOST_BITS_PER_WIDE_INT - 32);
+  operands[9] = GEN_INT (shiftval - INTVAL (operands[2]));
+  operands[10] = GEN_INT (shiftval);
+})
+
 ;; Testcase for the following four peepholes: gcc.target/cris/peep2-xsrand.c
 
 (define_peephole2 ; asrandb
diff --git a/gcc/testsuite/gcc.target/cris/peep2-lsrandsplit1.c 
b/gcc/testsuite/gcc.target/cris/peep2-lsrandsplit1.c
new file mode 100644
index ..0da645358771
--- /dev/null
+++ b/gcc/testsuite/gcc.target/cris/peep2-lsrandsplit1.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-final { scan-assembler-not " and" } } */
+/* { dg-final { scan-assembler-times "lsrq " 2 } } */
+/* { dg-final { scan-assembler-times "lslq " 2 } } */
+/* { dg-options "-O2" } */
+
+/* Test the "lsrlsllsr1" peephole2 trivially.  */
+
+unsigned int

[committed] CRIS: peephole2 an "and" with a contiguous "one-sided" sequences of 1s

2023-05-03 Thread Hans-Peter Nilsson via Gcc-patches
This kind of transformation seems pretty generic and might be a
candidate for adding to the middle-end, perhaps as part of combine.

I noticed these happened more often for LRA, which is the reason I
went on this track of low-hanging-fruit-microoptimizations that are
such an itch when noticing them, inspecting generated code for libgcc.
Unfortunately, this one improves coremark only by a few cycles at the
beginning or end (<0.0005%) for cris-elf -march=v10.  The size of the
coremark code is down by 0.4% (0.22% pre-lra).

Using an iterator from the start because other binary operations will
be added and their define_peephole2's would look exactly the same for
the .md part.

Some existing and-peephole2-related tests suffered, because many of
them were using patterns with only contiguous 1:s in them: adjusted.
Also, spotted and fixed, by adding a space, some
scan-assembler-strings that were prone to spurious identifier or file
name matches.

gcc:
* config/cris/cris.cc (cris_split_constant): New function.
* config/cris/cris.md (splitop): New iterator.
(opsplit1): New define_peephole2.
* config/cris/cris-protos.h (cris_split_constant): Declare.
(cris_splittable_constant_p): New macro.

gcc/testsuite:
* gcc.target/cris/peep2-andsplit1.c: New test.
* gcc.target/cris/peep2-andu1.c, gcc.target/cris/peep2-andu2.c,
gcc.target/cris/peep2-xsrand.c, gcc.target/cris/peep2-xsrand2.c:
Adjust values to avoid interference with "opsplit1" with AND.  Add
whitespace to match-strings that may be confused with identifiers
or file names.
---
 gcc/config/cris/cris-protos.h |  6 ++
 gcc/config/cris/cris.cc   | 78 +++
 gcc/config/cris/cris.md   | 26 +++
 .../gcc.target/cris/peep2-andsplit1.c | 25 ++
 gcc/testsuite/gcc.target/cris/peep2-andu1.c   |  4 +-
 gcc/testsuite/gcc.target/cris/peep2-andu2.c   |  6 +-
 gcc/testsuite/gcc.target/cris/peep2-xsrand.c  |  6 +-
 gcc/testsuite/gcc.target/cris/peep2-xsrand2.c |  6 +-
 8 files changed, 146 insertions(+), 11 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/cris/peep2-andsplit1.c

diff --git a/gcc/config/cris/cris-protos.h b/gcc/config/cris/cris-protos.h
index de9eacbae2aa..666e04f9eeec 100644
--- a/gcc/config/cris/cris-protos.h
+++ b/gcc/config/cris/cris-protos.h
@@ -44,6 +44,12 @@ extern rtx cris_emit_movem_store (rtx, rtx, int, bool);
 extern rtx_insn *cris_emit_insn (rtx x);
 extern void cris_order_for_addsi3 (rtx *, int);
 extern void cris_emit_trap_for_misalignment (rtx);
+extern int cris_split_constant (HOST_WIDE_INT, enum rtx_code,
+   machine_mode, bool,
+   bool generate = false,
+   rtx dest = NULL_RTX,
+   rtx op = NULL_RTX);
+#define cris_splittable_constant_p cris_split_constant
 #endif /* RTX_CODE */
 extern void cris_asm_output_label_ref (FILE *, char *);
 extern void cris_asm_output_ident (const char *);
diff --git a/gcc/config/cris/cris.cc b/gcc/config/cris/cris.cc
index 05dead9c0778..331f5908a538 100644
--- a/gcc/config/cris/cris.cc
+++ b/gcc/config/cris/cris.cc
@@ -2626,6 +2626,84 @@ cris_split_movdx (rtx *operands)
   return val;
 }
 
+/* Try to split the constant WVAL into a number of separate insns of less cost
+   for the rtx operation CODE and the metric SPEED than using val as-is.
+   Generate those insns if GENERATE.  DEST holds the destination, and OP holds
+   the other operand for binary operations; NULL when CODE is SET.  Return the
+   number of insns for the operation or 0 if the constant can't be usefully
+   split (because it's already minimal or is not within range for the known
+   methods).  Parts stolen from arm.cc.  */
+
+int
+cris_split_constant (HOST_WIDE_INT wval, enum rtx_code code,
+machine_mode mode, bool speed ATTRIBUTE_UNUSED,
+bool generate, rtx dest, rtx op)
+{
+  int32_t ival = (int32_t) wval;
+  uint32_t uval = (uint32_t) wval;
+
+  if (code != AND || IN_RANGE(ival, -32, 31)
+  /* Implemented using movu.[bw] elsewhere.  */
+  || ival == 255 || ival == 65535
+  /* Implemented using clear.[bw] elsewhere.  */
+  || uval == 0xff00 || uval == 0x)
+return 0;
+
+  int i;
+
+  int msb_zeros = 0;
+  int lsb_zeros = 0;
+
+  /* Count number of leading zeros.  */
+  for (i = 31; i >= 0; i--)
+{
+  if ((uval & (1 << i)) == 0)
+   msb_zeros++;
+  else
+   break;
+}
+
+  /* Count number of trailing zero's.  */
+  for (i = 0; i <= 31; i++)
+{
+  if ((uval & (1 << i)) == 0)
+   lsb_zeros++;
+  else
+   break;
+}
+
+  /* Is there a lowest or highest part that is zero (but not both)
+ and the non-zero part is just ones?  */
+  if (exact_log2 ((uval >> lsb_zeros) + 1) > 0
+  && (lsb_zeros != 0) != (msb_zeros != 0))
+{
+  /* If so, 

[committed] CRIS-LRA: Define TARGET_SPILL_CLASS

2023-05-03 Thread Hans-Peter Nilsson via Gcc-patches
This has no effect on arith-rand-ll (which suffers badly from LRA) and
marginal effects (0.01% improvement) on coremark, but the size of
coremark shrinks by 0.2%.  An earlier version was tested with a tree
around 2023-03 which showed (marginally) that ALL_REGS is preferable
to GENERAL_REGS.

* config/cris/cris.cc (TARGET_SPILL_CLASS): Define
to ALL_REGS.
---
 gcc/config/cris/cris.cc | 12 
 1 file changed, 12 insertions(+)

diff --git a/gcc/config/cris/cris.cc b/gcc/config/cris/cris.cc
index 641e7ea25fb1..373be0db 100644
--- a/gcc/config/cris/cris.cc
+++ b/gcc/config/cris/cris.cc
@@ -134,6 +134,7 @@ static void cris_init_libfuncs (void);
 static unsigned int cris_postdbr_cmpelim (void);
 
 static reg_class_t cris_preferred_reload_class (rtx, reg_class_t);
+static reg_class_t cris_spill_class (reg_class_t, machine_mode);
 
 static int cris_register_move_cost (machine_mode, reg_class_t, reg_class_t);
 static int cris_memory_move_cost (machine_mode, reg_class_t, bool);
@@ -224,6 +225,9 @@ int cris_cpu_version = CRIS_DEFAULT_CPU_VERSION;
 #undef TARGET_PREFERRED_RELOAD_CLASS
 #define TARGET_PREFERRED_RELOAD_CLASS cris_preferred_reload_class
 
+#undef TARGET_SPILL_CLASS
+#define TARGET_SPILL_CLASS cris_spill_class
+
 /* We don't define TARGET_FIXED_CONDITION_CODE_REGS, as at the time of
this writing, it has an effect only on pre-reload CSE and when
scheduling (and for "macro fusion" at that).  Neither applies for
@@ -1684,6 +1688,14 @@ cris_preferred_reload_class (rtx x, reg_class_t rclass)
   return rclass;
 }
 
+/* Worker function for TARGET_SPILL_CLASS.  */
+
+static reg_class_t
+cris_spill_class (reg_class_t /* orig_class */, machine_mode)
+{
+  return ALL_REGS;
+}
+
 /* Worker function for TARGET_REGISTER_MOVE_COST.  */
 
 static int
-- 
2.30.2



2nd Ping: Re: [PATCH v3] doc: Document order of define_peephole2 scanning

2023-05-03 Thread Hans-Peter Nilsson via Gcc-patches
Ping again.

> From: Hans-Peter Nilsson 
> Date: Thu, 27 Apr 2023 01:55:24 +0200
> 
> > From: Hans-Peter Nilsson 
> > Date: Wed, 19 Apr 2023 18:59:14 +0200
> [...]
> 
> > So again: Approvers: pdf output reviewed.  Ok to commit?
> > -- >8 --
> > I was a bit surprised when my newly-added define_peephole2 didn't
> > match, but it was because it was expected to partially match the
> > generated output of a previous define_peephole2, which matched and
> > modified the last insn of a sequence to be matched.  I had assumed
> > that the algorithm backed-up the size of the match-buffer, thereby
> > exposing newly created opportunities *with sufficient context* to all
> > define_peephole2's.  While things can change in that direction, let's
> > start with documenting the current state.
> > 
> > * doc/md.texi (define_peephole2): Document order of scanning.
> > ---
> >  gcc/doc/md.texi | 9 +
> >  1 file changed, 9 insertions(+)
> > 
> > diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> > index 07bf8bdebffb..300d104d58ab 100644
> > --- a/gcc/doc/md.texi
> > +++ b/gcc/doc/md.texi
> > @@ -9362,6 +9362,15 @@ If the preparation falls through (invokes neither 
> > @code{DONE} nor
> >  @code{FAIL}), then the @code{define_peephole2} uses the replacement
> >  template.
> >  
> > +Insns are scanned in forward order from beginning to end for each basic
> > +block.  Matches are attempted in order of @code{define_peephole2}
> > +appearance in the @file{md} file.  After a successful replacement,
> > +scanning for further opportunities for @code{define_peephole2}, resumes
> > +with the first generated replacement insn as the first insn to be
> > +matched against all @code{define_peephole2}.  For the example above,
> > +after its successful replacement, the first insn that can be matched by
> > +a @code{define_peephole2} is @code{(set (match_dup 4) (match_dup 1))}.
> > +
> >  @end ifset
> >  @ifset INTERNALS
> >  @node Insn Attributes
> > -- 
> > 2.30.2
> > 
> 


[committed] CRIS-LRA: Fix uses of reload_in_progress

2023-05-03 Thread Hans-Peter Nilsson via Gcc-patches
On previous occasions when a general LRA transition has been
discussed, IIRC, the argument was used, that everything is ready for
targets and their maintainers to make the transition.  As I pointed
out then (though more than a year ago last time, people forget) that's
still not true: LRA documentation needs improvement to know what to
do, and having to sneak peaks at other targets is not a valid level of
documentation.  Here's an example and its fall-out: what to do with
lra_in_progress vs. reload_in_progress.  I "guess" lra_in_progress is
supposed to be used like reload_in_progress.  The consequences of
missing it is quite mild though: LRA takes most of its data from state
collected during IRA, and lra_in_progress has much less impact there,
than reload_in_progress has for reload.

Anyway, many supposedly-transformed targets (without TARGET_LRA_P)
still mention reload_in_progress, like aarch64, arm, csky, epiphany,
iq2000, m32r, microblaze, mmix (I'll deal with this one), moxie,
nds32, stormy16, v850 (with reservations for grep-errors; the list may
be erroneous or incomplete).

While this may have been of no measureable consequence to CRIS, it
*might* be a build-breaker for other targets, transformed or
about-to-be transformed, thus heads-up to eager converters. ;-)

(JFTR: in comparison, reload_completed is used for both LRA and
reload; there's no "lra_completed".)

Committed.
-- >8 --
This shows no difference neither in arith-rand-ll nor coremark
numbers.  Comparing libgcc and newlib libc before/after, the only
difference can be seen in a few functions where it's mostly neutral
(newlib's _svfprintf_r et al) and one function (__gdtoa), which
improves ever so slightly (four bytes less; one load less, but one
instruction reading from memory instead of a register).

* config/cris/cris.cc (cris_side_effect_mode_ok): Use
lra_in_progress, not reload_in_progress.
* config/cris/cris.md ("movdi", "*addi_reload"): Ditto.
* config/cris/constraints.md ("Q"): Ditto.
---
 gcc/config/cris/constraints.md | 18 +-
 gcc/config/cris/cris.cc| 18 +-
 gcc/config/cris/cris.md|  4 ++--
 3 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/gcc/config/cris/constraints.md b/gcc/config/cris/constraints.md
index fa9aa19d13e3..3df8cc27d8d4 100644
--- a/gcc/config/cris/constraints.md
+++ b/gcc/config/cris/constraints.md
@@ -96,7 +96,7 @@ (define_constraint "G"
 (define_memory_constraint "Q"
   "@internal"
   (and (match_code "mem")
-   (match_test "cris_base_p (XEXP (op, 0), reload_in_progress
+   (match_test "cris_base_p (XEXP (op, 0), lra_in_progress
   || reload_completed)")))
 
 ;; Extra constraints.
@@ -107,7 +107,7 @@ (define_memory_constraint "T"
;; Double indirect: [[reg]] or [[reg+]]?
(ior (and (match_code "mem" "0")
 (match_test "cris_base_or_autoincr_p (XEXP (XEXP (op, 0), 0),
-  reload_in_progress
+  lra_in_progress
   || reload_completed)"))
;; Just an explicit indirect reference: [const]?
(match_test "CONSTANT_P (XEXP (op, 0))")
@@ -115,29 +115,29 @@ (define_memory_constraint "T"
(and (match_code "plus" "0")
  ;; A BDAP constant: [reg+(8|16|32)bit offset]?
 (ior (and (match_test "cris_base_p (XEXP (XEXP (op, 0), 0),
-reload_in_progress
+lra_in_progress
 || reload_completed)")
   (match_test "CONSTANT_P (XEXP (XEXP (op, 0), 1))"))
  ;; A BDAP register: [reg+[reg(+)].S]?
  (and (match_test "cris_base_p (XEXP (XEXP (op, 0), 0),
-reload_in_progress
+lra_in_progress
 || reload_completed)")
   (match_test "cris_bdap_index_p (XEXP (XEXP (op, 0), 
1),
-  reload_in_progress
+  lra_in_progress
   || 
reload_completed)"))
  ;; Same, but with swapped arguments (no canonical
  ;; ordering between e.g. REG and MEM as of LAST_UPDATED
  ;; "Thu May 12 03:59:11 UTC 2005").
  (and (match_test "cris_base_p (XEXP (XEXP (op, 0), 1),
-reload_in_progress
+lra_in_progress

Re: [committed] Enable LRA on several ports

2023-05-01 Thread Hans-Peter Nilsson via Gcc-patches
> Date: Mon, 1 May 2023 07:21:59 -0600
> From: Jeff Law 

> Spurred by Segher's RFC, I went ahead and tested several ports with LRA 
> enabled.  Not surprisingly, many failed, but a few built their full set 
> of libraries successful and of those a few even ran their testsuites 
> with no regressions.  In fact, enabling LRA fixes a small number of 
> failures on the iq2000 port.
> 
> This patch converts the ports which built their libraries and have test 
> results that are as good as or better than without LRA.

Yeah, I did fix test-suite errors for CRIS with LRA earlier
(see commit logs to the CRIS port this year).

>There may be 
> minor code quality regressions or there may be minor code quality 
> improvements -- I'm leaving that for the port maintainers to own going 
> forward.

Right; I noticed performance regressions, and didn't want to
commit anything that knowingly degraded performance.  I did
follow the traces but fell into the rabbit-hole of
rtx_costs.  That's the main reason I didn't push a "-mlra"
option or remove the TARGET_LRA_P for CRIS.  (My story and I
stick to it.)

But thanks I guess, it saves me a commit, but (to all!)
please sync check_effective_target_lra for targets you
"convert".  ...oops, it's just CRIS and hppa there (wot, not
converted?)

brgds, H-P


[PATCH] testsuite: Handle empty assembly lines in check-function-bodies

2023-04-28 Thread Hans-Peter Nilsson via Gcc-patches
Ok to commit?
-- >8 --
I tried to make use of check-function-bodies for cris-elf and was a
bit surprised to see it failing.  There's a deliberate empty line
after the filled delay slot of the return-function which was
mishandled.  I thought "aha" and tried to add an empty line
(containing just a "**" prefix) to the match, but that didn't help.
While it was added as input from the function's assembly output
to-be-matched like any other line, it couldn't be matched: I had to
use "...", which works but is...distracting.

Some digging shows that an empty assembly line can't be deliberately
matched because all matcher lines (lines starting with the prefix,
the ubiquitous "**") are canonicalized by trimming leading
whitespace (the "string trim" in check-function-bodies) and instead
adding a leading TAB character, thus empty lines end up containing
just a TAB.  For usability it's better to treat empty lines as fluff
than to uglifying the test-case and the code to properly match them.
Double-checking, no test-case tries to match an line containing just
TAB (by providing an a line containing just "**\s*", i.e. zero or
more whitespace characters).

* lib/scanasm.exp (parse_function_bodies): Set fluff to include
empty lines (besides optionally leading whitespace).
---
 gcc/testsuite/lib/scanasm.exp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/lib/scanasm.exp b/gcc/testsuite/lib/scanasm.exp
index fb53544d40c7..be2b83a5dd48 100644
--- a/gcc/testsuite/lib/scanasm.exp
+++ b/gcc/testsuite/lib/scanasm.exp
@@ -791,7 +791,7 @@ proc parse_function_bodies { filename result } {
 set terminator {^\s*\.size}
 
 # Regexp for lines that aren't interesting.
-set fluff {^\s*(?:\.|//|@)}
+set fluff {^\s*(?:\.|//|@|$)}
 
 set fd [open $filename r]
 set in_function 0
-- 
2.30.2



Re: [committed] libgcc CRIS: Define TARGET_HAS_NO_HW_DIVIDE

2023-04-26 Thread Hans-Peter Nilsson via Gcc-patches
> From: Paul Koning 
> Date: Wed, 26 Apr 2023 21:02:31 -0400

> > On Apr 26, 2023, at 8:05 PM, Hans-Peter Nilsson  wrote:
> > 
> > Not many targets define this besides msp430, pdp1, xtensa,
> > and arm compared to those that appear to unconditionally
> > have a hardware division instruction (also, pdp11 and
> > msp430 seem confused and should be empty instead of "1"  ...
> 
> How so, "confused"?  The documentation says it should be
> defined, it doesn't say that it should be defined as
> empty.  What goes wrong if it's defined as 1 rather than
> empty?

Only future edits, expecting action to follow as if it was a
non-zero expression like many of the target macros.

> The documentation is also somewhat misleading, because it
> says to define it if the hardware has no divide
> instruction.  The more accurate statement is that it
> should be defined if the hardware has no 64 / 32 bit
> divide hardware support.  pdp11.h points this out in a
> comment, because most pdp11s do have divide instructions
> but those are for 32 / 16 bits.

That might be true, and I've heard patches are welcome.

brgds, H-P


[committed] libgcc CRIS: Define TARGET_HAS_NO_HW_DIVIDE

2023-04-26 Thread Hans-Peter Nilsson via Gcc-patches
Not many targets define this besides msp430, pdp1, xtensa,
and arm compared to those that appear to unconditionally
have a hardware division instruction (also, pdp11 and
msp430 seem confused and should be empty instead of "1" and
"(!  TARGET_HWMULT)" - and having hardware multiplication
doesn't have a bearing anyway even if that worked; see
numbers below for an example).

Heads-up maintainers of ports without hardware division
(including conditionally, for multilibbed configurations)!

-- >8 --
With this, execution time for e.g. __moddi3 go from 59 to 40 cycles in
the "fast" case or from 290 to 200 cycles in the "slow" case (when the
!TARGET_HAS_NO_HW_DIVIDE variant calls division and modulus functions
for 32-bit SImode), as exposed by gcc.c-torture/execute/arith-rand-ll.c
compiled for -march=v10.

Unfortunately, it just puts a performance improvement "dent" of 0.07%
in a arith-rand-ll.c-based performance test - where all loops are also
reduced to 1/10.

The size of every affected libgcc function is reduced to less than
half and they are all now leaf functions.

* config/cris/t-cris (HOST_LIBGCC2_CFLAGS): Add
-DTARGET_HAS_NO_HW_DIVIDE.
---
 libgcc/config/cris/t-cris | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/libgcc/config/cris/t-cris b/libgcc/config/cris/t-cris
index b582974a42ee..e0020294be96 100644
--- a/libgcc/config/cris/t-cris
+++ b/libgcc/config/cris/t-cris
@@ -8,3 +8,6 @@ $(LIB2ADD): $(srcdir)/config/cris/arit.c
echo "#define L$$name" > tmp-$@ \
&& echo '#include "$<"' >> tmp-$@ \
&& mv -f tmp-$@ $@
+
+# Use an appropriate implementation when implementing DImode division.
+HOST_LIBGCC2_CFLAGS += -DTARGET_HAS_NO_HW_DIVIDE
-- 
2.30.2



Ping: Re: [PATCH v3] doc: Document order of define_peephole2 scanning

2023-04-26 Thread Hans-Peter Nilsson via Gcc-patches
> From: Hans-Peter Nilsson 
> Date: Wed, 19 Apr 2023 18:59:14 +0200
[...]

> So again: Approvers: pdf output reviewed.  Ok to commit?
> -- >8 --
> I was a bit surprised when my newly-added define_peephole2 didn't
> match, but it was because it was expected to partially match the
> generated output of a previous define_peephole2, which matched and
> modified the last insn of a sequence to be matched.  I had assumed
> that the algorithm backed-up the size of the match-buffer, thereby
> exposing newly created opportunities *with sufficient context* to all
> define_peephole2's.  While things can change in that direction, let's
> start with documenting the current state.
> 
>   * doc/md.texi (define_peephole2): Document order of scanning.
> ---
>  gcc/doc/md.texi | 9 +
>  1 file changed, 9 insertions(+)
> 
> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> index 07bf8bdebffb..300d104d58ab 100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -9362,6 +9362,15 @@ If the preparation falls through (invokes neither 
> @code{DONE} nor
>  @code{FAIL}), then the @code{define_peephole2} uses the replacement
>  template.
>  
> +Insns are scanned in forward order from beginning to end for each basic
> +block.  Matches are attempted in order of @code{define_peephole2}
> +appearance in the @file{md} file.  After a successful replacement,
> +scanning for further opportunities for @code{define_peephole2}, resumes
> +with the first generated replacement insn as the first insn to be
> +matched against all @code{define_peephole2}.  For the example above,
> +after its successful replacement, the first insn that can be matched by
> +a @code{define_peephole2} is @code{(set (match_dup 4) (match_dup 1))}.
> +
>  @end ifset
>  @ifset INTERNALS
>  @node Insn Attributes
> -- 
> 2.30.2
> 


Re: [PATCH v3] doc: Document order of define_peephole2 scanning

2023-04-19 Thread Hans-Peter Nilsson via Gcc-patches
> Date: Wed, 19 Apr 2023 22:16:36 +0200
> From: Bernhard Reutner-Fischer 

> On 19 April 2023 21:21:18 CEST, Bernhard Reutner-Fischer 
>  wrote:
> >Hi H-P!
> >
> >This begs the question iff now (i fear it's not), upon
> >successful match(es), the whole peepholes get re-run
> >again per BB (ATM?), exposing more opportunities?

(Not sure IIUC the question, but:) no; as mentioned the
peephole2 machinery doesn't back up to include the context
of longer sequences when having done replacement for a
shorter match.  It resumes matching at the beginning of the
shorter, matched and replaced, sequence.

> >If not, would we want to retry, at least for
> >-fexpensive-optimisations (sp?) or would all hell break
> >loose?

I don't think hell would break loose, but maybe slowdown
and/or buggy define_peephole2s would be weeded out.  Or
something else entirely unexpected. :)

> >Please also see below.
> >
> >On 19 April 2023 18:59:14 CEST, Hans-Peter Nilsson via Gcc-patches 
> > wrote:
> >>Anyway, the missing-context problem I ran into remains: if
> >>you have an insn sequence {foo bar} and a define_peephole2
> >>matching and replacing {bar} into {baz}, the resulting {foo
> >>baz} *will not be matched* against a define_peephole2
> >>looking for {foo baz}.  But, I'm not trying to document this
> >>caveat specifically, though at least it'll now be implied by
> >>the documentation.
> >>
> >>This could be fixed by always backing up MAX_INSNS_PER_PEEP2
> >>- 1 insns after a successful replacement.  I'm somewhat
> >>worries that this would also mean lots of futile re-match
> >>attempts.  Thoughts?
> >
> >Good point. Probably. Do you have stats?

None whatsoever.  I'm going to test with the "original"
define_peephole2 for CRIS that was suffering, but with an
"extra" define_peephole2 that also does the modification for
the one that caused it do be missed and see if the original
free one still fires off.  (I see I cut down my foo bar baz
examples too much to make sense to explain that.)

> >If there is even a slight benefit, then some certainly
> >would be willing to take that penalty for sure. I.e. iff
> >it helps -Os or locality then such expensive
> >optimisations certainly provide benefit for at least a
> >few if not some, IMO.

Right.  Not sure I'll be doing it though, having other
things, gcc-related and others, on my plate.  If so, first
I'd do a crude attempt at getting statistics for x86_64, by
after a successful replacement, restarting at the beginning
of a BB and checking whether any define_peephole2 fires off
before reaching the first replaced insn.

brgds, H-P


Re: [PATCH v3] doc: Document order of define_peephole2 scanning

2023-04-19 Thread Hans-Peter Nilsson via Gcc-patches
> From: Hans-Peter Nilsson 
> Date: Wed, 19 Apr 2023 06:06:27 +0200
> 
> Patch retracted, at least temporarily.  My "understanding"
> may be clouded by looking at an actual bug.  Sigh.

Mea culpa.  I was looking at the result of one
define_peephole2 and thinking it was due to another, and
also tricked by incorrect code comments (patch posted, will
commit).

TL;DR: Matching indeed does resume with attempting to match
the *first* define_peephole2 replacement insn.  But the
match-and-replacement order is largely undocumented.

Anyway, the missing-context problem I ran into remains: if
you have an insn sequence {foo bar} and a define_peephole2
matching and replacing {bar} into {baz}, the resulting {foo
baz} *will not be matched* against a define_peephole2
looking for {foo baz}.  But, I'm not trying to document this
caveat specifically, though at least it'll now be implied by
the documentation.

This could be fixed by always backing up MAX_INSNS_PER_PEEP2
- 1 insns after a successful replacement.  I'm somewhat
worries that this would also mean lots of futile re-match
attempts.  Thoughts?

(I could also just restart at the BB start, but I see all
this support for backing-up live info by single insns being
used.  Taking notes about a partial match for the first insn
of a failed attempt, as the maximum need to back-up to,
doesn't look like it'd fly, judging from the nonspecific
looking (set dest src) patterns being the first in i386
define_peephole2's match sequences.)

So again: Approvers: pdf output reviewed.  Ok to commit?
-- >8 --
I was a bit surprised when my newly-added define_peephole2 didn't
match, but it was because it was expected to partially match the
generated output of a previous define_peephole2, which matched and
modified the last insn of a sequence to be matched.  I had assumed
that the algorithm backed-up the size of the match-buffer, thereby
exposing newly created opportunities *with sufficient context* to all
define_peephole2's.  While things can change in that direction, let's
start with documenting the current state.

* doc/md.texi (define_peephole2): Document order of scanning.
---
 gcc/doc/md.texi | 9 +
 1 file changed, 9 insertions(+)

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 07bf8bdebffb..300d104d58ab 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -9362,6 +9362,15 @@ If the preparation falls through (invokes neither 
@code{DONE} nor
 @code{FAIL}), then the @code{define_peephole2} uses the replacement
 template.
 
+Insns are scanned in forward order from beginning to end for each basic
+block.  Matches are attempted in order of @code{define_peephole2}
+appearance in the @file{md} file.  After a successful replacement,
+scanning for further opportunities for @code{define_peephole2}, resumes
+with the first generated replacement insn as the first insn to be
+matched against all @code{define_peephole2}.  For the example above,
+after its successful replacement, the first insn that can be matched by
+a @code{define_peephole2} is @code{(set (match_dup 4) (match_dup 1))}.
+
 @end ifset
 @ifset INTERNALS
 @node Insn Attributes
-- 
2.30.2



[PATCH] recog.cc: Correct comments referring to parameter match_len

2023-04-19 Thread Hans-Peter Nilsson via Gcc-patches
I'll commit this as obvious, so it doesn't trick anyone else
anymore.
-- >8 --
* recog.cc (peep2_attempt, peep2_update_life): Correct
head-comment description of parameter match_len.
---
 gcc/recog.cc | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/recog.cc b/gcc/recog.cc
index 3ddeab59d924..fd09145d45e5 100644
--- a/gcc/recog.cc
+++ b/gcc/recog.cc
@@ -3850,7 +3850,7 @@ copy_frame_info_to_split_insn (rtx_insn *old_insn, 
rtx_insn *new_insn)
   maybe_copy_prologue_epilogue_insn (old_insn, new_insn);
 }
 
-/* While scanning basic block BB, we found a match of length MATCH_LEN,
+/* While scanning basic block BB, we found a match of length MATCH_LEN + 1,
starting at INSN.  Perform the replacement, removing the old insns and
replacing them with ATTEMPT.  Returns the last insn emitted, or NULL
if the replacement is rejected.  */
@@ -4036,7 +4036,7 @@ peep2_attempt (basic_block bb, rtx_insn *insn, int 
match_len, rtx_insn *attempt)
 /* After performing a replacement in basic block BB, fix up the life
information in our buffer.  LAST is the last of the insns that we
emitted as a replacement.  PREV is the insn before the start of
-   the replacement.  MATCH_LEN is the number of instructions that were
+   the replacement.  MATCH_LEN + 1 is the number of instructions that were
matched, and which now need to be replaced in the buffer.  */
 
 static void
-- 
2.30.2



Re: [PATCH v2] doc: Document order of define_peephole2 scanning

2023-04-18 Thread Hans-Peter Nilsson via Gcc-patches
> From: Hans-Peter Nilsson 
> Date: Wed, 19 Apr 2023 05:15:27 +0200

> Approvers: pdf output reviewed.  Ok to commit?

Patch retracted, at least temporarily.  My "understanding"
may be clouded by looking at an actual bug.  Sigh.

brgds, H-P


[PATCH v2] doc: Document order of define_peephole2 scanning

2023-04-18 Thread Hans-Peter Nilsson via Gcc-patches
> From: Hans-Peter Nilsson 
> Date: Tue, 18 Apr 2023 20:44:12 +0200
> 
> > From: Paul Koning 
> 
> > Date: Tue, 18 Apr 2023 14:32:07 -0400
> > 
> > I'm not sure about the meaning of part of this.
> > "...resumes at the last generated insn."  Does that mean:

[...]

> (Neither...)

[...]

> Sorry, your confusement confuses me.  I just don't see how
> to confuse last with first or matched with generated. :)

It's 4:30am and things appear much clearer, in particular
wrt. confusion.  Hopefully the version below is clearer.
Here's also the example from 35 lines up in md.texi:

(define_peephole2
  [(match_scratch:SI 4 "r")
   (set (match_operand:SI 0 "" "") (match_operand:SI 1 "" ""))
   (set (match_operand:SI 2 "" "") (match_dup 1))
   (match_dup 4)
   (set (match_operand:SI 3 "" "") (match_dup 1))]
  "/* @r{determine 1 does not overlap 0 and 2} */"
  [(set (match_dup 4) (match_dup 1))
   (set (match_dup 0) (match_dup 4))
   (set (match_dup 2) (match_dup 4))
   (set (match_dup 3) (match_dup 4))]
  "")

Approvers: pdf output reviewed.  Ok to commit?

All: thoughts on making define_peephole2 work "as expected";
"backtracing" so the replacement buffer ends with the first
generated replacement insn?  Might be simpler to restart at
the beginning of the BB, but I'm scared of overly long BB's.
Does anyone have statistics on the sizes of BB's in terms of
number of insns?

-- >8 --
I was a bit surprised when my define_peephole2 didn't match, but
it was because it was expected to partially match the generated
output of a previous define_peephole2.  I had assumed that the
algorithm backed-up the size of the match-buffer, thereby
exposing newly created opportunities with context to all
define_peephole2's.  While things can change in that direction,
let's start with documenting the current state.

* doc/md.texi (define_peephole2): Document order of scanning.
---
 gcc/doc/md.texi | 8 
 1 file changed, 8 insertions(+)

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 07bf8bdebffb..2ce043e6edc2 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -9362,6 +9362,14 @@ If the preparation falls through (invokes neither 
@code{DONE} nor
 @code{FAIL}), then the @code{define_peephole2} uses the replacement
 template.
 
+Insns are scanned in forward order from beginning to end for each basic
+block.  Matches are attempted in order of appearance in the @file{md}
+file.  After a successful replacement, scanning for further
+opportunities for @code{define_peephole2}, resumes with the last
+generated replacement insn as the first insn to be matched.  For the
+example above, the first insn that can be matched by another
+@code{define_peephole2}, is @code{(set (match_dup 3) (match_dup 4))}.
+
 @end ifset
 @ifset INTERNALS
 @node Insn Attributes
-- 
2.30.2



Re: [PATCH] doc: Document order of define_peephole2 scanning

2023-04-18 Thread Hans-Peter Nilsson via Gcc-patches
> From: Paul Koning 

> Date: Tue, 18 Apr 2023 14:32:07 -0400
> 
> I'm not sure about the meaning of part of this.
> "...resumes at the last generated insn."  Does that mean:

(Neither...)
 
> 1. If a match is found at some insn, the replacement
> defined by the matching define_peephole2 is performed, and
> then the scan resumes at the first of the insns produced
> by the replacement.

This was what I expected.  If it had been this, I wouldn't
have suggested the doc update.  But it isn't: no, it's the
*last produced one*.  If you look at the referenced example
(unfortunately outside of the diff context) it should all be
clear.

> or
> 
> 2. If a match is found at some insn, the replacement
> defined by the matching define_peephole2 is performed, and
> then the scan resumes at the insn immediately following
> the ones just matched.

No, from the last of the replacement insns.

> "Last generated" seems to fit option 1,

Sorry, your confusement confuses me.  I just don't see how
to confuse last with first or matched with generated. :)

> but I'm not sure
> if that's what you meant.  Maybe you could add some words
> to say more explicitly which it is.

I'm referring to an example on the same pdf page.

But perhaps s/resumes at the last generated insn/resumes at
the last insn in the replacement sequence/ would help?

brgds, H-P


> 
>   paul
> 
> > On Apr 18, 2023, at 1:55 PM, Hans-Peter Nilsson via Gcc-patches 
> >  wrote:
> > 
> > Generated pdf inspected.  Ok to commit?
> > 
> > Thoughts on fixing the IMHO wart to also expose all
> > replacements to all define_peephole2?  Looks feasible
> > (famous last words), but then again I haven't checked the
> > history yet.
> > 
> > -- >8 --
> > I was a bit surprised when my define_peephole2 didn't match,
> > but it was because it was expected to partially match the
> > generated output of a previous define_peephole2.  I had
> > assumed that the algorithm exposed newly created opportunities
> > to all define_peephole2's.  While things can change in that
> > direction, let's start with documenting the current state.
> > 
> > * doc/md.texi (define_peephole2): Document order of scanning.
> > ---
> > gcc/doc/md.texi | 8 
> > 1 file changed, 8 insertions(+)
> > 
> > diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> > index 07bf8bdebffb..0f9e32d2c648 100644
> > --- a/gcc/doc/md.texi
> > +++ b/gcc/doc/md.texi
> > @@ -9362,6 +9362,14 @@ If the preparation falls through (invokes neither 
> > @code{DONE} nor
> > @code{FAIL}), then the @code{define_peephole2} uses the replacement
> > template.
> > 
> > +Insns are scanned in forward order from beginning to end for each basic
> > +block, but the basic blocks are scanned in reverse order of appearance
> > +in a function.  After a successful replacement, scanning for further
> > +opportunities for @code{define_peephole2} matches, resumes at the last
> > +generated insn.  I.e. for the example above, the first insn that can be
> > +matched by another @code{define_peephole2}, is @code{(set (match_dup 3)
> > +(match_dup 4))}.
> > +
> > @end ifset
> > @ifset INTERNALS
> > @node Insn Attributes
> > -- 
> > 2.30.2
> > 
> 


[PATCH] doc: Document order of define_peephole2 scanning

2023-04-18 Thread Hans-Peter Nilsson via Gcc-patches
Generated pdf inspected.  Ok to commit?

Thoughts on fixing the IMHO wart to also expose all
replacements to all define_peephole2?  Looks feasible
(famous last words), but then again I haven't checked the
history yet.

-- >8 --
I was a bit surprised when my define_peephole2 didn't match,
but it was because it was expected to partially match the
generated output of a previous define_peephole2.  I had
assumed that the algorithm exposed newly created opportunities
to all define_peephole2's.  While things can change in that
direction, let's start with documenting the current state.

* doc/md.texi (define_peephole2): Document order of scanning.
---
 gcc/doc/md.texi | 8 
 1 file changed, 8 insertions(+)

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 07bf8bdebffb..0f9e32d2c648 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -9362,6 +9362,14 @@ If the preparation falls through (invokes neither 
@code{DONE} nor
 @code{FAIL}), then the @code{define_peephole2} uses the replacement
 template.
 
+Insns are scanned in forward order from beginning to end for each basic
+block, but the basic blocks are scanned in reverse order of appearance
+in a function.  After a successful replacement, scanning for further
+opportunities for @code{define_peephole2} matches, resumes at the last
+generated insn.  I.e. for the example above, the first insn that can be
+matched by another @code{define_peephole2}, is @code{(set (match_dup 3)
+(match_dup 4))}.
+
 @end ifset
 @ifset INTERNALS
 @node Insn Attributes
-- 
2.30.2



Re: [PATCH] reload: Handle generating reloads that also clobbers flags

2023-04-18 Thread Hans-Peter Nilsson via Gcc-patches
> Date: Tue, 18 Apr 2023 07:43:41 -0600
> From: Jeff Law 

> On 2/15/23 08:34, Hans-Peter Nilsson via Gcc-patches wrote:
> > Regtested cris-elf with its LEGITIMIZE_RELOAD_ADDRESS
> > disabled, where it regresses gcc.target/cris/rld-legit1.c;
> > as expected, because that test guards proper function of its
> > LEGITIMIZE_RELOAD_ADDRESS i.e., that there's no sign of
> > decomposed address elements.
> > 
> > LRA also causes a similar decomposition (and worse, in even
> > smaller bits), but it can create valid insns as-is.
> > Unfortunately, it doesn't have something equivalent to
> > LEGITIMIZE_RELOAD_ADDRESS so it generates worse code for
> > cases where that hook helped reload.
> > 
> > I fear reload-related patches these days are treated like a
> > redheaded stepchild and even worse as this one is intended
> > for stage 1.  Either way, I need to create a reference to
> > it, and it's properly tested and has been a help when
> > working towards LRA, thus might help other targets: ok to
> > install for the next stage 1?
> > 
> > -- >8 --
> > When LEGITIMIZE_RELOAD_ADDRESS for cris-elf is disabled,
> > this code is now required for reload to generate valid insns
> > from some reload-decomposed addresses, for example the
> > (plus:SI
> >   (sign_extend:SI (mem:HI (reg/v/f:SI 32 [ a ]) [1 *a_6(D)+0 S2 A8]))
> >   (reg/v/f:SI 33 [ y ]))
> > generated in gcc.target/cris/rld-legit1.c (a valid address
> > but with two registers needing reload).  Now after decc0:ing,
> > most SET insns for former cc0 targets need to be a parallel
> > with a clobber of the flags register.  Such targets
> > typically have TARGET_FLAGS_REGNUM set to a valid register.
> > 
> > * reload1.cc (emit_insn_if_valid_for_reload_1): Rename from
> > emit_insn_if_valid_for_reload.
> > (emit_insn_if_valid_for_reload): Call new helper, and if a SET fails
> > to be recognized, also try emitting a parallel that clobbers
> > TARGET_FLAGS_REGNUM, as applicable.
> BUt isn't it the case that we're not supposed to be exposing the flags 
> register until after reload?   And if that's the case, then why would 
> this be necessary?  Clearly I must be missing something.

That "supposed to" is only *one* possible implementation.
The one in CRIS - and I believe the preferred one; one I
should advocate more - is to *always* expose clobbering of
the flags.  (I managed to do the CRIS decc0ification
transformation without loss of performance.  There were much
fewer issues with code taking PATTERN (insn) and failing on
it being PARALLEL than I had expected, much thanks to use of
rtx_single_set.)

Think about it: why should the semantics of a valid insn
change after a "random" pass?  That's almost as crazy as the
implied semantics of cc0.

brgds, H-P


[committed] doc: md.texi (Including Patterns): Fix page break

2023-04-04 Thread Hans-Peter Nilsson via Gcc-patches
Committed as obvious.  See also the previous discussion
regarding my define_split doc patch.
-- >8 --
The line-break in the example looked odd, even more so with
a page-break in the middle of it, due to recently added text
in preceding pages.

* doc/md.texi (Including Patterns): Fix page break.
---
 gcc/doc/md.texi | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 970af5e34710..07bf8bdebffb 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -8963,8 +8963,7 @@ It looks like:
 
 @smallexample
 
-(include
-  @var{pathname})
+(include @var{pathname})
 @end smallexample
 
 For example:
-- 
2.30.2



Re: Regression with "recomputation and PR 109154"

2023-03-31 Thread Hans-Peter Nilsson via Gcc-patches
> Date: Fri, 31 Mar 2023 15:48:22 -0400
> From: Andrew MacLeod via Gcc-patches 
> Reply-To: Andrew MacLeod 

> commit 55bf4f0d443e5adbacfcdbbebf4b2e0c74d1dcc8
> Author: Andrew MacLeod 
> Date:   Fri Mar 31 15:42:43 2023 -0400
> 
> Adjust testcases to not produce errors..
> 
> tree-optimization/109363
> gcc/testsuite/
> * g++.dg/warn/Wstringop-overflow-4.C: Always cehck bogus message.
> * gcc.dg/tree-ssa/pr23109.c: Disable better recomputations.


(Needs to be spelled "PR tree-optimization/109363" for the
bugzilla-marker hook to react.  I'll mark manually though.)

Thanks!  I was about to push the --param adjustment as
obvious, seeing consensus equivalent of approval in this
mail thread.

brgds, H-P


Regression with "recomputation and PR 109154"

2023-03-31 Thread Hans-Peter Nilsson via Gcc-patches
> Attached. I also removed the bogus warning in Walloc-13.c that no longer 
> happens

> Add recursive GORI recompuations with a depth limit.
> 
> PR tree-optimization/109154
> gcc/
> * gimple-range-gori.cc (gori_compute::may_recompute_p): Add depth 
> limit.
> * gimple-range-gori.h (may_recompute_p): Add depth param.
> * params.opt (ranger-recompute-depth): New param.
> 
> gcc/testsuite/
> * gcc.dg/Walloca-13.c: Remove bogus warning that is now fixed.

This patch, commit r13-6945-g429a7a88438cc8, caused a
test-suite regression; FAIL for 'gcc.dg/tree-ssa/pr23109.c
scan-tree-dump-not recip "reciptmp"' for cris-elf that I
logged at PR109363.

Perhaps it's somewhat of a similar nature as Walloca-13.c
but then again it's not an bogus warning that's gone.

Is it expected and I should just blankly xfail it or is it
worth more attention?  I'm a bit surprised that it hasn't
shown up for other targets though.

brgds, H-P


[committed] CRIS: Make rtx-cost 0 for many CONST_INT "quick" operands

2023-03-29 Thread Hans-Peter Nilsson via Gcc-patches
Stepping through a gdb session inspecting costs that cause
gcc.dg/tree-ssa/slsr-13.c to fail, exposed that before this
patch, cris_rtx_costs told that a shift of 1 of a register
costs 5, while adding two registers costs 4.

Making the cost of a quick-immediate constant equal to using
a register (default 0) reflects actual performance and
size-cost better.  It also happens to make
gcc.dg/tree-ssa/slsr-13.c pass with what looks like better
code being generated, and improves coremark performance by
0.4%.

But, blindly doing this for *all* valid operands that fit
the "quick-immediate" addressing mode, trips interaction
with other factors*, with the end result mixed at best.  So,
do this only for MINUS and logical operations for the time
being, and only for modes that fit in one register.

*) Examples of "other factors":

- A bad default implementation of insn_cost or actually,
pattern_cost, that looks only at the set_src_cost and
furthermore sees such a cost of 0 as invalid.  (Compare to
the more sane set_rtx_cost.)  This naturally tripped up
combine and ifcvt, causing all sorts of changes, good and
bad.

- Having the same cost, to compare a register with 0 as with
-31..31, means a compare insn of an eliminable form no
longer looks preferable.

* config/cris/cris.cc (cris_rtx_costs) [CONST_INT]: Return 0
for many quick operands, for register-sized modes.
---
 gcc/config/cris/cris.cc | 23 ++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/gcc/config/cris/cris.cc b/gcc/config/cris/cris.cc
index 641e7ea25fb1..05dead9c0778 100644
--- a/gcc/config/cris/cris.cc
+++ b/gcc/config/cris/cris.cc
@@ -1884,7 +1884,28 @@ cris_rtx_costs (rtx x, machine_mode mode, int 
outer_code, int opno,
if (val == 0)
  *total = 0;
else if (val < 32 && val >= -32)
- *total = 1;
+ switch (outer_code)
+   {
+ /* For modes that fit in one register we tell they cost
+the same as with register operands.  DImode operations
+needs careful consideration for more basic reasons:
+shifting by a non-word-size amount needs more
+operations than an addition by a register pair.
+Deliberately excluding SET, PLUS and comparisons and
+also not including the full -64..63 range for (PLUS
+and) MINUS.  */
+   case MINUS: case ASHIFT: case LSHIFTRT:
+   case ASHIFTRT: case AND: case IOR:
+ if (GET_MODE_SIZE(mode) <= UNITS_PER_WORD)
+   {
+ *total = 0;
+ break;
+   }
+ /* FALL THROUGH.  */
+   default:
+ *total = 1;
+ break;
+   }
/* Eight or 16 bits are a word and cycle more expensive.  */
else if (val <= 32767 && val >= -32768)
  *total = 2;
-- 
2.30.2



Ping #2: [PATCH v2] doc: md.texi (Insn Splitting): Tweak wording for readability.

2023-03-28 Thread Hans-Peter Nilsson via Gcc-patches
> From: Hans-Peter Nilsson 
> Date: Tue, 14 Mar 2023 17:04:43 +0100

Ping #2 on contents (formatting is approved):

> -- >8 --
> I needed to check what was allowed in a define_split, but
> had problems understanding what was meant by "Splitting of
> jump instruction into sequence that over by another jump
> instruction".
> 
>   * doc/md.texi (Insn Splitting): Tweak wording for readability.
> 
> Co-Authored-By: Sandra Loosemore 
> ---
>  gcc/doc/md.texi | 30 +++---
>  1 file changed, 15 insertions(+), 15 deletions(-)
> 
> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> index 8e3113599fdc..134b227b9a93 100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -8756,21 +8756,21 @@ insns that don't.  Instead, write two separate 
> @code{define_split}
>  definitions, one for the insns that are valid and one for the insns that
>  are not valid.
>  
> -The splitter is allowed to split jump instructions into sequence of
> -jumps or create new jumps in while splitting non-jump instructions.  As
> -the control flow graph and branch prediction information needs to be updated,
> -several restriction apply.
> -
> -Splitting of jump instruction into sequence that over by another jump
> -instruction is always valid, as compiler expect identical behavior of new
> -jump.  When new sequence contains multiple jump instructions or new labels,
> -more assistance is needed.  Splitter is required to create only unconditional
> -jumps, or simple conditional jump instructions.  Additionally it must attach 
> a
> -@code{REG_BR_PROB} note to each conditional jump.  A global variable
> -@code{split_branch_probability} holds the probability of the original branch 
> in case
> -it was a simple conditional jump, @minus{}1 otherwise.  To simplify
> -recomputing of edge frequencies, the new sequence is required to have only
> -forward jumps to the newly created labels.
> +The splitter is allowed to split jump instructions into a sequence of jumps 
> or
> +create new jumps while splitting non-jump instructions.  As the control flow
> +graph and branch prediction information needs to be updated after the 
> splitter
> +runs, several restrictions apply.
> +
> +Splitting of a jump instruction into a sequence that has another jump
> +instruction to the same label is always valid, as the compiler expects
> +identical behavior of the new jump.  When the new sequence contains multiple
> +jump instructions or new labels, more assistance is needed.  The splitter is
> +permitted to create only unconditional jumps, or simple conditional jump
> +instructions.  Additionally it must attach a @code{REG_BR_PROB} note to each
> +conditional jump.  A global variable @code{split_branch_probability} holds 
> the
> +probability of the original branch in case it was a simple conditional jump,
> +@minus{}1 otherwise.  To simplify recomputing of edge frequencies, the new
> +sequence is permitted to have only forward jumps to the newly-created labels.
>  
>  @findex define_insn_and_split
>  For the common case where the pattern of a define_split exactly matches the
> -- 
> 2.30.2
> 
> brgds, H-P
> 


[committed] CRIS: Correct "T" to define_memory_constraint, not define_constraint

2023-03-27 Thread Hans-Peter Nilsson via Gcc-patches
This patch has no effect on builds using reload of libgcc, newlib libc, my
own at-a-glance-testsuite and coremark.  That somewhat surprisingly
also goes for LRA builds, even with all CRIS reload_in_progress
augmented to include lra_in_progress.  I just noticed it when checking
because another port had a similar fix, where it mattered for LRA.

* config/cris/constraints.md ("T"): Correct to
define_memory_constraint.
---
 gcc/config/cris/constraints.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/cris/constraints.md b/gcc/config/cris/constraints.md
index 5efb61364f46..fa9aa19d13e3 100644
--- a/gcc/config/cris/constraints.md
+++ b/gcc/config/cris/constraints.md
@@ -100,7 +100,7 @@ (define_memory_constraint "Q"
   || reload_completed)")))
 
 ;; Extra constraints.
-(define_constraint "T"
+(define_memory_constraint "T"
   "Memory three-address operand."
   ;; All are indirect-memory:
   (and (match_code "mem")
-- 
2.30.2



[committed] CRIS: Add peephole2 to handle gcc.target/cris/rld-legit1.c for LRA

2023-03-27 Thread Hans-Peter Nilsson via Gcc-patches
The test-case gcc.target/cris/rld-legit1.c is a reduced
test-case that required defining LEGITIMIZE_RELOAD_ADDRESS
to stop the address from being decomposed into several insns
by reload.  Valid but suboptimal code was generated.

(Before implementing that hook for CRIS, the same test-case
also exposed a bug in reload, and a fix was committed to
avoid an ICE; see e.g. git r0-71992-gff0d9879ab0f30 and
related commits.  But, post-cc0, reload no longer handles
this test-case without LEGITIMIZE_RELOAD_ADDRESS helping and
there'd again an be ICE for CRIS (again: only if
LEGITIMIZE_RELOAD_ADDRESS is disabled).  There's a patch to
reload to fix that, at
https://gcc.gnu.org/pipermail/gcc-patches/2023-February/612039.html)

But, LRA also does not handle that test-case gracefully, and
like reload without LEGITIMIZE_RELOAD_ADDRESS for CRIS,
decomposes the address into a suboptimal (but valid)
sequence, about as messy as that from reload, and
gcc.target/cris/rld-legit1.c would regress for LRA.  There's
nothing equivalent to LEGITIMIZE_RELOAD_ADDRESS for LRA.
(Stepping through LRA, I can't find an obvious place where
to put such a hook.  Granted, I haven't seen this kind of
messy decomposition in other code, so I'm not insisting a
LEGITIMIZE_RELOAD_ADDRESS-like hook is a good idea.)

These new peephole2's are required to not regress
gcc.target/cris/rld-legit1.c with LRA enabled for CRIS.
They don't appear to otherwise make a difference for neither
libgcc, newlib libc, my own at-a-glance tests nor coremark,
for neither LRA nor reload.

* config/cris/cris.md (BW2): New mode-iterator.
(lra_szext_decomposed, lra_szext_decomposed_indirect_with_offset): New
peephole2s.
---
 gcc/config/cris/cris.md | 50 +
 1 file changed, 50 insertions(+)

diff --git a/gcc/config/cris/cris.md b/gcc/config/cris/cris.md
index 2bea480a0200..c3de259983c6 100644
--- a/gcc/config/cris/cris.md
+++ b/gcc/config/cris/cris.md
@@ -183,6 +183,10 @@ (define_mode_iterator SI_ [SI])
 
 (define_mode_iterator WD [SI HI])
 (define_mode_iterator BW [HI QI])
+
+; Another "BW" for use where an independent iteration is needed.
+(define_mode_iterator BW2 [HI QI])
+
 (define_mode_attr S [(SI "HI") (HI "QI")])
 (define_mode_attr s [(SI "hi") (HI "qi")])
 (define_mode_attr m [(SI ".d") (HI ".w") (QI ".b")])
@@ -2832,6 +2836,52 @@ (define_peephole2 ; andqu
   operands[3] = gen_rtx_ZERO_EXTEND (SImode, op1);
   operands[4] = GEN_INT (trunc_int_for_mode (INTVAL (operands[1]), QImode));
 })
+
+;; Fix a decomposed szext: fuse it with the memory operand of the
+;; load.  This is typically the sign-extension part of a decomposed
+;; "indirect offset" address.
+(define_peephole2 ; lra_szext_decomposed
+  [(parallel
+[(set (match_operand:BW 0 "register_operand")
+ (match_operand:BW 1 "memory_operand"))
+ (clobber (reg:CC CRIS_CC0_REGNUM))])
+   (parallel
+[(set (match_operand:SI 2 "register_operand") (szext:SI (match_dup 0)))
+ (clobber (reg:CC CRIS_CC0_REGNUM))])]
+  "REGNO (operands[0]) == REGNO (operands[2])
+   || peep2_reg_dead_p (2, operands[0])"
+  [(parallel
+[(set (match_dup 2) (szext:SI (match_dup 1)))
+ (clobber (reg:CC CRIS_CC0_REGNUM))])])
+
+;; Re-compose a decomposed "indirect offset" address for a szext
+;; operation.  The non-clobbering "addi" is generated by LRA.
+;; This and lra_szext_decomposed is covered by cris/rld-legit1.c.
+(define_peephole2 ; lra_szext_decomposed_indirect_with_offset
+  [(parallel
+[(set (match_operand:SI 0 "register_operand")
+ (sign_extend:SI (mem:BW (match_operand:SI 1 "register_operand"
+ (clobber (reg:CC CRIS_CC0_REGNUM))])
+   (set (match_dup 0)
+   (plus:SI (match_dup 0) (match_operand:SI 2 "register_operand")))
+   (parallel
+[(set (match_operand:SI 3 "register_operand")
+ (szext:SI (mem:BW2 (match_dup 0
+ (clobber (reg:CC CRIS_CC0_REGNUM))])]
+  "(REGNO (operands[0]) == REGNO (operands[3])
+|| peep2_reg_dead_p (3, operands[0]))
+   && (REGNO (operands[0]) == REGNO (operands[1])
+   || peep2_reg_dead_p (3, operands[0]))"
+  [(parallel
+[(set
+  (match_dup 3)
+  (szext:SI
+   (mem:BW2 (plus:SI (szext:SI (mem:BW (match_dup 1))) (match_dup 2)
+ (clobber (reg:CC CRIS_CC0_REGNUM))])])
+
+;; Add operations with similar or same decomposed addresses here, when
+;; encountered - but only when covered by mentioned test-cases for at
+;; least one of the cases generalized in the pattern.
 
 ;; Local variables:
 ;; mode:emacs-lisp
-- 
2.30.2



[committed] CRIS: Improve bailing for eliminable compares for "addi" vs. "add"

2023-03-27 Thread Hans-Peter Nilsson via Gcc-patches
This patch affects a post-reload define_split for CRIS that transforms
a condition-code-clobbering addition into a non-clobbering addition.
(A "two-operand" addition between registers is the only insn that has
both a condition-code-clobbering and a non-clobbering variant for
CRIS.)  Many more "add.d":s are replaced by non-condition-code-
clobbering "addi":s after this patch, but most of the transformations
don't matter.

CRIS with LRA generated code that exposed a flaw with the original
patch: it bailed too easily, on *any* insn using the result of the
addition.  To wit, more effort than simply applying reg_mentioned_p is
needed to inspect the user, in the code to avoid munging an insn
sequence that cmpelim is supposed to handle.

With this patch coremark score for CRIS (*with reload*) improves by
less than 0.01% (a single "nop" is eliminated in
core_state_transition, in an execution path that affects ~1/20 of all
of the 10240 calls).  However, the original cause for this patch is to
not regress gcc.target/cris/pr93372-44.c for LRA, where otherwise a
needless "cmpq" is emitted.  For CRIS with LRA, the performance effect
on coremark isn't even measurable, except by reducing the size of the
executable due to affecting non-called library code.

* config/cris/cris.md ("*add3_addi"): Improve to bail only
for possible eliminable compares.
---
 gcc/config/cris/cris.md | 53 -
 1 file changed, 52 insertions(+), 1 deletion(-)

diff --git a/gcc/config/cris/cris.md b/gcc/config/cris/cris.md
index 2bea480a0200..30ff7e75c1bf 100644
--- a/gcc/config/cris/cris.md
+++ b/gcc/config/cris/cris.md
@@ -1362,12 +1362,63 @@ (define_split ;; "*add3_addi"
 {
   rtx reg = operands[0];
   rtx_insn *i = next_nonnote_nondebug_insn_bb (curr_insn);
+  rtx x, src, dest;
 
   while (i != NULL_RTX && (!INSN_P (i) || DEBUG_INSN_P (i)))
 i = next_nonnote_nondebug_insn_bb (i);
 
-  if (i == NULL_RTX || reg_mentioned_p (reg, i) || BARRIER_P (i))
+  /* We don't want to strip the clobber if the next insn possibly uses the
+ zeroness of the result.  Preferably fail only if we see a compare insn
+ that looks eliminable and with the register "reg" compared.  With some
+ effort we could also check for an equality test (EQ, NE) in the post-split
+ user, just not for now.  */
+  if (i == NULL_RTX)
 FAIL;
+
+  x = single_set (i);
+
+  /* We explicitly need to bail on a BARRIER, but that's implied by a failing
+ single_set test.  */
+  if (x == NULL_RTX)
+FAIL;
+
+  src = SET_SRC (x);
+  dest = SET_DEST (x);
+
+  /* Bail on (post-split) eliminable compares.  */
+  if (REG_P (dest) && REGNO (dest) == CRIS_CC0_REGNUM
+  && GET_CODE (src) == COMPARE)
+{
+  rtx cop0 = XEXP (src, 0);
+
+  if (REG_P (cop0) && REGNO (cop0) == REGNO (reg)
+ && XEXP (src, 1) == const0_rtx)
+   FAIL;
+}
+
+  /* Bail out if we see a (pre-split) cbranch or cstore where the comparison
+ looks eliminable and uses the destination register in this addition.  We
+ don't need to look very deep: a single_set which is a parallel clobbers
+ something, and (one of) that something, is always CRIS_CC0_REGNUM here.
+ Also, the entities we're looking for are two-element parallels.  A
+ split-up cbranch or cstore doesn't clobber CRIS_CC0_REGNUM.  A cbranch has
+ if_then_else as its source with a comparison operator as the condition,
+ and a cstore has a source with the comparison operator directly.  That
+ also matches dstep, so look for pc as destination for the if_then_else.
+ We error on the safe side if we happen to catch other conditional entities
+ and FAIL, that just means the split won't happen.  */
+  if (GET_CODE (PATTERN (i)) == PARALLEL && XVECLEN (PATTERN (i), 0) == 2)
+{
+  rtx cmp
+   = (GET_CODE (src) == IF_THEN_ELSE && dest == pc_rtx
+  ? XEXP (src, 0)
+  : (COMPARISON_P (src) ? src : NULL_RTX));
+  gcc_assert (cmp == NULL_RTX || COMPARISON_P (cmp));
+
+  if (cmp && REG_P (XEXP (cmp, 0)) && XEXP (cmp, 1) == const0_rtx
+ && REGNO (XEXP (cmp, 0)) == REGNO (reg))
+   FAIL;
+}
 })
 
 (define_insn "mul3"
-- 
2.30.2



[committed] CRIS: Remove unused constraint "R".

2023-03-27 Thread Hans-Peter Nilsson via Gcc-patches
gcc:
* config/cris/constraints.md ("R"): Remove unused constraint.
---
 gcc/config/cris/constraints.md | 10 --
 1 file changed, 10 deletions(-)

diff --git a/gcc/config/cris/constraints.md b/gcc/config/cris/constraints.md
index 05a1d24ef5a1..5efb61364f46 100644
--- a/gcc/config/cris/constraints.md
+++ b/gcc/config/cris/constraints.md
@@ -100,16 +100,6 @@ (define_memory_constraint "Q"
   || reload_completed)")))
 
 ;; Extra constraints.
-(define_constraint "R"
-  "An operand to BDAP or BIAP."
-   ;; A BIAP; r.S?
-  (ior (match_test "cris_biap_index_p (op, reload_in_progress
-  || reload_completed)")
-   ;; A [reg] or (int) [reg], maybe with post-increment.
-   (match_test "cris_bdap_index_p (op, reload_in_progress
-  || reload_completed)")
-   (match_test "CONSTANT_P (op)")))
-
 (define_constraint "T"
   "Memory three-address operand."
   ;; All are indirect-memory:
-- 
2.30.2



[COMMITTED] testsuite: Xfail gcc.dg/tree-ssa/ssa-fre-100.c for ! natural_alignment_32

2023-03-23 Thread Hans-Peter Nilsson via Gcc-patches
Tested native x86_64-linux and cris-elf.  The "recent patch
to gcc.dg/tree-ssa/pr100359.c" refers to r13-6838.
Committed as obvious after that commit.
-- >8 --
The test gcc.dg/tree-ssa/ssa-fre-100.c fails the
scan-tree-dump-not fre1 "baz" for at least m68k-linux,
pru-elf, and cris-elf according to posts on gcc-testresults.

GCC requires int-size-alignment for a target to see through
the "int *" dereference and perform value-numbering.  See
comments in PR91419 and also the recent patch to
gcc.dg/tree-ssa/pr100359.c  This is a flaw in gcc rather
than the target, so prefer an xfail rather than skipping
the test.

* gcc.dg/tree-ssa/ssa-fre-100.c: XFAIL for ! natural_alignment_32.
---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-100.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-100.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-100.c
index ead76548f3df..1b6a3a398a4e 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-100.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-100.c
@@ -22,4 +22,4 @@ void foo (int *p, int n)
   while (--n);
 }
 
-/* { dg-final { scan-tree-dump-not "baz" "fre1" } } */
+/* { dg-final { scan-tree-dump-not "baz" "fre1" { xfail { ! 
natural_alignment_32 } } } } */
-- 
2.30.2



[PATCH] testsuite: Compile-only gcc.dg/tree-ssa/pr100359.c if ! natural_alignment_32

2023-03-21 Thread Hans-Peter Nilsson via Gcc-patches
(CC to respectively author and committer of pr100359.c.)

Tested cris-elf and native x86_64-linux: the two
scan-tree-dumps pass and x86_64-linux still links.  Ok to
commit?

-- >8 --
The test gcc.dg/tree-ssa/pr100359.c fails the "test for
excess errors" for at least m68k-linux, pru-elf, and
cris-elf according to posts on gcc-testresults.  For
cris-elf, the "excess errors" is a failure to link; an
undefined reference to foo, because the code has a call to
an extern function foo, which is not optimized away, and
which is not defined.  I guess it's the same for those other
targets.

>From comparative gdb sessions for native x86_64-linux and
cris-elf, I see tree-ssa-sccvn.cc:vn_reference_lookup_3
(called from the "pre" pass) requires int-size-alignment for
a target to see through the "int *" dereference, that the
expression is constant false and subsequently optimize away
the call to foo.  The conclusion is with substantially less
effort available from comments in PR91419.

The point of the test seems only incidental to
optimizing-out the call to foo, judging from the comments in
PR100359, so an alternative is compile it (not link it) for
all targets.  However, I chose to not change the nature of
the test where it passes.

* gcc.dg/tree-ssa/pr100359.c: Compile-only for ! natural_alignment_32.
---
 gcc/testsuite/gcc.dg/tree-ssa/pr100359.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr100359.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr100359.c
index 29243522caaf..236dbef41c4e 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr100359.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr100359.c
@@ -1,4 +1,5 @@
-/* { dg-do link } */
+/* { dg-do link { target natural_alignment_32 } } */
+/* { dg-do compile { target { ! natural_alignment_32 } } } */
 /* { dg-options "-O3 -fdump-tree-cunrolli-optimized" } */
 
 extern void foo(void);
-- 
2.30.2



Re: [PATCH v2] doc: md.texi (Insn Splitting): Tweak wording for readability.

2023-03-21 Thread Hans-Peter Nilsson via Gcc-patches
> From: Hans-Peter Nilsson 
> CC: , 
> Date: Tue, 14 Mar 2023 17:04:43 +0100

Ping on contents (formatting is approved):

> I needed to check what was allowed in a define_split, but
> had problems understanding what was meant by "Splitting of
> jump instruction into sequence that over by another jump
> instruction".
> 
>   * doc/md.texi (Insn Splitting): Tweak wording for readability.
> 
> Co-Authored-By: Sandra Loosemore 
> ---
>  gcc/doc/md.texi | 30 +++---
>  1 file changed, 15 insertions(+), 15 deletions(-)
> 
> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> index 8e3113599fdc..134b227b9a93 100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -8756,21 +8756,21 @@ insns that don't.  Instead, write two separate 
> @code{define_split}
>  definitions, one for the insns that are valid and one for the insns that
>  are not valid.
>  
> -The splitter is allowed to split jump instructions into sequence of
> -jumps or create new jumps in while splitting non-jump instructions.  As
> -the control flow graph and branch prediction information needs to be updated,
> -several restriction apply.
> -
> -Splitting of jump instruction into sequence that over by another jump
> -instruction is always valid, as compiler expect identical behavior of new
> -jump.  When new sequence contains multiple jump instructions or new labels,
> -more assistance is needed.  Splitter is required to create only unconditional
> -jumps, or simple conditional jump instructions.  Additionally it must attach 
> a
> -@code{REG_BR_PROB} note to each conditional jump.  A global variable
> -@code{split_branch_probability} holds the probability of the original branch 
> in case
> -it was a simple conditional jump, @minus{}1 otherwise.  To simplify
> -recomputing of edge frequencies, the new sequence is required to have only
> -forward jumps to the newly created labels.
> +The splitter is allowed to split jump instructions into a sequence of jumps 
> or
> +create new jumps while splitting non-jump instructions.  As the control flow
> +graph and branch prediction information needs to be updated after the 
> splitter
> +runs, several restrictions apply.
> +
> +Splitting of a jump instruction into a sequence that has another jump
> +instruction to the same label is always valid, as the compiler expects
> +identical behavior of the new jump.  When the new sequence contains multiple
> +jump instructions or new labels, more assistance is needed.  The splitter is
> +permitted to create only unconditional jumps, or simple conditional jump
> +instructions.  Additionally it must attach a @code{REG_BR_PROB} note to each
> +conditional jump.  A global variable @code{split_branch_probability} holds 
> the
> +probability of the original branch in case it was a simple conditional jump,
> +@minus{}1 otherwise.  To simplify recomputing of edge frequencies, the new
> +sequence is permitted to have only forward jumps to the newly-created labels.
>  
>  @findex define_insn_and_split
>  For the common case where the pattern of a define_split exactly matches the
> -- 
> 2.30.2
> 
> brgds, H-P
> 


Re: [PATCH] testsuite: Handle default_packed targets in gcc.dg/plugin

2023-03-17 Thread Hans-Peter Nilsson via Gcc-patches
> From: David Malcolm 
> Date: Thu, 16 Mar 2023 14:42:58 -0400

> I think I prefer the top one-liner dg-skip-if approach you mentioned in
> your original email; it seems simplest.

Ok then.  There's also a choice between adding a
target-specifier (i.e. "{ target { ! default_packed } }") to
the dg-compile line, but a dg-skip-if is somewhat more
scalable and can, like here, be a bit more informative.

This is what I'll commit, as "testsuite: Skip some
gcc.dg/plugin tests for default_packed targets":

Avoid unweildy structure-layout-specific message-matching
expressions by exluding targets that lay out structures as
if they had been specified with __attribute__ ((__packed__)),
for tests where multiple messages depend on the structure
layout.

It's arguably a judgement call whether to skip some of these
tests or add multiple lines of matches depending on the
layout of structures.

* gcc.dg/plugin/infoleak-2.c,
gcc.dg/plugin/infoleak-CVE-2011-1078-1.c,
gcc.dg/plugin/infoleak-CVE-2011-1078-2.c,
gcc.dg/plugin/infoleak-CVE-2017-18549-1.c,
gcc.dg/plugin/infoleak-CVE-2017-18550-1.c,
gcc.dg/plugin/infoleak-antipatterns-1.c,
gcc.dg/plugin/infoleak-fixit-1.c: Skip for default_packed targets.
---
 gcc/testsuite/gcc.dg/plugin/infoleak-2.c| 1 +
 gcc/testsuite/gcc.dg/plugin/infoleak-CVE-2011-1078-1.c  | 1 +
 gcc/testsuite/gcc.dg/plugin/infoleak-CVE-2011-1078-2.c  | 1 +
 gcc/testsuite/gcc.dg/plugin/infoleak-CVE-2017-18549-1.c | 1 +
 gcc/testsuite/gcc.dg/plugin/infoleak-CVE-2017-18550-1.c | 1 +
 gcc/testsuite/gcc.dg/plugin/infoleak-antipatterns-1.c   | 1 +
 gcc/testsuite/gcc.dg/plugin/infoleak-fixit-1.c  | 1 +
 7 files changed, 7 insertions(+)

diff --git a/gcc/testsuite/gcc.dg/plugin/infoleak-2.c 
b/gcc/testsuite/gcc.dg/plugin/infoleak-2.c
index 252f8f25918a..43ab41b8a97b 100644
--- a/gcc/testsuite/gcc.dg/plugin/infoleak-2.c
+++ b/gcc/testsuite/gcc.dg/plugin/infoleak-2.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
 /* { dg-options "-fanalyzer" } */
 /* { dg-require-effective-target analyzer } */
+/* { dg-skip-if "structure layout assumption not met" { default_packed } } */
 
 #include 
 
diff --git a/gcc/testsuite/gcc.dg/plugin/infoleak-CVE-2011-1078-1.c 
b/gcc/testsuite/gcc.dg/plugin/infoleak-CVE-2011-1078-1.c
index 3616fbe176b3..8afce8eefac2 100644
--- a/gcc/testsuite/gcc.dg/plugin/infoleak-CVE-2011-1078-1.c
+++ b/gcc/testsuite/gcc.dg/plugin/infoleak-CVE-2011-1078-1.c
@@ -9,6 +9,7 @@
 /* { dg-do compile } */
 /* { dg-options "-fanalyzer" } */
 /* { dg-require-effective-target analyzer } */
+/* { dg-skip-if "structure layout assumption not met" { default_packed } } */
 
 #include 
 
diff --git a/gcc/testsuite/gcc.dg/plugin/infoleak-CVE-2011-1078-2.c 
b/gcc/testsuite/gcc.dg/plugin/infoleak-CVE-2011-1078-2.c
index 2096bda71798..1142cf22655a 100644
--- a/gcc/testsuite/gcc.dg/plugin/infoleak-CVE-2011-1078-2.c
+++ b/gcc/testsuite/gcc.dg/plugin/infoleak-CVE-2011-1078-2.c
@@ -3,6 +3,7 @@
 /* { dg-do compile } */
 /* { dg-options "-fanalyzer" } */
 /* { dg-require-effective-target analyzer } */
+/* { dg-skip-if "structure layout assumption not met" { default_packed } } */
 
 #include 
 
diff --git a/gcc/testsuite/gcc.dg/plugin/infoleak-CVE-2017-18549-1.c 
b/gcc/testsuite/gcc.dg/plugin/infoleak-CVE-2017-18549-1.c
index 8a1c816cc1b5..239c7d1df5d3 100644
--- a/gcc/testsuite/gcc.dg/plugin/infoleak-CVE-2017-18549-1.c
+++ b/gcc/testsuite/gcc.dg/plugin/infoleak-CVE-2017-18549-1.c
@@ -10,6 +10,7 @@
 /* { dg-do compile } */
 /* { dg-options "-fanalyzer" } */
 /* { dg-require-effective-target analyzer } */
+/* { dg-skip-if "structure layout assumption not met" { default_packed } } */
 
 #include 
 
diff --git a/gcc/testsuite/gcc.dg/plugin/infoleak-CVE-2017-18550-1.c 
b/gcc/testsuite/gcc.dg/plugin/infoleak-CVE-2017-18550-1.c
index 4272da96bab0..449348a1017f 100644
--- a/gcc/testsuite/gcc.dg/plugin/infoleak-CVE-2017-18550-1.c
+++ b/gcc/testsuite/gcc.dg/plugin/infoleak-CVE-2017-18550-1.c
@@ -10,6 +10,7 @@
 /* { dg-do compile } */
 /* { dg-options "-fanalyzer" } */
 /* { dg-require-effective-target analyzer } */
+/* { dg-skip-if "structure layout assumption not met" { default_packed } } */
 
 #include 
 
diff --git a/gcc/testsuite/gcc.dg/plugin/infoleak-antipatterns-1.c 
b/gcc/testsuite/gcc.dg/plugin/infoleak-antipatterns-1.c
index 500845364388..84789a7f157e 100644
--- a/gcc/testsuite/gcc.dg/plugin/infoleak-antipatterns-1.c
+++ b/gcc/testsuite/gcc.dg/plugin/infoleak-antipatterns-1.c
@@ -3,6 +3,7 @@
 /* { dg-do compile } */
 /* { dg-options "-fanalyzer" } */
 /* { dg-require-effective-target analyzer } */
+/* { dg-skip-if "structure layout assumption not met" { default_packed } } */
 
 typedef unsigned char u8;
 typedef unsigned __INT16_TYPE__ u16;
diff --git a/gcc/testsuite/gcc.dg/plugin/infoleak-fixit-1.c 
b/gcc/testsuite/gcc.dg/plugin/infoleak-fixit-1.c
index 6961b44f76b9..56158c12520a 100644
--- a/gcc/testsuite/gcc.dg/plugin/infoleak-fixit-1.c
+++ 

Re: [PATCH] testsuite: Handle default_packed targets in gcc.dg/plugin

2023-03-16 Thread Hans-Peter Nilsson via Gcc-patches
> From: Hans-Peter Nilsson 
> Date: Thu, 16 Mar 2023 19:25:05 +0100

> That doesn't seem like a good idea.  At a glance the
> *testcode* will be simpler, but the patch will be slightly
> larger

Bah, s/but the patch will be slightly larger/and the patch
will certainly be smaller, but because less is tested/.

brgds, H-P


Re: [PATCH] testsuite: Handle default_packed targets in gcc.dg/plugin

2023-03-16 Thread Hans-Peter Nilsson via Gcc-patches
> From: David Malcolm 
> Date: Thu, 16 Mar 2023 13:55:48 -0400

> On Thu, 2023-03-09 at 19:56 +0100, Hans-Peter Nilsson wrote:
> > It's not obvious to me whether considered best to include or
> > exclude these tests that depend on structure layout details.
> > If excluding, the obvious alternative to this patch is then
> > to add a top one-liner (to dg-skip-if the test for
> > default_packed targets or a similar excluding expression).
> > I'm fine either way, just suggesting the following, which
> > handles the cris-elf test-case failures I see for these
> > tests, and causes no change in results for native
> > x86_64-pc-linux-gnu.
> 
> Thanks for looking at this.
> 
> How about a third option: can the structs be explicitly marked as being
> packed, by adding __attribute__((__packed__)) to the various structs? 
> The tests are all about detecting problems with padding bits, and
> presumably we can have padding bits on all targets if we explicitly ask
> for them.
> 
> Does that make for a simpler patch?

Did I get you right: making the layout the same for all
targets, by -for all structs that in my patch needed
different layout- marking them with
__attribute__((__packed__)) and adjust numbers in warnings?

That doesn't seem like a good idea.  At a glance the
*testcode* will be simpler, but the patch will be slightly
larger and have a lot of "-" lines instead of "+" lines, as
the patch cause a lot of warnings to be dropped: you'll test
for absence of warnings instead of proper warnings.

Looks like you'll lose 24 of the padding tests; 30 lines
where I added "target ! default_packed" and 6 where I added
"target default_packed".

Perhaps I misunderstood?

brgds, H-P


> Dave
> 
> > 
> > Beware that some of the tests have lines with trailing
> > whitespace.  Where lines are changed in this patch, the
> > trailing whitespace is removed.
> 
> 
> > 
> > Ok to commit?
> > 
> > -- >8 --
> > It's a judgement call whether to just skip some of these
> > tests rather than trying to match messages depending on the
> > layout of structures, but better include than exclude.
> > 
> > * gcc.dg/plugin/infoleak-2.c,
> > gcc.dg/plugin/infoleak-CVE-2011-1078-1.c,
> > gcc.dg/plugin/infoleak-CVE-2011-1078-2.c,
> > gcc.dg/plugin/infoleak-CVE-2017-18549-1.c,
> > gcc.dg/plugin/infoleak-CVE-2017-18550-1.c,
> > gcc.dg/plugin/infoleak-antipatterns-1.c,
> > gcc.dg/plugin/infoleak-fixit-1.c: Handle default_packed
> > targets.
> > ---
> >  gcc/testsuite/gcc.dg/plugin/infoleak-2.c    | 13 ---
> > --
> >  .../gcc.dg/plugin/infoleak-CVE-2011-1078-1.c    | 10 +-
> >  .../gcc.dg/plugin/infoleak-CVE-2011-1078-2.c    | 10 +-
> >  .../gcc.dg/plugin/infoleak-CVE-2017-18549-1.c   | 10 +-
> >  .../gcc.dg/plugin/infoleak-CVE-2017-18550-1.c   |  7 ---
> >  .../gcc.dg/plugin/infoleak-antipatterns-1.c | 10 +-
> >  gcc/testsuite/gcc.dg/plugin/infoleak-fixit-1.c  | 10 ++
> >  7 files changed, 38 insertions(+), 32 deletions(-)
> > 
> > diff --git a/gcc/testsuite/gcc.dg/plugin/infoleak-2.c
> > b/gcc/testsuite/gcc.dg/plugin/infoleak-2.c
> > index 252f8f25918a..4ba484b3c6be 100644
> > --- a/gcc/testsuite/gcc.dg/plugin/infoleak-2.c
> > +++ b/gcc/testsuite/gcc.dg/plugin/infoleak-2.c
> > @@ -18,16 +18,19 @@ struct st
> >    int b:1; /* { dg-message "field 'b' is uninitialized \\(1 bit\\)"
> > "field" } */
> >     /* { dg-message "padding after field 'b' is uninitialized
> > \\(7 bits\\)" "padding" { target *-*-* } .-1 } */
> >    u8 d;    /* { dg-message "field 'd' is uninitialized \\(1 byte\\)"
> > } */
> > -  int c:7; /* { dg-message "padding after field 'c' is uninitialized
> > \\(9 bits\\)" } */
> > -  u16 e;   /* { dg-message "padding after field 'e' is uninitialized
> > \\(2 bytes\\)" } */  
> > +  int c:7; /* { dg-message "padding after field 'c' is uninitialized
> > \\(9 bits\\)" "padding" { target { ! default_packed } } } */
> > +   /* { dg-message "padding after field 'c' is uninitialized
> > \\(1 bit\\)" "padding" { target default_packed } .-1 } */
> > +  u16 e;   /* { dg-message "padding after field 'e' is uninitialized
> > \\(2 bytes\\)" "padding" { target { ! default_packed } } } */
> >  };
> >  
> >  void test (void __user *dst, u16 v)
> >  {
> >    struct st s; /* { dg-message "region created on stack here"
> > "where" } */
> > -  /* { dg-message "capacity: 12 bytes" "capacity" { target *-*-* }
> > .-1 } */
> > -  /* { dg-message "suggest forcing zero-initialization by providing
> > a '\\{0\\}' initializer" "fix-it" { target *-*-* } .-2 } */  
> > +  /* { dg-message "capacity: 12 bytes" "capacity" { target { !
> > default_packed } } .-1 } */
> > +  /* { dg-message "capacity: 9 bytes" "capacity" { target
> > default_packed } .-2 } */
> > +  /* { dg-message "suggest forcing zero-initialization by providing
> > a '\\{0\\}' initializer" "fix-it" { target *-*-* } .-3 } */
> >    s.e = v;
> >    

Ping: [PATCH] testsuite: Handle default_packed targets in gcc.dg/plugin

2023-03-16 Thread Hans-Peter Nilsson via Gcc-patches
Pinging this patch.

> From: Hans-Peter Nilsson 
> Date: Thu, 9 Mar 2023 19:56:16 +0100
> 
> It's not obvious to me whether considered best to include or
> exclude these tests that depend on structure layout details.
> If excluding, the obvious alternative to this patch is then
> to add a top one-liner (to dg-skip-if the test for
> default_packed targets or a similar excluding expression).
> I'm fine either way, just suggesting the following, which
> handles the cris-elf test-case failures I see for these
> tests, and causes no change in results for native
> x86_64-pc-linux-gnu.
> 
> Beware that some of the tests have lines with trailing
> whitespace.  Where lines are changed in this patch, the
> trailing whitespace is removed.
> 
> Ok to commit?
> 
> -- >8 --
> It's a judgement call whether to just skip some of these
> tests rather than trying to match messages depending on the
> layout of structures, but better include than exclude.
> 
>   * gcc.dg/plugin/infoleak-2.c,
>   gcc.dg/plugin/infoleak-CVE-2011-1078-1.c,
>   gcc.dg/plugin/infoleak-CVE-2011-1078-2.c,
>   gcc.dg/plugin/infoleak-CVE-2017-18549-1.c,
>   gcc.dg/plugin/infoleak-CVE-2017-18550-1.c,
>   gcc.dg/plugin/infoleak-antipatterns-1.c,
>   gcc.dg/plugin/infoleak-fixit-1.c: Handle default_packed targets.
> ---
>  gcc/testsuite/gcc.dg/plugin/infoleak-2.c| 13 -
>  .../gcc.dg/plugin/infoleak-CVE-2011-1078-1.c| 10 +-
>  .../gcc.dg/plugin/infoleak-CVE-2011-1078-2.c| 10 +-
>  .../gcc.dg/plugin/infoleak-CVE-2017-18549-1.c   | 10 +-
>  .../gcc.dg/plugin/infoleak-CVE-2017-18550-1.c   |  7 ---
>  .../gcc.dg/plugin/infoleak-antipatterns-1.c | 10 +-
>  gcc/testsuite/gcc.dg/plugin/infoleak-fixit-1.c  | 10 ++
>  7 files changed, 38 insertions(+), 32 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/plugin/infoleak-2.c 
> b/gcc/testsuite/gcc.dg/plugin/infoleak-2.c
> index 252f8f25918a..4ba484b3c6be 100644
> --- a/gcc/testsuite/gcc.dg/plugin/infoleak-2.c
> +++ b/gcc/testsuite/gcc.dg/plugin/infoleak-2.c
> @@ -18,16 +18,19 @@ struct st
>int b:1; /* { dg-message "field 'b' is uninitialized \\(1 bit\\)" "field" 
> } */
> /* { dg-message "padding after field 'b' is uninitialized \\(7 
> bits\\)" "padding" { target *-*-* } .-1 } */
>u8 d;/* { dg-message "field 'd' is uninitialized \\(1 byte\\)" } */
> -  int c:7; /* { dg-message "padding after field 'c' is uninitialized \\(9 
> bits\\)" } */
> -  u16 e;   /* { dg-message "padding after field 'e' is uninitialized \\(2 
> bytes\\)" } */  
> +  int c:7; /* { dg-message "padding after field 'c' is uninitialized \\(9 
> bits\\)" "padding" { target { ! default_packed } } } */
> +   /* { dg-message "padding after field 'c' is uninitialized \\(1 
> bit\\)" "padding" { target default_packed } .-1 } */
> +  u16 e;   /* { dg-message "padding after field 'e' is uninitialized \\(2 
> bytes\\)" "padding" { target { ! default_packed } } } */
>  };
>  
>  void test (void __user *dst, u16 v)
>  {
>struct st s; /* { dg-message "region created on stack here" "where" } */
> -  /* { dg-message "capacity: 12 bytes" "capacity" { target *-*-* } .-1 } */
> -  /* { dg-message "suggest forcing zero-initialization by providing a 
> '\\{0\\}' initializer" "fix-it" { target *-*-* } .-2 } */  
> +  /* { dg-message "capacity: 12 bytes" "capacity" { target { ! 
> default_packed } } .-1 } */
> +  /* { dg-message "capacity: 9 bytes" "capacity" { target default_packed } 
> .-2 } */
> +  /* { dg-message "suggest forcing zero-initialization by providing a 
> '\\{0\\}' initializer" "fix-it" { target *-*-* } .-3 } */
>s.e = v;
>copy_to_user(dst, , sizeof (struct st)); /* { dg-warning "potential 
> exposure of sensitive information by copying uninitialized data from stack" 
> "warning" } */
> -  /* { dg-message "10 bytes are uninitialized" "note how much" { target 
> *-*-* } .-1 } */
> +  /* { dg-message "10 bytes are uninitialized" "note how much" { target { ! 
> default_packed } } .-1 } */
> +  /* { dg-message "7 bytes are uninitialized" "note how much" { target 
> default_packed } .-2 } */
>  }
> diff --git a/gcc/testsuite/gcc.dg/plugin/infoleak-CVE-2011-1078-1.c 
> b/gcc/testsuite/gcc.dg/plugin/infoleak-CVE-2011-1078-1.c
> index 3616fbe176b3..9269b911b22f 100644
> --- a/gcc/testsuite/gcc.dg/plugin/infoleak-CVE-2011-1078-1.c
> +++ b/gcc/testsuite/gcc.dg/plugin/infoleak-CVE-2011-1078-1.c
> @@ -51,7 +51,7 @@ struct socket {
>  
>  struct sco_conninfo {
>   __u16 hci_handle;
> - __u8  dev_class[3]; /* { dg-message "padding after field 'dev_class' is 
> uninitialized \\(1 byte\\)" } */
> + __u8  dev_class[3]; /* { dg-message "padding after field 'dev_class' is 
> uninitialized \\(1 byte\\)" "padding" { target { ! default_packed } } } */
>  };
>  
>  struct sco_conn {
> @@ -83,8 +83,8 @@ static int sco_sock_getsockopt_old_broken(struct socket 
> *sock, int optname, char
> 

[PATCH v2] doc: md.texi (Insn Splitting): Tweak wording for readability.

2023-03-14 Thread Hans-Peter Nilsson via Gcc-patches
> Date: Mon, 13 Mar 2023 22:31:21 -0600
> From: Sandra Loosemore 

> On 3/13/23 19:25, Hans-Peter Nilsson via Gcc-patches wrote:
> > Jan, did I get this right?  This was from your
> > r0-36413-g6b24c25948265c / svn r44249, now on its 22nd year!
> > 
> > I spot-checked the pdf for readability.  Also calling on a
> > doc maintainer to check grammos etc.  Ok to commit?
> > 
> > -- >8 --
> > I needed to check what was allowed in a define_split, but
> > had problems understanding what was meant by "Splitting of
> > jump instruction into sequence that over by another jump
> > instruction".
> > 
> > * doc/md.texi (Insn Splitting): Tweak wording for readability.
> 
> Thanks for noticing this!  I can't comment on technical correctness, but 
> I do have some further suggestions on wording below.

Thank you for the review!  Updated version below with your
suggestions.  When spot-checking the pdf I noticed a strange
split of the page after the next after the section I
changed: last on page 484 "17.17 Including Patterns in
Machine Descriptions", there's a "(include" last on the page
and "pathname)" on top of page 485.  I'm afraid this patch
triggered that.  IMHO it'd be wrong to diddle with
formatting of *that* in *this* patch, instead leaving it to
a follow-up-patch.  I think the obvious fix is to *not*
split up (include pathname)" because that just looks odd
even without the page end in-between.  Right?

-- >8 --
I needed to check what was allowed in a define_split, but
had problems understanding what was meant by "Splitting of
jump instruction into sequence that over by another jump
instruction".

* doc/md.texi (Insn Splitting): Tweak wording for readability.

Co-Authored-By: Sandra Loosemore 
---
 gcc/doc/md.texi | 30 +++---
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 8e3113599fdc..134b227b9a93 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -8756,21 +8756,21 @@ insns that don't.  Instead, write two separate 
@code{define_split}
 definitions, one for the insns that are valid and one for the insns that
 are not valid.
 
-The splitter is allowed to split jump instructions into sequence of
-jumps or create new jumps in while splitting non-jump instructions.  As
-the control flow graph and branch prediction information needs to be updated,
-several restriction apply.
-
-Splitting of jump instruction into sequence that over by another jump
-instruction is always valid, as compiler expect identical behavior of new
-jump.  When new sequence contains multiple jump instructions or new labels,
-more assistance is needed.  Splitter is required to create only unconditional
-jumps, or simple conditional jump instructions.  Additionally it must attach a
-@code{REG_BR_PROB} note to each conditional jump.  A global variable
-@code{split_branch_probability} holds the probability of the original branch 
in case
-it was a simple conditional jump, @minus{}1 otherwise.  To simplify
-recomputing of edge frequencies, the new sequence is required to have only
-forward jumps to the newly created labels.
+The splitter is allowed to split jump instructions into a sequence of jumps or
+create new jumps while splitting non-jump instructions.  As the control flow
+graph and branch prediction information needs to be updated after the splitter
+runs, several restrictions apply.
+
+Splitting of a jump instruction into a sequence that has another jump
+instruction to the same label is always valid, as the compiler expects
+identical behavior of the new jump.  When the new sequence contains multiple
+jump instructions or new labels, more assistance is needed.  The splitter is
+permitted to create only unconditional jumps, or simple conditional jump
+instructions.  Additionally it must attach a @code{REG_BR_PROB} note to each
+conditional jump.  A global variable @code{split_branch_probability} holds the
+probability of the original branch in case it was a simple conditional jump,
+@minus{}1 otherwise.  To simplify recomputing of edge frequencies, the new
+sequence is permitted to have only forward jumps to the newly-created labels.
 
 @findex define_insn_and_split
 For the common case where the pattern of a define_split exactly matches the
-- 
2.30.2

brgds, H-P


[PATCH] doc: md.texi (Insn Splitting): Tweak wording for readability.

2023-03-13 Thread Hans-Peter Nilsson via Gcc-patches
Jan, did I get this right?  This was from your
r0-36413-g6b24c25948265c / svn r44249, now on its 22nd year!

I spot-checked the pdf for readability.  Also calling on a
doc maintainer to check grammos etc.  Ok to commit?

-- >8 --
I needed to check what was allowed in a define_split, but
had problems understanding what was meant by "Splitting of
jump instruction into sequence that over by another jump
instruction".

* doc/md.texi (Insn Splitting): Tweak wording for readability.
---
 gcc/doc/md.texi | 30 +++---
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 8e3113599fdc..786365143179 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -8756,21 +8756,21 @@ insns that don't.  Instead, write two separate 
@code{define_split}
 definitions, one for the insns that are valid and one for the insns that
 are not valid.
 
-The splitter is allowed to split jump instructions into sequence of
-jumps or create new jumps in while splitting non-jump instructions.  As
-the control flow graph and branch prediction information needs to be updated,
-several restriction apply.
-
-Splitting of jump instruction into sequence that over by another jump
-instruction is always valid, as compiler expect identical behavior of new
-jump.  When new sequence contains multiple jump instructions or new labels,
-more assistance is needed.  Splitter is required to create only unconditional
-jumps, or simple conditional jump instructions.  Additionally it must attach a
-@code{REG_BR_PROB} note to each conditional jump.  A global variable
-@code{split_branch_probability} holds the probability of the original branch 
in case
-it was a simple conditional jump, @minus{}1 otherwise.  To simplify
-recomputing of edge frequencies, the new sequence is required to have only
-forward jumps to the newly created labels.
+The splitter is allowed to split jump instructions into a sequence of jumps or
+create new jumps while splitting non-jump instructions.  As the control flow
+graph and branch prediction information needs to be updated, several
+restrictions apply.
+
+Splitting of a jump instruction into a sequence that has another jump
+instruction is always valid, as the compiler expects identical behavior of the
+new jump.  When the new sequence contains multiple jump instructions or new
+labels, more assistance is needed.  The splitter is required to create only
+unconditional jumps, or simple conditional jump instructions.  Additionally it
+must attach a @code{REG_BR_PROB} note to each conditional jump.  A global
+variable @code{split_branch_probability} holds the probability of the original
+branch in case it was a simple conditional jump, @minus{}1 otherwise.  To
+simplify recomputing of edge frequencies, the new sequence is required to have
+only forward jumps to the newly created labels.
 
 @findex define_insn_and_split
 For the common case where the pattern of a define_split exactly matches the
-- 
2.30.2



[committed] testsuite: Tweak check_fork_available for CRIS

2023-03-10 Thread Hans-Peter Nilsson via Gcc-patches
This takes care of the failing gcc.dg/torture/ftrapv-1.c and
-ftrapv-2.c for cris-elf.

For simplicity, assume simulators are the GNU simulator (in the gdb
repo).  But cris-elf is newlib, so a newlib target forking?  Yes: the
I/O, etc. interface to the simulator uses the Linux/CRIS ABI.

* lib/target-supports.exp (check_fork_available): Don't signal
true for CRIS running on a simulator.
---
 gcc/testsuite/lib/target-supports.exp | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index a4fbc1998c70..335e24b23b12 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -2880,6 +2880,12 @@ proc check_fork_available {} {
# tell as we're doing partial links for kernel modules.
return 0
  }
+if { [istarget cris-*-*] } {
+   # Compiling and linking works, and an executable running e.g.
+   # gcc.dg/torture/ftrapv-1.c works on now-historical hardware,
+   # but the GNU simulator emits an error for the fork syscall.
+   return [check_effective_target_hw]
+}
 return [check_function_available "fork"]
 }
 
-- 
2.30.2



[committed] testsuite: gcc.dg/pr108117.c: Require effective-target scheduling

2023-03-10 Thread Hans-Peter Nilsson via Gcc-patches
Committed as obvious.
-- >8 --
Targets that don't support scheduling get a warning:
cc1: warning: instruction scheduling not supported on this target machine

Do like other target-generic tests unconditionally passing
-fschedule-insns: require effective target scheduling.

* gcc.dg/pr108117.c: Require effective-target scheduling.
---
 gcc/testsuite/gcc.dg/pr108117.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/testsuite/gcc.dg/pr108117.c b/gcc/testsuite/gcc.dg/pr108117.c
index ae151693e2f9..4b3bebe229e5 100644
--- a/gcc/testsuite/gcc.dg/pr108117.c
+++ b/gcc/testsuite/gcc.dg/pr108117.c
@@ -1,5 +1,6 @@
 /* { dg-do run } */
 /* { dg-require-effective-target nonlocal_goto } */
+/* { dg-require-effective-target scheduling } */
 /* { dg-options "-O2 -fschedule-insns" } */
 
 #include 
-- 
2.30.2



[committed] testsuite: gcc.dg/pr106397.c: Add -w to options

2023-03-10 Thread Hans-Peter Nilsson via Gcc-patches
Committed as obvious.
-- >8 --
Targets that don't support prefetching get a warning:
cc1: warning: '-fprefetch-loop-arrays' not supported for this target

Align with other tests passing -fprefetch-loop-arrays for
all targets: add "-w" to options.

* gcc.dg/pr106397.c: Add -w to options.
---
 gcc/testsuite/gcc.dg/pr106397.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/pr106397.c b/gcc/testsuite/gcc.dg/pr106397.c
index 2bc17f8cf806..2dd04b870775 100644
--- a/gcc/testsuite/gcc.dg/pr106397.c
+++ b/gcc/testsuite/gcc.dg/pr106397.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O3 -fprefetch-loop-arrays --param l2-cache-size=0 --param 
prefetch-latency=3 -fprefetch-loop-arrays" } */
+/* { dg-options "-O3 -fprefetch-loop-arrays -w --param l2-cache-size=0 --param 
prefetch-latency=3 -fprefetch-loop-arrays" } */
 /* { dg-additional-options "-march=i686 -msse" { target { { i?86-*-* 
x86_64-*-* } && ia32 } } } */
 
 int
-- 
2.30.2



[PATCH] testsuite: Handle default_packed targets in gcc.dg/plugin

2023-03-09 Thread Hans-Peter Nilsson via Gcc-patches
It's not obvious to me whether considered best to include or
exclude these tests that depend on structure layout details.
If excluding, the obvious alternative to this patch is then
to add a top one-liner (to dg-skip-if the test for
default_packed targets or a similar excluding expression).
I'm fine either way, just suggesting the following, which
handles the cris-elf test-case failures I see for these
tests, and causes no change in results for native
x86_64-pc-linux-gnu.

Beware that some of the tests have lines with trailing
whitespace.  Where lines are changed in this patch, the
trailing whitespace is removed.

Ok to commit?

-- >8 --
It's a judgement call whether to just skip some of these
tests rather than trying to match messages depending on the
layout of structures, but better include than exclude.

* gcc.dg/plugin/infoleak-2.c,
gcc.dg/plugin/infoleak-CVE-2011-1078-1.c,
gcc.dg/plugin/infoleak-CVE-2011-1078-2.c,
gcc.dg/plugin/infoleak-CVE-2017-18549-1.c,
gcc.dg/plugin/infoleak-CVE-2017-18550-1.c,
gcc.dg/plugin/infoleak-antipatterns-1.c,
gcc.dg/plugin/infoleak-fixit-1.c: Handle default_packed targets.
---
 gcc/testsuite/gcc.dg/plugin/infoleak-2.c| 13 -
 .../gcc.dg/plugin/infoleak-CVE-2011-1078-1.c| 10 +-
 .../gcc.dg/plugin/infoleak-CVE-2011-1078-2.c| 10 +-
 .../gcc.dg/plugin/infoleak-CVE-2017-18549-1.c   | 10 +-
 .../gcc.dg/plugin/infoleak-CVE-2017-18550-1.c   |  7 ---
 .../gcc.dg/plugin/infoleak-antipatterns-1.c | 10 +-
 gcc/testsuite/gcc.dg/plugin/infoleak-fixit-1.c  | 10 ++
 7 files changed, 38 insertions(+), 32 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/plugin/infoleak-2.c 
b/gcc/testsuite/gcc.dg/plugin/infoleak-2.c
index 252f8f25918a..4ba484b3c6be 100644
--- a/gcc/testsuite/gcc.dg/plugin/infoleak-2.c
+++ b/gcc/testsuite/gcc.dg/plugin/infoleak-2.c
@@ -18,16 +18,19 @@ struct st
   int b:1; /* { dg-message "field 'b' is uninitialized \\(1 bit\\)" "field" } 
*/
/* { dg-message "padding after field 'b' is uninitialized \\(7 
bits\\)" "padding" { target *-*-* } .-1 } */
   u8 d;/* { dg-message "field 'd' is uninitialized \\(1 byte\\)" } */
-  int c:7; /* { dg-message "padding after field 'c' is uninitialized \\(9 
bits\\)" } */
-  u16 e;   /* { dg-message "padding after field 'e' is uninitialized \\(2 
bytes\\)" } */  
+  int c:7; /* { dg-message "padding after field 'c' is uninitialized \\(9 
bits\\)" "padding" { target { ! default_packed } } } */
+   /* { dg-message "padding after field 'c' is uninitialized \\(1 
bit\\)" "padding" { target default_packed } .-1 } */
+  u16 e;   /* { dg-message "padding after field 'e' is uninitialized \\(2 
bytes\\)" "padding" { target { ! default_packed } } } */
 };
 
 void test (void __user *dst, u16 v)
 {
   struct st s; /* { dg-message "region created on stack here" "where" } */
-  /* { dg-message "capacity: 12 bytes" "capacity" { target *-*-* } .-1 } */
-  /* { dg-message "suggest forcing zero-initialization by providing a 
'\\{0\\}' initializer" "fix-it" { target *-*-* } .-2 } */  
+  /* { dg-message "capacity: 12 bytes" "capacity" { target { ! default_packed 
} } .-1 } */
+  /* { dg-message "capacity: 9 bytes" "capacity" { target default_packed } .-2 
} */
+  /* { dg-message "suggest forcing zero-initialization by providing a 
'\\{0\\}' initializer" "fix-it" { target *-*-* } .-3 } */
   s.e = v;
   copy_to_user(dst, , sizeof (struct st)); /* { dg-warning "potential 
exposure of sensitive information by copying uninitialized data from stack" 
"warning" } */
-  /* { dg-message "10 bytes are uninitialized" "note how much" { target *-*-* 
} .-1 } */
+  /* { dg-message "10 bytes are uninitialized" "note how much" { target { ! 
default_packed } } .-1 } */
+  /* { dg-message "7 bytes are uninitialized" "note how much" { target 
default_packed } .-2 } */
 }
diff --git a/gcc/testsuite/gcc.dg/plugin/infoleak-CVE-2011-1078-1.c 
b/gcc/testsuite/gcc.dg/plugin/infoleak-CVE-2011-1078-1.c
index 3616fbe176b3..9269b911b22f 100644
--- a/gcc/testsuite/gcc.dg/plugin/infoleak-CVE-2011-1078-1.c
+++ b/gcc/testsuite/gcc.dg/plugin/infoleak-CVE-2011-1078-1.c
@@ -51,7 +51,7 @@ struct socket {
 
 struct sco_conninfo {
__u16 hci_handle;
-   __u8  dev_class[3]; /* { dg-message "padding after field 'dev_class' is 
uninitialized \\(1 byte\\)" } */
+   __u8  dev_class[3]; /* { dg-message "padding after field 'dev_class' is 
uninitialized \\(1 byte\\)" "padding" { target { ! default_packed } } } */
 };
 
 struct sco_conn {
@@ -83,8 +83,8 @@ static int sco_sock_getsockopt_old_broken(struct socket 
*sock, int optname, char
 {
struct sock *sk = sock->sk;
/* [...snip...] */
-   struct sco_conninfo cinfo; /* { dg-message "region created on stack 
here" "where" } */
-  /* { dg-message "capacity: 6 bytes" 
"capacity" { target *-*-* } .-1 } */
+ 

[PATCH] testsuite: Fix omp-parallel-for-get-min.c and -for-1.c for non-openmp

2023-03-07 Thread Hans-Peter Nilsson via Gcc-patches
Committed as obvious.
-- >8 --
The recently added tests missed checking for "fopenmp" (see
other tests where "-fopenmp" is passed), which makes them
fail on non-openmp systems.

* gcc.dg/analyzer/omp-parallel-for-get-min.c,
gcc.dg/analyzer/omp-parallel-for-1.c: Require effective target fopenmp.
---
 gcc/testsuite/gcc.dg/analyzer/omp-parallel-for-1.c   | 1 +
 gcc/testsuite/gcc.dg/analyzer/omp-parallel-for-get-min.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/gcc/testsuite/gcc.dg/analyzer/omp-parallel-for-1.c 
b/gcc/testsuite/gcc.dg/analyzer/omp-parallel-for-1.c
index dae940dac200..cadacc842750 100644
--- a/gcc/testsuite/gcc.dg/analyzer/omp-parallel-for-1.c
+++ b/gcc/testsuite/gcc.dg/analyzer/omp-parallel-for-1.c
@@ -1,3 +1,4 @@
+/* { dg-require-effective-target fopenmp } */
 /* { dg-additional-options "-fopenmp -Wall" } */
 
 typedef struct _Image
diff --git a/gcc/testsuite/gcc.dg/analyzer/omp-parallel-for-get-min.c 
b/gcc/testsuite/gcc.dg/analyzer/omp-parallel-for-get-min.c
index a7e64e1a3a85..ba9f634cd716 100644
--- a/gcc/testsuite/gcc.dg/analyzer/omp-parallel-for-get-min.c
+++ b/gcc/testsuite/gcc.dg/analyzer/omp-parallel-for-get-min.c
@@ -1,5 +1,6 @@
 /* Reduced from ImageMagick-7.1.0-57's MagickCore/attribute.c: 
GetEdgeBackgroundColor */
 
+/* { dg-require-effective-target fopenmp } */
 /* { dg-additional-options "-fopenmp -Wall" } */
 
 extern double get_census (void);
-- 
2.30.2



Re: Ping: [PATCH 1/2] testsuite: Provide means to regexp in multiline patterns

2023-03-06 Thread Hans-Peter Nilsson via Gcc-patches
> From: Mike Stump 
> Date: Mon, 6 Mar 2023 02:05:35 -0800

> Ok

Thanks!  The server-side hook didn't like my ChangeLog
entry:

* lib/multiline.exp (_build_multiline_regex): Map
"{re:" to "(", ":re}" to ")" and ":re?}" to ")?".

It seems I forgot to validate that patch by
contrib/gcc-changelog/git_check_commit.py, which complains:

Checking c0debd6f586ef76f1ceabfed11d7eaf8f6d1b110: FAILED
ERR: bad wrapping of parenthesis: " "{re:" to "(", ":re}" to ")" and 
":re?}" to ")?"."

I gave in and took the easy way out; not fixing the bug in
that script, but instead "wrapped the parenthesis" to:

* lib/multiline.exp (_build_multiline_regex): Map
"{re:" to "(", similarly ")?" from ":re?}" and the
same without question mark.

I hope to make amends by fixing git_check_commit.py, if
given guidance.

brgds, H-P


[PATCH] testsuite: Support scanning tree-dumps

2023-03-06 Thread Hans-Peter Nilsson via Gcc-patches
This is sort-of a spin-off from effective_target_tail_call: I thought
that'd best be implemented by scanning a tree-dump, specifically
-fdump-tree-optimized, but the "tail call" found there is emitted for
*all* targets.  Debugged and ready to apply, putting it out for
consideration as someone will need it (or should use it) sooner rather
than later...  Best committed rather than sitting in mail-archives so:
Ok to apply?
-- >8 --
No planned usage.

* lib/target-supports.exp (check_compile): Support scanning tree-dumps.
---
 gcc/testsuite/lib/target-supports.exp | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 4236c920baeb..0ca7a9680bb4 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -87,6 +87,7 @@ proc check_compile {basename type contents args} {
assembly { set output ${basename}[pid].s }
object { set output ${basename}[pid].o }
executable { set output ${basename}[pid].exe }
+   "tree-*" -
"rtl-*" {
set output ${basename}[pid].s
lappend options "additional_flags=-fdump-$type"
@@ -108,6 +109,9 @@ proc check_compile {basename type contents args} {
 if [regexp "rtl-(.*)" $type dummy rtl_type] {
set scan_output "[glob $src.\[0-9\]\[0-9\]\[0-9\]r.$rtl_type]"
file delete $output
+} elseif [regexp "tree-(.*)" $type dummy tree_type] {
+   set scan_output "[glob $src.\[0-9\]\[0-9\]\[0-9\]t.$tree_type]"
+   file delete $output
 }
 
 # Restore additional_sources.
-- 
2.30.2



[PATCH 3/3] testsuite: Gate gcc.dg/plugin/must-tail-call-1.c and -2.c on tail_call

2023-03-06 Thread Hans-Peter Nilsson via Gcc-patches
Borderline obvious when tail_call is available, so I'll then apply.
-- >8 --
While gcc.dg/plugin/must-tail-call-2.c passes for all targets even
without this, the error message is, for a target like cris-elf that
doesn't implement sibling calls: "error: cannot tail-call: machine
description does not have a sibcall_epilogue instruction pattern"
rather than "error: cannot tail-call: callee returns a structure".
Also, it'd be confusing to exclude must-tail-call-1.c but not
must-tail-call-2.c

* gcc.dg/plugin/must-tail-call-1.c, gcc.dg/plugin/must-tail-call-2.c:
Gate on effective target tail_call.
---
 gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c | 1 +
 gcc/testsuite/gcc.dg/plugin/must-tail-call-2.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c 
b/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c
index 1495a48232a6..3a6d4cceaba7 100644
--- a/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c
+++ b/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c
@@ -1,3 +1,4 @@
+/* { dg-do compile { target tail_call } } */
 /* { dg-options "-fdelayed-branch" { target sparc*-*-* } } */
 
 extern void abort (void);
diff --git a/gcc/testsuite/gcc.dg/plugin/must-tail-call-2.c 
b/gcc/testsuite/gcc.dg/plugin/must-tail-call-2.c
index c6dfecd32458..d51d15cc0879 100644
--- a/gcc/testsuite/gcc.dg/plugin/must-tail-call-2.c
+++ b/gcc/testsuite/gcc.dg/plugin/must-tail-call-2.c
@@ -1,3 +1,4 @@
+/* { dg-do compile { target tail_call } } */
 /* Allow nested functions.  */
 /* { dg-options "-Wno-pedantic" } */
 
-- 
2.30.2



[PATCH 2/3] doc: Document testsuite check_effective_target_tail_call

2023-03-06 Thread Hans-Peter Nilsson via Gcc-patches
Will commit as obvious, when the 1/3 tail_call is applied.
-- >8 --
Spot-checked the PDF output for sanity.

* doc/sourcebuild.texi: Document check_effective_target_tail_call.
---
 gcc/doc/sourcebuild.texi | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index c348a1e47cc3..80bef7f0a0e2 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -2844,6 +2844,9 @@ Target supports named sections.
 Target uses natural alignment (aligned to type size) for types of
 32 bits or less.
 
+@item tail_call
+Target supports tail-call optimizations.
+
 @item target_natural_alignment_64
 Target uses natural alignment (aligned to type size) for types of
 64 bits or less.
-- 
2.30.2



[PATCH 1/3] testsuite: Add tail_call effective target

2023-03-06 Thread Hans-Peter Nilsson via Gcc-patches
Ok to commit?
-- >8 --
The RTL "expand" dump is the first RTL dump, and it also appears to be
the earliest trace of the target having implemented sibcalls.
Including the "," in the pattern searched for, to try and avoid
possible false matches, but there doesn't appear to be any identifiers
or target names nearby so this is just belts and suspenders.  Using
"tail_call" as a shorter and more commonly used term than a derivative
of "sibling calls", and expecting only gcc folks to have heard of
"sibcalls".

* lib/target-supports.exp (check_effective_target_tail_call): New.
---
 gcc/testsuite/lib/target-supports.exp | 9 +
 1 file changed, 9 insertions(+)

diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 0ca7a9680bb4..958537b3b7c0 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -11684,6 +11684,15 @@ proc check_effective_target_frame_pointer_for_non_leaf 
{ } {
   return 0
 }
 
+# Return 1 if the target can perform tail-call optimizations of the
+# most trivial type.
+proc check_effective_target_tail_call { } {
+return [check_no_messages_and_pattern tail_call ",SIBCALL" rtl-expand {
+   __attribute__((__noipa__)) void foo (void) { }
+   __attribute__((__noipa__)) void bar (void) { foo(); }
+} {-O2 -fdump-rtl-expand-all}] ;# The "SIBCALL" note requires a detailed 
dump.
+}
+
 # Return 1 if the target's calling sequence or its ABI
 # create implicit stack probes at or prior to function entry.
 proc check_effective_target_caller_implicit_probes { } {
-- 
2.30.2



Ping: [PATCH 1/2] testsuite: Provide means to regexp in multiline patterns

2023-03-03 Thread Hans-Peter Nilsson via Gcc-patches
Ping...

> From: Hans-Peter Nilsson 
> Date: Fri, 24 Feb 2023 20:16:03 +0100
> 
> Ok to commit?
> -- >8 --
> Those multi-line-patterns are literal.  Sometimes a regexp
> needs to be matched.  This is a start: just three elements
> are supported: "(" ")" and the compound ")?" (and on second
> thought, it can be argued that "(...)" alone is not useful).
> Note that Tcl "string map" is documented to have the desired
> effect: a once-over but no re-recognitions of previously
> replaced mapped elements.  Also, drop a doubled "containing".
> 
> testsuite:
>   * lib/multiline.exp (_build_multiline_regex): Map
>   "{re:" to "(", ":re}" to ")" and ":re?}" to ")?".
> ---
>  gcc/testsuite/lib/multiline.exp | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/testsuite/lib/multiline.exp b/gcc/testsuite/lib/multiline.exp
> index 5eccf2bbebc1..f746bc3a618e 100644
> --- a/gcc/testsuite/lib/multiline.exp
> +++ b/gcc/testsuite/lib/multiline.exp
> @@ -297,7 +297,7 @@ proc _get_lines { filename first_line last_line } {
>  
>  # Convert $multiline from a list of strings to a multiline regex
>  # We need to support matching arbitrary followup text on each line,
> -# to deal with comments containing containing DejaGnu directives.
> +# to deal with comments containing DejaGnu directives.
>  
>  proc _build_multiline_regex { multiline index } {
>  verbose "_build_multiline_regex: $multiline $index" 4
> @@ -307,7 +307,10 @@ proc _build_multiline_regex { multiline index } {
>   verbose "  line: $line" 4
>  
>   # We need to escape "^" and other regexp metacharacters.
> - set line [string map {"^" "\\^"
> + set line [string map {"\{re:" "("
> +   ":re?\}" ")?"
> +   ":re\}" ")"
> +   "^" "\\^"
> "(" "\\("
> ")" "\\)"
> "[" "\\["
> -- 
> 2.30.2
> 


[COMMITTED] testsuite: Skip gcc.dg/ipa/pr77653.c for CRIS

2023-03-03 Thread Hans-Peter Nilsson via Gcc-patches
CRIS defines DATA_ALIGNMENT such that alignment can be
applied differently to different data of the same type, when
"references to it must bind to the current definition"
(varasm.cc:align_variable).  Here, it means that more
alignment is then applied to g, but not f, so the test-case
fails because another message is emitted than the expected:
a same-alignment test dominates the not-discardable test,
and we get "Not unifying; original and alias have
incompatible alignments" rather than "Not unifying; alias
cannot be created; target is discardable".  Because this
DATA_ALIGNMENT behavior for CRIS depends on target options,
and this test is already artificial by the use of -fcommon,
better skip it.

* gcc.dg/ipa/pr77653.c: Skip for cris-*-*.
---
 gcc/testsuite/gcc.dg/ipa/pr77653.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/testsuite/gcc.dg/ipa/pr77653.c 
b/gcc/testsuite/gcc.dg/ipa/pr77653.c
index 2fddb7eab548..16df3fff6a41 100644
--- a/gcc/testsuite/gcc.dg/ipa/pr77653.c
+++ b/gcc/testsuite/gcc.dg/ipa/pr77653.c
@@ -1,5 +1,6 @@
 /* { dg-require-alias "" } */
 /* { dg-options "-O2 -fcommon -fdump-ipa-icf-details"  } */
+/* { dg-skip-if "Can align g more than f" { cris-*-* } } */
 
 int a, b, c, d, e, h, i, j, k, l;
 const int f;
-- 
2.30.2



[COMMITTED] testsuite: Skip gcc.dg/ifcvt-4.c for CRIS

2023-03-03 Thread Hans-Peter Nilsson via Gcc-patches
CRIS has no conditional execution and no conditional moves.

* gcc.dg/ifcvt-4.c: Add cris-*-* to skip list.
---
 gcc/testsuite/gcc.dg/ifcvt-4.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/ifcvt-4.c b/gcc/testsuite/gcc.dg/ifcvt-4.c
index 46245f0d0698..8b2583d00e92 100644
--- a/gcc/testsuite/gcc.dg/ifcvt-4.c
+++ b/gcc/testsuite/gcc.dg/ifcvt-4.c
@@ -2,7 +2,7 @@
 /* { dg-additional-options "-misel" { target { powerpc*-*-* } } } */
 /* { dg-additional-options "-march=z196" { target { s390x-*-* } } } */
 /* { dg-additional-options "-mtune-ctrl=^one_if_conv_insn" { target { i?86-*-* 
x86_64-*-* } } } */
-/* { dg-skip-if "Multiple set if-conversion not guaranteed on all subtargets" 
{ "arm*-*-* avr-*-* hppa*64*-*-* visium-*-*" riscv*-*-* msp430-*-* nios2-*-* 
pru-*-* } }  */
+/* { dg-skip-if "Multiple set if-conversion not guaranteed on all subtargets" 
{ "arm*-*-* avr-*-* cris-*-* hppa*64*-*-* visium-*-*" riscv*-*-* msp430-*-* 
nios2-*-* pru-*-* } }  */
 /* { dg-skip-if "" { "s390x-*-*" } { "-m31" } }  */
 
 typedef int word __attribute__((mode(word)));
-- 
2.30.2



[COMMITTED] testsuite: Fix various scan-assembler identifiers not handling _-prefix

2023-03-03 Thread Hans-Peter Nilsson via Gcc-patches
Committed as obvious.
-- >8 --
* g++.dg/cpp0x/pr84497.C: Handle USER_LABEL_PREFIX == "_" on
scan-assembler identifiers.
* gcc.dg/debug/btf/btf-enum64-1.c, gcc.dg/ipa/symver1.c: Ditto.
---
 gcc/testsuite/g++.dg/cpp0x/pr84497.C  | 6 +++---
 gcc/testsuite/gcc.dg/debug/btf/btf-enum64-1.c | 6 +++---
 gcc/testsuite/gcc.dg/ipa/symver1.c| 4 ++--
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/gcc/testsuite/g++.dg/cpp0x/pr84497.C 
b/gcc/testsuite/g++.dg/cpp0x/pr84497.C
index 7eb1aee474e3..be5a9d359ec5 100644
--- a/gcc/testsuite/g++.dg/cpp0x/pr84497.C
+++ b/gcc/testsuite/g++.dg/cpp0x/pr84497.C
@@ -34,6 +34,6 @@ extern thread_local Container container_obj;
 int main() { return !(_obj && _obj && _obj);}
 #endif
 
-// { dg-final { scan-assembler ".weak\[ \t\]*_ZTH8base_obj" } }
-// { dg-final { scan-assembler ".weak\[ \t\]*_ZTH11derived_obj" } }
-// { dg-final { scan-assembler ".weak\[ \t\]*_ZTH13container_obj" } }
+// { dg-final { scan-assembler ".weak\[ \t\]*_?_ZTH8base_obj" } }
+// { dg-final { scan-assembler ".weak\[ \t\]*_?_ZTH11derived_obj" } }
+// { dg-final { scan-assembler ".weak\[ \t\]*_?_ZTH13container_obj" } }
diff --git a/gcc/testsuite/gcc.dg/debug/btf/btf-enum64-1.c 
b/gcc/testsuite/gcc.dg/debug/btf/btf-enum64-1.c
index e443d4c8c00f..5d1487c1183a 100644
--- a/gcc/testsuite/gcc.dg/debug/btf/btf-enum64-1.c
+++ b/gcc/testsuite/gcc.dg/debug/btf/btf-enum64-1.c
@@ -3,9 +3,9 @@
 /* { dg-do compile } */
 /* { dg-options "-O0 -gbtf -dA" } */
 
-/* { dg-final { scan-assembler-times "\[\t \].size\[\t \]myenum1,\[\t \]8" 1 } 
} */
-/* { dg-final { scan-assembler-times "\[\t \].size\[\t \]myenum2,\[\t \]8" 1 } 
} */
-/* { dg-final { scan-assembler-times "\[\t \].size\[\t \]myenum3,\[\t \]8" 1 } 
} */
+/* { dg-final { scan-assembler-times "\[\t \].size\[\t \]_?myenum1,\[\t \]8" 1 
} } */
+/* { dg-final { scan-assembler-times "\[\t \].size\[\t \]_?myenum2,\[\t \]8" 1 
} } */
+/* { dg-final { scan-assembler-times "\[\t \].size\[\t \]_?myenum3,\[\t \]8" 1 
} } */
 /* { dg-final { scan-assembler-times "\[\t \]0x1303\[\t 
\]+\[^\n\]*btt_info" 2 } } */
 /* { dg-final { scan-assembler-times "\[\t \]0x9303\[\t 
\]+\[^\n\]*btt_info" 1 } } */
 /* { dg-final { scan-assembler-times "\[\t \]0xffaa\[\t 
\]+\[^\n\]*bte_value_lo32" 2 } } */
diff --git a/gcc/testsuite/gcc.dg/ipa/symver1.c 
b/gcc/testsuite/gcc.dg/ipa/symver1.c
index 2cd025836697..d120ed5d76b1 100644
--- a/gcc/testsuite/gcc.dg/ipa/symver1.c
+++ b/gcc/testsuite/gcc.dg/ipa/symver1.c
@@ -8,5 +8,5 @@ int foo()
   return 2;
 }
 
-/* { dg-final { scan-assembler ".symver.*foo, foo@VER_2" } } */
-/* { dg-final { scan-assembler ".symver.*foo, foo@VER_3" } } */
+/* { dg-final { scan-assembler ".symver.*foo, _?foo@VER_2" } } */
+/* { dg-final { scan-assembler ".symver.*foo, _?foo@VER_3" } } */
-- 
2.30.2



[COMMITTED] testsuite: Fix g++.dg/ext/attr-copy-2.C for default_packed targets

2023-03-01 Thread Hans-Peter Nilsson via Gcc-patches
Committed as obvious.  FWIW, I'm on the side that emitting
the warning when the reason is just that it's the default
layout, is bad.  A discussion took place years ago when the
warning was added.

-- >8 --
For targets where the byte-wise structure element layout is
the same as for attribute-packed, you get a warning when
that's specified.  Thus, expect a warning for such targets.
Note the exclusion of bitfields.

* g++.dg/ext/attr-copy-2.C: Fix for default_packed targets.
---
 gcc/testsuite/g++.dg/ext/attr-copy-2.C | 60 +-
 1 file changed, 30 insertions(+), 30 deletions(-)

diff --git a/gcc/testsuite/g++.dg/ext/attr-copy-2.C 
b/gcc/testsuite/g++.dg/ext/attr-copy-2.C
index 7776959d9f6b..ffd6f22ef56d 100644
--- a/gcc/testsuite/g++.dg/ext/attr-copy-2.C
+++ b/gcc/testsuite/g++.dg/ext/attr-copy-2.C
@@ -27,51 +27,51 @@ extern B 
 
 typedef struct C
 {
-  ATTR (copy ((struct A *)0)) short m_pa_0;
-  ATTR (copy ((struct A *)(1, 0))) int m_pa_1_0;
-  ATTR (copy ((struct A *)(0, 1))) long m_pa_0_1;
+  ATTR (copy ((struct A *)0)) short m_pa_0; // { dg-warning "attribute 
ignored" "" { target default_packed } }
+  ATTR (copy ((struct A *)(1, 0))) int m_pa_1_0; // { dg-warning "attribute 
ignored" "" { target default_packed } }
+  ATTR (copy ((struct A *)(0, 1))) long m_pa_0_1; // { dg-warning "attribute 
ignored" "" { target default_packed } }
 
-  ATTR (copy (*(struct A *)0)) short m_xpa_0;
-  ATTR (copy (*(struct A *)(1, 0))) int m_xpa_1_0;
-  ATTR (copy (*(struct A *)(0, 1))) long m_xpa_0_1;
+  ATTR (copy (*(struct A *)0)) short m_xpa_0; // { dg-warning "attribute 
ignored" "" { target default_packed } }
+  ATTR (copy (*(struct A *)(1, 0))) int m_xpa_1_0; // { dg-warning "attribute 
ignored" "" { target default_packed } }
+  ATTR (copy (*(struct A *)(0, 1))) long m_xpa_0_1; // { dg-warning "attribute 
ignored" "" { target default_packed } }
 
-  ATTR (copy (((struct A *)0)[0])) short m_arpa_0;
-  ATTR (copy (((struct A *)(1, 0))[0])) int m_arpa_1_0;
-  ATTR (copy (((struct A *)(0, 1))[0])) long m_arpa_0_1;
+  ATTR (copy (((struct A *)0)[0])) short m_arpa_0; // { dg-warning "attribute 
ignored" "" { target default_packed } }
+  ATTR (copy (((struct A *)(1, 0))[0])) int m_arpa_1_0; // { dg-warning 
"attribute ignored" "" { target default_packed } }
+  ATTR (copy (((struct A *)(0, 1))[0])) long m_arpa_0_1; // { dg-warning 
"attribute ignored" "" { target default_packed } }
 
-  ATTR (copy (a)) short m_a;
-  ATTR (copy (b.a)) int m_b_a;
-  ATTR (copy (b.pa)) long m_b_pa;
-  ATTR (copy (b.ra)) long m_b_ra;
+  ATTR (copy (a)) short m_a; // { dg-warning "attribute ignored" "" { target 
default_packed } }
+  ATTR (copy (b.a)) int m_b_a; // { dg-warning "attribute ignored" "" { target 
default_packed } }
+  ATTR (copy (b.pa)) long m_b_pa; // { dg-warning "attribute ignored" "" { 
target default_packed } }
+  ATTR (copy (b.ra)) long m_b_ra; // { dg-warning "attribute ignored" "" { 
target default_packed } }
 
-  ATTR (copy ()) short m_ara;
-  ATTR (copy ()) int m_arb_a;
-  ATTR (copy (*b.pa)) long m_xb_pa;
-  ATTR (copy (b.pa[0])) long m_arb_pa;
+  ATTR (copy ()) short m_ara; // { dg-warning "attribute ignored" "" { 
target default_packed } }
+  ATTR (copy ()) int m_arb_a; // { dg-warning "attribute ignored" "" { 
target default_packed } }
+  ATTR (copy (*b.pa)) long m_xb_pa; // { dg-warning "attribute ignored" "" { 
target default_packed } }
+  ATTR (copy (b.pa[0])) long m_arb_pa; // { dg-warning "attribute ignored" "" 
{ target default_packed } }
 
-  ATTR (copy (*pa)) short m_xpa;
-  ATTR (copy (pa[0])) short m_arpa;
+  ATTR (copy (*pa)) short m_xpa; // { dg-warning "attribute ignored" "" { 
target default_packed } }
+  ATTR (copy (pa[0])) short m_arpa; // { dg-warning "attribute ignored" "" { 
target default_packed } }
 
-  ATTR (copy (ra)) short m_ra;
+  ATTR (copy (ra)) short m_ra; // { dg-warning "attribute ignored" "" { target 
default_packed } }
 
-  ATTR (copy (ab[0].a)) int m_arab_a;
-  ATTR (copy (ab[1].pa)) long m_arab_pa;
-  ATTR (copy (*ab[2].pa)) int m_xarab_pa;
+  ATTR (copy (ab[0].a)) int m_arab_a; // { dg-warning "attribute ignored" "" { 
target default_packed } }
+  ATTR (copy (ab[1].pa)) long m_arab_pa; // { dg-warning "attribute ignored" 
"" { target default_packed } }
+  ATTR (copy (*ab[2].pa)) int m_xarab_pa; // { dg-warning "attribute ignored" 
"" { target default_packed } }
   ATTR (copy (ab[3].pa->bf)) unsigned int m_arab_pa_bf: 1;
   ATTR (copy (ab[4].ra.bf)) unsigned int m_arab_ra_bf: 1;
 
-  ATTR (copy (pb->a)) int m_pb_a;
-  ATTR (copy (pb->pa)) long m_pb_pa;
-  ATTR (copy (*pb->pa)) int m_xpb_pa;
+  ATTR (copy (pb->a)) int m_pb_a; // { dg-warning "attribute ignored" "" { 
target default_packed } }
+  ATTR (copy (pb->pa)) long m_pb_pa; // { dg-warning "attribute ignored" "" { 
target default_packed } }
+  ATTR (copy (*pb->pa)) int m_xpb_pa; // { dg-warning "attribute ignored" "" { 
target default_packed } }
   ATTR (copy (pb->pa->bf)) unsigned int m_pb_pa_bf: 1;

[COMMITTED] testsuite: Fix gcc.dg/attr-copy-6.c for user-label-prefixed targets

2023-03-01 Thread Hans-Peter Nilsson via Gcc-patches
Trivia: I copied that ASMNAME construct from the
18-year-minus-a-month old commit of r0-66993-gc5221531453e02,
where it fixed a similar testsuite error.  Committed as obvious.

-- >8 --
This fixes:
 Running /x/gcc/testsuite/gcc.dg/dg.exp ...
 ...
 FAIL: gcc.dg/attr-copy-6.c (test for excess errors)
for cris-elf, where gcc.log has:
Excess errors:
/x/gcc/testsuite/gcc.dg/attr-copy-6.c:91:3: error: 'fnoreturn_alias' aliased to 
undefined symbol 'fnoreturn_name'

Asm-declared identifiers need to prepend __USER_LABEL_PREFIX__
to the asm-name, something which is often overlooked because
it's empty for most targets.  N.B: attribute-alias does not need
the same treatment.  The identical construct added here, is in
several tests.

* gcc.dg/attr-copy-6.c: Prefix asm-declared name with
__USER_LABEL_PREFIX__.
---
 gcc/testsuite/gcc.dg/attr-copy-6.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/attr-copy-6.c 
b/gcc/testsuite/gcc.dg/attr-copy-6.c
index cf578bddb1b0..30a1317bf928 100644
--- a/gcc/testsuite/gcc.dg/attr-copy-6.c
+++ b/gcc/testsuite/gcc.dg/attr-copy-6.c
@@ -9,6 +9,10 @@
 #define ATTR(...)   __attribute__ ((__VA_ARGS__))
 #define ASRT(expr)   _Static_assert (expr, #expr)
 
+#define ASMNAME(cname)  ASMNAME2 (__USER_LABEL_PREFIX__, cname)
+#define ASMNAME2(prefix, cname) STRING (prefix) cname
+#define STRING(x)#x
+
 /* Variable that is local to this translation unit but that can
be modified from other units by calling reset_unit_local().  */
 static int unit_local;
@@ -79,7 +83,7 @@ extern _Noreturn void fnoreturn (void);
 
 extern __typeof (fnoreturn)
   ATTR (visibility ("hidden"))
-  fnoreturn __asm__ ("fnoreturn_name");
+  fnoreturn __asm__ (ASMNAME ("fnoreturn_name"));
 
 void fnoreturn (void)
 {
-- 
2.30.2



Re: [PATCH 1/2] testsuite: Fix analyzer errors for newlib-errno

2023-03-01 Thread Hans-Peter Nilsson via Gcc-patches
> Date: Thu, 2 Mar 2023 01:37:12 +0100
> From: Bernhard Reutner-Fischer 

> On Thu, 2 Mar 2023 00:54:33 +0100
> Hans-Peter Nilsson  wrote:
> 
> > > Date: Thu, 2 Mar 2023 00:23:36 +0100
> > > From: Bernhard Reutner-Fischer   
> > 
> > > On Wed, 1 Mar 2023 17:02:31 +0100
> > > Hans-Peter Nilsson via Gcc-patches  wrote:
> > >   
> > > > > From: Hans-Peter Nilsson 
> > > > > Date: Wed, 1 Mar 2023 16:36:46 +0100
> > > >   
> > > > > ... this is what I intend to commit later
> > > > > today, just keeping the added comment as brief as
> > > > > reasonable:
> > > > 
> > > > Except I see the hook for errno magic took care of
> > > > gcc.dg/analyzer/flex-without-call-summaries.c so I'll add
> > > > that to the list of handled "FAIL"s in the commit log.  Yay.  
> > > 
> > > But in the end it means we'll have to keep _[_]+errno{,_location} 'til
> > > we bump requirements or 10, 20 years or the end of the universe,
> > > doesn't it.
> > > Way fancy.  
> > 
> > Not sure I see your point?  The (other) identifiers are already there.
> 
> I'm certainly not opposed to this partiular identifier, no.
> 
> > 
> > (And you do realize that this is in the analyzer part of gcc, right?)
> 
> And yes, i'm well aware this is "just" the analyzer -- which is unfair
> to state like that and does not mean to imply any inferiority --
> particular in this spot.

(Your statement and values; it can be read as you putting
that as mine, which I hope was not intended.)  My point is
that the presence of those identifiers does not affects an
ABI or code-generating parts of gcc.  All the identifiers
are present for all targets - for all invocation of the
analyzer.  If that passes a nuisance threshold, it seems it
can be changed easily, say by moving it to a target-hook by
someone who cares deeply enough.

> Just let's ditch any specialcased identifier which was superseded
> reliably ASAP?

I'm certainly not opposed to *that*
brgds, H-P


Re: [PATCH 1/2] testsuite: Fix analyzer errors for newlib-errno

2023-03-01 Thread Hans-Peter Nilsson via Gcc-patches
> Date: Thu, 2 Mar 2023 00:23:36 +0100
> From: Bernhard Reutner-Fischer 

> On Wed, 1 Mar 2023 17:02:31 +0100
> Hans-Peter Nilsson via Gcc-patches  wrote:
> 
> > > From: Hans-Peter Nilsson 
> > > Date: Wed, 1 Mar 2023 16:36:46 +0100  
> > 
> > > ... this is what I intend to commit later
> > > today, just keeping the added comment as brief as
> > > reasonable:  
> > 
> > Except I see the hook for errno magic took care of
> > gcc.dg/analyzer/flex-without-call-summaries.c so I'll add
> > that to the list of handled "FAIL"s in the commit log.  Yay.
> 
> But in the end it means we'll have to keep _[_]+errno{,_location} 'til
> we bump requirements or 10, 20 years or the end of the universe,
> doesn't it.
> Way fancy.

Not sure I see your point?  The (other) identifiers are already there.

(And you do realize that this is in the analyzer part of gcc, right?)

brgds, H-P


Re: [PATCH 1/2] testsuite: Fix analyzer errors for newlib-errno

2023-03-01 Thread Hans-Peter Nilsson via Gcc-patches
> From: Hans-Peter Nilsson 
> Date: Wed, 1 Mar 2023 16:36:46 +0100

> ... this is what I intend to commit later
> today, just keeping the added comment as brief as
> reasonable:

Except I see the hook for errno magic took care of
gcc.dg/analyzer/flex-without-call-summaries.c so I'll add
that to the list of handled "FAIL"s in the commit log.  Yay.

brgds, H-P


Re: [PATCH 1/2] testsuite: Fix analyzer errors for newlib-errno

2023-03-01 Thread Hans-Peter Nilsson via Gcc-patches
> From: David Malcolm 
> Cc: gcc-patches@gcc.gnu.org
> Date: Wed, 01 Mar 2023 08:32:13 -0500

> Also, the analyzer uses region_model::set_errno in various
> known_function implementations, e.g. for functions that can succeed or
> fail, to set errno on the "failure" execution path to a new symbolic
> value that tests as > 0.

Ha, there's indeed errno-analyzing magic. :)

(That's the level you get when your first impression of
source code is by tracing through failing test-cases...)

> > So, how's this instead, pending full testing (only
> > analyzer.exp spot-tested)?
> 
> Looks good, though how about adding a mention of newlib to the comment
> immediately above (the one that talks about Solaris and OS X)?

I deliberately didn't, as the format of that comment seemed
to ask for redundantly repeating the definition for each
system (telling exactly how the #define looks etc.), and I
thought the heading "Other implementations of C standard
library" should have been sufficient, assuming access to the
git log when there's interest in more than the first lines.

But I hear you so this is what I intend to commit later
today, just keeping the added comment as brief as
reasonable:

"analyzer: Support errno for newlib

Without this definition, the errno definition for newlib
isn't recognized as such, and these tests fail for newlib
targets:

FAIL: gcc.dg/analyzer/call-summaries-errno.c  (test for warnings, line 16)
FAIL: gcc.dg/analyzer/call-summaries-errno.c (test for excess errors)
FAIL: gcc.dg/analyzer/errno-1.c  (test for warnings, line 20)
FAIL: gcc.dg/analyzer/errno-1.c (test for excess errors)
FAIL: gcc.dg/analyzer/isatty-1.c  (test for warnings, line 31)
FAIL: gcc.dg/analyzer/isatty-1.c  (test for warnings, line 35)
FAIL: gcc.dg/analyzer/isatty-1.c  (test for warnings, line 46)
FAIL: gcc.dg/analyzer/isatty-1.c  (test for warnings, line 56)
FAIL: gcc.dg/analyzer/isatty-1.c (test for excess errors)

gcc/analyzer:
* kf.cc (register_known_functions): Add __errno function for newlib.
---
 gcc/analyzer/kf.cc | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/analyzer/kf.cc b/gcc/analyzer/kf.cc
index 3a91b6bd6ebb..ed5f70398e1d 100644
--- a/gcc/analyzer/kf.cc
+++ b/gcc/analyzer/kf.cc
@@ -1033,9 +1033,11 @@ register_known_functions (known_function_manager )
and OS X like this:
 extern int * __error(void);
 #define errno (*__error())
+   and similarly __errno for newlib.
Add these as synonyms for "__errno_location".  */
 kfm.add ("___errno", make_unique ());
 kfm.add ("__error", make_unique ());
+kfm.add ("__errno", make_unique ());
   }
 
   /* Language-specific support functions.  */
-- 
2.30.2

> Thanks for fixing this

Thank *you* for the review!

brgds, H-P


Re: [PATCH 1/2] testsuite: Fix analyzer errors for newlib-errno

2023-02-28 Thread Hans-Peter Nilsson via Gcc-patches
> From: David Malcolm 
> Date: Tue, 28 Feb 2023 21:25:34 -0500

> On Wed, 2023-03-01 at 01:59 +0100, Hans-Peter Nilsson wrote:
> > > From: David Malcolm 
> > > Date: Tue, 28 Feb 2023 14:12:47 -0500
> > 
> > > On Tue, 2023-02-28 at 19:47 +0100, Hans-Peter Nilsson wrote:
> > > > Ok to commit?
> > > > -- >8 --
> > > > Investigating analyzer tesstsuite errors for cris-elf.  The same
> > > > are
> > > > seen for pru-elf according to posts to gcc-testresults@.
> > > > 
> > > > For glibc, errno is #defined as:
> > > >  extern int *__errno_location (void) __THROW __attribute_const__;
> > > >  # define errno (*__errno_location ())
> > > > while for newlib in its default configuration, it's:
> > > >  #define errno (*__errno())
> > > >  extern int *__errno (void);
> > > 
> > > We're already handling ___errno (three underscores) for Solaris as
> > > of
> > > 7c9717fcb5cf94ce1e7ef5c903058adf9980ff28; does it fix the issue if
> > > you 
> > > add __errno (two underscores) to analyzer/kf.cc's 
> > > register_known_functions in an analogous way to that commit?  (i.e.
> > > wiring it up to kf_errno_location, "teaching" the analyzer that
> > > that
> > > function returns a pointer to the "errno region")
> > 
> > But...there's already "support" for two underscores since
> > the commit you quote, so I guess not.  
> 
> Is there?  That commit adds support for:
>   __error
> but not for:
>   __errno
> unless there's something I'm missing.

No, it's me.  Apparently I can't spot the difference between
error and errno.  Sorry.

It's nice to know that the const-attribute (to signal that
the pointer points to a constant location) works
automatically, i.e. the Solaris and glibc definitions should
not (no longer?) be needed - unless there's other
errno-analyzing magic elsewhere too.  Either way, since
there's specific support for this errno kind of thing, that
certainly works for me.  The patch gets smaller too. :)

So, how's this instead, pending full testing (only
analyzer.exp spot-tested)?

-- >8 --
analyzer: Support errno for newlib

Without this definition, the errno definition for newlib
isn't recognized as such, and these tests fail for newlib
targets:

FAIL: gcc.dg/analyzer/call-summaries-errno.c  (test for warnings, line 16)
FAIL: gcc.dg/analyzer/call-summaries-errno.c (test for excess errors)
FAIL: gcc.dg/analyzer/errno-1.c  (test for warnings, line 20)
FAIL: gcc.dg/analyzer/errno-1.c (test for excess errors)
FAIL: gcc.dg/analyzer/isatty-1.c  (test for warnings, line 31)
FAIL: gcc.dg/analyzer/isatty-1.c  (test for warnings, line 35)
FAIL: gcc.dg/analyzer/isatty-1.c  (test for warnings, line 46)
FAIL: gcc.dg/analyzer/isatty-1.c  (test for warnings, line 56)
FAIL: gcc.dg/analyzer/isatty-1.c (test for excess errors)

gcc/analyzer:
* kf.cc (register_known_functions): Add __errno function for newlib.
---
 gcc/analyzer/kf.cc | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/analyzer/kf.cc b/gcc/analyzer/kf.cc
index 3a91b6bd6ebb..bfb9148f9ae7 100644
--- a/gcc/analyzer/kf.cc
+++ b/gcc/analyzer/kf.cc
@@ -1036,6 +1036,7 @@ register_known_functions (known_function_manager )
Add these as synonyms for "__errno_location".  */
 kfm.add ("___errno", make_unique ());
 kfm.add ("__error", make_unique ());
+kfm.add ("__errno", make_unique ());
   }
 
   /* Language-specific support functions.  */
-- 
2.30.2


Re: [PATCH 1/2] testsuite: Fix analyzer errors for newlib-errno

2023-02-28 Thread Hans-Peter Nilsson via Gcc-patches
> From: David Malcolm 
> Date: Tue, 28 Feb 2023 14:12:47 -0500

> On Tue, 2023-02-28 at 19:47 +0100, Hans-Peter Nilsson wrote:
> > Ok to commit?
> > -- >8 --
> > Investigating analyzer tesstsuite errors for cris-elf.  The same are
> > seen for pru-elf according to posts to gcc-testresults@.
> > 
> > For glibc, errno is #defined as:
> >  extern int *__errno_location (void) __THROW __attribute_const__;
> >  # define errno (*__errno_location ())
> > while for newlib in its default configuration, it's:
> >  #define errno (*__errno())
> >  extern int *__errno (void);
> 
> We're already handling ___errno (three underscores) for Solaris as of
> 7c9717fcb5cf94ce1e7ef5c903058adf9980ff28; does it fix the issue if you 
> add __errno (two underscores) to analyzer/kf.cc's 
> register_known_functions in an analogous way to that commit?  (i.e.
> wiring it up to kf_errno_location, "teaching" the analyzer that that
> function returns a pointer to the "errno region")

But...there's already "support" for two underscores since
the commit you quote, so I guess not.  I strongly believe
"the critical difference is that __attribute__
((__const__))" because I indeed proved it by adding it to
the newlib definition.

I doubt these tests actually *pass* for OS X, because that
one looks identical to the newlib definition as quoted in
that commit (compare to the newlib one I pasted in the quote
above).

It looks like that definition was added for good measure
along with the Solaris definition (that has the
attribute-const) but never tested.  Sorry, I don't have an
OS X to test it myself and according to a popular search
engine(...) nobody has posted gcc-testresults@ for anything
"apple" since gcc-4.7 era. :(

brgds, H-P

> 
> Dave
> 
> > 
> > The critical difference is that __attribute__ ((__const__)),
> > where glibc says that the caller will see the same value on
> > all calls (from the same context; read: same thread).  I'm
> > not sure the absence of __attribute__ ((__const__)) for the
> > newlib definition is deliberate, but I guess it can.
> > Either way, without the "const" attribute, it can't be known
> > that the same location will be returned the next time, so
> > analyzer-tests that depend the value being known it should
> > see UNKNOWN rather than TRUE, that's why the deliberate
> > check for UNKNOWN rather than xfailing the test.
> > 
> > For isatty-1.c, it's the same problem, but here it'd be
> > unweildy with the extra dg-lines, so better just skip it for
> > newlib targets.
> > 
> > testsuite:
> > * gcc.dg/analyzer/call-summaries-errno.c: Expect UNKNOWN
> > for newlib after having set errno.
> > * gcc.dg/analyzer/errno-1.c: Ditto.
> > * gcc.dg/analyzer/isatty-1.c: Skip for newlib targets.
> > ---
> >  gcc/testsuite/gcc.dg/analyzer/call-summaries-errno.c | 3 ++-
> >  gcc/testsuite/gcc.dg/analyzer/errno-1.c  | 3 ++-
> >  gcc/testsuite/gcc.dg/analyzer/isatty-1.c | 2 +-
> >  3 files changed, 5 insertions(+), 3 deletions(-)
> > 
> > diff --git a/gcc/testsuite/gcc.dg/analyzer/call-summaries-errno.c
> > b/gcc/testsuite/gcc.dg/analyzer/call-summaries-errno.c
> > index e4333b30bb77..cf4d9f7141e4 100644
> > --- a/gcc/testsuite/gcc.dg/analyzer/call-summaries-errno.c
> > +++ b/gcc/testsuite/gcc.dg/analyzer/call-summaries-errno.c
> > @@ -13,5 +13,6 @@ void test_sets_errno (int y)
> >sets_errno (y);
> >sets_errno (y);
> >  
> > -  __analyzer_eval (errno == y); /* { dg-warning "TRUE" } */  
> > +  __analyzer_eval (errno == y); /* { dg-warning "TRUE" "errno is at
> > a constant location" { target { ! newlib } } } */
> > +  /* { dg-warning "UNKNOWN" "errno is not known to be at a constant
> > location" { target { newlib } } .-1 } */
> >  }
> > diff --git a/gcc/testsuite/gcc.dg/analyzer/errno-1.c
> > b/gcc/testsuite/gcc.dg/analyzer/errno-1.c
> > index 6b9d28c10799..af0cc3d52a36 100644
> > --- a/gcc/testsuite/gcc.dg/analyzer/errno-1.c
> > +++ b/gcc/testsuite/gcc.dg/analyzer/errno-1.c
> > @@ -17,7 +17,8 @@ void test_storing_to_errno (int val)
> >  {
> >__analyzer_eval (errno == val); /* { dg-warning "UNKNOWN" } */
> >errno = val;
> > -  __analyzer_eval (errno == val); /* { dg-warning "TRUE" } */
> > +  __analyzer_eval (errno == val); /* { dg-warning "TRUE" "errno is
> > at a constant location" { target { ! newlib } } } */
> > +  /* { dg-warning "UNKNOWN" "errno is not known to be at a constant
> > location" { target { newlib } } .-1 } */
> >external_fn ();
> >__analyzer_eval (errno == val); /* { dg-warning "UNKNOWN" } */  
> >  }
> > diff --git a/gcc/testsuite/gcc.dg/analyzer/isatty-1.c
> > b/gcc/testsuite/gcc.dg/analyzer/isatty-1.c
> > index 389d2cdf3f18..450a7d71990d 100644
> > --- a/gcc/testsuite/gcc.dg/analyzer/isatty-1.c
> > +++ b/gcc/testsuite/gcc.dg/analyzer/isatty-1.c
> > @@ -1,4 +1,4 @@
> > -/* { dg-skip-if "" { powerpc*-*-aix* } } */
> > +/* { dg-skip-if "" { powerpc*-*-aix* || newlib } } */
> >  
> >  #include 
> >  #include "analyzer-decls.h"
> 


[PATCH 2/2] testsuite: Fix analyzer errors for newlib-fd

2023-02-28 Thread Hans-Peter Nilsson via Gcc-patches
Ok to commit?  (After this, there's
gcc.dg/analyzer/flex-without-call-summaries.c left to do.)

-- >8 --
Investigating analyzer testsuite errors for cris-elf.  The same are
seen for pru-elf according to posts to gcc-testresults@.

The test fd-access-mode-target-headers.c uses the analyzer
"sm-fd" which for this use requires (e.g.) that constants
O_ACCMODE, O_RDONLY and O_WRONLY are defined as literal
constants.  While for glibc, O_ACCMODE is defined as:
 #define O_ACCMODE 0003
in newlib, it's defined as:
 #define O_ACCMODE (O_RDONLY|O_WRONLY|O_RDWR)
and the analyzer is not able to make use of an expression
like this (even though O_RDONLY, O_WRONLY and O_RDWR are
defined as literal constants and the whole evaluates to 3).
Better do as for AIX and skip this test.

testsuite:
* gcc.dg/analyzer/fd-access-mode-target-headers.c: Skip for
newlib targets too.
---
 gcc/testsuite/gcc.dg/analyzer/fd-access-mode-target-headers.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/analyzer/fd-access-mode-target-headers.c 
b/gcc/testsuite/gcc.dg/analyzer/fd-access-mode-target-headers.c
index 847d47e06342..cf273b217d17 100644
--- a/gcc/testsuite/gcc.dg/analyzer/fd-access-mode-target-headers.c
+++ b/gcc/testsuite/gcc.dg/analyzer/fd-access-mode-target-headers.c
@@ -1,4 +1,4 @@
-/* { dg-skip-if "" { powerpc*-*-aix* } } */
+/* { dg-skip-if "" { powerpc*-*-aix* || newlib } } */
 
 #include 
 #include 
-- 
2.30.2



[PATCH 1/2] testsuite: Fix analyzer errors for newlib-errno

2023-02-28 Thread Hans-Peter Nilsson via Gcc-patches
Ok to commit?
-- >8 --
Investigating analyzer tesstsuite errors for cris-elf.  The same are
seen for pru-elf according to posts to gcc-testresults@.

For glibc, errno is #defined as:
 extern int *__errno_location (void) __THROW __attribute_const__;
 # define errno (*__errno_location ())
while for newlib in its default configuration, it's:
 #define errno (*__errno())
 extern int *__errno (void);

The critical difference is that __attribute__ ((__const__)),
where glibc says that the caller will see the same value on
all calls (from the same context; read: same thread).  I'm
not sure the absence of __attribute__ ((__const__)) for the
newlib definition is deliberate, but I guess it can.
Either way, without the "const" attribute, it can't be known
that the same location will be returned the next time, so
analyzer-tests that depend the value being known it should
see UNKNOWN rather than TRUE, that's why the deliberate
check for UNKNOWN rather than xfailing the test.

For isatty-1.c, it's the same problem, but here it'd be
unweildy with the extra dg-lines, so better just skip it for
newlib targets.

testsuite:
* gcc.dg/analyzer/call-summaries-errno.c: Expect UNKNOWN
for newlib after having set errno.
* gcc.dg/analyzer/errno-1.c: Ditto.
* gcc.dg/analyzer/isatty-1.c: Skip for newlib targets.
---
 gcc/testsuite/gcc.dg/analyzer/call-summaries-errno.c | 3 ++-
 gcc/testsuite/gcc.dg/analyzer/errno-1.c  | 3 ++-
 gcc/testsuite/gcc.dg/analyzer/isatty-1.c | 2 +-
 3 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/analyzer/call-summaries-errno.c 
b/gcc/testsuite/gcc.dg/analyzer/call-summaries-errno.c
index e4333b30bb77..cf4d9f7141e4 100644
--- a/gcc/testsuite/gcc.dg/analyzer/call-summaries-errno.c
+++ b/gcc/testsuite/gcc.dg/analyzer/call-summaries-errno.c
@@ -13,5 +13,6 @@ void test_sets_errno (int y)
   sets_errno (y);
   sets_errno (y);
 
-  __analyzer_eval (errno == y); /* { dg-warning "TRUE" } */  
+  __analyzer_eval (errno == y); /* { dg-warning "TRUE" "errno is at a constant 
location" { target { ! newlib } } } */
+  /* { dg-warning "UNKNOWN" "errno is not known to be at a constant location" 
{ target { newlib } } .-1 } */
 }
diff --git a/gcc/testsuite/gcc.dg/analyzer/errno-1.c 
b/gcc/testsuite/gcc.dg/analyzer/errno-1.c
index 6b9d28c10799..af0cc3d52a36 100644
--- a/gcc/testsuite/gcc.dg/analyzer/errno-1.c
+++ b/gcc/testsuite/gcc.dg/analyzer/errno-1.c
@@ -17,7 +17,8 @@ void test_storing_to_errno (int val)
 {
   __analyzer_eval (errno == val); /* { dg-warning "UNKNOWN" } */
   errno = val;
-  __analyzer_eval (errno == val); /* { dg-warning "TRUE" } */
+  __analyzer_eval (errno == val); /* { dg-warning "TRUE" "errno is at a 
constant location" { target { ! newlib } } } */
+  /* { dg-warning "UNKNOWN" "errno is not known to be at a constant location" 
{ target { newlib } } .-1 } */
   external_fn ();
   __analyzer_eval (errno == val); /* { dg-warning "UNKNOWN" } */  
 }
diff --git a/gcc/testsuite/gcc.dg/analyzer/isatty-1.c 
b/gcc/testsuite/gcc.dg/analyzer/isatty-1.c
index 389d2cdf3f18..450a7d71990d 100644
--- a/gcc/testsuite/gcc.dg/analyzer/isatty-1.c
+++ b/gcc/testsuite/gcc.dg/analyzer/isatty-1.c
@@ -1,4 +1,4 @@
-/* { dg-skip-if "" { powerpc*-*-aix* } } */
+/* { dg-skip-if "" { powerpc*-*-aix* || newlib } } */
 
 #include 
 #include "analyzer-decls.h"
-- 
2.30.2



  1   2   3   >