Make full use of context-sensitive ranges in access warnings

2021-10-23 Thread Martin Sebor via Gcc-patches

Somewhat belatedly following Aldy's lead on finishing
the conversion to Ranger, the attached patch modifies
gimple-ssa-warn-access and other passes that use
the pointer_query machinery to provide Ranger with
the statement it's being called to determine ranges for.
The changes are almost completely mechanical, involving
passing a GIMPLE statement around (and a range_query
pointer) all the way into the bowels of the pointer_query
class to make them available when range info is being
determined.

There might be some overlap with Aldy's tree-ssa-strlen.c
changes to do the same there.  I'll deal with any conflicts
when it comes time to commit the work.

The changes trigger a couple of -Wstringop-overread instances
in libstdc++ tests.  The warnings look valid for the IL but
the code they're in is unreachable.  One of the tests already
suppresses -Wstringop-overflow so also suppressing
-Wstringop-overread doesn't seem out of line.

Tested on x86_64-linux.

Martin

PS The warning for the u8path-char8_t.cc test is this:

/ssd/test/build/gcc-test/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/char_traits.h:355: 
warning: 'void* __builtin_memcpy(void*, const void*, long unsigned int)' 
reading between 16 and 4611686018427387903 bytes from a region of size 
10 [-Wstringop-overread]


The IL for it is below.  The loop iN BB 3 exits with __i_22 equal
to 10 so BBs 5, 6 and 7 are unreachable.  It's surprising to me
that the loop isn't optimized into something better (like a MEM
array assignment or memcpy).

   [local count: 1073741824]:
  MEM[(struct basic_string *)] ={v} {CLOBBER};
  MEM[(struct _Alloc_hider *)] ={v} {CLOBBER};
  MEM[(struct _Alloc_hider *)]._M_p = _M_local_buf;

   [local count: 8687547547]:
  # __i_109 = PHI <__i_22(3), 0(2)>
  __i_22 = __i_109 + 1;
  _24 = MEM[(const char_type &)"filename2" + __i_22 * 1];
  if (_24 != 0)
goto ; [89.00%]
  else
goto ; [11.00%]

   [local count: 1073741824]:   <<< __i_22 == 10 here
  if (__i_22 > 15)
goto ; [33.00%]
  else
goto ; [67.00%]

   [local count: 354334802]:
  if (__i_22 > 4611686018427387903)
goto ; [0.04%]
  else
goto ; [99.96%]   >>> __i_22 in [16, 4611686018427387903]

   [local count: 141736]:
  std::__throw_length_error ("basic_string::_M_create");

   [local count: 354193066]:
  _85 = __i_109 + 2;
  _42 = operator new (_85);
  s1._M_dataplus._M_p = _42;
  s1.D.30357._M_allocated_capacity = __i_22;
  __builtin_memcpy (_42, "filename2", __i_22);   << -Wstringop-overread
Make full use of context-sensitive ranges in access warnings.

gcc/ChangeLog:

	* builtins.c (check_strncat_sizes): Pass access_data ctor additional
	arguments.
	(expand_builtin_memcmp): Move code to gimple-ssa-warn-access.cc.
	(expand_builtin_fork_or_exec): Same.
	* gimple-array-bounds.cc (array_bounds_checker::check_mem_ref):
	Pass compute_objsize additional arguments.
	(inbounds_memaccess_p): Same.
	(array_bounds_checker::check_array_bounds): Add an assert.  Stash
	statement in a member.
	(check_array_bounds_dom_walker::before_dom_children):
	* gimple-array-bounds.h (array_bounds_checker::m_stmt): New member.
	* gimple-ssa-sprintf.c (get_destination_size): Add an argument.
	(handle_printf_call): Pass a new argument.
	* gimple-ssa-warn-access.cc (get_size_range): Add an argument.
	(check_access): Add an argument and pass it along to callees.
	(check_read_access): Make a member function.
	(pass_waccess::check_strcat): Pass access_data ctor additional
	arguments.
	(pass_waccess::check_strncat): Same.
	(pass_waccess::check_stxcpy): Same.
	(pass_waccess::check_stxncpy): Same.
	(pass_waccess::check_strncmp): Same.
	(pass_waccess::check_read_access):
	(pass_waccess::check_builtin):
	(pass_waccess::maybe_check_access_sizes):
	(pass_waccess::maybe_check_dealloc_call):
	* gimple-ssa-warn-access.h (check_read_access): Declare a new
	member function.
	* pointer-query.cc (compute_objsize_r): Add an argument.
	(gimple_call_return_array): Same.
	(gimple_call_alloc_size): Same.
	(access_ref::access_ref): Same.
	(access_ref::get_ref): Same.
	(pointer_query::get_ref): Same.
	(handle_min_max_size): Pass an arguments to callees.
	(handle_array_ref): Add an argument.
	(handle_mem_ref): Same.
	(compute_objsize): Same.
	* pointer-query.h (struct access_ref): Adjust signatures.
	(struct access_data): Same.
	(gimple_call_alloc_size): Add an argument.
	(gimple_parm_array_size): Same.
	(compute_objsize): Same.
	* tree-ssa-strlen.c (strlen_pass::adjust_last_stmt): Pass an additional
	argument to compute_objsize.
	(strlen_pass::maybe_warn_overflow): Same.
	(maybe_diag_stxncpy_trunc): Same.

gcc/testsuite/ChangeLog:

	* gcc.dg/Wstringop-overflow-22.c: Correct typos.
	* gcc.dg/Wstringop-overflow-77.c: New test.

libstdc++/testsuite/ChangeLog:

	* 21_strings/basic_string/capacity/1.cc: Also suppress
	-Wstringop-overread.
	* 27_io/filesystem/path/factory/u8path-char8_t.cc: Same.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index f1c3fea3583..7d0f61fc98b 100644
--- a/gcc/builtins.c
+++ 

Re: [PATCH] restore ancient -Waddress for weak symbols [PR33925]

2021-10-23 Thread Martin Sebor via Gcc-patches

On 10/4/21 3:37 PM, Jason Merrill wrote:

On 10/4/21 14:42, Martin Sebor wrote:

While resolving the recent -Waddress enhancement request (PR
PR102103) I came across a 2007 problem report about GCC 4 having
stopped warning for using the address of inline functions in
equality comparisons with null.  With inline functions being
commonplace in C++ this seems like an important use case for
the warning.

The change that resulted in suppressing the warning in these
cases was introduced inadvertently in a fix for PR 22252.

To restore the warning, the attached patch enhances
the decl_with_nonnull_addr_p() function to return true also for
weak symbols for which a definition has been provided.


I think you probably want to merge this function with 
fold-const.c:maybe_nonzero_address, which already handles more cases.


maybe_nonzero_address() doesn't behave quite like
decl_with_nonnull_addr_p() expects and I'm reluctant to muck
around with the former too much since it's used for codegen,
while the latter just for warnings.  (There is even a case
where the functions don't behave the same, and would result
in different warnings between C and C++ without some extra
help.)

So in the attached revision I just have maybe_nonzero_address()
call decl_with_nonnull_addr_p() and then refine the failing
(or uncertain) cases separately, with some overlap between
them.

Since I worked on this someone complained that some instances
of the warning newly enhanced under PR102103 aren't suppresed
in code resulting from macro expansion.  Since it's trivial,
I include the fix for that report in this patch as well.

Tested on x86_64-linux.

Martin
Restore ancient -Waddress for weak symbols [PR33925].

Resolves:
PR c/33925 - gcc -Waddress lost some useful warnings
PR c/102867 - -Waddress from macro expansion in readelf.c

gcc/c-family/ChangeLog:

	PR c++/33925
	PR c/102867
	* c-common.c (decl_with_nonnull_addr_p): Call maybe_nonzero_address
	and improve handling tof defined symbols.

gcc/c/ChangeLog:

	PR c++/33925
	PR c/102867
	* c-typeck.c (maybe_warn_for_null_address): Suppress warnings for
	code resulting from macro expansion.

gcc/cp/ChangeLog:

	PR c++/33925
	PR c/102867
	* typeck.c (warn_for_null_address): Suppress warnings for code
	resulting from macro expansion.

gcc/ChangeLog:

	PR c++/33925
	PR c/102867
	* doc/invoke.texi (-Waddress): Update.
	* fold-const.c (maybe_nonzero_address): Declare extern.  Simplify
	for readability.
	* tree.h (maybe_nonzero_address): Declare.

gcc/testsuite/ChangeLog:

	PR c++/33925
	PR c/102867
	* g++.dg/warn/Walways-true-2.C: Adjust to avoid a valid warning.
	* c-c++-common/Waddress-5.c: New test.
	* c-c++-common/Waddress-6.c: New test.
	* g++.dg/warn/Waddress-7.C: New test.
	* g++.dg/warn/Walways-true-2.C: Adjust to avoid a valid warning.
	* gcc.dg/weak/weak-3.c: Expect a warning.

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 32c7e3e8972..71cc74ac63d 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -3395,16 +3395,46 @@ c_wrap_maybe_const (tree expr, bool non_const)
 
 /* Return whether EXPR is a declaration whose address can never be NULL.
The address of the first struct member could be NULL only if it were
-   accessed through a NULL pointer, and such an access would be invalid.  */
+   accessed through a NULL pointer, and such an access would be invalid.
+   The address of a weak symbol may be null unless it has a definition.  */
 
 bool
 decl_with_nonnull_addr_p (const_tree expr)
 {
-  return (DECL_P (expr)
-	  && (TREE_CODE (expr) == FIELD_DECL
-	  || TREE_CODE (expr) == PARM_DECL
-	  || TREE_CODE (expr) == LABEL_DECL
-	  || !DECL_WEAK (expr)));
+  if (maybe_nonzero_address (const_cast (expr)) > 0)
+return true;
+
+  if (!DECL_P (expr))
+return false;
+
+  if (TREE_CODE (expr) == FIELD_DECL
+  || TREE_CODE (expr) == PARM_DECL
+  || TREE_CODE (expr) == LABEL_DECL)
+return true;
+
+  if (!VAR_OR_FUNCTION_DECL_P (expr))
+return false;
+
+  if (!DECL_WEAK (expr))
+/* Ordinary (non-weak) symbols have nonnull addresses.  */
+return true;
+
+  if (DECL_INITIAL (expr) && DECL_INITIAL (expr) != error_mark_node)
+/* Initialized weak symbols have nonnull addresses.  */
+return true;
+
+  if (DECL_EXTERNAL (expr) || !TREE_STATIC (expr))
+/* Uninitialized extern weak symbols and weak symbols with no
+   allocated stroage might have a null address.  */
+return false;
+
+  tree attribs = DECL_ATTRIBUTES (expr);
+  if (lookup_attribute ("weakref", attribs))
+/* Weakref symbols might have a null address unless their referent
+   is known not to.  Don't bother following weakref targets here.  */
+return false;
+
+  return true;
 }
 
 /* Prepare expr to be an argument of a TRUTH_NOT_EXPR,
diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index 0aac978c02e..0ceedfa7b04 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -11571,7 +11571,10 @@ build_vec_cmp (tree_code code, tree type,

Re: [PATCH] Always default to DWARF2_DEBUG if not specified, warn about deprecated STABS

2021-10-23 Thread Jeff Law via Gcc-patches




On 10/23/2021 2:00 PM, Jan-Benedict Glaw wrote:

Hi Richard,

On Tue, 2021-09-21 16:25:19 +0200, Richard Biener via Gcc-patches 
 wrote:

I have built all targets from contrib/config-list.mk to make sure we
don't run into the #error and the following makes the STABS usage
explicit for pdp11 and hppa with SOM.

I'm running build tests based on config-list.mk as well and see a good
number of targets failing, all about the same, ie. for moxie-elf:

That's odd.  My moxie-elf (and all the others I test) aren't seeing this.

http://gcc.gnu.org/jenkins

Jeff



Re: [PATCH] Convert strlen pass from evrp to ranger.

2021-10-23 Thread Jeff Law via Gcc-patches




On 10/21/2021 12:20 PM, Jeff Law wrote:





So if we're referring to those temporary const/copy propagations
"escaping" into Ranger, then I would fully expect that to cause
problems.  Essentially they're path sensitive const/copy propagations
and may not be valid on all the paths through the CFG to the statement
where the propagation occurs

Yeah.  disabling the global ranger should help, plus making sure you
don't use the ranger in the midst of the path sensitive changes.
I think we should first try to remove those temporary const/copy 
propagations.  As I noted in a different follow-up, I can't remember 
if they were done as part of the original non-copying threader or if 
they enabled further optimizations in the copying threader.  If its 
the former, then they can go away and that would be my preference. 
I'll try to poke at that over the weekend.
OK.  So those temporary propagations are still useful.  Here's a simple 
example (pr36550, but there are others):



[ ... ]
;;   basic block 3, loop depth 0, count 536870912 (estimated locally), 
maybe hot

;;    prev block 2, next block 4, flags: (NEW, VISITED)
;;    pred:   2 [50.0% (guessed)]  count:536870912 (estimated 
locally) (TRUE_VALUE,EXECUTABLE)

  # VUSE <.MEM_10>
  _20 = *argv_12(D);
  if (_20 != 0B)
    goto ; [70.00%]
  else
    goto ; [30.00%]
;;    succ:   4 [70.0% (guessed)]  count:375809640 (estimated 
locally) (TRUE_VALUE,EXECUTABLE)
;;    7 [30.0% (guessed)]  count:161061272 (estimated 
locally) (FALSE_VALUE,EXECUTABLE)


[ ... ]

;;   basic block 7, loop depth 0, count 536763538 (estimated locally), 
maybe hot

;;    prev block 6, next block 8, flags: (NEW, VISITED)
;;    pred:   3 [30.0% (guessed)]  count:161061272 (estimated 
locally) (FALSE_VALUE,EXECUTABLE)
;;    4 [always]  count:375809640 (estimated locally) 
(FALLTHRU,EXECUTABLE)

  # argv_32 = PHI 
  # bug_33 = PHI 
  # VUSE <.MEM_10>
  _3 = *argv_32;
  if (_3 == 0B)
    goto ; [18.09%]
  else
    goto ; [81.91%]
;;    succ:   9 [18.1% (guessed)]  count:97100524 (estimated 
locally) (TRUE_VALUE,EXECUTABLE)
;;    8 [81.9% (guessed)]  count:439663014 (estimated 
locally) (FALSE_VALUE,EXECUTABLE)



So when we reach block 7 directly from block 3 we'll have _20 = 
*argv_12(MEM_10) = _20 in the expression hash table and _20 = 0 in the 
const/copies table.


The first statement in block #7 loads *argv_32.  Without the temporary 
propagation we'll lookup *argv_32(MEM_10) in the hash table which misses 
and we're unable to thread the subsequent conditional in block #7.


With the temporary propagation instead of looking up *argv_32(MEM_10), 
we propagate argv_12 for argv32 and lookup *argv32(MEM_10) which hits 
with the LHS value _20 which we then lookup in const/copies getting the 
constant 0.   Thus we know *argv_32 will have the value 0 when block 7 
is reached directly via block 3.  That in turn allows us to know that 
the conditional at the end of block #7 is threadable when reached via 
block #3.


In this particular testcase we end up getting a failure because... 
drumroll the failure to thread 3->7->9 leaves an infeasible path in 
the CFG which in turn causes a bogus Wuninitialized warning.


Instead of actually propagating the arguments into the statement, we 
could revamp the hashing bits so that they used the data from the 
const/copy tables.    That would avoid twiddling the IL.  Though I'm 
still not sure how the IL twiddling could be escaping at this point.


jeff




SV: [PATCH] libgccjit: add some reflection functions in the jit C api

2021-10-23 Thread Petter Tomner via Gcc-patches
Hi!

+gcc_jit_type *
+gcc_jit_type_unqualified (gcc_jit_type *type)
+{
+  RETURN_NULL_IF_FAIL (type, NULL, NULL, "NULL type");
+
+  return (gcc_jit_type *)type->unqualified ();
+}

I think there is a problem with the current implementation of unqualified() 
that might be kinda surprising to
users, since it only removes one qualifier and also not "pointed to" qualifiers.

Unrelated to this patch that behavior should probably be documented, since it 
effects many entry-points
and makes cast to base type kinda mandatory when qualified types are mixed. 
Casts uses a recursive
check of types and is happy to cast any pointer to any pointer (no matter the 
level of pointerness) so
they work.

I guess the casts should get some shortcircuit for casts between same base 
types really,
to not bloat the GENERIC tree.

I made a utility function in another patch for comparing types that do full 
stripping (see bellow).

Maybe something like that could be usefull?

jit-recording.h:
+/* Strip all qualifiers and count pointer depth, returning true
+   if the types and pointer depth are the same, otherwise false. */
+static inline bool
+types_kinda_same (recording::type *a, recording::type *b)
+{
+  /* Handle trivial case here, to allow for inlining. */
+  return a == b || types_kinda_same_internal (a, b);
+}
jit-recording.c:
+/* Strip qualifiers and count pointer depth, returning true
+   if the types' base type and pointer depth are
+   the same, otherwise false.
+
+   Do not call this directly. Call 'types_kinda_same' */
+bool
+types_kinda_same_internal (recording::type *a, recording::type *b)
+{
+  int ptr_depth[2] = {};
+  recording::type *base_types[2];
+  recording::type *types[2] = {a, b};
+
+  /* Strip qualifiers and count pointerness */
+  for (int i = 0; i < 2; i++)
+{
+  recording::type *t = types[i];
+  while (true)
+   {
+ if (!t)
+   return false; /* Should only happen on bad input */
+
+ recording::type *pointed_to_type = t->is_pointer();
+ if (pointed_to_type != NULL)
+   {
+ ptr_depth[i]++;
+ t = pointed_to_type;
+ continue;
+   }
+
+ /* unqualified() returns 'this' on base types */
+ recording::type *next = t->unqualified ();
+ if (next == t)
+   {
+ base_types[i] = t;
+ break;
+   }
+ t = next;
+   }
+}
+
+  return base_types[0] == base_types[1] &&
+(ptr_depth[0] == ptr_depth[1]);
+}



Från: Jit  för Antoni Boucher via Jit 

Skickat: den 13 oktober 2021 04:09
Till: David Malcolm
Kopia: Antoni Boucher via Jit; gcc-patches@gcc.gnu.org
Ämne: Re: [PATCH] libgccjit: add some reflection functions in the jit C api
    
David: PING

Le lundi 27 septembre 2021 à 20:53 -0400, Antoni Boucher a écrit :
> I fixed an issue (it would show an error message when
> gcc_jit_type_dyncast_function_ptr_type was called on a type different
> than a function pointer type).
> 
> Here's the updated patch.
> 
> Le vendredi 18 juin 2021 à 16:37 -0400, David Malcolm a écrit :
> > On Fri, 2021-06-18 at 15:41 -0400, Antoni Boucher wrote:
> > > I have write access now.
> > 
> > Great.
> > 
> > > I'm not sure how I'm supposed to send my patches:
> > > should I put it in personal branches and you'll merge them?
> > 
> > Please send them to this mailing list for review; once they're
> > approved
> > you can merge them.
> > 
> > > 
> > > And for the MAINTAINERS file, should I just push to master right
> > > away,
> > > after sending it to the mailing list?
> > 
> > I think people just push the MAINTAINERS change and then let the
> > list
> > know, since it makes a good test that write access is working
> > correctly.
> > 
> > Dave
> > 
> > > 
> > > Thanks for your help!
> > > 
> > > Le vendredi 18 juin 2021 à 12:09 -0400, David Malcolm a écrit :
> > > > On Fri, 2021-06-18 at 11:55 -0400, Antoni Boucher wrote:
> > > > > Le vendredi 11 juin 2021 à 14:00 -0400, David Malcolm a
> > > > > écrit :
> > > > > > On Fri, 2021-06-11 at 08:15 -0400, Antoni Boucher wrote:
> > > > > > > Thank you for your answer.
> > > > > > > I attached the updated patch.
> > > > > > 
> > > > > > BTW you (or possibly me) dropped the mailing lists; was
> > > > > > that
> > > > > > deliberate?
> > > > > 
> > > > > Oh, my bad.
> > > > > 
> > > > 
> > > > [...]
> > > > 
> > > > 
> > > > > > 
> > > > > > 
> > > > > > > I have signed the FSF copyright attribution.
> > > > > > 
> > > > > > I can push changes on your behalf, but I'd prefer it if you
> > > > > > did
> > > > > > it,
> > > > > > especially given that you have various other patches you
> > > > > > want
> > > > > > to
> > > > > > get
> > > > > > in.
> > > > > > 
> > > > > > Instructions on how to get push rights to the git repo are
> > > > > > here:
> > > > > >   https://gcc.gnu.org/gitwrite.html
> > > > > > 
> > > > > > I can sponsor you.
> > > > > 
> > > > > Thanks.
> > > > > I did sign up to get push rights.
> > > > > Have you 

Re: [PATCH] Always default to DWARF2_DEBUG if not specified, warn about deprecated STABS

2021-10-23 Thread Jan-Benedict Glaw
Hi Richard,

On Tue, 2021-09-21 16:25:19 +0200, Richard Biener via Gcc-patches 
 wrote:
> I have built all targets from contrib/config-list.mk to make sure we
> don't run into the #error and the following makes the STABS usage
> explicit for pdp11 and hppa with SOM.

I'm running build tests based on config-list.mk as well and see a good
number of targets failing, all about the same, ie. for moxie-elf:

[all 2021-10-17 00:01:19] /usr/lib/gcc-snapshot/bin/g++  -fno-PIE -c  
-DIN_GCC_FRONTEND -g -O2 -DIN_GCC  -DCROSS_DIRECTORY_STRUCTURE   
-fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing 
-Wwrite-strings -Wcast-qual -Wno-error=format-diag -Wmissing-format-attribute 
-Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros 
-Wno-overlength-strings -Werror -fno-common  -DHAVE_CONFIG_H -I. -I. 
-I../../gcc/gcc -I../../gcc/gcc/. -I../../gcc/gcc/../include 
-I../../gcc/gcc/../libcpp/include -I../../gcc/gcc/../libcody  
-I../../gcc/gcc/../libdecnumber -I../../gcc/gcc/../libdecnumber/dpd 
-I../libdecnumber -I../../gcc/gcc/../libbacktrace   -o default-d.o -MT 
default-d.o -MMD -MP -MF ./.deps/default-d.TPo ../../gcc/gcc/config/default-d.c
[all 2021-10-17 00:01:19] In file included from ./tm_d.h:9,
[all 2021-10-17 00:01:19]  from 
../../gcc/gcc/config/default-d.c:22:
[all 2021-10-17 00:01:19] ../../gcc/gcc/defaults.h:908:2: error: #error You 
must define PREFERRED_DEBUGGING_TYPE if DWARF is not supported
[all 2021-10-17 00:01:19]   908 | #error You must define 
PREFERRED_DEBUGGING_TYPE if DWARF is not supported
[all 2021-10-17 00:01:19]   |  ^
[all 2021-10-17 00:01:20] make[1]: *** [Makefile:2330: default-d.o] Error 1
[all 2021-10-17 00:01:21] make[1]: Leaving directory 
'/var/lib/laminar/run/gcc-moxie-elf/13/toolchain-build/gcc'
[all 2021-10-17 00:01:21] make: *** [Makefile:4423: all-gcc] Error 2

Shall I try to ping all the maintainers?

MfG, JBG

-- 


signature.asc
Description: PGP signature


RE: [PATCH 2/2]AArch64: Add better costing for vector constants and operations

2021-10-23 Thread Tamar Christina via Gcc-patches


> -Original Message-
> From: Richard Sandiford 
> Sent: Saturday, October 23, 2021 11:40 AM
> To: Tamar Christina via Gcc-patches 
> Cc: Tamar Christina ; Richard Earnshaw
> ; nd ; Marcus Shawcroft
> 
> Subject: Re: [PATCH 2/2]AArch64: Add better costing for vector constants
> and operations
> 
> Tamar Christina via Gcc-patches  writes:
> >> I'm still a bit sceptical about treating the high-part cost as lower.
> >> ISTM that the subreg cases are the ones that are truly “free” and any
> >> others should have a normal cost.  So if CSE handled the subreg case
> >> itself (to model how the rtx would actually be generated) then
> >> aarch64 code would have to do less work.  I imagine that will be true for
> other targets as well.
> >
> > I guess the main problem is that CSE lacks context because it's not
> > until after combine that the high part becomes truly "free" when pushed
> into a high operation.
> 
> Yeah.  And the aarch64 code is just being asked to cost the operation it's
> given, which could for example come from an existing
> aarch64_simd_mov_from_high.  I think we should try to ensure that
> a aarch64_simd_mov_from_high followed by some arithmetic on
> the result is more expensive than the fused operation (when fusing is
> possible).
> 
> An analogy might be: if the cost code is given:
> 
>   (add (reg X) (reg Y))
> 
> then, at some later point, the (reg X) might be replaced with a 
> multiplication,
> in which case we'd have a MADD operation and the addition is effectively
> free.  Something similar would happen if (reg X) became a shift by a small
> amount on newer cores, although I guess then you could argue either that
> the cost of the add disappears or that the cost of the shift disappears.
> 
> But we shouldn't count ADD as free on the basis that it could be combined
> with a multiplication or shift in future.  We have to cost what we're given.  
> I
> think the same thing applies to the high part.
> 
> Here we're trying to prevent cse1 from replacing a DUP (lane) with a MOVI
> by saying that the DUP is strictly cheaper than the MOVI.
> I don't think that's really true though, and the cost tables in the patch say 
> that
> DUP is more expensive (rather than less expensive) than MOVI.

No we're not. The front end has already pushed the constant into each operation 
that needs it
which is the entire problem.

MOVI as I mentioned before is the one case where this is a toss up.  But there 
are far
more constants that cannot be created with a movi.  A simple example is

#include 

int8x16_t square(int8x16_t full, int8x8_t small) {
int8x16_t cst = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,15};
int8x8_t low = vget_high_s8 (cst);
int8x8_t res1 = vmul_s8 (small, low);
return vaddq_s8 (vmulq_s8 (full, cst), vcombine_s8 (res1, res1));
}

Where in Gimple we get

   [local count: 1073741824]:
  _2 = __builtin_aarch64_get_highv16qi ({ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 
12, 13, 15, 0 });
  _4 = _2 * small_3(D);
  _6 = full_5(D) * { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 0 };
  _7 = __builtin_aarch64_combinev8qi (_4, _4);
  _8 = _6 + _7;
  return _8;

Regardless of what happens to __builtin_aarch64_get_highv16qi nothing will 
recreate the relationship
with cst, whether __builtin_aarch64_get_highv16qi is lowered or not, constant 
prop will still push in constants.

This codegen results in us rematerializing the constant twice.

square:
adrpx0, .LC0
ldr d2, [x0, #:lo12:.LC0]
adrpx0, .LC1
ldr q3, [x0, #:lo12:.LC1]
mul v1.8b, v2.8b, v1.8b
dup d2, v1.d[0]
ins v2.d[1], v1.d[0]
mla v2.16b, v0.16b, v3.16b
mov v0.16b, v2.16b
ret
.LC1:
.byte   0
.byte   1
.byte   2
.byte   3
.byte   4
.byte   5
.byte   6
.byte   7
.byte   8
.byte   9
.byte   10
.byte   11
.byte   12
.byte   13
.byte   15
.byte   0

Regardless whether it's pushed into a high operation or not this codegen it's 
still far more expensive to do this codegen.

> 
> Also, if I've understood correctly, it looks like we'd be relying on the
> vget_high of a constant remaining unfolded until RTL cse1.
> I think it's likely in future that we'd try to fold vget_high at the gimple 
> level
> instead, since that could expose more optimisations of a different kind.  The
> gimple optimisers would then fold vget_high(constant) in a similar way to
> cse1 does now.
> 
> So perhaps we should continue to allow the vget_high(constant) to be
> foloded in cse1 and come up with some way of coping with the folded form.

CSE1 doesn't fold it, because for CSE the cost is too high to do so. Which is 
what this costing was attempting to fix.
CSE simply does not touch it. It leaves it as

(insn 11 10 12 2 (set (reg:V16QI 95 [ _7 ])
(vec_concat:V16QI (vec_select:V8QI (reg:V16QI 95 [ _7 ])
   

Re: Cannot reproduce – Re: [r12-4632 Regression] FAIL: gfortran.dg/bind-c-intent-out-2.f90 -Os (test for excess errors) on Linux/x86_64

2021-10-23 Thread H.J. Lu via Gcc-patches
On Sat, Oct 23, 2021 at 5:34 AM H.J. Lu  wrote:
>
> On Sat, Oct 23, 2021 at 5:31 AM H.J. Lu  wrote:
> >
> > On Fri, Oct 22, 2021 at 11:20 PM Tobias Burnus  
> > wrote:
> > >
> > > Hi,
> > >
> > > for some reasons, I cannot reproduce this. I checked with that I am in
> > > sync with master – and I also tried -m32 and -march=cascadelake, running
> > > both manually and via DejaGNU but I it passes here.
> > >
> > > Can someone who sees it show the excess error? Or was that a spurious
> > > issue which is now  gone?
> >
> > spawn -ignore SIGHUP
> > /export/gnu/import/git/gcc-test-master-intel64-native/bld/gcc/testsuite/gfortran2/../../gfortran
> > -B/export/gnu/import/git/gcc-test-master-intel64-native/bld/gcc/testsuite/gfortran2/../../
> > -B/export/gnu/import/git/gcc-test-master-intel64-native/bld/x86_64-pc-linux-gnu/./libgfortran/
> > /export/gnu/import/git/gcc-test-master-intel64-native/src-master/gcc/testsuite/gfortran.dg/bind-c-intent-out-2.f90
> > -fdiagnostics-plain-output -fdiagnostics-plain-output -O0
> > -pedantic-errors -fsanitize=undefined -fcheck=all
> > -B/export/gnu/import/git/gcc-test-master-intel64-native/bld/x86_64-pc-linux-gnu/./libgfortran/.libs
> > -L/export/gnu/import/git/gcc-test-master-intel64-native/bld/x86_64-pc-linux-gnu/./libgfortran/.libs
> > -L/export/gnu/import/git/gcc-test-master-intel64-native/bld/x86_64-pc-linux-gnu/./libgfortran/.libs
> > -L/export/gnu/import/git/gcc-test-master-intel64-native/bld/x86_64-pc-linux-gnu/./libatomic/.libs
> > -B/export/gnu/import/git/gcc-test-master-intel64-native/bld/x86_64-pc-linux-gnu/./libquadmath/.libs
> > -L/export/gnu/import/git/gcc-test-master-intel64-native/bld/x86_64-pc-linux-gnu/./libquadmath/.libs
> > -L/export/gnu/import/git/gcc-test-master-intel64-native/bld/x86_64-pc-linux-gnu/./libquadmath/.libs
> > -lm -o ./bind-c-intent-out-2.exe
> > /usr/local/bin/ld: cannot find -lubsan: No such file or directory
> >
> > Shouldn't the test be placed under gfortran.dg/ubsan?
>
> I am checking it as an obvious fix.

>From d891ab1bc87bc5d855f6ee18337e517a2a90d759 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Sat, 23 Oct 2021 05:40:09 -0700
Subject: [PATCH] Move bind-c-intent-out-2.f90 to gfortran.dg/ubsan

Move bind-c-intent-out-2.f90 to gfortran.dg/ubsan for -fsanitize=undefined.

PR fortran/9262
* gfortran.dg/bind-c-intent-out-2.f90: Moved to ...
* gfortran.dg/ubsan/bind-c-intent-out-2.f90
---
 gcc/testsuite/gfortran.dg/{ => ubsan}/bind-c-intent-out-2.f90 | 0
 1 file changed, 0 insertions(+), 0 deletions(-)
 rename gcc/testsuite/gfortran.dg/{ => ubsan}/bind-c-intent-out-2.f90 (100%)

diff --git a/gcc/testsuite/gfortran.dg/bind-c-intent-out-2.f90
b/gcc/testsuite/gfortran.dg/ubsan/bind-c-intent-out-2.f90
similarity index 100%
rename from gcc/testsuite/gfortran.dg/bind-c-intent-out-2.f90
rename to gcc/testsuite/gfortran.dg/ubsan/bind-c-intent-out-2.f90

-- 
H.J.


Re: Cannot reproduce – Re: [r12-4632 Regression] FAIL: gfortran.dg/bind-c-intent-out-2.f90 -Os (test for excess errors) on Linux/x86_64

2021-10-23 Thread H.J. Lu via Gcc-patches
On Sat, Oct 23, 2021 at 5:31 AM H.J. Lu  wrote:
>
> On Fri, Oct 22, 2021 at 11:20 PM Tobias Burnus  
> wrote:
> >
> > Hi,
> >
> > for some reasons, I cannot reproduce this. I checked with that I am in
> > sync with master – and I also tried -m32 and -march=cascadelake, running
> > both manually and via DejaGNU but I it passes here.
> >
> > Can someone who sees it show the excess error? Or was that a spurious
> > issue which is now  gone?
>
> spawn -ignore SIGHUP
> /export/gnu/import/git/gcc-test-master-intel64-native/bld/gcc/testsuite/gfortran2/../../gfortran
> -B/export/gnu/import/git/gcc-test-master-intel64-native/bld/gcc/testsuite/gfortran2/../../
> -B/export/gnu/import/git/gcc-test-master-intel64-native/bld/x86_64-pc-linux-gnu/./libgfortran/
> /export/gnu/import/git/gcc-test-master-intel64-native/src-master/gcc/testsuite/gfortran.dg/bind-c-intent-out-2.f90
> -fdiagnostics-plain-output -fdiagnostics-plain-output -O0
> -pedantic-errors -fsanitize=undefined -fcheck=all
> -B/export/gnu/import/git/gcc-test-master-intel64-native/bld/x86_64-pc-linux-gnu/./libgfortran/.libs
> -L/export/gnu/import/git/gcc-test-master-intel64-native/bld/x86_64-pc-linux-gnu/./libgfortran/.libs
> -L/export/gnu/import/git/gcc-test-master-intel64-native/bld/x86_64-pc-linux-gnu/./libgfortran/.libs
> -L/export/gnu/import/git/gcc-test-master-intel64-native/bld/x86_64-pc-linux-gnu/./libatomic/.libs
> -B/export/gnu/import/git/gcc-test-master-intel64-native/bld/x86_64-pc-linux-gnu/./libquadmath/.libs
> -L/export/gnu/import/git/gcc-test-master-intel64-native/bld/x86_64-pc-linux-gnu/./libquadmath/.libs
> -L/export/gnu/import/git/gcc-test-master-intel64-native/bld/x86_64-pc-linux-gnu/./libquadmath/.libs
> -lm -o ./bind-c-intent-out-2.exe
> /usr/local/bin/ld: cannot find -lubsan: No such file or directory
>
> Shouldn't the test be placed under gfortran.dg/ubsan?

I am checking it as an obvious fix.

> > Tobias
> >
> > On 23.10.21 06:43, sunil.k.pandey wrote:
> > > On Linux/x86_64,
> > >
> > > 24e99e6ec1cc57f3660c00ff677c7feb16aa94d2 is the first bad commit
> > > commit 24e99e6ec1cc57f3660c00ff677c7feb16aa94d2
> > > Author: Tobias Burnus 
> > > Date:   Fri Oct 22 23:23:06 2021 +0200
> > >
> > >  Fortran: Avoid running into assert with -fcheck= + UBSAN
> > >
> > > caused
> > >
> > > FAIL: gfortran.dg/bind-c-intent-out-2.f90   -O0  (test for excess errors)
> > > FAIL: gfortran.dg/bind-c-intent-out-2.f90   -O1  (test for excess errors)
> > > FAIL: gfortran.dg/bind-c-intent-out-2.f90   -O2  (test for excess errors)
> > > FAIL: gfortran.dg/bind-c-intent-out-2.f90   -O3 -fomit-frame-pointer 
> > > -funroll-loops -fpeel-loops -ftracer -finline-functions  (test for excess 
> > > errors)
> > > FAIL: gfortran.dg/bind-c-intent-out-2.f90   -O3 -g  (test for excess 
> > > errors)
> > > FAIL: gfortran.dg/bind-c-intent-out-2.f90   -Os  (test for excess errors)
> > >
> > > with GCC configured with
> > >
> > > ../../gcc/configure 
> > > --prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-4632/usr
> > >  --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
> > > --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet 
> > > --without-isl --enable-libmpx x86_64-linux --disable-bootstrap
> > >
> > > To reproduce:
> > >
> > > $ cd {build_dir}/gcc && make check 
> > > RUNTESTFLAGS="dg.exp=gfortran.dg/bind-c-intent-out-2.f90 
> > > --target_board='unix{-m32}'"
> > > $ cd {build_dir}/gcc && make check 
> > > RUNTESTFLAGS="dg.exp=gfortran.dg/bind-c-intent-out-2.f90 
> > > --target_board='unix{-m32\ -march=cascadelake}'"
> > > $ cd {build_dir}/gcc && make check 
> > > RUNTESTFLAGS="dg.exp=gfortran.dg/bind-c-intent-out-2.f90 
> > > --target_board='unix{-m64}'"
> > > $ cd {build_dir}/gcc && make check 
> > > RUNTESTFLAGS="dg.exp=gfortran.dg/bind-c-intent-out-2.f90 
> > > --target_board='unix{-m64\ -march=cascadelake}'"
> > >
> > > (Please do not reply to this email, for question about this report, 
> > > contact me at skpgkp2 at gmail dot com)
> > -
> > Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 
> > 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: 
> > Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; 
> > Registergericht München, HRB 106955
>
>
>
> --
> H.J.



-- 
H.J.


Re: Cannot reproduce – Re: [r12-4632 Regression] FAIL: gfortran.dg/bind-c-intent-out-2.f90 -Os (test for excess errors) on Linux/x86_64

2021-10-23 Thread H.J. Lu via Gcc-patches
On Fri, Oct 22, 2021 at 11:20 PM Tobias Burnus  wrote:
>
> Hi,
>
> for some reasons, I cannot reproduce this. I checked with that I am in
> sync with master – and I also tried -m32 and -march=cascadelake, running
> both manually and via DejaGNU but I it passes here.
>
> Can someone who sees it show the excess error? Or was that a spurious
> issue which is now  gone?

spawn -ignore SIGHUP
/export/gnu/import/git/gcc-test-master-intel64-native/bld/gcc/testsuite/gfortran2/../../gfortran
-B/export/gnu/import/git/gcc-test-master-intel64-native/bld/gcc/testsuite/gfortran2/../../
-B/export/gnu/import/git/gcc-test-master-intel64-native/bld/x86_64-pc-linux-gnu/./libgfortran/
/export/gnu/import/git/gcc-test-master-intel64-native/src-master/gcc/testsuite/gfortran.dg/bind-c-intent-out-2.f90
-fdiagnostics-plain-output -fdiagnostics-plain-output -O0
-pedantic-errors -fsanitize=undefined -fcheck=all
-B/export/gnu/import/git/gcc-test-master-intel64-native/bld/x86_64-pc-linux-gnu/./libgfortran/.libs
-L/export/gnu/import/git/gcc-test-master-intel64-native/bld/x86_64-pc-linux-gnu/./libgfortran/.libs
-L/export/gnu/import/git/gcc-test-master-intel64-native/bld/x86_64-pc-linux-gnu/./libgfortran/.libs
-L/export/gnu/import/git/gcc-test-master-intel64-native/bld/x86_64-pc-linux-gnu/./libatomic/.libs
-B/export/gnu/import/git/gcc-test-master-intel64-native/bld/x86_64-pc-linux-gnu/./libquadmath/.libs
-L/export/gnu/import/git/gcc-test-master-intel64-native/bld/x86_64-pc-linux-gnu/./libquadmath/.libs
-L/export/gnu/import/git/gcc-test-master-intel64-native/bld/x86_64-pc-linux-gnu/./libquadmath/.libs
-lm -o ./bind-c-intent-out-2.exe
/usr/local/bin/ld: cannot find -lubsan: No such file or directory

Shouldn't the test be placed under gfortran.dg/ubsan?

> Tobias
>
> On 23.10.21 06:43, sunil.k.pandey wrote:
> > On Linux/x86_64,
> >
> > 24e99e6ec1cc57f3660c00ff677c7feb16aa94d2 is the first bad commit
> > commit 24e99e6ec1cc57f3660c00ff677c7feb16aa94d2
> > Author: Tobias Burnus 
> > Date:   Fri Oct 22 23:23:06 2021 +0200
> >
> >  Fortran: Avoid running into assert with -fcheck= + UBSAN
> >
> > caused
> >
> > FAIL: gfortran.dg/bind-c-intent-out-2.f90   -O0  (test for excess errors)
> > FAIL: gfortran.dg/bind-c-intent-out-2.f90   -O1  (test for excess errors)
> > FAIL: gfortran.dg/bind-c-intent-out-2.f90   -O2  (test for excess errors)
> > FAIL: gfortran.dg/bind-c-intent-out-2.f90   -O3 -fomit-frame-pointer 
> > -funroll-loops -fpeel-loops -ftracer -finline-functions  (test for excess 
> > errors)
> > FAIL: gfortran.dg/bind-c-intent-out-2.f90   -O3 -g  (test for excess errors)
> > FAIL: gfortran.dg/bind-c-intent-out-2.f90   -Os  (test for excess errors)
> >
> > with GCC configured with
> >
> > ../../gcc/configure 
> > --prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-4632/usr
> >  --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
> > --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet 
> > --without-isl --enable-libmpx x86_64-linux --disable-bootstrap
> >
> > To reproduce:
> >
> > $ cd {build_dir}/gcc && make check 
> > RUNTESTFLAGS="dg.exp=gfortran.dg/bind-c-intent-out-2.f90 
> > --target_board='unix{-m32}'"
> > $ cd {build_dir}/gcc && make check 
> > RUNTESTFLAGS="dg.exp=gfortran.dg/bind-c-intent-out-2.f90 
> > --target_board='unix{-m32\ -march=cascadelake}'"
> > $ cd {build_dir}/gcc && make check 
> > RUNTESTFLAGS="dg.exp=gfortran.dg/bind-c-intent-out-2.f90 
> > --target_board='unix{-m64}'"
> > $ cd {build_dir}/gcc && make check 
> > RUNTESTFLAGS="dg.exp=gfortran.dg/bind-c-intent-out-2.f90 
> > --target_board='unix{-m64\ -march=cascadelake}'"
> >
> > (Please do not reply to this email, for question about this report, contact 
> > me at skpgkp2 at gmail dot com)
> -
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
> München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
> Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
> München, HRB 106955



--
H.J.


Re: [PATCH 2/2]AArch64: Add better costing for vector constants and operations

2021-10-23 Thread Richard Sandiford via Gcc-patches
Tamar Christina via Gcc-patches  writes:
>> I'm still a bit sceptical about treating the high-part cost as lower.
>> ISTM that the subreg cases are the ones that are truly “free” and any others
>> should have a normal cost.  So if CSE handled the subreg case itself (to 
>> model
>> how the rtx would actually be generated) then aarch64 code would have to
>> do less work.  I imagine that will be true for other targets as well.
>
> I guess the main problem is that CSE lacks context because it's not until 
> after
> combine that the high part becomes truly "free" when pushed into a high 
> operation.

Yeah.  And the aarch64 code is just being asked to cost the operation
it's given, which could for example come from an existing
aarch64_simd_mov_from_high.  I think we should try to ensure that
a aarch64_simd_mov_from_high followed by some arithmetic on the
result is more expensive than the fused operation (when fusing is
possible).

An analogy might be: if the cost code is given:

  (add (reg X) (reg Y))

then, at some later point, the (reg X) might be replaced with a
multiplication, in which case we'd have a MADD operation and the
addition is effectively free.  Something similar would happen if
(reg X) became a shift by a small amount on newer cores, although
I guess then you could argue either that the cost of the add
disappears or that the cost of the shift disappears.

But we shouldn't count ADD as free on the basis that it could be
combined with a multiplication or shift in future.  We have to cost
what we're given.  I think the same thing applies to the high part.

Here we're trying to prevent cse1 from replacing a DUP (lane) with
a MOVI by saying that the DUP is strictly cheaper than the MOVI.
I don't think that's really true though, and the cost tables in the
patch say that DUP is more expensive (rather than less expensive)
than MOVI.

Also, if I've understood correctly, it looks like we'd be relying
on the vget_high of a constant remaining unfolded until RTL cse1.
I think it's likely in future that we'd try to fold vget_high
at the gimple level instead, since that could expose more
optimisations of a different kind.  The gimple optimisers would
then fold vget_high(constant) in a similar way to cse1 does now.

So perhaps we should continue to allow the vget_high(constant)
to be foloded in cse1 and come up with some way of coping with
the folded form.

Thanks,
Richard


Re: [PATCH] [PR testsuite/102857] Tweak ssa-dom-thread-7.c for aarch64.

2021-10-23 Thread Andrew Pinski via Gcc-patches
On Sat, Oct 23, 2021 at 2:15 AM Aldy Hernandez via Gcc-patches
 wrote:
>
> First, ssa-dom-thread-7 was looking at a dump file that was not
> being generated.  This probably happened in the detangling of the VRP
> threader from VRP, and I didn't notice because the test came back as
> with UNRESOLVED instead of FAIL.
>
> Second, aarch64 gets far more threads than other architectures (20
> versus 12).  The difference is sufficiently different to make the
> regex awkward.
>
> We already have special casing for aarch64 in other parts of this
> test, so perhaps it's simplest to have an arch specific test
> for the thread3 count.
>
> I don't know perhaps there's a better way.  I wake up with chills in
> the middle of the night thinking about this test ;-).

I guess you have never had a customer ask for coremark benchmark
numbers before :).
https://www.eembc.org/coremark/ (for reference).

Thanks,
Andrew Pinski


>
> Tested on x86-64 Linux and aarch64 Linux.
>
> OK?
>
> gcc/testsuite/ChangeLog:
>
> PR testsuite/102857
> * gcc.dg/tree-ssa/ssa-dom-thread-7.c: Add -fdump-tree-vrp2-stats.
> Tweak for aarch64.
> ---
>  gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c
> index 1da00a691c8..001319ab9e9 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c
> @@ -1,7 +1,6 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -fdump-tree-thread1-stats -fdump-tree-thread2-stats 
> -fdump-tree-dom2-stats -fdump-tree-thread3-stats -fdump-tree-dom3-stats 
> -fdump-tree-vrp2-stats -fno-guess-branch-probability" } */
> +/* { dg-options "-O2 -fdump-tree-dom2-stats -fdump-tree-thread3-stats 
> -fdump-tree-dom3-stats -fdump-tree-vrp-thread2-stats 
> -fno-guess-branch-probability" } */
>
> -/* { dg-final { scan-tree-dump "Jumps threaded: 12"  "thread3" } } */
>  /* { dg-final { scan-tree-dump-not "Jumps threaded"  "dom2" } } */
>
>  /* aarch64 has the highest CASE_VALUES_THRESHOLD in GCC.  It's high enough
> @@ -9,6 +8,8 @@
> jump threading opportunities.  Skip the later tests on aarch64.  */
>  /* { dg-final { scan-tree-dump-not "Jumps threaded"  "dom3" { target { ! 
> aarch64*-*-* } } } } */
>  /* { dg-final { scan-tree-dump-not "Jumps threaded"  "vrp-thread2" { target 
> { ! aarch64*-*-* } } } } */
> +/* { dg-final { scan-tree-dump "Jumps threaded: 12"  "thread3" { target { ! 
> aarch64*-*-* } } } } */
> +/* { dg-final { scan-tree-dump "Jumps threaded: 20"  "thread3" { target { 
> aarch64*-*-* } } } } */
>
>  enum STATE {
>S0=0,
> --
> 2.31.1
>


[PATCH] [PR testsuite/102857] Tweak ssa-dom-thread-7.c for aarch64.

2021-10-23 Thread Aldy Hernandez via Gcc-patches
First, ssa-dom-thread-7 was looking at a dump file that was not
being generated.  This probably happened in the detangling of the VRP
threader from VRP, and I didn't notice because the test came back as
with UNRESOLVED instead of FAIL.

Second, aarch64 gets far more threads than other architectures (20
versus 12).  The difference is sufficiently different to make the
regex awkward.

We already have special casing for aarch64 in other parts of this
test, so perhaps it's simplest to have an arch specific test
for the thread3 count.

I don't know perhaps there's a better way.  I wake up with chills in
the middle of the night thinking about this test ;-).

Tested on x86-64 Linux and aarch64 Linux.

OK?

gcc/testsuite/ChangeLog:

PR testsuite/102857
* gcc.dg/tree-ssa/ssa-dom-thread-7.c: Add -fdump-tree-vrp2-stats.
Tweak for aarch64.
---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c
index 1da00a691c8..001319ab9e9 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c
@@ -1,7 +1,6 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-thread1-stats -fdump-tree-thread2-stats 
-fdump-tree-dom2-stats -fdump-tree-thread3-stats -fdump-tree-dom3-stats 
-fdump-tree-vrp2-stats -fno-guess-branch-probability" } */
+/* { dg-options "-O2 -fdump-tree-dom2-stats -fdump-tree-thread3-stats 
-fdump-tree-dom3-stats -fdump-tree-vrp-thread2-stats 
-fno-guess-branch-probability" } */
 
-/* { dg-final { scan-tree-dump "Jumps threaded: 12"  "thread3" } } */
 /* { dg-final { scan-tree-dump-not "Jumps threaded"  "dom2" } } */
 
 /* aarch64 has the highest CASE_VALUES_THRESHOLD in GCC.  It's high enough
@@ -9,6 +8,8 @@
jump threading opportunities.  Skip the later tests on aarch64.  */
 /* { dg-final { scan-tree-dump-not "Jumps threaded"  "dom3" { target { ! 
aarch64*-*-* } } } } */
 /* { dg-final { scan-tree-dump-not "Jumps threaded"  "vrp-thread2" { target { 
! aarch64*-*-* } } } } */
+/* { dg-final { scan-tree-dump "Jumps threaded: 12"  "thread3" { target { ! 
aarch64*-*-* } } } } */
+/* { dg-final { scan-tree-dump "Jumps threaded: 20"  "thread3" { target { 
aarch64*-*-* } } } } */
 
 enum STATE {
   S0=0,
-- 
2.31.1



Re: [PATCH] Objective-C: fix protocol list count type (pertinent to non-LP64)

2021-10-23 Thread Iain Sandoe via Gcc-patches
Hi Matt,

sorry for slow response,
unavoidable external factors have been keeping me away from the computer (for 
both $dayjob and and volunteer stuff).

> On 20 Oct 2021, at 04:51, Matt Jacobson via Gcc-patches 
>  wrote:
> 
> 
>> On Sep 26, 2021, at 11:45 PM, Matt Jacobson  wrote:
>> 
>> Fix protocol list layout for non-LP64.  clang and objc4 both give the 
>> `count` 
>> field as `long`, not `intptr_t`.  Those are the same on LP64, but not 
>> everywhere.  For non-LP64, this fixes binary compatibility with clang-built 
>> classes.
>> 
>> This was more complicated than I anticipated, because the relevant frontend 
>> code in fact had no AST type for `protocol_list_t`, instead emitting 
>> protocol 
>> lists as `protocol_t[]`, with the zeroth element actually being the integer 
>> count.  That made it nontrivial to change the count to `long`.  With this 
>> change, there is now a true `protocol_list_t` type in the AST.
>> 
>> Tested multiple ways.  On x86_64/Darwin, I confirmed with a test program 
>> that 
>> protocol conformances by classes, categories, and protocols works.  On AVR, 
>> I 
>> manually inspected the generated assembly to confirm that protocol lists 
>> gain 
>> an extra two bytes of `count`, matching clang.
>> 
>> Thank you for your time.
>> 
>> 
> 
> Friendly ping.  Please let me know if there’s anything I can clarify.

The patch is in my queue (it will not get lost), the rationale seems reasonable,
Iain



Re: [PATCH] Try to resolve paths in threader without looking further back.

2021-10-23 Thread Aldy Hernandez via Gcc-patches




On 10/22/21 5:59 PM, Martin Sebor wrote:

On 10/22/21 9:18 AM, Aldy Hernandez wrote:

On Fri, Oct 22, 2021 at 4:27 PM Martin Sebor  wrote:


On 10/22/21 5:22 AM, Aldy Hernandez wrote:

On Thu, Oct 21, 2021 at 4:51 PM Martin Sebor  wrote:



(By the way, I don't see range info in the access pass at -O0.
Should I?)


I assume you mean you don't see anything in the dump files.


I mean that I don't get accurate range info from the ranger
instance in any function.  I'd like the example below to trigger
a warning even at -O0 but it doesn't because n's range is
[0, UINT_MAX] instead of [7, UINT_MAX]:

    char a[4];

    void f (unsigned n)
    {
  if (n < 7)
    n = 7;
  __builtin_memset (a, 0, n);
    }


Breakpoint 5, get_size_range (query=0x0, bound=, range=0x7fffda10,
 bndrng=0x7fffdc98) at
/home/aldyh/src/gcc/gcc/gimple-ssa-warn-access.cc:1196
(gdb) p debug_ranger()
;; Function f

=== BB 2 
Imports: n_3(D)
Exports: n_3(D)
n_3(D)    unsigned int VARYING
  :
 if (n_3(D) <= 6)
   goto ; [INV]
 else
   goto ; [INV]

2->3  (T) n_3(D) : unsigned int [0, 6]
2->4  (F) n_3(D) : unsigned int [7, +INF]

=== BB 3 
  :
 n_4 = 7;

n_4 : unsigned int [7, 7]

=== BB 4 
  :
 # n_2 = PHI 
 _1 = (long unsigned int) n_2;
 __builtin_memset (, 0, _1);
 return;

_1 : long unsigned int [7, 4294967295]
n_2 : unsigned int [7, +INF]
Non-varying global ranges:
=:
_1  : long unsigned int [7, 4294967295]
n_2  : unsigned int [7, +INF]
n_4  : unsigned int [7, 7]

 From the above it looks like _1 at BB4 is [7, 4294967295].


Great!


  You probably want:

   range_of_expr (r, tree_for_ssa_1, gimple_for_the_memset_call)


That's what the function does.  But its caller doesn't have
access to the Gimple statement so it passes in null instead.
Presumably without it, range_of_expr() doesn't have enough
context to know what BB I'm asking about.  It does work
without the statement at -O but then there's just one BB
(the if statement becomes a MAX_EXPR) so there's just one
range.



BTW, debug_ranger() tells you everything ranger would know for the
given IL.  It's meant as a debugging aid.  You may want to look at
it's source to see how it calls the ranger.


Thanks for the tip.  I should do that.  There's a paradigm
shift from the old ways of working with ranges and the new
way, and it will take a bit of adjusting to.  I just haven't
spent enough time working with Ranger to be there.  But this
exchange alone was already very helpful!



You can query the ranger on any point in the IL.  However, if you don't 
give it context, it'll just return the globally known range.  In this 
case it'll be the global SSA range, which is unset because the usual 
setters haven't run at -O0 (evrp, VRP*).  So yes, you need to pass it 
correct context.


Yes, it's a paradigm shift.  The evrp engine worked by pushing state as 
you did a dom walk, so you could ask for SSA ranges that were specific 
to the path sensitive point you were in the walk.  The ranger is far 
more versatile, in which you can ask for a range on an edge, block, or 
statement, regardless of how you're iterating through the gimple.


You can also use the ranger to indirectly tell you about reachability. 
For example, if you ask for the range of x_6 at a point in the IL and 
the answer comes out as UNDEFINED, that point is unreachable.  That is, 
assuming x_6 is considered live at the query point.  IIRC, if you ask 
for x_6  at a point not dominated by the definition of x_6, the ranger 
will also return UNDEFINED.


The ranger API is designed to be minimal and simple:

bool range_of_stmt (irange , gimple *, tree name = NULL);
bool range_of_expr (irange , tree name, gimple * = NULL);
bool range_on_edge (irange , edge e, tree name);
void range_on_exit (irange , basic_block bb, tree name);

Andrew keeps threatening he'll write up some articles this year on the 
ranger and its reusable components. *prod* :)


Aldy



Cannot reproduce – Re: [r12-4632 Regression] FAIL: gfortran.dg/bind-c-intent-out-2.f90 -Os (test for excess errors) on Linux/x86_64

2021-10-23 Thread Tobias Burnus

Hi,

for some reasons, I cannot reproduce this. I checked with that I am in
sync with master – and I also tried -m32 and -march=cascadelake, running
both manually and via DejaGNU but I it passes here.

Can someone who sees it show the excess error? Or was that a spurious
issue which is now  gone?

Tobias

On 23.10.21 06:43, sunil.k.pandey wrote:

On Linux/x86_64,

24e99e6ec1cc57f3660c00ff677c7feb16aa94d2 is the first bad commit
commit 24e99e6ec1cc57f3660c00ff677c7feb16aa94d2
Author: Tobias Burnus 
Date:   Fri Oct 22 23:23:06 2021 +0200

 Fortran: Avoid running into assert with -fcheck= + UBSAN

caused

FAIL: gfortran.dg/bind-c-intent-out-2.f90   -O0  (test for excess errors)
FAIL: gfortran.dg/bind-c-intent-out-2.f90   -O1  (test for excess errors)
FAIL: gfortran.dg/bind-c-intent-out-2.f90   -O2  (test for excess errors)
FAIL: gfortran.dg/bind-c-intent-out-2.f90   -O3 -fomit-frame-pointer 
-funroll-loops -fpeel-loops -ftracer -finline-functions  (test for excess 
errors)
FAIL: gfortran.dg/bind-c-intent-out-2.f90   -O3 -g  (test for excess errors)
FAIL: gfortran.dg/bind-c-intent-out-2.f90   -Os  (test for excess errors)

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-4632/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=gfortran.dg/bind-c-intent-out-2.f90 --target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=gfortran.dg/bind-c-intent-out-2.f90 --target_board='unix{-m32\ 
-march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=gfortran.dg/bind-c-intent-out-2.f90 --target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=gfortran.dg/bind-c-intent-out-2.f90 --target_board='unix{-m64\ 
-march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955