RE: [PATCH PR94442] [AArch64] Redundant ldp/stp instructions emitted at -O3

2020-11-10 Thread xiezhiheng
> -Original Message-
> From: Richard Sandiford [mailto:richard.sandif...@arm.com]
> Sent: Tuesday, November 10, 2020 7:54 PM
> To: xiezhiheng 
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH PR94442] [AArch64] Redundant ldp/stp instructions
> emitted at -O3
> 
> xiezhiheng  writes:
> >> -Original Message-
> >> From: Richard Sandiford [mailto:richard.sandif...@arm.com]
> >> Sent: Tuesday, November 3, 2020 9:57 PM
> >> To: xiezhiheng 
> >> Cc: gcc-patches@gcc.gnu.org
> >> Subject: Re: [PATCH PR94442] [AArch64] Redundant ldp/stp instructions
> >> emitted at -O3
> >>
> >> Thanks, I pushed both patches to trunk.
> >>
> >
> > Thanks.  And I made two separate patches for these two groups, tbl/tbx
> intrinsics and
> > the rest of the arithmetic operation intrinsics.
> >
> > Note: It does not matter which patch is applied first.
> 
> I pushed the TBL/TBX one, but on the other patch:
> 
> > @@ -297,7 +297,7 @@
> >BUILTIN_VSDQ_I (USHIFTIMM, uqshl_n, 0, ALL)
> >
> >/* Implemented by aarch64_reduc_plus_.  */
> > -  BUILTIN_VALL (UNOP, reduc_plus_scal_, 10, ALL)
> > +  BUILTIN_VALL (UNOP, reduc_plus_scal_, 10, FP)
> 
> This is defined for integer and FP modes, so I think it should be
> NONE instead of FP.  We'll automatically add FLAGS_FP based on the
> mode where necessary.
> 

Sorry, and I have revised a new patch.
Bootstrapped and tested on aarch64 Linux platform.

Thanks,
Xie Zhiheng


diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 75092451216..d6a49d65214 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,9 @@
+2020-11-11  Zhiheng Xie  
+   Nannan Zheng  
+
+   * config/aarch64/aarch64-simd-builtins.def: Add proper FLAG
+   for arithmetic operation intrinsics.
+



arithmetic-operation-v2.patch
Description: arithmetic-operation-v2.patch


Re: [PATCH 1/3] Refactor copying decl section names

2020-11-10 Thread Alan Modra via Gcc-patches
On Tue, Nov 10, 2020 at 09:19:37PM -0700, Jeff Law wrote:
> I think the const char * is fine.  That should force resolution to the
> same routine we were using earlier.  Do you want to commit that fix or
> shall I?

Commited 693a79a355e1.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH 1/3] Refactor copying decl section names

2020-11-10 Thread Jeff Law via Gcc-patches


On 11/10/20 8:57 PM, Alan Modra wrote:
> On Tue, Nov 10, 2020 at 11:45:37AM -0700, Jeff Law wrote:
>> On 11/12/19 11:28 PM, Strager Neds wrote:
>>> * gcc/cgraph.h (symtab_node::get_section): Constify.
>>> (symtab_node::set_section): Declare new overload.
>>> * gcc/symtab.c (symtab_node::set_section): Define new overload.
>>> (symtab_node::copy_visibility_from): Use new overload of
>>> symtab_node::set_section.
>>> (symtab_node::resolve_alias): Same.
>>> * gcc/tree.h (set_decl_section_name): Declare new overload.
>>> * gcc/tree.c (set_decl_section_name): Define new overload.
>>> * gcc/c/c-decl.c (merge_decls): Use new overload of
>>> set_decl_section_name.
>>> * gcc/cp/decl.c (duplicate_decls): Same.
>>> * gcc/cp/method.c (use_thunk): Same.
>>> * gcc/cp/optimize.c (maybe_clone_body): Same.
>>> * gcc/d/decl.cc (finish_thunk): Same.
>>> * gcc/tree-emutls.c (get_emutls_init_templ_addr): Same.
>>> * gcc/cgraphclones.c (cgraph_node::create_virtual_clone): Use new
>>> overload of symtab_node::set_section.
>>> (cgraph_node::create_version_clone_with_body): Same.
>>> * gcc/trans-mem.c (ipa_tm_create_version): Same.
>> I adjusted the ChangeLog, added an entry for the coroutines.cc addition
>> I made and pushed this to the trunk.
> /home/alan/src/gcc/gcc/go/go-gcc.cc:2759:44: error: call of overloaded 
> ‘set_decl_section_name(tree_node*&, std::nullptr_t)’ is ambiguous
>set_decl_section_name (var_decl, NULL);
> ^
> In file included from /home/alan/src/gcc/gcc/go/go-gcc.cc:27:0:
> /home/alan/src/gcc/gcc/tree.h:4283:13: note: candidate: void 
> set_decl_section_name(tree, const char*)
>  extern void set_decl_section_name (tree, const char *);
>  ^
> /home/alan/src/gcc/gcc/tree.h:4284:13: note: candidate: void 
> set_decl_section_name(tree, const_tree)
>  extern void set_decl_section_name (tree, const_tree);
>  ^
>
> I'm using the obvious to me "(const char *) NULL" in the call to fix
> this, but you might like a different style C++ fix instead.

Ugh.  I don't typically build go.  Sorry about that.  Funny thing is I
saw the NULL when I was looking for other instances of
set_decl_section_name that should have been converted.  But the fact
that it would trigger an ambiguous overload didn't cross my mind.


I think the const char * is fine.  That should force resolution to the
same routine we were using earlier.  Do you want to commit that fix or
shall I?


jeff



Re: [PATCH] CFI-handling : Add a hook to allow target-specific Personality and LSDA indirections.

2020-11-10 Thread Jeff Law via Gcc-patches


On 11/7/20 8:41 AM, Iain Sandoe wrote:
> Hi
>
> ** Actually, this was originally posted during last stage-1, but I forgot to 
> keep
>pinging it…
>
> 
>
> At present, the output of .cfi_personality and .cfi_lsda assumes
> ELF semantics for indirections.  This isn't suitable for all targets
> and is one blocker to moving Darwin to use .cfi_.
>
> The patch adds a target hook that allows non-ELF targets to use
> indirections appropriate to their needs.
>
> tested across the Darwin range and on x86_64-linux-gnu,
> OK for master?
> thanks
> Iain
>
> --
>
> gcc/ChangeLog:
>
>   * config/darwin-protos.h (darwin_make_eh_symbol_indirect): New.
>   * config/darwin.c (darwin_make_eh_symbol_indirect): New. Use
>   Mach-O semantics for personality and ldsa indirections.
>   * config/darwin.h (TARGET_ASM_MAKE_EH_SYMBOL_INDIRECT): New.
>   * doc/tm.texi: Regenerate.
>   * doc/tm.texi.in: Add TARGET_ASM_MAKE_EH_SYMBOL_INDIRECT hook.
>   * dwarf2out.c (dwarf2out_do_cfi_startproc): If the target defines
>   a hook for indirecting personality and ldsa references, use that
>   otherwise default to ELF semantics.
>   * target.def (make_eh_symbol_indirect): New target hook.

OK

jeff




Re: [PATCH] Objective-C++ : Allow prefix attrs on linkage specs.

2020-11-10 Thread Jeff Law via Gcc-patches


On 11/7/20 8:11 AM, Iain Sandoe wrote:
> Hi,
>
> For Objective-C++/C, we cater for the possibility that a class interface
> (@interface) might be preceded by prefix attributes.  In the case of
> Objective-C++, the reference implementation (a.k.a. clang) also allows
> (and combines) prefix attributes that precede a linkage specification
> (but only on a single decl).
>
> Some discussion with Nathan here:
> https://gcc.gnu.org/pipermail/gcc/2020-October/234057.html
>
> The upshot is that clang’s behaviour is inconsistent (I can file a bug,
> I guess) - but since what is “well-formed” for Objective-C is defined in
> reality by what clang accepts - there is a body of code out there that
> depends on the behaviour (some variant of Hyrum’s law, or corollary
> to it, perhaps?).
>
> Inability to parse code including these patterns is blocking progress
> in modernising GNU Objective-C.. so I need to find a way forward.
>
> The compromise made here is to accept the sequence when parsing
> for Objective-C++, and to warn** that the attributes are discarded otherwise.
>
> This seems to me to be an improvement in diagnostics for regular C++
> (since it now says something pertinent to the actual problem and does
> the 'same as usual' when encountering an unhandled attribute).
>
> Tested across the Darwin patch, and on x86_64-linux-gnu,
> OK for master?
> thanks
> Iain
>
> ** trivially, that could be an error instead - but it seems we usually warn
> for unrecognised attributes.
>
> —— commit message
>
> For Objective-C++, this combines prefix attributes from before and
> after top level linkage specs.  The "reference implementation" for
> Objective-C++ allows this, and system headers depend on it.
>
> e.g.
>
> __attribute__((__deprecated__))
> extern "C" __attribute__((__visibility__("default")))
> @interface MyClass
> ...
> @end
>
> Would consider the list of prefix attributes to the interface for
> MyClass to include both the visibility and deprecated ones.
>
> When we are compiling regular C++, this emits a warning and discards
> any prefix attributes before a linkage spec.
>
> gcc/cp/ChangeLog:
>
>   * parser.c (cp_parser_declaration): Unless we are compiling for
>   Ojective-C++, warn about and discard any attributes that prefix
>   a linkage specification.

Absolutely willing to trust your judgment on ObjC++.  So, OK


jeff




Re: [PATCH][MIPS] PR target/77510 Reduce size of MIPS core automaton

2020-11-10 Thread Jeff Law via Gcc-patches


On 11/10/20 8:57 PM, Paul Hua wrote:
>>> This patch reduce reservation of model do not more than 10 cycles. The
>>> memory of genautomata down to 1GB.
>> How bad is the memory consumption before this change?
>>
> almost 2.4GB.
> see bugzilla comment #4.
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77510#c4

Thanks Paul, I didn't realize it was your patch until reviewing the BZ.


Given you've done as much on the MIPS in the last several years as
anyone and it's originally your patch, I think we should go forward with
it as long as you don't have any concerns.


jeff



Re: [PATCH 3/3] Improve efficiency of copying section from another tree

2020-11-10 Thread Jeff Law via Gcc-patches


On 11/12/19 11:29 PM, Strager Neds wrote:
> Several parts of GCC need to copy a section name from one tree (or
> symtab_node) to another. Currently, this is implemented naively:
>
> 1. Query the source's section name
> 2. Hash the section name string
> 3. Find the section_hash_entry in the symbol table
> 4. Increment the section_hash_entry's reference count
> 5. Assign the destination's section to the section_hash_entry
>
> Since we have the source's section_hash_entry, we can copy the section
> name from one symtab_node to another efficiently with the following
> algorithm:
>
> 1. Query the source's section_hash_entry
> 2. Increment the section_hash_entry's reference count
> 3. Assign the destination's section to the section_hash_entry
>
> Implement this algorithm in the overload of symtab_node::set_section
> which takes an existing symtab_node.
>
> I did not measure the performance impact of this patch. In
> particular, I do not know if this patch actually improves performance.
>
> This patch should not change behavior.
>
> Testing: Bootstrap on x86_64-linux-gnu with --disable-multilib
> --enable-checking=release --enable-languages=c,c++. Observe no change in
> test results.
>
> 2019-11-12  Matthew Glazar 
>
> * gcc/cgraph.h (symtab_node::set_section_for_node): Declare new
> overload.
> (symtab_node::set_section_from_string): Rename from set_section.
> (symtab_node::set_section_from_node): Declare.
> * gcc/symtab.c (symtab_node::set_section_for_node): Define new
> overload.
> (symtab_node::set_section_from_string): Rename from set_section.
> (symtab_node::set_section_from_node): Define.
> (symtab_node::set_section): Call renamed set_section_from_string.
> (symtab_node::set_section): Call new set_section_from_node.

I've fixed up the ChangeLog, re-tested and pushed this patch to the trunk.

Thanks again for your patience,

jeff




Re: [PATCH][MIPS] PR target/77510 Reduce size of MIPS core automaton

2020-11-10 Thread Paul Hua via Gcc-patches
> > This patch reduce reservation of model do not more than 10 cycles. The
> > memory of genautomata down to 1GB.
>
> How bad is the memory consumption before this change?
>

almost 2.4GB.
see bugzilla comment #4.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77510#c4


Re: [PATCH 1/3] Refactor copying decl section names

2020-11-10 Thread Alan Modra via Gcc-patches
On Tue, Nov 10, 2020 at 11:45:37AM -0700, Jeff Law wrote:
> 
> On 11/12/19 11:28 PM, Strager Neds wrote:
> > * gcc/cgraph.h (symtab_node::get_section): Constify.
> > (symtab_node::set_section): Declare new overload.
> > * gcc/symtab.c (symtab_node::set_section): Define new overload.
> > (symtab_node::copy_visibility_from): Use new overload of
> > symtab_node::set_section.
> > (symtab_node::resolve_alias): Same.
> > * gcc/tree.h (set_decl_section_name): Declare new overload.
> > * gcc/tree.c (set_decl_section_name): Define new overload.
> > * gcc/c/c-decl.c (merge_decls): Use new overload of
> > set_decl_section_name.
> > * gcc/cp/decl.c (duplicate_decls): Same.
> > * gcc/cp/method.c (use_thunk): Same.
> > * gcc/cp/optimize.c (maybe_clone_body): Same.
> > * gcc/d/decl.cc (finish_thunk): Same.
> > * gcc/tree-emutls.c (get_emutls_init_templ_addr): Same.
> > * gcc/cgraphclones.c (cgraph_node::create_virtual_clone): Use new
> > overload of symtab_node::set_section.
> > (cgraph_node::create_version_clone_with_body): Same.
> > * gcc/trans-mem.c (ipa_tm_create_version): Same.
> 
> I adjusted the ChangeLog, added an entry for the coroutines.cc addition
> I made and pushed this to the trunk.

/home/alan/src/gcc/gcc/go/go-gcc.cc:2759:44: error: call of overloaded 
‘set_decl_section_name(tree_node*&, std::nullptr_t)’ is ambiguous
   set_decl_section_name (var_decl, NULL);
^
In file included from /home/alan/src/gcc/gcc/go/go-gcc.cc:27:0:
/home/alan/src/gcc/gcc/tree.h:4283:13: note: candidate: void 
set_decl_section_name(tree, const char*)
 extern void set_decl_section_name (tree, const char *);
 ^
/home/alan/src/gcc/gcc/tree.h:4284:13: note: candidate: void 
set_decl_section_name(tree, const_tree)
 extern void set_decl_section_name (tree, const_tree);
 ^

I'm using the obvious to me "(const char *) NULL" in the call to fix
this, but you might like a different style C++ fix instead.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH 2/3] Refactor section name ref counting

2020-11-10 Thread Jeff Law via Gcc-patches


On 11/12/19 11:28 PM, Strager Neds wrote:
> symtab_node::set_section_for_node manages the reference count of
> section_hash_entry objects. I plan to add another function which needs
> to manage the reference count of these objects. To avoid duplicating
> code, factor the existing logic into reusable functions.
>
> This patch should not change behavior.
>
> Testing: Bootstrap on x86_64-linux-gnu with --disable-multilib
> --enable-checking=release --enable-languages=c,c++. Observe no change in
> test results.
>
> 2019-11-12  Matthew Glazar 
>
> * gcc/symtab.c (symtab_node::set_section_for_node): Extract reference
> counting logic into ...
> (retain_section_hash_entry): ... here (new function) and ...
> (release_section_hash_entry): ... here (new function).

I fixed some minor formatting problems, added function comments to the
factored-out code and re-tested this patch.  I'm pushing it to the trunk
momentarily.


jeff




Re: [Patch, microblaze]: Correct the const high double immediate value

2020-11-10 Thread Michael Eager

On 11/8/20 9:43 PM, Nagaraju Mekala wrote:
diff --git a/gcc/config/microblaze/microblaze.c 
b/gcc/config/microblaze/microblaze.c


index a0f81b7..d9341ec 100644
--- a/gcc/config/microblaze/microblaze.c
+++ b/gcc/config/microblaze/microblaze.c
@@ -2440,15 +2440,18 @@ print_operand (FILE * file, rtx op, int letter)
    else if (letter == 'h' || letter == 'j')
  {
-  long val[2];
+  long val[2], l[2];
    if (code == CONST_DOUBLE)
     {
   if (GET_MODE (op) == DFmode)
     REAL_VALUE_TO_TARGET_DOUBLE (*CONST_DOUBLE_REAL_VALUE (op), 
val);

   else
     {
- val[0] = CONST_DOUBLE_HIGH (op);
- val[1] = CONST_DOUBLE_LOW (op);
+ REAL_VALUE_TYPE rv;
+     REAL_VALUE_FROM_CONST_DOUBLE (rv, op);


REAL_VALUE_FROM_CONST_DOUBLE was removed from real.h in 2015.
Use CONST_DOUBLE_REAL_VALUE.


+ REAL_VALUE_TO_TARGET_DOUBLE (rv, l);
+ val[1] = l[WORDS_BIG_ENDIAN == 0];
+ val[0] = l[WORDS_BIG_ENDIAN != 0];
     }
     }
    else if (code == CONST_INT)



diff --git a/gcc/testsuite/gcc.target/microblaze/long.c 
b/gcc/testsuite/gcc.target/microblaze/long.c

new file mode 100644
index 000..4d45186
--- /dev/null
+++ b/gcc/testsuite/gcc.target/microblaze/long.c
@@ -0,0 +1,10 @@
+/* { dg-options "-O0" } */
+#define BASEADDR 0xF000ULL
+int main ()
+{
+  unsigned long long start;
+  start = (unsigned long long) BASEADDR;
+  return 0;
+}
+/* { dg-final { scan-assembler 
"addik\tr(\[0-9]\|\[1-2]\[0-9]\|3\[0-1]),r0,0x" } } */
+/* { dg-final { scan-assembler 
"addik\tr(\[0-9]\|\[1-2]\[0-9]\|3\[0-1]),r0,0xf000" } } */


It looks like this test case will pass without the patch.  The code 
generated before applying the patch is

addik   r4,r0,0x
addik   r5,r0,0xf000 #li => la

Can you provide a test case which fails without the patch but passes 
with the patch?


--
Michael Eager


[PATCH] RISC-V: Enable ifunc if it was supported in the binutils for linux toolchain.

2020-11-10 Thread Nelson Chu
gcc/
* configure: Regenerated.
* configure.ac: If ifunc was supported in the binutils for
linux toolchain, then set enable_gnu_indirect_function to yes.
---
 gcc/configure| 37 +
 gcc/configure.ac | 35 +++
 2 files changed, 72 insertions(+)

diff --git a/gcc/configure b/gcc/configure
index 9d2fd0d..dbda441 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -23407,6 +23407,43 @@ else
 fi
 
 
+case "${target}" in
+  riscv*-*-linux*)
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking linker ifunc IRELATIVE 
support" >&5
+$as_echo_n "checking linker ifunc IRELATIVE support... " >&6; }
+cat > conftest.s < /dev/null 2>&1 \
+   && $gcc_cv_ld -o conftest conftest.o > /dev/null 2>&1 \
+   && $gcc_cv_readelf --relocs --wide conftest \
+ | grep R_RISCV_IRELATIVE > /dev/null 2>&1; then
+  enable_gnu_indirect_function=yes
+fi
+rm -f conftest conftest.o conftest.s
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: 
$enable_gnu_indirect_function" >&5
+$as_echo "$enable_gnu_indirect_function" >&6; }
+;;
+esac
+
 gif=`if test x$enable_gnu_indirect_function = xyes; then echo 1; else echo 0; 
fi`
 
 cat >>confdefs.h <<_ACEOF
diff --git a/gcc/configure.ac b/gcc/configure.ac
index 73034bb..08f3034 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -2807,6 +2807,41 @@ Valid choices are 'yes' and 'no'.]) ;;
   esac],
  [enable_gnu_indirect_function="$default_gnu_indirect_function"])
 
+case "${target}" in
+  riscv*-*-linux*)
+AC_MSG_CHECKING(linker ifunc IRELATIVE support)
+cat > conftest.s < /dev/null 2>&1 \
+   && $gcc_cv_ld -o conftest conftest.o > /dev/null 2>&1 \
+   && $gcc_cv_readelf --relocs --wide conftest \
+ | grep R_RISCV_IRELATIVE > /dev/null 2>&1; then
+  enable_gnu_indirect_function=yes
+fi
+rm -f conftest conftest.o conftest.s
+AC_MSG_RESULT($enable_gnu_indirect_function)
+;;
+esac
+
 gif=`if test x$enable_gnu_indirect_function = xyes; then echo 1; else echo 0; 
fi`
 AC_DEFINE_UNQUOTED(HAVE_GNU_INDIRECT_FUNCTION, $gif,
 [Define if your system supports gnu indirect functions.])
-- 
2.7.4



Re: testsuite: Adjust pr96789.c to exclude vect_load_lanes

2020-11-10 Thread Kewen.Lin via Gcc-patches
Hi Richard,

Thanks for the review!

on 2020/11/10 下午7:31, Richard Sandiford wrote:
> "Kewen.Lin"  writes:
>> Hi,
>>
>> As Lyon pointed out, the newly introduced test case
>> gcc.dg/tree-ssa/pr96789.c fails on arm-none-linux-gnueabihf.
>> Loop vectorizer is able to vectorize the two loops which
>> operate on array tmp with load_lanes feature support.  It
>> makes dse3 get unexpected inputs and do nothing.
>>
>> This patch is to teach the case to respect vect_load_lanes,
>> meanwhile to guard the check only under vect_int.
> 
> I'm not sure this is the right check.  The test passes on aarch64,
> which also has load lanes, but apparently doesn't use them for this
> test.  I think the way the loop vectoriser handles the loops will
> depend a lot on target costs, which can vary in unpredictable ways.
> 

You are right, although aarch64 doesn't have this failure, it can fail
with explicit -march=armv8-a+sve.  It can vary as target features/costs
change.  The check is still fragile.

Your suggestion with -ftree-slp-vectorize below is better!

> Does it work if you instead change -ftree-vectorize to -ftree-slp-vectorize?
> Or does that defeat the purpose of the test?

It works, nice, thanks for the suggestion!

I appended one explicit -fno-tree-loop-vectorize to avoid it to fail
in case someone kicks off the testing with explicit -ftree-loop-vectorize.

The updated version is pasted below, is it ok for trunk?

BR,
Kewen
-
gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr96789.c: Adjusted by disabling loop vectorization.

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr96789.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr96789.c
index d6139a014d8..5704952309b 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr96789.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr96789.c
@@ -1,5 +1,8 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -funroll-loops -ftree-vectorize -fdump-tree-dse-details" 
} */
+/* Disable loop vectorization to avoid that loop vectorizer
+   optimizes those two loops that operate tmp array so that
+   subsequent dse3 won't eliminate expected tmp stores.  */
+/* { dg-options "-O2 -funroll-loops -ftree-slp-vectorize 
-fno-tree-loop-vectorize -fdump-tree-dse-details" } */

 /* Test if scalar cleanup pass takes effects, mainly check
its secondary pass DSE can remove dead stores on array


introduce overridable clear_cache emitter

2020-11-10 Thread Alexandre Oliva


This patch introduces maybe_emit_call_builtin___clear_cache for the
builtin expander machinery and the trampoline initializers to use to
clear the instruction cache, removing a source of inconsistencies and
subtle errors in low-level machinery.

I've adjusted all trampoline_init implementations that used to issue
explicit calls to __clear_cache or similar to use this new primitive.


Specifically on vxworks targets, we needed to drop the __clear_cache
symbol in libgcc, for reasons related with linking that I didn't need
to understand, and we wanted to call cacheTextUpdate directly, despite
the different calling conventions: the second argument is a length
rather than the end address.

So I introduced a target hook to enable target OS-level overriding of
builtin __clear_cache call emission, retaining nearly (*) the same
logic to govern the decision on whether to emit a call (or nothing, or
a machine-dependent insn) but enabling a call to a target
system-defined function with different calling conventions to be
issued, without having to modify .md files of the various
architectures supported by the target system to introduce or modify
clear_cache insns.

(*) I write "nearly" mainly because, when not optimizing, we'd issue a
call regardless, but since the call may now be overridden, I added it
to the set of builtins that are not directly turned into calls when
not optimizing, following the normal expansion path instead.  It
wouldn't be hard to skip the emission of cache-clearing insns when not
optimizing, but it didn't seem very important, especially for the new
uses from trampoline init.

Another difference that might be relevant is that now we expand
the begin and end arguments unconditionally.  This might make a
difference if they have side effects.  That's prettty much impossible
at expand time, but I thought I'd mention it.


I have NOT modified targets that did not issue cache-clearing calls in
trampoline init to use the new clear_cache-calling infrastructure even
if it would expand to nothing.  I have considered doing so, to have
__builtin___clear_cache and trampoline init call cacheTextUpdate on
all vxworks targets, but decided not to, since on targets that don't
do any cache clearing, cacheTextUpdate ought to be a no-op, even
though rs6000 seems to use icbi and dcbf instructions in the function
called to initialize a trampoline, but AFAICT not in the __clear_cache
builtin.  Hopefully target maintainers will have a look and take
advantage of this new piece of infrastructure to remove such
(apparent?) inconsistencies.  Not rs6000 and other that call asm-coded
trampoline setup instructions, for sure, but they might wish to
introduce a CLEAR_INSN_CACHE macro or a clear_cache expander if they
don't have one.


This was regstrapped on x86_64-linux-gnu, and a trivial backport was
tested on multiple vxworks targets.  Ok to install?


for  gcc/ChangeLog

* builtins.c (default_emit_call_builtin___clear_cache): New.
(maybe_emit_call_builtin___clear_cache): New.
(expand_builtin___clear_cache): Split into the above.
(expand_builtin): Do not issue clear_cache call any more.
* builtins.h (maybe_emit_call_builtin___clear_cache): Declare.
* config/aarch64/aarch64.c (aarch64_trampoline_init): Use
maybe_emit_call_builtin___clear_cache.
* config/arc/arc.c (arc_trampoline_init): Likewise.
* config/arm/arm.c (arm_trampoline_init): Likewise.
* config/c6x/c6x.c (c6x_initialize_trampoline): Likewise.
* config/csky/csky.c (csky_trampoline_init): Likewise.
* config/m68k/linux.h (FInALIZE_TRAMPOLINE): Likewise.
* config/tilegx/tilegx.c (tilegx_trampoline_init): Likewise.
* config/tilepro/tilepro.c (tilepro_trampoline_init): Ditto.
* config/vxworks.c: Include rtl.h, memmodel.h, and optabs.h.
(vxworks_emit_call_builtin___clear_cache): New.
* config/vxworks.h (CLEAR_INSN_CACHE): Drop.
(TARGET_EMIT_CALL_BUILTIN___CLEAR_CACHE): Define.
* target.def (trampoline_init): In the documentation, refer to
maybe_emit_call_builtin___clear_cache.
(emit_call_builtin___clear_cache): New.
* doc/tm.texi.in: Add new hook point.
(CLEAR_CACHE_INSN): Remove duplicate 'both'.
* doc/tm.texi: Rebuilt.
* targhooks.h (default_meit_call_builtin___clear_cache):
Declare.
* tree.h (BUILTIN_ASM_NAME_PTR): New.

for  libgcc/ChangeLog

* config/t-vxworks (LIB2ADD): Drop.
* config/t-vxworks7 (LIB2ADD): Likewise.
* config/vxcache.c: Remove.
---
 gcc/builtins.c   |   89 +++---
 gcc/builtins.h   |1 
 gcc/config/aarch64/aarch64.c |8 ++--
 gcc/config/arc/arc.c |8 ++--
 gcc/config/arm/arm.c |7 ++-
 gcc/config/c6x/c6x.c |8 ++--
 gcc/config/csky/csky.c   |7 ++-
 gcc/config/m68k/linux.h  |8 ++--
 

Re: [PATCH][MIPS] PR target/77510 Reduce size of MIPS core automaton

2020-11-10 Thread Xu Chenghua

Hi:

I test spec cpu 2000 on Loongson 3A1000(gs464),  there is no difference 
in performance.



On 11/10/20 11:45 PM, Paul Koning wrote:

On Nov 10, 2020, at 6:42 AM, Xu Chenghua  wrote:


Hi:

This patch reduce reservation of model do not more than 10 cycles. The memory 
of genautomata down to 1GB.

Does this make any significant difference in performance of the generated code? 
 The original cycle counts are from the CPU description, so trimming them to a 
max of 10 means the model no longer reflects the real implementation.  It may 
be the difference is too small to worry about.  Can you comment on this?

paul





Re: [PATCH][MIPS] PR target/77510 Reduce size of MIPS core automaton

2020-11-10 Thread Paul Koning via Gcc-patches
> On Nov 10, 2020, at 6:09 PM, Jeff Law via Gcc-patches 
>  wrote:
> 
>> ...
>> ChangeLog
>> 
>> gcc/
>> PR target/77510
>> * config/mips/gs464.md: Reduce reservation
>> duration to 10 cycles.
>> * config/mips/i6400.md: Likewise.
>> * config/mips/m5100.md: Likewise.
>> * config/mips/p5600.md: Likewise.
>> * config/mips/p6600.md: Likewise.
>> * config/mips/xlp.md: Likewise.
>> * config/mips/xlr.md: Likewise.
> 
> I doubt there's a measurable performance penalty for not fully modeling
> the latency of these instructions, but I'm quite hesitant to ACK this
> change without hearing from others in the MIPS space.   Alternately, if
> the affected models aren't being sold anymore, then I wouldn't worry
> about getting feedback on them.
> 
> 
> Jeff

XLR and XLP are listed as active products on their manufacturer's website 
(Broadcom).  I don't know about the others.  

paul




Re: [PATCH] c++: Improve static_assert diagnostic [PR97518]

2020-11-10 Thread Marek Polacek via Gcc-patches
On Tue, Nov 10, 2020 at 02:30:30PM -0500, Jason Merrill via Gcc-patches wrote:
> On 11/10/20 2:28 PM, Marek Polacek wrote:
> > On Tue, Nov 10, 2020 at 02:15:56PM -0500, Jason Merrill wrote:
> > > On 11/10/20 1:59 PM, Marek Polacek wrote:
> > > > On Tue, Nov 10, 2020 at 11:32:04AM -0500, Jason Merrill wrote:
> > > > > On 11/9/20 7:21 PM, Marek Polacek wrote:
> > > > > > Currently, when a static_assert fails, we only say "static 
> > > > > > assertion failed".
> > > > > > It would be more useful if we could also print the expression that
> > > > > > evaluated to false; this is especially useful when the condition 
> > > > > > uses
> > > > > > template parameters.  Consider the motivating example, in which we 
> > > > > > have
> > > > > > this line:
> > > > > > 
> > > > > >  static_assert(is_same::value);
> > > > > > 
> > > > > > if this fails, the user has to play dirty games to get the compiler 
> > > > > > to
> > > > > > print the template arguments.  With this patch, we say:
> > > > > > 
> > > > > >  static assertion failed due to requirement 'is_same > > > > > int>::value'
> > > > > 
> > > > > I'd rather avoid the word "requirement" here, to avoid confusion with
> > > > > Concepts.
> > > > > 
> > > > > Maybe have the usual failed error, and if we're showing the 
> > > > > expression, have
> > > > > a second inform to say e.g. "%qE evaluates to false"?
> > > > 
> > > > Works for me.
> > > > 
> > > > > Also, if we've narrowed the problem down to a particular 
> > > > > subexpression,
> > > > > let's only print that one.
> > > > 
> > > > Done.
> > > > 
> > > > > > which I think is much better.  However, always printing the 
> > > > > > condition that
> > > > > > evaluated to 'false' wouldn't be very useful: e.g. noexcept(fn) is
> > > > > > always parsed to true/false, so we would say "static assertion 
> > > > > > failed due
> > > > > > to requirement 'false'" which doesn't help.  So I wound up only 
> > > > > > printing
> > > > > > the condition when it was instantiation-dependent, that is, we 
> > > > > > called
> > > > > > finish_static_assert from tsubst_expr.
> > > > > > 
> > > > > > Moreover, this patch also improves the diagnostic when the condition
> > > > > > consists of a logical AND.  Say you have something like this:
> > > > > > 
> > > > > >  static_assert(fn1() && fn2() && fn3() && fn4() && fn5());
> > > > > > 
> > > > > > where fn4() evaluates to false and the other ones to true.  
> > > > > > Highlighting
> > > > > > the whole thing is not that helpful because it won't say which 
> > > > > > clause
> > > > > > evaluated to false.  With the find_failing_clause tweak in this 
> > > > > > patch
> > > > > > we emit:
> > > > > > 
> > > > > >  error: static assertion failed
> > > > > >6 | static_assert(fn1() && fn2() && fn3() && fn4() && fn5());
> > > > > >  |  ~~~^~
> > > > > > 
> > > > > > so you know right away what's going on.  Unfortunately, when you 
> > > > > > combine
> > > > > > both things, that is, have an instantiation-dependent expr and && in
> > > > > > a static_assert, we can't yet quite point to the clause that 
> > > > > > failed.  It
> > > > > > is because when we tsubstitute something like is_same::value, 
> > > > > > we
> > > > > > generate a VAR_DECL that doesn't have any location.  It would be 
> > > > > > awesome
> > > > > > if we could wrap it with a location wrapper, but I didn't see 
> > > > > > anything
> > > > > > obvious.
> > > > > 
> > > > > Hmm, I vaguely remember that at first we were using location wrappers 
> > > > > less
> > > > > in templates, but I thought that was fixed.  I don't see anything 
> > > > > setting
> > > > > suppress_location_wrappers, and it looks like tsubst_copy_and_build 
> > > > > should
> > > > > preserve a location wrapper.
> > > > 
> > > > The problem here is that tsubst_qualified_id produces a VAR_DECL and 
> > > > for those
> > > > CAN_HAVE_LOCATION_P is false.
> > > 
> > > Ah, then perhaps tsubst_qualified_id should call maybe_wrap_with_location 
> > > to
> > > preserve the location of the SCOPE_REF.
> > 
> > The SCOPE_REF's location is fine, we just throw it away and return a 
> > VAR_DECL.
> 
> Yes, I'm saying that tsubst_qualified_id should extract the location from
> the SCOPE_REF and pass it to maybe_wrap_with_location to give the VAR_DECL a
> location wrapper.

Nevermind, I wasn't checking the return value of maybe_wrap_with_location...
Maybe we should childproof it by marking it with WARN_UNUSED_RESULT.

Anyway, this patch does the tsubst_qualified_id tweak.  With that, the
static_assert diagnostic with && is pretty spot on!

Relatedly, don't create useless location wrappers for temporary variables.
Since they are compiler-generated and ignored, nobody should be interested
in their location in the source file.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
Retain the location when tsubstituting a qualified-id so that our
static_assert 

Re: [Patch, testsuite]: Update MicroBlaze strings test

2020-11-10 Thread Jeff Law via Gcc-patches


On 11/8/20 10:58 PM, Nagaraju Mekala wrote:
> Hello All,
>
> for new scan-assembly output resulting in use of $LC label
>
> gcc/testsuite/ChangeLog:
> * gcc.target/microblaze/others/strings1.c: Update
> to include $LC label.

Thanks.  Installed on the trunk.

jeff




[committed] libstdc++: Implement std::emit_on_flush etc.

2020-11-10 Thread Jonathan Wakely via Gcc-patches
This adds the manipulators for use with basic_osyncstream. In order to
detect when an arbitrary basic_ostream is the base class of a
basic_syncbuf object, introduce a new intermediate base class
that stores the data members. The new base class stores a pointer and
two bools, which wastes (sizeof(void*) - 2) bytes of padding. It would
be possible to use the two least significant bits of the pointer for the
two bools, at least for targets where alignof(basic_streambuf) > 2, but
that's left as a possible change for a future date.

Also define basic_syncbuf::overflow to override the virtual function in
the base class, so that single characters can be inserted into the
stream buffer. Previously the default basic_streambuf::overflow
implementation was used, which drops the character on the floor.

libstdc++-v3/ChangeLog:

* include/std/ostream (__syncbuf_base): New class template.
(emit_on_flush, noemit_on_flush, flush_emit): New manipulators.
* include/std/syncstream (basic_syncbuf): Derive from
__syncbuf_base instead of basic_streambuf.
(basic_syncbuf::operator=): Remove self-assignment check.
(basic_syncbuf::swap): Remove self-swap check.
(basic_syncbuf::emit): Do not skip pubsync() call if sequence
is empty.
(basic_syncbuf::sync): Remove no-op pubsync on stringbuf.
(basic_syncbuf::overflow): Define override.
* testsuite/27_io/basic_syncstream/basic_ops/1.cc: Test
basic_osyncstream::put(char_type).
* testsuite/27_io/basic_ostream/emit/1.cc: New test.

Tested powerpc64le-linux. Committed to trunk.

commit ecba8547dd398ad4b627756013dbd22be417d4da
Author: Jonathan Wakely 
Date:   Wed Nov 11 00:19:40 2020

libstdc++: Implement std::emit_on_flush etc.

This adds the manipulators for use with basic_osyncstream. In order to
detect when an arbitrary basic_ostream is the base class of a
basic_syncbuf object, introduce a new intermediate base class
that stores the data members. The new base class stores a pointer and
two bools, which wastes (sizeof(void*) - 2) bytes of padding. It would
be possible to use the two least significant bits of the pointer for the
two bools, at least for targets where alignof(basic_streambuf) > 2, but
that's left as a possible change for a future date.

Also define basic_syncbuf::overflow to override the virtual function in
the base class, so that single characters can be inserted into the
stream buffer. Previously the default basic_streambuf::overflow
implementation was used, which drops the character on the floor.

libstdc++-v3/ChangeLog:

* include/std/ostream (__syncbuf_base): New class template.
(emit_on_flush, noemit_on_flush, flush_emit): New manipulators.
* include/std/syncstream (basic_syncbuf): Derive from
__syncbuf_base instead of basic_streambuf.
(basic_syncbuf::operator=): Remove self-assignment check.
(basic_syncbuf::swap): Remove self-swap check.
(basic_syncbuf::emit): Do not skip pubsync() call if sequence
is empty.
(basic_syncbuf::sync): Remove no-op pubsync on stringbuf.
(basic_syncbuf::overflow): Define override.
* testsuite/27_io/basic_syncstream/basic_ops/1.cc: Test
basic_osyncstream::put(char_type).
* testsuite/27_io/basic_ostream/emit/1.cc: New test.

diff --git a/libstdc++-v3/include/std/ostream b/libstdc++-v3/include/std/ostream
index 9a80adf3a5ac..c203e31d7c9d 100644
--- a/libstdc++-v3/include/std/ostream
+++ b/libstdc++-v3/include/std/ostream
@@ -776,6 +776,73 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   __ret_os << __x;
   return __ret_os;
 }
+
+#if __cplusplus > 201703L && _GLIBCXX_USE_CXX11_ABI
+  template
+class __syncbuf_base : public basic_streambuf<_CharT, _Traits>
+{
+public:
+  static bool*
+  _S_get(basic_streambuf<_CharT, _Traits>* __buf) noexcept
+  {
+   if (auto __p = dynamic_cast<__syncbuf_base*>(__buf))
+ return &__p->_M_emit_on_sync;
+   return nullptr;
+  }
+
+protected:
+  __syncbuf_base(basic_streambuf<_CharT, _Traits>* __w = nullptr)
+  : _M_wrapped(__w)
+  { }
+
+  basic_streambuf<_CharT, _Traits>* _M_wrapped = nullptr;
+  bool _M_emit_on_sync = false;
+  bool _M_needs_sync = false;
+};
+
+  template
+inline basic_ostream<_CharT, _Traits>&
+emit_on_flush(basic_ostream<_CharT, _Traits>& __os)
+{
+  if (bool* __flag = __syncbuf_base<_CharT, _Traits>::_S_get(__os.rdbuf()))
+   *__flag = true;
+  return __os;
+}
+
+  template
+inline basic_ostream<_CharT, _Traits>&
+noemit_on_flush(basic_ostream<_CharT, _Traits>& __os)
+{
+  if (bool* __flag = __syncbuf_base<_CharT, _Traits>::_S_get(__os.rdbuf()))
+   *__flag = false;
+  return __os;
+}
+
+  template
+inline basic_ostream<_CharT, 

Re: [PATCH] Add a new pattern in 4-insn combine

2020-11-10 Thread Jeff Law via Gcc-patches


On 11/8/20 7:48 PM, HAO CHEN GUI via Gcc-patches wrote:
> Hi,
>
> This patch adds a new pattern in 4-insn combine. It supports the
> following sign_extend(op: zero_extend, zero_extend) optimization. In
> the patch, newpat is split twice. The first split becomes newi1pat and
> the second becomes newi2pat. They replace i1, i2 and i3 if all of them
> can be recognized.
>
> 7: r126:SI=zero_extend([r123:DI+0x1])
> 6: r125:SI=zero_extend([r123:DI])
> 8: r127:SI=r125:SI+r126:SI
> 9: r124:DI=sign_extend(r127:SI)
>
> are replaced by:
>
> 7: r125:DI=zero_extend([r123:DI])
> 8: r127:DI=zero_extend([r123:DI+0x1])
> 9: r124:DI=r127:DI+r125:DI
>
> The attachments are the patch diff file and change log file.
>
> Bootstrapped and tested on powerpc64le-linux-gnu with no regressions. 
> Is this okay for trunk? Any recommendations? Thanks a lot.
>
>
>
> ChangeLog
>
>   * combine.c (combine_validate_cost): Add an argument for newi1pat.
>   (try_combine): Add a 4-insn combine pattern for optimizing rtx
>   sign_extend (op:zero_extend, zero_extend).

It'd be nice to see motivating examples.  Depending on their structure,
we may get better results cleaning things up with match.pd patterns.  We
already have some which work in this space.


Whether or not the combiner change itself moves forward is up to Segher
though.

jeff

>
> patch.diff
>
> diff --git a/gcc/combine.c b/gcc/combine.c
> index c88382efbd3..73259e6a9ed 100644
> --- a/gcc/combine.c
> +++ b/gcc/combine.c
> @@ -851,10 +851,11 @@ do_SUBST_LINK (struct insn_link **into, struct 
> insn_link *newval)
>  
>  static bool
>  combine_validate_cost (rtx_insn *i0, rtx_insn *i1, rtx_insn *i2, rtx_insn 
> *i3,
> -rtx newpat, rtx newi2pat, rtx newotherpat)
> +rtx newpat, rtx newi2pat, rtx newotherpat,
> +rtx newi1pat)
>  {
>int i0_cost, i1_cost, i2_cost, i3_cost;
> -  int new_i2_cost, new_i3_cost;
> +  int new_i1_cost, new_i2_cost, new_i3_cost;
>int old_cost, new_cost;
>  
>/* Lookup the original insn_costs.  */
> @@ -915,6 +916,20 @@ combine_validate_cost (rtx_insn *i0, rtx_insn *i1, 
> rtx_insn *i2, rtx_insn *i3,
>new_i2_cost = 0;
>  }
>  
> +  if (newi1pat)
> +{
> + tmp = PATTERN (i1);
> + PATTERN (i1) = newi1pat;
> + tmpi = INSN_CODE (i1);
> + INSN_CODE (i1) = -1;
> + new_i1_cost = insn_cost (i1, optimize_this_for_speed_p);
> + PATTERN (i1) = tmp;
> + INSN_CODE (i1) = tmpi;
> + new_cost = new_i1_cost > 0 ? new_i1_cost + new_cost : 0;
> +}
> +  else
> +new_i1_cost = 0;
> +
>if (undobuf.other_insn)
>  {
>int old_other_cost, new_other_cost;
> @@ -958,7 +973,10 @@ combine_validate_cost (rtx_insn *i0, rtx_insn *i1, 
> rtx_insn *i2, rtx_insn *i3,
>   fprintf (dump_file, "%d + ", i1_cost);
>fprintf (dump_file, "%d + %d = %d\n", i2_cost, i3_cost, old_cost);
>  
> -  if (newi2pat)
> +  if (newi1pat)
> + fprintf (dump_file, "replacement costs %d + %d + %d = %d\n",
> +  new_i1_cost, new_i2_cost, new_i3_cost, new_cost);
> +  else if (newi2pat)
>   fprintf (dump_file, "replacement costs %d + %d = %d\n",
>new_i2_cost, new_i3_cost, new_cost);
>else
> @@ -973,7 +991,10 @@ combine_validate_cost (rtx_insn *i0, rtx_insn *i1, 
> rtx_insn *i2, rtx_insn *i3,
>INSN_COST (i3) = new_i3_cost;
>if (i1)
>  {
> -  INSN_COST (i1) = 0;
> +  if (newi1pat)
> + INSN_COST (i1) = new_i1_cost;
> +  else
> + INSN_COST (i1) = 0;
>if (i0)
>   INSN_COST (i0) = 0;
>  }
> @@ -2672,7 +2693,7 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
> rtx_insn *i0,
>int *new_direct_jump_p, rtx_insn *last_combined_insn)
>  {
>/* New patterns for I3 and I2, respectively.  */
> -  rtx newpat, newi2pat = 0;
> +  rtx newpat, newi2pat = 0, newi1pat = 0;
>rtvec newpat_vec_with_clobbers = 0;
>int substed_i2 = 0, substed_i1 = 0, substed_i0 = 0;
>/* Indicates need to preserve SET in I0, I1 or I2 in I3 if it is not
> @@ -2682,8 +2703,9 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
> rtx_insn *i0,
>int total_sets;
>/* Nonzero if I2's or I1's body now appears in I3.  */
>int i2_is_used = 0, i1_is_used = 0;
> -  /* INSN_CODEs for new I3, new I2, and user of condition code.  */
> +  /* INSN_CODEs for new I3, new I2, new I1 and user of condition code.  */
>int insn_code_number, i2_code_number = 0, other_code_number = 0;
> +  int i1_code_number = 0;
>/* Contains I3 if the destination of I3 is used in its source, which means
>   that the old life of I3 is being killed.  If that usage is placed into
>   I2 and not in I3, a REG_DEAD note must be made.  */
> @@ -2756,7 +2778,9 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
> rtx_insn *i0,
> else if (BINARY_P (src) && CONSTANT_P (XEXP (src, 1)))
>   ngood++;
> else if (GET_CODE (src) == ASHIFT || GET_CODE (src) == 

Re: [PATCH] Better __ashlDI3, __ashrDI3 and __lshrDI3 functions, plus fixed __bswapsi2 function

2020-11-10 Thread Andreas Schwab
On Nov 10 2020, Stefan Kanthak wrote:

> Eric Botcazou  wrote:
>
>>> The implementation of the __ashlDI3(), __ashrDI3() and __lshrDI3() functions
>>> is rather bad, it yields bad machine code at least on i386 and AMD64. Since
>>> GCC knows how to shift integers twice the register size these functions can
>>> be written as one-liners.
>> 
>> These functions are precisely meant to be used when GCC cannot do that.
>
> On which processor(s) is GCC unable to generate code for DWtype shifts?

On most 32-bit targets with -Os.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


Re: [PATCH] Better __ashlDI3, __ashrDI3 and __lshrDI3 functions, plus fixed __bswapsi2 function

2020-11-10 Thread Jakub Jelinek via Gcc-patches
On Tue, Nov 10, 2020 at 04:48:10PM -0700, Jeff Law via Gcc-patches wrote:
> > @@ -486,10 +425,10 @@
> >  SItype
> >  __bswapsi2 (SItype u)
> >  {
> > -  return u) & 0xff00) >> 24)
> > - | (((u) & 0x00ff) >>  8)
> > - | (((u) & 0xff00) <<  8)
> > - | (((u) & 0x00ff) << 24));
> > +  return u) & 0xff00u) >> 24)
> > + | (((u) & 0x00ffu) >>  8)
> > + | (((u) & 0xff00u) <<  8)
> > + | (((u) & 0x00ffu) << 24));
> 
> What's the point of this change?  I'm not sure how the signedness of the
> constant really matters here.

Note 0xff00 is implicitly 0xff00U because it doesn't fit into signed
int, and that is the only one where the logical vs. arithmetic right shift
really matters for correct behavior.
Whether (((u) & 0x00ff) >> 8) is expanded as logical or arithmetic shift
doesn't really matter for code behavior, though perhaps we could during
expansion check if the most significant bit is known to be zero as in this
case and ask target about arithmetic vs. logical right shift costs and
choose the less costly if they aren't even.

Jakub



[committed] libstdc++: Avoid bad_alloc exceptions when changing locales

2020-11-10 Thread Jonathan Wakely via Gcc-patches
For the --enable-clocale=generic configuration, the current code can
fail with a bad_alloc exception. This patch uses the nothrow version of
operator new and reports allocation failures by setting failbit in the
iostate variable.

* config/locale/generic/c_locale.cc (__set_C_locale()): New function
to set the "C" locale and return the name of the previous locale.
(__convert_to_v, __convert_to_v)
(__convert_to_v): Use __set_C_locale and set failbit on
error.

Tested sparc-sun-solaris2.11. Committed to trunk.

commit 5dfbc52264fc64db22e75f385be9efae3d0eba46
Author: Jonathan Wakely 
Date:   Tue Nov 10 19:23:15 2020

libstdc++: Avoid bad_alloc exceptions when changing locales

For the --enable-clocale=generic configuration, the current code can
fail with a bad_alloc exception. This patch uses the nothrow version of
operator new and reports allocation failures by setting failbit in the
iostate variable.

* config/locale/generic/c_locale.cc (__set_C_locale()): New function
to set the "C" locale and return the name of the previous locale.
(__convert_to_v, __convert_to_v)
(__convert_to_v): Use __set_C_locale and set failbit on
error.

diff --git a/libstdc++-v3/config/locale/generic/c_locale.cc 
b/libstdc++-v3/config/locale/generic/c_locale.cc
index 61ead91ec1ed..648b8e182cc7 100644
--- a/libstdc++-v3/config/locale/generic/c_locale.cc
+++ b/libstdc++-v3/config/locale/generic/c_locale.cc
@@ -52,6 +52,22 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   ~_Save_errno() { if (errno == 0) errno = _M_errno; }
   int _M_errno;
 };
+
+// calls setlocale(LC_ALL, "C") and returns a string containing the old
+// locale name. Caller must delete[] the string. Returns NULL on error.
+const char*
+__set_C_locale()
+{
+  char* __old = setlocale(LC_ALL, 0);
+  const size_t __len = strlen(__old) + 1;
+  char* __sav = new(nothrow) char[__len];
+  if (__sav)
+   {
+ memcpy(__sav, __old, __len);
+ setlocale(LC_ALL, "C");
+   }
+  return __sav;
+}
   }
 
   template<>
@@ -60,11 +76,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   const __c_locale&) throw()
 {
   // Assumes __s formatted for "C" locale.
-  char* __old = setlocale(LC_ALL, 0);
-  const size_t __len = strlen(__old) + 1;
-  char* __sav = new char[__len];
-  memcpy(__sav, __old, __len);
-  setlocale(LC_ALL, "C");
+  const char* __sav = __set_C_locale();
+  if (!__sav)
+   {
+ __err = ios_base::failbit;
+ return;
+   }
   char* __sanity;
   bool __overflow = false;
 
@@ -125,11 +142,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   const __c_locale&) throw()
 {
   // Assumes __s formatted for "C" locale.
-  char* __old = setlocale(LC_ALL, 0);
-  const size_t __len = strlen(__old) + 1;
-  char* __sav = new char[__len];
-  memcpy(__sav, __old, __len);
-  setlocale(LC_ALL, "C");
+  const char* __sav = __set_C_locale();
+  if (!__sav)
+   {
+ __err = ios_base::failbit;
+ return;
+   }
   char* __sanity;
 
 #if !__DBL_HAS_INFINITY__
@@ -170,11 +188,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   ios_base::iostate& __err, const __c_locale&) throw()
 {
   // Assumes __s formatted for "C" locale.
-  char* __old = setlocale(LC_ALL, 0);
-  const size_t __len = strlen(__old) + 1;
-  char* __sav = new char[__len];
-  memcpy(__sav, __old, __len);
-  setlocale(LC_ALL, "C");
+  const char* __sav = __set_C_locale();
+  if (!__sav)
+   {
+ __err = ios_base::failbit;
+ return;
+   }
 
 #if !__LDBL_HAS_INFINITY__
   const _Save_errno __save_errno;


Re: [PATCH] Objective-C : Implement NSObject attribute.

2020-11-10 Thread Jeff Law via Gcc-patches


On 11/10/20 4:43 PM, Iain Sandoe wrote:
> Hi Jeff,
>
> Jeff Law  wrote:
>>>
>>> This attribute allows pointers to be marked as pointers to
>>> an NSObject-compatible object.  This allows for additional
>
>>
>> OK with a suitable update to documentation.  I have no idea the
>> attribute is supposed to do.
>
> Joseph ack’d this and I applied it to master ..
>
> .. this was intentionally undocumented.
>
> I originally wrote documentation for it.  However, it turns out that it
> is undocumented in both clang (and as far as I can tell Apple’s)
> material.
>
> This is perhaps because it’s not really intended (or recommended)
> for end-user code, but for “internal” cases when writing libraries etc.
>
> WDYT? (I’m happy to ressurect the documentation I wrote and
> post/apply it).

Your call -- if users aren't supposed to use the attribute, then
documentation is much less important ;-)

Jeff




Re: [PATCH] Better __ashlDI3, __ashrDI3 and __lshrDI3 functions, plus fixed __bswapsi2 function

2020-11-10 Thread Jeff Law via Gcc-patches


On 11/10/20 10:59 AM, Stefan Kanthak via Gcc-patches wrote:
> The implementation of the __ashlDI3(), __ashrDI3() and __lshrDI3() functions
> is rather bad, it yields bad machine code at least on i386 and AMD64.
> Since GCC knows how to shift integers twice the register size these functions
> can be written as one-liners.
>
> The implementation of the __bswapsi2() function uses SIGNED instead of
> unsigned mask values; cf. __bswapdi2()
>
> Stefan Kanthak
>
> libgcc2.diff
>
> --- -/libgcc/libgcc2.h
> +++ +/libgcc/libgcc2.h
> @@ -391,7 +391,7 @@
>  extern DWtype __negdi2 (DWtype);
>  #endif
>  
> -extern DWtype __lshrdi3 (DWtype, shift_count_type);
> +extern UDWtype __lshrdi3 (UDWtype, shift_count_type);
>  extern DWtype __ashldi3 (DWtype, shift_count_type);
>  extern DWtype __ashrdi3 (DWtype, shift_count_type);
>  
> --- -/libgcc/libgcc2.c
> +++ +/libgcc/libgcc2.c
> @@ -398,30 +398,10 @@
>  /* Unless shift functions are defined with full ANSI prototypes,
> parameter b will be promoted to int if shift_count_type is smaller than 
> an int.  */
>  #ifdef L_lshrdi3
> -DWtype
> -__lshrdi3 (DWtype u, shift_count_type b)
> +UDWtype
> +__lshrdi3 (UDWtype u, shift_count_type b)

As has been pointed out, you can't implement these routines by doing the
operation in the same mode as the argument.  It's potentially
self-recursive.


THe whole point of these routines is to provide double-word capabilities
on targets that don't have intrinsic double-word capabilities. That's
why they're written in a non-obvious way using word sized operations.


>  
> @@ -486,10 +425,10 @@
>  SItype
>  __bswapsi2 (SItype u)
>  {
> -  return u) & 0xff00) >> 24)
> -   | (((u) & 0x00ff) >>  8)
> -   | (((u) & 0xff00) <<  8)
> -   | (((u) & 0x00ff) << 24));
> +  return u) & 0xff00u) >> 24)
> +   | (((u) & 0x00ffu) >>  8)
> +   | (((u) & 0xff00u) <<  8)
> +   | (((u) & 0x00ffu) << 24));

What's the point of this change?  I'm not sure how the signedness of the
constant really matters here.

jeff



Re: [PATCH] Objective-C : Implement NSObject attribute.

2020-11-10 Thread Iain Sandoe

Hi Jeff,

Jeff Law  wrote:


This attribute allows pointers to be marked as pointers to
an NSObject-compatible object.  This allows for additional




OK with a suitable update to documentation.  I have no idea the
attribute is supposed to do.


Joseph ack’d this and I applied it to master ..

.. this was intentionally undocumented.

I originally wrote documentation for it.  However, it turns out that it
is undocumented in both clang (and as far as I can tell Apple’s)
material.

This is perhaps because it’s not really intended (or recommended)
for end-user code, but for “internal” cases when writing libraries etc.

WDYT? (I’m happy to ressurect the documentation I wrote and
post/apply it).

thanks
Iain



Re: [PATCH] Better __ashlDI3, __ashrDI3 and __lshrDI3 functions, plus fixed __bswapsi2 function

2020-11-10 Thread Jeff Law via Gcc-patches


On 11/10/20 2:17 PM, Jakub Jelinek wrote:
> On Tue, Nov 10, 2020 at 02:09:20PM -0700, Jeff Law wrote:
>> On 11/10/20 1:14 PM, Jakub Jelinek via Gcc-patches wrote:
>>> On Tue, Nov 10, 2020 at 08:44:32PM +0100, Stefan Kanthak wrote:
 Eric Botcazou  wrote:

>> The implementation of the __ashlDI3(), __ashrDI3() and __lshrDI3() 
>> functions
>> is rather bad, it yields bad machine code at least on i386 and AMD64. 
>> Since
>> GCC knows how to shift integers twice the register size these functions 
>> can
>> be written as one-liners.
> These functions are precisely meant to be used when GCC cannot do that.
 On which processor(s) is GCC unable to generate code for DWtype shifts?
>>> E.g. avr-none, msp430-elf, pdp11-aout.
>>> And I see recursive __cmpdi2 calls on avr-none too.
>> ACK.  I'll pull those [u]cmpdi changes.  They were iffy at best, this
>> confirms the concerns we both had.
> To be precise, I haven't tried to build libgcc for all of those, just
> checked what code is produced for
> unsigned long long foo (unsigned long long x) { return x << 3; }
> etc. by all trunk cross-compilers I have lying around.
> It might be that their __ashlDI3 etc. are actually __ashlsi3 and they
> have some other implementation of __ashldi3 etc.

Well for avr libgcc's cmpdi compiles into something reasonable, but
ucmpdi calls cmpdi and I can't convince myself it's right.  Regardless,
the [u]cmpdi stuff was very much borderline from a correctness
standpoint and I'm going to revert that part.



>
> Anyway, I'm not sure if we really need to optimize too much functions which
> are never used; and the -ftrapv stuff should one day be reimplemented using
> -fsanitize=signed-integer-overflow with an abort behavior on overflows.
>
> libgcc functions which are used heavily of course should be optimized as
> much as possible.

Agreed.  The improvements to use the overflow intrinsics are somewhat
useful -- they make the code clearer and that's a win.  The potential
self-recursive stuff in [u]cmpdi and the shifts probably aren't useful
(I haven't looked closely at the shifts yet).  Stefan has a fundamental
mis-understanding of how those routines are used and more importantly
the constraints they have to work within.

Jeff

Jeff

>
>   Jakub
>



Re: [08/32] cpp mkdeps

2020-11-10 Thread Jeff Law via Gcc-patches


On 11/3/20 2:14 PM, Nathan Sidwell wrote:
> The final bit of preprocessor change is that to mkdeps.  We have a new
> kind of thing to be dependent upon -- a compiled module interface.
> That's also an additional target.
>
> This adds the logic to mkdeps to keep those dependencies separate from
> include file dependencies.  I made use of this when demonstrating a
> module-aware Make POC.
>
>
>
> 08-cpp-mkdeps.diff
>
OK with ChangeLog entry.

jeff




Re: [07/32] cpp main

2020-11-10 Thread Jeff Law via Gcc-patches


On 11/3/20 2:14 PM, Nathan Sidwell wrote:
> Here's the changes to starting the main file. I have added the ability
> to search the user or system include paths for the main file.  That's
> real helpful to users attempting to build header-units.  I'll discuss
> the CLI with the options patch.
>
> Also recording the location at which the main file starts is helpful
> for whole-file diagnostics.
>
>
>
> 07-cpp-main.diff
>
OK with ChangeLog entry.

jeff




Re: [PATCH] Objective-C : Implement NSObject attribute.

2020-11-10 Thread Jeff Law via Gcc-patches


On 11/6/20 1:18 PM, Iain Sandoe wrote:
> Hi
>
> Originally, I had this as a Darwin-only patch, however GNUStep
> also makes use of NSObject and similar constructs, so this really
> needs to be available to linux-gnu as well.
>
> tested across several Darwin versions and on x86_64-linux-gnu.
>
> OK for the c-family changes?
> thanks
> Iain
>
> =
>
> This attribute allows pointers to be marked as pointers to
> an NSObject-compatible object.  This allows for additional
> checking of assignment etc. when referring to pointers to
> opaque types (or type-erased via a void * cast).
>
> gcc/c-family/ChangeLog:
>
>   * c-attribs.c (handle_nsobject_attribute): New.
>   * c.opt: Add WNSObject-attribute.
>
> gcc/objc/ChangeLog:
>
>   * objc-act.c (objc_compare_types): Handle NSObject type
>   attributes.
>   (objc_type_valid_for_messaging): Likewise.
>
> gcc/testsuite/ChangeLog:
>
>   * obj-c++.dg/attributes/nsobject-01.mm: New test.
>   * objc.dg/attributes/nsobject-01.m: New test.

OK with a suitable update to documentation.  I have no idea the
attribute is supposed to do.


jeff




Re: [PATCH][MIPS] PR target/77510 Reduce size of MIPS core automaton

2020-11-10 Thread Jeff Law via Gcc-patches


On 11/10/20 4:42 AM, Xu Chenghua wrote:
>
> Hi:
>
> This patch reduce reservation of model do not more than 10 cycles. The
> memory of genautomata down to 1GB.

How bad is the memory consumption before this change?


>
> ChangeLog
>
> gcc/
>     PR target/77510
>     * config/mips/gs464.md: Reduce reservation
>     duration to 10 cycles.
>     * config/mips/i6400.md: Likewise.
>     * config/mips/m5100.md: Likewise.
>     * config/mips/p5600.md: Likewise.
>     * config/mips/p6600.md: Likewise.
>     * config/mips/xlp.md: Likewise.
>     * config/mips/xlr.md: Likewise.

I doubt there's a measurable performance penalty for not fully modeling
the latency of these instructions, but I'm quite hesitant to ACK this
change without hearing from others in the MIPS space.   Alternately, if
the affected models aren't being sold anymore, then I wouldn't worry
about getting feedback on them.


Jeff




Re: [Patch, fortran] PR83118 - [8/9/10/11 Regression] Bad intrinsic assignment of class(*) array component of derived type

2020-11-10 Thread Thomas Koenig via Gcc-patches

Hi Paul,


This all bootstraps and regtests on FC31/x86_64 - OK for master?


This is a sizable patch, and from what I can see, it all looks
plausible.  So, I's say OK for master (with one nit, below),
but maybe you could wait a day or so to give others the chance
to look it over, too.

The nit:


PR fortran/83118
* gfortran.dg/dependency_57.f90: Change to dg-run and test for correct
result.


I'd rather not change a test case unless it is needed; if something
breaks it, it is better to leave it as is for bisection.

Could you just make a new test from the run-time version?

Thanks a lot for tackling this thorny issue!

Best regards

Thomas




Re: [PATCH] Better __ashlDI3, __ashrDI3 and __lshrDI3 functions, plus fixed __bswapsi2 function

2020-11-10 Thread Jeff Law via Gcc-patches


On 11/10/20 1:14 PM, Jakub Jelinek via Gcc-patches wrote:
> On Tue, Nov 10, 2020 at 08:44:32PM +0100, Stefan Kanthak wrote:
>> Eric Botcazou  wrote:
>>
 The implementation of the __ashlDI3(), __ashrDI3() and __lshrDI3() 
 functions
 is rather bad, it yields bad machine code at least on i386 and AMD64. Since
 GCC knows how to shift integers twice the register size these functions can
 be written as one-liners.
>>> These functions are precisely meant to be used when GCC cannot do that.
>> On which processor(s) is GCC unable to generate code for DWtype shifts?
> E.g. avr-none, msp430-elf, pdp11-aout.
> And I see recursive __cmpdi2 calls on avr-none too.

FWIW, the tester only builds libgcc for avr-elf as there's no newlib
port.  msp430-elf builds newlib and will run the testsuite using the
simulator from binutils/gdb.  I don't test pdp11 at all.

While the avr port may advertise DImode shifts and such, under the hood
they're actually implemented as calls into hand written assembly in
libgcc and probably implement an internal ABI, particularly WRT register
usage.


jeff



[r11-4883 Regression] FAIL: gfortran.dg/gomp/workshare-reduction-57.f90 -O scan-tree-dump-times optimized "__builtin_GOMP_loop_dynamic_next " 1 on Linux/x86_64

2020-11-10 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

e929ef532ad52cde873dfc0849907b020ffc5afd is the first bad commit
commit e929ef532ad52cde873dfc0849907b020ffc5afd
Author: Tobias Burnus 
Date:   Tue Nov 10 18:28:18 2020 +0100

Fortran: OpenMP 5.0 (in_,task_)reduction clause extensions

caused

FAIL: gfortran.dg/gomp/workshare-reduction-26.f90   -O   scan-tree-dump-times 
optimized "__builtin_GOMP_loop_maybe_nonmonotonic_runtime_next " 1
FAIL: gfortran.dg/gomp/workshare-reduction-26.f90   -O   scan-tree-dump-times 
optimized "__builtin_GOMP_loop_start [^\n\r]*, 0, 0, " 1
FAIL: gfortran.dg/gomp/workshare-reduction-27.f90   -O   scan-tree-dump-times 
optimized "__builtin_GOMP_loop_runtime_next " 1
FAIL: gfortran.dg/gomp/workshare-reduction-27.f90   -O   scan-tree-dump-times 
optimized "__builtin_GOMP_loop_start [^\n\r]*, (?:2147483648|-2147483648), 0, " 
1
FAIL: gfortran.dg/gomp/workshare-reduction-28.f90   -O   scan-tree-dump-times 
optimized "__builtin_GOMP_loop_nonmonotonic_runtime_next " 1
FAIL: gfortran.dg/gomp/workshare-reduction-28.f90   -O   scan-tree-dump-times 
optimized "__builtin_GOMP_loop_start [^\n\r]*, 4, 0, " 1
FAIL: gfortran.dg/gomp/workshare-reduction-36.f90   -O   scan-tree-dump-times 
optimized "__builtin_GOMP_loop_nonmonotonic_dynamic_next " 1
FAIL: gfortran.dg/gomp/workshare-reduction-36.f90   -O   scan-tree-dump-times 
optimized "__builtin_GOMP_loop_start [^\n\r]*, 2, 1, " 1
FAIL: gfortran.dg/gomp/workshare-reduction-37.f90   -O   scan-tree-dump-times 
optimized "__builtin_GOMP_loop_dynamic_next " 1
FAIL: gfortran.dg/gomp/workshare-reduction-37.f90   -O   scan-tree-dump-times 
optimized "__builtin_GOMP_loop_start [^\n\r]*, (?:2147483650|-2147483646), 1, " 
1
FAIL: gfortran.dg/gomp/workshare-reduction-38.f90   -O   scan-tree-dump-times 
optimized "__builtin_GOMP_loop_nonmonotonic_dynamic_next " 1
FAIL: gfortran.dg/gomp/workshare-reduction-38.f90   -O   scan-tree-dump-times 
optimized "__builtin_GOMP_loop_start [^\n\r]*, 2, 1, " 1
FAIL: gfortran.dg/gomp/workshare-reduction-39.f90   -O   scan-tree-dump-times 
optimized "__builtin_GOMP_loop_nonmonotonic_dynamic_next " 1
FAIL: gfortran.dg/gomp/workshare-reduction-39.f90   -O   scan-tree-dump-times 
optimized "__builtin_GOMP_loop_start [^\n\r]*, 2, 3, " 1
FAIL: gfortran.dg/gomp/workshare-reduction-3.f90   -O   scan-tree-dump-times 
optimized "__builtin_GOMP_loop_nonmonotonic_runtime_next " 1
FAIL: gfortran.dg/gomp/workshare-reduction-3.f90   -O   scan-tree-dump-times 
optimized "__builtin_GOMP_loop_start [^\n\r]*, 4, 0, " 1
FAIL: gfortran.dg/gomp/workshare-reduction-40.f90   -O   scan-tree-dump-times 
optimized "__builtin_GOMP_loop_dynamic_next " 1
FAIL: gfortran.dg/gomp/workshare-reduction-40.f90   -O   scan-tree-dump-times 
optimized "__builtin_GOMP_loop_start [^\n\r]*, (?:2147483650|-2147483646), 3, " 
1
FAIL: gfortran.dg/gomp/workshare-reduction-41.f90   -O   scan-tree-dump-times 
optimized "__builtin_GOMP_loop_nonmonotonic_dynamic_next " 1
FAIL: gfortran.dg/gomp/workshare-reduction-41.f90   -O   scan-tree-dump-times 
optimized "__builtin_GOMP_loop_start [^\n\r]*, 2, 3, " 1
FAIL: gfortran.dg/gomp/workshare-reduction-42.f90   -O   scan-tree-dump-times 
optimized "__builtin_GOMP_loop_nonmonotonic_guided_next " 1
FAIL: gfortran.dg/gomp/workshare-reduction-42.f90   -O   scan-tree-dump-times 
optimized "__builtin_GOMP_loop_start [^\n\r]*, 3, 1, " 1
FAIL: gfortran.dg/gomp/workshare-reduction-43.f90   -O   scan-tree-dump-times 
optimized "__builtin_GOMP_loop_guided_next " 1
FAIL: gfortran.dg/gomp/workshare-reduction-43.f90   -O   scan-tree-dump-times 
optimized "__builtin_GOMP_loop_start [^\n\r]*, (?:2147483651|-2147483645), 1, " 
1
FAIL: gfortran.dg/gomp/workshare-reduction-44.f90   -O   scan-tree-dump-times 
optimized "__builtin_GOMP_loop_nonmonotonic_guided_next " 1
FAIL: gfortran.dg/gomp/workshare-reduction-44.f90   -O   scan-tree-dump-times 
optimized "__builtin_GOMP_loop_start [^\n\r]*, 3, 1, " 1
FAIL: gfortran.dg/gomp/workshare-reduction-45.f90   -O   scan-tree-dump-times 
optimized "__builtin_GOMP_loop_nonmonotonic_guided_next " 1
FAIL: gfortran.dg/gomp/workshare-reduction-45.f90   -O   scan-tree-dump-times 
optimized "__builtin_GOMP_loop_start [^\n\r]*, 3, 3, " 1
FAIL: gfortran.dg/gomp/workshare-reduction-46.f90   -O   scan-tree-dump-times 
optimized "__builtin_GOMP_loop_guided_next " 1
FAIL: gfortran.dg/gomp/workshare-reduction-46.f90   -O   scan-tree-dump-times 
optimized "__builtin_GOMP_loop_start [^\n\r]*, (?:2147483651|-2147483645), 3, " 
1
FAIL: gfortran.dg/gomp/workshare-reduction-47.f90   -O   scan-tree-dump-times 
optimized "__builtin_GOMP_loop_nonmonotonic_guided_next " 1
FAIL: gfortran.dg/gomp/workshare-reduction-47.f90   -O   scan-tree-dump-times 
optimized "__builtin_GOMP_loop_start [^\n\r]*, 3, 3, " 1
FAIL: gfortran.dg/gomp/workshare-reduction-56.f90   -O   scan-tree-dump-times 
optimized "__builtin_GOMP_doacross_post " 1
FAIL: gfortran.dg/gomp/workshare-reduction-56.f90   -O   scan-tree-dump-times 
optimized "__builtin_GOMP_doacross_wait " 1
FAIL: 

Re: [PATCH] Better __ashlDI3, __ashrDI3 and __lshrDI3 functions, plus fixed __bswapsi2 function

2020-11-10 Thread Jeff Law via Gcc-patches


On 11/10/20 12:44 PM, Stefan Kanthak wrote:
> Eric Botcazou  wrote:
>
>>> The implementation of the __ashlDI3(), __ashrDI3() and __lshrDI3() functions
>>> is rather bad, it yields bad machine code at least on i386 and AMD64. Since
>>> GCC knows how to shift integers twice the register size these functions can
>>> be written as one-liners.
>> These functions are precisely meant to be used when GCC cannot do that.
> On which processor(s) is GCC unable to generate code for DWtype shifts?

We'd have to look at them all, and it's just not worth the headache.  
Just to pick one off the top of my head that I know well would be the H8
which only has QI, HI and SI shifts/rotates.  I'd expect inherent
support for double-word shifts to be the exception rather than the rule
for 32bit embedded targets.


>
> JFTR: if GCC were not able to generate code for DWtype addition and 
> subtraction
>   it would also need __[u]addDI3() and __[u]subDI3() functions ... which
>   are but missing from libgcc.a

Umm, no.  Non-overflow trapping addition/subtraction needs just one
version, signedness doesn't matter.


jeff



PING: Re: [PATCH] Add analyzer plugin support and CPython GIL example

2020-11-10 Thread David Malcolm via Gcc-patches
I'd like to ping this patch:
  https://gcc.gnu.org/pipermail/gcc-patches/2020-October/556214.html
Are the non-analyzer parts OK for master?

Thanks
Dave

On Wed, 2020-10-14 at 21:21 -0400, David Malcolm wrote:
> This patch adds a new GCC plugin event: PLUGIN_ANALYZER_INIT, called
> when -fanalyzer is starting, allowing for GCC plugins to register
> additional state-machine-based checks within -fanalyzer.  The idea
> is that 3rd-party code might want to add domain-specific checks for
> its own APIs - with the caveat that the analyzer is itself still
> rather experimental.
> 
> As an example, the patch adds a proof-of-concept plugin to the
> testsuite
> for checking CPython code: verifying that code that relinquishes
> CPython's global interpreter lock doesn't attempt to do anything with
> PyObjects in the sections where the lock isn't held.  It also adds a
> warning about nested releases of the lock, which is forbidden.
> For example:
> 
> demo.c: In function 'foo':
> demo.c:11:3: warning: use of PyObject '*(obj)' without the GIL
>11 |   Py_INCREF (obj);
>   |   ^
>   'test': events 1-3
> |
> |   15 | void test (PyObject *obj)
> |  |  ^~~~
> |  |  |
> |  |  (1) entry to 'test'
> |   16 | {
> |   17 |   Py_BEGIN_ALLOW_THREADS
> |  |   ~~
> |  |   |
> |  |   (2) releasing the GIL here
> |   18 |   foo (obj);
> |  |   ~
> |  |   |
> |  |   (3) calling 'foo' from 'test'
> |
> +--> 'foo': events 4-5
>|
>|9 | foo (PyObject *obj)
>|  | ^~~
>|  | |
>|  | (4) entry to 'foo'
>|   10 | {
>|   11 |   Py_INCREF (obj);
>|  |   ~
>|  |   |
>|  |   (5) PyObject '*(obj)' used here without the GIL
>|
> 
> Doing so requires adding some logic for ignoring macro expansions in
> analyzer diagnostics, since the insides of Py_INCREF and
> Py_BEGIN_ALLOW_THREADS are not of interest to the user for these
> cases.
> 
> Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
> 
> Are the non-analyzer parts OK for master?
> 
> gcc/analyzer/ChangeLog:
>   * analyzer-pass.cc (pass_analyzer::execute): Move sorry call
> to...
>   (sorry_no_analyzer): New.
>   * analyzer.h (class state_machine): New forward decl.
>   (class logger): New forward decl.
>   (class plugin_analyzer_init_iface): New.
>   (sorry_no_analyzer): New decl.
>   * checker-path.cc (checker_path::fixup_locations): New.
>   * checker-path.h (checker_event::set_location): New.
>   (checker_path::fixup_locations): New decl.
>   * diagnostic-manager.cc
>   (diagnostic_manager::emit_saved_diagnostic): Call
>   checker_path::fixup_locations, and call fixup_location
>   on the primary location.
>   * engine.cc: Include "plugin.h".
>   (class plugin_analyzer_init_impl): New.
>   (impl_run_checkers): Invoke PLUGIN_ANALYZER_INIT callbacks.
>   * pending-diagnostic.h (pending_diagnostic::fixup_location):
> New
>   vfunc.
> 
> gcc/ChangeLog:
>   * doc/plugins.texi (Plugin callbacks): Add
> PLUGIN_ANALYZER_INIT.
>   * plugin.c (register_callback): Likewise.
>   (invoke_plugin_callbacks_full): Likewise.
>   * plugin.def (PLUGIN_ANALYZER_INIT): New event.
> 
> gcc/testsuite/ChangeLog:
>   * gcc.dg/plugin/analyzer_gil_plugin.c: New test.
>   * gcc.dg/plugin/gil-1.c: New test.
>   * gcc.dg/plugin/gil.h: New header.
>   * gcc.dg/plugin/plugin.exp (plugin_test_list): Add the new
> plugin
>   and test.
> ---
>  gcc/analyzer/analyzer-pass.cc |  18 +-
>  gcc/analyzer/analyzer.h   |  15 +
>  gcc/analyzer/checker-path.cc  |   9 +
>  gcc/analyzer/checker-path.h   |   4 +
>  gcc/analyzer/diagnostic-manager.cc|   9 +-
>  gcc/analyzer/engine.cc|  31 ++
>  gcc/analyzer/pending-diagnostic.h |   8 +
>  gcc/doc/plugins.texi  |   4 +
>  gcc/plugin.c  |   2 +
>  gcc/plugin.def|   4 +
>  .../gcc.dg/plugin/analyzer_gil_plugin.c   | 436
> ++
>  gcc/testsuite/gcc.dg/plugin/gil-1.c   |  90 
>  gcc/testsuite/gcc.dg/plugin/gil.h |  32 ++
>  gcc/testsuite/gcc.dg/plugin/plugin.exp|   2 +
>  14 files changed, 660 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/plugin/analyzer_gil_plugin.c
>  create mode 100644 gcc/testsuite/gcc.dg/plugin/gil-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/plugin/gil.h
> 
> diff --git a/gcc/analyzer/analyzer-pass.cc b/gcc/analyzer/analyzer-
> pass.cc
> index a27421e46d4..1f65bf8b154 100644
> --- a/gcc/analyzer/analyzer-pass.cc
> +++ b/gcc/analyzer/analyzer-pass.cc
> @@ -83,9 

Re: [PATCH] Better __ashlDI3, __ashrDI3 and __lshrDI3 functions, plus fixed __bswapsi2 function

2020-11-10 Thread Jakub Jelinek via Gcc-patches
On Tue, Nov 10, 2020 at 02:09:20PM -0700, Jeff Law wrote:
> 
> On 11/10/20 1:14 PM, Jakub Jelinek via Gcc-patches wrote:
> > On Tue, Nov 10, 2020 at 08:44:32PM +0100, Stefan Kanthak wrote:
> >> Eric Botcazou  wrote:
> >>
>  The implementation of the __ashlDI3(), __ashrDI3() and __lshrDI3() 
>  functions
>  is rather bad, it yields bad machine code at least on i386 and AMD64. 
>  Since
>  GCC knows how to shift integers twice the register size these functions 
>  can
>  be written as one-liners.
> >>> These functions are precisely meant to be used when GCC cannot do that.
> >> On which processor(s) is GCC unable to generate code for DWtype shifts?
> > E.g. avr-none, msp430-elf, pdp11-aout.
> > And I see recursive __cmpdi2 calls on avr-none too.
> 
> ACK.  I'll pull those [u]cmpdi changes.  They were iffy at best, this
> confirms the concerns we both had.

To be precise, I haven't tried to build libgcc for all of those, just
checked what code is produced for
unsigned long long foo (unsigned long long x) { return x << 3; }
etc. by all trunk cross-compilers I have lying around.
It might be that their __ashlDI3 etc. are actually __ashlsi3 and they
have some other implementation of __ashldi3 etc.

Anyway, I'm not sure if we really need to optimize too much functions which
are never used; and the -ftrapv stuff should one day be reimplemented using
-fsanitize=signed-integer-overflow with an abort behavior on overflows.

libgcc functions which are used heavily of course should be optimized as
much as possible.

Jakub



Re: [PATCH] Better __ashlDI3, __ashrDI3 and __lshrDI3 functions, plus fixed __bswapsi2 function

2020-11-10 Thread Jeff Law via Gcc-patches


On 11/10/20 1:14 PM, Jakub Jelinek via Gcc-patches wrote:
> On Tue, Nov 10, 2020 at 08:44:32PM +0100, Stefan Kanthak wrote:
>> Eric Botcazou  wrote:
>>
 The implementation of the __ashlDI3(), __ashrDI3() and __lshrDI3() 
 functions
 is rather bad, it yields bad machine code at least on i386 and AMD64. Since
 GCC knows how to shift integers twice the register size these functions can
 be written as one-liners.
>>> These functions are precisely meant to be used when GCC cannot do that.
>> On which processor(s) is GCC unable to generate code for DWtype shifts?
> E.g. avr-none, msp430-elf, pdp11-aout.
> And I see recursive __cmpdi2 calls on avr-none too.

ACK.  I'll pull those [u]cmpdi changes.  They were iffy at best, this
confirms the concerns we both had.

jeff



Re: [stage1][PATCH] Change semantics of -frecord-gcc-switches and add -frecord-gcc-switches-format.

2020-11-10 Thread Qing Zhao via Gcc-patches
Jakub and Jeff,

PING^7 on the following patch proposed 8 months ago for gcc11:

https://gcc.gnu.org/pipermail/gcc-patches/2020-March/542198.html 


The deadline for gcc11 stage 1 is approaching.  The pinged patch is one that 
has been sent for review 8 months ago in order to 
Make into gcc11. 

And this is an important feature that our company is waiting for a long time. 

Could you please take a look at this patch and let us know whether it’s ready 
for commit into gcc11? 

Thanks a lot.

Qing




> On Oct 27, 2020, at 5:56 AM, Martin Liška  wrote:
> 
> PING^6
> 
> The patch is in review process for more than 6 months, can please any global
> reviewer take a look at it?
> 
> Thanks,
> Martin
> 
> On 9/25/20 4:55 PM, Martin Liška wrote:
>> PING^5
> 



Re: [PATCH] Practical Improvement to libgcc Complex Divide

2020-11-10 Thread Patrick McGehearty via Gcc-patches

2nd ping - submitted Sept 8, 2020, last comment Sept 9, 2020
This patch is a correctness bug fix and not a change to any user
visible interface.

It makes a major improvement to complex divide accuracy, including
algorithm improvements known for 8 years. It is highly desirable
to get these changes into gcc11 and not wait another year.

In recent years there have been substantial improvements in mathematical
functions in glibc and gcc. I'm hoping to continue those contributions
with the goal of making glibc/gcc the "go to" solution for portable
mathematical software development. Substantial delays in getting
reviews makes it more difficult to get management backing for
additional mathematical improvement projects.

Also, if reviewer has any unexpected questions about design decisions,
it would be helpful to ask them while I still remember why I made those
choices.

- Patrick McGehearty
patrick.mcgehea...@oracle.com


On 10/13/2020 9:54 AM, Patrick McGehearty via Gcc-patches wrote:

Ping - still need review of version 4 of this patch.
It has been over a month since the last comment.

- patrick



On 9/9/2020 2:13 AM, Richard Biener wrote:

On Tue, Sep 8, 2020 at 8:50 PM Patrick McGehearty via Gcc-patches
 wrote:

(Version 4)

(Added in version 4)
Fixed Changelog entry to include __divsc3, __divdc3, __divxc3, 
__divtc3.

Revised description to avoid incorrect use of "ulp (units last place)".
Modified float precison case to use double precision when double
precision hardware is available. Otherwise float uses the new 
algorithm.

Added code to scale subnormal numerator arguments when appropriate.
This change reduces 16 bit errors in double precision by a factor of 
140.

Revised results charts to match current version of code.
Added background of tuning approach.

Summary of Purpose

The following patch to libgcc/libgcc2.c __divdc3 provides an
opportunity to gain important improvements to the quality of answers
for the default complex divide routine (half, float, double, extended,
long double precisions) when dealing with very large or very small 
exponents.


The current code correctly implements Smith's method (1962) [2]
further modified by c99's requirements for dealing with NaN (not a
number) results. When working with input values where the exponents
are greater than *_MAX_EXP/2 or less than -(*_MAX_EXP)/2, results are
substantially different from the answers provided by quad precision
more than 1% of the time. This error rate may be unacceptable for many
applications that cannot a priori restrict their computations to the
safe range. The proposed method reduces the frequency of
"substantially different" answers by more than 99% for double
precision at a modest cost of performance.

Differences between current gcc methods and the new method will be
described. Then accuracy and performance differences will be discussed.

Background

This project started with an investigation related to
https://urldefense.com/v3/__https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59714__;!!GqivPVa7Brio!NjjEhnQQ38VNyP_v8nlAm9uVjvZldUnobfY5hZdq22cMMVauop64MFw3nOHIQXUmy8PToRw$ 
.  Study of Beebe[1]

provided an overview of past and recent practice for computing complex
divide. The current glibc implementation is based on Robert Smith's
algorithm [2] from 1962.  A google search found the paper by Baudin
and Smith [3] (same Robert Smith) published in 2012. Elen Kalda's
proposed patch [4] is based on that paper.

I developed two sets of test set by randomly distributing values over
a restricted range and the full range of input values. The current
complex divide handled the restricted range well enough, but failed on
the full range more than 1% of the time. Baudin and Smith's primary
test for "ratio" equals zero reduced the cases with 16 or more error
bits by a factor of 5, but still left too many flawed answers. Adding
debug print out to cases with substantial errors allowed me to see the
intermediate calculations for test values that failed. I noted that
for many of the failures, "ratio" was a subnormal. Changing the
"ratio" test from check for zero to check for subnormal reduced the 16
bit error rate by another factor of 12. This single modified test
provides the greatest benefit for the least cost, but the percentage
of cases with greater than 16 bit errors (double precision data) is
still greater than 0.027% (2.7 in 10,000).

Continued examination of remaining errors and their intermediate
computations led to the various tests of input value tests and scaling
to avoid under/overflow. The current patch does not handle some of the
rarest and most extreme combinations of input values, but the random
test data is only showing 1 case in 10 million that has an error of
greater than 12 bits. That case has 18 bits of error and is due to
subtraction cancellation. These results are significantly better
than the results reported by Baudin and Smith.

Support for half, float, double, extended, and long double precision
is included as all 

Re: [PATCH] Better __ashlDI3, __ashrDI3 and __lshrDI3 functions, plus fixed __bswapsi2 function

2020-11-10 Thread Jakub Jelinek via Gcc-patches
On Tue, Nov 10, 2020 at 08:44:32PM +0100, Stefan Kanthak wrote:
> Eric Botcazou  wrote:
> 
> >> The implementation of the __ashlDI3(), __ashrDI3() and __lshrDI3() 
> >> functions
> >> is rather bad, it yields bad machine code at least on i386 and AMD64. Since
> >> GCC knows how to shift integers twice the register size these functions can
> >> be written as one-liners.
> > 
> > These functions are precisely meant to be used when GCC cannot do that.
> 
> On which processor(s) is GCC unable to generate code for DWtype shifts?

E.g. avr-none, msp430-elf, pdp11-aout.
And I see recursive __cmpdi2 calls on avr-none too.

Jakub



[pushed] c++: Add 5 unfixed tests.

2020-11-10 Thread Marek Polacek via Gcc-patches
A couple of dg-ice tests.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/testsuite/ChangeLog:

PR c++/52830
PR c++/88982
PR c++/90799
PR c++/87765
PR c++/89565
* g++.dg/cpp0x/constexpr-52830.C: New test.
* g++.dg/cpp0x/vt-88982.C: New test.
* g++.dg/cpp1z/class-deduction76.C: New test.
* g++.dg/cpp1z/constexpr-lambda26.C: New test.
* g++.dg/cpp2a/nontype-class39.C: New test.
---
 gcc/testsuite/g++.dg/cpp0x/constexpr-52830.C  | 37 +++
 gcc/testsuite/g++.dg/cpp0x/vt-88982.C | 14 +++
 .../g++.dg/cpp1z/class-deduction76.C  | 25 +
 .../g++.dg/cpp1z/constexpr-lambda26.C | 13 +++
 gcc/testsuite/g++.dg/cpp2a/nontype-class39.C  | 12 ++
 5 files changed, 101 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/constexpr-52830.C
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/vt-88982.C
 create mode 100644 gcc/testsuite/g++.dg/cpp1z/class-deduction76.C
 create mode 100644 gcc/testsuite/g++.dg/cpp1z/constexpr-lambda26.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/nontype-class39.C

diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-52830.C 
b/gcc/testsuite/g++.dg/cpp0x/constexpr-52830.C
new file mode 100644
index 000..2c9d2f9b329
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-52830.C
@@ -0,0 +1,37 @@
+// PR c++/52830
+// { dg-do compile { target c++11 } }
+// { dg-ice "comptypes" }
+
+template struct eif { typedef void type; };
+template<>   struct eif {};
+
+template struct same
+{
+  static constexpr bool value = false;
+};
+template
+struct same
+{
+  static constexpr bool value = true;
+};
+
+
+struct foo {
+  template
+  void func(T && a,
+typename eif::value>::type * = 0);
+};
+
+template
+void
+foo::
+func(T && a,
+ typename eif::value>::type * )
+{
+}
+
+void do_stuff()
+{
+  foo f;
+  f.func(12);
+}
diff --git a/gcc/testsuite/g++.dg/cpp0x/vt-88982.C 
b/gcc/testsuite/g++.dg/cpp0x/vt-88982.C
new file mode 100644
index 000..cb9530dcee1
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/vt-88982.C
@@ -0,0 +1,14 @@
+// PR c++/88982
+// { dg-do compile { target c++11 } }
+// { dg-ice "tsubst_pack_expansion" }
+
+template struct A {
+  template class ...Cs, Cs ...Vs> struct B {
+B() {
+}
+  };
+};
+
+template using Int = int;
+template using Char = char;
+A::B b;
diff --git a/gcc/testsuite/g++.dg/cpp1z/class-deduction76.C 
b/gcc/testsuite/g++.dg/cpp1z/class-deduction76.C
new file mode 100644
index 000..23bb6e8fa9a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/class-deduction76.C
@@ -0,0 +1,25 @@
+// PR c++/90799
+// { dg-do compile { target c++17 } }
+// { dg-ice "unify" }
+
+template
+void foo() noexcept(T::value);
+
+struct S {
+static constexpr const bool value = true;
+
+template
+void bar() noexcept(T::value);
+};
+
+template
+constexpr bool is_noexcept_function(void(Args...) noexcept(is_noexcept)) 
noexcept {
+return is_noexcept;
+}
+
+template
+constexpr bool is_noexcept_member_function(void(S::*)(Args...) 
noexcept(is_noexcept)) noexcept {
+return is_noexcept;
+}
+
+static_assert(is_noexcept_function(foo));
diff --git a/gcc/testsuite/g++.dg/cpp1z/constexpr-lambda26.C 
b/gcc/testsuite/g++.dg/cpp1z/constexpr-lambda26.C
new file mode 100644
index 000..d6c8bae525f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/constexpr-lambda26.C
@@ -0,0 +1,13 @@
+// PR c++/87765
+// { dg-do compile { target c++17 } }
+// { dg-ice "cxx_eval_constant_expression" }
+
+template 
+using foo = int;
+
+struct A {
+  constexpr int bar() const { return 42; }
+};
+
+void baz(A a) {
+  [=](auto c) { return foo { }; }; }
diff --git a/gcc/testsuite/g++.dg/cpp2a/nontype-class39.C 
b/gcc/testsuite/g++.dg/cpp2a/nontype-class39.C
new file mode 100644
index 000..f5f79a71ec2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/nontype-class39.C
@@ -0,0 +1,12 @@
+// PR c++/89565
+// { dg-do compile { target c++20 } }
+// { dg-ice "resolve_args" }
+
+template 
+struct N{};
+
+template 
+struct S {};
+
+template 
+using NS = S;

base-commit: 8b9a92f794b8ad8011e6beb11a609efa635c4600
-- 
2.28.0



Re: [PATCH] Better __ashlDI3, __ashrDI3 and __lshrDI3 functions, plus fixed __bswapsi2 function

2020-11-10 Thread Stefan Kanthak
Eric Botcazou  wrote:

>> The implementation of the __ashlDI3(), __ashrDI3() and __lshrDI3() functions
>> is rather bad, it yields bad machine code at least on i386 and AMD64. Since
>> GCC knows how to shift integers twice the register size these functions can
>> be written as one-liners.
> 
> These functions are precisely meant to be used when GCC cannot do that.

On which processor(s) is GCC unable to generate code for DWtype shifts?

JFTR: if GCC were not able to generate code for DWtype addition and subtraction
  it would also need __[u]addDI3() and __[u]subDI3() functions ... which
  are but missing from libgcc.a

Stefan Kanthak


[committed] libstdc++: Fix more unspecified comparisons to null pointer [PR 97415]

2020-11-10 Thread Jonathan Wakely via Gcc-patches
This adds some more null checks to avoid a relational comparison with a
null pointer, similar to 78198b6021a9695054dab039340202170b88423c.

libstdc++-v3/ChangeLog:

PR libstdc++/97415
* include/std/sstream (basic_stringbuf::_M_update_egptr)
(basic_stringbuf::__xfer_bufptrs::__xfer_bufptrs): Check for
null before comparing pointers.

Tested powerpc64le-linux. Committed to trunk.

commit ced70ebaa372945ec8d73703d81e4a10d6d51c9b
Author: Jonathan Wakely 
Date:   Tue Nov 10 15:46:02 2020

libstdc++: Fix more unspecified comparisons to null pointer [PR 97415]

This adds some more null checks to avoid a relational comparison with a
null pointer, similar to 78198b6021a9695054dab039340202170b88423c.

libstdc++-v3/ChangeLog:

PR libstdc++/97415
* include/std/sstream (basic_stringbuf::_M_update_egptr)
(basic_stringbuf::__xfer_bufptrs::__xfer_bufptrs): Check for
null before comparing pointers.

diff --git a/libstdc++-v3/include/std/sstream b/libstdc++-v3/include/std/sstream
index 437e2ba2a5f8..9c50e4e83281 100644
--- a/libstdc++-v3/include/std/sstream
+++ b/libstdc++-v3/include/std/sstream
@@ -357,13 +357,16 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
   void
   _M_update_egptr()
   {
-   const bool __testin = _M_mode & ios_base::in;
-   if (this->pptr() && this->pptr() > this->egptr())
+   if (char_type* __pptr = this->pptr())
  {
-   if (__testin)
- this->setg(this->eback(), this->gptr(), this->pptr());
-   else
- this->setg(this->pptr(), this->pptr(), this->pptr());
+   char_type* __egptr = this->egptr();
+   if (!__egptr || __pptr > __egptr)
+ {
+   if (_M_mode & ios_base::in)
+ this->setg(this->eback(), this->gptr(), __pptr);
+   else
+ this->setg(__pptr, __pptr, __pptr);
+ }
  }
   }
 
@@ -396,7 +399,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
  _M_poff[0] = __from.pbase() - __str;
  _M_poff[1] = __from.pptr() - __from.pbase();
  _M_poff[2] = __from.epptr() - __str;
- if (__from.pptr() > __end)
+ if (!__end || __from.pptr() > __end)
__end = __from.pptr();
}
 


[committed] libstdc++: Add remaining C++20 additions to [P0408R7]

2020-11-10 Thread Jonathan Wakely via Gcc-patches
This adds the new overloads of basic_stringbuf::str, and the
corresponding overloads to basic_istringstream, basic_ostringstream and
basic_stringstream.

libstdc++-v3/ChangeLog:

* config/abi/pre/gnu.ver (GLIBCXX_3.4.21): Tighten patterns.
(GLIBCXX_3.4.29): Export new symbols.
* include/bits/alloc_traits.h (__allocator_like): New concept.
* include/std/sstream (basic_stringbuf::swap): Add exception
specification.
(basic_stringbuf::str() const): Add ref-qualifier. Use new
_M_high_mark function.
(basic_stringbuf::str(const SAlloc&) const): Define new function.
(basic_stringbuf::str() &&): Likewise.
(basic_stringbuf::str(const basic_string&)):
Likewise.
(basic_stringbuf::str(basic_string&&)): Likewise.
(basic_stringbuf::view() const): Use _M_high_mark.
(basic_istringstream::str, basic_ostringstream::str)
(basic_stringstream::str): Define new overloads.
* src/c++20/sstream-inst.cc (basic_stringbuf::str)
(basic_istringstream::str, basic_ostringstream::str)
(basic_stringstream::str): Explicit instantiation definitions
for new overloads.
* testsuite/27_io/basic_istringstream/view/char/1.cc: Add more
checks.
* testsuite/27_io/basic_istringstream/view/wchar_t/1.cc:
Likewise.
* testsuite/27_io/basic_ostringstream/view/char/1.cc:
Likewise.
* testsuite/27_io/basic_ostringstream/view/wchar_t/1.cc:
Likewise.
* testsuite/27_io/basic_stringstream/view/char/1.cc:
Likewise.
* testsuite/27_io/basic_stringstream/view/wchar_t/1.cc:
Likewise.
* testsuite/27_io/basic_istringstream/str/char/2.cc: New test.
* testsuite/27_io/basic_istringstream/str/wchar_t/2.cc: New test.
* testsuite/27_io/basic_ostringstream/str/char/3.cc: New test.
* testsuite/27_io/basic_ostringstream/str/wchar_t/3.cc: New test.
* testsuite/27_io/basic_stringbuf/str/char/4.cc: New test.
* testsuite/27_io/basic_stringbuf/str/wchar_t/4.cc: New test.
* testsuite/27_io/basic_stringstream/str/char/5.cc: New test.
* testsuite/27_io/basic_stringstream/str/wchar_t/5.cc.cc: New test.

Tested powerpc64le-linux. Committed to trunk.

commit 95cb0fc8c51841cc6a0e51490cb3769eb80fa34c
Author: Jonathan Wakely 
Date:   Tue Nov 10 15:57:04 2020

libstdc++: Add remaining C++20 additions to  [P0408R7]

This adds the new overloads of basic_stringbuf::str, and the
corresponding overloads to basic_istringstream, basic_ostringstream and
basic_stringstream.

libstdc++-v3/ChangeLog:

* config/abi/pre/gnu.ver (GLIBCXX_3.4.21): Tighten patterns.
(GLIBCXX_3.4.29): Export new symbols.
* include/bits/alloc_traits.h (__allocator_like): New concept.
* include/std/sstream (basic_stringbuf::swap): Add exception
specification.
(basic_stringbuf::str() const): Add ref-qualifier. Use new
_M_high_mark function.
(basic_stringbuf::str(const SAlloc&) const): Define new function.
(basic_stringbuf::str() &&): Likewise.
(basic_stringbuf::str(const basic_string&)):
Likewise.
(basic_stringbuf::str(basic_string&&)): Likewise.
(basic_stringbuf::view() const): Use _M_high_mark.
(basic_istringstream::str, basic_ostringstream::str)
(basic_stringstream::str): Define new overloads.
* src/c++20/sstream-inst.cc (basic_stringbuf::str)
(basic_istringstream::str, basic_ostringstream::str)
(basic_stringstream::str): Explicit instantiation definitions
for new overloads.
* testsuite/27_io/basic_istringstream/view/char/1.cc: Add more
checks.
* testsuite/27_io/basic_istringstream/view/wchar_t/1.cc:
Likewise.
* testsuite/27_io/basic_ostringstream/view/char/1.cc:
Likewise.
* testsuite/27_io/basic_ostringstream/view/wchar_t/1.cc:
Likewise.
* testsuite/27_io/basic_stringstream/view/char/1.cc:
Likewise.
* testsuite/27_io/basic_stringstream/view/wchar_t/1.cc:
Likewise.
* testsuite/27_io/basic_istringstream/str/char/2.cc: New test.
* testsuite/27_io/basic_istringstream/str/wchar_t/2.cc: New test.
* testsuite/27_io/basic_ostringstream/str/char/3.cc: New test.
* testsuite/27_io/basic_ostringstream/str/wchar_t/3.cc: New test.
* testsuite/27_io/basic_stringbuf/str/char/4.cc: New test.
* testsuite/27_io/basic_stringbuf/str/wchar_t/4.cc: New test.
* testsuite/27_io/basic_stringstream/str/char/5.cc: New test.
* testsuite/27_io/basic_stringstream/str/wchar_t/5.cc.cc: New test.

diff --git a/libstdc++-v3/config/abi/pre/gnu.ver 

[committed] libstdc++: Reorder constructors in

2020-11-10 Thread Jonathan Wakely via Gcc-patches
This groups all the constructors together, consistent with the synopses
in the C++20 standard.

libstdc++-v3/ChangeLog:

* include/std/sstream (basic_stringbug, basic_istringstream)
(basic_ostringstream, basic_stringstream): Reorder C++20
constructors to be declared next to other constructors.

Tested powerpc64le-linux. Committed to trunk.

commit f7c41c572bbc16d852104515e506936d447debbe
Author: Jonathan Wakely 
Date:   Tue Nov 10 19:12:03 2020

libstdc++: Reorder constructors in 

This groups all the constructors together, consistent with the synopses
in the C++20 standard.

libstdc++-v3/ChangeLog:

* include/std/sstream (basic_stringbug, basic_istringstream)
(basic_ostringstream, basic_stringstream): Reorder C++20
constructors to be declared next to other constructors.

diff --git a/libstdc++-v3/include/std/sstream b/libstdc++-v3/include/std/sstream
index 8cddda297017..d7200ab6ed88 100644
--- a/libstdc++-v3/include/std/sstream
+++ b/libstdc++-v3/include/std/sstream
@@ -149,37 +149,6 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
   : basic_stringbuf(std::move(__rhs), __xfer_bufptrs(__rhs, this))
   { __rhs._M_sync(const_cast(__rhs._M_string.data()), 0, 0); }
 
-  // 27.8.2.2 Assign and swap:
-
-  basic_stringbuf&
-  operator=(const basic_stringbuf&) = delete;
-
-  basic_stringbuf&
-  operator=(basic_stringbuf&& __rhs)
-  {
-   __xfer_bufptrs __st{__rhs, this};
-   const __streambuf_type& __base = __rhs;
-   __streambuf_type::operator=(__base);
-   this->pubimbue(__rhs.getloc());
-   _M_mode = __rhs._M_mode;
-   _M_string = std::move(__rhs._M_string);
-   __rhs._M_sync(const_cast(__rhs._M_string.data()), 0, 0);
-   return *this;
-  }
-
-  void
-  swap(basic_stringbuf& __rhs) noexcept(_Noexcept_swap::value)
-  {
-   __xfer_bufptrs __l_st{*this, std::__addressof(__rhs)};
-   __xfer_bufptrs __r_st{__rhs, this};
-   __streambuf_type& __base = __rhs;
-   __streambuf_type::swap(__base);
-   __rhs.pubimbue(this->pubimbue(__rhs.getloc()));
-   std::swap(_M_mode, __rhs._M_mode);
-   std::swap(_M_string, __rhs._M_string); // XXX not exception safe
-  }
-#endif // C++11
-
 #if __cplusplus > 201703L && _GLIBCXX_USE_CXX11_ABI
   explicit
   basic_stringbuf(const allocator_type& __a)
@@ -226,7 +195,38 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
 
   allocator_type get_allocator() const noexcept
   { return _M_string.get_allocator(); }
-#endif
+#endif // C++20
+
+  // 27.8.2.2 Assign and swap:
+
+  basic_stringbuf&
+  operator=(const basic_stringbuf&) = delete;
+
+  basic_stringbuf&
+  operator=(basic_stringbuf&& __rhs)
+  {
+   __xfer_bufptrs __st{__rhs, this};
+   const __streambuf_type& __base = __rhs;
+   __streambuf_type::operator=(__base);
+   this->pubimbue(__rhs.getloc());
+   _M_mode = __rhs._M_mode;
+   _M_string = std::move(__rhs._M_string);
+   __rhs._M_sync(const_cast(__rhs._M_string.data()), 0, 0);
+   return *this;
+  }
+
+  void
+  swap(basic_stringbuf& __rhs) noexcept(_Noexcept_swap::value)
+  {
+   __xfer_bufptrs __l_st{*this, std::__addressof(__rhs)};
+   __xfer_bufptrs __r_st{__rhs, this};
+   __streambuf_type& __base = __rhs;
+   __streambuf_type::swap(__base);
+   __rhs.pubimbue(this->pubimbue(__rhs.getloc()));
+   std::swap(_M_mode, __rhs._M_mode);
+   std::swap(_M_string, __rhs._M_string); // XXX not exception safe
+  }
+#endif // C++11
 
   // Getters and setters:
 
@@ -282,7 +282,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
else
  return _M_string;
   }
-#endif
+#endif // C++20
 
   /**
*  @brief  Setting a new buffer.
@@ -513,7 +513,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
   _M_mode(__rhs._M_mode), _M_string(std::move(__rhs._M_string), __a)
   { }
 #endif
-#endif
+#endif // C++11
 };
 
 
@@ -623,27 +623,6 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
   _M_stringbuf(std::move(__rhs._M_stringbuf))
   { __istream_type::set_rdbuf(&_M_stringbuf); }
 
-  // 27.8.3.2 Assign and swap:
-
-  basic_istringstream&
-  operator=(const basic_istringstream&) = delete;
-
-  basic_istringstream&
-  operator=(basic_istringstream&& __rhs)
-  {
-   __istream_type::operator=(std::move(__rhs));
-   _M_stringbuf = std::move(__rhs._M_stringbuf);
-   return *this;
-  }
-
-  void
-  swap(basic_istringstream& __rhs)
-  {
-   __istream_type::swap(__rhs);
-   _M_stringbuf.swap(__rhs._M_stringbuf);
-  }
-#endif
-
 #if __cplusplus > 201703L && _GLIBCXX_USE_CXX11_ABI
   basic_istringstream(ios_base::openmode __mode, const allocator_type& __a)
   : __istream_type(), _M_stringbuf(__mode | ios_base::in, __a)
@@ -674,7 +653,28 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
ios_base::openmode __mode = ios_base::in)
   

[PATCH v2] C-family : Add attribute 'unavailable'.

2020-11-10 Thread Iain Sandoe
Hi Jospeh,

Joseph Myers  wrote:

> This patch seems to be missing documentation for the new attribute in 
> extend.texi.

Apologies, for that omission, revised patch includes the documentation and
also addresses Richi’s comments.

documentation patch tested with “make pdf” and visual inspection of the output.

OK now?
thanks
Iain

—— commit message.

If an interface is marked 'deprecated' then, presumably, at some point it
will be withdrawn and no longer available.  The 'unavailable' attribute
makes it possible to mark up interfaces to indicate this status.  It is used
quite extensively in some codebases where a single set of headers can be used
to permit code generation for multiple system versions.

>From a configuration perspective, it also allows a compile test to determine
that an interface is missing - rather than requiring a link test.

The implementation follows the pattern of attribute deprecated, but produces
an error (where deprecation produces a warning).

This attribute has been implemented in clang for some years.

gcc/c-family/ChangeLog:

* c-attribs.c (handle_unavailable_attribute): New.

gcc/c/ChangeLog:

* c-decl.c (enum deprecated_states): Add unavailable state.
(merge_decls): Copy unavailability.
(quals_from_declspecs): Handle unavailable case.
(start_decl): Amend the logic handling suppression of nested
deprecation states to include unavailability.
(smallest_type_quals_location): Amend comment.
(grokdeclarator): Handle the unavailable deprecation state.
(declspecs_add_type): Set TREE_UNAVAILABLE from the decl specs.
* c-tree.h (struct c_declspecs): Add unavailable_p.
* c-typeck.c (build_component_ref): Handle unavailability.
(build_external_ref): Likewise.

gcc/cp/ChangeLog:

* call.c (build_over_call): Handle unavailable state in addition to
deprecation.
* class.c (type_build_ctor_call): Likewise.
(type_build_dtor_call): Likewise.
* cp-tree.h: Rename cp_warn_deprecated_use to
cp_handle_deprecated_or_unavailable.
* decl.c (duplicate_decls): Merge unavailability.
(grokdeclarator): Handle unavailability in addition to deprecation.
(type_is_unavailable): New.
(grokparms): Handle unavailability in addition to deprecation.
* decl.h (enum deprecated_states): Add
UNAVAILABLE_DEPRECATED_SUPPRESS.
* decl2.c (cplus_decl_attributes): Propagate unavailability to
templates.
(cp_warn_deprecated_use): Rename to ...
(cp_handle_deprecated_or_unavailable): ... this and amend to handle
the unavailable case. It remains a warning in the case of deprecation
but becomes an error in the case of unavailability.
(cp_warn_deprecated_use_scopes): Handle unavailability.
(mark_used): Likewise.
* parser.c (cp_parser_template_name): Likewise.
(cp_parser_template_argument): Likewise.
(cp_parser_parameter_declaration_list): Likewise.
* typeck.c (build_class_member_access_expr): Likewise.
(finish_class_member_access_expr): Likewise.
* typeck2.c (build_functional_cast_1): Likewise.

gcc/ChangeLog:

* doc/extend.texi: Document unavailable attribute.
* print-tree.c (print_node): Handle unavailable attribute.
* tree-core.h (struct tree_base): Add a bit to carry unavailability.
* tree.c (error_unavailable_use): New.
* tree.h (TREE_UNAVAILABLE): New.
(error_unavailable_use): New.

gcc/objc/ChangeLog:

* objc-act.c (objc_add_property_declaration): Register unavailable
attribute.
(maybe_make_artificial_property_decl): Set available.
(objc_maybe_build_component_ref): Generalise to the method prototype
to count availability.
(objc_build_class_component_ref): Likewise.
(build_private_template): Likewise.
(objc_decl_method_attributes): Handle unavailable attribute.
(lookup_method_in_hash_lists): Amend comments.
(objc_finish_message_expr): Handle unavailability in addition to
deprecation.
(start_class): Likewise.
(finish_class): Likewise.
(lookup_protocol): Likewise.
(objc_declare_protocol): Likewise.
(start_protocol): Register unavailable attribute.
(really_start_method): Likewise.
(objc_gimplify_property_ref): Emit error on encountering an
unavailable entity (and a warning for a deprecated one).

gcc/testsuite/ChangeLog:

* g++.dg/ext/attr-unavailable-1.C: New test.
* g++.dg/ext/attr-unavailable-2.C: New test.
* g++.dg/ext/attr-unavailable-3.C: New test.
* g++.dg/ext/attr-unavailable-4.C: New test.
* g++.dg/ext/attr-unavailable-5.C: New test.
* g++.dg/ext/attr-unavailable-6.C: New test.
* g++.dg/ext/attr-unavailable-7.C: New test.
* g++.dg/ext/attr-unavailable-8.C: New 

Re: [PATCH] c++: Improve static_assert diagnostic [PR97518]

2020-11-10 Thread Jason Merrill via Gcc-patches

On 11/10/20 2:28 PM, Marek Polacek wrote:

On Tue, Nov 10, 2020 at 02:15:56PM -0500, Jason Merrill wrote:

On 11/10/20 1:59 PM, Marek Polacek wrote:

On Tue, Nov 10, 2020 at 11:32:04AM -0500, Jason Merrill wrote:

On 11/9/20 7:21 PM, Marek Polacek wrote:

Currently, when a static_assert fails, we only say "static assertion failed".
It would be more useful if we could also print the expression that
evaluated to false; this is especially useful when the condition uses
template parameters.  Consider the motivating example, in which we have
this line:

 static_assert(is_same::value);

if this fails, the user has to play dirty games to get the compiler to
print the template arguments.  With this patch, we say:

 static assertion failed due to requirement 'is_same::value'


I'd rather avoid the word "requirement" here, to avoid confusion with
Concepts.

Maybe have the usual failed error, and if we're showing the expression, have
a second inform to say e.g. "%qE evaluates to false"?


Works for me.


Also, if we've narrowed the problem down to a particular subexpression,
let's only print that one.


Done.


which I think is much better.  However, always printing the condition that
evaluated to 'false' wouldn't be very useful: e.g. noexcept(fn) is
always parsed to true/false, so we would say "static assertion failed due
to requirement 'false'" which doesn't help.  So I wound up only printing
the condition when it was instantiation-dependent, that is, we called
finish_static_assert from tsubst_expr.

Moreover, this patch also improves the diagnostic when the condition
consists of a logical AND.  Say you have something like this:

 static_assert(fn1() && fn2() && fn3() && fn4() && fn5());

where fn4() evaluates to false and the other ones to true.  Highlighting
the whole thing is not that helpful because it won't say which clause
evaluated to false.  With the find_failing_clause tweak in this patch
we emit:

 error: static assertion failed
   6 | static_assert(fn1() && fn2() && fn3() && fn4() && fn5());
 |  ~~~^~

so you know right away what's going on.  Unfortunately, when you combine
both things, that is, have an instantiation-dependent expr and && in
a static_assert, we can't yet quite point to the clause that failed.  It
is because when we tsubstitute something like is_same::value, we
generate a VAR_DECL that doesn't have any location.  It would be awesome
if we could wrap it with a location wrapper, but I didn't see anything
obvious.


Hmm, I vaguely remember that at first we were using location wrappers less
in templates, but I thought that was fixed.  I don't see anything setting
suppress_location_wrappers, and it looks like tsubst_copy_and_build should
preserve a location wrapper.


The problem here is that tsubst_qualified_id produces a VAR_DECL and for those
CAN_HAVE_LOCATION_P is false.


Ah, then perhaps tsubst_qualified_id should call maybe_wrap_with_location to
preserve the location of the SCOPE_REF.


The SCOPE_REF's location is fine, we just throw it away and return a VAR_DECL.


Yes, I'm saying that tsubst_qualified_id should extract the location 
from the SCOPE_REF and pass it to maybe_wrap_with_location to give the 
VAR_DECL a location wrapper.


Jason



Re: [PATCH] c++: Improve static_assert diagnostic [PR97518]

2020-11-10 Thread Marek Polacek via Gcc-patches
On Tue, Nov 10, 2020 at 02:15:56PM -0500, Jason Merrill wrote:
> On 11/10/20 1:59 PM, Marek Polacek wrote:
> > On Tue, Nov 10, 2020 at 11:32:04AM -0500, Jason Merrill wrote:
> > > On 11/9/20 7:21 PM, Marek Polacek wrote:
> > > > Currently, when a static_assert fails, we only say "static assertion 
> > > > failed".
> > > > It would be more useful if we could also print the expression that
> > > > evaluated to false; this is especially useful when the condition uses
> > > > template parameters.  Consider the motivating example, in which we have
> > > > this line:
> > > > 
> > > > static_assert(is_same::value);
> > > > 
> > > > if this fails, the user has to play dirty games to get the compiler to
> > > > print the template arguments.  With this patch, we say:
> > > > 
> > > > static assertion failed due to requirement 'is_same > > > int>::value'
> > > 
> > > I'd rather avoid the word "requirement" here, to avoid confusion with
> > > Concepts.
> > > 
> > > Maybe have the usual failed error, and if we're showing the expression, 
> > > have
> > > a second inform to say e.g. "%qE evaluates to false"?
> > 
> > Works for me.
> > 
> > > Also, if we've narrowed the problem down to a particular subexpression,
> > > let's only print that one.
> > 
> > Done.
> > 
> > > > which I think is much better.  However, always printing the condition 
> > > > that
> > > > evaluated to 'false' wouldn't be very useful: e.g. noexcept(fn) is
> > > > always parsed to true/false, so we would say "static assertion failed 
> > > > due
> > > > to requirement 'false'" which doesn't help.  So I wound up only printing
> > > > the condition when it was instantiation-dependent, that is, we called
> > > > finish_static_assert from tsubst_expr.
> > > > 
> > > > Moreover, this patch also improves the diagnostic when the condition
> > > > consists of a logical AND.  Say you have something like this:
> > > > 
> > > > static_assert(fn1() && fn2() && fn3() && fn4() && fn5());
> > > > 
> > > > where fn4() evaluates to false and the other ones to true.  Highlighting
> > > > the whole thing is not that helpful because it won't say which clause
> > > > evaluated to false.  With the find_failing_clause tweak in this patch
> > > > we emit:
> > > > 
> > > > error: static assertion failed
> > > >   6 | static_assert(fn1() && fn2() && fn3() && fn4() && fn5());
> > > > |  ~~~^~
> > > > 
> > > > so you know right away what's going on.  Unfortunately, when you combine
> > > > both things, that is, have an instantiation-dependent expr and && in
> > > > a static_assert, we can't yet quite point to the clause that failed.  It
> > > > is because when we tsubstitute something like is_same::value, we
> > > > generate a VAR_DECL that doesn't have any location.  It would be awesome
> > > > if we could wrap it with a location wrapper, but I didn't see anything
> > > > obvious.
> > > 
> > > Hmm, I vaguely remember that at first we were using location wrappers less
> > > in templates, but I thought that was fixed.  I don't see anything setting
> > > suppress_location_wrappers, and it looks like tsubst_copy_and_build should
> > > preserve a location wrapper.
> > 
> > The problem here is that tsubst_qualified_id produces a VAR_DECL and for 
> > those
> > CAN_HAVE_LOCATION_P is false.
> 
> Ah, then perhaps tsubst_qualified_id should call maybe_wrap_with_location to
> preserve the location of the SCOPE_REF.

The SCOPE_REF's location is fine, we just throw it away and return a VAR_DECL.

I think I'll have to extend op_location_t and pass something reasonable from
tsubst_copy_and_build/TRUTH_AND_EXPR to cp_build_binary_op, and use it to set
the operands' locations after decay_conversion :(.

> > + /* See if we can find which clause was failing (for logical AND).  */
> > + tree bad = find_failing_clause (orig_condition);
> > + /* If not, or its location is unusable, fall back to the previous
> > +location.  */
> > + location_t cloc = location;
> > + if (cp_expr_location (bad) != UNKNOWN_LOCATION)
> > +   cloc = cp_expr_location (bad);
> > +
> > /* Report the error. */
> >   if (len == 0)
> > -error ("static assertion failed");
> > +   error_at (cloc, "static assertion failed");
> >   else
> > -error ("static assertion failed: %s",
> > -  TREE_STRING_POINTER (message));
> > +   error_at (cloc, "static assertion failed: %s",
> > + TREE_STRING_POINTER (message));
> > + if (show_expr_p)
> > +   inform (cloc, "%qE evaluates to false",
> > +   /* Nobody wants to see the artifical (bool) cast.  */
> 
> "artificial"
> 
> OK with that typo fixed.

Thanks a lot.

Marek



Re: [PATCH] c++: Improve static_assert diagnostic [PR97518]

2020-11-10 Thread Jason Merrill via Gcc-patches

On 11/10/20 1:59 PM, Marek Polacek wrote:

On Tue, Nov 10, 2020 at 11:32:04AM -0500, Jason Merrill wrote:

On 11/9/20 7:21 PM, Marek Polacek wrote:

Currently, when a static_assert fails, we only say "static assertion failed".
It would be more useful if we could also print the expression that
evaluated to false; this is especially useful when the condition uses
template parameters.  Consider the motivating example, in which we have
this line:

static_assert(is_same::value);

if this fails, the user has to play dirty games to get the compiler to
print the template arguments.  With this patch, we say:

static assertion failed due to requirement 'is_same::value'


I'd rather avoid the word "requirement" here, to avoid confusion with
Concepts.

Maybe have the usual failed error, and if we're showing the expression, have
a second inform to say e.g. "%qE evaluates to false"?


Works for me.


Also, if we've narrowed the problem down to a particular subexpression,
let's only print that one.


Done.


which I think is much better.  However, always printing the condition that
evaluated to 'false' wouldn't be very useful: e.g. noexcept(fn) is
always parsed to true/false, so we would say "static assertion failed due
to requirement 'false'" which doesn't help.  So I wound up only printing
the condition when it was instantiation-dependent, that is, we called
finish_static_assert from tsubst_expr.

Moreover, this patch also improves the diagnostic when the condition
consists of a logical AND.  Say you have something like this:

static_assert(fn1() && fn2() && fn3() && fn4() && fn5());

where fn4() evaluates to false and the other ones to true.  Highlighting
the whole thing is not that helpful because it won't say which clause
evaluated to false.  With the find_failing_clause tweak in this patch
we emit:

error: static assertion failed
  6 | static_assert(fn1() && fn2() && fn3() && fn4() && fn5());
|  ~~~^~

so you know right away what's going on.  Unfortunately, when you combine
both things, that is, have an instantiation-dependent expr and && in
a static_assert, we can't yet quite point to the clause that failed.  It
is because when we tsubstitute something like is_same::value, we
generate a VAR_DECL that doesn't have any location.  It would be awesome
if we could wrap it with a location wrapper, but I didn't see anything
obvious.


Hmm, I vaguely remember that at first we were using location wrappers less
in templates, but I thought that was fixed.  I don't see anything setting
suppress_location_wrappers, and it looks like tsubst_copy_and_build should
preserve a location wrapper.


The problem here is that tsubst_qualified_id produces a VAR_DECL and for those
CAN_HAVE_LOCATION_P is false.


Ah, then perhaps tsubst_qualified_id should call 
maybe_wrap_with_location to preserve the location of the SCOPE_REF.



If the user writes (bool) is_same::value,
this works well.  Maybe we could create an artificial (bool) for clauses of
the logical AND operator; decay_conversion in cp_build_binary_op will likely
do it anyway, it just doesn't have any location: cp_build_binary_op takes
an op_location_t, but that doesn't track locations of the operands.  Maybe
some hacks in tsubst_copy_and_build/TRUTH_AND_EXPR would help, I'm not sure.
  

Further tweak could be to print the failing clause of the condition if
possible, whether or not it was instantiation-dependent.


If not instantiation-dependent, underlining the clause seems sufficient.


Ack.
  
Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?


-- >8 --
Currently, when a static_assert fails, we only say "static assertion failed".
It would be more useful if we could also print the expression that
evaluated to false; this is especially useful when the condition uses
template parameters.  Consider the motivating example, in which we have
this line:

   static_assert(is_same::value);

if this fails, the user has to play dirty games to get the compiler to
print the template arguments.  With this patch, we say:

   error: static assertion failed
   note: 'is_same::value' evaluates to false

which I think is much better.  However, always printing the condition that
evaluated to 'false' wouldn't be very useful: e.g. noexcept(fn) is
always parsed to true/false, so we would say "'false' evaluates to false"
which doesn't help.  So I wound up only printing the condition when it was
instantiation-dependent, that is, we called finish_static_assert from
tsubst_expr.

Moreover, this patch also improves the diagnostic when the condition
consists of a logical AND.  Say you have something like this:

   static_assert(fn1() && fn2() && fn3() && fn4() && fn5());

where fn4() evaluates to false and the other ones to true.  Highlighting
the whole thing is not that helpful because it won't say which clause
evaluated to false.  With the find_failing_clause tweak in this patch
we emit:

   error: static assertion failed
   

Re: RFC: Monitoring old PRs, new dg directives [v2]

2020-11-10 Thread Marek Polacek via Gcc-patches
On Tue, Nov 10, 2020 at 03:15:00PM +0100, Thomas Schwinge wrote:
> Hi!
> 
> On 2020-08-10T17:30:21-0400, Marek Polacek via Gcc-patches 
>  wrote:
> > This patch adds a new DejaGNU directive, dg-ice, as outlined in the
> > proposal here:
> > https://gcc.gnu.org/pipermail/gcc-patches/2020-July/550913.html
> >
> > It means that it's expected that the compiler crashes with an internal
> > compiler error when compiling test with such a directive.
> 
> Thanks, I find this useful.

Glad you find it useful!

> So I have a testcase that currently ICEs...  ;-)
> 
> spawn -ignore SIGHUP [xgcc]
> during GIMPLE pass: omplower
> In file included from [...]:4:
> [...]: In function 'f_data':
> [...]:41:10: internal compiler error: in lower_omp_target, at 
> omp-low.c:11890
> [backtrace]
> Please submit a full bug report,
> with preprocessed source if appropriate.
> Please include the complete backtrace with any bug report.
> See  for instructions.
> compiler exited with status 1
> 
> I put '{ dg-ice }' into it, and do get the expected:
> 
> XFAIL: [...] (internal compiler error)'
> 
> But I also get an unexpected:
> 
> FAIL: [...] (test for excess errors)'
> 
> That's because of:
> 
> Excess errors:
> during GIMPLE pass: omplower
> 
> I can easily '{ dg-prune-output "during GIMPLE pass: omplower" }', of
> course, or should this lead-in be pruned internally, automatically?

I think prune_ices should prune this automatically.  Want me to submit a patch
for that?

> And then, this is a '{ dg-do run }' testcase, I see:
> 
> UNRESOLVED: 
> libgomp.oacc-c/../libgomp.oacc-c-c++-common/declare-vla-kernels-decompose.c 
> -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0  
> compilation failed to produce executable
> 
> Given '{ dg-ice }', it's obvious that we don't get an executable to run.

Yup.

> I thought that maybe '{ dg-ice }' testcases should automatically be
> demoted into '{ dg-do compile }', but that would be wrong for link-time
> ICEs.  But can we "expect" the UNRESOLVED, and turn that into another
> XFAIL?  Aha, actually I see that for '{ dg-do link }', this seems to
> behave as expected (no UNRESOLVED), so we might demote all '{ dg-ice }'
> testcases from '{ dg-do run }' to '{ dg-do link }', if that makes sense?

I think the dg-ice directive shouldn't change whether the test is
dg-do compile/run/link/assemble or something else.  Probably best to
use { dg-do link } if that works, and leave a comment that after fixing
the ICE the test should be turned into dg-do run, if that is what's
desired.

Marek



Re: [PATCH] c++: Improve static_assert diagnostic [PR97518]

2020-11-10 Thread Marek Polacek via Gcc-patches
On Tue, Nov 10, 2020 at 11:32:04AM -0500, Jason Merrill wrote:
> On 11/9/20 7:21 PM, Marek Polacek wrote:
> > Currently, when a static_assert fails, we only say "static assertion 
> > failed".
> > It would be more useful if we could also print the expression that
> > evaluated to false; this is especially useful when the condition uses
> > template parameters.  Consider the motivating example, in which we have
> > this line:
> > 
> >static_assert(is_same::value);
> > 
> > if this fails, the user has to play dirty games to get the compiler to
> > print the template arguments.  With this patch, we say:
> > 
> >static assertion failed due to requirement 'is_same::value'
> 
> I'd rather avoid the word "requirement" here, to avoid confusion with
> Concepts.
> 
> Maybe have the usual failed error, and if we're showing the expression, have
> a second inform to say e.g. "%qE evaluates to false"?

Works for me.

> Also, if we've narrowed the problem down to a particular subexpression,
> let's only print that one.

Done.

> > which I think is much better.  However, always printing the condition that
> > evaluated to 'false' wouldn't be very useful: e.g. noexcept(fn) is
> > always parsed to true/false, so we would say "static assertion failed due
> > to requirement 'false'" which doesn't help.  So I wound up only printing
> > the condition when it was instantiation-dependent, that is, we called
> > finish_static_assert from tsubst_expr.
> > 
> > Moreover, this patch also improves the diagnostic when the condition
> > consists of a logical AND.  Say you have something like this:
> > 
> >static_assert(fn1() && fn2() && fn3() && fn4() && fn5());
> > 
> > where fn4() evaluates to false and the other ones to true.  Highlighting
> > the whole thing is not that helpful because it won't say which clause
> > evaluated to false.  With the find_failing_clause tweak in this patch
> > we emit:
> > 
> >error: static assertion failed
> >  6 | static_assert(fn1() && fn2() && fn3() && fn4() && fn5());
> >|  ~~~^~
> > 
> > so you know right away what's going on.  Unfortunately, when you combine
> > both things, that is, have an instantiation-dependent expr and && in
> > a static_assert, we can't yet quite point to the clause that failed.  It
> > is because when we tsubstitute something like is_same::value, we
> > generate a VAR_DECL that doesn't have any location.  It would be awesome
> > if we could wrap it with a location wrapper, but I didn't see anything
> > obvious.
> 
> Hmm, I vaguely remember that at first we were using location wrappers less
> in templates, but I thought that was fixed.  I don't see anything setting
> suppress_location_wrappers, and it looks like tsubst_copy_and_build should
> preserve a location wrapper.

The problem here is that tsubst_qualified_id produces a VAR_DECL and for those
CAN_HAVE_LOCATION_P is false.  If the user writes (bool) is_same::value,
this works well.  Maybe we could create an artificial (bool) for clauses of
the logical AND operator; decay_conversion in cp_build_binary_op will likely
do it anyway, it just doesn't have any location: cp_build_binary_op takes
an op_location_t, but that doesn't track locations of the operands.  Maybe
some hacks in tsubst_copy_and_build/TRUTH_AND_EXPR would help, I'm not sure.
 
> > Further tweak could be to print the failing clause of the condition if
> > possible, whether or not it was instantiation-dependent.
> 
> If not instantiation-dependent, underlining the clause seems sufficient.

Ack.
 
Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
Currently, when a static_assert fails, we only say "static assertion failed".
It would be more useful if we could also print the expression that
evaluated to false; this is especially useful when the condition uses
template parameters.  Consider the motivating example, in which we have
this line:

  static_assert(is_same::value);

if this fails, the user has to play dirty games to get the compiler to
print the template arguments.  With this patch, we say:

  error: static assertion failed
  note: 'is_same::value' evaluates to false

which I think is much better.  However, always printing the condition that
evaluated to 'false' wouldn't be very useful: e.g. noexcept(fn) is
always parsed to true/false, so we would say "'false' evaluates to false"
which doesn't help.  So I wound up only printing the condition when it was
instantiation-dependent, that is, we called finish_static_assert from
tsubst_expr.

Moreover, this patch also improves the diagnostic when the condition
consists of a logical AND.  Say you have something like this:

  static_assert(fn1() && fn2() && fn3() && fn4() && fn5());

where fn4() evaluates to false and the other ones to true.  Highlighting
the whole thing is not that helpful because it won't say which clause
evaluated to false.  With the find_failing_clause tweak in this patch
we emit:

  error: static 

Re: [PATCH 1/3] Refactor copying decl section names

2020-11-10 Thread Jeff Law via Gcc-patches


On 11/12/19 11:28 PM, Strager Neds wrote:
> * gcc/cgraph.h (symtab_node::get_section): Constify.
> (symtab_node::set_section): Declare new overload.
> * gcc/symtab.c (symtab_node::set_section): Define new overload.
> (symtab_node::copy_visibility_from): Use new overload of
> symtab_node::set_section.
> (symtab_node::resolve_alias): Same.
> * gcc/tree.h (set_decl_section_name): Declare new overload.
> * gcc/tree.c (set_decl_section_name): Define new overload.
> * gcc/c/c-decl.c (merge_decls): Use new overload of
> set_decl_section_name.
> * gcc/cp/decl.c (duplicate_decls): Same.
> * gcc/cp/method.c (use_thunk): Same.
> * gcc/cp/optimize.c (maybe_clone_body): Same.
> * gcc/d/decl.cc (finish_thunk): Same.
> * gcc/tree-emutls.c (get_emutls_init_templ_addr): Same.
> * gcc/cgraphclones.c (cgraph_node::create_virtual_clone): Use new
> overload of symtab_node::set_section.
> (cgraph_node::create_version_clone_with_body): Same.
> * gcc/trans-mem.c (ipa_tm_create_version): Same.

I adjusted the ChangeLog, added an entry for the coroutines.cc addition
I made and pushed this to the trunk.


Thanks for your patience.

Jeff



Merge from trunk to gccgo branch

2020-11-10 Thread Ian Lance Taylor via Gcc-patches
I merged trunk revision cf392dbdf17e38026f8e3c0e9af7f5b87f63be56 to
the gccgo branch.

Ian


Re: [PATCH] Better __ashlDI3, __ashrDI3 and __lshrDI3 functions, plus fixed __bswapsi2 function

2020-11-10 Thread Jeff Law via Gcc-patches


On 11/10/20 11:09 AM, Jakub Jelinek via Gcc-patches wrote:
> On Tue, Nov 10, 2020 at 06:59:30PM +0100, Stefan Kanthak via Gcc-patches 
> wrote:
>> The implementation of the __ashlDI3(), __ashrDI3() and __lshrDI3() functions
>> is rather bad, it yields bad machine code at least on i386 and AMD64.
>> Since GCC knows how to shift integers twice the register size these functions
>> can be written as one-liners.
> This looks wrong.  If gcc knows how to do that inline, it will not call the
> out of line function at all.  The functions are there for the cases where
> gcc can't do that.
> So, your patch will shorten the routines on targets where those are never
> called, and completely break them on any other (resulting in infinite
> recursion there).

There's a way to test that.  It's possible the expanders know how to
handle these cases now.  I'd be surprised (as I was with the [u]cmpdi2),
but it's possible.  I can throw it into the tester which will spin all
those pesky embedded targets.  If we get infinite recursion it should
show up quite clearly.


Of course, if the expanders now handle all the cases, then these
routines in libgcc2 are dead code and this is a fairly pointless exercise.


jeff



Re: [PATCH] Better __ashlDI3, __ashrDI3 and __lshrDI3 functions, plus fixed __bswapsi2 function

2020-11-10 Thread Jakub Jelinek via Gcc-patches
On Tue, Nov 10, 2020 at 06:59:30PM +0100, Stefan Kanthak via Gcc-patches wrote:
> The implementation of the __ashlDI3(), __ashrDI3() and __lshrDI3() functions
> is rather bad, it yields bad machine code at least on i386 and AMD64.
> Since GCC knows how to shift integers twice the register size these functions
> can be written as one-liners.
> 
> The implementation of the __bswapsi2() function uses SIGNED instead of
> unsigned mask values; cf. __bswapdi2()

The 0xff00 constant is unsigned, the others are signed, but that doesn't
really matter as the high bits are masked off.

Jakub



Re: [PATCH] Better __ashlDI3, __ashrDI3 and __lshrDI3 functions, plus fixed __bswapsi2 function

2020-11-10 Thread Jakub Jelinek via Gcc-patches
On Tue, Nov 10, 2020 at 06:59:30PM +0100, Stefan Kanthak via Gcc-patches wrote:
> The implementation of the __ashlDI3(), __ashrDI3() and __lshrDI3() functions
> is rather bad, it yields bad machine code at least on i386 and AMD64.
> Since GCC knows how to shift integers twice the register size these functions
> can be written as one-liners.

This looks wrong.  If gcc knows how to do that inline, it will not call the
out of line function at all.  The functions are there for the cases where
gcc can't do that.
So, your patch will shorten the routines on targets where those are never
called, and completely break them on any other (resulting in infinite
recursion there).

Jakub



Re: [PATCH] Better __ashlDI3, __ashrDI3 and __lshrDI3 functions, plus fixed __bswapsi2 function

2020-11-10 Thread Eric Botcazou
> The implementation of the __ashlDI3(), __ashrDI3() and __lshrDI3() functions
> is rather bad, it yields bad machine code at least on i386 and AMD64. Since
> GCC knows how to shift integers twice the register size these functions can
> be written as one-liners.

These functions are precisely meant to be used when GCC cannot do that.

-- 
Eric Botcazou




[PATCH] Better __ashlDI3, __ashrDI3 and __lshrDI3 functions, plus fixed __bswapsi2 function

2020-11-10 Thread Stefan Kanthak via Gcc-patches
The implementation of the __ashlDI3(), __ashrDI3() and __lshrDI3() functions
is rather bad, it yields bad machine code at least on i386 and AMD64.
Since GCC knows how to shift integers twice the register size these functions
can be written as one-liners.

The implementation of the __bswapsi2() function uses SIGNED instead of
unsigned mask values; cf. __bswapdi2()

Stefan Kanthak

libgcc2.diff
Description: Binary data


Re: [PATCH] openmp: Retire nest-var ICV

2020-11-10 Thread Jakub Jelinek via Gcc-patches
On Mon, Nov 09, 2020 at 05:43:20PM +, Kwok Cheung Yeung wrote:
> commit b4feb16f3c84b8f82163a4cbba6a31d55fbb8e5b
> Author: Kwok Cheung Yeung 
> Date:   Mon Nov 9 09:34:39 2020 -0800
> 
> openmp: Retire nest-var ICV for OpenMP 5.0
> 
> This removes the nest-var ICV, expressing nesting in terms of the
> max-active-levels-var ICV instead.
> 
> 2020-11-09  Kwok Cheung Yeung  
> 
>   libgomp/
>   * env.c (gomp_global_icv): Remove nest_var field.
>   (gomp_max_active_levels_var): Initialize to 1.
>   (parse_boolean): Return true on success.
>   (handle_omp_display_env): Express OMP_NESTED in terms of
>   gomp_max_active_levels_var.
>   (initialize_env): Set gomp_max_active_levels_var from
>   OMP_MAX_ACTIVE_LEVELS, OMP_NESTED, OMP_NUM_THREADS and
>   OMP_PROC_BIND.
>   * icv.c (omp_set_nested): Express in terms of
>   gomp_max_active_levels_var.
>   (omp_get_nested): Likewise.
>   * libgomp.h (struct gomp_task_icv): Remove nest_var field.
>   * libgomp.texi (omp_get_nested): Update documentation.
>   (omp_set_nested): Likewise.
>   (OMP_MAX_ACTIVE_LEVELS): Likewise.
>   (OMP_NESTED): Likewise.
>   (OMP_NUM_THREADS): Likewise.
>   (OMP_PROC_BIND): Likewise.
>   * parallel.c (gomp_resolve_num_threads): Replace reference
>   to nest_var with gomp_max_active_levels_var.
>   * testsuite/libgomp.c/target-5.c: Remove additional options.
>   (main): Remove references to omp_get_nested and omp_set_nested.

One thing is that max-active-levels-var in 5.0 is per-device,
but in 5.1 per-data environment.  The question is if we should implement
the problematic 5.0 way or the 5.1 one.  E.g.:
#include 
#include 

int
main ()
{
  #pragma omp parallel
  {
omp_set_nested (1);
#pragma omp parallel num_threads(2)
printf ("Hello, world!\n");
  }
}
which used to be valid in 4.5 (where nest-var used to be per-data
environment) is in 5.0 racy (and in 5.1 will not be racy again).
Though, as these are deprecated APIs, perhaps we can just do the 5.0 way for
now.

> --- a/libgomp/libgomp.texi
> +++ b/libgomp/libgomp.texi
> @@ -489,8 +489,11 @@ represent their language-specific counterparts.
>  
>  Nested parallel regions may be initialized at startup by the 
>  @env{OMP_NESTED} environment variable or at runtime using
> -@code{omp_set_nested}.  If undefined, nested parallel regions are
> -disabled by default.
> +@code{omp_set_nested}.  Setting the maximum number of nested
> +regions to above one using the @env{OMP_MAX_ACTIVE_LEVELS}
> +environment variable or @code{omp_set_max_active_levels} will
> +also enable nesting.  If undefined, nested parallel regions
> +are disabled by default.

This doesn't really describe what env.c does.  If undefined, then
if OMP_NESTED is defined, it will be folloed, and if neither is
defined, the code sets the default based on
"OMP_NUM_THREADS or OMP_PROC_BIND is set to a
comma-separated list of more than one value"
as the spec says and only is disabled otherwise.

> @@ -1502,10 +1511,11 @@ disabled by default.
>  @item @emph{Description}:
>  Specifies the initial value for the maximum number of nested parallel
>  regions.  The value of this variable shall be a positive integer.
> -If undefined, the number of active levels is unlimited.
> +If undefined, the number of active levels is one, which effectively
> +disables nested regions.

Similarly.
>  
>  @item @emph{See also}:
> -@ref{omp_set_max_active_levels}
> +@ref{omp_set_max_active_levels}, @ref{OMP_NESTED}
>  
>  @item @emph{Reference}: 
>  @uref{https://www.openmp.org, OpenMP specification v4.5}, Section 4.9
> @@ -1541,11 +1551,13 @@ integer, and zero is allowed.  If undefined, the 
> default priority is
>  @item @emph{Description}:
>  Enable or disable nested parallel regions, i.e., whether team members
>  are allowed to create new teams.  The value of this environment variable 
> -shall be @code{TRUE} or @code{FALSE}.  If undefined, nested parallel 
> -regions are disabled by default.
> +shall be @code{TRUE} or @code{FALSE}.  If set to @code{TRUE}, the number
> +of maximum active nested regions supported will by default be set to the
> +maximum supported, otherwise it will be set to one.  If undefined, nested
> +parallel regions are disabled by default.

Again.

> --- a/libgomp/testsuite/libgomp.c/target-5.c
> +++ b/libgomp/testsuite/libgomp.c/target-5.c
> @@ -1,5 +1,3 @@
> -/* { dg-additional-options "-Wno-deprecated-declarations" } */
> -
>  #include 
>  #include 
>  
> @@ -7,17 +5,14 @@ int
>  main ()
>  {
>int d_o = omp_get_dynamic ();
> -  int n_o = omp_get_nested ();
>omp_sched_t s_o;
>int c_o;
>omp_get_schedule (_o, _o);
>int m_o = omp_get_max_threads ();
>omp_set_dynamic (1);
> -  omp_set_nested (1);
>omp_set_schedule (omp_sched_static, 2);
>omp_set_num_threads (4);
>int d = omp_get_dynamic ();
> -  int n = omp_get_nested ();
>omp_sched_t s;
>int c;
>

Re: [PATCH] IBM Z: Fix bootstrap breakage due to HAVE_TF macro

2020-11-10 Thread Andreas Krebbel via Gcc-patches
On 10.11.20 18:43, Ilya Leoshkevich wrote:
> Bootstrap and regtest running on s390x-redhat-linux with --enable-shared
> --with-system-zlib --enable-threads=posix --enable-__cxa_atexit
> --enable-checking=yes,rtl --enable-gnu-indirect-function
> --disable-werror --enable-languages=c,c++,fortran,objc,obj-c++
> --with-arch=arch13.  Ok for master?
> 
> 
> 
> Commit e627cda56865 ("IBM Z: Store long doubles in vector registers
> when possible") introduced HAVE_TF macro which expands to a logical
> "or" of HAVE_ constants.  Not all of these constants are available in
> GENERATOR_FILE context, so a hack was used: simply expand to true in
> this case, because the actual value matters only during compiler
> runtime and not during generation.
> 
> However, one aspect of this value matters during generation after all:
> whether or not it's a constant, which in this case it appears to be.
> This results in incorrect values in insn-flags.h and broken bootstrap
> for some configurations.
> 
> Fix by using a dummy value that is not a constant.
> 
> gcc/ChangeLog:
> 
> 2020-11-10  Ilya Leoshkevich  
> 
>   * config/s390/s390.h (HAVE_TF): Use opaque value when
>   GENERATOR_FILE is defined.

Ok. Thanks!

Andreas


[PATCH] IBM Z: Fix bootstrap breakage due to HAVE_TF macro

2020-11-10 Thread Ilya Leoshkevich via Gcc-patches
Bootstrap and regtest running on s390x-redhat-linux with --enable-shared
--with-system-zlib --enable-threads=posix --enable-__cxa_atexit
--enable-checking=yes,rtl --enable-gnu-indirect-function
--disable-werror --enable-languages=c,c++,fortran,objc,obj-c++
--with-arch=arch13.  Ok for master?



Commit e627cda56865 ("IBM Z: Store long doubles in vector registers
when possible") introduced HAVE_TF macro which expands to a logical
"or" of HAVE_ constants.  Not all of these constants are available in
GENERATOR_FILE context, so a hack was used: simply expand to true in
this case, because the actual value matters only during compiler
runtime and not during generation.

However, one aspect of this value matters during generation after all:
whether or not it's a constant, which in this case it appears to be.
This results in incorrect values in insn-flags.h and broken bootstrap
for some configurations.

Fix by using a dummy value that is not a constant.

gcc/ChangeLog:

2020-11-10  Ilya Leoshkevich  

* config/s390/s390.h (HAVE_TF): Use opaque value when
GENERATOR_FILE is defined.
---
 gcc/config/s390/s390.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/gcc/config/s390/s390.h b/gcc/config/s390/s390.h
index 8c028317b6b..bc579a3dadd 100644
--- a/gcc/config/s390/s390.h
+++ b/gcc/config/s390/s390.h
@@ -1187,8 +1187,9 @@ struct GTY(()) machine_function
 #define TARGET_INDIRECT_BRANCH_TABLE s390_indirect_branch_table
 
 #ifdef GENERATOR_FILE
-/* gencondmd.c is built before insn-flags.h.  */
-#define HAVE_TF(icode) true
+/* gencondmd.c is built before insn-flags.h.  Use an arbitrary opaque value
+   that cannot be optimized away by gen_insn.  */
+#define HAVE_TF(icode) TARGET_HARD_FLOAT
 #else
 #define HAVE_TF(icode) (HAVE_##icode##_fpr || HAVE_##icode##_vr)
 #endif
-- 
2.25.4



Re: [PATCH] Overflow-trapping integer arithmetic routines7code

2020-11-10 Thread Stefan Kanthak
Jeff Law  wrote:

> On 10/5/20 10:49 AM, Stefan Kanthak wrote:
>> The implementation of the functions __absv?i2(), __addv?i3() etc. for
>> trapping integer overflow provided in libgcc2.c is rather bad.
>> Same for __cmp?i2() and __ucmp?i2()
>>
>> At least for AMD64 and i386 processors GCC creates awful to horrible
>> code for them: see 
>> for some examples as well as the expected assembly.
>>
>> The attached diff/patch provides better implementations.
> 
> So you didn't indicate how this was tested.

I inspected the assembly generated for i386 and AMD64 and found it to be
correct. I also ran both the current and the patched functions in parallel
and compared their results, without detecting any deviation (except for
the greater speed of the patched functions).

> The addition, subtraction, negation and mulvsi3 are reasonable. The
> mulvsi3 change introduced a failure on m32r, but I'm highly confident
> it's an pre-existing simulator bug.

I only have AMD64 processors, so my tests are limited to these platforms.

> But the abs changes and the other other multiply changes have undesired
> LIBGCC2_BAD_CODE ifdefs that you need to clean up before we could accept
> those changes.

I included these alternative #ifdef'ed implementations only to demonstrate
that GCC generates bad^Wnot the best code there, especially for __mulvDI3(),
which I implemented using __umulsidi3() plus 2 __builtin_mul_overflow() and
2 __builtin_add_overflow() for Wtype instead of just a single
__builtin_mul_overflow() for DWtype. 

> What I don't understand is why the [u]cmpdi/cmpdi2 changes don't cause
> infinite recursion. I would expect the comparisons you're using to
> compute the return value to themselves trigger calls to [u]cmpdi2.

Why should the compiler generate calls to [u]cmpdi2() there?
It doesn't generate calls to the missing functions [u]adddi2() or
[u]subdi2() for addition and subtraction of DWtypes either!

JFTR: this also holds for the __ashlDI3(),  __ashrDI3() and __lshrDI3()
  functions, which should better be written as

__ashldi3 (DWtype u, shift_count_type b)
{
  return u << b;
}

__ashrdi3 (DWtype u, shift_count_type b)
{
  return u >> b;
}

__lshrdi3 (UDWtype u, shift_count_type b)
{
  return u >> b;
}

> I can speculate that the expanders are open-coding the comparison, but
> in that case I'd expect the libgcc2 routines to never be used, so
> optimizing them is pointless :-)

Both are documented, so users might call them directly.
Besides that, their source smells -- nobody with a sane mind would NOT
write (a > b) - (a < b) + 1, what every self-respecting compiler/optimiser
should translate into branch-free code.

> But I put them through my tester and couldn't see any evidence that
> they're causing problems though and reviewed the generated code on m32r
> since it was convenient to do so.
> 
> 
> Also note the ucmpdi changes require updating the prototype in libgcc2.h,
> otherwise you get build failures building gcc itself. I fixed that minor
> omission.

Argh, I missed that indeed.

> So with all that in mind, I installed everything except the bits which
> have the LIBGCC2_BAD_CODE ifdefs after testing on the various crosses. 
> If you could remove the ifdefs on the abs/mult changes and resubmit them
> it'd be appreciated.

Done.

regards
Stefan

libgcc2.patch
Description: Binary data


Re: [Patch] Fortran: OpenMP 5.0 (in_,task_)reduction clause extensions

2020-11-10 Thread Jakub Jelinek via Gcc-patches
On Tue, Nov 10, 2020 at 05:46:17PM +0100, Tobias Burnus wrote:
> Fortran: OpenMP 5.0 (in_,task_)reduction clause extensions
> 
> gcc/fortran/ChangeLog:
> 
>   PR fortran/95847
>   * dump-parse-tree.c (show_omp_clauses): Handle new reduction enums.
>   * gfortran.h (OMP_LIST_REDUCTION_INSCAN, OMP_LIST_REDUCTION_TASK,
>   OMP_LIST_IN_REDUCTION, OMP_LIST_TASK_REDUCTION): Add enums.
>   * openmp.c (enum omp_mask1): Add OMP_CLAUSE_IN_REDUCTION
>   and OMP_CLAUSE_TASK_REDUCTION.
>   (gfc_match_omp_clause_reduction): Extend reduction handling;
>   moved from ...
>   (gfc_match_omp_clauses): ... here. Add calls to it.
>   (OMP_TASK_CLAUSES, OMP_TARGET_CLAUSES, OMP_TASKLOOP_CLAUSES):
>   Add OMP_CLAUSE_IN_REDUCTION.
>   (gfc_match_omp_taskgroup): Add task_reduction matching.
>   (resolve_omp_clauses): Update for new reduction clause changes;
>   remove removed nonmonotonic-schedule restrictions.
>   (gfc_resolve_omp_parallel_blocks): Add new enums to switch.
>   * trans-openmp.c (gfc_omp_clause_default_ctor,
>   gfc_trans_omp_reduction_list, gfc_trans_omp_clauses,
>   gfc_split_omp_clauses): Handle updated reduction clause.
> 
> gcc/ChangeLog:
> 
>   PR fortran/95847
>   * gimplify.c (gimplify_scan_omp_clauses, gimplify_omp_loop): Use 'do'
>   instead of 'for' in error messages for Fortran.
>   * omp-low.c (check_omp_nesting_restrictions): Likewise
> 
> gcc/testsuite/ChangeLog:
> 
>   PR fortran/95847
>   * gfortran.dg/gomp/schedule-modifiers-2.f90: Remove some dg-error.
>   * gfortran.dg/gomp/reduction4.f90: New test.
>   * gfortran.dg/gomp/reduction5.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-1.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-2.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-3.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-4.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-5.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-6.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-7.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-8.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-9.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-10.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-11.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-12.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-13.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-14.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-15.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-16.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-17.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-18.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-19.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-20.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-21.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-22.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-23.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-24.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-25.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-26.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-27.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-28.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-29.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-30.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-31.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-32.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-33.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-34.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-35.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-36.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-37.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-38.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-39.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-40.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-41.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-42.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-43.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-44.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-45.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-46.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-47.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-48.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-49.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-50.f90: New test.
>   * gfortran.dg/gomp/workshare-reduction-51.f90: New test.
>   

Re: [PATCH] c-family: Avoid unnecessary work when -Wpragmas is being ignored

2020-11-10 Thread Patrick Palka via Gcc-patches
On Tue, 10 Nov 2020, David Malcolm wrote:

> On Mon, 2020-11-09 at 10:38 -0500, Patrick Palka wrote:
> > This speeds up handle_pragma_diagnostic by avoiding computing a
> > spelling
> > suggestion for an unrecognized option inside a #pragma directive when
> > -Wpragmas warnings are being suppressed.
> > 
> > In the range-v3 library, which contains many instances of
> > 
> >   #pragma GCC diagnostic push
> >   #pragma GCC diagnostic ignored "-Wpragmas"
> >   #pragma GCC diagnostic ignored "-Wfoo"
> >   ...
> >   #pragma GCC diagnostic pop
> > 
> > compile time is reduced by 33% in some of its tests.
> > 
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK
> > for
> > trunk?
> 
> Looks good to me.

Thanks David & Jeff.  Patch committed to trunk as r11-4849.

> 
> Dave
> 
> 



Re: [PATCH] fortran: Fix up gfc_typename CHARACTER length handling [PR97768]

2020-11-10 Thread Tobias Burnus

On 10.11.20 10:58, Jakub Jelinek via Gcc-patches wrote:


The earlier version of the patch has been successfully bootstrapped
and regtested (only with a few regressions) on x86_64-linux and i686-linux,
this version passed all the new and earlier problematic tests, ok for trunk
if it passes another bootstrap/regtest?


LGTM. Thanks!

Tobias


2020-11-10  Jakub Jelinek  

  PR fortran/97768
  * misc.c (gfc_typename): Use ex->value.character.length only if
  ex->expr_type == EXPR_CONSTANT.  If ex->ts.deferred, print : instead
  of length.  If ex->ts.u.cl && ex->ts.u.cl->length == NULL, print
  * instead of length.  Otherwise if character length is non-constant,
  print just CHARACTER or CHARACTER(KIND=N).

  * gfortran.dg/pr97768_1.f90: New test.
  * gfortran.dg/pr97768_2.f90: New test.

--- gcc/fortran/misc.c.jj 2020-11-09 23:01:02.978826528 +0100
+++ gcc/fortran/misc.c2020-11-10 10:41:22.087850720 +0100
@@ -224,10 +224,32 @@ gfc_typename (gfc_expr *ex)

if (ex->ts.type == BT_CHARACTER)
  {
-  if (ex->ts.u.cl && ex->ts.u.cl->length)
- length = gfc_mpz_get_hwi (ex->ts.u.cl->length->value.integer);
-  else
+  if (ex->expr_type == EXPR_CONSTANT)
  length = ex->value.character.length;
+  else if (ex->ts.deferred)
+ {
+   if (ex->ts.kind == gfc_default_character_kind)
+ return "CHARACTER(:)";
+   sprintf (buffer, "CHARACTER(:,%d)", ex->ts.kind);
+   return buffer;
+ }
+  else if (ex->ts.u.cl && ex->ts.u.cl->length == NULL)
+ {
+   if (ex->ts.kind == gfc_default_character_kind)
+ return "CHARACTER(*)";
+   sprintf (buffer, "CHARACTER(*,%d)", ex->ts.kind);
+   return buffer;
+ }
+  else if (ex->ts.u.cl == NULL
+|| ex->ts.u.cl->length->expr_type != EXPR_CONSTANT)
+ {
+   if (ex->ts.kind == gfc_default_character_kind)
+ return "CHARACTER";
+   sprintf (buffer, "CHARACTER(KIND=%d)", ex->ts.kind);
+   return buffer;
+ }
+  else
+ length = gfc_mpz_get_hwi (ex->ts.u.cl->length->value.integer);
if (ex->ts.kind == gfc_default_character_kind)
  sprintf (buffer, "CHARACTER(" HOST_WIDE_INT_PRINT_DEC ")", length);
else
--- gcc/testsuite/gfortran.dg/pr97768_1.f90.jj2020-11-10 
10:22:26.053445061 +0100
+++ gcc/testsuite/gfortran.dg/pr97768_1.f90   2020-11-10 10:22:26.053445061 
+0100
@@ -0,0 +1,25 @@
+! PR fortran/97768
+! { dg-do compile }
+
+module pr97768_1
+  interface operator(.in.)
+module procedure substr_in_str
+  end interface
+contains
+  pure function to_upper (in_str) result (string)
+character(len=*), intent(in) :: in_str
+character(len=len(in_str)) :: string
+string = in_str
+  end function to_upper
+  logical pure function substr_in_str (substring, string)
+character(len=*), intent(in) :: string, substring
+substr_in_str=.false.
+  end function
+end module
+function foo ()
+  use pr97768_1, only : to_upper, operator(.in.)
+  logical :: foo
+  character(len=8) :: str
+  str = 'abcde'
+  foo = 'b' .in. to_upper (str)
+end function foo
--- gcc/testsuite/gfortran.dg/pr97768_2.f90.jj2020-11-10 
10:22:26.053445061 +0100
+++ gcc/testsuite/gfortran.dg/pr97768_2.f90   2020-11-10 10:46:15.104602757 
+0100
@@ -0,0 +1,53 @@
+! PR fortran/97768
+! { dg-do compile }
+
+module pr97768_2
+  interface operator(.in.)
+module procedure substr_in_str
+  end interface
+contains
+  pure function to_upper (in_str) result (string)
+character(len=*), intent(in) :: in_str
+character(len=len(in_str)) :: string
+string = in_str
+  end function to_upper
+  logical pure function substr_in_str (substring, string)
+character(len=*), intent(in) :: string, substring
+substr_in_str=.false.
+  end function
+end module
+function foo ()
+  use pr97768_2, only : to_upper, operator(.in.)
+  logical :: foo
+  character(len=8) :: str
+  str = 'abcde'
+  foo = to_upper (str) .in. 32! { dg-error "are CHARACTER/INTEGER" }
+end function foo
+function bar (str)
+  use pr97768_2, only : operator(.in.)
+  logical :: bar
+  character(len=*) :: str
+  foo = str .in. 32   ! { dg-error "are 
CHARACTER\\(\\*\\)/INTEGER" }
+end function bar
+function baz (lenstr)
+  use pr97768_2, only : operator(.in.)
+  logical :: baz
+  integer :: lenstr
+  character(len=lenstr) :: str
+  str = 'abc'
+  foo = str .in. 32   ! { dg-error "are CHARACTER/INTEGER" }
+end function baz
+function qux ()
+  use pr97768_2, only : operator(.in.)
+  logical :: qux
+  character(len=8) :: str
+  str = 'def'
+  foo = str .in. 32   ! { dg-error "are CHARACTER\\(8\\)/INTEGER" }
+end function qux
+function corge ()
+  use pr97768_2, only : operator(.in.)
+  logical :: corge
+  character(len=:), allocatable :: str
+  str = 'ghijk'
+  foo = str .in. 32   ! { dg-error "are CHARACTER\\(:\\)/INTEGER" }
+end function corge

  Jakub


-
Mentor Graphics (Deutschland) 

Re: [PATCH] analyzer: remove dead code

2020-11-10 Thread Jeff Law via Gcc-patches


On 11/10/20 9:57 AM, David Malcolm via Gcc-patches wrote:
> On Fri, 2020-10-23 at 17:26 +0200, Martin Liška wrote:
>> Hey.
>>
>> I've noticed that when building GCC with Clang.
>> David what do you think about it?
> Thanks for the patch.
>
> Looks good to me, assuming it goes through normal bootstrap and
> regression-testing.
>
> That said, technically I'm not a maintainer for the analyzer, unless
> you count it as part of "diagnostic messages"; I've been making very
> liberal use of Jeff's permission to commit followup work when he
> approved the initial patch kit last year (in the hope that the steering
> committee will make it official at some point).

I think you should be able to ack stuff in the analyzer, I'd actually
deleted this from my queue on the assumption you'd take care of it. :-)


But just to be safe, Martin, the patch is fine :-)


jeff



Re: [PATCH] analyzer: remove dead code

2020-11-10 Thread David Malcolm via Gcc-patches
On Fri, 2020-10-23 at 17:26 +0200, Martin Liška wrote:
> Hey.
> 
> I've noticed that when building GCC with Clang.
> David what do you think about it?

Thanks for the patch.

Looks good to me, assuming it goes through normal bootstrap and
regression-testing.

That said, technically I'm not a maintainer for the analyzer, unless
you count it as part of "diagnostic messages"; I've been making very
liberal use of Jeff's permission to commit followup work when he
approved the initial patch kit last year (in the hope that the steering
committee will make it official at some point).

Dave



Re: [Patch] Fortran: OpenMP 5.0 (in_,task_)reduction clause extensions

2020-11-10 Thread Tobias Burnus

Hi Jakub,

thanks for the comments. I have now removed
OMP_CLAUSE_REDUCTION{_DEFAULT,_MODIFIER} and use 'openacc' as flag for OpenACC.
The only 'default:' check is now moved to resolution stage.

If I gathered it correctly, the splitting is complex but the
way it is implemented, it happens to be fine.

We discussed that the _LIST part gets too long (but that's something
for later) and that 'omp scan' best get implemented during stage1.
(Additionally, I believe that the ME always complains about 'inscan'
not being in 'omp scan' – and it can't as 'omp scan' is not yet
supported.)

On 10.11.20 13:16, Jakub Jelinek wrote:


+  for (int i = OMP_LIST_REDUCTION; i <= OMP_LIST_REDUCTION_TASK; i++)
+{
+  if (mask & GFC_OMP_MASK_TEAMS)
+clausesa[GFC_OMP_SPLIT_TEAMS].lists[i]
+  = code->ext.omp_clauses->lists[i];
+  if (mask & GFC_OMP_MASK_PARALLEL)
+clausesa[GFC_OMP_SPLIT_PARALLEL].lists[i]
+  = code->ext.omp_clauses->lists[i];
+  else if (mask & GFC_OMP_MASK_DO)
+clausesa[GFC_OMP_SPLIT_DO].lists[i]
+  = code->ext.omp_clauses->lists[i];
+  if (mask & GFC_OMP_MASK_SIMD)
+clausesa[GFC_OMP_SPLIT_SIMD].lists[i]
+  = code->ext.omp_clauses->lists[i];
+}

Seems in the middle-end we just ignore OMP_CLAUSE_REDUCTION_TASK on anything
but parallel/{for,do}/sections, so guess this is ok.



Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
Fortran: OpenMP 5.0 (in_,task_)reduction clause extensions

gcc/fortran/ChangeLog:

	PR fortran/95847
	* dump-parse-tree.c (show_omp_clauses): Handle new reduction enums.
	* gfortran.h (OMP_LIST_REDUCTION_INSCAN, OMP_LIST_REDUCTION_TASK,
	OMP_LIST_IN_REDUCTION, OMP_LIST_TASK_REDUCTION): Add enums.
	* openmp.c (enum omp_mask1): Add OMP_CLAUSE_IN_REDUCTION
	and OMP_CLAUSE_TASK_REDUCTION.
	(gfc_match_omp_clause_reduction): Extend reduction handling;
	moved from ...
	(gfc_match_omp_clauses): ... here. Add calls to it.
	(OMP_TASK_CLAUSES, OMP_TARGET_CLAUSES, OMP_TASKLOOP_CLAUSES):
	Add OMP_CLAUSE_IN_REDUCTION.
	(gfc_match_omp_taskgroup): Add task_reduction matching.
	(resolve_omp_clauses): Update for new reduction clause changes;
	remove removed nonmonotonic-schedule restrictions.
	(gfc_resolve_omp_parallel_blocks): Add new enums to switch.
	* trans-openmp.c (gfc_omp_clause_default_ctor,
	gfc_trans_omp_reduction_list, gfc_trans_omp_clauses,
	gfc_split_omp_clauses): Handle updated reduction clause.

gcc/ChangeLog:

	PR fortran/95847
	* gimplify.c (gimplify_scan_omp_clauses, gimplify_omp_loop): Use 'do'
	instead of 'for' in error messages for Fortran.
	* omp-low.c (check_omp_nesting_restrictions): Likewise

gcc/testsuite/ChangeLog:

	PR fortran/95847
	* gfortran.dg/gomp/schedule-modifiers-2.f90: Remove some dg-error.
	* gfortran.dg/gomp/reduction4.f90: New test.
	* gfortran.dg/gomp/reduction5.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-1.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-2.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-3.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-4.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-5.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-6.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-7.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-8.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-9.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-10.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-11.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-12.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-13.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-14.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-15.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-16.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-17.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-18.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-19.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-20.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-21.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-22.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-23.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-24.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-25.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-26.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-27.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-28.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-29.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-30.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-31.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-32.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-33.f90: New test.
	* gfortran.dg/gomp/workshare-reduction-34.f90: New test.
	* 

Re: [PATCH] c-family: Avoid unnecessary work when -Wpragmas is being ignored

2020-11-10 Thread David Malcolm via Gcc-patches
On Mon, 2020-11-09 at 10:38 -0500, Patrick Palka wrote:
> This speeds up handle_pragma_diagnostic by avoiding computing a
> spelling
> suggestion for an unrecognized option inside a #pragma directive when
> -Wpragmas warnings are being suppressed.
> 
> In the range-v3 library, which contains many instances of
> 
>   #pragma GCC diagnostic push
>   #pragma GCC diagnostic ignored "-Wpragmas"
>   #pragma GCC diagnostic ignored "-Wfoo"
>   ...
>   #pragma GCC diagnostic pop
> 
> compile time is reduced by 33% in some of its tests.
> 
> Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK
> for
> trunk?

Looks good to me.

Dave



Re: [PATCH] c++: Improve static_assert diagnostic [PR97518]

2020-11-10 Thread Jason Merrill via Gcc-patches

On 11/9/20 7:21 PM, Marek Polacek wrote:

Currently, when a static_assert fails, we only say "static assertion failed".
It would be more useful if we could also print the expression that
evaluated to false; this is especially useful when the condition uses
template parameters.  Consider the motivating example, in which we have
this line:

   static_assert(is_same::value);

if this fails, the user has to play dirty games to get the compiler to
print the template arguments.  With this patch, we say:

   static assertion failed due to requirement 'is_same::value'


I'd rather avoid the word "requirement" here, to avoid confusion with 
Concepts.


Maybe have the usual failed error, and if we're showing the expression, 
have a second inform to say e.g. "%qE evaluates to false"?


Also, if we've narrowed the problem down to a particular subexpression, 
let's only print that one.



which I think is much better.  However, always printing the condition that
evaluated to 'false' wouldn't be very useful: e.g. noexcept(fn) is
always parsed to true/false, so we would say "static assertion failed due
to requirement 'false'" which doesn't help.  So I wound up only printing
the condition when it was instantiation-dependent, that is, we called
finish_static_assert from tsubst_expr.

Moreover, this patch also improves the diagnostic when the condition
consists of a logical AND.  Say you have something like this:

   static_assert(fn1() && fn2() && fn3() && fn4() && fn5());

where fn4() evaluates to false and the other ones to true.  Highlighting
the whole thing is not that helpful because it won't say which clause
evaluated to false.  With the find_failing_clause tweak in this patch
we emit:

   error: static assertion failed
 6 | static_assert(fn1() && fn2() && fn3() && fn4() && fn5());
   |  ~~~^~

so you know right away what's going on.  Unfortunately, when you combine
both things, that is, have an instantiation-dependent expr and && in
a static_assert, we can't yet quite point to the clause that failed.  It
is because when we tsubstitute something like is_same::value, we
generate a VAR_DECL that doesn't have any location.  It would be awesome
if we could wrap it with a location wrapper, but I didn't see anything
obvious.


Hmm, I vaguely remember that at first we were using location wrappers 
less in templates, but I thought that was fixed.  I don't see anything 
setting suppress_location_wrappers, and it looks like 
tsubst_copy_and_build should preserve a location wrapper.



Further tweak could be to print the failing clause of the condition if
possible, whether or not it was instantiation-dependent.


If not instantiation-dependent, underlining the clause seems sufficient.


In passing, I've cleaned up some things:
* use iloc_sentinel when appropriate,
* it's nicer to call contextual_conv_bool instead of the rather verbose
   perform_implicit_conversion_flags,
* no need to check for INTEGER_CST before calling integer_zerop.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

gcc/cp/ChangeLog:

PR c++/97518
* cp-tree.h (finish_static_assert): Adjust declaration.
* parser.c (cp_parser_static_assert): Pass false to
finish_static_assert.
* pt.c (tsubst_expr): Pass true to finish_static_assert.
* semantics.c (find_failing_clause_r): New function.
(find_failing_clause): New function.
(finish_static_assert): Add a bool parameter.  Use
iloc_sentinel.  Call contextual_conv_bool instead of
perform_implicit_conversion_flags.  Don't check for INTEGER_CST before
calling integer_zerop.  Call find_failing_clause and maybe use its
location.  Print the original condition if SHOW_EXPR_P.

gcc/testsuite/ChangeLog:

PR c++/97518
* g++.dg/diagnostic/pr87386.C: Adjust expected output.
* g++.dg/diagnostic/static_assert1.C: New test.
* g++.dg/diagnostic/static_assert2.C: New test.

libcc1/ChangeLog:

PR c++/97518
* libcp1plugin.cc (plugin_add_static_assert): Pass false to
finish_static_assert.

libstdc++-v3/ChangeLog:

* testsuite/20_util/scoped_allocator/69293_neg.cc: Adjust dg-error.
* testsuite/20_util/uses_allocator/69293_neg.cc: Likewise.
* testsuite/20_util/uses_allocator/cons_neg.cc: Likewise.
* testsuite/26_numerics/random/pr60037-neg.cc: Likewise.
---
  gcc/cp/cp-tree.h  |  2 +-
  gcc/cp/parser.c   |  3 +-
  gcc/cp/pt.c   |  6 +-
  gcc/cp/semantics.c| 79 ---
  gcc/testsuite/g++.dg/diagnostic/pr87386.C |  2 +-
  .../g++.dg/diagnostic/static_assert1.C| 20 +
  .../g++.dg/diagnostic/static_assert2.C| 68 
  libcc1/libcp1plugin.cc|  2 +-
  .../20_util/scoped_allocator/69293_neg.cc |  2 +-
  

Re: [PATCH 1/3] Refactor copying decl section names

2020-11-10 Thread Jeff Law via Gcc-patches


On 11/12/19 11:28 PM, Strager Neds wrote:
> Sometimes, we need to copy a section name from one decl or symtab node
> to another. Currently, this is done by getting the source's section
> name and setting the destination's section name. For example:
>
> set_decl_section_name (dest, DECL_SECTION_NAME (source));
> dest->set_section (source->get_section ());
>
> This code could be more efficient. Section names are stored in an
> interning hash table, but the current interfaces of
> set_decl_section_name and symtab_node::set_section force unnecessary
> indirections (to get the section name) and hashing (to find the section
> name in the hash table).
>
> Overload set_decl_section_name and symtab_node::set_section to accept an
> existing symtab_node to copy the section name from:
>
> set_decl_section_name (dest, source);
> dest->set_section (*source);
>
> For now, implement these new functions as a simple wrapper around the
> existing functions. In the future, these functions can be implemented
> using just a pointer copy and an increment (for the reference count).
>
> This patch should not change behavior.
>
> Testing: Bootstrap on x86_64-linux-gnu with --disable-multilib
> --enable-checking=release --enable-languages=c,c++. Observe no change in
> test results.
>
> 2019-11-12  Matthew Glazar 
>
> * gcc/cgraph.h (symtab_node::get_section): Constify.
> (symtab_node::set_section): Declare new overload.
> * gcc/symtab.c (symtab_node::set_section): Define new overload.
> (symtab_node::copy_visibility_from): Use new overload of
> symtab_node::set_section.
> (symtab_node::resolve_alias): Same.
> * gcc/tree.h (set_decl_section_name): Declare new overload.
> * gcc/tree.c (set_decl_section_name): Define new overload.
> * gcc/c/c-decl.c (merge_decls): Use new overload of
> set_decl_section_name.
> * gcc/cp/decl.c (duplicate_decls): Same.
> * gcc/cp/method.c (use_thunk): Same.
> * gcc/cp/optimize.c (maybe_clone_body): Same.
> * gcc/d/decl.cc (finish_thunk): Same.
> * gcc/tree-emutls.c (get_emutls_init_templ_addr): Same.
> * gcc/cgraphclones.c (cgraph_node::create_virtual_clone): Use new
> overload of symtab_node::set_section.
> (cgraph_node::create_version_clone_with_body): Same.
> * gcc/trans-mem.c (ipa_tm_create_version): Same.

So only a year after the original post 


Presumably given that you're just adding an overload, there is no need
or intent to change all the callers to set_decl_section_name.   
Right?   I scanned the various remaining calls, some just use a string,
or build up a string based on a decl or something else.  I did find one
other call that should be converted in cp/coroutines.cc which almost
certainly went onto the trunk after your patch was submitted.  I've
adjusted that one.

Also note, your mailer seemed to mangle whitespace in your message.  It
makes it slightly harder to deal with, so if you plan on making regular
contributions, we should probably work to get that fixed.  If the
refactoring + subsequent bugfixes are all you're planning to do, then
I'll fix the whitespace issues manually.


Given the age of the patch, the need to manually apply bits (due to
whitespace issues) and the desire to fix coroutines.cc, I'm giving it a
fresh test spin. 


Jeff





[PATCH] [PR target/97727] arm: [testsuite] fix some simd tests on armbe

2020-11-10 Thread Andrea Corallo via Gcc-patches
Hi all,

I'd like to submit this patch to fix three testcases reported to be
failing on arm big endian on PR target/97727.

Okay for trunk?

Thanks

  Andrea
  
>From c2d5faf49647381edbcdce6a709b5ed046ce13a4 Mon Sep 17 00:00:00 2001
From: Andrea Corallo 
Date: Tue, 10 Nov 2020 15:15:27 +0100
Subject: [PATCH] [PR target/97727] arm: [testsuite] fix some simd tests on
 armbe

2020-11-10  Andrea Corallo  

PR target/97727
* gcc.target/arm/simd/bf16_vldn_1.c: Relax regexps not to fail on
big endian.
* gcc.target/arm/simd/vldn_lane_bf16_1.c: Likewise
* gcc.target/arm/simd/vmmla_1.c: Add -mfloat-abi=hard flag.
---
 .../gcc.target/arm/simd/bf16_vldn_1.c | 48 +--
 .../gcc.target/arm/simd/vldn_lane_bf16_1.c| 30 +++-
 gcc/testsuite/gcc.target/arm/simd/vmmla_1.c   |  2 +-
 3 files changed, 43 insertions(+), 37 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/simd/bf16_vldn_1.c 
b/gcc/testsuite/gcc.target/arm/simd/bf16_vldn_1.c
index 222e7af9453..663e76984c3 100644
--- a/gcc/testsuite/gcc.target/arm/simd/bf16_vldn_1.c
+++ b/gcc/testsuite/gcc.target/arm/simd/bf16_vldn_1.c
@@ -10,8 +10,8 @@
 /*
 **test_vld2_bf16:
 ** ...
-** vld2.16 {d0-d1}, \[r0\]
-** bx  lr
+** vld[0-9]+.16{d[0-9]+-d[0-9]+}, \[r[0-9]+\]
+** ...
 */
 bfloat16x4x2_t
 test_vld2_bf16 (bfloat16_t * ptr)
@@ -22,8 +22,8 @@ test_vld2_bf16 (bfloat16_t * ptr)
 /*
 **test_vld2q_bf16:
 ** ...
-** vld2.16 {d0-d3}, \[r0\]
-** bx  lr
+** vld[0-9]+.16{d[0-9]+-d[0-9]+}, \[r[0-9]+\]
+** ...
 */
 bfloat16x8x2_t
 test_vld2q_bf16 (bfloat16_t * ptr)
@@ -34,8 +34,8 @@ test_vld2q_bf16 (bfloat16_t * ptr)
 /*
 **test_vld2_dup_bf16:
 ** ...
-** vld2.16 {d0\[\], d1\[\]}, \[r0\]
-** bx  lr
+** vld[0-9]+.16{d[0-9]+\[\], d[0-9]+\[\]}, \[r[0-9]+\]
+** ...
 */
 bfloat16x4x2_t
 test_vld2_dup_bf16 (bfloat16_t * ptr)
@@ -46,8 +46,8 @@ test_vld2_dup_bf16 (bfloat16_t * ptr)
 /*
 **test_vld2q_dup_bf16:
 ** ...
-** vld2.16 {d0, d1, d2, d3}, \[r0\]
-** bx  lr
+** vld[0-9]+.16{d[0-9]+, d[0-9]+, d[0-9]+, d[0-9]+}, \[r[0-9]+\]
+** ...
 */
 bfloat16x8x2_t
 test_vld2q_dup_bf16 (bfloat16_t * ptr)
@@ -58,8 +58,8 @@ test_vld2q_dup_bf16 (bfloat16_t * ptr)
 /*
 **test_vld3_bf16:
 ** ...
-** vld3.16 {d0-d2}, \[r0\]
-** bx  lr
+** vld[0-9]+.16{d[0-9]+-d[0-9]+}, \[r[0-9]+\]
+** ...
 */
 bfloat16x4x3_t
 test_vld3_bf16 (bfloat16_t * ptr)
@@ -70,8 +70,8 @@ test_vld3_bf16 (bfloat16_t * ptr)
 /*
 **test_vld3q_bf16:
 ** ...
-** vld3.16 {d1, d3, d5}, \[r0\]
-** bx  lr
+** vld[0-9]+.16{d[0-9]+, d[0-9]+, d[0-9]+}, \[r[0-9]+\]
+** ...
 */
 bfloat16x8x3_t
 test_vld3q_bf16 (bfloat16_t * ptr)
@@ -82,8 +82,8 @@ test_vld3q_bf16 (bfloat16_t * ptr)
 /*
 **test_vld3_dup_bf16:
 ** ...
-** vld3.16 {d0\[\], d1\[\], d2\[\]}, \[r0\]
-** bx  lr
+** vld[0-9]+.16{d[0-9]+\[\], d[0-9]+\[\], d[0-9]+\[\]}, \[r[0-9]+\]
+** ...
 */
 bfloat16x4x3_t
 test_vld3_dup_bf16 (bfloat16_t * ptr)
@@ -94,8 +94,8 @@ test_vld3_dup_bf16 (bfloat16_t * ptr)
 /*
 **test_vld3q_dup_bf16:
 ** ...
-** vld3.16 {d0\[\], d1\[\], d2\[\]}, \[r0\]
-** bx  lr
+** vld[0-9]+.16{d[0-9]+\[\], d[0-9]+\[\], d[0-9]+\[\]}, \[r[0-9]+\]
+** ...
 */
 bfloat16x8x3_t
 test_vld3q_dup_bf16 (bfloat16_t * ptr)
@@ -106,8 +106,8 @@ test_vld3q_dup_bf16 (bfloat16_t * ptr)
 /*
 **test_vld4_bf16:
 ** ...
-** vld4.16 {d0-d3}, \[r0\]
-** bx  lr
+** vld4.16 {d[0-9]+-d[0-9]+}, \[r[0-9]+\]
+** ...
 */
 bfloat16x4x4_t
 test_vld4_bf16 (bfloat16_t * ptr)
@@ -118,8 +118,8 @@ test_vld4_bf16 (bfloat16_t * ptr)
 /*
 **test_vld4q_bf16:
 ** ...
-** vld4.16 {d1, d3, d5, d7}, \[r0\]
-** bx  lr
+** vld4.16 {d[0-9]+, d[0-9]+, d[0-9]+, d[0-9]+}, \[r[0-9]+\]
+** ...
 */
 bfloat16x8x4_t
 test_vld4q_bf16 (bfloat16_t * ptr)
@@ -130,8 +130,8 @@ test_vld4q_bf16 (bfloat16_t * ptr)
 /*
 **test_vld4_dup_bf16:
 ** ...
-** vld4.16 {d0\[\], d1\[\], d2\[\], d3\[\]}, \[r0\]
-** bx  lr
+** vld4.16 {d[0-9]+\[\], d[0-9]+\[\], d[0-9]+\[\], d[0-9]+\[\]}, 
\[r[0-9]+\]
+** ...
 */
 bfloat16x4x4_t
 test_vld4_dup_bf16 (bfloat16_t * ptr)
@@ -142,8 +142,8 @@ test_vld4_dup_bf16 (bfloat16_t * ptr)
 /*
 **test_vld4q_dup_bf16:
 ** ...
-** vld4.16 {d0\[\], d1\[\], d2\[\], d3\[\]}, \[r0\]
-** bx  lr
+** vld4.16 {d[0-9]+\[\], d[0-9]+\[\], d[0-9]+\[\], d[0-9]+\[\]}, 
\[r[0-9]+\]
+** ...
 */
 bfloat16x8x4_t
 test_vld4q_dup_bf16 (bfloat16_t * ptr)
diff --git a/gcc/testsuite/gcc.target/arm/simd/vldn_lane_bf16_1.c 
b/gcc/testsuite/gcc.target/arm/simd/vldn_lane_bf16_1.c
index 58153eddf62..b235b1f33b9 100644
--- a/gcc/testsuite/gcc.target/arm/simd/vldn_lane_bf16_1.c
+++ b/gcc/testsuite/gcc.target/arm/simd/vldn_lane_bf16_1.c
@@ -8,8 +8,9 @@
 
 /*
 **test_vld2_lane_bf16:
-** vld2.16 {d0\[2\], d1\[2\]}, \[r0\]
-** bx  lr
+** ...
+**

[PATCH/RFC v2] Add -fdiagnostics-path-format=html [v2]

2020-11-10 Thread David Malcolm via Gcc-patches
Here's an updated version of the HTML output idea from:
  https://gcc.gnu.org/pipermail/gcc-patches/2020-October/556848.html

I reworked the HTML path output to show stack frames as well as just
runs of events, using drop-shadows to give a 3D look.  The idea is to try
to highlight the stack of frames as if it were an actual stack of
overlapping cards.

Updated HTML for the example from the earlier patch can be seen here:
  
https://dmalcolm.fedorapeople.org/gcc/2020-11-05/html-examples/test.c.path-1.html
As before, other examples can be seen in that directory, such as:
  Signal handler issue:

https://dmalcolm.fedorapeople.org/gcc/2020-11-05/html-examples/signal-1.c.path-1.html
  Leak due to longjmp past a "free":

https://dmalcolm.fedorapeople.org/gcc/2020-11-05/html-examples/setjmp-7.c.path-1.html

Other changes in v2:
* switched j and k in keyboard navigation so that j is "next event"
and k is "previous event"
* fixed event element IDs, fixing a bug where it assumed they were in
  ascending order
* moved HTML printing code out of path_summary and event_range
* more selftest coverage

As before, this approach merely emits the path information; it doesn't
capture the associated diagnostic.  I'm working on an alternate approach
for v3 of the patch that does that; doing that requires reworking
pretty_printer.

I'm not sure on exactly the correct approach here; for v3 I'm
experimenting with an "html" option for -fdiagnostics-format= which
when selected would supplement the existing text output, by additionally
writing out an HTML file for each diagnostic group, with path
information captured there.  Doing it per group means that the classic
"warning"/"note" pairs would be grouped together in that output.

I'm not sure what that ought to do with paths though; part of the point
of doing it is to have less verbose output on stderr whilst capturing
the pertinent information "on the side" in the HTML file.

Thoughts?

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu, FWIW.

gcc/ChangeLog:
* common.opt (diagnostic_path_format): Add DPF_HTML.
* diagnostic-format-json.cc: Include "selftest.h".
* diagnostic-show-locus.c (colorizer::m_context): Replace with...
(colorizer::m_pp): ...this new field.
(layout::m_context): Make const.
(layout::m_is_html): New field.
(layout::m_html_label_writer): New field.
(colorizer::colorizer): Update for use of pp rather than context.
(colorizer::begin_state): Likewise.
(colorizer::finish_state): Likewise.
(colorizer::get_color_by_name): Likewise.
(layout::layout): Make "context" param const.  Add "pp", "is_html"
and "label_writer" params, with defaults.  Use them to initialize
new fields.  Update for change to m_colorizer.
(layout::print_source_line): Add HTML support.
(layout::start_annotation_line): Likewise.
(layout::print_annotation_line): Likewise.
(line_label::line_label): Make context const.  Add
"original_range_idx" param and use it to initialize...
(line_label::m_original_range_idx): ...this new field.
(layout::print_any_labels): Pass original index of range to
line_label ctor.  Add HTML support.
(get_affected_range): Make context const.
(get_printed_columns): Likewise.
(line_corrections::line_corrections): Likewise.
(line_corrections::m_context): Likewise.
(layout::print_line): Don't print fix-it hints for HTML support.
(diagnostic_show_locus_as_html): New.
(selftest::assert_html_eq): New.
(ASSERT_HTML_EQ): New.
(selftest::test_one_liner_simple_caret): Verify the HTML output.
(selftest::test_diagnostic_show_locus_fixit_lines): Likewise.
(selftest::test_html): New.
(selftest::diagnostic_show_locus_c_tests): Call it.
* diagnostic.c (diagnostic_show_any_path): Pass the location to
the print_path callback.
* diagnostic.h (enum diagnostic_path_format): Add DPF_HTML.
(diagnostic_context::num_html_paths): New field.
(diagnostic_context::print_path): Add location_t param.
(class html_label_writer): New.
(diagnostic_show_locus_as_html): New decl.
* doc/invoke.texi (Diagnostic Message Formatting Options): Add
"html" to -fdiagnostics-path-format=.
(-fdiagnostics-path-format=): Add html.
* pretty-print.c (pp_write_text_as_html_to_stream): New.
* pretty-print.h (pp_write_text_as_html_to_stream): New decl.
* selftest-diagnostic.c (selftest::html_printer::html_printer):
New ctor.
(selftest::html_printer::get_html_output): New.
* selftest-diagnostic.h (class selftest::html_printer): New.
* tree-diagnostic-path.cc: Define GCC_DIAG_STYLE where possible
and necessary.  Include "options.h" for dump_base_name.
(path_label::path_label): Add "colorize" param 

Re: [PATCH] Overflow-trapping integer arithmetic routines7code

2020-11-10 Thread Jeff Law via Gcc-patches


On 11/10/20 8:57 AM, Jakub Jelinek wrote:
> On Tue, Nov 10, 2020 at 08:29:58AM -0700, Jeff Law via Gcc-patches wrote:
>> On 10/5/20 10:49 AM, Stefan Kanthak wrote:
>>> The implementation of the functions __absv?i2(), __addv?i3() etc. for
>>> trapping integer overflow provided in libgcc2.c is rather bad.
>>> Same for __cmp?i2() and __ucmp?i2()
>>>
>>> At least for AMD64 and i386 processors GCC creates awful to horrible
>>> code for them: see 
>>> for some examples as well as the expected assembly.
>>>
>>> The attached diff/patch provides better implementations.
>> So you didn't indicate how this was tested.   The addition, subtraction,
>> negation and mulvsi3 are reasonable.  The mulvsi3 change introduced a
>> failure on m32r, but I'm highly confident it's an pre-existing simulator
>> bug.
> The formatting is wrong too (e.g. no space before ( when calling functions).

Ugh.  I fixed that in one tree, but not in the tree I committed from. 
I'll adjust momentarily.

jeff



Re: [PATCH] Overflow-trapping integer arithmetic routines7code

2020-11-10 Thread Jakub Jelinek via Gcc-patches
On Tue, Nov 10, 2020 at 08:29:58AM -0700, Jeff Law via Gcc-patches wrote:
> On 10/5/20 10:49 AM, Stefan Kanthak wrote:
> > The implementation of the functions __absv?i2(), __addv?i3() etc. for
> > trapping integer overflow provided in libgcc2.c is rather bad.
> > Same for __cmp?i2() and __ucmp?i2()
> >
> > At least for AMD64 and i386 processors GCC creates awful to horrible
> > code for them: see 
> > for some examples as well as the expected assembly.
> >
> > The attached diff/patch provides better implementations.
> 
> So you didn't indicate how this was tested.   The addition, subtraction,
> negation and mulvsi3 are reasonable.  The mulvsi3 change introduced a
> failure on m32r, but I'm highly confident it's an pre-existing simulator
> bug.

The formatting is wrong too (e.g. no space before ( when calling functions).

> What I don't understand is why the [u]cmpdi/cmpdi2 changes don't cause
> infinite recursion.   I would expect the comparisons you're using to
> compute the return value to themselves trigger calls to [u]cmpdi2.  I
> can speculate that the expanders are open-coding the comparison, but in
> that case I'd expect the libgcc2 routines to never be used, so
> optimizing them is pointless :-)

Yeah, I don't really understand these changes either.

Jakub



Re: [PATCH][MIPS] PR target/77510 Reduce size of MIPS core automaton

2020-11-10 Thread Paul Koning via Gcc-patches



> On Nov 10, 2020, at 6:42 AM, Xu Chenghua  wrote:
> 
> 
> Hi:
> 
> This patch reduce reservation of model do not more than 10 cycles. The memory 
> of genautomata down to 1GB.

Does this make any significant difference in performance of the generated code? 
 The original cycle counts are from the CPU description, so trimming them to a 
max of 10 means the model no longer reflects the real implementation.  It may 
be the difference is too small to worry about.  Can you comment on this?

paul




Re: [PATCH] C-family : Add attribute 'unavailable'.

2020-11-10 Thread Joseph Myers
This patch seems to be missing documentation for the new attribute in 
extend.texi.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] Overflow-trapping integer arithmetic routines7code

2020-11-10 Thread Jeff Law via Gcc-patches


On 10/5/20 10:49 AM, Stefan Kanthak wrote:
> The implementation of the functions __absv?i2(), __addv?i3() etc. for
> trapping integer overflow provided in libgcc2.c is rather bad.
> Same for __cmp?i2() and __ucmp?i2()
>
> At least for AMD64 and i386 processors GCC creates awful to horrible
> code for them: see 
> for some examples as well as the expected assembly.
>
> The attached diff/patch provides better implementations.

So you didn't indicate how this was tested.   The addition, subtraction,
negation and mulvsi3 are reasonable.  The mulvsi3 change introduced a
failure on m32r, but I'm highly confident it's an pre-existing simulator
bug.


But the abs changes and the other other multiply changes have undesired
LIBGCC2_BAD_CODE ifdefs that you need to clean up before we could accept
those changes.


What I don't understand is why the [u]cmpdi/cmpdi2 changes don't cause
infinite recursion.   I would expect the comparisons you're using to
compute the return value to themselves trigger calls to [u]cmpdi2.  I
can speculate that the expanders are open-coding the comparison, but in
that case I'd expect the libgcc2 routines to never be used, so
optimizing them is pointless :-)   But I put them through my tester and
couldn't see any evidence that they're causing problems though and
reviewed the generated code on m32r since it was convenient to do so.


Also note the ucmpdi changes require updating the prototype in
libgcc2.h, otherwise you get build failures building gcc itself.  I
fixed that minor omission.


So with all that in mind, I installed everything except the bits which
have the LIBGCC2_BAD_CODE ifdefs after testing on the various crosses. 
If you could remove the ifdefs on the abs/mult changes and resubmit them
it'd be appreciated.

Jeff



libgo patch committed: Update to Go 1.15.4 release

2020-11-10 Thread Ian Lance Taylor via Gcc-patches
This patch updates libgo to the 1.15.4 release.  Bootstrapped and ran
Go testsuite on x86_64-pc-linux-gnu.  Committed to mainline.

Ian
8815a8e97f8d0ee4066e737152c8198a80e9fe32
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index 149f20077e3..e62578fc781 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-ae20684902b82883d3d65f2cde0894c7cb3b995b
+893fa057e36ae6c9b2ac5ffdf74634c35b3489c6
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/libgo/MERGE b/libgo/MERGE
index 4b158c05c6e..ad239c92fa9 100644
--- a/libgo/MERGE
+++ b/libgo/MERGE
@@ -1,4 +1,4 @@
-1984ee00048b63eacd2155cd6d74a2d13e998272
+0e953add9656c32a788e06438cd7b533e968b7f8
 
 The first line of this file holds the git revision number of the
 last merge done from the master library sources.
diff --git a/libgo/VERSION b/libgo/VERSION
index 93c7b2d3d78..baff2224d62 100644
--- a/libgo/VERSION
+++ b/libgo/VERSION
@@ -1 +1 @@
-go1.15.3
+go1.15.4
diff --git a/libgo/go/cmd/go/internal/modfetch/coderepo_test.go 
b/libgo/go/cmd/go/internal/modfetch/coderepo_test.go
index f69c193b86c..9a0cd7dba69 100644
--- a/libgo/go/cmd/go/internal/modfetch/coderepo_test.go
+++ b/libgo/go/cmd/go/internal/modfetch/coderepo_test.go
@@ -655,11 +655,6 @@ var codeRepoVersionsTests = []struct {
path: "swtch.com/testmod",
versions: []string{"v1.0.0", "v1.1.1"},
},
-   {
-   vcs:  "git",
-   path: "gopkg.in/russross/blackfriday.v2",
-   versions: []string{"v2.0.0", "v2.0.1"},
-   },
{
vcs:  "git",
path: "gopkg.in/natefinch/lumberjack.v2",
diff --git a/libgo/go/compress/flate/deflate_test.go 
b/libgo/go/compress/flate/deflate_test.go
index 3362d256cf4..8a22b8e6f92 100644
--- a/libgo/go/compress/flate/deflate_test.go
+++ b/libgo/go/compress/flate/deflate_test.go
@@ -11,6 +11,7 @@ import (
"internal/testenv"
"io"
"io/ioutil"
+   "math/rand"
"reflect"
"runtime/debug"
"sync"
@@ -896,6 +897,62 @@ func TestBestSpeedMaxMatchOffset(t *testing.T) {
}
 }
 
+func TestBestSpeedShiftOffsets(t *testing.T) {
+   // Test if shiftoffsets properly preserves matches and resets 
out-of-range matches
+   // seen in https://github.com/golang/go/issues/4142
+   enc := newDeflateFast()
+
+   // testData may not generate internal matches.
+   testData := make([]byte, 32)
+   rng := rand.New(rand.NewSource(0))
+   for i := range testData {
+   testData[i] = byte(rng.Uint32())
+   }
+
+   // Encode the testdata with clean state.
+   // Second part should pick up matches from the first block.
+   wantFirstTokens := len(enc.encode(nil, testData))
+   wantSecondTokens := len(enc.encode(nil, testData))
+
+   if wantFirstTokens <= wantSecondTokens {
+   t.Fatalf("test needs matches between inputs to be generated")
+   }
+   // Forward the current indicator to before wraparound.
+   enc.cur = bufferReset - int32(len(testData))
+
+   // Part 1 before wrap, should match clean state.
+   got := len(enc.encode(nil, testData))
+   if wantFirstTokens != got {
+   t.Errorf("got %d, want %d tokens", got, wantFirstTokens)
+   }
+
+   // Verify we are about to wrap.
+   if enc.cur != bufferReset {
+   t.Errorf("got %d, want e.cur to be at bufferReset (%d)", 
enc.cur, bufferReset)
+   }
+
+   // Part 2 should match clean state as well even if wrapped.
+   got = len(enc.encode(nil, testData))
+   if wantSecondTokens != got {
+   t.Errorf("got %d, want %d token", got, wantSecondTokens)
+   }
+
+   // Verify that we wrapped.
+   if enc.cur >= bufferReset {
+   t.Errorf("want e.cur to be < bufferReset (%d), got %d", 
bufferReset, enc.cur)
+   }
+
+   // Forward the current buffer, leaving the matches at the bottom.
+   enc.cur = bufferReset
+   enc.shiftOffsets()
+
+   // Ensure that no matches were picked up.
+   got = len(enc.encode(nil, testData))
+   if wantFirstTokens != got {
+   t.Errorf("got %d, want %d tokens", got, wantFirstTokens)
+   }
+}
+
 func TestMaxStackSize(t *testing.T) {
// This test must not run in parallel with other tests as 
debug.SetMaxStack
// affects all goroutines.
diff --git a/libgo/go/compress/flate/deflatefast.go 
b/libgo/go/compress/flate/deflatefast.go
index 24f8be9d5db..6aa439f13d9 100644
--- a/libgo/go/compress/flate/deflatefast.go
+++ b/libgo/go/compress/flate/deflatefast.go
@@ -270,6 +270,7 @@ func (e *deflateFast) matchLen(s, t int32, src []byte) 
int32 {
 func (e *deflateFast) reset() {
e.prev = e.prev[:0]
// Bump the offset, so all matches will fail distance check.
+   // Nothing should be >= e.cur in 

Re: [PATCH] Use extremes of underlying type when building VR_ANTI_RANGEs.

2020-11-10 Thread Aldy Hernandez via Gcc-patches




On 11/10/20 3:26 PM, Andrew MacLeod wrote:

On 11/10/20 5:41 AM, Aldy Hernandez wrote:

[Andrew, this doesn't fix a PR per se.  It's just something I stumbled
upon in my other -fstrict-enums fixes.  It passes tests.  What do you
think?  Should we push it?]

Imagine an enum in an -fstrict-enums world:

enum foo { zero, one, two, three };

Building the inverse of [0,3] currently yields VARYING (instead of
[4,+INF]).  This is because the setter sees 0 and 3 as the extremes of
the type, and per the current code, calculates the inverse to VARYING.
BTW, it really should be UNDEFINED, but this is legacy code I'm afraid
to touch:


Didnt we conclude at one point that neither VARYING nor UNDEFINED can be 
inverted?  There was too much ambiguity about what it meant, so callers 
need to check for undefined and varying before they can use invert. This 
makes sure they get their expected results.


Right, but this is in the setter.  We aren't inverting a varying, but 
creating the inverse of a range that happens to behave as a varying 
under the old rules.  Per the last set of fixes (or even the 
normalization code we had before), [0,3] is not varying.  [0,MAX] is 
varying.  What I'm trying to avoid is ~[0,3] be VARYING.  It should be 
[4,MAX].



in fact, I see this in invert():

gcc_assert (!undefined_p () && !varying_p ());


but it comes after the legacy check..  so maybe there is some attempt to 
preserve legacy behaviour..  dunno.


But you are correct that this is legacy behavior.  I don't know if 
having legacy and multi-range behavior different in this regard is good, 
so I proposed this fix.


I'm a bit indifferent.  We could leave it as is.

Aldy



Re: [PATCH] generalized range_query class for multiple contexts

2020-11-10 Thread Martin Sebor via Gcc-patches

On 11/6/20 2:54 PM, Andrew MacLeod wrote:

On 11/6/20 11:13 AM, Martin Sebor wrote:

On 11/5/20 8:08 PM, Andrew MacLeod wrote:

On 11/5/20 7:50 PM, Martin Sebor wrote:

On 11/5/20 5:02 PM, Andrew MacLeod wrote:

On 11/5/20 4:02 PM, Martin Sebor wrote:

On 11/5/20 12:29 PM, Martin Sebor wrote:

On 10/1/20 11:25 AM, Martin Sebor wrote:

On 10/1/20 9:34 AM, Aldy Hernandez wrote:



On 10/1/20 3:22 PM, Andrew MacLeod wrote:
 > On 10/1/20 5:05 AM, Aldy Hernandez via Gcc-patches wrote:
 >>> Thanks for doing all this!  There isn't anything I don't 
understand
 >>> in the sprintf changes so no questions from me (well, 
almost none).

 >>> Just some comments:
 >> Thanks for your comments on the sprintf/strlen API conversion.
 >>
 >>> The current call statement is available in all functions 
that take
 >>> a directive argument, as dir->info.callstmt.  There should 
be no need
 >>> to also add it as a new argument to the functions that now 
need it.

 >> Fixed.
 >>
 >>> The change adds code along these lines in a bunch of places:
 >>>
 >>> + value_range vr;
 >>> + if (!query->range_of_expr (vr, arg, stmt))
 >>> +   vr.set_varying (TREE_TYPE (arg));
 >>>
 >>> I thought under the new Ranger APIs when a range couldn't be
 >>> determined it would be automatically set to the maximum for
 >>> the type.  I like that and have been moving in that direction
 >>> with my code myself (rather than having an API fail, have it
 >>> set the max range and succeed).
 >> I went through all the above idioms and noticed all are 
being used on
 >> supported types (integers or pointers). So range_of_expr 
will always

 >> return true.  I've removed the if() and the set_varying.
 >>
 >>> Since that isn't so in this case, I think it would still 
be nice
 >>> if the added code could be written as if the range were 
set to
 >>> varying in this case and (ideally) reduced to just 
initialization:

 >>>
 >>> value_range vr = some-function (query, stmt, arg);
 >>>
 >>> some-function could be an inline helper defined just for 
the sprintf
 >>> pass (and maybe also strlen which also seems to use the 
same pattern),
 >>> or it could be a value_range AKA irange ctor, or it could 
be a member

 >>> of range_query, whatever is the most appropriate.
 >>>
 >>> (If assigning/copying a value_range is thought to be too 
expensive,
 >>> declaring it first and then passing it to that helper to 
set it

 >>> would work too).
 >>>
 >>> In strlen, is the removed comment no longer relevant? 
(I.e., does

 >>> the ranger solve the problem?)
 >>>
 >>> -  /* The range below may be "inaccurate" if a 
constant has been
 >>> -    substituted earlier for VAL by this pass that 
hasn't been
 >>> -    propagated through the CFG. This shoud be fixed 
by the new
 >>> -    on-demand VRP if/when it becomes available 
(hopefully in

 >>> -    GCC 11).  */
 >> It should.
 >>
 >>> I'm wondering about the comment added to 
get_range_strlen_dynamic

 >>> and other places:
 >>>
 >>> + // FIXME: Use range_query instead of global ranges.
 >>>
 >>> Is that something you're planning to do in a followup or 
should

 >>> I remember to do it at some point?
 >> I'm not planning on doing it.  It's just a reminder that it 
would be

 >> beneficial to do so.
 >>
 >>> Otherwise I have no concern with the changes.
 >> It's not cleared whether Andrew approved all 3 parts of the 
patchset
 >> or just the valuation part.  I'll wait for his nod before 
committing

 >> this chunk.
 >>
 >> Aldy
 >>
 > I have no issue with it, so OK.

Pushed all 3 patches.

 >
 > Just an observation that should be pointed out, I believe 
Aldy has all
 > the code for converting to a ranger, but we have not pursued 
that any
 > further yet since there is a regression due to our lack of 
equivalence
 > processing I think?  That should be resolved in the coming 
month, but at
 > the moment is a holdback/concern for converting these 
passes... iirc.


Yes.  Martin, the take away here is that the strlen/sprintf 
pass has been converted to the new API, but ranger is still not 
up and running on it (even on the branch).


With the new API, all you have to do is remove all instances of 
evrp_range_analyzer and replace them with a ranger. That's it.
Below is an untested patch that would convert you to a ranger 
once it's contributed.


IIRC when I enabled the ranger for your pass a while back, 
there was one or two regressions due to missing equivalences, 
and the rest were because the tests were expecting an actual 
specific range, and the ranger returned a slightly 
different/better one. You'll need to adjust your tests.


Ack.  I'll be on the lookout for the ranger commit (if you hppen
to remember and CC me on it just in case I might miss it that would
be great).


I have applied the patch and ran some tests.  There are quite
a few failures (see the list below).  I have only looked at
a couple.  The one in in gcc.dg/tree-ssa/builtin-sprintf-warn-3.c
boils down to the 

Re: [PATCH] "used" attribute saves decl from linker garbage collection

2020-11-10 Thread Jozef Lawrynowicz
On Mon, Nov 09, 2020 at 12:31:09PM -0800, H.J. Lu via Gcc-patches wrote:
> On Mon, Nov 9, 2020 at 11:56 AM Jozef Lawrynowicz
>  wrote:
> >
> > On Mon, Nov 09, 2020 at 10:36:07AM -0800, H.J. Lu via Gcc-patches wrote:
> > > On Mon, Nov 9, 2020 at 9:41 AM Jozef Lawrynowicz
> > >  wrote:
> > > >
> > > > On Fri, Nov 06, 2020 at 04:39:33PM -0800, H.J. Lu via Gcc-patches wrote:
> > > > > On Fri, Nov 6, 2020 at 4:17 PM Jeff Law  wrote:
> > > > > >
> > > > > >
> > > > > > On 11/6/20 5:13 PM, H.J. Lu wrote:
> > > > > > > On Fri, Nov 6, 2020 at 4:01 PM Jeff Law  wrote:
> > > > > > >>
> > > > > > >> On 11/6/20 4:45 PM, H.J. Lu wrote:
> > > > > > >>> On Fri, Nov 6, 2020 at 3:37 PM Jeff Law  wrote:
> > > > > >  On 11/6/20 4:29 PM, H.J. Lu wrote:
> > > > > > > On Fri, Nov 6, 2020 at 3:22 PM Jeff Law  
> > > > > > > wrote:
> > > > > > >> On 11/5/20 7:34 AM, H.J. Lu via Gcc-patches wrote:
> > > > > > >>> On Thu, Nov 5, 2020 at 3:37 AM Jozef Lawrynowicz
> > > > > > >>>  wrote:
> > > > > >  On Thu, Nov 05, 2020 at 06:21:21AM -0500, Hans-Peter 
> > > > > >  Nilsson wrote:
> > > > > > > On Wed, 4 Nov 2020, H.J. Lu wrote:
> > > > > > >> .retain is ill-defined.   For example,
> > > > > > >>
> > > > > > >> [hjl@gnu-cfl-2 gcc]$ cat /tmp/x.c
> > > > > > >> static int xyzzy __attribute__((__used__));
> > > > > > >> [hjl@gnu-cfl-2 gcc]$ ./xgcc -B./ -S /tmp/x.c -fcommon
> > > > > > >> [hjl@gnu-cfl-2 gcc]$ cat x.s
> > > > > > >> .file "x.c"
> > > > > > >> .text
> > > > > > >> .retain xyzzy  < What does it do?
> > > > > > >> .local xyzzy
> > > > > > >> .comm xyzzy,4,4
> > > > > > >> .ident "GCC: (GNU) 11.0.0 20201103 (experimental)"
> > > > > > >> .section .note.GNU-stack,"",@progbits
> > > > > > >> [hjl@gnu-cfl-2 gcc]$
> > > > > > > To answer that question: it's up to the assembler, but 
> > > > > > > for ELF
> > > > > > > and SHF_GNU_RETAIN, it seems obvious it'd tell the 
> > > > > > > assembler to
> > > > > > > set SHF_GNU_RETAIN for the section where the symbol ends 
> > > > > > > up.
> > > > > > > We both know this isn't rocket science with binutils.
> > > > > >  Indeed, and my patch handles it trivially:
> > > > > >  https://sourceware.org/pipermail/binutils/2020-November/113993.html
> > > > > > 
> > > > > >    +void
> > > > > >    +obj_elf_retain (int arg ATTRIBUTE_UNUSED)
> > > > > >     snip 
> > > > > >    +  sym = get_sym_from_input_line_and_check ();
> > > > > >    +  symbol_get_obj (sym)->retain = 1;
> > > > > > 
> > > > > >    @@ -2624,6 +2704,9 @@ elf_frob_symbol (symbolS *symp, 
> > > > > >  int *puntp)
> > > > > >  }
> > > > > > }
> > > > > > 
> > > > > >    +  if (symbol_get_obj (symp)->retain)
> > > > > >    +elf_section_flags (S_GET_SEGMENT (symp)) |= 
> > > > > >  SHF_GNU_RETAIN;
> > > > > >    +
> > > > > >   /* Double check weak symbols.  */
> > > > > >   if (S_IS_WEAK (symp))
> > > > > > {
> > > > > > 
> > > > > >  We could check that the symbol named in the .retain 
> > > > > >  directive has
> > > > > >  already been defined, however this isn't compatible with 
> > > > > >  GCC
> > > > > >  mark_decl_preserved handling, since mark_decl_preserved is 
> > > > > >  called
> > > > > >  emitted before the local symbols are defined in the 
> > > > > >  assembly output
> > > > > >  file.
> > > > > > 
> > > > > >  GAS should at least validate that the symbol named in the 
> > > > > >  .retain
> > > > > >  directive does end up as a symbol though.
> > > > > > 
> > > > > > >>> Don't add .retain.
> > > > > > >> Why?  I don't see why you find it so objectionable.
> > > > > > >>
> > > > > > > An ELF symbol directive should operate on symbol table:
> > > > > > >
> > > > > > > http://www.sco.com/developers/gabi/latest/ch4.symtab.html
> > > > > > >
> > > > > > > not the section flags where the symbol is defined.
> > > > > >  I agree in general, but I think this is one of those cases 
> > > > > >  where it's
> > > > > >  not so clear.  And what you're talking about is an 
> > > > > >  implementation detail.
> > > > > > >>> There is no need for such a hack.  The proper thing to do in 
> > > > > > >>> ELF is
> > > > > > >>> to place such a symbol in a section with SHF_GNU_RETAIN flag.   
> > > > > > >>> This
> > > > > > >>> also avoids the question what to do with SHN_COMMON.
> > > > > > >> I'm not sure that's a good idea either.  Moving symbols into a 
> > > > > > >> section
> > > > > > >> other than they'd normally live doesn't seem all 

Re: [PATCH] Cleanup irange::set.

2020-11-10 Thread Richard Sandiford via Gcc-patches
Aldy Hernandez  writes:
>> (actually I can see 3245 ICEs on aarch64)
>> 
>> Can you fix it?
>
> Sure can.
>
> Richard, I seem to have incorrectly removed the early exit for varying, 
> and that affected the changes you made for poly ints.  Is there any 
> reason we can't just exit and set varying without checking for kind != 
> VR_VARYING?

No reason, it seemed more natural to drop to a lower kind with the old
code :-)  (But not with the new code.)

But it isn't obvious to me why the code is now structed the way it is.

  if (POLY_INT_CST_P (min) || POLY_INT_CST_P (max))
{
  set_varying (TREE_TYPE (min));
  return;
}

  // Nothing to canonicalize for symbolic ranges.
  if (TREE_CODE (min) != INTEGER_CST
  || TREE_CODE (max) != INTEGER_CST)
{
  m_kind = kind;
  m_base[0] = min;
  m_base[1] = max;
  m_num_ranges = 1;
  return;
}

  swap_out_of_order_endpoints (min, max, kind);
  if (kind == VR_VARYING)
{
  set_varying (TREE_TYPE (min));
  return;
}

Why do we want to check “min” and “max” being INTEGER_CST before “kind”
being VR_VARYING, and the potentially record VR_VARYING with specific
bounds?  And why do we want to swap the “min” and “max” before checking
whether “kind” is VR_VARYING (when we'll then drop the min and max anyway)?
I think this would benefit from a bit more commentary at least.

Thanks,
Richard


>
> The attached patch fixes the problem for aarch64.
>
> Testing in progress...
> Aldy
>
>  Early exit from irange::set for poly ints.
>
>  My previous cleanups to irange::set moved the early exit when
>  VARYING.  This caused poly int varyings to be created with
>  incorrect min/max.
>
>  We can just set varying and exit for all poly ints.
>
>  gcc/ChangeLog:
>
>  * value-range.cc (irange::set): Early exit for poly ints.
>
> diff --git a/gcc/value-range.cc b/gcc/value-range.cc
> index 61f7da278d6..178ef1b0a4f 100644
> --- a/gcc/value-range.cc
> +++ b/gcc/value-range.cc
> @@ -249,9 +249,11 @@ irange::set (tree min, tree max, value_range_kind kind)
> return;
>   }
>
> -  if (kind != VR_VARYING
> -  && (POLY_INT_CST_P (min) || POLY_INT_CST_P (max)))
> -kind = VR_VARYING;
> +  if (POLY_INT_CST_P (min) || POLY_INT_CST_P (max))
> +{
> +  set_varying (TREE_TYPE (min));
> +  return;
> +}
>
> // Nothing to canonicalize for symbolic ranges.
> if (TREE_CODE (min) != INTEGER_CST


Re: [PATCH] c, c++: Fix up -Wunused-value on COMPLEX_EXPRs [PR97748]

2020-11-10 Thread Jason Merrill via Gcc-patches

On 11/10/20 3:48 AM, Jakub Jelinek wrote:

On Mon, Nov 09, 2020 at 10:34:00PM +0100, Jakub Jelinek via Gcc-patches wrote:

On Mon, Nov 09, 2020 at 02:35:58PM -0500, Jason Merrill wrote:

How about calling warn_if_unused_value instead of the new
warn_if_unused_value_p?


That seems to work if I just replace the warning_at call with
warn_if_unused_value call (at least no regression in check-c++-all and
libstdc++ testsuite).
Initially I've tried just calling warn_if_unused_value without the NOP_EXPR
stripping and code/tclass checks, but that regressed a few tests, e.g.
g++.dg/warn/Wunused-14.C or c-c++-common/Wunused-var-9.c or 3 lines
in the new test, e.g. because STATEMENT_LIST or CLEANUP_POINT_EXPRs would
make it through and resulted in bogus warnings.


Bootstrapped/regtested on x86_64-linux and i686-linux successfully now.
If we wanted to simplify, I think the && !(code == ) part could be
dropped too, i.e. just
  if (tclass == tcc_comparison
  || tclass == tcc_unary
  || tclass == tcc_binary
  || code == VEC_PERM_EXPR
  || code == VEC_COND_EXPR)
warn_if_unused_value (e, loc);
because warn_if_unused_value already returns false on those codes.


OK with that change.

Jason



Re: Detect EAF flags in ipa-modref

2020-11-10 Thread Jan Hubicka
Hi,
here is updaed patch.

Honza

Bootstrapped/regtested x86_64-linux, OK (after the fnspec fixes)?

2020-11-10  Jan Hubicka  

* gimple.c: Include ipa-modref-tree.h and ipa-modref.h.
(gimple_call_arg_flags): Use modref to determine flags.
* ipa-modref.c: Include gimple-ssa.h, tree-phinodes.h,
tree-ssa-operands.h, stringpool.h and tree-ssanames.h.
(analyze_ssa_name_flags): Declare.
(modref_summary::useful_p): Summary is also useful if arg flags are
known.
(dump_eaf_flags): New function.
(modref_summary::dump): Use it.
(get_modref_function_summary): Be read for current_function_decl
being NULL.
(memory_access_to): New function.
(deref_flags): New function.
(call_lhs_flags): New function.
(analyze_parms): New function.
(analyze_function): Use it.
* ipa-modref.h (struct modref_summary): Add arg_flags.
* doc/invoke.texi (ipa-modref-max-depth): Document.
* params.opt (ipa-modref-max-depth): New param.

gcc/testsuite/ChangeLog:

2020-11-10  Jan Hubicka  

* gcc.dg/torture/pta-ptrarith-1.c: Escape parametrs.

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index d2a188d7c75..0bd76d2841e 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -12953,6 +12953,10 @@ memory locations using the mod/ref information.  This 
parameter ought to be
 bigger than @option{--param ipa-modref-max-bases} and @option{--param
 ipa-modref-max-refs}.
 
+@item ipa-modref-max-depth
+Specified the maximum depth of DFS walk used by modref escape analysis.
+Setting to 0 disables the analysis completely.
+
 @item profile-func-internal-id
 A parameter to control whether to use function internal id in profile
 database lookup. If the value is 0, the compiler uses an id that
diff --git a/gcc/gimple.c b/gcc/gimple.c
index 1afed88e1f1..da90716aa23 100644
--- a/gcc/gimple.c
+++ b/gcc/gimple.c
@@ -46,6 +46,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "asan.h"
 #include "langhooks.h"
 #include "attr-fnspec.h"
+#include "ipa-modref-tree.h"
+#include "ipa-modref.h"
 
 
 /* All the tuples have their operand vector (if present) at the very bottom
@@ -1532,24 +1534,45 @@ int
 gimple_call_arg_flags (const gcall *stmt, unsigned arg)
 {
   attr_fnspec fnspec = gimple_call_fnspec (stmt);
-
-  if (!fnspec.known_p ())
-return 0;
-
   int flags = 0;
 
-  if (!fnspec.arg_specified_p (arg))
-;
-  else if (!fnspec.arg_used_p (arg))
-flags = EAF_UNUSED;
-  else
+  if (fnspec.known_p ())
 {
-  if (fnspec.arg_direct_p (arg))
-   flags |= EAF_DIRECT;
-  if (fnspec.arg_noescape_p (arg))
-   flags |= EAF_NOESCAPE;
-  if (fnspec.arg_readonly_p (arg))
-   flags |= EAF_NOCLOBBER;
+  if (!fnspec.arg_specified_p (arg))
+   ;
+  else if (!fnspec.arg_used_p (arg))
+   flags = EAF_UNUSED;
+  else
+   {
+ if (fnspec.arg_direct_p (arg))
+   flags |= EAF_DIRECT;
+ if (fnspec.arg_noescape_p (arg))
+   flags |= EAF_NOESCAPE;
+ if (fnspec.arg_readonly_p (arg))
+   flags |= EAF_NOCLOBBER;
+   }
+}
+  tree callee = gimple_call_fndecl (stmt);
+  if (callee)
+{
+  cgraph_node *node = cgraph_node::get (callee);
+  modref_summary *summary = node ? get_modref_function_summary (node)
+   : NULL;
+
+  if (summary && summary->arg_flags.length () > arg)
+   {
+ int modref_flags = summary->arg_flags[arg];
+
+ /* We have possibly optimized out load.  Be conservative here.  */
+ if (!node->binds_to_current_def_p ())
+   {
+ if ((modref_flags & EAF_UNUSED) && !(flags & EAF_UNUSED))
+   modref_flags &= ~EAF_UNUSED;
+ if ((modref_flags & EAF_DIRECT) && !(flags & EAF_DIRECT))
+   modref_flags &= ~EAF_DIRECT;
+   }
+ flags |= modref_flags;
+   }
 }
   return flags;
 }
diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
index 3f46bebed3c..30e76580fb0 100644
--- a/gcc/ipa-modref.c
+++ b/gcc/ipa-modref.c
@@ -61,6 +61,15 @@ along with GCC; see the file COPYING3.  If not see
 #include "ipa-fnsummary.h"
 #include "attr-fnspec.h"
 #include "symtab-clones.h"
+#include "gimple-ssa.h"
+#include "tree-phinodes.h"
+#include "tree-ssa-operands.h"
+#include "ssa-iterators.h"
+#include "stringpool.h"
+#include "tree-ssanames.h"
+
+static int analyze_ssa_name_flags (tree name,
+  vec _flags, int depth);
 
 /* We record fnspec specifiers for call edges since they depends on actual
gimple statements.  */
@@ -186,6 +195,8 @@ modref_summary::useful_p (int ecf_flags)
 {
   if (ecf_flags & (ECF_CONST | ECF_NOVOPS))
 return false;
+  if (arg_flags.length ())
+return true;
   if (loads && !loads->every_base)
 return true;
   if (ecf_flags & ECF_PURE)
@@ -355,6 +366,22 @@ dump_lto_records (modref_records_lto *tt, FILE 

Re: [PATCH] Use extremes of underlying type when building VR_ANTI_RANGEs.

2020-11-10 Thread Andrew MacLeod via Gcc-patches

On 11/10/20 5:41 AM, Aldy Hernandez wrote:

[Andrew, this doesn't fix a PR per se.  It's just something I stumbled
upon in my other -fstrict-enums fixes.  It passes tests.  What do you
think?  Should we push it?]

Imagine an enum in an -fstrict-enums world:

enum foo { zero, one, two, three };

Building the inverse of [0,3] currently yields VARYING (instead of
[4,+INF]).  This is because the setter sees 0 and 3 as the extremes of
the type, and per the current code, calculates the inverse to VARYING.
BTW, it really should be UNDEFINED, but this is legacy code I'm afraid
to touch:


Didnt we conclude at one point that neither VARYING nor UNDEFINED can be 
inverted?  There was too much ambiguity about what it meant, so callers 
need to check for undefined and varying before they can use invert.  
This makes sure they get their expected results.


in fact, I see this in invert():

gcc_assert (!undefined_p () && !varying_p ());


but it comes after the legacy check..  so maybe there is some attempt to 
preserve legacy behaviour..  dunno.





   if (is_min && is_max)
{
  /* We cannot deal with empty ranges, drop to varying.
 ???  This could be VR_UNDEFINED instead.  */
  set_varying (type);
  return;
}

However, the extremes of a VARYING are the extremes of the underlying
type ([0, 0xff..ff]), even in the presence of -fstrict-enums.

This fixes the setter to use the extremes of the underlying type.

gcc/ChangeLog:

* value-range.cc (irange::set): Use wi::{min,max}_value
instead of TYPE_{MIN,MAX}_VALUE when building anti-ranges.
(range_tests_strict_enum): New tests.
---
  gcc/value-range.cc | 27 +--
  1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/gcc/value-range.cc b/gcc/value-range.cc
index f83a824a982..bbd7d69a766 100644
--- a/gcc/value-range.cc
+++ b/gcc/value-range.cc
@@ -274,11 +274,13 @@ irange::set (tree min, tree max, value_range_kind kind)
// Anti-ranges that can be represented as ranges should be so.
if (kind == VR_ANTI_RANGE)
  {
-  /* For -fstrict-enums we may receive out-of-range ranges so consider
- values < -INF and values > INF as -INF/INF as well.  */
-  bool is_min = vrp_val_is_min (min);
-  bool is_max = vrp_val_is_max (max);
tree type = TREE_TYPE (min);
+  unsigned prec = TYPE_PRECISION (type);
+  signop sign = TYPE_SIGN (type);
+  wide_int type_max = wi::max_value (prec, sign);
+  wide_int type_min = wi::min_value (prec, sign);
+  bool is_min = wi::to_wide (min) == type_min;
+  bool is_max = wi::to_wide (max) == type_max;
  
if (is_min && is_max)

{
@@ -302,14 +304,14 @@ irange::set (tree min, tree max, value_range_kind kind)
  {
  tree one = build_int_cst (TREE_TYPE (max), 1);
  min = int_const_binop (PLUS_EXPR, max, one);
- max = vrp_val_max (TREE_TYPE (max));
+ max = wide_int_to_tree (type, type_max);
  kind = VR_RANGE;
  }
else if (is_max)
  {
  tree one = build_int_cst (TREE_TYPE (min), 1);
  max = int_const_binop (MINUS_EXPR, min, one);
- min = vrp_val_min (TREE_TYPE (min));
+ min = wide_int_to_tree (type, type_min);
  kind = VR_RANGE;
  }
  }
@@ -2276,6 +2278,19 @@ range_tests_strict_enum ()
ir1 = vr1;
ASSERT_TRUE (ir1 == vr1);
ASSERT_FALSE (ir1.varying_p ());
+
+  // Legacy inversion of [0, 3] should be [4, MAX].
+  vr1 = int_range<1> (build_int_cstu (rtype, 0), build_int_cstu (rtype, 3));
+  vr1.invert ();
+  tree type_max = wide_int_to_tree (rtype,
+   wi::max_value (TYPE_PRECISION (rtype),
+  TYPE_SIGN (rtype)));
+  ASSERT_TRUE (vr1 == int_range<1> (build_int_cstu (rtype, 4), type_max));
+
+  // Multi-range inversion of [0, 3] should be [4, MAX].
+  ir1 = int_range<2> (build_int_cstu (rtype, 0), build_int_cstu (rtype, 3));
+  ir1.invert ();
+  ASSERT_TRUE (vr1 == int_range<2> (build_int_cstu (rtype, 4), type_max));
  }
  
  static void




Re: RFC: Monitoring old PRs, new dg directives [v2]

2020-11-10 Thread Thomas Schwinge
Hi!

On 2020-08-10T17:30:21-0400, Marek Polacek via Gcc-patches 
 wrote:
> This patch adds a new DejaGNU directive, dg-ice, as outlined in the
> proposal here:
> https://gcc.gnu.org/pipermail/gcc-patches/2020-July/550913.html
>
> It means that it's expected that the compiler crashes with an internal
> compiler error when compiling test with such a directive.

Thanks, I find this useful.


So I have a testcase that currently ICEs...  ;-)

spawn -ignore SIGHUP [xgcc]
during GIMPLE pass: omplower
In file included from [...]:4:
[...]: In function 'f_data':
[...]:41:10: internal compiler error: in lower_omp_target, at 
omp-low.c:11890
[backtrace]
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.
compiler exited with status 1

I put '{ dg-ice }' into it, and do get the expected:

XFAIL: [...] (internal compiler error)'

But I also get an unexpected:

FAIL: [...] (test for excess errors)'

That's because of:

Excess errors:
during GIMPLE pass: omplower

I can easily '{ dg-prune-output "during GIMPLE pass: omplower" }', of
course, or should this lead-in be pruned internally, automatically?


And then, this is a '{ dg-do run }' testcase, I see:

UNRESOLVED: 
libgomp.oacc-c/../libgomp.oacc-c-c++-common/declare-vla-kernels-decompose.c 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0  
compilation failed to produce executable

Given '{ dg-ice }', it's obvious that we don't get an executable to run.
I thought that maybe '{ dg-ice }' testcases should automatically be
demoted into '{ dg-do compile }', but that would be wrong for link-time
ICEs.  But can we "expect" the UNRESOLVED, and turn that into another
XFAIL?  Aha, actually I see that for '{ dg-do link }', this seems to
behave as expected (no UNRESOLVED), so we might demote all '{ dg-ice }'
testcases from '{ dg-do run }' to '{ dg-do link }', if that makes sense?


Grüße
 Thomas


> diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
> index 63216a0daba..967cb135cb4 100644
> --- a/gcc/doc/sourcebuild.texi
> +++ b/gcc/doc/sourcebuild.texi
> @@ -1172,6 +1172,16 @@ Expect the execute step of a test to fail if the 
> conditions (which are
>  the same as for @code{dg-skip-if}) are met.
>  @end table
>
> +@subsubsection Expect the compiler to crash
> +
> +@table @code
> +@item  @{ dg-ice @var{comment} [@{ @var{selector} @} [@{ @var{include-opts} 
> @} [@{ @var{exclude-opts} @}]]] @}
> +Expect the compiler to crash with an internal compiler error and return
> +a nonzero exit status if the conditions (which are the same as for
> +@code{dg-skip-if}) are met.  Used for tests that test bugs that have not been
> +fixed yet.
> +@end table
> +
>  @subsubsection Expect the test executable to fail
>
>  @table @code
> diff --git a/gcc/testsuite/lib/gcc-dg.exp b/gcc/testsuite/lib/gcc-dg.exp
> index 45d97024883..e8ad3052657 100644
> --- a/gcc/testsuite/lib/gcc-dg.exp
> +++ b/gcc/testsuite/lib/gcc-dg.exp
> @@ -308,13 +308,27 @@ proc gcc-dg-test-1 { target_compile prog do_what 
> extra_tool_flags } {
>  verbose "$target_compile $prog $output_file $compile_type $options" 4
>  set comp_output [$target_compile "$prog" "$output_file" "$compile_type" 
> $options]
>
> +global expect_ice
>  # Look for an internal compiler error, which sometimes masks the fact
>  # that we didn't get an expected error message.  XFAIL an ICE via
>  # dg-xfail-if and use { dg-prune-output ".*internal compiler error.*" }
>  # to avoid a second failure for excess errors.
> -if [string match "*internal compiler error*" $comp_output] {
> +# "Error reporting routines re-entered" ICE says "Internal" rather than
> +# "internal", so match that too.
> +if [string match {*[Ii]nternal compiler error*} $comp_output] {
>   upvar 2 name name
> - fail "$name (internal compiler error)"
> + if { $expect_ice == 0 } {
> +   fail "$name (internal compiler error)"
> + } else {
> +   # We expected an ICE and we got it.
> +   xfail "$name (internal compiler error)"
> +   # Prune the ICE from the output.
> +   set comp_output [prune_ices $comp_output]
> + }
> +} elseif { $expect_ice == 1 } {
> + upvar 2 name name
> + # We expected an ICE but we didn't get it.
> + xpass "$name (internal compiler error)"
>  }
>
>  if { $do_what == "repo" } {
> @@ -939,6 +953,7 @@ if { [info procs saved-dg-test] == [list] } {
>   global additional_prunes
>   global compiler_conditional_xfail_data
>   global shouldfail
> + global expect_ice
>   global testname_with_flags
>   global set_target_env_var
>   global set_compiler_env_var
> @@ -954,6 +969,7 @@ if { [info procs saved-dg-test] == [list] } {
>   set additional_sources_used ""
>   set additional_prunes ""
>   set shouldfail 0

Re: [PATCH] Cleanup irange::set.

2020-11-10 Thread Aldy Hernandez via Gcc-patches

(actually I can see 3245 ICEs on aarch64)

Can you fix it?


Sure can.

Richard, I seem to have incorrectly removed the early exit for varying, 
and that affected the changes you made for poly ints.  Is there any 
reason we can't just exit and set varying without checking for kind != 
VR_VARYING?


The attached patch fixes the problem for aarch64.

Testing in progress...
Aldy

Early exit from irange::set for poly ints.

My previous cleanups to irange::set moved the early exit when
VARYING.  This caused poly int varyings to be created with
incorrect min/max.

We can just set varying and exit for all poly ints.

gcc/ChangeLog:

* value-range.cc (irange::set): Early exit for poly ints.

diff --git a/gcc/value-range.cc b/gcc/value-range.cc
index 61f7da278d6..178ef1b0a4f 100644
--- a/gcc/value-range.cc
+++ b/gcc/value-range.cc
@@ -249,9 +249,11 @@ irange::set (tree min, tree max, value_range_kind kind)
   return;
 }

-  if (kind != VR_VARYING
-  && (POLY_INT_CST_P (min) || POLY_INT_CST_P (max)))
-kind = VR_VARYING;
+  if (POLY_INT_CST_P (min) || POLY_INT_CST_P (max))
+{
+  set_varying (TREE_TYPE (min));
+  return;
+}

   // Nothing to canonicalize for symbolic ranges.
   if (TREE_CODE (min) != INTEGER_CST



[Patch, fortran] PR83118 - [8/9/10/11 Regression] Bad intrinsic assignment of class(*) array component of derived type

2020-11-10 Thread Paul Richard Thomas via Gcc-patches
Hi Everyone,

I am afraid that this is a rather long sad story, mainly due to my efforts
with gfortran being interrupted by daytime work. I posted the first version
of the patch nearly a year ago but this was derailed by Tobias's question
at: https://gcc.gnu.org/legacy-ml/fortran/2019-11/msg00098.html

My recent attempt to post this patch were disrupted by the patch itself
disappearing from the posting. Thanks to Andre and Thomas for pointing this
out. Since then, I have been working on downstream PRs and this has led to
a reworking of the unposted version.

(i) The attached fixes the original problem and is tested by
gfortran.dg/unlimited_polymorphic_32.f03.
(ii) In fixing the original problem, a fair amount of effort was required
to get the element length correct for class temporaries produced by
dependencies in class assignment. This is reflected in the changes to
trans_array.c(gfc_alloc_allocatable_for_assignment) and the new function
get_class_info_from_ss.
(iii) Tobias's testcase in the above posting to the list didn't address
itself to class arrays of the original problem. However, it revealed that
reallocation was not occuring at all for scalar assignments.  This is fixed
by the large chunk in trans-expr.c(trans_class_assignment). The array case
is 'fixed' by testing for unequal element sizes between lhs and rhs before
reallocation in gfc_alloc_allocatable_for_assignment. This is difficult to
test for since, in most cases, the system returns that same address after
reallocation.
(iv) dependency_57.f90 segfaulted at runtime. The other work in
trans_class_assignment was required to fix this.
(v) A number of minor tidy ups were done including the new function
gfc_resize_class_size_with_len to eliminate some repeated code.

Note: Chunks of code are coming within scalarization loops that should be
outside:
  x->_vptr = (struct __vtype__STAR * {ref-all})
&__vtab_INTEGER_4_;
  x->_len = 0;
  D.3977 = x->_vptr->_size;
  D.3978 = x->_len;
  D.3979 = D.3978 > 0 ? D.3977 * D.3978 : D.3977;
also in many cases of class assignment, the lhs vptr is being set more than
once outside the loop when temporaries are involved. I will try to iron out
these issues later on.

This all bootstraps and regtests on FC31/x86_64 - OK for master?

Cheers

Paul

As well as the PR this patch fixes problems in handling class objects;
most importantly class array temporaries, required when dependences
occur in class assignment, and a correct implementation of reallocation
on assignment.

2020-11-10  Paul Thomas  

gcc/fortran
PR fortran/83118
* resolve.c (resolve_ordinary_assign): Generate a vtable if
necessary for scalar non-polymorphic rhs's to unlimited lhs's.
* trans-array.c (get_class_info_from_ss): New function.
(gfc_trans_allocate_array_storage): Defer obtaining class
element type until all sources of class exprs are tried. Use
class API rather than TREE_OPERAND. Look for class expressions
in ss->info by calling get_class_info_from_ss. After obtain
the element size for class descriptors. Where the element type
is unknown, cast the data as character(len=size) to overcome
unlimited polymorphic problems.
(structure_alloc_comps): Replace code that replicates the new
function gfc_resize_class_size_with_len.
(gfc_alloc_allocatable_for_assignment): Obtain element size
for lhs in cases of deferred characters and class enitities.
Move code for the element size of rhs to start of block. Clean
up extraction of class parameters throughout this function.
After the shape check test whether or not the lhs and rhs
element sizes are the same. Use earlier evaluation of
'cond_null'. Reallocation of lhs only to happen if size changes
or element size changes.
* trans-expr.c (gfc_resize_class_size_with_len): New function.
(gfc_conv_procedure_call): Ensure the vtable is present for
passing a non-class actual to an unlimited formal.
(trans_class_vptr_len_assignment): For expressions of type
BT_CLASS, extract the class expression if necessary. Use a
statement block outside the loop body. Ensure that 'rhs' is
of the correct type. Obtain rhs vptr in all circumstances.
(gfc_trans_assignment_1): Simplify some of the logic with
'realloc_flag'. Set 'vptr_copy' for all array assignments to
unlimited polymorphic lhs.
* trans-c (gfc_build_array_ref): Call gfc_resize_class_size_
with_len to correct span for unlimited polymorphic decls.
* trans.h : Add prototype for gfc_resize_class_size_with_len.

gcc/testsuite/
PR fortran/83118
* gfortran.dg/dependency_57.f90: Change to dg-run and test for correct
result.
* gfortran.dg/unlimited_polymorphic_32.f03: New test.


unlimited_polymorphic_32.f03
Description: Binary data
diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
index 1641eb6ca10..daa947af9d1 100644
--- a/gcc/fortran/resolve.c
+++ b/gcc/fortran/resolve.c
@@ -11054,7 +11054,7 @@ resolve_ordinary_assign (gfc_code *code, gfc_namespace *ns)
 
   /* Make sure there is a vtable 

[PATCH] aarch64: Make use of RTL predicates

2020-11-10 Thread Andrea Corallo via Gcc-patches
Hi all,

I'd like to propose this patch to make use of RTL predicates into the
AArch64 back-end where possible.

Bootstrapped and regtested on aarch64-unknown-linux-gnu.

Okay for trunk?

Thanks

  Andrea

>From 9a027963f228b5c11aba30537cf888bba09bd6c2 Mon Sep 17 00:00:00 2001
From: Andrea Corallo 
Date: Tue, 10 Nov 2020 11:23:15 +
Subject: [PATCH] aarch64: Make use of RTL predicates

2020-11-10  Andrea Corallo  

* config/aarch64/aarch64.c (tls_symbolic_operand_type)
(aarch64_load_symref_appropriately, aarch64_mov128_immediate)
(aarch64_expand_mov_immediate)
(aarch64_maybe_expand_sve_subreg_move)
(aarch64_tls_referenced_p, aarch64_cannot_force_const_mem)
(aarch64_base_register_rtx_p, aarch64_classify_index)
(aarch64_classify_address, aarch64_symbolic_address_p)
(aarch64_reinterpret_float_as_int, aarch64_float_const_rtx_p)
(aarch64_can_const_movi_rtx_p, aarch64_select_cc_mode)
(aarch64_print_operand, aarch64_label_mentioned_p)
(aarch64_secondary_reload, aarch64_preferred_reload_class)
(aarch64_address_cost, aarch64_tls_symbol_p)
(aarch64_classify_symbol, aarch64_legitimate_pic_operand_p)
(aarch64_legitimate_constant_p)
(aarch64_sve_float_arith_immediate_p)
(aarch64_sve_float_mul_immediate_p, aarch64_mov_operand_p)
(fusion_load_store): Use RTL predicates where possible.
---
 gcc/config/aarch64/aarch64.c | 80 ++--
 1 file changed, 40 insertions(+), 40 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 97cb68980e9..26cac5f788b 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -2985,7 +2985,7 @@ tls_symbolic_operand_type (rtx addr)
   enum tls_model tls_kind = TLS_MODEL_NONE;
   poly_int64 offset;
   addr = strip_offset_and_salt (addr, );
-  if (GET_CODE (addr) == SYMBOL_REF)
+  if (SYMBOL_REF_P (addr))
 tls_kind = SYMBOL_REF_TLS_MODEL (addr);
 
   return tls_kind;
@@ -3126,7 +3126,7 @@ aarch64_load_symref_appropriately (rtx dest, rtx imm,
/* The operand is expected to be MEM.  Whenever the related insn
   pattern changed, above code which calculate mem should be
   updated.  */
-   gcc_assert (GET_CODE (mem) == MEM);
+   gcc_assert (MEM_P (mem));
MEM_READONLY_P (mem) = 1;
MEM_NOTRAP_P (mem) = 1;
emit_insn (insn);
@@ -3169,7 +3169,7 @@ aarch64_load_symref_appropriately (rtx dest, rtx imm,
mem = XVECEXP (XEXP (SET_SRC (insn), 0), 0, 0);
  }
 
-   gcc_assert (GET_CODE (mem) == MEM);
+   gcc_assert (MEM_P (mem));
MEM_READONLY_P (mem) = 1;
MEM_NOTRAP_P (mem) = 1;
emit_insn (insn);
@@ -4235,7 +4235,7 @@ aarch64_internal_mov_immediate (rtx dest, rtx imm, bool 
generate,
 bool
 aarch64_mov128_immediate (rtx imm)
 {
-  if (GET_CODE (imm) == CONST_INT)
+  if (CONST_INT_P (imm))
 return true;
 
   gcc_assert (CONST_WIDE_INT_NUNITS (imm) == 2);
@@ -5099,8 +5099,8 @@ aarch64_expand_mov_immediate (rtx dest, rtx imm)
 
   /* Check on what type of symbol it is.  */
   scalar_int_mode int_mode;
-  if ((GET_CODE (imm) == SYMBOL_REF
-   || GET_CODE (imm) == LABEL_REF
+  if ((SYMBOL_REF_P (imm)
+   || LABEL_REF_P (imm)
|| GET_CODE (imm) == CONST
|| GET_CODE (imm) == CONST_POLY_INT)
   && is_a  (mode, _mode))
@@ -5390,9 +5390,9 @@ bool
 aarch64_maybe_expand_sve_subreg_move (rtx dest, rtx src)
 {
   gcc_assert (BYTES_BIG_ENDIAN);
-  if (GET_CODE (dest) == SUBREG)
+  if (SUBREG_P (dest))
 dest = SUBREG_REG (dest);
-  if (GET_CODE (src) == SUBREG)
+  if (SUBREG_P (src))
 src = SUBREG_REG (src);
 
   /* The optimization handles two single SVE REGs with different element
@@ -8533,7 +8533,7 @@ aarch64_tls_referenced_p (rtx x)
   FOR_EACH_SUBRTX (iter, array, x, ALL)
 {
   const_rtx x = *iter;
-  if (GET_CODE (x) == SYMBOL_REF && SYMBOL_REF_TLS_MODEL (x) != 0)
+  if (SYMBOL_REF_P (x) && SYMBOL_REF_TLS_MODEL (x) != 0)
return true;
   /* Don't recurse into UNSPEC_TLS looking for TLS symbols; these are
 TLS offsets, not real symbol references.  */
@@ -8759,7 +8759,7 @@ aarch64_cannot_force_const_mem (machine_mode mode 
ATTRIBUTE_UNUSED, rtx x)
 
   poly_int64 offset;
   rtx base = strip_offset_and_salt (x, );
-  if (GET_CODE (base) == SYMBOL_REF || GET_CODE (base) == LABEL_REF)
+  if (SYMBOL_REF_P (base) || LABEL_REF_P (base))
 {
   /* We checked for POLY_INT_CST offsets above.  */
   if (aarch64_classify_symbol (base, offset.to_constant ())
@@ -8845,7 +8845,7 @@ static bool
 aarch64_base_register_rtx_p (rtx x, bool strict_p)
 {
   if (!strict_p
-  && GET_CODE (x) == SUBREG
+  && SUBREG_P (x)
   && contains_reg_of_mode[GENERAL_REGS][GET_MODE (SUBREG_REG (x))])
 x = SUBREG_REG (x);
 
@@ -8864,7 +8864,7 @@ aarch64_classify_index (struct aarch64_address_info 
*info, rtx x,
   int shift;
 
  

Re: Detect EAF flags in ipa-modref

2020-11-10 Thread Jan Hubicka
> > +  tree callee = gimple_call_fndecl (stmt);
> > +  if (callee)
> > +{
> > +  cgraph_node *node = cgraph_node::get (callee);
> > +  modref_summary *summary = node ? get_modref_function_summary (node)
> > +   : NULL;
> > +
> > +  if (summary && summary->arg_flags.length () > arg)
> 
> So could we make modref "transform" push this as fnspec attribute or
> would that not really be an optimization?

It was my original plan to synthetize fnspecs, but I think it is not
very good idea: we have the summary readily available and we can
represent information that fnspecs can't
(do not have artificial limits on number of parameters or counts)

I would preffer fnspecs to be used only for in-compiler declarations.
> > +
> > +/* Analyze EAF flags for SSA name NAME.
> > +   KNOWN_FLAGS is a cache for flags we already determined.
> > +   DEPTH is a recursion depth used to make debug output prettier.  */
> > +
> > +static int
> > +analyze_ssa_name_flags (tree name, vec *known_flags, int depth)
> 
> C++ has references which makes the access to known_flags nicer ;)

Yay, will chang that :)
> 
> > +{
> > +  imm_use_iterator ui;
> > +  gimple *use_stmt;
> > +  int flags = EAF_DIRECT | EAF_NOCLOBBER | EAF_NOESCAPE | EAF_UNUSED;
> > +
> > +  /* See if value is already computed.  */
> > +  if ((*known_flags)[SSA_NAME_VERSION (name)])
> > +{
> > +  /* Punt on cycles for now, so we do not need dataflow.  */
> > +  if ((*known_flags)[SSA_NAME_VERSION (name)] == 1)
> > +   {
> > + if (dump_file)
> > +   fprintf (dump_file,
> > +"%*sGiving up on a cycle in SSA graph\n", depth * 4, "");
> > + return 0;
> > +   }
> > +  return (*known_flags)[SSA_NAME_VERSION (name)] - 2;
> > +}
> > +  /* Recursion guard.  */
> > +  (*known_flags)[SSA_NAME_VERSION (name)] = 1;
> 
> This also guards against multiple evaluations of the same stmts
> but only in some cases?  Consider
> 
>   _1 = ..;
>   _2 = _1 + _3;
>   _4 = _1 + _5;
>   _6 = _2 + _4;
> 
> where we visit _2 = and _4 = from _1 but from both are going
> to visit _6.

Here we first push _6, then we go for _2 then for _1 evaluate _1,
evalueate _2, go for _4 and evaluate _4, and evaluate _6.
It is DFS and you need backward edge in DFS (comming from a PHI).

Cycles seems to somewhat matter for GCC: we do have a lot of functions
that walk linked lists that we could track otherwise.
> 
> Maybe I'm blind but you're not limiting depth?  Guess that asks
> for problems, esp. as you are recursing rather than using a
> worklist or so?
> 
> I see you try to "optimize" the walk by only visiting def->use
> links from parameters but then a RPO walk over all stmts would
> be simpler iteration-wise ...
We usually evaluate just small part of bigger functions (since we lose
track quite easily after hitting first memory store).  My plan was to
change this to actual dataflow once we have it well defined 
(this means after discussing EAF flags with you and adding the logic to
track callsites for true IPA pass that midly complicated things - for
every ssa name I track callsite/arg pair where it is passed to
either directly or indirectly.  Then this is translaed into call summary
and used by IPA pass to compute final flags)

I guess I can add --param ipa-modref-walk-depth for now and handle
dataflow incremntally?
In particular I am not sure if I should just write iterated RPO myself
or use tree-ssa-propagate.h (the second may be overkill).
> 
> > +  if (dump_file)
> > +{
> > +  fprintf (dump_file,
> > +  "%*sAnalyzing flags of ssa name: ", depth * 4, "");
> > +  print_generic_expr (dump_file, name);
> > +  fprintf (dump_file, "\n");
> > +}
> > +
> > +  FOR_EACH_IMM_USE_STMT (use_stmt, ui, name)
> > +{
> > +  if (flags == 0)
> > +   {
> > + BREAK_FROM_IMM_USE_STMT (ui);
> > +   }
> > +  if (is_gimple_debug (use_stmt))
> > +   continue;
> > +  if (dump_file)
> > +   {
> > + fprintf (dump_file, "%*s  Analyzing stmt:", depth * 4, "");
> > + print_gimple_stmt (dump_file, use_stmt, 0);
> > +   }
> > +
> > +  /* Gimple return may load the return value.  */
> > +  if (gimple_code (use_stmt) == GIMPLE_RETURN)
> 
>  if (greturn *ret = dyn_cast  (use_stmt))
> 
> makes the as_a below not needed, similar for the other cases (it
> also makes accesses cheaper, avoiding some checking code).

Looks cleaner indeed.
> 
> > +   {
> > + if (memory_access_to (gimple_return_retval
> > +  (as_a  (use_stmt)), name))
> > +   flags &= ~EAF_UNUSED;
> > +   }
> > +  /* Account for LHS store, arg loads and flags from callee function.  
> > */
> > +  else if (is_gimple_call (use_stmt))
> > +   {
> > + tree callee = gimple_call_fndecl (use_stmt);
> > +
> > + /* Recursion would require bit of propagation; give up for now.  */
> > + if (callee && recursive_call_p (current_function_decl, callee))
> > +   flags = 0;
> > + else
> > + 

[PATCH] tree-optimization/97769 - fix assert in peeling for alignment

2020-11-10 Thread Richard Biener
The following removes an assert that can not easily be adjusted to
cover the additional cases we now handle after the removal of
the same-align DRs vector.

Tested on x86_64-unknown-linux and aarch64, pushed.

2020-11-10  Richard Biener  

PR tree-optimization/97769
* tree-vect-data-refs.c (vect_update_misalignment_for_peel):
Remove assert.

* gcc.dg/vect/pr97769.c: New testcase.
---
 gcc/testsuite/gcc.dg/vect/pr97769.c | 32 +
 gcc/tree-vect-data-refs.c   |  7 +--
 2 files changed, 33 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/pr97769.c

diff --git a/gcc/testsuite/gcc.dg/vect/pr97769.c 
b/gcc/testsuite/gcc.dg/vect/pr97769.c
new file mode 100644
index 000..127f91aa8fd
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr97769.c
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O3" } */
+
+typedef struct {
+int alg;
+int h1[8];
+unsigned d1[1];
+} tmp;
+typedef struct {
+tmp itmp;
+tmp otmp;
+} h1;
+h1 c;
+
+static void
+fn1(char *p1, int p2)
+{
+  int i = 0;
+  for (; i < 4; i++)
+*p1++ = p2;
+}
+
+static void
+fn2(tmp *p1)
+{
+  char *d = (char *)p1->d1;
+  int *b = p1->h1;
+  for (int a; a; a++, d += 4)
+fn1(d, *b++);
+}
+
+void fn3() { fn2(&()->otmp); }
diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index 8afd3044461..0efab495407 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -1186,14 +1186,9 @@ static void
 vect_update_misalignment_for_peel (dr_vec_info *dr_info,
   dr_vec_info *dr_peel_info, int npeel)
 {
-  /* It can be assumed that if dr_info has the same alignment as dr_peel,
- it is aligned in the vector loop.  */
+  /* If dr_info is aligned of dr_peel_info is, then mark it so.  */
   if (vect_dr_aligned_if_peeled_dr_is (dr_info, dr_peel_info))
 {
-  gcc_assert (!known_alignment_for_access_p (dr_info)
- || !known_alignment_for_access_p (dr_peel_info)
- || (DR_MISALIGNMENT (dr_info)
- == DR_MISALIGNMENT (dr_peel_info)));
   SET_DR_MISALIGNMENT (dr_info, 0);
   return;
 }
-- 
2.26.2


Re: Enable MOVDIRI, MOVDIR64B, CLDEMOTE and WAITPKG for march=tremont

2020-11-10 Thread Jakub Jelinek via Gcc-patches
On Tue, Nov 10, 2020 at 08:03:29PM +0800, Hongtao Liu via Gcc-patches wrote:
> --- a/gcc/config/i386/i386.h
> +++ b/gcc/config/i386/i386.h
> @@ -2437,6 +2437,7 @@ const wide_int_bitmask PTA_AVX512BF16 (0, 
> HOST_WIDE_INT_1U << 11);
>  const wide_int_bitmask PTA_WAITPKG (0, HOST_WIDE_INT_1U << 12);
>  const wide_int_bitmask PTA_MOVDIRI(0, HOST_WIDE_INT_1U << 13);
>  const wide_int_bitmask PTA_MOVDIR64B(0, HOST_WIDE_INT_1U << 14);
> +const wide_int_bitmask PTA_CLDEMOTE(0, HOST_WIDE_INT_1U << 16);

Formatting, there should be a space between PTA_* and (.
Please fix this up on both branches not just for the entries you've added
but also for the older ones, and please fix it on the trunk too:
const wide_int_bitmask PTA_MOVDIRI(0, HOST_WIDE_INT_1U << 13);
const wide_int_bitmask PTA_MOVDIR64B(0, HOST_WIDE_INT_1U << 14);
...
const wide_int_bitmask PTA_AMX_TILE(0, HOST_WIDE_INT_1U << 19);
const wide_int_bitmask PTA_AMX_INT8(0, HOST_WIDE_INT_1U << 20);
const wide_int_bitmask PTA_AMX_BF16(0, HOST_WIDE_INT_1U << 21);
...
const wide_int_bitmask PTA_HRESET(0, HOST_WIDE_INT_1U << 23);
on the trunk suffer from this.

Ok for branches and the above change to trunk is preapproved too,
but please bootstrap/regtest all backports next time.

Jakub



Re: [PATCH] Cleanup irange::set.

2020-11-10 Thread Andreas Schwab
This breaks aarch64.

spawn -ignore SIGHUP /opt/gcc/test/Build/gcc/xgcc -B/opt/gcc/test/Build/gcc/ 
-mabi=lp64 -fdiagnostics-plain-output -march=armv8.2-a+sve -O3 --save-temps 
-ffat-lto-objects -fno-ident -c -o abs_1.o 
/opt/gcc/test/gcc/testsuite/gcc.target/aarch64/sve/abs_1.c
during GIMPLE pass: dom
/opt/gcc/test/gcc/testsuite/gcc.target/aarch64/sve/abs_1.c: In function 
'vneg_int8_t':
/opt/gcc/test/gcc/testsuite/gcc.target/aarch64/sve/abs_1.c:13:6: internal 
compiler error: tree check: expected integer_cst, have poly_int_cst in to_wide, 
at tree.h:5950
0x6743ab tree_check_failed(tree_node const*, char const*, int, char const*, ...)
../../gcc/tree.c:9752
0x674f3b tree_int_cst_elt_check(tree_node const*, int, char const*, int, char 
const*)
../../gcc/tree.h:3502
0x1033d67 tree_int_cst_elt_check(tree_node const*, int, char const*, int, char 
const*)
../../gcc/tree.h:3437
0x1033d67 wi::to_wide(tree_node const*)
../../gcc/tree.h:5951
0x1033d67 irange::legacy_lower_bound(unsigned int) const
../../gcc/value-range.cc:420
0x103ad23 irange::lower_bound(unsigned int) const
../../gcc/value-range.h:497
0x183154f range_operator::fold_range(irange&, tree_node*, irange const&, irange 
const&) const
../../gcc/range-op.cc:159
0xfeae4b range_fold_binary_expr(int_range<1u>*, tree_code, tree_node*, 
int_range<1u> const*, int_range<1u> const*)
../../gcc/tree-vrp.c:1243
0x107dcaf vr_values::extract_range_from_binary_expr(value_range_equiv*, 
tree_code, tree_node*, tree_node*, tree_node*)
../../gcc/vr-values.c:853
0x1086917 vr_values::extract_range_from_assignment(value_range_equiv*, gassign*)
../../gcc/vr-values.c:1561
0x1747eb3 evrp_range_analyzer::record_ranges_from_stmt(gimple*, bool)
../../gcc/gimple-ssa-evrp-analyze.c:304
0xe3b047 dom_opt_dom_walker::before_dom_children(basic_block_def*)
../../gcc/tree-ssa-dom.c:1458
0x170cf7f dom_walker::walk(basic_block_def*)
../../gcc/domwalk.c:309
0xe38b03 execute
../../gcc/tree-ssa-dom.c:724

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


Re: [Patch] Fortran: OpenMP 5.0 (in_,task_)reduction clause extensions

2020-11-10 Thread Jakub Jelinek via Gcc-patches
On Tue, Nov 10, 2020 at 12:40:08AM +0100, Tobias Burnus wrote:
> --- a/gcc/fortran/openmp.c
> +++ b/gcc/fortran/openmp.c
> @@ -762,6 +762,10 @@ enum omp_mask1
>OMP_CLAUSE_SHARED,
>OMP_CLAUSE_COPYIN,
>OMP_CLAUSE_REDUCTION,
> +  OMP_CLAUSE_REDUCTION_DEFAULT,
> +  OMP_CLAUSE_REDUCTION_MODIFIER,

I don't understand the need for OMP_CLAUSE_REDUCTION_DEFAULT,
default modifier is allowed in all OpenMP reduction clauses and
your gfc_match_omp_clause_reduction function has the openacc argument.
So why can't you keep using OMP_CLAUSE_REDUCTION in place where
you use OMP_CLAUSE_REDUCTION_DEFAULT now and just decide based on !openacc
whether to parse any reduction modifiers?

One could probably get away even without OMP_CLAUSE_REDUCTION_MODIFIER,
just allow all the modifiers first and during resolving complain if
[OMP_LIST_REDUCTION_INSCAN] and/or [OMP_LIST_REDUCTION_TASK] is non-NULL
on constructs where it shouldn't.

When splitting clauses, OMP_LIST_REDUCTION_TASK applies only on the innermost
construct that accepts it (others should treat it as OMP_LIST_REDUCTION),
and inscan is severely limited to which constructs it can appear on.
For task modifier, do you diagnose somewhere what c_split_omp_clauses
diagnoses?
E.g. if task modifier appears with innermost combined construct simd/loop,
or if task modifier appears and not combined with parallel/do/sections?
Inscan is allowed only on do, simd, do simd, parallel do and parallel do sim.d

Also, I think there should be an error or sorry on the inscan modifier
somewhere until the rest of the scan support is implemented (in particular
the changes in the parsing of do/simd body with inscan reductions
including the parsing of the inclusive/exclusive directives in there).

>  #define OMP_DECLARE_SIMD_CLAUSES \
>(omp_mask (OMP_CLAUSE_SIMDLEN) | OMP_CLAUSE_LINEAR \
> | OMP_CLAUSE_UNIFORM  | OMP_CLAUSE_ALIGNED | OMP_CLAUSE_INBRANCH  
> \

I wonder why we have the occassional tabs in the middle of the OMP_*_CLAUSES
defines.
> +   | OMP_CLAUSE_SAFELEN | OMP_CLAUSE_LINEAR | OMP_CLAUSE_ALIGNED \
> +   | OMP_CLAUSE_SIMDLEN  | OMP_CLAUSE_IF | OMP_CLAUSE_ORDER  
> \

Here too.

> +  for (int i = OMP_LIST_REDUCTION; i <= OMP_LIST_REDUCTION_TASK; i++)
> + {
> +   if (mask & GFC_OMP_MASK_TEAMS)
> + clausesa[GFC_OMP_SPLIT_TEAMS].lists[i]
> +   = code->ext.omp_clauses->lists[i];
> +   if (mask & GFC_OMP_MASK_PARALLEL)
> + clausesa[GFC_OMP_SPLIT_PARALLEL].lists[i]
> +   = code->ext.omp_clauses->lists[i];
> +   else if (mask & GFC_OMP_MASK_DO)
> + clausesa[GFC_OMP_SPLIT_DO].lists[i]
> +   = code->ext.omp_clauses->lists[i];
> +   if (mask & GFC_OMP_MASK_SIMD)
> + clausesa[GFC_OMP_SPLIT_SIMD].lists[i]
> +   = code->ext.omp_clauses->lists[i];
> + }

Seems in the middle-end we just ignore OMP_CLAUSE_REDUCTION_TASK on anything
but parallel/{for,do}/sections, so guess this is ok.

Jakub



Re: [PATCH] Cleanup irange::set.

2020-11-10 Thread Christophe Lyon via Gcc-patches
Hi,

On Mon, 9 Nov 2020 at 15:43, Aldy Hernandez via Gcc-patches
 wrote:
>
> [This is actually part of a larger patch that actually changes
> behavior, but I thought I'd commit the non-invasive cleanups first
> which will simplify the upcoming work.]
>
> irange::set was doing more work than it should for legacy ranges.
> I cleaned up various unnecessary calls to swap_out_of_order_endpoints,
> as well as some duplicate code that could be done with normalize_min_max.
>
> I also removed an obsolete comment wrt sticky infinite overflows.
> Not only did the -INF/+INF(OVF) code get removed in 2017,
> but normalize_min_max() uses wide ints, which ignored overflows
> altogether.
>
> Pushed.
>
> gcc/ChangeLog:
>
> * value-range.cc (irange::swap_out_of_order_endpoints): Rewrite
> into static function.
> (irange::set): Cleanup redundant manipulations.
> * value-range.h (irange::normalize_min_max): Modify object
> in-place instead of modifying arguments.


This is causing regressions on aarch64, such as:
FAIL: gcc.target/aarch64/pr88838.c (internal compiler error)
FAIL: gcc.target/aarch64/pr88838.c (test for excess errors)
Excess errors:
during GIMPLE pass: dom
/gcc/testsuite/gcc.target/aarch64/pr88838.c:5:1: internal compiler
error: tree check: expected integer_cst, have poly_int_cst in to_wide,
at tree.h:5950
0x66741d tree_check_failed(tree_node const*, char const*, int, char const*, ...)
/gcc/tree.c:9754
0x6404e8 tree_check
/gcc/tree.h:3570
0x6404e8 to_wide
/gcc/tree.h:5950
0x7f0816 wi::to_wide(tree_node const*)
/gcc/tree.h:3505
0x1118bdc irange::legacy_upper_bound(unsigned int) const
/gcc/value-range.cc:447
0x1122cd4 irange::upper_bound(unsigned int) const
/gcc/value-range.h:510
0x19493be range_operator::fold_range(irange&, tree_node*, irange
const&, irange const&) const
/gcc/range-op.cc:159
0x19494f4 range_operator::fold_range(irange&, tree_node*, irange
const&, irange const&) const
/gcc/range-op.cc:74
0x10cfee6 range_fold_binary_expr(int_range<1u>*, tree_code,
tree_node*, int_range<1u> const*, int_range<1u> const*)
/gcc/tree-vrp.c:1243
0x116fda4 vr_values::extract_range_from_binary_expr(value_range_equiv*,
tree_code, tree_node*, tree_node*, tree_node*)
/gcc/vr-values.c:853
0x1178b3c vr_values::extract_range_from_assignment(value_range_equiv*, gassign*)
/gcc/vr-values.c:1564
0x185bcbe evrp_range_analyzer::record_ranges_from_stmt(gimple*, bool)
/gcc/gimple-ssa-evrp-analyze.c:304
0xee7f9b dom_opt_dom_walker::before_dom_children(basic_block_def*)
/gcc/tree-ssa-dom.c:1458
0x1817487 dom_walker::walk(basic_block_def*)
/gcc/domwalk.c:309
0xee4f8b execute
/gcc/tree-ssa-dom.c:724


(actually I can see 3245 ICEs on aarch64)

Can you fix it?

Thanks

> ---
>  gcc/value-range.cc | 70 --
>  gcc/value-range.h  | 28 +--
>  2 files changed, 37 insertions(+), 61 deletions(-)
>
> diff --git a/gcc/value-range.cc b/gcc/value-range.cc
> index 5827e812216..2124e229e0c 100644
> --- a/gcc/value-range.cc
> +++ b/gcc/value-range.cc
> @@ -131,13 +131,14 @@ irange::copy_to_legacy (const irange )
>  set (src.tree_lower_bound (), src.tree_upper_bound ());
>  }
>
> -// Swap min/max if they are out of order.  Return TRUE if further
> -// processing of the range is necessary, FALSE otherwise.
> +// Swap MIN/MAX if they are out of order and adjust KIND appropriately.
>
> -bool
> -irange::swap_out_of_order_endpoints (tree , tree ,
> - value_range_kind )
> +static void
> +swap_out_of_order_endpoints (tree , tree , value_range_kind )
>  {
> +  gcc_checking_assert (kind != VR_UNDEFINED);
> +  if (kind == VR_VARYING)
> +return;
>/* Wrong order for min and max, to swap them and the VR type we need
>   to adjust them.  */
>if (tree_int_cst_lt (max, min))
> @@ -149,8 +150,8 @@ irange::swap_out_of_order_endpoints (tree , tree ,
>  for VR_ANTI_RANGE empty range, so drop to varying as well.  */
>if (TYPE_PRECISION (TREE_TYPE (min)) == 1)
> {
> - set_varying (TREE_TYPE (min));
> - return false;
> + kind = VR_VARYING;
> + return;
> }
>
>one = build_int_cst (TREE_TYPE (min), 1);
> @@ -163,12 +164,11 @@ irange::swap_out_of_order_endpoints (tree , tree 
> ,
>  to varying in this case.  */
>if (tree_int_cst_lt (max, min))
> {
> - set_varying (TREE_TYPE (min));
> - return false;
> + kind = VR_VARYING;
> + return;
> }
>kind = kind == VR_RANGE ? VR_ANTI_RANGE : VR_RANGE;
>  }
> -  return true;
>  }
>
>  void
> @@ -253,13 +253,6 @@ irange::set (tree min, tree max, value_range_kind kind)
>&& (POLY_INT_CST_P (min) || POLY_INT_CST_P (max)))
>  kind = VR_VARYING;
>
> -  if (kind == VR_VARYING)
> -{
> -  set_varying (TREE_TYPE 

[PATCH] tree-optimization/97780 - fix ICE in fini_pre

2020-11-10 Thread Richard Biener
This deals with blocks elimination added.

Pushed as obvious.

2020-11-10  Richard Biener  

PR tree-optimization/97780
* tree-ssa-pre.c (fini_pre): Deal with added basic blocks
when freeing PHI_TRANS_TABLE.
---
 gcc/tree-ssa-pre.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/tree-ssa-pre.c b/gcc/tree-ssa-pre.c
index 160f3b4593a..90877e3c68e 100644
--- a/gcc/tree-ssa-pre.c
+++ b/gcc/tree-ssa-pre.c
@@ -4196,7 +4196,7 @@ fini_pre ()
 
   basic_block bb;
   FOR_ALL_BB_FN (bb, cfun)
-if (PHI_TRANS_TABLE (bb))
+if (bb->aux && PHI_TRANS_TABLE (bb))
   delete PHI_TRANS_TABLE (bb);
   free_aux_for_blocks ();
 }
-- 
2.26.2


Re: Enable MOVDIRI, MOVDIR64B, CLDEMOTE and WAITPKG for march=tremont

2020-11-10 Thread Hongtao Liu via Gcc-patches
On Tue, Nov 10, 2020 at 4:17 PM Hongtao Liu  wrote:
>
> On Tue, Nov 10, 2020 at 3:22 AM Jason Merrill via Gcc-patches
>  wrote:
> >
> > This patch was also applied to the GCC 9 and 10 branches and breaks those
> > builds, because PTA_CLDEMOTE is not defined.
> >
> Mine, let me fix it, sorry for that.
>

Since CLDEMOTE already existed in gcc9 and later, define PTA_CLDEMOTE
instead of removing it.
Similar for MOVDIRI, MOVDIR64B.

GCC10-Fix.patch

gcc/ChangeLog
* config/i386/i386-options.c (ix86_option_override_internal):
Handle PTA_CLDEMOTE.
* config/i386/i386.h (PTA_CLDEMOTE): Define.

GCC9-Fix.patch

gcc/ChangeLog
* config/i386/i386.c (ix86_option_override_internal):
Handle PTA_CLDEMOTE, PTA_MOVDIRI, PTA_MOVDIR64B.
* config/i386/i386.h (PTA_CLDEMOTE, PTA_MOVDIRI,
PTA_MOVDIR64B.): Define.

Bootstrap and regression test for i386 backend is ok on releases/gcc-9
and releases/gcc-10 branch.

Sorry for such unnecessary inconvenience.

> --
> BR,
> Hongtao



--
BR,
Hongtao
From 7744c0f741e26123e63a90494efd4a6c989df7c6 Mon Sep 17 00:00:00 2001
From: liuhongt 
Date: Tue, 10 Nov 2020 16:42:06 +0800
Subject: [PATCH] Fix missing defination of PTA_CLDEMOTE.

gcc/ChangeLog
	* config/i386/i386-options.c (ix86_option_override_internal):
	Handle PTA_CLDEMOTE.
	* config/i386/i386.h (PTA_CLDEMOTE): Define.
---
 gcc/config/i386/i386-options.c | 3 +++
 gcc/config/i386/i386.h | 1 +
 2 files changed, 4 insertions(+)

diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c
index 5c21fce06a4..14a57a17156 100644
--- a/gcc/config/i386/i386-options.c
+++ b/gcc/config/i386/i386-options.c
@@ -2226,6 +2226,9 @@ ix86_option_override_internal (bool main_args_p,
 	if (((processor_alias_table[i].flags & PTA_PTWRITE) != 0)
 	&& !(opts->x_ix86_isa_flags2_explicit & OPTION_MASK_ISA2_PTWRITE))
 	  opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA2_PTWRITE;
+	if (((processor_alias_table[i].flags & PTA_CLDEMOTE) != 0)
+	&& !(opts->x_ix86_isa_flags2_explicit & OPTION_MASK_ISA2_CLDEMOTE))
+	  opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA2_CLDEMOTE;
 
 	if ((processor_alias_table[i].flags
 	   & (PTA_PREFETCH_SSE | PTA_SSE)) != 0)
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index d75a86bb8da..08f265fc8e0 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -2437,6 +2437,7 @@ const wide_int_bitmask PTA_AVX512BF16 (0, HOST_WIDE_INT_1U << 11);
 const wide_int_bitmask PTA_WAITPKG (0, HOST_WIDE_INT_1U << 12);
 const wide_int_bitmask PTA_MOVDIRI(0, HOST_WIDE_INT_1U << 13);
 const wide_int_bitmask PTA_MOVDIR64B(0, HOST_WIDE_INT_1U << 14);
+const wide_int_bitmask PTA_CLDEMOTE(0, HOST_WIDE_INT_1U << 16);
 
 const wide_int_bitmask PTA_CORE2 = PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2
   | PTA_SSE3 | PTA_SSSE3 | PTA_CX16 | PTA_FXSR;
-- 
2.18.1

From a1b774edeedadf69c56529fa52a9e2ef92cb7957 Mon Sep 17 00:00:00 2001
From: liuhongt 
Date: Tue, 10 Nov 2020 16:48:08 +0800
Subject: [PATCH] Fix missing defination of PTA_CLDEMOTE, PTA_MOVDIRI,
 PTA_MOVDIR64B.

gcc/ChangeLog
	* config/i386/i386.c (ix86_option_override_internal):
	Handle PTA_CLDEMOTE, PTA_MOVDIRI, PTA_MOVDIR64B.
	* config/i386/i386.h (PTA_CLDEMOTE, PTA_MOVDIRI,
	PTA_MOVDIR64B.): Define.
---
 gcc/config/i386/i386.c | 9 +
 gcc/config/i386/i386.h | 3 +++
 2 files changed, 12 insertions(+)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index f968d033972..bd61be61fdf 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -4058,6 +4058,15 @@ ix86_option_override_internal (bool main_args_p,
 	if (((processor_alias_table[i].flags & PTA_PTWRITE) != 0)
 	&& !(opts->x_ix86_isa_flags2_explicit & OPTION_MASK_ISA_PTWRITE))
 	  opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA_PTWRITE;
+	if (((processor_alias_table[i].flags & PTA_MOVDIRI) != 0)
+	&& !(opts->x_ix86_isa_flags_explicit & OPTION_MASK_ISA_MOVDIRI))
+	  opts->x_ix86_isa_flags |= OPTION_MASK_ISA_MOVDIRI;
+	if (((processor_alias_table[i].flags & PTA_MOVDIR64B) != 0)
+	&& !(opts->x_ix86_isa_flags2_explicit & OPTION_MASK_ISA_MOVDIR64B))
+	  opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA_MOVDIR64B;
+	if (((processor_alias_table[i].flags & PTA_CLDEMOTE) != 0)
+	&& !(opts->x_ix86_isa_flags2_explicit & OPTION_MASK_ISA_CLDEMOTE))
+	  opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA_CLDEMOTE;
 
 	if ((processor_alias_table[i].flags
 	   & (PTA_PREFETCH_SSE | PTA_SSE)) != 0)
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 17d6218e991..1cc9e3d9463 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -2379,6 +2379,9 @@ const wide_int_bitmask PTA_PCONFIG (0, HOST_WIDE_INT_1U << 7);
 const wide_int_bitmask PTA_WBNOINVD (0, HOST_WIDE_INT_1U << 8);
 const wide_int_bitmask PTA_WAITPKG (0, HOST_WIDE_INT_1U << 9);
 const wide_int_bitmask PTA_PTWRITE (0, HOST_WIDE_INT_1U << 10);
+const wide_int_bitmask PTA_MOVDIRI(0, HOST_WIDE_INT_1U << 13);
+const wide_int_bitmask PTA_MOVDIR64B(0, 

Re: [PATCH] C-family : Add attribute 'unavailable'.

2020-11-10 Thread Richard Biener
On Tue, 10 Nov 2020, Iain Sandoe wrote:

> Hi Richard,
> 
> Richard Biener  wrote:
> 
> >On Mon, 9 Nov 2020, Iain Sandoe wrote:
> >
> >>Hi,
> >>
> >>I?ve been carrying this patch around on my Darwin branches for a very long
> >>time?
> >>
> >>tested across the Darwin patch and on x86_64-linux-gnu,
> >>OK for master?
> >>thanks
> >>Iain
> >>
> >>= commit message
> >>
> >>If an interface is marked 'deprecated' then, presumably, at some point it
> >>will be withdrawn and no longer available. The 'unavailable' attribute
> >>makes it possible to mark up interfaces to indicate this status. It is used
> >>quite extensively in some codebases where a single set of headers can be
> >>used
> >>to permit code generation for multiple system versions.
> >>
> >>From a configuration perspective, it also allows a compile test to determine
> >>that an interface is missing - rather than requiring a link test.
> >>
> >>The implementation follows the pattern of attribute deprecated, but produces
> >>an error (where deprecation produces a warning).
> >>
> >>This attribute has been implemented in clang for some years.
> >>
> >>gcc/c-family/ChangeLog:
> >>
> >>* c-attribs.c (handle_unavailable_attribute): New.
> >>
> >>gcc/c/ChangeLog:
> >>
> >>* c-decl.c (enum deprecated_states): Add unavailable state.
> >>(merge_decls): Copy unavailability.
> >>(quals_from_declspecs): Handle unavailable case.
> >>(start_decl): Amend the logic handling suppression of nested
> >>deprecation states to include unavailability.
> >>(smallest_type_quals_location): Amend comment.
> >>(grokdeclarator): Handle the unavailable deprecation state.
> >>(declspecs_add_type): Set TREE_UNAVAILABLE from the decl specs.
> >>* c-tree.h (struct c_declspecs): Add unavailable_p.
> >>* c-typeck.c (build_component_ref): Handle unavailability.
> >>(build_external_ref): Likewise.
> >>
> >>gcc/cp/ChangeLog:
> >>
> >>* call.c (build_over_call): Handle unavailable state in addition to
> >>deprecation.
> >>* class.c (type_build_ctor_call): Likewise.
> >>(type_build_dtor_call): Likewise.
> >>* cp-tree.h: Rename cp_warn_deprecated_use to
> >>cp_handle_deprecated_or_unavailable.
> >>* decl.c (duplicate_decls): Merge unavailability.
> >>(grokdeclarator): Handle unavailability in addition to deprecation.
> >>(type_is_unavailable): New.
> >>(grokparms): Handle unavailability in addition to deprecation.
> >>* decl.h (enum deprecated_states): Add
> >>UNAVAILABLE_DEPRECATED_SUPPRESS.
> >>* decl2.c (cplus_decl_attributes): Propagate unavailability to
> >>templates.
> >>(cp_warn_deprecated_use): Rename to ...
> >>(cp_handle_deprecated_or_unavailable): ... this and amend to handle
> >>the unavailable case. It remains a warning in the case of deprecation
> >>but becomes an error in the case of unavailability.
> >>(cp_warn_deprecated_use_scopes): Handle unavailability.
> >>(mark_used): Likewise.
> >>* parser.c (cp_parser_template_name): Likewise.
> >>(cp_parser_template_argument): Likewise.
> >>(cp_parser_parameter_declaration_list): Likewise.
> >>* typeck.c (build_class_member_access_expr): Likewise.
> >>(finish_class_member_access_expr): Likewise.
> >>* typeck2.c (build_functional_cast_1): Likewise.
> >>
> >>gcc/ChangeLog:
> >>
> >>* lto-streamer-out.c (hash_tree): Stream TREE_UNAVAILABLE.
> >>* print-tree.c (print_node): Handle unavailable attribute.
> >>* tree-core.h (struct tree_base): Add a bit to carry unavailability.
> >>* tree.c (error_unavailable_use): New.
> >>* tree.h (TREE_UNAVAILABLE): New.
> >>(error_unavailable_use): New.
> >
> >Why'd you need to stream this in LTO, more specifically only handle
> >it in hashing?  It should be all frontend operation.
> 
> I followed the existing implementation for TREE_DEPRECATED ().
> 
> Presumably, that entry should also be removed - I will remove and re-test (it
> wasn?t
> obvious to me why the entry was present, the deprecation/unavailable stuff
> should
> be done by end of gimplification).

deprecated_flag is overloaded, we likely use this accessor because
it isn't guarded by any tree code check.  We definitely need the bit
for SSA_NAMEs.

> >You're targeting only DECLs can you use a bit from decl_common where decl
> >specific bits exist instead?
> 
> .. except that we are also targeting types.

Ah, I see, using a bit in tree-base is fine then.  Please document
its use in near to the other flags like

   deprecated_flag:

   TREE_DEPRECATED in
   all decls
   all types

so overloads can be added for other tree kinds.

Richard.

> We could use a bit in decl_common for decls and in tree_type_common for types,
> I suppose, but that would then mean that every test for unavailable would grow
> an
> additional check to discriminate.
> 
> (if we do that, then presumably, as a separate patch, we might as well move
> TREE_DEPRECATED in the same way - since otherwise the impl. is different for
> no obvious reason).
>
> thanks
> Iain
> 
> 
> >
> >Thanks,
> >Richard.
> >
> >>gcc/objc/ChangeLog:
> >>
> >>* objc-act.c 

  1   2   >