[committed] [OG10] Permit calls to builtins and intrinsics in kernels loops

2020-08-22 Thread Sandra Loosemore
This is another small tweak to the kernels loop annotator.  Like the 
subject line says, it allows the annotator to consider loops that have 
calls to builtins (C/C++) or intrinsics (Fortran).  I've committed this 
to the OG10 branch since there is no one to review these patches right 
now, and it will be mashed into the rest of the kernels loop annotation 
code for review before submission for mainline.


-Sandra
commit 1c9af55d7ff76e2e6b633af33e6e6991a0ba4c48
Author: Sandra Loosemore 
Date:   Sat Aug 22 18:23:26 2020 -0700

Permit calls to builtins and intrinsics in kernels loops.

This tweak to the OpenACC kernels loop annotation relaxes the
restrictions on function calls in the loop body.  Normally calls to
functions not explicitly marked with a parallelism attribute are not
permitted, but C/C++ builtins and Fortran intrinsics have known
semantics so we can generally permit those without restriction.  If
any turn out to be problematical, we can add on here to recognize
them, or in the processing of the "auto" annotations.

2020-08-22  Sandra Loosemore  

	gcc/c-family/
	* c-omp.c (annotate_loops_in_kernels_regions): Test for
	calls to builtins.

	gcc/fortran/
	* openmp.c (check_expr_for_invalid_calls): Check for intrinsic
	functions.

	gcc/testsuite/
	* c-c++-common/goacc/kernels-loop-annotation-20.c: New.
	* gfortran.dg/goacc/kernels-loop-annotation-20.f95: New.

diff --git a/gcc/c-family/ChangeLog.omp b/gcc/c-family/ChangeLog.omp
index 51a3b9e..2c48153 100644
--- a/gcc/c-family/ChangeLog.omp
+++ b/gcc/c-family/ChangeLog.omp
@@ -1,3 +1,11 @@
+2020-08-22  Sandra Loosemore  
+
+	Allow annotation of loops containing calls to builtins in
+	kernels regions.
+
+	* c-omp.c (annotate_loops_in_kernels_regions): Test for
+	calls to builtins.
+
 2020-08-19  Sandra Loosemore  
 
 	Annotate inner loops in "acc kernels loop" directives (C/C++).
diff --git a/gcc/c-family/c-omp.c b/gcc/c-family/c-omp.c
index 24f2448..34523cee 100644
--- a/gcc/c-family/c-omp.c
+++ b/gcc/c-family/c-omp.c
@@ -2850,8 +2850,9 @@ annotate_loops_in_kernels_regions (tree *nodeptr, int *walk_subtrees,
   break;
 
 case CALL_EXPR:
-  /* Direct function calls to functions marked as OpenACC routines are
-	 allowed.  Reject indirect calls or calls to non-routines.  */
+  /* Direct function calls to builtins and functions marked as
+	 OpenACC routines are allowed.  Reject indirect calls or calls
+	 to non-routines.  */
   if (info->state >= as_in_kernels_loop)
 	{
 	  tree fn = CALL_EXPR_FN (node), fn_decl = NULL_TREE;
@@ -2865,8 +2866,9 @@ annotate_loops_in_kernels_regions (tree *nodeptr, int *walk_subtrees,
 	}
 	  if (fn_decl == NULL_TREE)
 	do_not_annotate_loop_nest (info, as_invalid_call, node);
-	  else if (!lookup_attribute ("oacc function",
-  DECL_ATTRIBUTES (fn_decl)))
+	  else if (!fndecl_built_in_p (fn_decl, BUILT_IN_NORMAL)
+		   && !lookup_attribute ("oacc function",
+	 DECL_ATTRIBUTES (fn_decl)))
 	do_not_annotate_loop_nest (info, as_invalid_call, node);
 	}
   break;
diff --git a/gcc/fortran/ChangeLog.omp b/gcc/fortran/ChangeLog.omp
index 13cc5a6..df5448d 100644
--- a/gcc/fortran/ChangeLog.omp
+++ b/gcc/fortran/ChangeLog.omp
@@ -1,3 +1,11 @@
+2020-08-22  Sandra Loosemore  
+
+	Permit calls to Fortran intrinsics when annotating loops in
+	kernels regions.
+
+	* openmp.c (check_expr_for_invalid_calls): Check for intrinsic
+	functions.
+
 2020-08-21  Tobias Burnus  
 
 	Backport from mainline
diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index d2e7b73..b297f4b 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -7094,9 +7094,12 @@ check_expr_for_invalid_calls (gfc_expr **exprp, int *walk_subtrees,
   switch (expr->expr_type)
 {
 case EXPR_FUNCTION:
-  if (expr->value.function.esym
-	  && (expr->value.function.esym->attr.oacc_routine_lop
-	  != OACC_ROUTINE_LOP_NONE))
+  /* Permit calls to Fortran intrinsic functions and to routines
+	 with an explicitly declared parallelism level.  */
+  if (expr->value.function.isym
+	  || (expr->value.function.esym
+	  && (expr->value.function.esym->attr.oacc_routine_lop
+		  != OACC_ROUTINE_LOP_NONE)))
 	return 0;
   /* Else fall through.  */
 
diff --git a/gcc/testsuite/ChangeLog.omp b/gcc/testsuite/ChangeLog.omp
index c5f2813..4adda2a 100644
--- a/gcc/testsuite/ChangeLog.omp
+++ b/gcc/testsuite/ChangeLog.omp
@@ -1,3 +1,11 @@
+2020-08-22   Sandra Loosemore  
+
+	Test cases for allowing calls to C/C++ builtins and Fortran
+	intrinsics in loops in kernels regions.
+
+	* c-c++-common/goacc/kernels-loop-annotation-20.c: New.
+	* gfortran.dg/goacc/kernels-loop-annotation-20.f95: New.
+
 2020-08-21  Tobias Burnus  
 
 	Backport from mainline
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-20.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-annotation-20.c
new file mode 100644
index 

Re: [PATCH 0/3] Power10 PCREL_OPT support

2020-08-22 Thread Bill Schmidt via Gcc-patches

On 8/20/20 6:33 PM, Segher Boessenkool wrote:

Hi!

On Tue, Aug 18, 2020 at 02:31:41AM -0400, Michael Meissner wrote:


In order to do this, the pass that converts the load address and load/store
must occur late in the compilation cycle.

That does not follow afaics.


Let me see if I can help explain this.

I think the issue is that this optimization creates a dependency that 
isn't directly represented in RTL.  We either have to figure out how to 
represent it, or we have to do this very late to avoid problems.


Suppose we are at a point where hard registers have been assigned, and 
the RTL looks like:


    addi  r5,r3,4
    sldi  r6,r5,2
    pld  r10,symbol@got@pcrel
    lwz  r5,0(r10)

Everything is fine for the optimization to take place, since the two 
instructions are adjacent and therefore we can't have any problems with 
r10 being redefined in between, or r5 being used. So we stick on the 
relocation telling the linker to change this if resolved during static 
link time to:


    addi  r5,r3,4
    sldi  r6,r5,2
    plwz  r5,symbol@pcrel
    nop

Now, suppose after we insert the relocation we get a reordering of 
instructions such as


    addi  r5,r3,4
    pld  r10,symbol@got@pcrel
    sldi  r6,r5,2
    lwz  r5,0(r10)

When the linker performs the replacement, we will now end up with

    addi  r5,r3,4
    plwz  r5,symbol@pcrel
    sldi  r6,r5,2
    nop

which has altered the semantics of the program.

What is necessary in order to allow this optimization to occur earlier 
is to make this hidden dependency explicit.  When the relocation is 
inserted, we have to change the "pld" instruction to have a specific 
clobber of (in this case) r5, which represents what will happen if the 
linker makes the substitution.


I agree that it's too fragile to force this to be the last pass, so I 
think if Mike can look into introducing a clobber of the hard register 
when performing the optimization, that would at least allow us to move 
this anywhere after reload.


I don't immediately see a solution that works prior to register 
allocation because we basically are representing two potential starting 
points of a live range, only one of which will survive in the final 
code.  That is too ugly a problem to hand to the register allocator.


Thanks,
Bill



RE: [PATCH] hppa: Improve expansion of ashldi3 when !TARGET_64BIT

2020-08-22 Thread Roger Sayle

Hi Dave,
I actually think using plus_xor_ior operator is useful.  It means that if 
combine,
inlining or some other RTL simplification generates these variants, these forms
will still be recognized by the backend.  It's more typing, but the compiler 
produces
better code.

Here's what I have so far, but please feel free to modify anything.  I'll leave 
the
rest to you.

With this patch:

unsigned long long rotl4(unsigned long long x)
{
  return (x<<4) | (x>>60);
}

unsigned long long rotr4(unsigned long long x)
{
  return (x<<60) | (x>>4);
}

which previously generated:

rotl4:  depd,z %r26,59,60,%r28
extrd,u %r26,3,4,%r26
bve (%r2)
or %r26,%r28,%r28

rotr4:  extrd,u %r26,59,60,%r28
depd,z %r26,3,4,%r26
bve (%r2)
or %r26,%r28,%r28

now produces:

rotl4:  bve (%r2)
shrpd %r26,%r26,60,%r28

rotr4:  bve (%r2)
shrpd %r26,%r26,4,%r28


I'm guessing this is very similar to what you were thinking (or what I 
described previously).

Many thanks again for trying out these patches/suggestions for me.

Best regards,
Roger
--

-Original Message-
From: John David Anglin  
Sent: 22 August 2020 23:09
To: Roger Sayle ; 'GCC Patches' 

Cc: 'Jeff Law' 
Subject: Re: [PATCH] hppa: Improve expansion of ashldi3 when !TARGET_64BIT

On 2020-08-22 12:01 p.m., Roger Sayle wrote:
> I suspect that the issue with the 64-bit patterns is that the second 
> variant of pa.md's define_insn "shrpdi4" is unlikely ever to match as 
> (minus:DI (const_int 64) x) is never "canonical" when x is itself a 
> CONST_INT.  Splitting this define_insn into two (or three see below) 
> separate forms; the first as it currently is and the second (as you 
> suggest) with
>   "TARGET_64BIT
> && INTVAL (operands[3]) + INTVAL (operands[4]) == 64"
> should do the trick.
I will go ahead and add the basic patterns.  It seems it would be best if I 
avoid using the "plus_xor_ior_operator".  It also seems the 32-bit patterns 
should avoid it.
>
> My first impression was that the DImode shrpd instructions would be 
> most useful for implementing TI mode shifts, but that TI mode isn't 
> supported by hppa64.  But then I noticed that the more immediate 
> benefit would be in supporting rotrdi3 and rotldi3 on TARGET_64BIT 
> that currently don't have expanders nor insns defined.  Here GCC 
> currently generates three instructions where a single shrpd would be 
> optimal.
It turns out we now need to support TI mode and __int128 for libgomp.  The 
hppa64-hpux target won't boot without it.  I had just added a change to support 
TI mode but it's untested.

Regards,
Dave

--
John David Anglin  dave.ang...@bell.net



patchh3.log
Description: Binary data
diff --git a/gcc/config/pa/pa.md b/gcc/config/pa/pa.md
index 6350c68..5f04c02 100644
--- a/gcc/config/pa/pa.md
+++ b/gcc/config/pa/pa.md
@@ -6604,32 +6604,82 @@
(set_attr "length" "4")])
 
 ; Shift right pair word 0 to 31 bits.
-(define_insn "shrpsi4"
-  [(set (match_operand:SI 0 "register_operand" "=r,r")
-   (ior:SI (ashift:SI (match_operand:SI 1 "register_operand" "r,r")
-  (minus:SI (const_int 32)
-(match_operand:SI 3 "shift5_operand" "q,n")))
-   (lshiftrt:SI (match_operand:SI 2 "register_operand" "r,r")
-(match_dup 3]
+(define_insn "*shrpsi4_1"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+   (match_operator:SI 4 "plus_xor_ior_operator"
+ [(ashift:SI (match_operand:SI 1 "register_operand" "r")
+ (minus:SI (const_int 32)
+   (match_operand:SI 3 "register_operand" "q")))
+  (lshiftrt:SI (match_operand:SI 2 "register_operand" "r")
+   (match_dup 3))]))]
   ""
-  "@
-   {vshd %1,%2,%0|shrpw %1,%2,%%sar,%0}
-   {shd|shrpw} %1,%2,%3,%0"
+  "{vshd %1,%2,%0|shrpw %1,%2,%%sar,%0}"
+  [(set_attr "type" "shift")
+   (set_attr "length" "4")])
+
+(define_insn "*shrpsi4_2"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+   (match_operator:SI 4 "plus_xor_ior_operator"
+ [(lshiftrt:SI (match_operand:SI 2 "register_operand" "r")
+   (match_operand:SI 3 "register_operand" "q"))
+  (ashift:SI (match_operand:SI 1 "register_operand" "r")
+ (minus:SI (const_int 32)
+   (match_dup 3)))]))]
+  ""
+  "{vshd %1,%2,%0|shrpw %1,%2,%%sar,%0}"
   [(set_attr "type" "shift")
(set_attr "length" "4")])
 
 ; Shift right pair doubleword 0 to 63 bits.
-(define_insn "shrpdi4"
-  [(set (match_operand:DI 0 "register_operand" "=r,r")
-   (ior:DI (ashift:DI (match_operand:SI 1 "register_operand" "r,r")
-  (minus:DI (const_int 64)
-(match_operand:DI 3 "shift6_operand" "q,n")))
-   (lshiftrt:DI (match_operand:DI 2 "register_operand" "r,r")
-(match_dup 3]
+(define_insn "*shrpdi4_1"
+  

Re: [PATCH] hppa: Improve expansion of ashldi3 when !TARGET_64BIT

2020-08-22 Thread John David Anglin
On 2020-08-22 12:01 p.m., Roger Sayle wrote:
> I suspect that the issue with the 64-bit patterns is that the second variant
> of 
> pa.md's define_insn "shrpdi4" is unlikely ever to match as (minus:DI
> (const_int 64) x)
> is never "canonical" when x is itself a CONST_INT.  Splitting this
> define_insn
> into two (or three see below) separate forms; the first as it currently is
> and the
> second (as you suggest) with 
>   "TARGET_64BIT
> && INTVAL (operands[3]) + INTVAL (operands[4]) == 64"
> should do the trick.
I will go ahead and add the basic patterns.  It seems it would be best if I 
avoid
using the "plus_xor_ior_operator".  It also seems the 32-bit patterns should 
avoid it.
>
> My first impression was that the DImode shrpd instructions would be most
> useful for implementing TI mode shifts, but that TI mode isn't supported by
> hppa64.  But then I noticed that the more immediate benefit would be in
> supporting rotrdi3 and rotldi3 on TARGET_64BIT that currently don't have
> expanders nor insns defined.  Here GCC currently generates three
> instructions
> where a single shrpd would be optimal.
It turns out we now need to support TI mode and __int128 for libgomp.  The 
hppa64-hpux target won't
boot without it.  I had just added a change to support TI mode but it's 
untested.

Regards,
Dave

-- 
John David Anglin  dave.ang...@bell.net



Re: [PATCH 0/6] Parallelize Intra-Procedural Optimizations using the LTO Engine.

2020-08-22 Thread Giuliano Belinassi via Gcc-patches
Hi, Josh

On 08/21, Josh Triplett wrote:
> On Thu, Aug 20, 2020 at 07:00:13PM -0300, Giuliano Belinassi wrote:
> > This patch series add a new flag "-fparallel-jobs=" to control if the
> > compiler should try to compile the current file in parallel.
> [...]
> > Bootstrapped and Regtested on Linux x86_64.
> > 
> > Giuliano Belinassi (6):
> >   Modify gcc driver for parallel compilation
> >   Implement a new partitioner for parallel compilation
> >   Implement fork-based parallelism engine
> >   Add `+' for Jobserver Integration
> >   Add invoke documentation
> >   New tests for parallel compilation feature
> 
> Very nice!

Thank you for your interest in this :)

> 
> I'm interested in testing this on a highly parallel system. What
> baseline do these patches apply to?  They don't seem to apply to GCC
> trunk.

Hummm, this was supposed to work on trunk out of the box. However,
there is a high probability that I messed up something while rebasing.
I will post a version 2 of it when I get more comments and when I fix
the Makefile issue that Joseph pointed out in other e-mail.

If you want to test it on a high parallel system, I think it will be
cool to see how it behaves also when --param=promote-statics=1, as it
increases parallelism opportunity. :)

> 
> Also, I tried to bootstrap the current tip of the devel/autopar_devel
> branch, but ended up with compiler segfaults that all look like this:
> ../../gcc/zlib/compress.c:86:1: internal compiler error: Segmentation fault
>86 | }
>   | ^

Well, there was once a bug in this branch when compiling with -flto that
caused the assembler output file not to be properly initialized early
enough, resulting in LTO LGEN stage writing into a invalid FILE pointer.
I fixed this during rebasing but I forgot to push to the autopar_devel
branch. In any case, I just pushed the recent changes to autopar_devel
which fix this issue.

In any case, -fparallel-jobs= should NOT be used together with -flto.
Although I used part of the LTO engine for development of this feature,
they are meant for distinct things. I guess I should give a warning
about that in next version :)

Also, I just tested bootstrap with

../gcc/configure --disable-multilib --enable-languages=c,c++

on x86_64 linux and it is working.

Thank you,
Giuliano.


[PATCH] x86: Only use general-purpose registers during CPUID check

2020-08-22 Thread H.J. Lu via Gcc-patches
On Sat, Aug 22, 2020 at 10:11 AM Uros Bizjak  wrote:
>
> On Sat, Aug 22, 2020 at 6:27 PM H.J. Lu  wrote:
> >
> > On Fri, Aug 21, 2020 at 9:45 AM H.J. Lu  wrote:
> > >
> > > On Fri, Aug 21, 2020 at 9:35 AM H.J. Lu  wrote:
> > > >
> > > > On Fri, Aug 21, 2020 at 9:29 AM Hongtao Liu  wrote:
> > > > >
> > > > > On Fri, Aug 21, 2020 at 11:50 PM Uros Bizjak  
> > > > > wrote:
> > > > > >
> > > > > > On Fri, Aug 21, 2020 at 5:41 PM Hongtao Liu  
> > > > > > wrote:
> > > > > > >
> > > > > > > On Fri, Aug 21, 2020 at 9:15 PM Uros Bizjak  
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > > > gcc/
> > > > > > > > > > > PR target/88808
> > > > > > > > > > > * config/i386/i386.c 
> > > > > > > > > > > (ix86_preferred_reload_class): Allow
> > > > > > > > > > > QImode data go into mask registers.
> > > > > > > > > > > * config/i386/i386.md: (*movhi_internal): Adjust 
> > > > > > > > > > > constraints
> > > > > > > > > > > for mask registers.
> > > > > > > > > > > (*movqi_internal): Ditto.
> > > > > > > > > > > (*anddi_1): Support mask register operations
> > > > > > > > > > > (*and_1): Ditto.
> > > > > > > > > > > (*andqi_1): Ditto.
> > > > > > > > > > > (*andn_1): Ditto.
> > > > > > > > > > > (*_1): Ditto.
> > > > > > > > > > > (*qi_1): Ditto.
> > > > > > > > > > > (*one_cmpl2_1): Ditto.
> > > > > > > > > > > (*one_cmplsi2_1_zext): Ditto.
> > > > > > > > > > > (*one_cmplqi2_1): Ditto.
> > > > > > > > > > > (define_peephole2): Move constant 0/-1 directly 
> > > > > > > > > > > into mask
> > > > > > > > > > > registers.
> > > > > > > > > > > * config/i386/predicates.md (mask_reg_operand): 
> > > > > > > > > > > New predicate.
> > > > > > > > > > > * config/i386/sse.md (define_split): Add 
> > > > > > > > > > > post-reload splitters
> > > > > > > > > > > that would convert "generic" patterns to mask 
> > > > > > > > > > > patterns.
> > > > > > > > > > > (*knotsi_1_zext): New define_insn.
> > > > > > > > > > >
> > > > > > > > > > > gcc/testsuite/
> > > > > > > > > > > * gcc.target/i386/bitwise_mask_op-1.c: New test.
> > > > > > > > > > > * gcc.target/i386/bitwise_mask_op-2.c: New test.
> > > > > > > > > > > * gcc.target/i386/bitwise_mask_op-3.c: New test.
> > > > > > > > > > > * gcc.target/i386/avx512bw-pr88465.c: New 
> > > > > > > > > > > testcase.
> > > > > > > > > > > * gcc.target/i386/avx512bw-kunpckwd-1.c: Adjust 
> > > > > > > > > > > testcase.
> > > > > > > > > > > * gcc.target/i386/avx512bw-kunpckwd-3.c: Ditto.
> > > > > > > > > > > * gcc.target/i386/avx512dq-kmovb-5.c: Ditto.
> > > > > > > > > > > * gcc.target/i386/avx512f-kmovw-5.c: Ditto.
> > > > > > > > > >
> > > > > > > > > > A little nit, please put new splitters after the 
> > > > > > > > > > instruction pattern.
> > > > > > > > > >
> > > > > > > > > > OK for the whole patch set with the above change,
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Yes, thanks for the review.
> > > > > > > >
> > > > > > > > Please note that your patch introduces several testsuite fails 
> > > > > > > > with -m32:
> > > > > > > >
> > > > > > > > gcc -O2 -mavx512bitalg -mavx512bw -m32 -g 
> > > > > > > > avx512bitalg-vpopcntb-1.c
> > > > > > > >
> > > > > > >
> > > > > > > I can't reproduce this failure.
> > > > > >
> > > > > > Because you are running it on AVX512 enabled target.
> > > > > >
> > > > > > > > Program received signal SIGILL, Illegal instruction.
> > > > > > > > 0x080490ac in __get_cpuid_count (__edx=,
> > > > > > > > __ecx=, __ebx=, 
> > > > > > > > __eax= > > > > > > > pointer>,
> > > > > > > > __subleaf=0, __leaf=7) at 
> > > > > > > > /hdd/uros/gcc-build-fast/gcc/include/cpuid.h:316
> > > > > > > > 316   __cpuid_count (__leaf, __subleaf, *__eax, *__ebx, 
> > > > > > > > *__ecx, *__edx);
> > > > > > > >
> > > > > > > >0x080490a3 <+51>:cpuid
> > > > > > > >0x080490a5 <+53>:mov$0x1,%eax
> > > > > > > >0x080490aa <+58>:mov%ecx,%esi
> > > > > > > > => 0x080490ac <+60>:kmovd  %ebx,%k0
> > > > > > > >0x080490b0 <+64>:mov%edi,%ecx
> > > > > > > >0x080490b2 <+66>:mov%edi,%ebx
> > > > > > > >
> > > > > > > > kmov insn is generated for __cpuid_count function, where the 
> > > > > > > > binary
> > > > > > > > determines, if the new instructions are supported. The binary 
> > > > > > > > will
> > > > > > > > crash in the detection code if the processor lacks AVX512
> > > > > > > > instructions.
> > > > > > > >
> > > > > > >
> > > > > > > IMHO, the testcase shouldn't be run on processors without 
> > > > > > > AVX512BW.
> > > > > >
> > > > > > No, it could run, because it checks for AVX512BW at runtime.
> > > > > >
> > > > >
> > > > > Got it.
> > > > >
> > > > > > > Because in  avx512bitalg-vpopcntb-1.c, there's /*
> > > > > > > dg-require-effective-target avx512bw 

Re: [PATCH] x86: Disable SSE, AVX and AVX512 during CPUID check

2020-08-22 Thread Uros Bizjak via Gcc-patches
On Sat, Aug 22, 2020 at 6:27 PM H.J. Lu  wrote:
>
> On Fri, Aug 21, 2020 at 9:45 AM H.J. Lu  wrote:
> >
> > On Fri, Aug 21, 2020 at 9:35 AM H.J. Lu  wrote:
> > >
> > > On Fri, Aug 21, 2020 at 9:29 AM Hongtao Liu  wrote:
> > > >
> > > > On Fri, Aug 21, 2020 at 11:50 PM Uros Bizjak  wrote:
> > > > >
> > > > > On Fri, Aug 21, 2020 at 5:41 PM Hongtao Liu  
> > > > > wrote:
> > > > > >
> > > > > > On Fri, Aug 21, 2020 at 9:15 PM Uros Bizjak  
> > > > > > wrote:
> > > > > > >
> > > > > > > > > > gcc/
> > > > > > > > > > PR target/88808
> > > > > > > > > > * config/i386/i386.c (ix86_preferred_reload_class): 
> > > > > > > > > > Allow
> > > > > > > > > > QImode data go into mask registers.
> > > > > > > > > > * config/i386/i386.md: (*movhi_internal): Adjust 
> > > > > > > > > > constraints
> > > > > > > > > > for mask registers.
> > > > > > > > > > (*movqi_internal): Ditto.
> > > > > > > > > > (*anddi_1): Support mask register operations
> > > > > > > > > > (*and_1): Ditto.
> > > > > > > > > > (*andqi_1): Ditto.
> > > > > > > > > > (*andn_1): Ditto.
> > > > > > > > > > (*_1): Ditto.
> > > > > > > > > > (*qi_1): Ditto.
> > > > > > > > > > (*one_cmpl2_1): Ditto.
> > > > > > > > > > (*one_cmplsi2_1_zext): Ditto.
> > > > > > > > > > (*one_cmplqi2_1): Ditto.
> > > > > > > > > > (define_peephole2): Move constant 0/-1 directly 
> > > > > > > > > > into mask
> > > > > > > > > > registers.
> > > > > > > > > > * config/i386/predicates.md (mask_reg_operand): New 
> > > > > > > > > > predicate.
> > > > > > > > > > * config/i386/sse.md (define_split): Add 
> > > > > > > > > > post-reload splitters
> > > > > > > > > > that would convert "generic" patterns to mask 
> > > > > > > > > > patterns.
> > > > > > > > > > (*knotsi_1_zext): New define_insn.
> > > > > > > > > >
> > > > > > > > > > gcc/testsuite/
> > > > > > > > > > * gcc.target/i386/bitwise_mask_op-1.c: New test.
> > > > > > > > > > * gcc.target/i386/bitwise_mask_op-2.c: New test.
> > > > > > > > > > * gcc.target/i386/bitwise_mask_op-3.c: New test.
> > > > > > > > > > * gcc.target/i386/avx512bw-pr88465.c: New testcase.
> > > > > > > > > > * gcc.target/i386/avx512bw-kunpckwd-1.c: Adjust 
> > > > > > > > > > testcase.
> > > > > > > > > > * gcc.target/i386/avx512bw-kunpckwd-3.c: Ditto.
> > > > > > > > > > * gcc.target/i386/avx512dq-kmovb-5.c: Ditto.
> > > > > > > > > > * gcc.target/i386/avx512f-kmovw-5.c: Ditto.
> > > > > > > > >
> > > > > > > > > A little nit, please put new splitters after the instruction 
> > > > > > > > > pattern.
> > > > > > > > >
> > > > > > > > > OK for the whole patch set with the above change,
> > > > > > > > >
> > > > > > > >
> > > > > > > > Yes, thanks for the review.
> > > > > > >
> > > > > > > Please note that your patch introduces several testsuite fails 
> > > > > > > with -m32:
> > > > > > >
> > > > > > > gcc -O2 -mavx512bitalg -mavx512bw -m32 -g 
> > > > > > > avx512bitalg-vpopcntb-1.c
> > > > > > >
> > > > > >
> > > > > > I can't reproduce this failure.
> > > > >
> > > > > Because you are running it on AVX512 enabled target.
> > > > >
> > > > > > > Program received signal SIGILL, Illegal instruction.
> > > > > > > 0x080490ac in __get_cpuid_count (__edx=,
> > > > > > > __ecx=, __ebx=, 
> > > > > > > __eax= > > > > > > pointer>,
> > > > > > > __subleaf=0, __leaf=7) at 
> > > > > > > /hdd/uros/gcc-build-fast/gcc/include/cpuid.h:316
> > > > > > > 316   __cpuid_count (__leaf, __subleaf, *__eax, *__ebx, 
> > > > > > > *__ecx, *__edx);
> > > > > > >
> > > > > > >0x080490a3 <+51>:cpuid
> > > > > > >0x080490a5 <+53>:mov$0x1,%eax
> > > > > > >0x080490aa <+58>:mov%ecx,%esi
> > > > > > > => 0x080490ac <+60>:kmovd  %ebx,%k0
> > > > > > >0x080490b0 <+64>:mov%edi,%ecx
> > > > > > >0x080490b2 <+66>:mov%edi,%ebx
> > > > > > >
> > > > > > > kmov insn is generated for __cpuid_count function, where the 
> > > > > > > binary
> > > > > > > determines, if the new instructions are supported. The binary will
> > > > > > > crash in the detection code if the processor lacks AVX512
> > > > > > > instructions.
> > > > > > >
> > > > > >
> > > > > > IMHO, the testcase shouldn't be run on processors without AVX512BW.
> > > > >
> > > > > No, it could run, because it checks for AVX512BW at runtime.
> > > > >
> > > >
> > > > Got it.
> > > >
> > > > > > Because in  avx512bitalg-vpopcntb-1.c, there's /*
> > > > > > dg-require-effective-target avx512bw } */.
> > > > >
> > > > > This is to check the toolchain for support.
> > > > >
> > > > > > what's the version of your assembler?
> > > > >
> > > > > GNU assembler version 2.34-4.fc32
> > > > >
> > > >
> > > > If assembler supports avx512bw, but processor not, the test would pass
> > > > condition `dg-require-effective-target 

[PATCH] x86: Disable SSE, AVX and AVX512 during CPUID check

2020-08-22 Thread H.J. Lu via Gcc-patches
On Fri, Aug 21, 2020 at 9:45 AM H.J. Lu  wrote:
>
> On Fri, Aug 21, 2020 at 9:35 AM H.J. Lu  wrote:
> >
> > On Fri, Aug 21, 2020 at 9:29 AM Hongtao Liu  wrote:
> > >
> > > On Fri, Aug 21, 2020 at 11:50 PM Uros Bizjak  wrote:
> > > >
> > > > On Fri, Aug 21, 2020 at 5:41 PM Hongtao Liu  wrote:
> > > > >
> > > > > On Fri, Aug 21, 2020 at 9:15 PM Uros Bizjak  wrote:
> > > > > >
> > > > > > > > > gcc/
> > > > > > > > > PR target/88808
> > > > > > > > > * config/i386/i386.c (ix86_preferred_reload_class): 
> > > > > > > > > Allow
> > > > > > > > > QImode data go into mask registers.
> > > > > > > > > * config/i386/i386.md: (*movhi_internal): Adjust 
> > > > > > > > > constraints
> > > > > > > > > for mask registers.
> > > > > > > > > (*movqi_internal): Ditto.
> > > > > > > > > (*anddi_1): Support mask register operations
> > > > > > > > > (*and_1): Ditto.
> > > > > > > > > (*andqi_1): Ditto.
> > > > > > > > > (*andn_1): Ditto.
> > > > > > > > > (*_1): Ditto.
> > > > > > > > > (*qi_1): Ditto.
> > > > > > > > > (*one_cmpl2_1): Ditto.
> > > > > > > > > (*one_cmplsi2_1_zext): Ditto.
> > > > > > > > > (*one_cmplqi2_1): Ditto.
> > > > > > > > > (define_peephole2): Move constant 0/-1 directly into 
> > > > > > > > > mask
> > > > > > > > > registers.
> > > > > > > > > * config/i386/predicates.md (mask_reg_operand): New 
> > > > > > > > > predicate.
> > > > > > > > > * config/i386/sse.md (define_split): Add post-reload 
> > > > > > > > > splitters
> > > > > > > > > that would convert "generic" patterns to mask 
> > > > > > > > > patterns.
> > > > > > > > > (*knotsi_1_zext): New define_insn.
> > > > > > > > >
> > > > > > > > > gcc/testsuite/
> > > > > > > > > * gcc.target/i386/bitwise_mask_op-1.c: New test.
> > > > > > > > > * gcc.target/i386/bitwise_mask_op-2.c: New test.
> > > > > > > > > * gcc.target/i386/bitwise_mask_op-3.c: New test.
> > > > > > > > > * gcc.target/i386/avx512bw-pr88465.c: New testcase.
> > > > > > > > > * gcc.target/i386/avx512bw-kunpckwd-1.c: Adjust 
> > > > > > > > > testcase.
> > > > > > > > > * gcc.target/i386/avx512bw-kunpckwd-3.c: Ditto.
> > > > > > > > > * gcc.target/i386/avx512dq-kmovb-5.c: Ditto.
> > > > > > > > > * gcc.target/i386/avx512f-kmovw-5.c: Ditto.
> > > > > > > >
> > > > > > > > A little nit, please put new splitters after the instruction 
> > > > > > > > pattern.
> > > > > > > >
> > > > > > > > OK for the whole patch set with the above change,
> > > > > > > >
> > > > > > >
> > > > > > > Yes, thanks for the review.
> > > > > >
> > > > > > Please note that your patch introduces several testsuite fails with 
> > > > > > -m32:
> > > > > >
> > > > > > gcc -O2 -mavx512bitalg -mavx512bw -m32 -g avx512bitalg-vpopcntb-1.c
> > > > > >
> > > > >
> > > > > I can't reproduce this failure.
> > > >
> > > > Because you are running it on AVX512 enabled target.
> > > >
> > > > > > Program received signal SIGILL, Illegal instruction.
> > > > > > 0x080490ac in __get_cpuid_count (__edx=,
> > > > > > __ecx=, __ebx=, 
> > > > > > __eax= > > > > > pointer>,
> > > > > > __subleaf=0, __leaf=7) at 
> > > > > > /hdd/uros/gcc-build-fast/gcc/include/cpuid.h:316
> > > > > > 316   __cpuid_count (__leaf, __subleaf, *__eax, *__ebx, *__ecx, 
> > > > > > *__edx);
> > > > > >
> > > > > >0x080490a3 <+51>:cpuid
> > > > > >0x080490a5 <+53>:mov$0x1,%eax
> > > > > >0x080490aa <+58>:mov%ecx,%esi
> > > > > > => 0x080490ac <+60>:kmovd  %ebx,%k0
> > > > > >0x080490b0 <+64>:mov%edi,%ecx
> > > > > >0x080490b2 <+66>:mov%edi,%ebx
> > > > > >
> > > > > > kmov insn is generated for __cpuid_count function, where the binary
> > > > > > determines, if the new instructions are supported. The binary will
> > > > > > crash in the detection code if the processor lacks AVX512
> > > > > > instructions.
> > > > > >
> > > > >
> > > > > IMHO, the testcase shouldn't be run on processors without AVX512BW.
> > > >
> > > > No, it could run, because it checks for AVX512BW at runtime.
> > > >
> > >
> > > Got it.
> > >
> > > > > Because in  avx512bitalg-vpopcntb-1.c, there's /*
> > > > > dg-require-effective-target avx512bw } */.
> > > >
> > > > This is to check the toolchain for support.
> > > >
> > > > > what's the version of your assembler?
> > > >
> > > > GNU assembler version 2.34-4.fc32
> > > >
> > >
> > > If assembler supports avx512bw, but processor not, the test would pass
> > > condition `dg-require-effective-target avx512bw` and be runned.
> > > then crashed for illegal instruction.
> > >
> > > > Please add something like
> > > > X86_TUNE_INTER_UNIT_MOVES_FROM_MASK/X86_TUNE_INTER_UNIT_MOVES_TO_MASK
> > > > and enable them only for m_CORE_AVX512 (or perhaps m_INTEL).
> > > >
> > > > Handle this in inline_secondary_memory_needed to reject direct 

RE: [PATCH] hppa: Improve expansion of ashldi3 when !TARGET_64BIT

2020-08-22 Thread Roger Sayle


Hi Dave,

Many thanks for your help.

I suspect that the issue with the 64-bit patterns is that the second variant
of 
pa.md's define_insn "shrpdi4" is unlikely ever to match as (minus:DI
(const_int 64) x)
is never "canonical" when x is itself a CONST_INT.  Splitting this
define_insn
into two (or three see below) separate forms; the first as it currently is
and the
second (as you suggest) with 
"TARGET_64BIT
  && INTVAL (operands[3]) + INTVAL (operands[4]) == 64"
should do the trick.

My first impression was that the DImode shrpd instructions would be most
useful for implementing TI mode shifts, but that TI mode isn't supported by
hppa64.  But then I noticed that the more immediate benefit would be in
supporting rotrdi3 and rotldi3 on TARGET_64BIT that currently don't have
expanders nor insns defined.  Here GCC currently generates three
instructions
where a single shrpd would be optimal.

I'm sure that folks are aware of this but something I learned/found strange
was
the implications of using (match_operator:SI 5 "plus_xor_ior_operator" x y).
Clearly it's nice to have these patterns match IOR, PLUS and XOR (as
recently
pointed out by Jakub in a recent review), but using match_operator has two
side-effects.  The first is that recog's machinery doesn't know these
operators 
are commutative, so two variants of a define_insn using match_operator need
to be specified, i.e. (... y x).  The second side-effect is that for
generation/expansion
it's impossible (inefficient?) to use the corresponding gen_foo functions.
It would
be nice if match_operator:5 resulted in the gen_function taking an enum
rtx_code as the corresponding argument, to specify IOR or PLUS or XOR.
Alas no.  So instead I added the extremely light weight define_expand
for SImode shd_internal to select IOR arbitrarily as the preferred RTL.
You're right, that a similar strategy using a DImode shrpd_internal would
be required for the 64-bit form; but the two are different; it's not that
the
current name would be affected (or should be changed).

Hence the define_expand for rotrdi3 and rotldi3 would/should (for constant
shifts) call gen_shrpd_internal to create an insn that matches one of the
define_insns.

I'd be happy to write that patch, but I personally have no way of testing
it.
It's a shame there isn't a hppa64 machine on the GCC compile farm.

[p.s. I've ordered Gerry Kane's "PA-RISC 2.0 architecture" book from amazon,
so I'll hopefully understand more of what I'm talking about when it
arrives].

Thanks again.
Roger
--

-Original Message-
From: John David Anglin  
Sent: 22 August 2020 13:58
To: Roger Sayle ; 'GCC Patches'

Cc: 'Jeff Law' 
Subject: Re: [PATCH] hppa: Improve expansion of ashldi3 when !TARGET_64BIT

Hi Roger,

Started a test of your latest version.

It appears we miss 64-bit patterns similar to these:
(define_insn ""
  [(set (match_operand:SI 0 "register_operand" "=r")
    (match_operator:SI 5 "plus_xor_ior_operator"
  [(ashift:SI (match_operand:SI 1 "register_operand" "r")
  (match_operand:SI 3 "const_int_operand" "n"))
   (lshiftrt:SI (match_operand:SI 2 "register_operand" "r")
    (match_operand:SI 4 "const_int_operand" "n"))]))]
  "INTVAL (operands[3]) + INTVAL (operands[4]) == 32"
  "{shd|shrpw} %1,%2,%4,%0"
  [(set_attr "type" "shift")
   (set_attr "length" "4")])

(define_insn ""
  [(set (match_operand:SI 0 "register_operand" "=r")
    (match_operator:SI 5 "plus_xor_ior_operator"
  [(lshiftrt:SI (match_operand:SI 2 "register_operand" "r")
    (match_operand:SI 4 "const_int_operand" "n"))
   (ashift:SI (match_operand:SI 1 "register_operand" "r")
  (match_operand:SI 3 "const_int_operand" "n"))]))]
  "INTVAL (operands[3]) + INTVAL (operands[4]) == 32"
  "{shd|shrpw} %1,%2,%4,%0"
  [(set_attr "type" "shift")
   (set_attr "length" "4")])

I'm wondering if it would be useful to define the 64-bit equivalents using
the "shrpd" instruction.
If that's the case, then I think it would be good to rename "shd_internal"
to "shrpw_internal".

Regards,
Dave

On 2020-08-22 4:52 a.m., Roger Sayle wrote:
> Hi Dave,
>
> It's great to hear from you.  It's been a long while.
>
> Sorry, doh! yes, there's a mistake in my patch (that I introduced when 
> I renumbered the operands in the shd's define_expand to be the more 
> logical 0, 1, 2, 3, then 4).
> Sorry for the inconvenience [due to my lack of familiarity with 
> PA-RISC assembly].
> Hopefully you should get much better mileage out of the attached revision.
>
> Thanks again (and my sincere apologies), Roger
> --
>
> -Original Message-
> From: John David Anglin 
> Sent: 21 August 2020 20:00
> To: Roger Sayle ; 'GCC Patches'
> 
> Cc: 'Jeff Law' 
> Subject: Re: [PATCH] hppa: Improve expansion of ashldi3 when 
> !TARGET_64BIT
>
> Hi Roger,
>
> On 2020-08-21 8:53 a.m., Roger Sayle wrote:
>> I was wondering whether Dave or Jeff (or someone else 

[committed] analyzer: fix NULL deref false positives [PR94851]

2020-08-22 Thread David Malcolm via Gcc-patches
PR analyzer/94851 reports various false "NULL dereference" diagnostics.
The first case (comment #1) affects GCC 10.2 but no longer affects
trunk; I believe it was fixed by the state rewrite of
r11-2694-g808f4dfeb3a95f50f15e71148e5c1067f90a126d.

The patch adds a regression test for this case.

The other cases (comment #3 and comment #4) still affect trunk.
In both cases, the && in a conditional is optimized to bitwise &
  _1 = p_4 != 0B;
  _2 = p_4 != q_6(D);
  _3 = _1 & _2;
and the analyzer fails to fold this for the case where one (or both) of
the conditionals is false, and thus erroneously considers the path where
"p" is non-NULL despite being passed a NULL value.

Fix this by implementing folding for this case.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to master as r11-2807-gdf2b78d407a3fe8685343f7249b9c31c7e3af44d.

gcc/analyzer/ChangeLog:
PR analyzer/94851
* region-model-manager.cc
(region_model_manager::maybe_fold_binop): Fold bitwise "& 0" to 0.

gcc/testsuite/ChangeLog:
PR analyzer/94851
* gcc.dg/analyzer/pr94851-1.c: New test.
* gcc.dg/analyzer/pr94851-3.c: New test.
* gcc.dg/analyzer/pr94851-4.c: New test.
---
 gcc/analyzer/region-model-manager.cc  |  6 +++
 gcc/testsuite/gcc.dg/analyzer/pr94851-1.c | 46 +++
 gcc/testsuite/gcc.dg/analyzer/pr94851-3.c | 20 ++
 gcc/testsuite/gcc.dg/analyzer/pr94851-4.c | 24 
 4 files changed, 96 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr94851-1.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr94851-3.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr94851-4.c

diff --git a/gcc/analyzer/region-model-manager.cc 
b/gcc/analyzer/region-model-manager.cc
index 75402649a91..ce949322db7 100644
--- a/gcc/analyzer/region-model-manager.cc
+++ b/gcc/analyzer/region-model-manager.cc
@@ -451,6 +451,12 @@ region_model_manager::maybe_fold_binop (tree type, enum 
tree_code op,
   if (cst1 && integer_onep (cst1))
return arg0;
   break;
+case BIT_AND_EXPR:
+  if (cst1)
+   if (zerop (cst1) && INTEGRAL_TYPE_P (type))
+ /* "(ARG0 & 0)" -> "0".  */
+ return get_or_create_constant_svalue (build_int_cst (type, 0));
+  break;
 case TRUTH_ANDIF_EXPR:
 case TRUTH_AND_EXPR:
   if (cst1)
diff --git a/gcc/testsuite/gcc.dg/analyzer/pr94851-1.c 
b/gcc/testsuite/gcc.dg/analyzer/pr94851-1.c
new file mode 100644
index 000..56571318f91
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/pr94851-1.c
@@ -0,0 +1,46 @@
+/* { dg-additional-options "-O2" } */
+
+#include 
+#include 
+
+typedef struct AMARK {
+  struct AMARK *m_next;
+  char m_name;
+} AMARK;
+
+struct buf {
+  AMARK *b_amark;
+};
+
+struct buf *curbp;
+
+int pamark(void) {
+  int c;
+  AMARK *p = curbp->b_amark;
+  AMARK *last = curbp->b_amark;
+
+  c = getchar();
+
+  while (p != (AMARK *)NULL && p->m_name != (char)c) {
+last = p;
+p = p->m_next;
+  }
+
+  if (p != (AMARK *)NULL) {
+printf("over writing mark %c\n", c);
+  } else {
+if ((p = (AMARK *)malloc(sizeof(AMARK))) == (AMARK *)NULL)
+  return 0;
+
+p->m_next = (AMARK *)NULL;
+
+if (curbp->b_amark == (AMARK *)NULL)
+  curbp->b_amark = p;
+else
+  last->m_next = p;
+  }
+
+  p->m_name = (char)c;
+
+  return 1;
+}
diff --git a/gcc/testsuite/gcc.dg/analyzer/pr94851-3.c 
b/gcc/testsuite/gcc.dg/analyzer/pr94851-3.c
new file mode 100644
index 000..0f953970b00
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/pr94851-3.c
@@ -0,0 +1,20 @@
+/* { dg-additional-options "-O1" } */
+
+struct List {
+struct List *next;
+};
+
+void foo(struct List *p, struct List *q)
+{
+while (p && p != q){
+p = p->next;
+}
+}
+
+int main()
+{
+struct List x = {0};
+foo(0, );
+return 0;
+}
+
diff --git a/gcc/testsuite/gcc.dg/analyzer/pr94851-4.c 
b/gcc/testsuite/gcc.dg/analyzer/pr94851-4.c
new file mode 100644
index 000..2a15a5d7f5b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/pr94851-4.c
@@ -0,0 +1,24 @@
+/* { dg-additional-options "-O2" } */
+
+#include 
+
+struct List {
+struct List *next;
+};
+
+void foo(struct List *p, struct List *q)
+{
+while (p && p != q){
+struct List *next = p->next;
+free(p);
+p = next;
+}
+}
+
+int main()
+{
+struct List x = {0};
+foo(NULL, );
+return 0;
+}
+
-- 
2.26.2



[committed] analyzer: simplify store::eval_alias

2020-08-22 Thread David Malcolm via Gcc-patches
I have followup patches that add new conditions to store::eval_alias.
Rather than duplicate all conditions for symmetry, split it up and
call it on both (A, B) and (B, A).

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to master as c199723d7ed0032db095abc75b82a9710eaa5e56.

gcc/analyzer/ChangeLog:
* store.cc (store::eval_alias): Make const.  Split out 2nd half
into store::eval_alias_1 and call it twice for symmetry, avoiding
test duplication.
(store::eval_alias_1): New function, split out from the above.
* store.h (store::eval_alias): Make const.
(store::eval_alias_1): New decl.
---
 gcc/analyzer/store.cc | 45 +++
 gcc/analyzer/store.h  |  4 +++-
 2 files changed, 27 insertions(+), 22 deletions(-)

diff --git a/gcc/analyzer/store.cc b/gcc/analyzer/store.cc
index d854f4e504a..298088f6ef9 100644
--- a/gcc/analyzer/store.cc
+++ b/gcc/analyzer/store.cc
@@ -1544,7 +1544,7 @@ store::set_value (store_manager *mgr, const region 
*lhs_reg,
 
 tristate
 store::eval_alias (const region *base_reg_a,
-  const region *base_reg_b)
+  const region *base_reg_b) const
 {
   /* SSA names can't alias.  */
   tree decl_a = base_reg_a->maybe_get_decl ();
@@ -1554,31 +1554,34 @@ store::eval_alias (const region *base_reg_a,
   if (decl_b && TREE_CODE (decl_b) == SSA_NAME)
 return tristate::TS_FALSE;
 
+  /* Try both ways, for symmetry.  */
+  tristate ts_ab = eval_alias_1 (base_reg_a, base_reg_b);
+  if (ts_ab.is_false ())
+return tristate::TS_FALSE;
+  tristate ts_ba = eval_alias_1 (base_reg_b, base_reg_a);
+  if (ts_ba.is_false ())
+return tristate::TS_FALSE;
+  return tristate::TS_UNKNOWN;
+}
+
+/* Half of store::eval_alias; called twice for symmetry.  */
+
+tristate
+store::eval_alias_1 (const region *base_reg_a,
+const region *base_reg_b) const
+{
   if (const symbolic_region *sym_reg_a
   = base_reg_a->dyn_cast_symbolic_region ())
 {
   const svalue *sval_a = sym_reg_a->get_pointer ();
-  if (sval_a->get_kind () == SK_INITIAL
- && decl_b
- && !is_global_var (decl_b))
-   {
- /* The initial value of a pointer can't point to a local.  */
- return tristate::TS_FALSE;
-   }
-}
-  if (const symbolic_region *sym_reg_b
-  = base_reg_b->dyn_cast_symbolic_region ())
-{
-  const svalue *sval_b = sym_reg_b->get_pointer ();
-  if (sval_b->get_kind () == SK_INITIAL
- && decl_a
- && !is_global_var (decl_a))
-   {
- /* The initial value of a pointer can't point to a local.  */
- return tristate::TS_FALSE;
-   }
+  if (sval_a->get_kind () == SK_INITIAL)
+   if (tree decl_b = base_reg_b->maybe_get_decl ())
+ if (!is_global_var (decl_b))
+   {
+ /* The initial value of a pointer can't point to a local.  */
+ return tristate::TS_FALSE;
+   }
 }
-
   return tristate::TS_UNKNOWN;
 }
 
diff --git a/gcc/analyzer/store.h b/gcc/analyzer/store.h
index bc9dc2e0b5c..636a9547f2c 100644
--- a/gcc/analyzer/store.h
+++ b/gcc/analyzer/store.h
@@ -553,7 +553,7 @@ public:
   cluster_map_t::iterator end () const { return m_cluster_map.end (); }
 
   tristate eval_alias (const region *base_reg_a,
-  const region *base_reg_b);
+  const region *base_reg_b) const;
 
   template 
   void for_each_binding (BindingVisitor )
@@ -569,6 +569,8 @@ public:
 
 private:
   void remove_overlapping_bindings (store_manager *mgr, const region *reg);
+  tristate eval_alias_1 (const region *base_reg_a,
+const region *base_reg_b) const;
 
   cluster_map_t m_cluster_map;
 
-- 
2.26.2



[committed] analyzer: simplify region_model::push_frame

2020-08-22 Thread David Malcolm via Gcc-patches
region_model::push_frame was binding arguments for both the default SSA
name for each parameter, and the underlying parameter.

Simplify the generated states by only binding the default SSA name if
it exists, or the parameter if there is no default SSA name.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to master as 294b6da21bbd8297fe6aee497ac6c8e561637e70.

gcc/analyzer/ChangeLog:
* region-model.cc (region_model::push_frame): Bind the default
SSA name for each parm if it exists, falling back to the parm
itself otherwise, rather than doing both.

gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/malloc-ipa-8-double-free.c: Drop
-fanalyzer-verbose-state-changes.
---
 gcc/analyzer/region-model.cc  | 19 +++
 .../analyzer/malloc-ipa-8-double-free.c   | 10 +-
 2 files changed, 12 insertions(+), 17 deletions(-)

diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index b8a0f9ffd3d..02bbfa54781 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -2353,17 +2353,12 @@ region_model::push_frame (function *fun, const 
vec *arg_svals,
 rest of the params as uninitialized.  */
  if (idx >= arg_svals->length ())
break;
+ tree parm_lval = iter_parm;
+ if (tree parm_default_ssa = ssa_default_def (fun, iter_parm))
+   parm_lval = parm_default_ssa;
+ const region *parm_reg = get_lvalue (parm_lval, ctxt);
  const svalue *arg_sval = (*arg_svals)[idx];
- const region *parm_reg = get_lvalue (iter_parm, ctxt);
  set_value (parm_reg, arg_sval, ctxt);
-
- /* Also do it for default SSA name (sharing the same value).  */
- tree parm_default_ssa = ssa_default_def (fun, iter_parm);
- if (parm_default_ssa)
-   {
- const region *defssa_reg = get_lvalue (parm_default_ssa, ctxt);
- set_value (defssa_reg, arg_sval, ctxt);
-   }
}
 }
   else
@@ -2375,10 +2370,10 @@ region_model::push_frame (function *fun, const 
vec *arg_svals,
   for (tree iter_parm = DECL_ARGUMENTS (fndecl); iter_parm;
   iter_parm = DECL_CHAIN (iter_parm))
{
- on_top_level_param (iter_parm, ctxt);
- tree parm_default_ssa = ssa_default_def (fun, iter_parm);
- if (parm_default_ssa)
+ if (tree parm_default_ssa = ssa_default_def (fun, iter_parm))
on_top_level_param (parm_default_ssa, ctxt);
+ else
+   on_top_level_param (iter_parm, ctxt);
}
 }
 
diff --git a/gcc/testsuite/gcc.dg/analyzer/malloc-ipa-8-double-free.c 
b/gcc/testsuite/gcc.dg/analyzer/malloc-ipa-8-double-free.c
index cdf5ac18324..580862b0138 100644
--- a/gcc/testsuite/gcc.dg/analyzer/malloc-ipa-8-double-free.c
+++ b/gcc/testsuite/gcc.dg/analyzer/malloc-ipa-8-double-free.c
@@ -1,6 +1,6 @@
 /* Example of a multilevel wrapper around malloc/free, with a double-'free'.  
*/
 
-/* { dg-additional-options "-fdiagnostics-show-line-numbers 
-fdiagnostics-path-format=inline-events -fanalyzer-checker=malloc 
-fanalyzer-verbose-state-changes -fdiagnostics-show-caret" } */
+/* { dg-additional-options "-fdiagnostics-show-line-numbers 
-fdiagnostics-path-format=inline-events -fanalyzer-checker=malloc 
-fdiagnostics-show-caret" } */
 /* { dg-enable-nn-line-numbers "" } */
 
 #include 
@@ -83,7 +83,7 @@ void test (int i)
   |   NN |   return malloc (size);
   |  |  ~
   |  |  |
-  |  |  (6) allocated here (state of '': 
'start' -> 'unchecked', NULL origin)
+  |  |  (6) allocated here
   |
<--+
|
@@ -96,7 +96,7 @@ void test (int i)
|   NN |   if (!result)
|  |  ~  
|  |  |
-   |  |  (8) assuming 'result' is non-NULL (state of 'result': 
'unchecked' -> 'nonnull', NULL origin)
+   |  |  (8) assuming 'result' is non-NULL
|  |  (9) following 'false' branch (when 'result' is 
non-NULL)...
|   NN | abort ();
|   NN |   result->i = i;
@@ -140,7 +140,7 @@ void test (int i)
   |   NN |   free (ptr);
   |  |   ~~
   |  |   |
-  |  |   (16) first 'free' here (state of 'ptr': 'nonnull' 
-> 'freed', NULL origin)
+  |  |   (16) first 'free' here
   |
<--+
|
@@ -187,7 +187,7 @@ void test (int i)
   |   NN |   free (ptr);
   |  |   ~~
   |  |   |
-  |  |   (23) second 'free' here; first 'free' was at (16) 
('ptr' is in state 'freed')
+  |  |   (23) second 'free' 

Re: [committed] libstdc++: Make __int128 meet integer-class requirements [PR 96042]

2020-08-22 Thread Marc Glisse

On Sat, 22 Aug 2020, Jonathan Wakely via Gcc-patches wrote:


On Sat, 22 Aug 2020 at 13:13, Jonathan Wakely  wrote:


On Sat, 22 Aug 2020 at 10:52, Marc Glisse wrote:

is there a particular reason to handle only __int128 this way, and not all
the non-standard integer types? It looks like it would be a bit simpler to
avoid a special case.


I have no objection to doing it for all of them, it just wasn't
necessary to solve the immediate problem that the library now uses
__int128 even when integral<__int128> is false. (Hmm, or is size_t  an
alias for __int20 on one arch, which would mean we do use it?)


Oh I remember why I didn't do that now. I did actually want to do it
that way initially.

The macros like __GLIBCXX_TYPE_INT_N_0 are not defined in strict mode,
so we have no generic way to name those types.


IIRC, those macros were introduced specifically to help libstdc++. If 
libstdc++ wants them defined in different circumstances, it should be fine 
to change the condition from "!flag_iso || int_n_data[i].bitsize == 
POINTER_SIZE" to whatever you need.


But now I understand why you did this, thanks.

--
Marc Glisse


RE: [PATCH] middle-end: Recognize idioms for bswap32 and bswap64 in match.pd.

2020-08-22 Thread Marc Glisse

On Sat, 15 Aug 2020, Roger Sayle wrote:

Here's version #2 of the patch to recognize bswap32 and bswap64 
incorporating your suggestions and feedback.  The test cases now confirm 
the transformation is applied when int is 32 bits and long is 64 bits, 
and should pass otherwise; the patterns now reuse (more) capturing 
groups, and the patterns have been made more generic to allow the 
ultimate type to be signed or unsigned (hence there are now two new 
gcc.dg tests).


Alas my efforts to allow the input argument to be signed, and use 
fold_convert to coerce it to the correct type before calling 
__builtin_bswap failed, with the error messages:


You can't use fold_convert for that (well, maybe if you restricted the 
transformation to GENERIC), but if I understand correctly, you are trying 
to do


(convert (BUILT_IN_BSWAP64 (convert:uint64_type_node @1))

? (untested)


From: Marc Glisse 

+(simplify
+  (bit_ior:c
+(lshift
+  (convert (BUILT_IN_BSWAP16 (convert (bit_and @0
+  INTEGER_CST@1
+  (INTEGER_CST@2))
+(convert (BUILT_IN_BSWAP16 (convert (rshift @3
+   INTEGER_CST@4)

I didn't realize we kept this useless bit_and when casting to a smaller
type.


I was confused when I wrote that and thought we were converting from int 
to uint16_t, but bswap16 actually takes an int on x86_64, probably because 
of the calling convention, so we are converting from unsigned int to int.


Having implementation details like the calling convention appear here in 
the intermediate language complicates things a bit. Can we assume that it 
is fine to build a call to bswap32/bswap64 taking uint32_t/uint64_t and 
that only bswap16 can be affected? Do most targets have a similar-enough 
calling convention that this transformation also works on them? It looks 
like aarch64 / powerpc64le / mips64el would like for bswap16->bswap32 a 
transformation of the same form as the one you wrote for bswap32->bswap64.



I was wondering what would happen if I start from an int instead of an 
unsigned int.


f (int x)
{
  short unsigned int _1;
  short unsigned int _2;
  short unsigned int _3;
  int _5;
  int _7;
  unsigned int _8;
  unsigned int _9;
  int _10;

   [local count: 1073741824]:
  _7 = x_4(D) & 65535;
  _1 = __builtin_bswap16 (_7);
  _8 = (unsigned int) x_4(D);
  _9 = _8 >> 16;
  _10 = (int) _9;
  _2 = __builtin_bswap16 (_10);
  _3 = _1 | _2;
  _5 = (int) _3;
  return _5;
}

Handling this in the same transformation with a pair of convert12? and 
some tests should be doable, but it gets complicated enough that it is 
fine to postpone that.


--
Marc Glisse


Re: [RISC-V] Add support for AddressSanitizer on RISC-V GCC

2020-08-22 Thread Kito Cheng via Gcc-patches
On Fri, Aug 21, 2020 at 12:04 AM Palmer Dabbelt  wrote:
>
> On Wed, 19 Aug 2020 02:25:37 PDT (-0700), gcc-patches@gcc.gnu.org wrote:
> > Hi Andrew:
> >
> > I am not sure the reason why some targets pick different numbers.
> > It seems it's not only target dependent but also OS dependent[1].
> >
> > For RV32, I think using 1<<29 like other 32 bit targets is fine.
> >
> > [1] 
> > https://github.com/llvm/llvm-project/blob/master/compiler-rt/lib/asan/asan_mapping.h#L159
> >
> > Hi Joshua:
> >
> > Could you update that for RV32, and this patch will be pending until
> > LLVM accepts the libsanitizer part.
>
> This is ABI, and Linux only supports kasan on rv64 right now so it's
> technically undefined.  It's probably best to avoid picking an arbitrary 
> number
> for rv32, as we still have some open questions WRT the kernel memory map over
> there.  I doubt that will get sorted out for a while, as the rv32 doesn't get 
> a
> lot of attention (though hopefully the glibc stuff will help out).

Yeah, I agree it's part of ABI, and I think this part could wait until
rv32 glibc upstream, the day seems not too far.

> > On Wed, Aug 19, 2020 at 4:48 PM Andrew Waterman  wrote:
> >>
> >> I'm having trouble understanding why different ports chose their
> >> various constants--e.g., SPARC uses 1<<29 for 32-bit and 1<<43 for
> >> 64-bit, whereas x86 uses 1<<29 and 0x7fff8000, respectively.  So I
> >> can't comment on the choice of the constant 1<<36 for RISC-V.  But
> >> isn't it a problem that 1<<36 is not a valid Pmode value for ILP32?
>
> This is for kasan (not regular asan), which requires some coordination between
> the kernel's memory map and the compiler's inline address sanitizer (as you
> can't just pick your own memory map).  Essentially what's going on is that
> there's an array of valid tags associated with each address, which is checked
> in-line by the compiler for performance reasons (IIRC it used to be library
> routines).  The compiler needs to know how to map between addresses and tags,
> which depends on the kernel's memory map -- essentially baking the kernel's
> memory map into the compiler.  That's why the constants seem somewhat
> arbitrary.

IIRC kasan will give the offset via -fasan-shadow-offset,
so TARGET_ASAN_SHADOW_OFFSET only meaningful for (user-space) asan.

>
> In order to save memory there's some lossyness in the address->tag mapping.
> Most 32-bit ports pick a tag array that's 1/8th of the memory size, which is
> where the 29 comes from.  I don't see any reason why that wouldn't be workable
> on rv32, but it seems better to make sure that's the case rather than just
> making up an ABI :)

I guess we could try it after rv32 glibc upstream.

>
> >> On Wed, Aug 19, 2020 at 1:02 AM Joshua via Gcc-patches
> >>  wrote:
> >> >
> >> > From: cooper.joshua 
> >> >
> >> > gcc/
> >> >
> >> > * config/riscv/riscv.c (asan_shadow_offset): Implement the 
> >> > offset of asan shadow memory for risc-v.
> >> > (asan_shadow_offset): new macro definition.
> >> > ---
> >> >
> >> >  gcc/config/riscv/riscv.c | 11 +++
> >> >  1 file changed, 11 insertions(+)
> >> >
> >> > diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
> >> > index 63b0c38..b85b459 100644
> >> > --- a/gcc/config/riscv/riscv.c
> >> > +++ b/gcc/config/riscv/riscv.c
> >> > @@ -5292,6 +5292,14 @@ riscv_gpr_save_operation_p (rtx op)
> >> >return true;
> >> >  }
> >> >
> >> > +/* Implement TARGET_ASAN_SHADOW_OFFSET.  */
> >> > +
> >> > +static unsigned HOST_WIDE_INT
> >> > +riscv_asan_shadow_offset (void)
> >> > +{
> >> > +  return HOST_WIDE_INT_1U << 36;
> >> > +}
> >> > +
> >> >  /* Initialize the GCC target structure.  */
> >> >  #undef TARGET_ASM_ALIGNED_HI_OP
> >> >  #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t"
> >> > @@ -5475,6 +5483,9 @@ riscv_gpr_save_operation_p (rtx op)
> >> >  #undef TARGET_NEW_ADDRESS_PROFITABLE_P
> >> >  #define TARGET_NEW_ADDRESS_PROFITABLE_P riscv_new_address_profitable_p
> >> >
> >> > +#undef TARGET_ASAN_SHADOW_OFFSET
> >> > +#define TARGET_ASAN_SHADOW_OFFSET riscv_asan_shadow_offset
> >> > +
> >> >  struct gcc_target targetm = TARGET_INITIALIZER;
> >> >
> >> >  #include "gt-riscv.h"
> >> > --
> >> > 2.7.4
> >> >


Re: [PATCH] middle-end: Simplify popcount/parity of bswap/rotate.

2020-08-22 Thread Marc Glisse

On Fri, 21 Aug 2020, Roger Sayle wrote:


This simple patch to match.pd optimizes away bit permutation
operations, specifically bswap and rotate, in calls to popcount and
parity.


Good idea.


Although this patch has been developed and tested on LP64,
it relies on there being no truncations or extensions to "marry up"
the appropriate PARITY, PARITYL and PARITYLL forms with either BSWAP32
or BSWAP64, assuming this transformation won't fire if the integral
types have different sizes.


There would be a convert_expr or similar if the sizes were different, so 
you are safe.


I wish we had only generic builtins/ifn instead of
__builtin_parity{,l,ll,imax}, and inconsistently __builtin_bswap{16,32,64,128},
that's very inconvenient to deal with... And genmatch generates a lot of
useless code because of that.

I didn't try but couldn't the transformations be merged to reduce repetition?

(for bitcnt (POPCOUNT PARITY)
 (for swap (BUILT_IN_BSWAP32 BUILT_IN_BSWAP64)
  (simplify
   (bitcnt (swap @0))
   (bitcnt @0)))
 (for rot (lrotate rrotate)
  (simplify
   (bitcnt (rot @0 @1))
   (bitcnt @0

I assume you skipped BUILT_IN_BSWAP16 because on 32+ bit targets, there
is an intermediate extension, and 16 bit targets are "rare"? And
BUILT_IN_BSWAP128 because on most platforms intmax_t is only 64 bits and
we don't have a 128-bit version of parity/popcount? (we have an IFN, but
it seldom appears by magic)

--
Marc Glisse


Re: [PATCH] hppa: Improve expansion of ashldi3 when !TARGET_64BIT

2020-08-22 Thread John David Anglin
Hi Roger,

Started a test of your latest version.

It appears we miss 64-bit patterns similar to these:
(define_insn ""
  [(set (match_operand:SI 0 "register_operand" "=r")
    (match_operator:SI 5 "plus_xor_ior_operator"
  [(ashift:SI (match_operand:SI 1 "register_operand" "r")
  (match_operand:SI 3 "const_int_operand" "n"))
   (lshiftrt:SI (match_operand:SI 2 "register_operand" "r")
    (match_operand:SI 4 "const_int_operand" "n"))]))]
  "INTVAL (operands[3]) + INTVAL (operands[4]) == 32"
  "{shd|shrpw} %1,%2,%4,%0"
  [(set_attr "type" "shift")
   (set_attr "length" "4")])

(define_insn ""
  [(set (match_operand:SI 0 "register_operand" "=r")
    (match_operator:SI 5 "plus_xor_ior_operator"
  [(lshiftrt:SI (match_operand:SI 2 "register_operand" "r")
    (match_operand:SI 4 "const_int_operand" "n"))
   (ashift:SI (match_operand:SI 1 "register_operand" "r")
  (match_operand:SI 3 "const_int_operand" "n"))]))]
  "INTVAL (operands[3]) + INTVAL (operands[4]) == 32"
  "{shd|shrpw} %1,%2,%4,%0"
  [(set_attr "type" "shift")
   (set_attr "length" "4")])

I'm wondering if it would be useful to define the 64-bit equivalents using the 
"shrpd" instruction.
If that's the case, then I think it would be good to rename "shd_internal" to 
"shrpw_internal".

Regards,
Dave

On 2020-08-22 4:52 a.m., Roger Sayle wrote:
> Hi Dave,
>
> It's great to hear from you.  It's been a long while.
>
> Sorry, doh! yes, there's a mistake in my patch (that I introduced when I
> renumbered
> the operands in the shd's define_expand to be the more logical 0, 1, 2, 3,
> then 4).
> Sorry for the inconvenience [due to my lack of familiarity with PA-RISC
> assembly].
> Hopefully you should get much better mileage out of the attached revision.
>
> Thanks again (and my sincere apologies),
> Roger
> --
>
> -Original Message-
> From: John David Anglin  
> Sent: 21 August 2020 20:00
> To: Roger Sayle ; 'GCC Patches'
> 
> Cc: 'Jeff Law' 
> Subject: Re: [PATCH] hppa: Improve expansion of ashldi3 when !TARGET_64BIT
>
> Hi Roger,
>
> On 2020-08-21 8:53 a.m., Roger Sayle wrote:
>> I was wondering whether Dave or Jeff (or someone else with access to 
>> real hardware) might "spin" this patch for me?
> This may be totally unrelated to this patch but I hit this error in stage2
> testing your change:
> build/genattrtab ../../gcc/gcc/common.md ../../gcc/gcc/config/pa/pa.md
> insn-conditions.md \
>     -Atmp-attrtab.c -Dtmp-dfatab.c -Ltmp-latencytab.c
> genattrtab: Internal error: abort in attr_alt_union, at genattrtab.c:2383
>
> It's great that you are back helpting with the middle-end.
>
> Regards,
> Dave
>
> --
> John David Anglin  dave.ang...@bell.net
>


-- 
John David Anglin  dave.ang...@bell.net



Re: [committed] libstdc++: Make __int128 meet integer-class requirements [PR 96042]

2020-08-22 Thread Jonathan Wakely via Gcc-patches
On Sat, 22 Aug 2020 at 13:13, Jonathan Wakely  wrote:
>
> On Sat, 22 Aug 2020 at 10:52, Marc Glisse wrote:
> > is there a particular reason to handle only __int128 this way, and not all
> > the non-standard integer types? It looks like it would be a bit simpler to
> > avoid a special case.
>
> I have no objection to doing it for all of them, it just wasn't
> necessary to solve the immediate problem that the library now uses
> __int128 even when integral<__int128> is false. (Hmm, or is size_t  an
> alias for __int20 on one arch, which would mean we do use it?)

Oh I remember why I didn't do that now. I did actually want to do it
that way initially.

The macros like __GLIBCXX_TYPE_INT_N_0 are not defined in strict mode,
so we have no generic way to name those types. It's possible to
special case __int128 because I know its name, and I know the macro to
test for its existence (__SIZEOF_INT128__).

To do it for all the other possible non-standard types I'd need to
know all the possible names they can have. I know __int20 is one
possible name on one target, but I don't know of the others. Since
__int128 is the only one we're actually relying on, I just solved the
problem for signed/unsigned __int128.


Re: [committed] libstdc++: Make __int128 meet integer-class requirements [PR 96042]

2020-08-22 Thread Jonathan Wakely via Gcc-patches
On Sat, 22 Aug 2020 at 13:18, JeanHeyd Meneide wrote:
>
> On Sat, Aug 22, 2020 at 8:14 AM Jonathan Wakely via Gcc-patches
>  wrote:
> > I really wish WG14 would just fix the intmax_t mess so we can make
> > them integral types unconditionally.
>
> We're trying, but we're struggling to reach a good consensus. Almost
> nobody's fully agreeing on one /particular/ solution (there's roughly
> 3 ideas of what to do).

I liked Jens Gustedt's original proposal, it looked perfect to me. I
should catch up on WG14 mails to see what the other proposals are and
where the discussion is going.


Re: [committed] libstdc++: Make __int128 meet integer-class requirements [PR 96042]

2020-08-22 Thread JeanHeyd Meneide via Gcc-patches
On Sat, Aug 22, 2020 at 8:14 AM Jonathan Wakely via Gcc-patches
 wrote:
> I really wish WG14 would just fix the intmax_t mess so we can make
> them integral types unconditionally.

We're trying, but we're struggling to reach a good consensus. Almost
nobody's fully agreeing on one /particular/ solution (there's roughly
3 ideas of what to do).

But the papers are coming in and it's on our radar so maybe we can fix
it for C2x and then pass the solution on to WG21!


Re: [committed] libstdc++: Make __int128 meet integer-class requirements [PR 96042]

2020-08-22 Thread Jonathan Wakely via Gcc-patches
On Sat, 22 Aug 2020 at 10:52, Marc Glisse wrote:
> is there a particular reason to handle only __int128 this way, and not all
> the non-standard integer types? It looks like it would be a bit simpler to
> avoid a special case.

I have no objection to doing it for all of them, it just wasn't
necessary to solve the immediate problem that the library now uses
__int128 even when integral<__int128> is false. (Hmm, or is size_t  an
alias for __int20 on one arch, which would mean we do use it?)

In case you didn't see it, I've created https://gcc.gnu.org/PR96710 to
try and make our treatment of __int128 a bit more useful and
consistent. Even if it's not a standard integer type, it's definitely
an object type. And although it doesn't fit the standard's definition
of a scalar type, I think it should (I prefer to think of scalar types
as objects that aren't classes or arrays, rather than the list of
specific types in the standard that attempts to be exhaustive). What
we do for __int128 could/should be done for the other non-standard
integers.

I really wish WG14 would just fix the intmax_t mess so we can make
them integral types unconditionally.


Re: [Patch, fortran] PR fortran/96727 - ICE with character length specified using specification function on assumed-rank array

2020-08-22 Thread Thomas Koenig via Gcc-patches

Hi Jose,

Proposed patch to PR96727 - ICE with character length specified using 
specification function on assumed-rank array.


Patch tested only on x86_64-pc-linux-gnu.

Add missing default error message for the assumed-rank array case.


Looks good with an adjusted ChangeLog/git commit message.
(If you have trouble with that, you can also send me
an e-mail).

Best regards

Thomas




Re: [Patch, fortran] PR fortran/96726 - ICE with user defined specification function on assumed-rank array

2020-08-22 Thread Thomas Koenig via Gcc-patches

Hi Jose,

Proposed patch to PR96726 - ICE with user defined specification function 
on assumed-rank array.


OK, you'll need a to work on the ChangeLog format to commit this
(like I wrote in my previous mail).

Thanks for the patch!

Regards

Thomas


Re: [committed] libstdc++: Make __int128 meet integer-class requirements [PR 96042]

2020-08-22 Thread Marc Glisse

On Wed, 19 Aug 2020, Jonathan Wakely via Gcc-patches wrote:


Because __int128 can be used as the difference type for iota_view, we
need to ensure that it meets the requirements of an integer-class type.
The requirements in [iterator.concept.winc] p10 include numeric_limits
being specialized and giving meaningful answers. Currently we only
specialize numeric_limits for non-standard integer types in non-strict
modes.  However, nothing prevents us from defining an explicit
specialization for any implementation-defined type, so it doesn't matter
whether std::is_integral<__int128> is true or not.

This patch ensures that the numeric_limits specializations for signed
and unsigned __int128 are defined whenever __int128 is available. It
also makes the __numeric_traits and __int_limits helpers work for
__int128, via a new __gnu_cxx::__is_integer_nonstrict trait.


Hello,

is there a particular reason to handle only __int128 this way, and not all 
the non-standard integer types? It looks like it would be a bit simpler to 
avoid a special case.


--
Marc Glisse


RE: [PATCH] hppa: Improve expansion of ashldi3 when !TARGET_64BIT

2020-08-22 Thread Roger Sayle

Hi Dave,

It's great to hear from you.  It's been a long while.

Sorry, doh! yes, there's a mistake in my patch (that I introduced when I
renumbered
the operands in the shd's define_expand to be the more logical 0, 1, 2, 3,
then 4).
Sorry for the inconvenience [due to my lack of familiarity with PA-RISC
assembly].
Hopefully you should get much better mileage out of the attached revision.

Thanks again (and my sincere apologies),
Roger
--

-Original Message-
From: John David Anglin  
Sent: 21 August 2020 20:00
To: Roger Sayle ; 'GCC Patches'

Cc: 'Jeff Law' 
Subject: Re: [PATCH] hppa: Improve expansion of ashldi3 when !TARGET_64BIT

Hi Roger,

On 2020-08-21 8:53 a.m., Roger Sayle wrote:
> I was wondering whether Dave or Jeff (or someone else with access to 
> real hardware) might "spin" this patch for me?
This may be totally unrelated to this patch but I hit this error in stage2
testing your change:
build/genattrtab ../../gcc/gcc/common.md ../../gcc/gcc/config/pa/pa.md
insn-conditions.md \
    -Atmp-attrtab.c -Dtmp-dfatab.c -Ltmp-latencytab.c
genattrtab: Internal error: abort in attr_alt_union, at genattrtab.c:2383

It's great that you are back helpting with the middle-end.

Regards,
Dave

--
John David Anglin  dave.ang...@bell.net

diff --git a/gcc/config/pa/pa.md b/gcc/config/pa/pa.md
index 6350c68..713ff17 100644
--- a/gcc/config/pa/pa.md
+++ b/gcc/config/pa/pa.md
@@ -6416,9 +6416,32 @@
   [(set (match_operand:DI 0 "register_operand" "")
(ashift:DI (match_operand:DI 1 "lhs_lshift_operand" "")
   (match_operand:DI 2 "arith32_operand" "")))]
-  "TARGET_64BIT"
+  ""
   "
 {
+  if (!TARGET_64BIT)
+{
+  if (REG_P (operands[0]) && GET_CODE (operands[2]) == CONST_INT)
+   {
+ unsigned HOST_WIDE_INT shift = UINTVAL (operands[2]);
+ if (shift >= 1 && shift <= 31)
+   {
+ rtx dst = operands[0];
+ rtx src = force_reg (DImode, operands[1]);
+ emit_insn (gen_shd_internal (gen_highpart (SImode, dst),
+  gen_lowpart (SImode, src),
+  GEN_INT (32-shift),
+  gen_highpart (SImode, src),
+  GEN_INT (shift)));
+ emit_insn (gen_ashlsi3 (gen_lowpart (SImode, dst),
+ gen_lowpart (SImode, src),
+ GEN_INT (shift)));
+ DONE;
+   }
+   }
+  /* Fallback to using optabs.c's expand_doubleword_shift.  */
+  FAIL;
+}
   if (GET_CODE (operands[2]) != CONST_INT)
 {
   rtx temp = gen_reg_rtx (DImode);
@@ -6705,6 +6728,15 @@
   [(set_attr "type" "shift")
(set_attr "length" "4")])
 
+(define_expand "shd_internal"
+  [(set (match_operand:SI 0 "register_operand")
+   (ior:SI
+ (lshiftrt:SI (match_operand:SI 1 "register_operand")
+  (match_operand:SI 2 "const_int_operand"))
+ (ashift:SI (match_operand:SI 3 "register_operand")
+(match_operand:SI 4 "const_int_operand"]
+  "")
+
 (define_insn ""
   [(set (match_operand:SI 0 "register_operand" "=r")
(and:SI (ashift:SI (match_operand:SI 1 "register_operand" "r")


Re: [PATCH] libgccjit: update some comments in libgccjit.c

2020-08-22 Thread Andrea Corallo
David Malcolm  writes:

> On Wed, 2020-08-19 at 09:24 +0200, Andrea Corallo wrote:
>> Hi all,
>> 
>> just a small patch updating some comments that apparently went out of
>> sync a while ago adding gcc_jit_context_new_rvalue_from_long.
>
>> Okay for trunk?
>
> Yes
>
> Thanks for fixing these
> Dave

Pushed as fc34d04b075.

Thanks

  Andrea