[Bug tree-optimization/98254] Failure to optimize simple pattern for __builtin_convertvector

2020-12-12 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98254

--- Comment #4 from rguenther at suse dot de  ---
On December 12, 2020 8:36:07 PM GMT+01:00, "jakub at gcc dot gnu.org"
 wrote:
>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98254
>
>--- Comment #3 from Jakub Jelinek  ---
>(In reply to rguent...@suse.de from comment #2)
>> Should already be handled by vectorizing the CTOR.
>
>I've tried:
>
>typedef int __attribute__((vector_size(16))) V;
>
>V
>foo (short *a)
>{
>  return (V){a[0], a[1], a[2], a[3]};
>}
>
>V
>bar (int *a)
>{
>  return (V){a[0], a[1], a[2], a[3]};
>}
>
>and we don't do a vector (unaligned) read even in bar with -O3
>-fno-tree-slp-vectorize, it is just SLP vectorization that makes it
>vectorize.
>If we should handle foo as convertvector, we should handle bar in the
>same spot
>as vector load from memory.

I see. Forwprop handles some conversions already, would need to check what is
missing for this case.

[Bug target/98259] New: [11 Regression] error: 'void verify_insn_chain()' causes a section type conflict with 'void init_rtl_bb_info(basic_block)'

2020-12-12 Thread doko at debian dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98259

Bug ID: 98259
   Summary: [11 Regression] error: 'void verify_insn_chain()'
causes a section type conflict with 'void
init_rtl_bb_info(basic_block)'
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: doko at debian dot org
  Target Milestone: ---

seen with 20201212 with a profiled bootstrap on arm-linux-gnueabihf:

../../src/gcc/symtab.c: In static member function 'static void
symtab_node::verify_symtab_nodes()':
../../src/gcc/symtab.c:1349:1: error: 'void symtab_node::verify()' causes a
section type conflict with 'void symbol_table::sym
tab_initialize_asm_name_hash()'
 1349 | symtab_node::verify (void)
  | ^~~
../../src/gcc/symtab.c:259:1: note: 'void
symbol_table::symtab_initialize_asm_name_hash()' was declared here
  259 | symbol_table::symtab_initialize_asm_name_hash (void)
  | ^~~~
make[5]: *** [Makefile:1123: symtab.o] Error 1
make[5]: *** Waiting for unfinished jobs
../../src/gcc/cfgrtl.c: In function 'void cfg_layout_finalize()':
../../src/gcc/cfgrtl.c:4057:1: error: 'void verify_insn_chain()' causes a
section type conflict with 'void init_rtl_bb_info(basic_block)'
 4057 | verify_insn_chain (void)
  | ^
../../src/gcc/cfgrtl.c:5134:1: note: 'void init_rtl_bb_info(basic_block)' was
declared here
 5134 | init_rtl_bb_info (basic_block bb)
  | ^~~~
make[5]: *** [Makefile:1123: cfgrtl.o] Error 1

gcc is configured with:

 --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++
 --with-gcc-major-version-only
 --program-prefix=
 --enable-shared
 --enable-linker-build-id
 --disable-nls
 --enable-bootstrap
 --enable-clocale=gnu
 --enable-libstdcxx-debug
 --enable-libstdcxx-time=yes
 --with-default-libstdcxx-abi=new
 --enable-gnu-unique-object
 --disable-libitm
 --disable-libquadmath
 --disable-libquadmath-support
 --enable-plugin
 --with-system-zlib
 --enable-libphobos-checking=release
 --with-target-system-zlib=auto
 --enable-objc-gc=auto
 --enable-multiarch
 --enable-multilib
 --disable-sjlj-exceptions
 --with-arch=armv7-a
 --with-fpu=vfpv3-d16
 --with-float=hard
 --with-mode=thumb
 --disable-werror
 --enable-multilib
 --enable-checking=yes
 --build=arm-linux-gnueabihf
 --host=arm-linux-gnueabihf
 --target=arm-linux-gnueabihf

build target: profiledbootstrap-lean

[Bug libgomp/98258] New: Can't compile programs for both OpenMP (CPU) + OpenACC (GPU)

2020-12-12 Thread mehdi.chinoune at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98258

Bug ID: 98258
   Summary: Can't compile programs for both OpenMP (CPU) + OpenACC
(GPU)
   Product: gcc
   Version: 10.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libgomp
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mehdi.chinoune at hotmail dot com
CC: jakub at gcc dot gnu.org
  Target Milestone: ---

Trying to use OpenMP (CPU) for some parts and OpenACC (GPU) for others.
I got:
mkoffload: fatal error: either '-fopenacc' or '-fopenmp' must be set

Another use is for multi-GPU programming, where OpenMP is used to distribute
work among different GPUs

[Bug libgomp/95150] Some offloaded programs crash with openmp

2020-12-12 Thread mehdi.chinoune at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95150

Chinoune  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |WONTFIX

--- Comment #8 from Chinoune  ---
Adding "parallel do" to openmp directive solves the problem.
The crash reappears with "collapse(2)" with both OpenMP and OpenACC.

program main
  implicit none
  integer, parameter :: sp = selected_real_kind(6,37)
  real(sp), allocatable :: a(:,:), b(:,:), c(:,:)
  character( len=5 ) :: val
  integer :: n, l, m
  integer :: i, j, k
  integer :: t1, t2
  real(sp) :: tic
  !
  call get_command_argument( 1, val )
  read( val, *) n
  l = n
  m = n
  !
  call system_clock( t1, tic)
  !
  allocate( a(l,m), b(m,n), c(l,n) )
  !
  call random_number(a)
  call random_number(b)
  c = 0._sp
  !
  !$acc data copyin(a,b) copy(c)
  !$acc parallel loop collapse(3)
  !$omp target teams distribute parallel do collapse(3) map( to:a,b ) map(
tofrom:c )
  do j = 1, n
do k = 1, m
  do i = 1, l
c(i,j) = a(i,k)*b(k,j) + c(i,j)
  end do
end do
  end do
  !$acc end data
  !
  call system_clock(t2)
  print*, n, (t2-t1)/tic, sum(c)
  !
end program main

$ gfortran -O3 -fopenmp -foffload=nvptx-none matmul.f90 -o test.x
$ for i in {1..5}; do ./test.x $((512*2**$i)); done
1024  0.28788   268377424.
2048   7.4010E-02   0.
4096  0.17002   0.
8192  0.57401   0.
   16384   2.1049   0.

[Bug libgomp/95150] Some offloaded programs crash with openmp

2020-12-12 Thread mehdi.chinoune at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95150

Chinoune  changed:

   What|Removed |Added

  Known to fail|10.1.0  |10.2.0
   Keywords||openacc
Version|10.1.0  |10.2.0

--- Comment #7 from Chinoune  ---
with OpenACC, I got a similar message:

libgomp: cuStreamSynchronize error: the launch timed out and was terminated

[Bug gcov-profile/98257] New: Replace Donald B. Johnson's cycle enumeration with iterative loop finding

2020-12-12 Thread i at maskray dot me via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98257

Bug ID: 98257
   Summary: Replace Donald B. Johnson's cycle enumeration with
iterative loop finding
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: gcov-profile
  Assignee: unassigned at gcc dot gnu.org
  Reporter: i at maskray dot me
CC: marxin at gcc dot gnu.org
  Target Milestone: ---

gcov used _J. C. Tiernan, An Efficient Search Algorithm to Find the Elementary
Circuits of a Graph, Comm ACM 1970_. The worst-case time bound is exponential
in the number of elementary circuits. It enumerated cycles (aka simple circuit,
aka elementary circuit) and performed cycle cancelling.

In 2016, the resolution to PR67992 switched to Donald B. Johnson's algorithm to
improve performance. The theoretical time complexity is $O((V+E)(c+1))$ where
$c$ is the number of cycles, which is exponential in the size of the graph.
(Boost attributed the algorithm to K. A. Hawick and H. A. James, and gcov
inherited this name. However, that paper did not improve Johnson's algorithm.)

Actually every step of cycle cancelling decreases the count of at lease one arc
to 0, so there is at most $O(E)$ cycles. The resolution to PR90380 skipped
non-positive arcs and decreased the time complexity to $O(V*E^2)$ (in theory it
could be $O(E^2)$ but the implementation has a linear scan).

This is all unnecessary. We can just iteratively find cycles (using the
classical tri-color DFS) and perform cycle cancelling. There are at most O(E)
cycles and the overall time complexity is O(E^2). 

(
We are processing a reducible flow graph (there is no intuitive cycle count for
an irreducible flow graph).
Every natural loop is identified by a back edge. By constructing a dominator
tree, finding back edges, identifying natural loops and clearing the arc
counters (we will compute incoming counts so we clear counters to prevent
duplicates), the time complexity can be decreased to $O(depthOfNestedLoops*E)$.
In practice, the semi-NCA algorithm (time complexity: $O(V^2)$, but considered
faster than the almost linear Lengauer-Tarjan's algorithm) is not difficult to
implement, but identifying natural loops is troublesome. So the method is not
useful.)

[Bug fortran/98253] Conflicting random_seed/random_init results

2020-12-12 Thread sgk at troutmask dot apl.washington.edu via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98253

--- Comment #9 from Steve Kargl  ---
On Sat, Dec 12, 2020 at 11:55:41PM +, damian at sourceryinstitute dot org
wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98253
> 
> Damian Rouson  changed:
> 
>What|Removed |Added
> 
>  Resolution|--- |FIXED
>  Status|WAITING |RESOLVED
> 
> --- Comment #8 from Damian Rouson  ---
> Steve, one more question.  How do you interpret the second sentence in the 
> text
> that I originally quoted: "In each execution of the program with the same
> execution environment, if the invoking image index value in the initial team 
> is
> the same, the value for PUT shall be the same."  This is in 16.9.155 Case (i)
> describing the relationship between random_init and random_seed.  I originally
> interpreted this quote to mean that each image would use the same seed each
> time the program runs, which would be a constraint on the PRNG.  I'm now
> thinking that the reference to PUT implies that the user is setting the seed
> and this is saying that the program must set the same seed each a given image
> executes, but that seems like an odd constraint so I'm probably still horribly
> confused.  Feel free to mark this issue as invalid if this is starting to seem
> like a waste of time.  I'm just trying to understand.
> 
> Either way, an image number is defined for all programs whether or not there
> are coarrays anywhere in the program and whether or not the program is ever
> executed in multiple images -- for example, this_image() is just an intrinsic
> function rather than a (hypothetical) "coarray" intrinsic function.  This 
> point
> is most meaningful with a compiler like the Cray compiler, which requires no
> special flags to compile a program that invokes this_image().  In some sense,
> all Fortran programs are now parallel programs whether the user takes 
> advantage
> of that fact in any explicit way or not. I suspect that's the reason that
> IMAGE_DISTINCT is not optional. Possibly the committee deemed it better to
> require users to specify the desired behavior in multi-image execution. Even
> libraries that were never designed in any way to exploit parallelism can be
> linked into parallel programs so it seems better to have developers of such a
> library specify the desired behavior if their code is ultimately linked into a
> parallel program -- analogous to requiring that code be thread-safe even if 
> the
> code makes no explicit use of multi-threading.
> 

It's been awhile since I implemented random_init(),
and thought about the combinations for the two 
arguments.  Presonally, I think the standard is 
flawed.  If someone wants to review the wording of
the standard and the implementation details of 
random_init(), I am certainly not going to object.

[Bug fortran/98253] Conflicting random_seed/random_init results

2020-12-12 Thread damian at sourceryinstitute dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98253

Damian Rouson  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|WAITING |RESOLVED

--- Comment #8 from Damian Rouson  ---
Steve, one more question.  How do you interpret the second sentence in the text
that I originally quoted: "In each execution of the program with the same
execution environment, if the invoking image index value in the initial team is
the same, the value for PUT shall be the same."  This is in 16.9.155 Case (i)
describing the relationship between random_init and random_seed.  I originally
interpreted this quote to mean that each image would use the same seed each
time the program runs, which would be a constraint on the PRNG.  I'm now
thinking that the reference to PUT implies that the user is setting the seed
and this is saying that the program must set the same seed each a given image
executes, but that seems like an odd constraint so I'm probably still horribly
confused.  Feel free to mark this issue as invalid if this is starting to seem
like a waste of time.  I'm just trying to understand.

Either way, an image number is defined for all programs whether or not there
are coarrays anywhere in the program and whether or not the program is ever
executed in multiple images -- for example, this_image() is just an intrinsic
function rather than a (hypothetical) "coarray" intrinsic function.  This point
is most meaningful with a compiler like the Cray compiler, which requires no
special flags to compile a program that invokes this_image().  In some sense,
all Fortran programs are now parallel programs whether the user takes advantage
of that fact in any explicit way or not. I suspect that's the reason that
IMAGE_DISTINCT is not optional. Possibly the committee deemed it better to
require users to specify the desired behavior in multi-image execution. Even
libraries that were never designed in any way to exploit parallelism can be
linked into parallel programs so it seems better to have developers of such a
library specify the desired behavior if their code is ultimately linked into a
parallel program -- analogous to requiring that code be thread-safe even if the
code makes no explicit use of multi-threading.

[Bug middle-end/98227] [11 Regression] ICE: tree check: expected tree that contains 'decl common' structure, have 'constructor' in get_section, at varasm.c:297 on riscv64-linux-gnu

2020-12-12 Thread wilson at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98227

--- Comment #5 from Jim Wilson  ---
My bootstrap with ada succeeded.  I used the same configure options except for
--prefix.  make check is still running.

[Bug fortran/98253] Conflicting random_seed/random_init results

2020-12-12 Thread damian at sourceryinstitute dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98253

--- Comment #7 from Damian Rouson  ---
I agree that it would have been better for image_distinct to be optional.  I
co-hosted the 2018 WG5 meeting at which there were lengthy discussions around
random number generation.  I don't recall whether making that argument optional
was discussed.  I assume it wouldn't break any existing code to make it
optional in a future standard.

[Bug tree-optimization/98256] [11 Regression] ICE at -Os and above: verify_ssa failed since r11-5957

2020-12-12 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98256

Jakub Jelinek  changed:

   What|Removed |Added

   Priority|P3  |P1
 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2020-12-12
   Target Milestone|--- |11.0
 Ever confirmed|0   |1
Summary|ICE at -Os and above:   |[11 Regression] ICE at -Os
   |verify_ssa failed   |and above: verify_ssa
   ||failed since r11-5957
 CC||jakub at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org

[Bug fortran/98253] Conflicting random_seed/random_init results

2020-12-12 Thread kargl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98253

--- Comment #6 from kargl at gcc dot gnu.org ---
(In reply to Dominique d'Humieres from comment #4)
> Invalid expectation?

Not sure.  This long response was composed before I saw Damian's reply.

At the risk of starting an existential argument, I'll provide
my understanding of the situtation.

Prior to random_init(), Fortran had random_seed().  When J3
added random_number() and random_seed() to Fortran standard,
the individual who wrote the specification forgot to include
a statement about the state of the PRNG if random_seed() was 
not called.  So, this simple program

program foo
   real r
   call random_number(r)
   print *, r
end program foo

when compiled with ifort would give a different PRN on each
invocation.  A long time ago, when compiled with gfortran,
the program always gave the same PRN.  (Janne changed gfortran's
behavior when he replaced replaced the KISS PRNG with xshiro++.)
The problem was that ifort used a different processor-dependent
set of seeds on each invocation whereas gfortran used the same
processor-dependent set of seeds on each invocation.  Both
behaviors are standard conforming.  To resolve the problem, J3
could not select one behavior over the other without causing
problems with a conforming program that relied on the old behavior.

Steve Lionel wrote the specification for random_init() in
hopes of fixing shortcomings of random_seed().  Unfortunately,
he (and/or J3) decided to conflated behavior for coarray
programs into the specification.  Consider the simply non-coarray
program:

% cat u.f90
program foo
   call random_init(repeatable=.false., image_distinct=.false.)
   do i = 1, 3
  call random_number(r)
  print '(F8.5), r
   end do
   print *
   call random_init(repeatable=.false., image_distinct=.false.)
   do i = 1, 3
  call random_number(r)
  print '(F8.5), r
   end do
end program foo

% gfcx -o z u.f90 && ./z
 0.78330
 0.40072
 0.22728

 0.44823
 0.12879
 0.50003

Exactly, the behavior one would expect.  Reseeding the PRNG
uses a new set of processor-dependent seeds.  If 'z' is run
again, a different set of processor-dependent seeds are used.

Now change the code to have 'repeatable=.true.'

% gfcx -o z u.f90 && ./z
 0.67367
 0.06375
 0.69694

 0.67367
 0.06375
 0.69694

Exactly, what one expects.  When the second random_init() is
called, the PRNG is re-initialized with the original set of
processor-dependent seeds.  What happens if the executable is
run again?  Well,
% ./z
 0.34318
 0.90421
 0.38122

 0.34318
 0.90421
 0.38122
A different set of processor-dependent seeds are used to initially
seed the PRNG, and when random_init() is called a second time, it
uses that "different set of processor-dependent seeds" to re-initialize
the PRNG.

So, where does existentialism enter into the issue?  When executable
'./z' is run the following occurs:
  1) 'image0' is instantiated
  2) random_init() is called
  3) the do-loop executes
  3) 'image0' is terminated.
a year later when './z' is run again, the following occurs:
  a) 'image0' is instantiated
  b) random_init() is called
  c) the do-loop executes
  d) 'image0' is terminated.

When 'image0' in a) is instantiated, 'image0' in 1) no longer exists.
There is no way to determined what set of processor-dependent seeds
were used for random_init() in 2) when random_init() is called in b).
It does not matter what value is assigned to image_distinct in the
above code.  Is 'image0' in a) the same as 'image0' in 1) or are these
images distinct?

IMO, image_distinct only applies when more than one image is
instantiate during the execution of a co-array program.  image_distinct
should have been an optional argument.  Suppose you have a
program that has num_images() return a value of 2.  You execute that
program and the following occurs:
  I) 'image0' is instantiated
 II) 'image1' is instantiated
III) one or more images call random_init()
 IV) work is done
  V) one or more images call random_init(), again.
 VI) image1 terminates
VII) image0 terminates
It is here that image_distinct can affect the seeding of PRNG.
When I developed random_init(), I spent a few days getting
opencoarray installed on my system.  I then spent some time
trying to getting a reasonable approach of dealing with 
images (don't remember any consideration about teams).  The
comment in libgfortran/intrinsics/random_init.f90 details 
what happens with combinations of 'repeatable' and 'image_distinct'

[Bug fortran/98253] Conflicting random_seed/random_init results

2020-12-12 Thread damian at sourceryinstitute dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98253

--- Comment #5 from Damian Rouson  ---
Steve, thanks for all the time you put into implementing random_init and
responding to this PR.  My confusion stemmed from the first sentence that I
quoted from the standard. It states that the provided random_init call is
equivalent to a processor-dependent random_seed call so I was attempting to
replace my two random_seed calls with one random_init call. I see now that such
a replacement only works if one knows the correct, processor-dependent seed
values, but I also understand now that it would be pointless to do what I'm
trying to do.  Because the matching seeds would be processor-dependent, the
code wouldn't be portable. 

On a related note, I've been trying over time to evolve away from using
"coarray" as the blanket term for all parallel features.  Fortran now has so
many parallel features that don't necessarily involve coarrays. The
IMAGE_DISTINCT argument is one small example so I don't think IMAGE_DISTINCT
necessarily has anything to do with coarrays, but it does have to do with
multi-image execution.

[Bug fortran/90207] Debugging generated tree code

2020-12-12 Thread tkoenig at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90207

--- Comment #5 from Thomas Koenig  ---
https://gcc.gnu.org/pipermail/gcc-patches/2020-December/561720.html
allows debugging of the generated variables.

[Bug libgomp/95150] Some offloaded programs crash with openmp

2020-12-12 Thread mehdi.chinoune at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95150

Chinoune  changed:

   What|Removed |Added

 Resolution|WONTFIX |---
 Status|RESOLVED|UNCONFIRMED

--- Comment #6 from Chinoune  ---
Reopen, as I have reproduced the same crash with another GPU.

[Bug tree-optimization/98256] New: ICE at -Os and above: verify_ssa failed

2020-12-12 Thread zhendong.su at inf dot ethz.ch via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98256

Bug ID: 98256
   Summary: ICE at -Os and above: verify_ssa failed
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zhendong.su at inf dot ethz.ch
  Target Milestone: ---

[547] % gcctk -v
Using built-in specs.
COLLECT_GCC=gcctk
COLLECT_LTO_WRAPPER=/local/suz-local/software/local/gcc-trunk/libexec/gcc/x86_64-pc-linux-gnu/11.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-trunk/configure --disable-bootstrap
--prefix=/local/suz-local/software/local/gcc-trunk --enable-languages=c,c++
--disable-werror --enable-multilib --with-system-zlib
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 11.0.0 20201212 (experimental) [master revision
ff2dfdef2f2:87144b47033:815eb852a2d293331eba2e241a986b8641d4da1f] (GCC) 
[548] % 
[548] % gcctk -O1 -c small.c
[549] % 
[549] % gcctk -Os -c small.c
small.c: In function ā€˜gā€™:
small.c:3:6: error: definition in block 2 follows the use
3 | void g() { f(1 && ~a / b); }
  |  ^
for SSA_NAME: b.1_3 in statement:
_8 = .ADD_OVERFLOW (a.0_1, b.1_3);
during GIMPLE pass: widening_mul
small.c:3:6: internal compiler error: verify_ssa failed
0xfa18ab verify_ssa(bool, bool)
../../gcc-trunk/gcc/tree-ssa.c:1214
0xc26fe7 execute_function_todo
../../gcc-trunk/gcc/passes.c:2049
0xc27d92 execute_todo
../../gcc-trunk/gcc/passes.c:2096
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.
[550] % 
[550] % cat small.c
extern void f (int);
unsigned a, b;
void g() { f(1 && ~a / b); }

[Bug tree-optimization/98255] [10/11 Regression] wrong code at -Os and above with -fPIC on x86_64-pc-linux-gnu

2020-12-12 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98255

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org,
   ||jamborm at gcc dot gnu.org
Summary|wrong code at -Os and above |[10/11 Regression] wrong
   |with -fPIC on   |code at -Os and above with
   |x86_64-pc-linux-gnu |-fPIC on
   ||x86_64-pc-linux-gnu
   Last reconfirmed||2020-12-12
   Target Milestone|--- |10.3
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

--- Comment #1 from Jakub Jelinek  ---
Started with r10-917-g3b47da42de621c6c3bf7d2f9245df989aa7eb5a1

[Bug tree-optimization/98254] Failure to optimize simple pattern for __builtin_convertvector

2020-12-12 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98254

--- Comment #3 from Jakub Jelinek  ---
(In reply to rguent...@suse.de from comment #2)
> Should already be handled by vectorizing the CTOR.

I've tried:

typedef int __attribute__((vector_size(16))) V;

V
foo (short *a)
{
  return (V){a[0], a[1], a[2], a[3]};
}

V
bar (int *a)
{
  return (V){a[0], a[1], a[2], a[3]};
}

and we don't do a vector (unaligned) read even in bar with -O3
-fno-tree-slp-vectorize, it is just SLP vectorization that makes it vectorize.
If we should handle foo as convertvector, we should handle bar in the same spot
as vector load from memory.

[Bug c/98252] gcc 10 unaligned copy (with tree-loop-vectorize) produce wrong result

2020-12-12 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98252

Jakub Jelinek  changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|UNCONFIRMED |RESOLVED

--- Comment #3 from Jakub Jelinek  ---
When there is UB, you can't make any assumptions, the program can do anything
after it reaches it.

[Bug tree-optimization/98255] New: wrong code at -Os and above with -fPIC on x86_64-pc-linux-gnu

2020-12-12 Thread zhendong.su at inf dot ethz.ch via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98255

Bug ID: 98255
   Summary: wrong code at -Os and above with -fPIC on
x86_64-pc-linux-gnu
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zhendong.su at inf dot ethz.ch
  Target Milestone: ---

[510] % gcctk -v
Using built-in specs.
COLLECT_GCC=gcctk
COLLECT_LTO_WRAPPER=/local/suz-local/software/local/gcc-trunk/libexec/gcc/x86_64-pc-linux-gnu/11.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-trunk/configure --disable-bootstrap
--prefix=/local/suz-local/software/local/gcc-trunk --enable-languages=c,c++
--disable-werror --enable-multilib --with-system-zlib
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 11.0.0 20201212 (experimental) [master revision
ff2dfdef2f2:87144b47033:815eb852a2d293331eba2e241a986b8641d4da1f] (GCC) 
[511] % 
[511] % gcctk -Os small.c; ./a.out
[512] % 
[512] % gcctk -Os -fPIC small.c
[513] % ./a.out
Segmentation fault
[514] % 
[514] % cat small.c
struct a {
  volatile unsigned b;
  unsigned c;
};

int d, *e, h, k, l;
static struct a f;
long g;
static unsigned i = 4294967294;
volatile int j;

long m() {
  char n[4][4][3] = {{{9, 2, 8}, {9, 2, 8}, {9, 2, 8}, {9}}, {{8}}, {{8}},
{{2}}};
  while (d) {
for (; f.c < 4; f.c++) {
  *e = 0;
  h = n[f.c + 4][0][d];
}
while (g)
  return n[0][3][i];
while (1) {
  if (k) {
j = 0;
if (j)
  continue;
  }
  if (l)
break;
}
  }
  return 0;
}

int main() {
  m();
  return 0;
}

[Bug fortran/97455] ICE on invalid code (wrong pointer assignment) in SELECT TYPE construct

2020-12-12 Thread dominiq at lps dot ens.fr via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97455

Dominique d'Humieres  changed:

   What|Removed |Added

   Last reconfirmed||2020-12-12
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW

--- Comment #1 from Dominique d'Humieres  ---
Confirmed since at least GCC7.
Note pr86551 is now fixed.

[Bug c/98252] gcc 10 unaligned copy (with tree-loop-vectorize) produce wrong result

2020-12-12 Thread a3at.mail at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98252

--- Comment #2 from Azat  ---
>If you compile your testcase with -fsanitize=undefined, you'll see that it 
>invokes UB.

Jakub, Indeed I saw them, but is there any explanation (except "UB") why it
does copy by 16 if the memory overlaps?

[Bug fortran/86551] [OOP] ICE on invalid code with select type

2020-12-12 Thread dominiq at lps dot ens.fr via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86551

--- Comment #5 from Dominique d'Humieres  ---
The ICE is gone for GCC10.2.1 and 11.0.

[Bug fortran/98253] Conflicting random_seed/random_init results

2020-12-12 Thread dominiq at lps dot ens.fr via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98253

Dominique d'Humieres  changed:

   What|Removed |Added

   Last reconfirmed||2020-12-12
 Status|UNCONFIRMED |WAITING
 Ever confirmed|0   |1

--- Comment #4 from Dominique d'Humieres  ---
Invalid expectation?

[Bug tree-optimization/98254] Failure to optimize simple pattern for __builtin_convertvector

2020-12-12 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98254

--- Comment #2 from rguenther at suse dot de  ---
On December 12, 2020 7:27:01 PM GMT+01:00, "jakub at gcc dot gnu.org"
 wrote:
>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98254
>
>Jakub Jelinek  changed:
>
>   What|Removed |Added
>
>  CC||jakub at gcc dot gnu.org,
>   ||rguenth at gcc dot gnu.org
>
>--- Comment #1 from Jakub Jelinek  ---
>Guess a task for SLP vectorization.

Should already be handled by vectorizing the CTOR.

[Bug c/98252] gcc 10 unaligned copy (with tree-loop-vectorize) produce wrong result

2020-12-12 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98252

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek  ---
If you compile your testcase with -fsanitize=undefined, you'll see that it
invokes UB.

[Bug tree-optimization/98254] Failure to optimize simple pattern for __builtin_convertvector

2020-12-12 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98254

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org,
   ||rguenth at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek  ---
Guess a task for SLP vectorization.

[Bug fortran/98022] [9/10/11 Regression] ICE in gfc_assign_data_value, at fortran/data.c:468 since r9-3803-ga5fbc2f36a291cbe

2020-12-12 Thread sgk at troutmask dot apl.washington.edu via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98022

--- Comment #9 from Steve Kargl  ---
On Sat, Dec 12, 2020 at 05:54:43PM +, pault at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98022
> 
> --- Comment #8 from Paul Thomas  ---
> The example that you give shows that setting the undefined part to zero
> certainly is not correct. I updated my tree for the commit and am only just 
> now
> rebuilding. It'll be tomorrow before I put this right.
> 
> I guess that this is in the category of invalid but not forbidden. It's in the
> same category as:
>   complex :: a, b
>   a%im = 1.0
>   b = a
>   print *, a, b
> end
> 

Yes, it's invalid under the same portion of section 19 I quoted earlier.
'a' is undefined because 'a%re' is undefined.  I cannot find anything
in the Standard that requires an error or a warning message.

[Bug target/92729] [avr] Convert the backend to MODE_CC so it can be kept in future releases

2020-12-12 Thread abebeos at lazaridis dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92729

--- Comment #43 from abebeos at lazaridis dot com ---
The patch is now (after further validation zero regressions within gcc/g++
testsuite in 2 different test-setups) "out there":

https://gcc.gnu.org/pipermail/gcc-patches/2020-December/561718.html

My understanding of the process tells me:

- the relevant maintainers decide about the patch.
- if merged, then this issue can be closed. then...
- the bounty backers (and only they) decide about the claims to the bounty.

https://github.com/bountysource/core/wiki/Frequently-Asked-Questions#how-are-claims-processed

38 backers - it looks quite impossible for one malicious claimant to cheat the
system.

[Bug fortran/98253] Conflicting random_seed/random_init results

2020-12-12 Thread kargl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98253

--- Comment #3 from kargl at gcc dot gnu.org ---
Third thought.  Here are the programs you meant to write (without error
checking such as how_to_use_random_init must be run before
how_to_seed_with_random_seed_like_random_init).

program how_to_use_random_init

   implicit none

   integer fd, i, n
   integer, allocatable :: seeds(:)
   real r

   call random_init(repeatable=.true., image_distinct=.true.)

   call random_seed(size=n)
   allocate(seeds(n))
   call random_seed(get=seeds)
   open(newunit=fd,file='seed.cache',access='stream',status='replace')
   write(fd) seeds
   close(fd)

   do i=1,5
 call random_number(r)
 print *,r 
   end do

end program how_to_use_random_init

program how_to_seed_with_random_seed_like_random_init

   implicit none

   integer fd, i, n
   integer, allocatable :: seeds(:)
   real r

   call random_seed(size=n)
   allocate(seeds(n))
   open(newunit=fd,file='seed.cache',access='stream',status='old')
   read(fd) seeds
   close(fd)

   call random_seed(put=seeds)
   do i=1,5
  call random_number(r)
  print *,r 
   end do

end program how_to_seed_with_random_seed_like_random_init

[Bug fortran/98022] [9/10/11 Regression] ICE in gfc_assign_data_value, at fortran/data.c:468 since r9-3803-ga5fbc2f36a291cbe

2020-12-12 Thread pault at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98022

--- Comment #8 from Paul Thomas  ---
The example that you give shows that setting the undefined part to zero
certainly is not correct. I updated my tree for the commit and am only just now
rebuilding. It'll be tomorrow before I put this right.

I guess that this is in the category of invalid but not forbidden. It's in the
same category as:
  complex :: a, b
  a%im = 1.0
  b = a
  print *, a, b
end

Thanks

Paul

[Bug fortran/98253] Conflicting random_seed/random_init results

2020-12-12 Thread kargl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98253

--- Comment #2 from kargl at gcc dot gnu.org ---
On 2nd thought.

Of course, the results are different.

In your first example, you have

  call random_init(repeatable=.true., image_distinct=.true.)

which gets you processor-dependent seeds.  In your second
example, you have

  call random_seed(size=n)
  call random_seed(put=[(i,i=1,n)])

that is not processor-dependent.  You are explicitly seeding
the PRNG.

[Bug tree-optimization/98254] New: Failure to optimize simple pattern for __builtin_convertvector

2020-12-12 Thread gabravier at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98254

Bug ID: 98254
   Summary: Failure to optimize simple pattern for
__builtin_convertvector
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gabravier at gmail dot com
  Target Milestone: ---

typedef int32_t __attribute__((vector_size(16))) v4i32;
typedef int16_t __attribute__((vector_size(8))) v4i16;

v4i32 f(short *a)
{
return (v4i32){a[0], a[1], a[2], a[3]};
}

This can be optimized to `return __builtin_convertvector(*(v4i16 *)a, v4i32);`
(or at least, something very close to that, if aliasing is to be taken into
account). LLVM does this transformation, but GCC does not.

[Bug fortran/98253] Conflicting random_seed/random_init results

2020-12-12 Thread kargl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98253

kargl at gcc dot gnu.org changed:

   What|Removed |Added

 CC||kargl at gcc dot gnu.org

--- Comment #1 from kargl at gcc dot gnu.org ---
Of course, the results are different.  When I wrote random_init(), I asked
several times on the J3 list what image_distinct meant.  No one would provide
an answer.  I concluded that image_distinct only affects co-array programs. 
You can read the long comment in libgfortran/intrinsics/random_init.f90.

[Bug libstdc++/98003] FAIL: 27_io/basic_syncbuf/sync_ops/1.cc (test for excess errors)

2020-12-12 Thread dave.anglin at bell dot net via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98003

--- Comment #5 from dave.anglin at bell dot net ---
There is no --as-needed support.

I think either approach would simplify things as most targets don't need to
link against libatomic.

[Bug fortran/98022] [9/10/11 Regression] ICE in gfc_assign_data_value, at fortran/data.c:468 since r9-3803-ga5fbc2f36a291cbe

2020-12-12 Thread sgk at troutmask dot apl.washington.edu via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98022

--- Comment #7 from Steve Kargl  ---
On Sat, Dec 12, 2020 at 04:02:54PM +, pault at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98022
> 
> --- Comment #6 from Paul Thomas  ---
> (In reply to kargl from comment #4)
> > (In reply to Paul Thomas from comment #3)
> > 
> > >   function kn1() result(hm2)
> > > complex :: hm(1:2), hm2(1:2)
> > > data (hm(md)%re, md=1,2)/1.0, 2.0/
> > > hm2 = hm
> > >   end function kn1
> > 
> > Are you sure that this is valid Fortran?  I cannot
> > find anything in the Fortran standard that says hm%im
> > is defined.  Thus, 'hm2=hm' is referencing a variable
> > that is no completely defined.
> > 
> > 
> > 19.6.1 Definition of objects and subobjects
> > 
> > 2 Arrays, including sections, and variables of derived, character,
> >   or complex type are objects that consist of zero or more subobjects.
> >   Associations may be established between variables and subobjects and
> >   between subobjects of different variables. These subobjects may become
> >   defined or undefined.
> > 
> > 5 A complex or character scalar object is defined if and only if all
> >   of its subobjects are defined.
> 
> Hi Steve,
> 
> I saw your comment a bit too late. I think that you are correct. I guess that,
> at very least, I should not zero out the undefined part of the complex object?
> That way it would be equivalent to using assignment to achieve the same thing
> or to partially define a derived type.
> 
> I'll post on clf.
> 

I recall looking at this PR a long time ago, but came up empty
with ideas on how to fix it.  You've some made some progress.

It gets messy (at least to me) to determine if it is valid,
and comes from reading 8.6.7 "Data statement", carefully.
One gets to 

   A data-stmt-constant other than boz-literal-constant, null-init,
   or initial-data-target shall be compatible with its corresponding
   variable according to the rules of intrinsic assignment (10.2.1.2).
   The variable is initially defined with the value specified by the
   data-stmt-constant; if necessary, the value is converted according
   to the rules of intrinsic assignment (10.2.1.3) to a value that
   agrees in type, type parameters, and shape with the variable.

Now, we go to "what is a variable?"

   R902 variable  is designator

   R901 designator  is ...
or complex-part-designator
...

   R915 complex-part-designator  is designator % RE
 or designator % IM


In your example, hm%re is real, so the rules for intrinsic
assignment to a real applies.

Of course, I could be wrong.

[Bug ada/98228] [11 Regression] ICE: Assert_Failure atree.adb:931: Error detected at s-gearop.adb:382:34 [a-ngrear.adb:313:7 [a-nllrar.ads:18:1]] on s390x-linux-gnu

2020-12-12 Thread doko at debian dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98228

--- Comment #3 from Matthias Klose  ---
I still see this with 20201212,
54f75d8fb3f:a415eda93e0:cc9b9c0b68233d38a26f7acd68cc5f9a8fc4d994

[Bug fortran/98253] New: Conflicting random_seed/random_init results

2020-12-12 Thread damian at sourceryinstitute dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98253

Bug ID: 98253
   Summary: Conflicting random_seed/random_init results
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: damian at sourceryinstitute dot org
  Target Milestone: ---

16.9.155 Case (i) in the Fortran 2018 standard states

  CALL RANDOM_INIT (REPEATABLE=true, IMAGE_DISTINCT=true) is equivalent to  
  invoking RANDOM_SEED with a processor-dependent value for PUT that is 
  different on every invoking image. In each execution of the program with 
  the same execution environment, if the invoking image index value in the 
  initial team is the same, the value for PUT shall be the same.

but the two programs below give different results.

% cat random_init.f90
  implicit none
  integer i
  real r
  call random_init(repeatable=.true., image_distinct=.true.)
  do i=1,5
call random_number(r)
print *,r 
  end do
end

% cat random_seed.f90 
  implicit none
  integer i, n
  real r
  call random_seed(size=n)
  call random_seed(put=[(i,i=1,n)])
  do i=1,5
call random_number(r)
print *,r 
  end do
end

% /usr/local/Cellar/gnu/11.0.0/bin/gfortran random_init.f90
% ./a.out
  0.731217086
  0.652637541
  0.381399393
  0.817764997
  0.394176722

% /usr/local/Cellar/gnu/11.0.0/bin/gfortran random_seed.f90 
% ./a.out
  0.471070886
  0.117344737
  0.357547939
  0.318134785
  0.696753800

% /usr/local/Cellar/gnu/11.0.0/bin/gfortran --version
GNU Fortran (GCC) 11.0.0 20200804 (experimental)

[Bug fortran/98022] [9/10/11 Regression] ICE in gfc_assign_data_value, at fortran/data.c:468 since r9-3803-ga5fbc2f36a291cbe

2020-12-12 Thread pault at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98022

--- Comment #6 from Paul Thomas  ---
(In reply to kargl from comment #4)
> (In reply to Paul Thomas from comment #3)
> 
> >   function kn1() result(hm2)
> > complex :: hm(1:2), hm2(1:2)
> > data (hm(md)%re, md=1,2)/1.0, 2.0/
> > hm2 = hm
> >   end function kn1
> 
> Are you sure that this is valid Fortran?  I cannot
> find anything in the Fortran standard that says hm%im
> is defined.  Thus, 'hm2=hm' is referencing a variable
> that is no completely defined.
> 
> 
> 19.6.1 Definition of objects and subobjects
> 
> 2 Arrays, including sections, and variables of derived, character,
>   or complex type are objects that consist of zero or more subobjects.
>   Associations may be established between variables and subobjects and
>   between subobjects of different variables. These subobjects may become
>   defined or undefined.
> 
> 5 A complex or character scalar object is defined if and only if all
>   of its subobjects are defined.

Hi Steve,

I saw your comment a bit too late. I think that you are correct. I guess that,
at very least, I should not zero out the undefined part of the complex object?
That way it would be equivalent to using assignment to achieve the same thing
or to partially define a derived type.

I'll post on clf.

Cheers

Paul

[Bug fortran/98022] [9/10/11 Regression] ICE in gfc_assign_data_value, at fortran/data.c:468 since r9-3803-ga5fbc2f36a291cbe

2020-12-12 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98022

--- Comment #5 from CVS Commits  ---
The master branch has been updated by Paul Thomas :

https://gcc.gnu.org/g:ff2dfdef2f2e01c579dd280daa1d81fbeb4d7ac5

commit r11-5959-gff2dfdef2f2e01c579dd280daa1d81fbeb4d7ac5
Author: Paul Thomas 
Date:   Sat Dec 12 14:01:08 2020 +

Fortran: Enable inquiry references in data statements [PR98022].

2020-12-12  Paul Thomas  

gcc/fortran
PR fortran/98022
* data.c (gfc_assign_data_value): Handle inquiry references in
the data statement object list.

gcc/testsuite/
PR fortran/98022
* gfortran.dg/data_inquiry_ref.f90: New test.

[Bug fortran/97920] [FINAL] -O2 segment fault due to extend derive type's member being partially allocated

2020-12-12 Thread tkoenig at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97920

Thomas Koenig  changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|WAITING |RESOLVED

--- Comment #3 from Thomas Koenig  ---
Paul is correct, the state of the pointers is undefined.

What you can do to correct this is to use

module m
  type t1
real, dimension(:), pointer :: a => NULL()
  contains
final :: t1f
  end type

  type, extends(t1) :: t2
real, dimension(:), pointer :: b => NULL()
  contains
final :: t2f
  end type

which will then run as expected.

[Bug tree-optimization/96685] Failure to optimize not+sub to add+not

2020-12-12 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96685

--- Comment #7 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:0bd675183d94e6bca100c3aaaf87ee9676fb3c26

commit r11-5958-g0bd675183d94e6bca100c3aaaf87ee9676fb3c26
Author: Jakub Jelinek 
Date:   Sat Dec 12 14:49:57 2020 +0100

match.pd: Add ~(X - Y) -> ~X + Y simplification [PR96685]

This patch adds the ~(X - Y) -> ~X + Y simplification requested
in the PR (plus also ~(X + C) -> ~X + (-C) for constants C that can
be safely negated.

The first two simplify blocks is what has been requested in the PR
and that makes the first testcase pass.
Unfortunately, that change also breaks the second testcase, because
while the same expressions appearing in the same stmt and split
across multiple stmts has been folded (not really) before, with
this optimization fold-const.c optimizes ~X + Y further into
(Y - X) - 1 in fold_binary_loc associate: code, but we have nothing
like that in GIMPLE and so end up with different expressions.

The last simplify is an attempt to deal with just this case,
had to rule out there the Y == -1U case, because then we
reached infinite recursion as ~X + -1U was canonicalized by
the pattern into (-1U - X) + -1U but there is a canonicalization
-1 - A -> ~A that turns it back.  Furthermore, had to make it #if
GIMPLE only, because it otherwise resulted in infinite recursion
when interacting with the associate: optimization.
The end result is that we pass all 3 testcases and thus canonizalize
the 3 possible forms of writing the same thing.

2020-12-12  Jakub Jelinek  

PR tree-optimization/96685
* match.pd (~(X - Y) -> ~X + Y): New optimization.
(~X + Y -> (Y - X) - 1): Likewise.

* gcc.dg/tree-ssa/pr96685-1.c: New test.
* gcc.dg/tree-ssa/pr96685-2.c: New test.
* gcc.dg/tree-ssa/pr96685-3.c: New test.

[Bug tree-optimization/96272] Failure to optimize overflow check

2020-12-12 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96272

--- Comment #7 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:fe78528c05fdd562f21e12675781473b0fbe892e

commit r11-5957-gfe78528c05fdd562f21e12675781473b0fbe892e
Author: Jakub Jelinek 
Date:   Sat Dec 12 14:48:47 2020 +0100

widening_mul: Recognize another form of ADD_OVERFLOW [PR96272]

The following patch recognizes another form of hand written
__builtin_add_overflow (this time _p), in particular when
the code does unsigned
if (x > ~0U - y)
or
if (x <= ~0U - y)
it can be optimized (if the subtraction turned into ~y is single use)
into
if (__builtin_add_overflow_p (x, y, 0U))
or
if (!__builtin_add_overflow_p (x, y, 0U))
and generate better code, e.g. for the first function in the testcase:
-   movl%esi, %eax
addl%edi, %esi
-   notl%eax
-   cmpl%edi, %eax
-   movl$-1, %eax
-   cmovnb  %esi, %eax
+   jc  .L3
+   movl%esi, %eax
+   ret
+.L3:
+   orl $-1, %eax
ret
on x86_64.  As for the jumps vs. conditional move case, that is some CE
issue with complex branch patterns we should fix up no matter what, but
in this case I'm actually not sure if branchy code isn't better, overflow
is something that isn't that common.

2020-12-12  Jakub Jelinek  

PR tree-optimization/96272
* tree-ssa-math-opts.c (uaddsub_overflow_check_p): Add OTHER
argument.
Handle BIT_NOT_EXPR.
(match_uaddsub_overflow): Optimize unsigned a > ~b into
__imag__ .ADD_OVERFLOW (a, b).
(math_opts_dom_walker::after_dom_children): Call
match_uaddsub_overflow
even for BIT_NOT_EXPR.

* gcc.dg/tree-ssa/pr96272.c: New test.

[Bug c/98252] New: gcc 10 unaligned copy (with tree-loop-vectorize) produce wrong result

2020-12-12 Thread a3at.mail at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98252

Bug ID: 98252
   Summary: gcc 10 unaligned copy (with tree-loop-vectorize)
produce wrong result
   Product: gcc
   Version: 10.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: a3at.mail at gmail dot com
  Target Milestone: ---

Created attachment 49750
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49750&action=edit
test case

In the attachment there is an example of two functions:
- incremental_copy_fast_path
- incremental_copy_fast_path_safe

If it will be compiled with -O1 -ftree-loop-vectorize, safe variants works
correctly (incremental_copy_fast_path_safe), while other
(incremental_copy_fast_path) does not, and looks like the problem is that it
copies 16 bytes at a time (movdqu+movups), while this does not looks correct,
since it may be changed after copying (since the memory overlaps).

Is this some problem in the code due to some UB because of unaligned
store/load? Or a compiler issue?

Thanks in advance!