[Bug tree-optimization/114767] gfortran AVX2 complex multiplication by (0d0,1d0) suboptimal

2024-05-14 Thread mjr19 at cam dot ac.uk via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114767 --- Comment #7 from mjr19 at cam dot ac.uk --- Another manifestation of this issue in GCC 13.1 and 14.1 is that the loop do i=1,n c(i)=a(i)*c(i)*(0d0,1d0) enddo takes about twice as long to run as do i=1,n c(i)=a(i)*(0d0,1d0

[Bug tree-optimization/114324] [13/14/15 Regression] AVX2 vectorisation performance regression with gfortran 13/14

2024-05-01 Thread mjr19 at cam dot ac.uk via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114324 --- Comment #5 from mjr19 at cam dot ac.uk --- Note that bug 114767 also turns out to be a case in which the inability to alternate neg and nop along a vector leads to poor performance with some operations on the complex type. That optimisation

[Bug tree-optimization/114767] gfortran AVX2 complex multiplication by (0d0,1d0) suboptimal

2024-04-19 Thread mjr19 at cam dot ac.uk via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114767 --- Comment #6 from mjr19 at cam dot ac.uk --- I was starting to wonder whether this issue might be related to that in bug 114324, which is a slightly more complicated example in which multiplication by a purely imaginary number destroys

[Bug tree-optimization/114767] gfortran AVX2 complex multiplication by (0d0,1d0) suboptimal

2024-04-18 Thread mjr19 at cam dot ac.uk via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114767 --- Comment #4 from mjr19 at cam dot ac.uk --- An issue which I suspect is related is shown by subroutine zradd(c,n) integer :: i,n complex(kind(1d0)) :: c(*) do i=1,n c(i)=c(i)+1d0 enddo end subroutine If compiled with gfortran

[Bug tree-optimization/114767] gfortran AVX2 complex multiplication by (0d0,1d0) suboptimal

2024-04-18 Thread mjr19 at cam dot ac.uk via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114767 --- Comment #2 from mjr19 at cam dot ac.uk --- Ah, I see. An inability to alternate negation with noop also means that conjugation is treated suboptimally. do i=1,n c(i)=conjg(c(i)) enddo Here gfortran-13 and -14 are differently

[Bug fortran/114767] New: gfortran AVX2 complex multiplication by (0d0,1d0) suboptimal

2024-04-18 Thread mjr19 at cam dot ac.uk via Gcc-bugs
Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: mjr19 at cam dot ac.uk Target Milestone: --- Gfortran 14 shows considerable improvement over 13.1 on x86_64 AVX2 on the test case subroutine scale_i(c,n) integer :: i,n complex

[Bug tree-optimization/114324] [13/14 Regression] AVX2 vectorisation performance regression with gfortran 13/14

2024-03-15 Thread mjr19 at cam dot ac.uk via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114324 --- Comment #4 from mjr19 at cam dot ac.uk --- Created attachment 57713 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57713=edit Second testcase, very similar to first Thank you for looking into this. The real code in question has m

[Bug fortran/114324] New: AVX2 vectorisation performance regression with gfortran 13/14

2024-03-13 Thread mjr19 at cam dot ac.uk via Gcc-bugs
Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: mjr19 at cam dot ac.uk Target Milestone: --- Created attachment 57685 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57685=edit Test case of loop showing performance regress

[Bug fortran/92698] Unnecessary copy in overlapping array assignment

2019-11-30 Thread mjr19 at cam dot ac.uk
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92698 --- Comment #2 from mjr19 at cam dot ac.uk --- Thomas is quite correct that I had failed to mark the array as contiguous, at which point the double copy is more reasonable (although memcpy will also expect its arguments to be contiguous). He

[Bug fortran/92698] New: Unnecessary copy in overlapping array assignment

2019-11-27 Thread mjr19 at cam dot ac.uk
: fortran Assignee: unassigned at gcc dot gnu.org Reporter: mjr19 at cam dot ac.uk Target Milestone: --- subroutine cpy(a,src,dest,len) integer, intent(in) :: src,dest,len real(kind(1d0)), intent(inout) :: a(:) a(dest:dest+len-1)=a(src:src+len-1) end subroutine cpy