https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83064

            Bug ID: 83064
           Summary: DO CONCURRENT inconsistent results
           Product: gcc
           Version: 8.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: fortran
          Assignee: unassigned at gcc dot gnu.org
          Reporter: cfztol at hotmail dot com
  Target Milestone: ---

This bug is related to 83017. I'm trying to parallelize a pure function with
the do concurrent construct. The following table shows compile flags and
results from the test program at the end of this message. GCC revision r254890:

Unrolled do-loop

Options                               Parallel  Correct
-Og    -ftree-parallelize-loops=2     N         Y
-O1    -ftree-parallelize-loops=2     Y         N - arbitrary
-O2    -ftree-parallelize-loops=2     Y         N - always zero
-O3    -ftree-parallelize-loops=2     Y         Y
-Ofast -ftree-parallelize-loops=2     Y         Y

Modulo inside do-loop, or Indexed via host associated array

Options                               Parallel  Correct
-Og    -ftree-parallelize-loops=2     N         Y
-O1    -ftree-parallelize-loops=2     Y         N - arbitrary
-O2    -ftree-parallelize-loops=2     Y         N - arbitrary
-O3    -ftree-parallelize-loops=2     Y         N - arbitrary
-Ofast -ftree-parallelize-loops=2     Y         N - arbitrary

So the loop is parallelized always, unless -Og optimization level is used.
However, the computed value of PI is only correct in a few cases, depending of
optimization level and details of the pure function. I think that is a bug -
especially since consecutive runs give different results (marked with
"arbitrary" in the table above).

Here's my test program:

program main
    use, intrinsic :: iso_fortran_env
    implicit none

    integer, parameter :: nsplit = 4
    integer(int64), parameter :: ne = 200000000
    integer(int64) :: stride, low(nsplit), high(nsplit), edof(ne), i
    real(real64), dimension(nsplit) :: pi

    edof(1::4) = 1
    edof(2::4) = 2
    edof(3::4) = 3
    edof(4::4) = 4

    stride = ceiling(real(ne)/nsplit)
    do i = 1, nsplit
        high(i) = stride*i
    end do
    do i = 2, nsplit
        low(i) = high(i-1) + 1
    end do
    low(1) = 1
    high(nsplit) = ne

    pi = 0
    do concurrent (i = 1:nsplit)
        pi(i) = sum(compute( low(i), high(i) ))
    end do
    print *, "PI", 4*sum(pi)
    print *, "PI", 4*atan(1.0)

contains

    pure function compute( low, high ) result( tmp )        
        integer(int64), intent(in) :: low, high
        real(real64), dimension(nsplit) :: tmp
        integer(int64) :: j, k

        tmp = 0

        ! Unrolled loop
!         do j = low, high, 4
!             k = 1
!             tmp(k) = tmp(k) + (-1)**(j+1) / real( 2*j-1 )                     
!             k = 2
!             tmp(k) = tmp(k) + (-1)**(j+2) / real( 2*j+1 )                     
!             k = 3
!             tmp(k) = tmp(k) + (-1)**(j+3) / real( 2*j+3 )                     
!             k = 4
!             tmp(k) = tmp(k) + (-1)**(j+4) / real( 2*j+5 )                     
!         end do

        ! Loop with modulo operation
!         do j = low, high
!             k = mod( j, nsplit ) + 1
!             tmp(k) = tmp(k) + (-1)**(j+1) / real( 2*j-1 )                     
!         end do

        ! Loop with subscripting via host association
        do j = low, high
            k = edof(j)
            tmp(k) = tmp(k) + (-1.0_real64)**(j+1) / real( 2*j-1 )              
        end do
    end function

end program main

Reply via email to