Re: Aw: Re: Add: [Bug fortran/121043] [16 Regression] Tests of OpenCoarray fail to pass, works on 15.1.1 20250712

Harald Anlauf Sat, 26 Jul 2025 10:04:11 -0700

> Gesendet: Samstag, 26. Juli 2025 um 15:17
> Von: "Mikael Morin" <morin-mik...@orange.fr>
> An: "Harald Anlauf" <anl...@gmx.de>, ve...@gmx.de
> CC: fortran@gcc.gnu.org
> Betreff: Re: Aw: Re: Add: [Bug fortran/121043] [16 Regression] Tests of 
> OpenCoarray fail to pass, works on 15.1.1 20250712
>
> Le 24/07/2025 à 22:01, Harald Anlauf a écrit :
> > Hi Andre,
> > 
> >> Gesendet: Donnerstag, 24. Juli 2025 um 10:25
> >> Von: "Andre Vehreschild" <ve...@gmx.de>
> >> An: "Harald Anlauf" <anl...@gmx.de>
> >> CC: fortran@gcc.gnu.org
> >> Betreff: Re: Add: [Bug fortran/121043] [16 Regression] Tests of 
> >> OpenCoarray fail to pass, works on 15.1.1 20250712
> >>
> >> Hi Harald,
> >>
> >> <snipp>
> >>
> >>>> Did that help?
> >>>
> >>> Actually this discussion is quite helpful to me, so I (and maybe
> >>> others) understand more of the underlying stuff.
> >>>
> >>> I now spent some time looking thru portions of the F2023 standard,
> >>> and I think that it answers many questions in that respect:
> >>>
> >>> - 10.2 Assignment, esp. 10.2.1.3 Interpretation of intrinsic
> >>> assignments
> >>>
> >>> - 11.1.3 ASSOCIATE construct
> >>>
> >>> - Transformational intrinsics: this_image, team_number, ...
> >>>
> >>> It seems to be clear in most cases on which image something is
> >>> evaluated, and which order.
> >>>
> >>>> I mean, we are way off of the original question, which was if it
> >>>> is ok to always compute a function result on the image initiating a
> >>>> communication instead of in the caf_accessor.
> >>>
> >>> I am still confused what you mean by "initiating a communication".
> >>
> >> When you use OpenCoarrays and a Coindex gets executed a communication
> >> is triggered.
> > 
> > OK.
> > 
> >>> The function you are talking about takes an argument, interpreted
> >>> in the way defined by the standard, and each image evalutes its
> >>> portion.
> >>
> >> That is what I confused. I had the dumb idea to evaluate certain
> >> functions not on the calling image, but on the remote one. (Again,
> >> OpenCoarrays triggers a communication, when the coindex points to an
> >> image different from this_image()). My last patch remedies this.
> >> Function calls in an expression having a coindex are now always
> >> evaluated on the calling image.
> > 
> > Can you elaborate what you mean here?  It is very unclear to me.
> > Function evaluation, or subroutine calls require their arguments
> > to be evaluated before actually invoking the procedures.
> > 
> > See F2023:15.5.3 Function reference and 15.5.4 Subroutine reference,
> > where there is nothing special about coarrays or coindexed objects.
> > No need to even remotely think about any doing something on
> > different images.  This is also consistent with the text on
> > assignments, associate, etc.
> > 
> > So do I understand your comment that the coarray implementation
> > does (did?) not respect the standard here?  Does is satisfy the
> > standard now?
> > 
> >>
> >>> In code such as
> >>>
> >>>> if (this_image() == 1) caf(:, team_number(row_team))[1, team_number
> >>>> = -1] = row
> >>>
> >>> team_number is a transformational function, which I expect to get
> >>> evaluated on each image where the condition is fulfilled.  I don't
> >>> see any communication involved.
> >>
> >> Well, the expression caf(...)[1, team_number=-1] when this_image() /= 1
> >> triggers a communication. The program is writing into "remote" memory
> >> here. I.e. memory that belongs to image 1 in the initial team. When
> >> this code is executed by image 1 of the row_team, which maps to image 4
> >> in the initial team (just for simplicity; it may map to a different one,
> >> but let's assume it is mapped linear here), then a portion of the caf
> >> array in the initial team of image 1 is updated.
> > 
> > Well, this does not look right to me.
> > 
> > The "if (this_image() == 1)" prevents the assignment from being
> > executed for this_image() /= , no matter what you like.
> > If that is your proposal, then I am out.
> > 
> >> When using
> >> OpenCoarrays, this means that a message is composed, send to the remote
> >> image's communication thread, executed there and a result is returned
> >> indicating completion. This is where the communication is involved.
> >> GFortran creates an accessor routine for writing data into `caf(:,
> >> add_data%team_number_row_team) = data`. This routine is executed by the
> >> communication thread on the remote image. My latest patch now corrects,
> >> that `add_data%team_number_row_team` is correctly used instead of
> >> `current_team(add_data%row_team)`. The latter can not be executed in
> >> the communication thread, because `row_team` is a pointer into memory
> >> of the calling image.
> >>
> >> Yes, I know. All of this confusing and it also took me a longer time to
> >> understand all of this and figure a way to do this fast and efficient.
> >>
> >>> Then there is the assignment, which is difficult.  I haven't thought
> >>> long enough about the consistency between the condition which refers
> >>> to the current team, and coindex 1 of the initial team.
> >>> (This is why I asked about communicators and alike, as this assignment
> >>> might be correct only under very special conditions, or I just don't
> >>> understand it.)
> >>
> >> To my understanding that assignment is allowed by the standard. Any
> >> concerns?
> >>
> >>> So can you clarify that your code evaluates in the standard-defined
> >>> way?
> >>
> >> I hope the above did it.
> > 
> > No, unfortunately I either do not agree with your reasoning because
> > I do not understand it, or I simply do not understand coarrays.
> > 
> > MPI is way simpler to use.
> > 
> > Hopefully somebody else can help here.  I am lost...
> > 
> (For reference, the discussion started in another thread:
> https://gcc.gnu.org/pipermail/fortran/2025-July/062451.html)
> 
> Let's see if I understand the problem.  Consider this example:
> 
> program p
>    implicit none
>    integer :: img, data
>    integer, allocatable :: res(:)[:]
>    img = this_image()
>    data = img * img + 10  ! Something different on each image
>    allocate(res(num_images())[*], source=-1)
>    res(get_val())[1] = data
>    if (this_image() == 1) print *, res
> contains
>    pure function get_val()
>      integer :: get_val
>      get_val = img
>    end function
> end program
> 
> The function get_val() returns the current image, so when assigning to 
> res(get_val())[1], it should be evaluated on the local image, otherwise 
> only the first element of res is populated (and with conflicting values).


What is important is what the standard says.  In the above code,
the array index is to be evalutated before the assignment takes place.
The following thus should be equvivalent:

  res(img)[1] = data
  res(get_val())[1] = data
  res(this_image())[1] = data

Which are all the caf equivalent of mpi_gather with communicator mpi_comm_world
and root=0 (= image 1).  Confirmed with NAG and ifx

> Andre, Harald, is this the original topic of this thread?
> Is my reasoning correct?

Yes and no.  Your example has only the initial team with the associated
communicator (this is my layman's interpretation).

Andre's testcase in addition uses teams in ways I do not yet understand.

Harald

Re: Aw: Re: Add: [Bug fortran/121043] [16 Regression] Tests of OpenCoarray fail to pass, works on 15.1.1 20250712

Reply via email to