and it's conceivable that you might have better performance with
>
>     CALL MPI_ISEND()
>     DO I = 1, N
>         call do_a_little_of_my_work()  ! no MPI progress is being made here
>         CALL MPI_TEST()            ! enough MPI progress is being made here
> that the receiver has something to do
>     END DO
>     CALL MPI_WAIT()
>
> Whether performance improves or not is not guaranteed by the MPI standard.
>
> And the SECOND desire is to use Persistent communication for even better
> speedup.
>
> Right.  That's a separate issue.
>
>

So actually I am focusing on the persistent communication at this time.
Based on your suggestions, I developed:

sending, receiving buffers, and the request array is defined in declared in
the global module. And their sizes are allocated in the main program. But
following is not working. Segmentation fault messages at just from the
underline blue line lace.

*Main program starts------@@@@@@@@@@@@@@@@@@@@@@@.*
*
**CALL MPI_RECV_INIT for each neighboring process  **
CALL MPI_SEND_INIT for each neighboring process*
*Loop Calling the subroutine1--------------------(10000 times in the main
program).

** Call subroutine1*
*
**Subroutine1 starts===================================*
*
   Loop A starts here >>>>>>>>>>>>>>>>>>>> (three passes)
   Call subroutine2

   Subroutine2 starts----------------------------

         Pick local data from array U in separate arrays for each
neighboring processor
         CALL MPI_STARTALL
         -------perform work that could be done with local data
         CALL MPI_WAITALL( )
         -------perform work using the received data
   Subroutine**2** ends**----------------------------*


*         -------perform work to update array U*
*   Loop A ends here >>>>>>>>>>>>>>>>>>>>*
*Subroutine1 ends====================================*

*Loop Calling the subroutine1 ends------------(10000 times in the main
program).*

*CALL MPI_Request_free( )*

*Main program ends------@@@@@@@@@@@@@@@@@@@@@@@.*

How to tackle all this.

Reply via email to