Re: [OMPI users] Is there a MPI routine that returns the value of "npernode" being used?

2022-04-02 Thread Gilles Gouaillardet via users
Ernesto,

Not directly.

But you can use MPI_Comm_split_type(..., MPI_COMM_TYPE_SHARED, ...) and then
MPI_Comm_size(...) on the "returned" communicator.

Cheers,

Gilles

On Sun, Apr 3, 2022 at 5:52 AM Ernesto Prudencio via users <
users@lists.open-mpi.org> wrote:

> Thanks,
>
>
>
> Ernesto.
>
> Schlumberger-Private
>


[OMPI users] Is there a MPI routine that returns the value of "npernode" being used?

2022-04-02 Thread Ernesto Prudencio via users
Thanks,

Ernesto.


Schlumberger-Private


Re: [OMPI users] 101 question on MPI_Bcast()

2022-04-02 Thread Gilles Gouaillardet via users
Ernesto,

MPI_Bcast() has no barrier semantic.
It means the root rank can return after the message is sent (kind of eager
send) and before it is received by other ranks.


Cheers,

Gilles

On Sat, Apr 2, 2022, 09:33 Ernesto Prudencio via users <
users@lists.open-mpi.org> wrote:

> I have an “extreme” case below, for the sake of example.
>
>
>
> Suppose one is running a MPI job with N >= 2 ranks, and at a certain
> moment the code does the following:
>
>
>
> .
>
> .
>
> .
>
> If (rank == 0) {
>
> MPI_Bcast(…);
>
> }
>
> .
>
> .
>
> .
>
> std::cout << “Here A, rank = “ << rank << std::endl;
>
> MPI_Barrier(…);
>
> std::cout << “Here B, rank = “ << rank << std::endl;
>
> .
>
> .
>
> .
>
>
>
> I thought rank 0 would never print the message “Here A”, because he MPI
> lib at rank 0 would be stuck on the MPI_Bcast waiting for all other ranks
> to notify (internally, in the MPI lib logic) that they have received the
> contents.
>
>
>
> But this seems not to be the case. Instead, the code behaves as follows:
>
>1. MPI_Bcast() returns the processing to rank 0, so it (rank 0) prints
>the “Here A” message (and all the other ranks print “Here A” as well).
>2. All ranks get to the barrier, and then all of them print the “Here
>B” message afterwards.
>
>
>
> Am I correct on the statements (1) and (2) above?
>
>
>
> Thanks,
>
>
>
> Ernesto.
>
> Schlumberger-Private
>


Re: [OMPI users] 101 question on MPI_Bcast()

2022-04-02 Thread Protze, Joachim via users
Hi Ernesto,

You program is erroneous from MPI standard perspective. That means you are in 
anything can happen land. MPI implementations are typically optimized for 
performance and assume correct MPI usage from the application.
In your situation, especially with small message size (?), the broadcast from 
root will result in an eager send. The broadcast is not defined as 
synchronizing collective (other than barrier, for example).
The message from your broadcast might actually match to any collective later in 
the code leading to more obscure error pattern.

To pinpoint such correctness errors in your application code, you can use tools 
like MUST (https://itc.rwth-aachen.de/must/), which will point out inconsistent 
(and therefore erroneous) use of collective communication.

- Joachim

From: users  on behalf of Ernesto Prudencio 
via users 
Sent: Saturday, April 2, 2022 2:29:07 AM
To: Open MPI Users 
Cc: Ernesto Prudencio 
Subject: [OMPI users] 101 question on MPI_Bcast()


I have an “extreme” case below, for the sake of example.



Suppose one is running a MPI job with N >= 2 ranks, and at a certain moment the 
code does the following:



.

.

.

If (rank == 0) {

MPI_Bcast(…);

}

.

.

.

std::cout << “Here A, rank = “ << rank << std::endl;

MPI_Barrier(…);

std::cout << “Here B, rank = “ << rank << std::endl;

.

.

.



I thought rank 0 would never print the message “Here A”, because he MPI lib at 
rank 0 would be stuck on the MPI_Bcast waiting for all other ranks to notify 
(internally, in the MPI lib logic) that they have received the contents.



But this seems not to be the case. Instead, the code behaves as follows:

  1.  MPI_Bcast() returns the processing to rank 0, so it (rank 0) prints the 
“Here A” message (and all the other ranks print “Here A” as well).
  2.  All ranks get to the barrier, and then all of them print the “Here B” 
message afterwards.



Am I correct on the statements (1) and (2) above?



Thanks,



Ernesto.


Schlumberger-Private