Re: [petsc-users] How to specify different MPI communication patterns.

2024-05-21 Thread Jed Brown




 Randall Mackie  writes: > Dear PETSc team, > > A few years ago we were having some issue with MPI communications with large numbers of processes and subcomms, see this thread here: > > https: //urldefense. us/v3/__https: //lists. mcs. anl. gov/mailman/htdig/petsc-users/2020-April/040976. html__;!!G_uCfscf7eWS!fyPrzMKC4KZmxGO-HI0xUlOCbgwXod4O8q2h_6MjHqPLPj9ppLkgFkJUig-KqXgu6AX7pMhYtEpWOP_cCesCWcCk_Q$




ZjQcmQRYFpfptBannerStart




  

  
	This Message Is From an External Sender
  
  
This message came from outside your organization.
  



 
  


ZjQcmQRYFpfptBannerEnd




Randall Mackie  writes:

> Dear PETSc team,
>
> A few years ago we were having some issue with MPI communications with large numbers of processes and subcomms, see this thread here:
>
> https://urldefense.us/v3/__https://lists.mcs.anl.gov/mailman/htdig/petsc-users/2020-April/040976.html__;!!G_uCfscf7eWS!fyPrzMKC4KZmxGO-HI0xUlOCbgwXod4O8q2h_6MjHqPLPj9ppLkgFkJUig-KqXgu6AX7pMhYtEpWOP_cCesCWcCk_Q$
>
> We are once again encountering strange issues when running our code on a new cluster and after a month of various tests we have not found a solution, but we think it has something to do with network traffic and high MPI communications, similar perhaps to the thread from 3 years ago.
>
> Is it still possible to change the communication pattern with the option -build_twosided_allreduce (and is that the right syntax?).

It's `-build_twosided allreduce` or `-build_twosided redscatter` to avoid `ibarrier`. You could also check that you have a recent MPI release, and even compare MPICH and Open MPI.

> Are there other runtime options that we can try to change the MPI communication type for all underlying communications?
>
>
> Thank you,
>
> Randy M.



[petsc-users] How to specify different MPI communication patterns.

2024-05-21 Thread Randall Mackie
Dear PETSc team,

A few years ago we were having some issue with MPI communications with large 
numbers of processes and subcomms, see this thread here:

https://urldefense.us/v3/__https://lists.mcs.anl.gov/mailman/htdig/petsc-users/2020-April/040976.html__;!!G_uCfscf7eWS!fyPrzMKC4KZmxGO-HI0xUlOCbgwXod4O8q2h_6MjHqPLPj9ppLkgFkJUig-KqXgu6AX7pMhYtEpWOP_cCesCWcCk_Q$
 

We are once again encountering strange issues when running our code on a new 
cluster and after a month of various tests we have not found a solution, but we 
think it has something to do with network traffic and high MPI communications, 
similar perhaps to the thread from 3 years ago.

Is it still possible to change the communication pattern with the option 
-build_twosided_allreduce (and is that the right syntax?).

Are there other runtime options that we can try to change the MPI communication 
type for all underlying communications?


Thank you,

Randy M.