Hi folks
There are a few things I’d like to cover on Tuesday’s call:
* review of detailed launch timings - I’m seeing linear scaling vs ppn for the
initialization code at the very beginning of MPI_Init. This consists of the
following calls:
ompi_hook_base_mpi_init_top
ompi_mpi_thread_level
opal_init_util
ompi_register_mca_variables
opal_arch_set_fortran_logical_size
ompi_hook_base_mpi_init_top_post_opal
This turns out to now be the single largest time component in our startup, so
I’d like to understand what is scale-dependent in that list, and why.
* when I disable all but shared memory BTLs, MPI_Init errors out with the
message that procs on different nodes have no way to connect to each other.
However, we are supposedly not retrieving modex information until first
message, and the app I’m running is a simple MPI_Init/MPI_finalize - and so
there is no communication. Why then do I error out during init? How does the
system “know” that the procs cannot communicate?
* discuss Artem’s question of behavior:
https://github.com/open-mpi/ompi/issues/3269
<https://github.com/open-mpi/ompi/issues/3269>
Thanks
Ralph
_______________________________________________
devel mailing list
[email protected]
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel