Public bug reported: MPICH 3.3b2 deadlocks when large tags are used (the issue was identified by the PETSc team, but it affects other packages). The fix is one line:
commit c597c8d79deea220a42751fda0f01ce70764c260 Author: Min Si <[email protected]> Date: Wed Apr 18 10:15:25 2018 -0500 ch3: Fix tag upper limit initialization The value of tag_ub is initialized in MPIR_Init_thread, but was incorrectly reset in ch3 device initialization. This patch fixes it. Signed-off-by: Ken Raffenetti <[email protected]> diff --git a/src/mpid/ch3/src/mpid_init.c b/src/mpid/ch3/src/mpid_init.c index f7664fd2e..298ef4bd5 100644 --- a/src/mpid/ch3/src/mpid_init.c +++ b/src/mpid/ch3/src/mpid_init.c @@ -157,7 +157,6 @@ int MPID_Init(int *argc, char ***argv, int requested, int *provided, * Set global process attributes. These can be overridden by the channel * if necessary. */ - MPIR_Process.attrs.tag_ub = INT_MAX; MPIR_Process.attrs.io = MPI_ANY_SOURCE; /* See also https://github.com/pmodels/mpich/pull/3097/ I have confirmed that this bug is present in the current package distributed with Ubuntu 18.10. I am aware of another bug that was fixed at a similar time and should also be patched on any Ubuntu releases that stick with mpich-3.3b2. (These are both fixed in mpich-3.3b3.) commit 8edabc7373b82dd660019e53d246131765819294 Author: Rob Latham <[email protected]> Date: Tue Apr 17 11:20:25 2018 -0500 fix uninitialized variable Closes pmodels/mpich#2892 diff --git a/src/mpi/romio/adio/ad_nfs/ad_nfs_read.c b/src/mpi/romio/adio/ad_nfs/ad_nfs_read.c index e01cc21b0..5b8f0b88f 100644 --- a/src/mpi/romio/adio/ad_nfs/ad_nfs_read.c +++ b/src/mpi/romio/adio/ad_nfs/ad_nfs_read.c @@ -158,7 +158,7 @@ void ADIOI_NFS_ReadStrided(ADIO_File fd, void *buf, int count, ADIOI_Flatlist_node *flat_buf, *flat_file; ADIO_Offset i_offset, new_brd_size, brd_size, size; - int i, j, k, err, err_flag, st_index = 0; + int i, j, k, err, err_flag=0, st_index = 0; MPI_Count num, bufsize; int n_etypes_in_filetype; ADIO_Offset n_filetypes, etype_in_filetype, st_n_filetypes, size_in_filetype; https://github.com/pmodels/mpich/commit/8edabc7373b82dd660019e53d246131765819294 ** Affects: mpich (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1802372 Title: mpich-3.3b2 critical bug (deadlock) and patch To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/mpich/+bug/1802372/+subscriptions -- ubuntu-bugs mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
