Public bug reported:

MPICH 3.3b2 deadlocks when large tags are used (the issue was identified
by the PETSc team, but it affects other packages). The fix is one line:

commit c597c8d79deea220a42751fda0f01ce70764c260
Author: Min Si <[email protected]>
Date:   Wed Apr 18 10:15:25 2018 -0500

    ch3: Fix tag upper limit initialization

    The value of tag_ub is initialized in MPIR_Init_thread, but was
    incorrectly reset in ch3 device initialization. This patch fixes it.

    Signed-off-by: Ken Raffenetti <[email protected]>

diff --git a/src/mpid/ch3/src/mpid_init.c b/src/mpid/ch3/src/mpid_init.c
index f7664fd2e..298ef4bd5 100644
--- a/src/mpid/ch3/src/mpid_init.c
+++ b/src/mpid/ch3/src/mpid_init.c
@@ -157,7 +157,6 @@ int MPID_Init(int *argc, char ***argv, int requested, int 
*provided,
      * Set global process attributes.  These can be overridden by the channel
      * if necessary.
      */
-    MPIR_Process.attrs.tag_ub = INT_MAX;
     MPIR_Process.attrs.io = MPI_ANY_SOURCE;

     /*

See also https://github.com/pmodels/mpich/pull/3097/

I have confirmed that this bug is present in the current package
distributed with Ubuntu 18.10.


I am aware of another bug that was fixed at a similar time and should also be 
patched on any Ubuntu releases that stick with mpich-3.3b2.  (These are both 
fixed in mpich-3.3b3.)

commit 8edabc7373b82dd660019e53d246131765819294
Author: Rob Latham <[email protected]>
Date:   Tue Apr 17 11:20:25 2018 -0500

    fix uninitialized variable
    
    Closes pmodels/mpich#2892

diff --git a/src/mpi/romio/adio/ad_nfs/ad_nfs_read.c 
b/src/mpi/romio/adio/ad_nfs/ad_nfs_read.c
index e01cc21b0..5b8f0b88f 100644
--- a/src/mpi/romio/adio/ad_nfs/ad_nfs_read.c
+++ b/src/mpi/romio/adio/ad_nfs/ad_nfs_read.c
@@ -158,7 +158,7 @@ void ADIOI_NFS_ReadStrided(ADIO_File fd, void *buf, int 
count,
 
     ADIOI_Flatlist_node *flat_buf, *flat_file;
     ADIO_Offset i_offset, new_brd_size, brd_size, size;
-    int i, j, k, err, err_flag, st_index = 0;
+    int i, j, k, err, err_flag=0, st_index = 0;
     MPI_Count num, bufsize;
     int n_etypes_in_filetype;
     ADIO_Offset n_filetypes, etype_in_filetype, st_n_filetypes, 
size_in_filetype;


https://github.com/pmodels/mpich/commit/8edabc7373b82dd660019e53d246131765819294

** Affects: mpich (Ubuntu)
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1802372

Title:
  mpich-3.3b2 critical bug (deadlock) and patch

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/mpich/+bug/1802372/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to