Here is the test code reproducer:

      program test2
      implicit none
      include 'mpif.h'
      integer ierr, myid, numprocs,i1,i2,n,local_comm,
     $     icolor,ikey,rank,root

c
c...  MPI set-up
      ierr = 0
      call MPI_INIT(IERR)
      ierr = 1
      CALL MPI_COMM_SIZE(MPI_COMM_WORLD, numprocs, ierr)
      print *, ierr

      ierr = -1

      CALL MPI_COMM_RANK(MPI_COMM_WORLD, myid, ierr)

      ierr = -5
      i1 = ierr
      if (myid.eq.0) i1 = 1
      call mpi_allreduce(i1, i2, 1,MPI_integer,MPI_MIN,
     $     MPI_COMM_WORLD,ierr)

      ikey = myid
      if (mod(myid,2).eq.0) then
         icolor = 0
      else
         icolor = MPI_UNDEFINED
      end if

      root = 0
      do n = 1, 100000

         call MPI_COMM_SPLIT(MPI_COMM_WORLD, icolor,
     $        ikey, local_comm, ierr)

         if (mod(myid,2).eq.0) then
            CALL MPI_COMM_RANK(local_comm, rank, ierr)
            i2 = i1
            call mpi_reduce(i1, i2, 1,MPI_integer,MPI_MIN,
     $           root, local_comm,ierr)

            if (myid.eq.0.and.mod(n,10).eq.0)
     $           print *, n, i1, i2,icolor,ikey

            call mpi_comm_free(local_comm, ierr)
         end if

      end do
c      if (icolor.eq.0) call mpi_comm_free(local_comm, ierr)



      call MPI_barrier(MPi_COMM_WORLD,ierr)

      call MPI_FINALIZE(IERR)

      print *, myid, ierr

      end



-david
--
David Gunter
HPC-3: Parallel Tools Team
Los Alamos National Laboratory



On Apr 30, 2009, at 12:43 PM, David Gunter wrote:

Just to throw out more info on this, the test code runs fine on previous versions of OMPI. It only hangs on the 1.3 line when the cid reaches 65536.

-david
--
David Gunter
HPC-3: Parallel Tools Team
Los Alamos National Laboratory



On Apr 30, 2009, at 12:28 PM, Edgar Gabriel wrote:

cid's are in fact not recycled in the block algorithm. The problem is that comm_free is not collective, so you can not make any assumptions whether other procs have also released that communicator.


But nevertheless, a cid in the communicator structure is a uint32_t, so it should not hit the 16k limit there yet. this is not new, so if there is a discrepancy between what the comm structure assumes that a cid is and what the pml assumes, than this was in the code since the very first days of Open MPI...

Thanks
Edgar

Brian W. Barrett wrote:
On Thu, 30 Apr 2009, Ralph Castain wrote:
We seem to have hit a problem here - it looks like we are seeing a
built-in limit on the number of communicators one can create in a
program. The program basically does a loop, calling MPI_Comm_split each
time through the loop to create a sub-communicator, does a reduce
operation on the members of the sub-communicator, and then calls
MPI_Comm_free to release it (this is a minimized reproducer for the real
code). After 64k times through the loop, the program fails.

This looks remarkably like a 16-bit index that hits a max value and then
blocks.

I have looked at the communicator code, but I don't immediately see such a field. Is anyone aware of some other place where we would have a limit
that would cause this problem?
There's a maximum of 32768 communicator ids when using OB1 (each PML can set the max contextid, although the communicator code is the part that actually assigns a cid). Assuming that comm_free is actually properly called, there should be plenty of cids available for that pattern. However, I'm not sure I understand the block algorithm someone added to cid allocation - I'd have to guess that there's something funny with that routine and cids aren't being recycled properly.
Brian
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

--
Edgar Gabriel
Assistant Professor
Parallel Software Technologies Lab      http://pstl.cs.uh.edu
Department of Computer Science          University of Houston
Philip G. Hoffman Hall, Room 524        Houston, TX-77204, USA
Tel: +1 (713) 743-3857                  Fax: +1 (713) 743-3335
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Reply via email to