Re: [OMPI devel] Inherent limit on #communicators?

Edgar Gabriel Fri, 1 May 2009 08:08:17 -0400

David,

is this code representative for what your app is doing? E.g. you have abase communicator (e.g. MPI_COMM_WORLD) which is being 'split', freedagain, split, freed again etc. ? i.e. the important aspect is that thesame 'base' communicator is being used for deriving new communicatorsagain and again?

The reason I ask is two-fold: one, you would in that case be one of theideal beneficiaries of the block cid algorithm :-) (even if it fails youright now); two, a fix for this scenario which basically tries to reusethe last block used (and which would fix your case if the condition istrue) is roughly five lines of code. This would give us the possibilityto have a fix quickly in the trunk and v1.3 (keep in mind that theblock-cid code is in the trunk since two years and this is the firstproblem that we have) and give us more time to develop a profoundsolution for the worst case - a chain of communicators being created,e.g. communicator 1 is basis to derive a new comm 2, comm 2 is beingused to derive comm 3 etc.


Thanks
Edgar

David Gunter wrote:

Here is the test code reproducer:

      program test2
      implicit none
      include 'mpif.h'
      integer ierr, myid, numprocs,i1,i2,n,local_comm,
     $     icolor,ikey,rank,root

c
c...  MPI set-up
      ierr = 0
      call MPI_INIT(IERR)
      ierr = 1
      CALL MPI_COMM_SIZE(MPI_COMM_WORLD, numprocs, ierr)
      print *, ierr

      ierr = -1

      CALL MPI_COMM_RANK(MPI_COMM_WORLD, myid, ierr)

      ierr = -5
      i1 = ierr
      if (myid.eq.0) i1 = 1
      call mpi_allreduce(i1, i2, 1,MPI_integer,MPI_MIN,
     $     MPI_COMM_WORLD,ierr)

      ikey = myid
      if (mod(myid,2).eq.0) then
         icolor = 0
      else
         icolor = MPI_UNDEFINED
      end if

      root = 0
      do n = 1, 100000

         call MPI_COMM_SPLIT(MPI_COMM_WORLD, icolor,
     $        ikey, local_comm, ierr)

         if (mod(myid,2).eq.0) then
            CALL MPI_COMM_RANK(local_comm, rank, ierr)
            i2 = i1
            call mpi_reduce(i1, i2, 1,MPI_integer,MPI_MIN,
     $           root, local_comm,ierr)

            if (myid.eq.0.and.mod(n,10).eq.0)
     $           print *, n, i1, i2,icolor,ikey

            call mpi_comm_free(local_comm, ierr)
         end if

      end do
c      if (icolor.eq.0) call mpi_comm_free(local_comm, ierr)



      call MPI_barrier(MPi_COMM_WORLD,ierr)

      call MPI_FINALIZE(IERR)

      print *, myid, ierr

      end



-david
--
David Gunter
HPC-3: Parallel Tools Team
Los Alamos National Laboratory



On Apr 30, 2009, at 12:43 PM, David Gunter wrote:
Just to throw out more info on this, the test code runs fine onprevious versions of OMPI. It only hangs on the 1.3 line when the cidreaches 65536.
-david
--
David Gunter
HPC-3: Parallel Tools Team
Los Alamos National Laboratory



On Apr 30, 2009, at 12:28 PM, Edgar Gabriel wrote:
cid's are in fact not recycled in the block algorithm. The problem isthat comm_free is not collective, so you can not make any assumptionswhether other procs have also released that communicator.
But nevertheless, a cid in the communicator structure is a uint32_t,so it should not hit the 16k limit there yet. this is not new, so ifthere is a discrepancy between what the comm structure assumes that acid is and what the pml assumes, than this was in the code since thevery first days of Open MPI...
Thanks
Edgar

Brian W. Barrett wrote:
On Thu, 30 Apr 2009, Ralph Castain wrote:
We seem to have hit a problem here - it looks like we are seeing a
built-in limit on the number of communicators one can create in a
program. The program basically does a loop, calling MPI_Comm_spliteach
time through the loop to create a sub-communicator, does a reduce
operation on the members of the sub-communicator, and then calls
MPI_Comm_free to release it (this is a minimized reproducer for thereal
code). After 64k times through the loop, the program fails.
This looks remarkably like a 16-bit index that hits a max value andthen
blocks.
I have looked at the communicator code, but I don't immediately seesucha field. Is anyone aware of some other place where we would have alimit
that would cause this problem?
There's a maximum of 32768 communicator ids when using OB1 (each PMLcan set the max contextid, although the communicator code is thepart that actually assigns a cid). Assuming that comm_free isactually properly called, there should be plenty of cids availablefor that pattern. However, I'm not sure I understand the blockalgorithm someone added to cid allocation - I'd have to guess thatthere's something funny with that routine and cids aren't beingrecycled properly.
Brian
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
--
Edgar Gabriel
Assistant Professor
Parallel Software Technologies Lab      http://pstl.cs.uh.edu
Department of Computer Science          University of Houston
Philip G. Hoffman Hall, Room 524        Houston, TX-77204, USA
Tel: +1 (713) 743-3857                  Fax: +1 (713) 743-3335
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


--
Edgar Gabriel
Assistant Professor
Parallel Software Technologies Lab      http://pstl.cs.uh.edu
Department of Computer Science          University of Houston
Philip G. Hoffman Hall, Room 524        Houston, TX-77204, USA
Tel: +1 (713) 743-3857                  Fax: +1 (713) 743-3335

Re: [OMPI devel] Inherent limit on #communicators?

Reply via email to