Hi Tim, Great news! Happy calculating :-).
-- Samuel K. Gutierrez Los Alamos National Laboratory > Dear Samuel, > > Just as you replied I was trying that on the compute nodes. Surprise, > surprise...the value returned as the hard and soft limits is 1024. > > Thanks for confirming my suspicions... > > Regards, > > Tim. > > On Mar 30, 2011, at 7:41 PM, Samuel K. Gutierrez wrote: > > Hi, > > It sounds like Open MPI is hitting your system's open file descriptor > limit. If that's the case, one potential workaround is to have your > system administrator raise file descriptor limits. > > On a compute node, what does "ulimit -a" show (using bash)? > > Hope that helps, > > -- > Samuel K. Gutierrez > Los Alamos National Laboratory > > On Mar 30, 2011, at 5:22 PM, Timothy Stitt wrote: > > Dear OpenMPI developers, > > One of our users was running a benchmark on a 1032 core simulation. He had > a successful run at 900 cores but when he stepped up to 1032 cores the job > just stalled and his logs contained many occurrences of the following > line: > > [d6copt368.crc.nd.edu][[25621,1],0][btl_tcp_component.c:885:mca_btl_tcp_component_accept_handler] > accept() failed: Too many open files (24) > > The simulation has a single master task that communicates with all the > other tasks to write out some I/O via the master. We are assuming the > message is related to this bottleneck. Is there a 1024 limit on the number > of open files/connections for instance? > > Can anyone confirm the meaning of this error and secondly provide a > resolution that hopefully doesn't involve a code rewrite. > > Thanks in advance, > > Tim. > > Tim Stitt PhD (User Support Manager). > Center for Research Computing | University of Notre Dame | > P.O. Box 539, Notre Dame, IN 46556 | Phone: 574-631-5287 | Email: > tst...@nd.edu<mailto:tst...@nd.edu> > > _______________________________________________ > devel mailing list > de...@open-mpi.org<mailto:de...@open-mpi.org> > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > <ATT00001..txt> > > Tim Stitt PhD (User Support Manager). > Center for Research Computing | University of Notre Dame | > P.O. Box 539, Notre Dame, IN 46556 | Phone: 574-631-5287 | Email: > tst...@nd.edu<mailto:tst...@nd.edu> > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel