Ralph Castain wrote: > On Mar 4, 2010, at 7:27 AM, Prentice Bisbal wrote: > >> >> Ralph Castain wrote: >>> On Mar 3, 2010, at 12:16 PM, Prentice Bisbal wrote: >>> >>>> Eugene Loh wrote: >>>>> Prentice Bisbal wrote: >>>>>> Eugene Loh wrote: >>>>>> >>>>>>> Prentice Bisbal wrote: >>>>>>> >>>>>>>> Is there a limit on how many MPI processes can run on a single host? >>>>>>>> >>>>> Depending on which OMPI release you're using, I think you need something >>>>> like 4*np up to 7*np (plus a few) descriptors. So, with 256, you need >>>>> 1000+ descriptors. You're quite possibly up against your limit, though >>>>> I don't know for sure that that's the problem here. >>>>> >>>>> You say you're running 1.2.8. That's "a while ago", so would you >>>>> consider updating as a first step? Among other things, newer OMPIs will >>>>> generate a much clearer error message if the descriptor limit is the >>>>> problem. >>>> While 1.2.8 might be "a while ago", upgrading software just because it's >>>> "old" is not a valid argument. >>>> >>>> I can install the lastest version of OpenMPI, but it will take a little >>>> while. >>> Maybe not because it is "old", but Eugene is correct. The old versions of >>> OMPI required more file descriptors than the newer versions. >>> >>> That said, you'll still need a minimum of 4x the number of procs on the >>> node even with the latest release. I suggest talking to your sys admin >>> about getting the limit increased. It sounds like it has been set >>> unrealistically low. >>> >>> >> I *am* the system admin! ;) >> >> The file descriptor limit is the default for RHEL, 1024, so I would not >> characterize it as "unrealistically low". I assume someone with much >> more knowledge of OS design and administration than me came up with this >> default, so I'm hesitant to change it without good reason. If there was >> good reason, I'd have no problem changing it. I have read that setting >> it to more than 8192 can lead to system instability. > > Never heard that, and most HPC systems have it set a great deal higher > without trouble.
I just read that the other day. Not sure where, though. Probably a forum posting somewhere. I'll take your word for it that it's safe to increase if necessary. > > However, the choice is yours. If you have a large SMP system, you'll > eventually be forced to change it or severely limit its usefulness for MPI. > RHEL sets it that low arbitrarily as a way of saving memory by keeping the fd > table small, not because the OS can't handle it. > > Anyway, that is the problem. Nothing we (or any MPI) can do about it as the > fd's are required for socket-based communications and to forward I/O. Thanks, Ralph, that's exactly the answer I was looking for - where this limit was coming from. I can see how on a large SMP system the fd limit would have to be increased. In normal circumstances, my cluster nodes should never have more than 8 MPI processes running at once (per node), so I shouldn't be hitting that limit on my cluster. > > >> This is admittedly unusual situation - in normal use, no one would ever >> want to run that many processes on a single system - so I don't see any >> justification for modifying that setting. >> >> Yesterday I spoke to the researcher who originally asked me this limit - >> he just wanted to know what the limit was, and doesn't actually plan to >> do any "real" work with that many processes on a single node, rendering >> this whole discussion academic. >> >> I did install OpenMPI 1.4.1 yesterday, but I haven't had a chance to >> test it yet. I'll post the results of testing here. >> >>>>>>>> I have a user trying to test his code on the command-line on a single >>>>>>>> host before running it on our cluster like so: >>>>>>>> >>>>>>>> mpirun -np X foo >>>>>>>> >>>>>>>> When he tries to run it on large number of process (X = 256, 512), the >>>>>>>> program fails, and I can reproduce this with a simple "Hello, World" >>>>>>>> program: >>>>>>>> >>>>>>>> $ mpirun -np 256 mpihello >>>>>>>> mpirun noticed that job rank 0 with PID 0 on node juno.sns.ias.edu >>>>>>>> exited on signal 15 (Terminated). >>>>>>>> 252 additional processes aborted (not shown) >>>>>>>> >>>>>>>> I've done some testing and found that X <155 for this program to work. >>>>>>>> Is this a bug, part of the standard, or design/implementation decision? >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> One possible issue is the limit on the number of descriptors. The error >>>>>>> message should be pretty helpful and descriptive, but perhaps you're >>>>>>> using an older version of OMPI. If this is your problem, one workaround >>>>>>> is something like this: >>>>>>> >>>>>>> unlimit descriptors >>>>>>> mpirun -np 256 mpihello >>>>>>> >>>>>> Looks like I'm not allowed to set that as a regular user: >>>>>> >>>>>> $ ulimit -n 2048 >>>>>> -bash: ulimit: open files: cannot modify limit: Operation not permitted >>>>>> >>>>>> Since I am the admin, I could change that elsewhere, but I'd rather not >>>>>> do that system-wide unless absolutely necessary. >>>>>> >>>>>>> though I guess the syntax depends on what shell you're running. Another >>>>>>> is to set the MCA parameter opal_set_max_sys_limits to 1. >>>>>>> >>>>>> That didn't work either: >>>>>> >>>>>> $ mpirun -mca opal_set_max_sys_limits 1 -np 256 mpihello >>>>>> mpirun noticed that job rank 0 with PID 0 on node juno.sns.ias.edu >>>>>> exited on signal 15 (Terminated). >>>>>> 252 additional processes aborted (not shown) >> >> -- >> Prentice Bisbal >> Linux Software Support Specialist/System Administrator >> School of Natural Sciences >> Institute for Advanced Study >> Princeton, NJ >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > -- Prentice Bisbal Linux Software Support Specialist/System Administrator School of Natural Sciences Institute for Advanced Study Princeton, NJ