Dear William and Bill, thanks a lot for your answers.
I already configured limits.conf a few days ago on all nodes. 'ulimit -n' (open files) gives 94000. That should be more than enough. I did some more tests in the meantime. The file i am running is very simple. I attached it. i compiled it with 'mpicc teste.c' and get a.out as the executable. The breakpoint seems to be 252. When i run on the masternode: qsub -pe orte 252 -V -j yes -cwd -S /bin/bash <<< "export LD_LIBRARY_PATH=${LD_LIBRARY_PATH} && mpiexec -n 252 a.out >> /home/ulrich/abc.out" it runs, giving as output numerous lines like [...] Hello world from processor karun07, rank 58 out of 200 processors Hello world from processor karun07, rank 59 out of 200 processors [...] Running qsub -pe orte 253 -V -j yes -cwd -S /bin/bash <<< "export LD_LIBRARY_PATH=${LD_LIBRARY_PATH} && mpiexec -n 253 a.outn >> /home/ulrich/abc.out" gives: Errno: 24 (Too many open files) When i go now to one node (login there), no matter which, and do: mpiexec -n 400 a.outn >> /home/ulrich/abc.out" That works fine as it should. I do not understand where the breakpoint 252/253 comes from, and why it works with mpiexec directly on the node. Did i oversee a config issue? I am not totally convinced that it is not a gridengine issue. With kind regards, ulrich On 06/13/2016 12:47 PM, William Hay wrote: > On Fri, Jun 10, 2016 at 07:24:47PM +0200, Ulrich Hiller wrote: >> Hello, >> >> I have a problem submiiting parralel jobs, e.g.: >> > >> Your Open MPI job will likely hang until the failure resason is fixed >> (e.g., more file descriptors and/or memory becomes available), and may >> eventually timeout / abort. >> >> Local host: karun02 >> Errno: 24 (Too many open files) >> Probable cause: Out of file descriptors >> -------------------------------------------------------------------------- > This doesn't look like it has much to do with grid engine per se. > I'd look at ulimit to see what is going on and tweak things > to raise the number of open files allowed appropriately. > > On linux limits.conf would be the first place to look although > shell startup scripts might lower the limits as well. > > William >
#include <mpi.h> #include <stdio.h> int main(int argc, char** argv) { // Initialize the MPI environment MPI_Init(NULL, NULL); // Get the number of processes int world_size; MPI_Comm_size(MPI_COMM_WORLD, &world_size); // Get the rank of the process int world_rank; MPI_Comm_rank(MPI_COMM_WORLD, &world_rank); // Get the name of the processor char processor_name[MPI_MAX_PROCESSOR_NAME]; int name_len; MPI_Get_processor_name(processor_name, &name_len); // Print off a hello world message printf("Hello world from processor %s, rank %d" " out of %d processors\n", processor_name, world_rank, world_size); // Finalize the MPI environment. MPI_Finalize(); }
_______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users