Dear William and Bill,

thanks a lot for your answers.

I already configured limits.conf a few days ago on all nodes. 'ulimit
-n' (open files)  gives 94000. That should be more than enough.

I did some more tests in the meantime.


The file i am running is very simple. I attached it.
i compiled it with 'mpicc teste.c' and get a.out as the executable.

The breakpoint seems to be 252. When i run on the masternode:
 qsub -pe orte 252 -V -j yes -cwd -S /bin/bash <<< "export
LD_LIBRARY_PATH=${LD_LIBRARY_PATH} && mpiexec -n 252 a.out >>
/home/ulrich/abc.out"

it runs, giving as output numerous lines like
[...]
Hello world from processor karun07, rank 58 out of 200 processors
Hello world from processor karun07, rank 59 out of 200 processors
[...]

Running
qsub -pe orte 253 -V -j yes -cwd -S /bin/bash <<< "export
LD_LIBRARY_PATH=${LD_LIBRARY_PATH} && mpiexec -n 253 a.outn >>
/home/ulrich/abc.out"

gives:
Errno:          24 (Too many open files)

When i go now to one node (login there), no matter which, and do:
mpiexec -n 400 a.outn >> /home/ulrich/abc.out"

That works fine as it should.

I do not understand where the breakpoint 252/253 comes from, and why it
works with mpiexec directly on the node. Did i oversee a config issue?
I am not totally convinced that it is not a gridengine issue.

With kind regards, ulrich





On 06/13/2016 12:47 PM, William Hay wrote:
> On Fri, Jun 10, 2016 at 07:24:47PM +0200, Ulrich Hiller wrote:
>> Hello,
>>
>> I have a problem submiiting parralel jobs, e.g.:
>>
> 
>> Your Open MPI job will likely hang until the failure resason is fixed
>> (e.g., more file descriptors and/or memory becomes available), and may
>> eventually timeout / abort.
>>
>>   Local host:     karun02
>>   Errno:          24 (Too many open files)
>>   Probable cause: Out of file descriptors
>> --------------------------------------------------------------------------
> This doesn't look like it has much to do with grid engine per se.
> I'd look at ulimit to see what  is going on and tweak things
> to raise the number of open files allowed appropriately.
> 
> On linux limits.conf would be the first place to look although
> shell startup scripts might lower the limits as well.
> 
> William
> 
#include <mpi.h>
#include <stdio.h>

int main(int argc, char** argv) {
    // Initialize the MPI environment
    MPI_Init(NULL, NULL);

    // Get the number of processes
    int world_size;
    MPI_Comm_size(MPI_COMM_WORLD, &world_size);

    // Get the rank of the process
    int world_rank;
    MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);

    // Get the name of the processor
    char processor_name[MPI_MAX_PROCESSOR_NAME];
    int name_len;
    MPI_Get_processor_name(processor_name, &name_len);

    // Print off a hello world message
    printf("Hello world from processor %s, rank %d"
           " out of %d processors\n",
           processor_name, world_rank, world_size);

    // Finalize the MPI environment.
    MPI_Finalize();
}


_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to