Re: [OMPI users] I got "ssh_exchange_identification" errors when I mpirun over 1500 times almost at the same time

2013-06-04 Thread vacate
Dear Ralph Castain, Thank you for you reply!!! Actually, I have adjusted my /etc/security/limits.conf file, I modified the "soft nofile" and "hard nofile" values up to 65535, so these days I tried another possible limits settings another settings include "soft memlock" ,"hard memlock", and

Re: [OMPI users] I got "ssh_exchange_identification" errors when I mpirun over 1500 times almost at the same time

2013-06-04 Thread vacate
Dear Sabuj Pattanayek, After your reply, I try to disable my /etc/hosts.deny, but unfortunately, It didn't work still But I finally solve my problem, The reason is my "soft nofile" and "hard nofile" values aren't set large enough, so I can't open too much file like that Still thanks for your

Re: [OMPI users] 1.7.1 Hang with MPI_THREAD_MULTIPLE set

2013-06-04 Thread Jeff Squyres (jsquyres)
On Jun 3, 2013, at 5:06 AM, Paul Kapinos wrote: > It is more or less well-known that MPI_THREAD_MULTIPLE disable the OpenFabric > / InfiniBand networking in Open MPI: > > http://www.open-mpi.org/faq/?category=supported-systems#thread-support >

Re: [OMPI users] 1.7.1 Hang with MPI_THREAD_MULTIPLE set

2013-06-04 Thread W Spector
On 06/04/2013 03:23 AM, Jeff Squyres (jsquyres) wrote: On Jun 3, 2013, at 5:06 AM, Paul Kapinos wrote: It is more or less well-known that MPI_THREAD_MULTIPLE disable the OpenFabric / InfiniBand networking in Open MPI:

Re: [OMPI users] Open MPI Checkpoint Restart

2013-06-04 Thread Neel Sunil Desai
Hi, So, I was able to remove the "cannot open shared file or object" errors. But I am not able to checkpoint yet. When I enter ompi-checkpoint PID of mpirun, it does not return anything (not even a new prompt). In my mca-params.conf file, I added sstore=stage

[OMPI users] "ssh: connect to host XXX.XXX.XXX.XX port 22: connection timed out" errors during mpirun

2013-06-04 Thread vacate
Hello everyone, After solving my first ssh_exchange_identification problem, I feel embarrassed to ask my another problem... :'(( I got some "*ssh: connect to host XXX.XXX.XXX.XX port 22: connection timed out*" errors when I mpirun over 2000 times almost at the same time. --- my bash shell script

[OMPI users] Force mpirun to only run under gridengine

2013-06-04 Thread Orion Poplawski
I'd like to be able to force mpirun to require being run under a gridengine environment. Any ideas on how to achieve this, if possible? -- Orion Poplawski Technical Manager 303-415-9701 x222 NWRA, Boulder/CoRA Office FAX: 303-415-9702 3380 Mitchell Lane

Re: [OMPI users] Force mpirun to only run under gridengine

2013-06-04 Thread Ralph Castain
There is an Mca param to require an allocation Sent from my iPhone On Jun 4, 2013, at 11:18 AM, Orion Poplawski wrote: > I'd like to be able to force mpirun to require being run under a gridengine > environment. Any ideas on how to achieve this, if possible? > > -- >

Re: [OMPI users] Force mpirun to only run under gridengine

2013-06-04 Thread Reuti
Am 04.06.2013 um 20:38 schrieb Ralph Castain: > There is an Mca param to require an allocation But this can be requested (or not) at execution time? Even a decicated compilation with a builtin test of an allocation won't give the intended effect, as someone could use his own compilation of

Re: [OMPI users] 1.7.1 Hang with MPI_THREAD_MULTIPLE set

2013-06-04 Thread Jeff Squyres (jsquyres)
On Jun 4, 2013, at 8:20 AM, W Spector wrote: >> Yes, this is true -- MPI_THREAD_MULITPLE support is fairly incomplete in >> Open MPI. > > One would hope a simple MPI_Barrier call would work though... Underneath, I am pretty sure that barrier is doing an MPI_WAITALL.

Re: [OMPI users] Force mpirun to only run under gridengine

2013-06-04 Thread Ralph Castain
Yes, current releases do not have a way of prohibiting user-override of MCA params, so a user could indeed circumvent the directive to require an allocation. The original intent of the parameter was to close a hole that allowed users to mistakenly overload the head node of a cluster by forgetting

Re: [MTT users] mtt setup

2013-06-04 Thread Jeff Squyres (jsquyres)
I believe that HLRS had to run the mtt-relay when it was running MTT before. See client/mtt-relay, and https://svn.open-mpi.org/trac/mtt/changeset/623. On Jun 4, 2013, at 5:19 AM, sethi wrote: > > Hello! > I am setting up mtt testing in my institute. Clusters in my >