[OMPI users] Collective component priorities and sm

2010-06-09 Thread Gus Correa
Dear OpenMPI experts I am confused about the role and values of priorities set to the OpenMPI collective components, and why the shared memory (sm) collective component disqualifies itself (apparently by default). I describe the problem, then ask a some questions, if you have the patience to

[OMPI users] Specifying slots in rankfile

2010-06-09 Thread Grzegorz Maj
Hi, I'd like mpirun to run tasks with specific ranks on specific hosts, but I don't want to provide any particular sockets/slots/cores. The following example uses just one host, but generally I'll use more. In my hostfile I just have: root@host01 slots=4 I was playing with my rankfile to achieve

Re: [OMPI users] Specifying slots in rankfile

2010-06-09 Thread Grzegorz Maj
In my previous mail I said that slot=0-3 would be a solution. Unfortunately it gives me exactly the same segfault as in case with *:* 2010/6/9 Grzegorz Maj : > Hi, > I'd like mpirun to run tasks with specific ranks on specific hosts, > but I don't want to provide any particular

[OMPI users] (no subject)

2010-06-09 Thread asmae . elbahlouli
asmae.elbahlo...@mpsa.com

[OMPI users] problem with the mpirun

2010-06-09 Thread asmae . elbahlouli
hello, i'm doing a tutorial on OpenFoam, but when i run in parallel by typing "mpirun -np 30 foamProMesh -parallel | tee 2>&1 log/FPM.log"On the terminal window , after 10 minutes of run, it iterate but i have at the end:..Feature refinement iteration 5--Marked for

Re: [OMPI users] problem with the mpirun

2010-06-09 Thread Jeff Squyres
On Jun 9, 2010, at 7:30 AM, wrote: > mpirun noticed that process rank 0 with PID 18900 on node linux-qv31 exited > on signal 9 (Killed). This error message means that some external agent (outside of Open MPI) killed your OpenFoam process. You might want to check

Re: [OMPI users] OpenMPI-Ranking problem

2010-06-09 Thread Chamila Janath
Dear Sir/Madam, I'm running OpenMPI 1.4.2 version. The operation system is Ubuntu 9.10 with kernel version 2.6.31-14. $ mpirun -np 1 -cpus-per-proc 1 -bind-to-core a.out * This works fine on single core P4 machine.* $ mpirun -np 1 -bind-to-core a.out *This

Re: [OMPI users] Collective component priorities and sm

2010-06-09 Thread Jeff Squyres
On Jun 9, 2010, at 12:43 AM, Gus Correa wrote: > btl_self_priority=0 (default value) > btl_sm_priority=0 (default value) These are ok. BTL selection is a combination of priority and reachability. The self BTL can *only* reach its own process. So process A will use the "self" BTL to talk to

Re: [OMPI users] Threading models with openib

2010-06-09 Thread Jeff Squyres
On Jun 8, 2010, at 12:33 PM, David Turner wrote: > Please verify: if using openib BTL, the only threading model is > MPI_THREAD_SINGLE? Up to MPI_THREAD_SERIALIZED. > Is there a timeline for full support of MPI_THREAD_MULTIPLE in Open MPI's > openib BTL? IBM has been making some good

Re: [OMPI users] Specifying slots in rankfile

2010-06-09 Thread Ralph Castain
I would recommend using the sequential mapper instead: mpirun -mca rmaps seq You can then just list your hosts in your hostfile, and we will put the ranks sequentially on those hosts. So you get something like this host01 <= rank0 host01 <= rank1 host02 <= rank2 host03 <= rank3 host01 <=

[OMPI users] Running openMPI job with torque

2010-06-09 Thread Govind
Hi, I have installed following openMPI packge on worker node from repo openmpi-libs-1.4-4.el5.x86_64 openmpi-1.4-4.el5.x86_64 mpitests-openmpi-3.0-2.el5.x86_64 mpi-selector-1.0.2-1.el5.noarch torque-client-2.3.6-2cri.el5.x86_64 torque-2.3.6-2cri.el5.x86_64 torque-mom-2.3.6-2cri.el5.x86_64

Re: [OMPI users] Running openMPI job with torque

2010-06-09 Thread David Zhang
what does your my-script.sh looks like? On Wed, Jun 9, 2010 at 8:17 AM, Govind wrote: > Hi, > > I have installed following openMPI packge on worker node from repo > openmpi-libs-1.4-4.el5.x86_64 > openmpi-1.4-4.el5.x86_64 > mpitests-openmpi-3.0-2.el5.x86_64 >

Re: [OMPI users] Running openMPI job with torque

2010-06-09 Thread Govind
#!/bin/sh /usr/lib64/openmpi/1.4-gcc/bin/mpirun hello On 9 June 2010 16:21, David Zhang wrote: > what does your my-script.sh looks like? > > On Wed, Jun 9, 2010 at 8:17 AM, Govind wrote: > >> Hi, >> >> I have installed following openMPI packge on

Re: [OMPI users] Running openMPI job with torque

2010-06-09 Thread Ralph Castain
You need to include the path to "hello" unless it sits in your PATH environment! On Jun 9, 2010, at 9:37 AM, Govind wrote: > > #!/bin/sh > /usr/lib64/openmpi/1.4-gcc/bin/mpirun hello > > > On 9 June 2010 16:21, David Zhang wrote: > what does your my-script.sh looks

Re: [OMPI users] Running openMPI job with torque

2010-06-09 Thread Govind Songara
Thanks Ralph after giving full path of hello it runs. But it run only on one rank Hello World! from process 0 out of 1 on node56.beowulf.cluster there also a error >cat my-script.sh.e43 stty: standard input: Invalid argument On 9 June 2010 16:46, Ralph Castain wrote: > You

Re: [OMPI users] Specifying slots in rankfile

2010-06-09 Thread Grzegorz Maj
Thanks a lot, it works fine for me. But going back to my problems - is it some bug in open-mpi or I should use "slot=*" option in some other way? 2010/6/9 Ralph Castain : > I would recommend using the sequential mapper instead: > > mpirun -mca rmaps seq > > You can then just

Re: [OMPI users] Specifying slots in rankfile

2010-06-09 Thread Ralph Castain
I would have to look at the code, but I suspect it doesn't handle "*". Could be upgraded to do so, but that would depend on the relevant developer to do so :-) On Jun 9, 2010, at 10:16 AM, Grzegorz Maj wrote: > Thanks a lot, it works fine for me. > But going back to my problems - is it some

Re: [OMPI users] mpi_iprobe not behaving as expect

2010-06-09 Thread Eugene Loh
I'll take a stab at this since I don't remember seeing any other replies. At least in the original code you sent out, you used Isend/sleep/Wait to send messages.  So, I'm guessing that part of the message is sent, Iprobe detects that a matching message is in-coming, and then the receiver goes

Re: [OMPI users] Running openMPI job with torque

2010-06-09 Thread Ralph Castain
On Jun 9, 2010, at 10:00 AM, Govind Songara wrote: > Thanks Ralph after giving full path of hello it runs. > But it run only on one rank > Hello World! from process 0 out of 1 on node56.beowulf.cluster Just to check things out, I would do: mpirun --display-allocation --display-map -np 4

Re: [OMPI users] Running openMPI job with torque

2010-06-09 Thread Jeff Squyres
On Jun 9, 2010, at 12:31 PM, Ralph Castain wrote: >> >cat my-script.sh.e43 >> stty: standard input: Invalid argument > > Not really sure here - must be an error in the script itself. ...or an error in your shell startup files (e.g., $HOME/.bashrc). -- Jeff Squyres jsquy...@cisco.com For

Re: [OMPI users] Running openMPI job with torque

2010-06-09 Thread Gus Correa
Hi Govind Besides what Ralph said, make sure your OpenMPI was built with Torque ("tm") support. Suggestion: Do: ompi_info --all | grep tm It should show lines like these: MCA ras: tm (MCA v2.0, API v2.0, Component v1.4.2) MCA plm: tm (MCA v2.0, API v2.0, Component v1.4.2) ... *** If your

Re: [OMPI users] Debug info on Darwin

2010-06-09 Thread Jeff Squyres
On Jun 4, 2010, at 5:02 PM, Peter Thompson wrote: > It was suggested by our CTO that if these files were compiled as to > produce STABS debug info, rather than DWARF, then the debug info would > be copied into the executables and shared libraries, and we would then > be able to debug with Open

Re: [OMPI users] Unable to connect to a server using MX MTL with TCP

2010-06-09 Thread Jeff Squyres
On Jun 5, 2010, at 7:52 AM, Scott Atchley wrote: > I do not think this is a supported scenario. George or Jeff can correct me, > but when you use the MX MTL you are using the pml cm and not the pml ob1. The > BTLs are part of ob1. When using the MX MTL, it cannot use the TCP BTL. > > You only

Re: [OMPI users] Behaviour of MPI_Cancel when using 'large' messages

2010-06-09 Thread Jeff Squyres
Yes, Open MPI does not implement cancels for sends. Cancels *could* be implemented in Open MPI, but no one has done so. There are three reasons why: 1. It can be really, really hard to implement cancels (lots of race conditions and corner cases involved). 2. Very, very few people ask for it

Re: [OMPI users] Running openMPI job with torque

2010-06-09 Thread Govind Songara
Hi Gus, OpenMPI was not built with tm support. The submission/execution hosts does not have any of the PBS environment variable set PBS_O_WORKDIR, $PBS_NODEFILE. How i can make set it regards Govind On 9 June 2010 18:45, Gus Correa wrote: > Hi Govind > > Besides what

Re: [OMPI users] Running openMPI job with torque

2010-06-09 Thread Govind Songara
Hi Ralph, The allocation looks fine, but why it show number of slots as 1. The executions host has 4 processor, in nodes file also defined np=4. == ALLOCATED NODES == Data for node: Name: node56.beowulf.clusterNum slots: 1Max slots: 0

Re: [OMPI users] Running openMPI job with torque

2010-06-09 Thread Gus Correa
Hi Govind Govind Songara wrote: > Hi Gus, > > OpenMPI was not built with tm support. > I suspected that. Reading your postings, it seems to be an OpenMPI rpm from a Linux distribution, which I would guess are generic, and have no specific support for any resource manager like Torque. > The

Re: [OMPI users] Running openMPI job with torque

2010-06-09 Thread Gus Correa
Hi Govind It may work with the suggestion I sent you, even with the OpenMPI with no Torque support that you have. However, since you have Torque installed on your cluster, it may be better to install OpenMPI from the source code tarball, so as to have full Torque support built in, which is much

[OMPI users] port_name information between Linux and Windows

2010-06-09 Thread awwase
Dear all, I setup a client application in a Linux machine and server application in a Windows machine. In the server side, the function* MPI_Open_port* will generate the following port_name: tag=0 port=2001 description=awwase-Laptop ifname=192.168.1.4 Now I have to to pass this information into

Re: [OMPI users] Running openMPI job with torque

2010-06-09 Thread Gus Correa
Hi Govind Govind Songara wrote: > Hi Gus, > OpenMPI was not built with tm support. > The submission/execution hosts does not have any of the > PBS environment variable set > PBS_O_WORKDIR, $PBS_NODEFILE. > How i can make set it > regards > Govind > I missed the final part of your message, about

Re: [OMPI users] Unable to connect to a server using MX MTL with TCP

2010-06-09 Thread Audet, Martin
Thanks to both Scott and Jeff ! Next time I have a problem, I will check the README file first (Doh !). Also we might mitigate the problem by connecting the workstation to the Myrinet switch. Martin -Original Message- From: users-boun...@open-mpi.org

Re: [hwloc-users] Getting a graphics view for anon graphic system...

2010-06-09 Thread Jeff Squyres
On Jun 6, 2010, at 4:03 PM, Olivier Cessenat wrote: > What you write is clear to computer scientists, but I failed to figure > out what it meant. Sorry, it is clear now ! FWIW, there's a section about "output formats" in the hwloc-ls.1 man page. It's probably worth adding a sentence in there