Re: [OMPI users] testing for openMPI

2012-06-07 Thread TERRY DONTJE
Try: ps -elf | grep hello This should list out all the processes named hello. In that output is the pid (should be the 4th column) of the process and you give your debugger that pid. For example if the pid was 1234 you'd give "gdb - 1234". Actually Jeff's suggestion of this being a firewall

Re: [OMPI users] testing for openMPI

2012-06-07 Thread Jeff Squyres
Exxxcellent. Good luck! On Jun 7, 2012, at 3:43 AM, Duke wrote: > On 6/7/12 5:32 PM, Jeff Squyres wrote: >> Check to ensure that you have firewalls disabled between your two machines; >> that's a common cause of hanging (i.e., Open MPI is trying to open >> connections and/or send data

Re: [OMPI users] testing for openMPI

2012-06-07 Thread TERRY DONTJE
Another sanity think to try is see if you can run your test program on just one of the nodes? If that works more than likely MPI is having issues setting up connections between the nodes. --td On 6/7/2012 6:06 AM, Duke wrote: Hi again, Somehow the verbose flag (-v) did not work for me. I

Re: [OMPI users] testing for openMPI

2012-06-07 Thread Duke
On 6/7/12 5:32 PM, Jeff Squyres wrote: Check to ensure that you have firewalls disabled between your two machines; that's a common cause of hanging (i.e., Open MPI is trying to open connections and/or send data between your two nodes, and the packets are getting black-holed at the other

Re: [OMPI users] testing for openMPI

2012-06-07 Thread Duke
On 6/7/12 5:31 PM, TERRY DONTJE wrote: Can you get on one of the nodes and see the job's processes? If so can you then attach a debugger to it and get a stack? I wonder if the processes are stuck in MPI_Init? Thanks Terry for your suggestion, but please let me know how would I do it? I can

Re: [OMPI users] testing for openMPI

2012-06-07 Thread Jeff Squyres
Check to ensure that you have firewalls disabled between your two machines; that's a common cause of hanging (i.e., Open MPI is trying to open connections and/or send data between your two nodes, and the packets are getting black-holed at the other side). Open MPI needs to be able to

Re: [OMPI users] testing for openMPI

2012-06-07 Thread TERRY DONTJE
Can you get on one of the nodes and see the job's processes? If so can you then attach a debugger to it and get a stack? I wonder if the processes are stuck in MPI_Init? --td On 6/7/2012 6:06 AM, Duke wrote: Hi again, Somehow the verbose flag (-v) did not work for me. I tried

Re: [OMPI users] testing for openMPI

2012-06-07 Thread Duke
Hi again, Somehow the verbose flag (-v) did not work for me. I tried --debug-daemon and got: [mpiuser@fantomfs40a ~]$ mpirun --debug-daemons -np 3 --machinefile /home/mpiuser/.mpi_hostfile ./test/mpihello Daemon was launched on hp430a - beginning to initialize Daemon [[34432,0],1] checking

Re: [OMPI users] testing for openMPI

2012-06-07 Thread Duke
Hi Jingha, On 6/7/12 4:28 PM, Jingcha Joba wrote: Hello Duke, Welcome to the forum. The way openmpi schedules by default is to fill all the slots in a host, before moving on to next host. Check this link for some info: http://www.open-mpi.org/faq/?category=running#mpirun-scheduling Thanks

Re: [OMPI users] testing for openMPI

2012-06-07 Thread Jingcha Joba
Hello Duke, Welcome to the forum. The way openmpi schedules by default is to fill all the slots in a host, before moving on to next host. Check this link for some info: http://www.open-mpi.org/faq/?category=running#mpirun-scheduling -- Jingcha On Thu, Jun 7, 2012 at 2:11 AM, Duke

[OMPI users] testing for openMPI

2012-06-07 Thread Duke
Hi folks, Please be gentle to the newest member of openMPI, I am totally new to this field. I just built a test cluster with 3 boxes on Scientific Linux 6.2 and openMPI (Open MPI 1.5.3), and I wanted to test how the cluster works but I cant figure out what was/is happening. On my master node,