Hello Sridhar,
Jeff Squyres wrote:
On Aug 17, 2005, at 8:23 AM, Sridhar Chirravuri wrote:
Can someone reply to my mail please?
I think you sent your first mail at 6:48am in my time zone (that is
4:48am Los Alamos time -- I strongly doubt that they are at work
yet...); I'm still processing my mail from last night and am just now
seeing your mail.
Global software development is challenging. :-)
Yes - as Jeff indicated it was just 7:00 am as I started catching
up on email - and then my laptop died...
Here is the output of sample MPI program which sends a char and recvs a
char.
[root@micrompi-1 ~]# mpirun -np 2 ./a.out
Could not join a running, existing universe
Establishing a new one named: default-universe-12913
[0,0,0] mca_oob_tcp_init: calling orte_gpr.subscribe
[0,0,0] mca_oob_tcp_init: calling orte_gpr.put(orte-job-0)
[snipped]
[0,0,0]-[0,0,1] mca_oob_tcp_send: tag 2
[0,0,0]-[0,0,1] mca_oob_tcp_send: tag 2
So, I'm assuming from your message that since pallas runs ok on
the same host - this case must be for multiple hosts?
Looks like the oob connections are comming up.
Have you tried turning on the mvapi btl debug to see what's
going on? Try turning off the oob debug and running with:
mpirun -np 2 -mca pml ob1 -mca btl_base_include self,mvapi -mca btl_base_debug
1 ./a.out
Sorry this has been such a pain for you - let's see if the mvapi
debug points out anything.
Thanks,
Tim