Hi,
I am trying to reproduce what I was able to show last Friday on Amazon
EC2 instances, but I am having a problem. What I was able to show last
Friday as root was with this command:
mpirun –app app.ac
with app.ac being:
-H dns-entry-A –np 1 (linux command)
-H dns-entry-A –np 1 (linux command)
-H dns-entry-B –np 1 (linux command)
-H dns-entry-B –np 1 (linux command)
Here’s the config file in root’s .ssh directory:
Host *
IdentityFile /root/.ssh/.derobee/.kagi
IdentitiesOnly yes
BatchMode yes
Yesterday and today I can’t get this to work. I made the last part of app.ac
file simpler (it now says /bin/hostname). Below is the session:
-bash-3.2#
-bash-3.2# # I am on instance A, host name for inst A is:
-bash-3.2# hostname
domU-12-31-39-09-CD-C2
-bash-3.2#
-bash-3.2# nslookup domU-12-31-39-09-CD-C2
Server: 172.16.0.23
Address: 172.16.0.23#53
Non-authoritative answer:
Name: domU-12-31-39-09-CD-C2.compute-1.internal
Address: 10.210.210.48
-bash-3.2# cd .ssh
-bash-3.2#
-bash-3.2# cat config
Host *
IdentityFile /root/.ssh/.derobee/.kagi
IdentitiesOnly yes
BatchMode yes
-bash-3.2#
-bash-3.2# ll config
-rw-r--r-- 1 root root 103 Feb 15 17:18 config
-bash-3.2#
-bash-3.2# chmod 600 config
-bash-3.2#
-bash-3.2# # show I can go to inst B without password/passphrase
-bash-3.2#
-bash-3.2# ssh domU-12-31-39-09-E6-71.compute-1.internal
Last login: Tue Feb 15 17:18:46 2011 from 10.210.210.48
-bash-3.2#
-bash-3.2# hostname
domU-12-31-39-09-E6-71
-bash-3.2#
-bash-3.2# nslookup `hostname`
Server: 172.16.0.23
Address: 172.16.0.23#53
Non-authoritative answer:
Name: domU-12-31-39-09-E6-71.compute-1.internal
Address: 10.210.233.123
-bash-3.2# # and back to inst A is also no problem
-bash-3.2#
-bash-3.2# ssh domU-12-31-39-09-CD-C2.compute-1.internal
Last login: Tue Feb 15 17:36:19 2011 from 63.193.205.1
-bash-3.2#
-bash-3.2# hostname
domU-12-31-39-09-CD-C2
-bash-3.2#
-bash-3.2# # log out twice to go back to inst A
-bash-3.2# exit
logout
Connection to domU-12-31-39-09-CD-C2.compute-1.internal closed.
-bash-3.2#
-bash-3.2# exit
logout
Connection to domU-12-31-39-09-E6-71.compute-1.internal closed.
-bash-3.2#
-bash-3.2# hostname
domU-12-31-39-09-CD-C2
-bash-3.2#
-bash-3.2# cd ..
-bash-3.2#
-bash-3.2# pwd
/root
-bash-3.2#
-bash-3.2# ll
total 8
-rw-r--r-- 1 root root 260 Feb 15 17:24 app.ac
-rw-r--r-- 1 root root 130 Feb 15 17:34 app.ac2
-bash-3.2#
-bash-3.2# cat app.ac
-H domU-12-31-39-09-CD-C2.compute-1.internal -np 1 /bin/hostname
-H domU-12-31-39-09-CD-C2.compute-1.internal -np 1 /bin/hostname
-H domU-12-31-39-09-E6-71.compute-1.internal -np 1 /bin/hostname
-H domU-12-31-39-09-E6-71.compute-1.internal -np 1 /bin/hostname
-bash-3.2#
-bash-3.2# # when there is a remote machine (bottome 2 lines) it hangs
-bash-3.2# mpirun -app app.ac
mpirun: killing job...
--------------------------------------------------------------------------
mpirun noticed that the job aborted, but has no info as to the process
that caused that situation.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun was unable to cleanly terminate the daemons on the nodes shown
below. Additional manual cleanup may be required - please refer to
the "orte-clean" tool for assistance.
--------------------------------------------------------------------------
domU-12-31-39-09-E6-71.compute-1.internal - daemon did not report back
when launched
-bash-3.2#
-bash-3.2# cat app.ac2
-H domU-12-31-39-09-CD-C2.compute-1.internal -np 1 /bin/hostname
-H domU-12-31-39-09-CD-C2.compute-1.internal -np 1 /bin/hostname
-bash-3.2#
-bash-3.2# # when there is no remote machine, then mpirun works:
-bash-3.2# mpirun -app app.ac2
domU-12-31-39-09-CD-C2
domU-12-31-39-09-CD-C2
-bash-3.2#
-bash-3.2# hostname
domU-12-31-39-09-CD-C2
-bash-3.2#
-bash-3.2# # this gotta be ssh problem....
-bash-3.2#
-bash-3.2# # show no firewall is used
-bash-3.2# iptables --list
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain FORWARD (policy ACCEPT)
target prot opt source destination
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
-bash-3.2#
-bash-3.2# exit
logout
[tsakai@vixen ec2]$
Would someone please point out what I am doing wrong?
Thank you.
Regards,
Tena