Ok, I understand. Outside problem -- could whoever manages the borders of 169.236.129 be blocking incoming ssh (either by default, or maybe just to this box, such as if they specify what's let through on a host by host basis, for example)? Also/or, can you connect to something outside from madrid (i.e. browse to google or ssh to the box you're trying to connect in from), to confirm madrid's connection to the outside network works ok? Inside...what is the output of the hostname command when run on madrid? Also, to answer DongInn's question, did you do this before you did the install/build: http://svn.oscar.openclustergroup.org/trac/oscar/wiki/InstallGuideNetwork <https://webmail.ngc.com/exchweb/bin/redir.asp?URL=http://svn.oscar.openclustergroup.org/trac/oscar/wiki/InstallGuideNetwork> ? If so, something went horribly wrong and we should figure out if there's a bug in OSCAR we didn't know about. If not, that's probably where some of the initial confusion started. --Joe
________________________________ From: [EMAIL PROTECTED] on behalf of Joseph Norris Sent: Mon 8/11/2008 3:04 PM To: oscar-users@lists.sourceforge.net Subject: Re: [Oscar-users] Oscar installed - tests run - problem withheadnode Allow me to clarify - sorry if I did not. When I say outside I am referring to outside the 169.236.129.xxx network. Another outside of this network - I can not get a password prompt at all and then I get: ssh -vvv [EMAIL PROTECTED] OpenSSH_4.4p1, OpenSSL 0.9.7l 28 Sep 2006 debug1: Reading configuration data /etc/ssh/ssh_config debug2: ssh_connect: needpriv 0 debug1: Connecting to madrid.ucmerced.edu [169.236.129.234] port 22. debug1: connect to address 169.236.129.234 port 22: Connection timed out ssh: connect to host madrid.ucmerced.edu port 22: Connection timed out I hope I have clarified this more. Thanks Inside this network - another box on the 169.236.129 - requires a 30+ second wait for a password. Once I am on the headnode I can ssh to any of my compute nodes with no problem. Firewall - on the headnode - ssh port is opened. Greenseid, Joseph M. wrote: > ok, first, i want to make sure i understand. > > when connecting from "outside," you're connecting from somewhere where you're > trying to access via the 169.236.129.234 address (eth0 interface). > > when connecting from "inside," you're doing it from somewhere else on your > 10. network, trying to access via the eth1 interface, which is 10.0.0.2? > > the inside hang is probably because there's no /etc/hosts entry at all now > for 10.0.0.2. > > as for the outside, that may be a misconfigured firewall or something -- can > you ping the box (assuming you're not blocking ping)? can you `telnet > 169.236.129.234 22` (telnet to the sshd port) to see if you can connect? i > think that this problem is not oscar, but network or firewall related. can > you connect to the outside world from madrid (i.e. can you ping google or > access some web site or something to ensure that the network connection > between madrid and the world is working)? > > --Joe > > ________________________________ > > From: [EMAIL PROTECTED] on behalf of Joseph Norris > Sent: Mon 8/11/2008 2:52 PM > To: oscar-users@lists.sourceforge.net > Subject: Re: [Oscar-users] Oscar installed - tests run - problem withheadnode > > > > Here is what I have now: > > # Do not remove the following line, or various programs > # that require network functionality will fail. > > 127.0.0.1 localhost.localdomain localhost > > 169.236.129.234 madrid.ucmerced.edu madrid > > > # These entries are managed by SIS, please don't modify them. > 10.0.0.3 oscarnode1.ucmerced.edu oscarnode1 > 10.0.0.4 oscarnode2.ucmerced.edu oscarnode2 > 10.0.0.5 oscarnode3.ucmerced.edu oscarnode3 > 10.0.0.6 oscarnode4.ucmerced.edu oscarnode4 > > > I am still getting a 20-30 second wait for password from boxes inside my > network - when I do the following from inside: > > ssh -vvv [EMAIL PROTECTED] - I get: > > OpenSSH_4.3p2, OpenSSL 0.9.8b 04 May 2006 > debug1: Reading configuration data /etc/ssh/ssh_config > debug1: Applying options for * > debug2: ssh_connect: needpriv 0 > debug1: Connecting to madrid.ucmerced.edu [169.236.129.234] port 22. > debug1: Connection established. > debug1: identity file /home/joseph/.ssh/identity type -1 > debug1: identity file /home/joseph/.ssh/id_rsa type -1 > debug1: identity file /home/joseph/.ssh/id_dsa type -1 > debug1: loaded 3 keys > debug1: Remote protocol version 2.0, remote software version OpenSSH_4.3 > debug1: match: OpenSSH_4.3 pat OpenSSH* > debug1: Enabling compatibility mode for protocol 2.0 > debug1: Local version string SSH-2.0-OpenSSH_4.3 > debug2: fd 3 setting O_NONBLOCK > debug1: SSH2_MSG_KEXINIT sent > debug1: SSH2_MSG_KEXINIT received > debug2: kex_parse_kexinit: > diffie-hellman-group-exchange-sha1,diffie-hellman-group14-sha1,diffie-hellman-group1-sha1 > debug2: kex_parse_kexinit: ssh-rsa,ssh-dss > debug2: kex_parse_kexinit: > aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,arcfour128,arcfour256,arcfour,aes192-cbc,aes256-cbc,[EMAIL > PROTECTED],aes128-ctr,aes192-ctr,aes256-ctr > debug2: kex_parse_kexinit: > aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,arcfour128,arcfour256,arcfour,aes192-cbc,aes256-cbc,[EMAIL > PROTECTED],aes128-ctr,aes192-ctr,aes256-ctr > debug2: kex_parse_kexinit: > hmac-md5,hmac-sha1,hmac-ripemd160,[EMAIL PROTECTED],hmac-sha1-96,hmac-md5-96 > debug2: kex_parse_kexinit: > hmac-md5,hmac-sha1,hmac-ripemd160,[EMAIL PROTECTED],hmac-sha1-96,hmac-md5-96 > debug2: kex_parse_kexinit: none,[EMAIL PROTECTED],zlib > debug2: kex_parse_kexinit: none,[EMAIL PROTECTED],zlib > debug2: kex_parse_kexinit: > debug2: kex_parse_kexinit: > debug2: kex_parse_kexinit: first_kex_follows 0 > debug2: kex_parse_kexinit: reserved 0 > debug2: kex_parse_kexinit: > diffie-hellman-group-exchange-sha1,diffie-hellman-group14-sha1,diffie-hellman-group1-sha1 > debug2: kex_parse_kexinit: ssh-rsa,ssh-dss > debug2: kex_parse_kexinit: > aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,arcfour128,arcfour256,arcfour,aes192-cbc,aes256-cbc,[EMAIL > PROTECTED],aes128-ctr,aes192-ctr,aes256-ctr > debug2: kex_parse_kexinit: > aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,arcfour128,arcfour256,arcfour,aes192-cbc,aes256-cbc,[EMAIL > PROTECTED],aes128-ctr,aes192-ctr,aes256-ctr > debug2: kex_parse_kexinit: > hmac-md5,hmac-sha1,hmac-ripemd160,[EMAIL PROTECTED],hmac-sha1-96,hmac-md5-96 > debug2: kex_parse_kexinit: > hmac-md5,hmac-sha1,hmac-ripemd160,[EMAIL PROTECTED],hmac-sha1-96,hmac-md5-96 > debug2: kex_parse_kexinit: none,[EMAIL PROTECTED] > debug2: kex_parse_kexinit: none,[EMAIL PROTECTED] > debug2: kex_parse_kexinit: > debug2: kex_parse_kexinit: > debug2: kex_parse_kexinit: first_kex_follows 0 > debug2: kex_parse_kexinit: reserved 0 > debug2: mac_init: found hmac-md5 > debug1: kex: server->client aes128-cbc hmac-md5 none > debug2: mac_init: found hmac-md5 > debug1: kex: client->server aes128-cbc hmac-md5 none > debug1: SSH2_MSG_KEX_DH_GEX_REQUEST(1024<1024<8192) sent > debug1: expecting SSH2_MSG_KEX_DH_GEX_GROUP > debug2: dh_gen_key: priv key bits set: 122/256 > debug2: bits set: 505/1024 > debug1: SSH2_MSG_KEX_DH_GEX_INIT sent > debug1: expecting SSH2_MSG_KEX_DH_GEX_REPLY > debug3: check_host_in_hostfile: filename /home/joseph/.ssh/known_hosts > debug3: check_host_in_hostfile: match line 13 > debug3: check_host_in_hostfile: filename /home/joseph/.ssh/known_hosts > debug3: check_host_in_hostfile: match line 11 > debug1: Host 'madrid.ucmerced.edu' is known and matches the RSA host key. > debug1: Found key in /home/joseph/.ssh/known_hosts:13 > debug2: bits set: 502/1024 > debug1: ssh_rsa_verify: signature correct > debug2: kex_derive_keys > debug2: set_newkeys: mode 1 > debug1: SSH2_MSG_NEWKEYS sent > debug1: expecting SSH2_MSG_NEWKEYS > debug2: set_newkeys: mode 0 > debug1: SSH2_MSG_NEWKEYS received > debug1: SSH2_MSG_SERVICE_REQUEST sent > debug2: service_accept: ssh-userauth > debug1: SSH2_MSG_SERVICE_ACCEPT received > debug2: key: /home/joseph/.ssh/identity ((nil)) > debug2: key: /home/joseph/.ssh/id_rsa ((nil)) > debug2: key: /home/joseph/.ssh/id_dsa ((nil)) > > After the 30+ second wait I get a password prompt: > > debug3: packet_send2: adding 48 (len 64 padlen 16 extra_pad 64) > debug2: we sent a password packet, wait for reply > debug1: Authentication succeeded (password). > debug1: channel 0: new [client-session] > debug3: ssh_session2_open: channel_new: 0 > debug2: channel 0: send open > debug1: Entering interactive session. > debug2: callback start > debug2: client_session2_setup: id 0 > debug2: channel 0: request pty-req confirm 0 > debug3: tty_make_modes: ospeed 38400 > debug3: tty_make_modes: ispeed 38400 > debug3: tty_make_modes: 1 3 > debug3: tty_make_modes: 2 28 > debug3: tty_make_modes: 3 127 > debug3: tty_make_modes: 4 21 <snip> > debug3: tty_make_modes: 92 0 > debug3: tty_make_modes: 93 0 > debug1: Sending environment. > debug3: Ignored env HOSTNAME > debug3: Ignored env TERM > debug3: Ignored env SHELL > debug3: Ignored env HISTSIZE > debug3: Ignored env SSH_CLIENT > debug3: Ignored env CVSROOT > debug3: Ignored env SSH_TTY > debug3: Ignored env USER > debug3: Ignored env LS_COLORS > debug3: Ignored env MAIL > debug3: Ignored env PATH > debug3: Ignored env INPUTRC > debug3: Ignored env PWD > debug1: Sending env LANG = en_US.UTF-8 > debug2: channel 0: request env confirm 0 > debug3: Ignored env SSH_ASKPASS > debug3: Ignored env SHLVL > debug3: Ignored env HOME > debug3: Ignored env LOGNAME > debug3: Ignored env CVS_RSH > debug3: Ignored env SSH_CONNECTION > debug3: Ignored env LESSOPEN > debug3: Ignored env G_BROKEN_FILENAMES > debug3: Ignored env _ > debug2: channel 0: request shell confirm 0 > debug2: fd 3 setting TCP_NODELAY > debug2: callback done > debug2: channel 0: open confirm rwindow 0 rmax 32768 > debug2: channel 0: rcvd adjust 131072 > Last login: Mon Aug 11 11:46:46 2008 from 169.236.129.235 > > From outside the network I get: > > [EMAIL PROTECTED]:]$ ssh -vvv [EMAIL PROTECTED] > OpenSSH_4.4p1, OpenSSL 0.9.7l 28 Sep 2006 > debug1: Reading configuration data /etc/ssh/ssh_config > debug2: ssh_connect: needpriv 0 > debug1: Connecting to madrid.ucmerced.edu [169.236.129.234] port 22. > > > just hangs there and no password. > > > In the messages log I am getting: > > Aug 11 11:39:32 madrid sshd[3899]: error: Bind to port 22 on 0.0.0.0 > failed: Address already in use. > Aug 11 11:39:34 madrid sshd[3761]: pam_unix(sshd:session): session > closed for user joseph > Aug 11 11:40:39 madrid sshd[3903]: Accepted password for joseph from > 169.236.129.235 port 52030 ssh2 > Aug 11 11:40:39 madrid sshd[3903]: pam_unix(sshd:session): session > opened for user joseph by (uid=0) > Aug 11 11:41:30 madrid sudo: joseph : TTY=pts/1 ; PWD=/home/joseph ; > USER=root ; COMMAND=/bin/bash > Aug 11 11:46:46 madrid sshd[4536]: Accepted password for joseph from > 169.236.129.235 port 54462 ssh2 > Aug 11 11:46:46 madrid sshd[4536]: pam_unix(sshd:session): session > opened for user joseph by (uid=0) > Aug 11 11:46:48 madrid sshd[4536]: pam_unix(sshd:session): session > closed for user joseph > Aug 11 11:48:35 madrid sshd[4606]: Accepted password for joseph from > 169.236.129.235 port 38358 ssh2 > Aug 11 11:48:35 madrid sshd[4606]: pam_unix(sshd:session): session > opened for user joseph by (uid=0) > > > and nothing registered from login on the outside - no password and then > times out. > > > > Greenseid, Joseph M. wrote: > >> very strange. one last question about the hosts file. the network setup >> section of the install guide recommends separating the localhost name and >> actual hostname into two separate lines in the hosts file before starting >> the installation. did you do this before you started the install? if not, >> i wonder if that had something to do with things getting so mangled. >> >> anyway, if you fix it up, hopefully it'll all work fine. >> >> --Joe >> >> ________________________________ >> >> From: [EMAIL PROTECTED] on behalf of Joseph Norris >> Sent: Mon 8/11/2008 2:26 PM >> To: oscar-users@lists.sourceforge.net >> Subject: Re: [Oscar-users] Oscar installed - tests run - problem withheadnode >> >> >> >> As you see the hosts file - this is how oscar left it. I did not begin >> looking at this or editing it until I discovered this issue and there >> was not an external IP in the hosts file. >> >> Greenseid, Joseph M. wrote: >> >> >>> oscar deleted the public address line from your hosts file? >>> >>> --Joe >>> >>> ________________________________ >>> >>> From: [EMAIL PROTECTED] on behalf of Joseph Norris >>> Sent: Mon 8/11/2008 2:00 PM >>> To: oscar-users@lists.sourceforge.net >>> Subject: Re: [Oscar-users] Oscar installed - tests run - problem >>> withheadnode >>> >>> >>> >>> This was the way oscar built my file and it was a bit odd to me also. I >>> have eth0 aimed at the outside IP address of 169.236.129.234. eth1 is >>> aimed at the nodes. Oscar built this host file in this way. I will >>> modify it but leave the compute nodes in place. >>> >>> Greenseid, Joseph M. wrote: >>> >>> >>> >>>> joseph, >>>> >>>> i had the same thought as donginn -- having "madrid.ucmerced.edu" on both >>>> the 127.0.0.1 line *AND* the 10.0.0.2 line may well be confusing the >>>> system. as a general rule, i've taken to having the first line of the >>>> hosts file be "127.0.0.1 localhost.localdomain localhost" and that's it, >>>> and have the hostname on the line with the reachable IP addr. >>>> >>>> as for the ssh problems specifically, a verbose trace of the ssh >>>> connection may yield some more specific information than our best guess at >>>> the hosts file configuration -- try to add a "-v" to the ssh command and >>>> see where it's stalling out. >>>> >>>> also, how do you connect to this box from "outside?" from this hosts >>>> file, it looks like the head node has a 10. addr that it uses to talk to >>>> the cluster nodes, but no separate "public" facing address, as is >>>> customarily the case with clusters (a public facing addr on the head node, >>>> and a private network that the head node and all the compute nodes are on, >>>> so the compute nodes are not reachable directly from anywhere except >>>> inside the cluster). >>>> >>>> --Joe >>>> >>>> ________________________________ >>>> >>>> From: [EMAIL PROTECTED] on behalf of DongInn Kim >>>> Sent: Mon 8/11/2008 1:08 PM >>>> To: oscar-users@lists.sourceforge.net >>>> Subject: Re: [Oscar-users] Oscar installed - tests run - problem >>>> withheadnode >>>> >>>> >>>> >>>> Hi Joseph, >>>> >>>> I don't know how to avoid the OSCAR sanity checking of network >>>> configuration(especially /etc/hosts) because OSCAR does not like to have >>>> any actual hostname rather than localhost.localdomain. >>>> >>>> http://svn.oscar.openclustergroup.org/trac/oscar/wiki/InstallGuideNetwork#NIC >>>> >>>> This is from the OSCAR install guide. >>>> >>>> Anyway, I am wondering if the 127.0.0.1 line caused the problem on your >>>> test. I am not really sure though. >>>> >>>> Regards, >>>> >>>> - DongInn >>>> >>>> >>>> Joseph Norris wrote: >>>> >>>> >>>> >>>> >>>>> I was able to get oscar totally installed, tests run, X11 issues >>>>> resolved etc... Now I have the following issue. >>>>> >>>>> On head node I have ssh open on my fire wall When I log in from another >>>>> box within my network it takes between 30-40 seconds to get a password >>>>> prompt and from outside I get no password prompt at all. I discussed >>>>> this with another sys admin and he suggested that I look at resolv.conf >>>>> - however this has the same structure as the other redhat servers that I >>>>> administrate and I can reach the just fine. >>>>> >>>>> I was wondering if my hosts file is correct? or how I should >>>>> trouble-shoot this issue? >>>>> >>>>> Hosts file: >>>>> 127.0.0.1 madrid.ucmerced.edu madrid localhost.localdomain >>>>> localhos >>>>> t >>>>> 10.0.0.2 madrid.ucmerced.edu oscar_server oscar_server >>>>> nfs_osca >>>>> r pbs_oscar >>>>> >>>>> >>>>> # These entries are managed by SIS, please don't modify them. >>>>> 10.0.0.3 oscarnode1.ucmerced.edu oscarnode1 >>>>> 10.0.0.4 oscarnode2.ucmerced.edu oscarnode2 >>>>> 10.0.0.5 oscarnode3.ucmerced.edu oscarnode3 >>>>> 10.0.0.6 oscarnode4.ucmerced.edu oscarnode4 >>>>> >>>>> >>>>> Thanks.
<<winmail.dat>>
------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________ Oscar-users mailing list Oscar-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/oscar-users