Somebody help please...
On Wed, Oct 29, 2008 at 12:41 PM, ali nazemian <[EMAIL PROTECTED]>wrote:
> Hi again.
> Let me explain about my problem more:
> here is the result on client node:
> client mac addr: XX XX XX XX XX XX ...
> client ip: 192.168.0.2 mask: 255.255.255.0 dhcp ip: 192.168.0.1
> gateway ip: 192.168.0.1
> pxe-e32: tftp open timeout.
> ...
> and same time on the server node:
> ssh:connect to host oscarnode1.clusternet port 22: connection time out
> i use "nmap -a" to see which port is open and which is not, here is the
> result:
> starting nmap 3.70 (http://...) at 2008-10-29 14:12 IRST
> no target machines/network specified!
> quitting!
> i couldnt use "ssh 192.168.0.1" on the client node , cause of i havent any
> command environment there to type any command , so i use "ssh 192.168.0.2"
> on server and the result was: ssh: connect to host 192.168.0.2 port 22: no
> route to host
> i used cd boot instead of network boot , same result appeared:
> connect to host 192.168.0.2 port 22: no route to host
>
> It seems i have a problem with my network not OSCAR, what do u think?!
>
>
> On Wed, Oct 29, 2008 at 1:36 AM, ali nazemian <[EMAIL PROTECTED]>wrote:
>
>> I use 192.168.x.x just for test , and as a pilot this network didnt
>> connect to other networks so in this case 192.168.x.x shouldnt be a problem
>> , however i will use 10.0.x.x for final clustering network.
>> Now i haven't access to those nodes , so i'll test ssh tomorrow and let
>> you know what result i get.
>>
>>
>> On Wed, Oct 29, 2008 at 12:03 AM, Michael Edwards <[EMAIL PROTECTED]>wrote:
>>
>>> It looks like the nodes didn't image, or are now unable to communicate
>>> with the head node in any event.
>>>
>>> Plug in a monitor and keyboard to your compute node and see if you
>>> have a login prompt. You should be able to log in as your root user
>>> from the head node. If it allows you to log in there (you will need
>>> the password) try running "ssh 192.168.0.1"
>>>
>>> It is quite possible that a switch or other device on your network is
>>> using the 192.168.0.1 network. I generally use the 10.0.0.1 network
>>> because of this.
>>>
>>> On Tue, Oct 28, 2008 at 4:18 PM, ali nazemian <[EMAIL PROTECTED]>
>>> wrote:
>>> > For your first question,it seems that all of steps before step 7
>>> > successfully completed .
>>> > and about your second one , i dont know how to check that.
>>> > I think maybe its hardware problem for my switch , its 3com 24 port
>>> switch ,
>>> > can it be my problem?!
>>> >
>>> > On Tue, Oct 28, 2008 at 11:30 PM, Michael Edwards <[EMAIL PROTECTED]>
>>> > wrote:
>>> >>
>>> >> Did the image deployment complete successfully and the nodes reboot to
>>> >> the oscar image?
>>> >>
>>> >> Can you ssh to the compute nodes from the head node (without getting a
>>> >> password prompt)?
>>> >>
>>> >> On Tue, Oct 28, 2008 at 3:51 PM, ali nazemian <[EMAIL PROTECTED]>
>>> >> wrote:
>>> >> > Hi.
>>> >> > I want to execute clustering for our HPC center using OSCAR, but i
>>> have
>>> >> > a
>>> >> > problem with step 7, installing cluster.
>>> >> > Here is my problem :
>>> >> > After i want to run step 7 , after some time on client node "tftp
>>> time
>>> >> > out"
>>> >> > error appeared and node terminate the boot agent. and "Received
>>> >> > disconnect
>>> >> > from 192.168.0.2: 2: The connection is closed by SSH Server
>>> >> > Current FSM is SSH_Main_SSHProcess" appeared on server node.
>>> >> > Here is the complete log of step 7:
>>> >> >
>>> >> >
>>> --------------------------------------------------------------------------
>>> >> > --> Update Wizard Env (as needed)
>>> >> > --> Step 7: Running: ./post_install
>>> >> > Gathering processor count from oscarnode1.clusternet.
>>> >> > ssh: connect to host oscarnode1.clusternet port 22: Connection timed
>>> out
>>> >> > Improper count (0) returned from machine oscarnode1.clusternet at
>>> >> > ./post_install line 83
>>> >> > main::get_numproc() called at ./post_install line 39
>>> >> > ssh: connect to host oscarnode1 port 22: Connection timed out
>>> >> > rsync: connection unexpectedly closed (0 bytes received so far)
>>> [sender]
>>> >> > rsync error: error in rsync protocol data stream (code 12) at
>>> io.c(359)
>>> >> > --> About to run /opt/oscar/packages/loghost/scripts/post_install
>>> for
>>> >> > loghost
>>> >> > ************************* oscar_cluster *************************
>>> >> > --------- oscarnode1---------
>>> >> > ssh: connect to host oscarnode1 port 22: Connection timed out
>>> >> > --> About to run /opt/oscar/packages/ganglia/scripts/post_install
>>> for
>>> >> > ganglia
>>> >> > [ganglia] Ganglia gmond configuration file modified, re-starting
>>> >> > daemon...
>>> >> > Shutting down GANGLIA gmond: [60G[ [0;32mOK [0;39m ]
>>> >> > Starting GANGLIA gmond: [60G[ [0;32mOK [0;39m ]
>>> >> > editing /etc/gmetad.conf
>>> >> > match: gridname\s+.*
>>> >> > match: data_source\s+.*
>>> >> > [ganglia] Ganglia gmetad configuration file modified, re-starting
>>> >> > daemon...
>>> >> > Shutting down GANGLIA gmetad: [60G[ [0;32mOK [0;39m ]
>>> >> > Starting GANGLIA gmetad: [60G[ [0;32mOK [0;39m ]
>>> >> > [ganglia] Starting up apache...
>>> >> > Stopping httpd: [60G[ [0;32mOK [0;39m ]
>>> >> > Starting httpd: [60G[ [0;32mOK [0;39m ]
>>> >> > [ganglia] Ganglia page is located at
>>> http://server.clusternet/ganglia/
>>> >> > ************************* oscar_cluster *************************
>>> >> > --------- oscarnode1---------
>>> >> > Received disconnect from 192.168.0.2: 2: The connection is closed
>>> by SSH
>>> >> > Server
>>> >> > Current FSM is SSH_Main_SSHProcess
>>> >> > --> About to run /opt/oscar/packages/torque/scripts/post_install for
>>> >> > torque
>>> >> > Received disconnect from 192.168.0.2: 2: The connection is closed
>>> by SSH
>>> >> > Server
>>> >> > Current FSM is SSH_Main_SSHProcess
>>> >> > rsync: connection unexpectedly closed (0 bytes received so far)
>>> [sender]
>>> >> > rsync error: error in rsync protocol data stream (code 12) at
>>> io.c(359)
>>> >> > TORQUE mom config file updated with clienthost: server.clusternet
>>> >> > Pushing config file to clients...
>>> >> > Sending SIGHUP to all moms...
>>> >> > ************************* oscar_cluster *************************
>>> >> > --------- oscarnode1---------
>>> >> > Received disconnect from 192.168.0.2: 2: The connection is closed
>>> by SSH
>>> >> > Server
>>> >> > Current FSM is SSH_Main_SSHProcess
>>> >> > [torque] Updating pbs_server nodes
>>> >> > /opt/pbs/bin/pbsnodes: Server has no node list
>>> >> > Shutting down TORQUE Server: [60G[ [0;32mOK [0;39m ]
>>> >> > Starting TORQUE Server: [60G[ [0;32mOK [0;39m ]
>>> >> > [torque] Creating TORQUE workq queue...
>>> >> > Max open servers: 4
>>> >> > set queue workq resources_max.ncpus = 0
>>> >> > set queue workq resources_max.nodect = 0
>>> >> > set queue workq resources_available.nodect = 0
>>> >> > set server resources_available.ncpus = 0
>>> >> > set server resources_available.nodect = 0
>>> >> > set server resources_available.nodes = 0
>>> >> > set server resources_max.ncpus = 0
>>> >> > set server resources_max.nodes = 0
>>> >> > set server scheduler_iteration = 60
>>> >> > set server log_events = 64
>>> >> > Shutting down MAUI Scheduler: [60G[ [0;32mOK [0;39m ]
>>> >> > Starting MAUI Scheduler: [60G[ [0;32mOK [0;39m ]
>>> >> > --> About to run /opt/oscar/packages/switcher/scripts/post_install
>>> for
>>> >> > switcher
>>> >> > Setting default for tag mpi ("lam-7.1.2")
>>> >> > Attribute successfully set; new attribute setting will be effective
>>> for
>>> >> > future shells
>>> >> > Received disconnect from 192.168.0.2: 2: The connection is closed
>>> by SSH
>>> >> > Server
>>> >> > Current FSM is SSH_Main_SSHProcess
>>> >> > rsync: connection unexpectedly closed (0 bytes received so far)
>>> [sender]
>>> >> > rsync error: error in rsync protocol data stream (code 12) at
>>> io.c(359)
>>> >> > --> About to run /opt/oscar/packages/mta-config/scripts/post_install
>>> for
>>> >> > mta-config
>>> >> > ************************************ WARNING
>>> >> > ************************************
>>> >> > OSCAR could not set up the configuration for any mailing service on
>>> the
>>> >> > server.
>>> >> > The current version of the mta-config package in OSCAR only supports
>>> the
>>> >> > Postfix mail transfer agent (MTA).
>>> >> > It looks like you have another MTA installed (e.g, sendmail or
>>> exim); as
>>> >> > such,
>>> >> > please be aware that OSCAR will not automatically configure it.
>>> >> > ************************************ WARNING
>>> >> > ************************************
>>> >> > --> About to run /opt/oscar/packages/ntpconfig/scripts/post_install
>>> for
>>> >> > ntpconfig
>>> >> > Shutting down ntpd: [60G[ [0;32mOK [0;39m ]
>>> >> > Starting ntpd: [60G[ [0;32mOK [0;39m ]
>>> >> > ************************* oscar_cluster *************************
>>> >> > --------- oscarnode1---------
>>> >> > Received disconnect from 192.168.0.2: 2: The connection is closed
>>> by SSH
>>> >> > Server
>>> >> > Current FSM is SSH_Main_SSHProcess
>>> >> > --> About to run /opt/oscar/packages/opium/scripts/post_install for
>>> >> > opium
>>> >> > Not all hosts were accessible by c3! Will retry the update later
>>> >> > Could not find template for file switcher.ini
>>> >> > If this contains distro-specific lines, please create a template!
>>> >> > image:
>>> >> > $VAR1 = 'oscarimage';
>>> >> > ---------------
>>> >> > Received disconnect from 192.168.0.2: 2: The connection is closed
>>> by SSH
>>> >> > Server
>>> >> > Current FSM is SSH_Main_SSHProcess
>>> >> > rsync: connection unexpectedly closed (0 bytes received so far)
>>> [sender]
>>> >> > rsync error: error in rsync protocol data stream (code 12) at
>>> io.c(359)
>>> >> > Could not find template for file gshadow
>>> >> > If this contains distro-specific lines, please create a template!
>>> >> > image:
>>> >> > $VAR1 = 'oscarimage';
>>> >> > ---------------
>>> >> > Received disconnect from 192.168.0.2: 2: The connection is closed
>>> by SSH
>>> >> > Server
>>> >> > Current FSM is SSH_Main_SSHProcess
>>> >> > rsync: connection unexpectedly closed (0 bytes received so far)
>>> [sender]
>>> >> > rsync error: error in rsync protocol data stream (code 12) at
>>> io.c(359)
>>> >> > image:
>>> >> > $VAR1 = 'oscarimage';
>>> >> > ---------------
>>> >> > Received disconnect from 192.168.0.2: 2: The connection is closed
>>> by SSH
>>> >> > Server
>>> >> > Current FSM is SSH_Main_SSHProcess
>>> >> > rsync: connection unexpectedly closed (0 bytes received so far)
>>> [sender]
>>> >> > rsync error: error in rsync protocol data stream (code 12) at
>>> io.c(359)
>>> >> > image:
>>> >> > $VAR1 = 'oscarimage';
>>> >> > ---------------
>>> >> > Received disconnect from 192.168.0.2: 2: The connection is closed
>>> by SSH
>>> >> > Server
>>> >> > Current FSM is SSH_Main_SSHProcess
>>> >> > rsync: connection unexpectedly closed (0 bytes received so far)
>>> [sender]
>>> >> > rsync error: error in rsync protocol data stream (code 12) at
>>> io.c(359)
>>> >> > image:
>>> >> > $VAR1 = 'oscarimage';
>>> >> > ---------------
>>> >> > Received disconnect from 192.168.0.2: 2: The connection is closed
>>> by SSH
>>> >> > Server
>>> >> > Current FSM is SSH_Main_SSHProcess
>>> >> > rsync: connection unexpectedly closed (0 bytes received so far)
>>> [sender]
>>> >> > rsync error: error in rsync protocol data stream (code 12) at
>>> io.c(359)
>>> >> > --> About to run /opt/oscar/packages/oda/scripts/post_install for
>>> oda
>>> >> > generating the /etc/odaserver file on all oscar clients
>>> >> > . /etc/profile.d/c3.sh && cexec 'echo oscar_server > /etc/odaserver'
>>> >> > ************************* oscar_cluster *************************
>>> >> > --------- oscarnode1---------
>>> >> > Received disconnect from 192.168.0.2: 2: The connection is closed
>>> by SSH
>>> >> > Server
>>> >> > Current FSM is SSH_Main_SSHProcess
>>> >> > Cluster setup complete!
>>> >> > --> Step 7: Successfully completed the cluster install
>>> >> > --> Update Wizard Env (as needed)
>>> >> >
>>> >> >
>>> -----------------------------------------------------------------------------------
>>> >> > P.S: i am using OSCAR 5 on centos 4.7-x86_64 , i cant use centos 5.X
>>> >> > because
>>> >> > of it has problem with my graphic cards.
>>> >> > Best regards.
>>> >> >
>>> >> > --
>>> >> > A.Nazemian
>>> >> >
>>> >> >
>>> >> >
>>> -------------------------------------------------------------------------
>>> >> > This SF.Net email is sponsored by the Moblin Your Move Developer's
>>> >> > challenge
>>> >> > Build the coolest Linux based applications with Moblin SDK & win
>>> great
>>> >> > prizes
>>> >> > Grand prize is a trip for two to an Open Source event anywhere in
>>> the
>>> >> > world
>>> >> > http://moblin-contest.org/redirect.php?banner_id=100&url=/
>>> >> > _______________________________________________
>>> >> > Oscar-users mailing list
>>> >> > Oscar-users@lists.sourceforge.net
>>> >> > https://lists.sourceforge.net/lists/listinfo/oscar-users
>>> >> >
>>> >> >
>>> >>
>>> >>
>>> -------------------------------------------------------------------------
>>> >> This SF.Net email is sponsored by the Moblin Your Move Developer's
>>> >> challenge
>>> >> Build the coolest Linux based applications with Moblin SDK & win great
>>> >> prizes
>>> >> Grand prize is a trip for two to an Open Source event anywhere in the
>>> >> world
>>> >> http://moblin-contest.org/redirect.php?banner_id=100&url=/
>>> >> _______________________________________________
>>> >> Oscar-users mailing list
>>> >> Oscar-users@lists.sourceforge.net
>>> >> https://lists.sourceforge.net/lists/listinfo/oscar-users
>>> >
>>> >
>>> >
>>> > --
>>> > A.Nazemian
>>> >
>>> >
>>> -------------------------------------------------------------------------
>>> > This SF.Net email is sponsored by the Moblin Your Move Developer's
>>> challenge
>>> > Build the coolest Linux based applications with Moblin SDK & win great
>>> > prizes
>>> > Grand prize is a trip for two to an Open Source event anywhere in the
>>> world
>>> > http://moblin-contest.org/redirect.php?banner_id=100&url=/
>>> > _______________________________________________
>>> > Oscar-users mailing list
>>> > Oscar-users@lists.sourceforge.net
>>> > https://lists.sourceforge.net/lists/listinfo/oscar-users
>>> >
>>> >
>>>
>>> -------------------------------------------------------------------------
>>> This SF.Net email is sponsored by the Moblin Your Move Developer's
>>> challenge
>>> Build the coolest Linux based applications with Moblin SDK & win great
>>> prizes
>>> Grand prize is a trip for two to an Open Source event anywhere in the
>>> world
>>> http://moblin-contest.org/redirect.php?banner_id=100&url=/
>>> _______________________________________________
>>> Oscar-users mailing list
>>> Oscar-users@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/oscar-users
>>>
>>
>>
>>
>> --
>> A.Nazemian
>>
>
>
>
> --
> A.Nazemian
>
--
A.Nazemian
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Oscar-users mailing list
Oscar-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/oscar-users