I use 192.168.x.x just for test , and as a pilot this network didnt connect
to other networks so in this case 192.168.x.x shouldnt be a problem ,
however i will use 10.0.x.x for final clustering network.
Now i haven't access to those nodes , so i'll test ssh tomorrow and let you
know what result i get.

On Wed, Oct 29, 2008 at 12:03 AM, Michael Edwards <[EMAIL PROTECTED]>wrote:

> It looks like the nodes didn't image, or are now unable to communicate
> with the head node in any event.
>
> Plug in a monitor and keyboard to your compute node and see if you
> have a login prompt.  You should be able to log in as your root user
> from the head node.  If it allows you to log in there (you will need
> the password) try running "ssh 192.168.0.1"
>
> It is quite possible that a switch or other device on your network is
> using the 192.168.0.1 network.  I generally use the 10.0.0.1 network
> because of this.
>
> On Tue, Oct 28, 2008 at 4:18 PM, ali nazemian <[EMAIL PROTECTED]>
> wrote:
> > For your first question,it seems that all of steps before step 7
> > successfully completed .
> > and about your second one , i dont know how to check that.
> > I think maybe its hardware problem for my switch , its 3com 24 port
> switch ,
> > can it be my problem?!
> >
> > On Tue, Oct 28, 2008 at 11:30 PM, Michael Edwards <[EMAIL PROTECTED]>
> > wrote:
> >>
> >> Did the image deployment complete successfully and the nodes reboot to
> >> the oscar image?
> >>
> >> Can you ssh to the compute nodes from the head node (without getting a
> >> password prompt)?
> >>
> >> On Tue, Oct 28, 2008 at 3:51 PM, ali nazemian <[EMAIL PROTECTED]>
> >> wrote:
> >> > Hi.
> >> > I want to execute clustering for our HPC center using OSCAR, but i
> have
> >> > a
> >> > problem with step 7, installing cluster.
> >> > Here is my problem :
> >> > After i want to run step 7 , after some time on client node "tftp time
> >> > out"
> >> > error appeared and node terminate the boot agent. and "Received
> >> > disconnect
> >> > from 192.168.0.2: 2: The connection is closed by SSH Server
> >> > Current FSM is SSH_Main_SSHProcess" appeared on server node.
> >> > Here is the complete log of step 7:
> >> >
> >> >
> --------------------------------------------------------------------------
> >> > --> Update Wizard Env (as needed)
> >> > --> Step 7: Running: ./post_install
> >> > Gathering processor count from oscarnode1.clusternet.
> >> > ssh: connect to host oscarnode1.clusternet port 22: Connection timed
> out
> >> > Improper count (0) returned from machine oscarnode1.clusternet at
> >> > ./post_install line 83
> >> >     main::get_numproc() called at ./post_install line 39
> >> > ssh: connect to host oscarnode1 port 22: Connection timed out
> >> > rsync: connection unexpectedly closed (0 bytes received so far)
> [sender]
> >> > rsync error: error in rsync protocol data stream (code 12) at
> io.c(359)
> >> > --> About to run /opt/oscar/packages/loghost/scripts/post_install for
> >> > loghost
> >> > ************************* oscar_cluster *************************
> >> > --------- oscarnode1---------
> >> > ssh: connect to host oscarnode1 port 22: Connection timed out
> >> > --> About to run /opt/oscar/packages/ganglia/scripts/post_install for
> >> > ganglia
> >> > [ganglia] Ganglia gmond configuration file modified, re-starting
> >> > daemon...
> >> > Shutting down GANGLIA gmond: [60G[  [0;32mOK [0;39m  ]
> >> > Starting GANGLIA gmond: [60G[  [0;32mOK [0;39m  ]
> >> > editing /etc/gmetad.conf
> >> > match: gridname\s+.*
> >> > match: data_source\s+.*
> >> > [ganglia] Ganglia gmetad configuration file modified, re-starting
> >> > daemon...
> >> > Shutting down GANGLIA gmetad: [60G[  [0;32mOK [0;39m  ]
> >> > Starting GANGLIA gmetad: [60G[  [0;32mOK [0;39m  ]
> >> > [ganglia] Starting up apache...
> >> > Stopping httpd: [60G[  [0;32mOK [0;39m  ]
> >> > Starting httpd: [60G[  [0;32mOK [0;39m  ]
> >> > [ganglia] Ganglia page is located at
> http://server.clusternet/ganglia/
> >> > ************************* oscar_cluster *************************
> >> > --------- oscarnode1---------
> >> > Received disconnect from 192.168.0.2: 2: The connection is closed by
> SSH
> >> > Server
> >> > Current FSM is SSH_Main_SSHProcess
> >> > --> About to run /opt/oscar/packages/torque/scripts/post_install for
> >> > torque
> >> > Received disconnect from 192.168.0.2: 2: The connection is closed by
> SSH
> >> > Server
> >> > Current FSM is SSH_Main_SSHProcess
> >> > rsync: connection unexpectedly closed (0 bytes received so far)
> [sender]
> >> > rsync error: error in rsync protocol data stream (code 12) at
> io.c(359)
> >> > TORQUE mom config file updated with clienthost: server.clusternet
> >> > Pushing config file to clients...
> >> > Sending SIGHUP to all moms...
> >> > ************************* oscar_cluster *************************
> >> > --------- oscarnode1---------
> >> > Received disconnect from 192.168.0.2: 2: The connection is closed by
> SSH
> >> > Server
> >> > Current FSM is SSH_Main_SSHProcess
> >> > [torque] Updating pbs_server nodes
> >> > /opt/pbs/bin/pbsnodes: Server has no node list
> >> > Shutting down TORQUE Server: [60G[  [0;32mOK [0;39m  ]
> >> > Starting TORQUE Server: [60G[  [0;32mOK [0;39m  ]
> >> > [torque] Creating TORQUE workq queue...
> >> > Max open servers: 4
> >> > set queue workq resources_max.ncpus = 0
> >> > set queue workq resources_max.nodect = 0
> >> > set queue workq resources_available.nodect = 0
> >> > set server resources_available.ncpus = 0
> >> > set server resources_available.nodect = 0
> >> > set server resources_available.nodes = 0
> >> > set server resources_max.ncpus = 0
> >> > set server resources_max.nodes = 0
> >> > set server scheduler_iteration = 60
> >> > set server log_events = 64
> >> > Shutting down MAUI Scheduler: [60G[  [0;32mOK [0;39m  ]
> >> > Starting MAUI Scheduler: [60G[  [0;32mOK [0;39m  ]
> >> > --> About to run /opt/oscar/packages/switcher/scripts/post_install for
> >> > switcher
> >> > Setting default for tag mpi ("lam-7.1.2")
> >> > Attribute successfully set; new attribute setting will be effective
> for
> >> > future shells
> >> > Received disconnect from 192.168.0.2: 2: The connection is closed by
> SSH
> >> > Server
> >> > Current FSM is SSH_Main_SSHProcess
> >> > rsync: connection unexpectedly closed (0 bytes received so far)
> [sender]
> >> > rsync error: error in rsync protocol data stream (code 12) at
> io.c(359)
> >> > --> About to run /opt/oscar/packages/mta-config/scripts/post_install
> for
> >> > mta-config
> >> > ************************************ WARNING
> >> > ************************************
> >> > OSCAR could not set up the configuration for any mailing service on
> the
> >> > server.
> >> > The current version of the mta-config package in OSCAR only supports
> the
> >> > Postfix mail transfer agent (MTA).
> >> > It looks like you have another MTA installed (e.g, sendmail or exim);
> as
> >> > such,
> >> > please be aware that OSCAR will not automatically configure it.
> >> > ************************************ WARNING
> >> > ************************************
> >> > --> About to run /opt/oscar/packages/ntpconfig/scripts/post_install
> for
> >> > ntpconfig
> >> > Shutting down ntpd: [60G[  [0;32mOK [0;39m  ]
> >> > Starting ntpd: [60G[  [0;32mOK [0;39m  ]
> >> > ************************* oscar_cluster *************************
> >> > --------- oscarnode1---------
> >> > Received disconnect from 192.168.0.2: 2: The connection is closed by
> SSH
> >> > Server
> >> > Current FSM is SSH_Main_SSHProcess
> >> > --> About to run /opt/oscar/packages/opium/scripts/post_install for
> >> > opium
> >> > Not all hosts were accessible by c3! Will retry the update later
> >> > Could not find template for file switcher.ini
> >> > If this contains distro-specific lines, please create a template!
> >> > image:
> >> > $VAR1 = 'oscarimage';
> >> > ---------------
> >> > Received disconnect from 192.168.0.2: 2: The connection is closed by
> SSH
> >> > Server
> >> > Current FSM is SSH_Main_SSHProcess
> >> > rsync: connection unexpectedly closed (0 bytes received so far)
> [sender]
> >> > rsync error: error in rsync protocol data stream (code 12) at
> io.c(359)
> >> > Could not find template for file gshadow
> >> > If this contains distro-specific lines, please create a template!
> >> > image:
> >> > $VAR1 = 'oscarimage';
> >> > ---------------
> >> > Received disconnect from 192.168.0.2: 2: The connection is closed by
> SSH
> >> > Server
> >> > Current FSM is SSH_Main_SSHProcess
> >> > rsync: connection unexpectedly closed (0 bytes received so far)
> [sender]
> >> > rsync error: error in rsync protocol data stream (code 12) at
> io.c(359)
> >> > image:
> >> > $VAR1 = 'oscarimage';
> >> > ---------------
> >> > Received disconnect from 192.168.0.2: 2: The connection is closed by
> SSH
> >> > Server
> >> > Current FSM is SSH_Main_SSHProcess
> >> > rsync: connection unexpectedly closed (0 bytes received so far)
> [sender]
> >> > rsync error: error in rsync protocol data stream (code 12) at
> io.c(359)
> >> > image:
> >> > $VAR1 = 'oscarimage';
> >> > ---------------
> >> > Received disconnect from 192.168.0.2: 2: The connection is closed by
> SSH
> >> > Server
> >> > Current FSM is SSH_Main_SSHProcess
> >> > rsync: connection unexpectedly closed (0 bytes received so far)
> [sender]
> >> > rsync error: error in rsync protocol data stream (code 12) at
> io.c(359)
> >> > image:
> >> > $VAR1 = 'oscarimage';
> >> > ---------------
> >> > Received disconnect from 192.168.0.2: 2: The connection is closed by
> SSH
> >> > Server
> >> > Current FSM is SSH_Main_SSHProcess
> >> > rsync: connection unexpectedly closed (0 bytes received so far)
> [sender]
> >> > rsync error: error in rsync protocol data stream (code 12) at
> io.c(359)
> >> > --> About to run /opt/oscar/packages/oda/scripts/post_install for oda
> >> > generating the /etc/odaserver file on all oscar clients
> >> > . /etc/profile.d/c3.sh && cexec 'echo oscar_server > /etc/odaserver'
> >> > ************************* oscar_cluster *************************
> >> > --------- oscarnode1---------
> >> > Received disconnect from 192.168.0.2: 2: The connection is closed by
> SSH
> >> > Server
> >> > Current FSM is SSH_Main_SSHProcess
> >> > Cluster setup complete!
> >> > --> Step 7: Successfully completed the cluster install
> >> > --> Update Wizard Env (as needed)
> >> >
> >> >
> -----------------------------------------------------------------------------------
> >> > P.S: i am using OSCAR 5 on centos 4.7-x86_64 , i cant use centos 5.X
> >> > because
> >> > of it has problem with my graphic cards.
> >> > Best regards.
> >> >
> >> > --
> >> > A.Nazemian
> >> >
> >> >
> >> >
> -------------------------------------------------------------------------
> >> > This SF.Net email is sponsored by the Moblin Your Move Developer's
> >> > challenge
> >> > Build the coolest Linux based applications with Moblin SDK & win great
> >> > prizes
> >> > Grand prize is a trip for two to an Open Source event anywhere in the
> >> > world
> >> > http://moblin-contest.org/redirect.php?banner_id=100&url=/
> >> > _______________________________________________
> >> > Oscar-users mailing list
> >> > Oscar-users@lists.sourceforge.net
> >> > https://lists.sourceforge.net/lists/listinfo/oscar-users
> >> >
> >> >
> >>
> >>
> -------------------------------------------------------------------------
> >> This SF.Net email is sponsored by the Moblin Your Move Developer's
> >> challenge
> >> Build the coolest Linux based applications with Moblin SDK & win great
> >> prizes
> >> Grand prize is a trip for two to an Open Source event anywhere in the
> >> world
> >> http://moblin-contest.org/redirect.php?banner_id=100&url=/
> >> _______________________________________________
> >> Oscar-users mailing list
> >> Oscar-users@lists.sourceforge.net
> >> https://lists.sourceforge.net/lists/listinfo/oscar-users
> >
> >
> >
> > --
> > A.Nazemian
> >
> > -------------------------------------------------------------------------
> > This SF.Net email is sponsored by the Moblin Your Move Developer's
> challenge
> > Build the coolest Linux based applications with Moblin SDK & win great
> > prizes
> > Grand prize is a trip for two to an Open Source event anywhere in the
> world
> > http://moblin-contest.org/redirect.php?banner_id=100&url=/
> > _______________________________________________
> > Oscar-users mailing list
> > Oscar-users@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/oscar-users
> >
> >
>
> -------------------------------------------------------------------------
> This SF.Net email is sponsored by the Moblin Your Move Developer's
> challenge
> Build the coolest Linux based applications with Moblin SDK & win great
> prizes
> Grand prize is a trip for two to an Open Source event anywhere in the world
> http://moblin-contest.org/redirect.php?banner_id=100&url=/
> _______________________________________________
> Oscar-users mailing list
> Oscar-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/oscar-users
>



-- 
A.Nazemian
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Oscar-users mailing list
Oscar-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/oscar-users

Reply via email to