I checked what u said , and it seems that firewall was enable , although ssh
was allowed , but that was enable , so this problem solved by disabling
firewall , after that some new errors showed up on client node ,it was
something about portioning problem in client node that i think it was
related to ide.disk/scsi.disk file , so i have questions about imaging
process in OSCAR installation, that probably can help me to install it
without any errors:
1- in step "build the image" we should choose disk partion file , they said
we should choose scsi.disk for scsi disks and ide.disk for IDE disks , but
what about SATA IDE disks? as u know hda partion format for IDE and sda is
for scsi , i saw SATA is use sda too , so should i choose scsi.disk?!
2- in this step , we should ip assignment method , dafult value for that is
static , which one should i choose?! static or dhcp?! which one is more
efficeient? i think static is better, what do u think?!
3- post install action , should be reboot , beep or something else? in
istallation manual it says we shouldn't choose reboot if we want to choose
network boot installation, now i dont know which one is better and errorless
for me?!
4- I found something , when i want to find mac address of the client node ,
( i have just 2 node connected to switch as a pilot project , one of them as
a server and another one as a clinet ) wrong mac address found , i think it
is switch mac address that found , so i should insert client mac address
manually , do u think it can cause some errors in installation process?!
Best regards.
On Fri, Oct 31, 2008 at 1:57 AM, <[EMAIL PROTECTED]> wrote:
> Have you checked the headnode, to make sure that your firewall is not
> running?
>
> for a GUI to turn off the firewall: system-config-securitylevel
>
> On Tue, Oct 28, 2008 at 2:51 PM, ali nazemian <[EMAIL PROTECTED]>wrote:
>
>> Hi.
>> I want to execute clustering for our HPC center using OSCAR, but i have a
>> problem with step 7, installing cluster.
>> Here is my problem :
>> After i want to run step 7 , after some time on client node "tftp time
>> out" error appeared and node terminate the boot agent. and "Received
>> disconnect from 192.168.0.2: 2: The connection is closed by SSH Server
>> Current FSM is SSH_Main_SSHProcess" appeared on server node.
>> Here is the complete log of step 7:
>> --------------------------------------------------------------------------
>> --> Update Wizard Env (as needed)
>> --> Step 7: Running: ./post_install
>> Gathering processor count from oscarnode1.clusternet.
>> ssh: connect to host oscarnode1.clusternet port 22: Connection timed out
>> Improper count (0) returned from machine oscarnode1.clusternet at
>> ./post_install line 83
>> main::get_numproc() called at ./post_install line 39
>> ssh: connect to host oscarnode1 port 22: Connection timed out
>> rsync: connection unexpectedly closed (0 bytes received so far) [sender]
>> rsync error: error in rsync protocol data stream (code 12) at io.c(359)
>> --> About to run /opt/oscar/packages/loghost/scripts/post_install for
>> loghost
>> ************************* oscar_cluster *************************
>> --------- oscarnode1---------
>> ssh: connect to host oscarnode1 port 22: Connection timed out
>> --> About to run /opt/oscar/packages/ganglia/scripts/post_install for
>> ganglia
>> [ganglia] Ganglia gmond configuration file modified, re-starting daemon...
>> Shutting down GANGLIA gmond: [60G[ [0;32mOK [0;39m ]
>> Starting GANGLIA gmond: [60G[ [0;32mOK [0;39m ]
>> editing /etc/gmetad.conf
>> match: gridname\s+.*
>> match: data_source\s+.*
>> [ganglia] Ganglia gmetad configuration file modified, re-starting
>> daemon...
>> Shutting down GANGLIA gmetad: [60G[ [0;32mOK [0;39m ]
>> Starting GANGLIA gmetad: [60G[ [0;32mOK [0;39m ]
>> [ganglia] Starting up apache...
>> Stopping httpd: [60G[ [0;32mOK [0;39m ]
>> Starting httpd: [60G[ [0;32mOK [0;39m ]
>> [ganglia] Ganglia page is located at http://server.clusternet/ganglia/
>> ************************* oscar_cluster *************************
>> --------- oscarnode1---------
>> Received disconnect from 192.168.0.2: 2: The connection is closed by SSH
>> Server
>> Current FSM is SSH_Main_SSHProcess
>> --> About to run /opt/oscar/packages/torque/scripts/post_install for
>> torque
>> Received disconnect from 192.168.0.2: 2: The connection is closed by SSH
>> Server
>> Current FSM is SSH_Main_SSHProcess
>> rsync: connection unexpectedly closed (0 bytes received so far) [sender]
>> rsync error: error in rsync protocol data stream (code 12) at io.c(359)
>> TORQUE mom config file updated with clienthost: server.clusternet
>> Pushing config file to clients...
>> Sending SIGHUP to all moms...
>> ************************* oscar_cluster *************************
>> --------- oscarnode1---------
>> Received disconnect from 192.168.0.2: 2: The connection is closed by SSH
>> Server
>> Current FSM is SSH_Main_SSHProcess
>> [torque] Updating pbs_server nodes
>> /opt/pbs/bin/pbsnodes: Server has no node list
>> Shutting down TORQUE Server: [60G[ [0;32mOK [0;39m ]
>> Starting TORQUE Server: [60G[ [0;32mOK [0;39m ]
>> [torque] Creating TORQUE workq queue...
>> Max open servers: 4
>> set queue workq resources_max.ncpus = 0
>> set queue workq resources_max.nodect = 0
>> set queue workq resources_available.nodect = 0
>> set server resources_available.ncpus = 0
>> set server resources_available.nodect = 0
>> set server resources_available.nodes = 0
>> set server resources_max.ncpus = 0
>> set server resources_max.nodes = 0
>> set server scheduler_iteration = 60
>> set server log_events = 64
>> Shutting down MAUI Scheduler: [60G[ [0;32mOK [0;39m ]
>> Starting MAUI Scheduler: [60G[ [0;32mOK [0;39m ]
>> --> About to run /opt/oscar/packages/switcher/scripts/post_install for
>> switcher
>> Setting default for tag mpi ("lam-7.1.2")
>> Attribute successfully set; new attribute setting will be effective for
>> future shells
>> Received disconnect from 192.168.0.2: 2: The connection is closed by SSH
>> Server
>> Current FSM is SSH_Main_SSHProcess
>> rsync: connection unexpectedly closed (0 bytes received so far) [sender]
>> rsync error: error in rsync protocol data stream (code 12) at io.c(359)
>> --> About to run /opt/oscar/packages/mta-config/scripts/post_install for
>> mta-config
>> ************************************ WARNING
>> ************************************
>> OSCAR could not set up the configuration for any mailing service on the
>> server.
>> The current version of the mta-config package in OSCAR only supports the
>> Postfix mail transfer agent (MTA).
>> It looks like you have another MTA installed (e.g, sendmail or exim); as
>> such,
>> please be aware that OSCAR will not automatically configure it.
>> ************************************ WARNING
>> ************************************
>> --> About to run /opt/oscar/packages/ntpconfig/scripts/post_install for
>> ntpconfig
>> Shutting down ntpd: [60G[ [0;32mOK [0;39m ]
>> Starting ntpd: [60G[ [0;32mOK [0;39m ]
>> ************************* oscar_cluster *************************
>> --------- oscarnode1---------
>> Received disconnect from 192.168.0.2: 2: The connection is closed by SSH
>> Server
>> Current FSM is SSH_Main_SSHProcess
>> --> About to run /opt/oscar/packages/opium/scripts/post_install for opium
>> Not all hosts were accessible by c3! Will retry the update later
>> Could not find template for file switcher.ini
>> If this contains distro-specific lines, please create a template!
>> image:
>> $VAR1 = 'oscarimage';
>> ---------------
>> Received disconnect from 192.168.0.2: 2: The connection is closed by SSH
>> Server
>> Current FSM is SSH_Main_SSHProcess
>> rsync: connection unexpectedly closed (0 bytes received so far) [sender]
>> rsync error: error in rsync protocol data stream (code 12) at io.c(359)
>> Could not find template for file gshadow
>> If this contains distro-specific lines, please create a template!
>> image:
>> $VAR1 = 'oscarimage';
>> ---------------
>> Received disconnect from 192.168.0.2: 2: The connection is closed by SSH
>> Server
>> Current FSM is SSH_Main_SSHProcess
>> rsync: connection unexpectedly closed (0 bytes received so far) [sender]
>> rsync error: error in rsync protocol data stream (code 12) at io.c(359)
>> image:
>> $VAR1 = 'oscarimage';
>> ---------------
>> Received disconnect from 192.168.0.2: 2: The connection is closed by SSH
>> Server
>> Current FSM is SSH_Main_SSHProcess
>> rsync: connection unexpectedly closed (0 bytes received so far) [sender]
>> rsync error: error in rsync protocol data stream (code 12) at io.c(359)
>> image:
>> $VAR1 = 'oscarimage';
>> ---------------
>> Received disconnect from 192.168.0.2: 2: The connection is closed by SSH
>> Server
>> Current FSM is SSH_Main_SSHProcess
>> rsync: connection unexpectedly closed (0 bytes received so far) [sender]
>> rsync error: error in rsync protocol data stream (code 12) at io.c(359)
>> image:
>> $VAR1 = 'oscarimage';
>> ---------------
>> Received disconnect from 192.168.0.2: 2: The connection is closed by SSH
>> Server
>> Current FSM is SSH_Main_SSHProcess
>> rsync: connection unexpectedly closed (0 bytes received so far) [sender]
>> rsync error: error in rsync protocol data stream (code 12) at io.c(359)
>> --> About to run /opt/oscar/packages/oda/scripts/post_install for oda
>> generating the /etc/odaserver file on all oscar clients
>> . /etc/profile.d/c3.sh && cexec 'echo oscar_server > /etc/odaserver'
>> ************************* oscar_cluster *************************
>> --------- oscarnode1---------
>> Received disconnect from 192.168.0.2: 2: The connection is closed by SSH
>> Server
>> Current FSM is SSH_Main_SSHProcess
>> Cluster setup complete!
>> --> Step 7: Successfully completed the cluster install
>> --> Update Wizard Env (as needed)
>>
>> -----------------------------------------------------------------------------------
>> P.S: i am using OSCAR 5 on centos 4.7-x86_64 , i cant use centos 5.X
>> because of it has problem with my graphic cards.
>> Best regards.
>>
>> --
>> A.Nazemian
>>
>> -------------------------------------------------------------------------
>> This SF.Net email is sponsored by the Moblin Your Move Developer's
>> challenge
>> Build the coolest Linux based applications with Moblin SDK & win great
>> prizes
>> Grand prize is a trip for two to an Open Source event anywhere in the
>> world
>> http://moblin-contest.org/redirect.php?banner_id=100&url=/
>> _______________________________________________
>> Oscar-users mailing list
>> Oscar-users@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/oscar-users
>>
>>
>
> -------------------------------------------------------------------------
> This SF.Net email is sponsored by the Moblin Your Move Developer's
> challenge
> Build the coolest Linux based applications with Moblin SDK & win great
> prizes
> Grand prize is a trip for two to an Open Source event anywhere in the world
> http://moblin-contest.org/redirect.php?banner_id=100&url=/
> _______________________________________________
> Oscar-users mailing list
> Oscar-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/oscar-users
>
>
--
A.Nazemian
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Oscar-users mailing list
Oscar-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/oscar-users