|
Hi Bernard:
1. I collected MAC addresses of clients and
assigned to the client node when I defined the client.
2. The client is able to boot by the network(PXE),
so I enabled the client to boot with PXE.
3. When I rebooted the client, it connected to the
server. But, it couldn't find the DISK0.
The following messages
printed out when I started the client with PXE.
Loading ide-scsi... Assuming ide-scsi is
compiled into the kernel, not needed, or already loaded.
get_hostname_by_hosts_file Host file
exists... Searching for this machine's hostname in /scripts/hosts by IP:
192.168.1.4
This hosts name is: client4
run_pre_install_scripts >>>
99all.harmless_example_script
I live in
/var/lib/systemimager/scripts/pre-install.
chose_autoinstall_script Using autoinstall
script: /scripts/client4.sh
write_variables
run_autoinstall_script >>>
/scripts/client4.sh
get_arch DISKORDER=sd,hd,cciss,ida,rd enumerate_disks DISKS=0
Undefinded: DISK0 Killing off running
processes.
write_varibles
Thanks,
YoungJun
----- Original Message -----
Sent: Tuesday, April 05, 2005 10:20
AM
Subject: RE: [Oscar-users] Error message
while testing Cluster Setup- RHEL WS version 3 (Taroon Update 3)
Hi YoungJun:
Please refer to the installation manual for further
details, but generally there are 2 ways. Both of which require you
collect MAC addresses of the corresponding nodes, choose an image to tie in
with the nodes, then reboot the machines so that they get
imaged.
If the nodes support PXE boot, it's just a matter of
hitting 'Setup Network Boot' in the Setup Networking menu and rebooting the
machine. If not, then you can make an autoinstall floppy and use that
instead. I would highly recommend that you read the relevant sections of
the installation guide before continuing.
Good luck.
Cheers,
Bernard
Hi Bernard:
Thank you for your help. I think I have built
the image successfully. How do I deploy it to the client? Should I use PXE
boot?
Thanks,
YoungJun
----- Original Message -----
Sent: Tuesday, April 05, 2005 2:06
AM
Subject: RE: [Oscar-users] Error
message while testing Cluster Setup- RHEL WS version 3 (Taroon Update
3)
Hi
YoungJun:
You do not need do any manual
installation of the OS on your client node (it is only necessary on the
headnode). All you need to do is use OSCAR to build the image, and
then deploy it.
Cheers,
Bernard
From: YoungJun Kim
[mailto:[EMAIL PROTECTED] Sent: Tue 05/04/2005 1:39
AM To: Bernard Li Subject: Re: [Oscar-users] Error
message while testing Cluster Setup- RHEL WS version 3 (Taroon Update
3)
On my client node, I re-install from the
beginning (WS version 3 (Update 3)), then I did not change anything except
/etc/ssh/sshd_config file.
Here is the scripts while I was installing in
Step 7.
============================================================================= ==
Running step 7 of the OSCAR wizard: Complete cluster
setup ============================================================================= -->
Step 7: Running: ./post_install Gathering processor count from
client4.vrlab. [EMAIL PROTECTED]'s
password: Updating database for machine client4.vrlab. [EMAIL PROTECTED]'s password: building file
list ... done wrote 44 bytes read 20 bytes 14.22
bytes/sec total size is 339 speedup is 5.30 --> About to
run /opt/oscar/packages/oda/scripts/post_install for oda generating the
/etc/odaserver file on all oscar clients . /etc/profile.d/c3.sh
&& cexec 'echo oscar_server > /etc/odaserver' [EMAIL PROTECTED]'s
password: ************************* oscar_cluster
************************* --------- client4--------- --> About to
run /opt/oscar/packages/torque/scripts/post_install for torque [EMAIL PROTECTED]'s password: mkstemp
/var/spool/pbs/mom_priv/.config.Q5UZbo failed: No such file or
directory rsync error: some files could not be transferred (code 23) at
main.c(620) [EMAIL PROTECTED]'s
password: PBS mom config file updated with clienthost:
server.vrlab Pushing config file to clients... building file list
... done config wrote 151 bytes read 36 bytes 53.43
bytes/sec total size is 96 speedup is 0.51 Sending SIGHUP to
all moms... ************************* oscar_cluster
************************* --------- client4--------- pbs_mom: no
process killed Updating pbs_server nodes set node client4.vrlab np =
1 Shutting down PBS
Server:
[ OK ] Starting PBS
Server:
[ OK ] Creating pbs workq queue... Max open servers:
4 set queue workq resources_max.ncpus = 1 set queue workq
resources_max.nodect = 1 set queue workq resources_available.nodect =
1 set server resources_available.ncpus = 1 set server
resources_available.nodect = 1 set server resources_available.nodes =
1 set server resources_max.ncpus = 1 set server resources_max.nodes
= 1 set server scheduler_iteration = 60 set server log_events =
64 Shutting down MAUI Scheduler:
vr
[ OK ] Starting MAUI
Scheduler:
[ OK ] --> About to run
/opt/oscar/packages/switcher/scripts/post_install for switcher Setting
default for tag mpi ("lam-7.0.6") Attribute successfully set; new
attribute setting will be effective for future shells [EMAIL PROTECTED]'s password: building file
list ... done switcher.ini mkstemp
/opt/env-switcher/etc/.switcher.ini.vs8c6H failed: No such file or
directory wrote 237 bytes read 36 bytes 109.20
bytes/sec total size is 188 speedup is 0.69 rsync error: some
files could not be transferred (code 23) at main.c(620) --> About to
run /opt/oscar/packages/pfilter/scripts/post_install for
pfilter (re)starting the pfilter firewall service on this
server /etc/init.d/pfilter restart Restarting
pfilter:vr
[ OK ] pushing out the clients pfilter firewall
configuration file . /etc/profile.d/c3.sh && cpush
/etc/pfilter.conf.clients /etc/pfilter.conf [EMAIL PROTECTED]'s password: Permission
denied, please try again. [EMAIL PROTECTED]'s password: building file
list ... done wrote 59 bytes read 20 bytes 12.15
bytes/sec total size is 855 speedup is 10.82 (re)starting the
pfilter firewall service on the clients . /etc/profile.d/c3.sh
&& cexec /etc/init.d/pfilter restart [EMAIL PROTECTED]'s
password: ************************* oscar_cluster
************************* --------- client4--------- bash: line 1:
/etc/init.d/pfilter: No such file or directory --> About to run
/opt/oscar/packages/opium/scripts/post_install for opium [EMAIL PROTECTED]'s password: building file
list ... done switcher.ini mkstemp
/opt/env-switcher/etc/.switcher.ini.mOMpxL failed: No such file or
directory wrote 237 bytes read 36 bytes 78.00
bytes/sec total size is 188 speedup is 0.69 rsync error: some
files could not be transferred (code 23) at main.c(620) [EMAIL PROTECTED]'s password: building file
list ... done wrote 46 bytes read 20 bytes 26.40
bytes/sec total size is 596 speedup is 9.03 [EMAIL PROTECTED]'s password: building file
list ... done passwd wrote 81 bytes read 54 bytes 38.57
bytes/sec total size is 2056 speedup is 15.23 [EMAIL PROTECTED]'s password: building file
list ... done group wrote 80 bytes read 48 bytes 28.44
bytes/sec total size is 720 speedup is 5.62 [EMAIL PROTECTED]'s password: building file
list ... done shadow wrote 81 bytes read 48 bytes 51.60
bytes/sec total size is 1245 speedup is 9.65 --> About to
run /opt/oscar/packages/ntpconfig/scripts/post_install for ntpconfig [EMAIL PROTECTED]'s
password: ************************* oscar_cluster
************************* --------- client4--------- ntpd: Removing
firewall opening for 127.127.1.0 port 123iptables: Bad rule (does a
matching rule exist in that chain?) [FAILED] Shutting down ntpd:
[FAILED] ntpd: Opening firewall for input from 127.127.1.0 port
123[ OK ] Starting ntpd: [ OK ] --> About
to run /opt/oscar/packages/loghost/scripts/post_install for loghost [EMAIL PROTECTED]'s
password: ************************* oscar_cluster
************************* --------- client4--------- oscar_loghost
already set --> About to run
/opt/oscar/packages/ganglia/scripts/post_install for ganglia [EMAIL PROTECTED]'s password: building file
list ... done gmond.conf wrote 85 bytes read 72 bytes
62.80 bytes/sec total size is 3710 speedup is 23.63 Shutting
down GANGLIA
gmond:
[ OK ] Shutting down GANGLIA
gmetad:
[ OK ] [EMAIL PROTECTED]'s
password: ************************* oscar_cluster
************************* --------- client4--------- bash: line 1:
/etc/init.d/gmond: No such file or directory Starting GANGLIA
gmond:
[ OK ] [EMAIL PROTECTED]'s
password: ************************* oscar_cluster
************************* --------- client4--------- bash: line 1:
/etc/init.d/gmond: No such file or directory Starting GANGLIA
gmetad:
[ OK ] --> About to run
/opt/oscar/packages/disable-services/scripts/post_install for
disable-services POSTFIX is running Postfix is succesfully
configured. : SERVER NODE Shutting down
postfix:
[FAILED] Starting
postfix:
[ OK ] - finished configuring postfix Cluster setup
complete! --> Step 7: Successfully completed the cluster
install
Thanks,
YoungJun
----- Original Message -----
Sent: Tuesday, April 05, 2005 1:08
AM
Subject: RE: [Oscar-users] Error
message while testing Cluster Setup- RHEL WS version 3 (Taroon Update
3)
Hi
YoungJun:
Is /home mounted on all your
compute nodes (should be mounted off your headnode).
Also, have you done the step
'Complete Cluster Install'?
Cheers,
Bernard
From: [EMAIL PROTECTED]
on behalf of YoungJun Kim Sent: Tue 05/04/2005 1:01
AM To: [email protected] Subject:
[Oscar-users] Error message while testing Cluster Setup- RHEL WS version
3 (Taroon Update 3)
Hi all,
I tried to test cluster setup and I have
the following errors.
Preparing user tests... Performing user
tests... SSH ping
test
[PASSED] SSH
server->node
[EMAIL PROTECTED]. vrlab's
password: [EMAIL PROTECTED]'s
password: [EMAIL PROTECTED]'s
password: SSH
server->node
[FAILED] SSH
node->server
[EMAIL PROTECTED]. vrlab's
password: [EMAIL PROTECTED]'s
password: [EMAIL PROTECTED]'s
password: SSH
node->server
[FAILED] PBS default queue
definition
[PASSED] Checking for 1 free
nodes:
[FAILED] Not enough free nodes. Tests incomplete. Checking for 1
free
nodes:
[FAILED] Not enough free nodes. Tests incomplete. Checking for 1
free
nodes:
[FAILED] Not enough free nodes. Tests incomplete. Checking for 1
free
nodes:
[FAILED] Not enough free nodes. Tests incomplete. Ganglia
test
[FAILED] There were issues running some user test scripts.
Please check your logs
...Hit <ENTER> key to
exit...
Any ideas?
Thank you,
YoungJun
|