Howell,
See my comments in line below.
Frank
On Fri, 2003-11-14 at 11:09, Howell Silverman wrote:
> Hello,
>
> I hope someone can help.
>
> Environment is 8 Systems total, Dual Xeon, the master and the 7 nodes
> each have 4GB of memory, Master has 120GB system drive, 40GB drives in
> each node.
>
> This is a standard load from RH 9 no changes.
>
> Problem Description
> When I choose vmlinux-2.4.20-6bigmem for
> the oscarimage:
>
> At step 2, "Configure Selected OSCAR Packages", there are
> 3 things to configure:
> "Environment Switcher" - pick mpich
> "ntpconfig" - pick default
> "kernel_picker" - pick /boot/vmlinux-2.4.20-6bigmem
>
> In the "kernel_picker", if I choose
> /boot/vmlinux-2.4.20-6bigmem, there is a further option
> whether to not to use loadable kernel modules.
>
> 1) choose "use loadable kernel modules"
>
> I can successfully build everything, and network boot
> the slave node. After the network boot, I changed back
> the boot order to the hard disk. Then The slave node cannot boot from
> harddisk. Will check to see if there are any messages displayed.
Have a look at /etc/lilo.conf (if you can get onto one of the nodes). I
think you will see that the default image name in /etc/lilo.conf doesn't
match any image. If you can watch the screen during an install on the
client you will find that lilo fails to install the boot block, because
of this.
A quick fix to this is to edit
/var/lib/systemimager/images/barossa/etc/systemconfig/systemconfig.conf
on your head node and make sure that the "DEFAULTBOOT" matches a "LABEL"
statement. Following this reinstall your clients.
> 2) choose "not to use loadable kernel modules"
> Everything works until step 5. When I tried to "Add
> Clients" to oscar, the install_cluster hangs at:
>
> /opt/kernel_picker/bin/kernel_picker --bootkernel
> /boot/vmlinux-2.4.20-6bigmem --bootramdisk N
> --networkboot N --kernelversion --modulespath
Do you have multiple images available on your system? If so, you will
probably find that there is a non-graphical question sitting and waiting
for an answer (i.e. pick which image).
Terry Fleury fixed a number of these problems with kernel_picker.pl and
posted fixes both to the CVS and to on the NCSA's OSCAR page. For full
details, check the mailing list archives for early September.
>
> Below are 3 problems we encountered on the master node.
>
> 1) Something strange happened on the master node this
> afternoon.
> Initially, the "df" commands gives 76% used on hda3.
> And we wanted to find out which directory is so big.
> Surprisingly, "df -ms /" gives 6gig as used space, and
> the recycle bin is empty.
>
> After we went to the single user mode, and using "df",
> the percent used decreased steadily from 76% to 6%. And
> 6% is consistent with 6GB.
>
> We wonder what's going on?
Some big file was probably held open by an active process, but had been
deleted. However, that is a guess.
> 2) The log files on /var/log, like messages, secure, often
> gets several GB. There are lots of white spaces in the
> log files. Is there a way to get rid of them?
They should be rotated and compressed regularly. This makes them much
less of a problem. The actual messages can't be changed as they are
what is put out by the various daemons on many of the nodes.
> 3) Although I've commented everything in /etc/crontab,
> sometimes the master node is writing to the hard disk,
> and the available blocks on the hard disk decreased rather
> fast, writing about 664000K every 10 minutes. Is there
> anyway we can find which process is writing the hard disk?
Use "/sbin/fuser -mu /" to show what is in active on the partition.
After that look at each process.
> Is this normal?
--
ac3
Suite G16, Bay 7, Locomotive Workshop Phone: 02 9209 4600
Australian Technology Park Fax: 02 9209 4611
Eveleigh NSW 1430
-------------------------------------------------------
This SF.Net email sponsored by: ApacheCon 2003,
16-19 November in Las Vegas. Learn firsthand the latest
developments in Apache, PHP, Perl, XML, Java, MySQL,
WebDAV, and more! http://www.apachecon.com/
_______________________________________________
Oscar-users mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/oscar-users