<x-flowed>
At 01:57 PM 7/10/2002 -0700, Richard Ferri wrote:
Mike,
        Great work, and most excellent suggestions...
although you clearly don't have enough to do at your
day job -- wouldn't you love to become a 'highly paid'
OSCAR developer in your spare time?

I have individual comments below... Rich

Richard Ferri
IBM Linux Technology Center
[EMAIL PROTECTED]
845.433.7920

--- Mike Mettke <[EMAIL PROTECTED]> wrote:
>
> All,
>
> I successfully installed RH7.3 and 1.3b3 and it is
> working for a couple
> of days now. All cluster tests pass successfully and
> all of my own
> LAM/MPI programs run.
>
> I took notes wrt what I did during the install and I
> hope they might be
> useful for others:
>
> hardware:
> master: P3 800MHz 256MB, Intel + 3com NIC
>
> 3 nodes: dual P3 800MHz EB 256 MB RAM on ASUS
> CUR-DLS motherboard,
> ServerWorks LE 3.0 chipset, dual onboard Intel 82559
> NIC
>
> 1 node: dual dual P3 800MHz EB 256 MB RAM on Tyan
> Tiger 200 motherboard,
> VIA Apollo Pro 133A chipset, onboard Intel 82559 NIC
>
> All bioses on all nodes were flashed with latest
> bios version.
>
> 1. Master configuration:
>     Full (everything) RH7.3 install + all updates
> from Redhat.
>
> 2. cp all rpms from cdrom into /tftpboot/rpm
>
> 3. cp all update rpms into /tftpboot/rpm
>
> 4. rm /tftpboot/rpm/c3-2.71-2*.rpm
>     since 1.3b3 comes with its own version of c3
>
> 5. rpm -e tftp
>     rpm -e tftp-server
>     since 1.3b3 brings its own tftp packages
>
I'm curious... does the installed tftp-server work for OSCAR? If so, should we detect that and not stomp it?

> 6. start install_cluster eth0
>
> 7. put in full path (/usr/bin/rsync) to rsync in
> /etc/init.d/systemimager
If I read that problem correctly, it needs a bug registered. Can someone confirm?

>
> 8. restart install_cluster eth0
>
> 9. follow the installation wizard
>
> 10. I used my own rpm list which I created from
> scratch.
> I added hdparm, gcc, lm-sensors and the ntp packages
> and dependencies
> beyond what was strictly needed. see attachment.
>
Hopefully, once we have package selection capability, this will all become much nicer. I see us using a default.rpmlist that is thinned out as much as possible. Then, we have some super basic packages that are nothing more than rpmlists which users could choose to supplement there installs with. For example, some packages might be:
* Compilers (gcc,g77)
* X
* Gnome
* KDE
etc.
It's an easy way to do different types of OS installs once we support package selection...

>
> Some suggestions/wishlist:
> 0. Starting of dhcpd on the master only on the
> private interface (edit
> /etc/sysconfig/dhcpd accordingly). This prevents the
> dhcpd on the master
> node from respoding to dhcp requests on the company
> network.
dhcp should currently only respond to collected MACs. It does, however, send out dhcp broadcasts even though it's not giving out addresses. I'm not familiar with the method to keep it quiet on a given interface.
However... there is another measure we should be taking. If the cluster is installed such that DHCP is not required, then the wizard should kill DHCP when it terminates. Likewise with TFTP, but I have a bug for that already (529709). This would prevent nodes from building w/o the wizard running, but that costs very little since the "Complete Cluster Setup" step must be run after a node is build before it is useful anyway.

>
> 1. centralized logging at the head node, at least
> for critical stuff.

We opened a bug for this, planned for the 1.4 release
of oscar -- agreed that it's essential for debugging.
This is a good idea and trivial to do... I've got it ready to go once 1.3 is released.
It's actually in the Feature tracker... # 577222.

>
> 2. ntp installation. My nodes have a drift of ~100
> ppm/day without it !
>     multicast or broadcast mode should be
> sufficient.
>

NTP support is on the list for 1.4 also (505572)

> 3. /usr/spool/PBS/server_priv/nodes does not get
> correctly updated with
> the cpu count when install_cluster is restarted.
> This prevents the tests
> from successfully executing in the case of SMP
> nodes. I had to fix this
> file manually.

This sounds like a bug that should go into the bug
tracking system, for 1.3, it's a serious bug.
There already is a bug registered for this. It's not serious enough to stop 1.3, as restarting the wizard doesn't cause this. It is caused by the "Complete Cluster Setup" step being run before all the nodes are present. The PBS proc count is defaulted to 1 since it hasn't been collected yet, and not subsequently changed. The explanation is detailed in bugs 579340 and 579350.

>
> 4. ScaLAPACK installation. Nice for benchmarking.

You're the second person to suggest ScaLAPACK today;
if we don't already have it in the tracking system
I'll add it.

>
> 5. Maybe: lm-sensors installation and configuration.
>   xpbsmon (or gmond?) could then display the
> temperature and fanspeed of
> the individual nodes. For large, unsupervised
> clusters this could be
> critical to prevent costly mishaps when the air
> condition fails.

someone from ganglia land, please respond?

>
> 6. direct PXE netbooting all the time. The kernel
> could then either
> install the node, or if the node already has the
> correct image, use
> pivot_root to simply use the existing image. Doing
> this means:
> a) potential for greater fault-tolerance in case the
> local hd gets
> corrupted.
> b) less sysadmin overhead since the master and node
> setup stays
> constant. No bios parameter switching, no messing
> with dhcpd.conf on the
> head node.
> This idea bears fruit especially for larger
> clusters.
>

Interesting.  We used this approach (always attempting
to network boot) on the RS/6000 SP.  Agreed, the
advantages are less manipulation of the device boot
list, and centralized management of the nodes -- the
master tells the slave what mode to boot into.  The
problem I saw was that some nodes never recover from a
failed network boot.  They are supposed to percolate
to the next item on the bootlist, but some just
happily start net boot all over again -- I think these
nodes are in the minority, and I'd like to say it's
actually a deficiency  on the part of the firmware.

Other than these 'wacky' nodes, I agree that
centralized management of booting is more scalable...
The method Mike mentions is what Xcat and a few other distributions use. Unfortunately, it does depend on PXE being there. Since OSCAR has to address the universal, this is just too specific to depend on. The "always netboot" method also makes it nice and easy to rebuild nodes on a whim. SIS has similar features anyway though.
Moreover, there are security implications of leaving TFTP and DHCP on all the time (depending on how the user installed the cluster, sometimes it's necessary though).

> 7. Option of using IP ranges for dhcpd.conf instead
> of MAC addresses.
> This avoids the whole MAC address snooping process
> during installation,
> but makes nodes unidentifiable (unless you ssh to
> the node and issue a
> "beep" command). In my opinion, I'd be happy not
> knowing which node is
> which, as long as I can positively identify any node
> when the need (hw
> problems) arises. But this is only my opinion ...
>
>
This is cool.  We opted to be able to identify the
node over ease of MAC address collection, but you may
be aware that Scyld for example, takes the opposite
approach -- they collect all the MACs easily, but you
don't know which node is which. The debate here rages
on...
I don't think it even needs to be a debate... Mike mentioned that maybe there should be the "Option" of dhcp'ing out the images regardless of MAC.... I would personally call that risky, but if someone were to code the option... it might appeal to some. The MAC collection can be much improved in other ways too...
* Import MAC list from file
* Assign ALL collected MACs in sequence (I installed a 64 node cluster, and manually clicking thru the Tk GUI for each of the MAC assignments to each node was very painful)

Thx for all the info Mike!

Jeremy

>
> Comments and suggestions are always appreciated.
>
> regards,
> Mike
>
> Wireless Advanced Technology Lab
> Bell Labs - Lucent Technologies
>
>
>
>
>
>
> > filesystem
> glibc-common
> glib
> chkconfig
> popt
> pcre
> bzip2-libs
> mingetty
> termcap
> bash
> bzip2
> openssl
> cracklib
> perl
> libappconfig-perl
> which
> cracklib-dicts
> systemimager-client
> info
> gawk
> ed
> grep
> procps
> pam
> modutils
> cyrus-sasl-md5
> libuser
> tcsh
> shadow-utils
> openpbs-oscar
> openpbs-oscar-mom
> bdflush
> modules
> env-switcher
> lam-module
> sysklogd
> initscripts
> dev
> gzip
> losetup
> mkinitrd
> kernel-smp
> openssh
> openssh-clients
> lm_sensors
> nfs-utils
> ntp
> binutils
> kernel-headers
> gcc
> setup
> basesystem
> glibc
> gdbm
> zlib
> e2fsprogs
> rsync
> mktemp
> net-tools
> libtermcap
> ganglia-monitor-core
> textutils
> glib2
> db3
> c3-ckillnode
> systemimager-common
> words
> ncurses
> systemconfigurator
> iputils
> make
> sed
> diffutils
> fileutils0
> lam
> sh-utils
> cyrus-sasl
> openldap
> mount
> SysVinit
> rpm
> mpich-oscar
> pvm
> tcl
> pvm-modules
> mpich-oscar-module
> iproute
> util-linux
> usermode
> less
> tar
> findutils
> lilo
> hdparm
> openssh-server
> pump
> portmap
> libcap
> pfilter
> cpp
> glibc-devel
> psmisc
>


__________________________________________________
Do You Yahoo!?
Sign up for SBC Yahoo! Dial - First Month Free
http://sbc.yahoo.com


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Two, two, TWO treats in one.
http://thinkgeek.com/sf
_______________________________________________
Oscar-users mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/oscar-users


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Two, two, TWO treats in one.
http://thinkgeek.com/sf
_______________________________________________
Oscar-users mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/oscar-users

</x-flowed>

Reply via email to