Hi Fellow Oscar Users,

Just an email to ask if anyone out there has a cluster based on Oscar and is
using blcr (Berkeley Lab Checkpoint/Restart -
https://ftg.lbl.gov/CheckpointRestart/CheckpointRestart.shtml). I am
particularly after how you are building it, installing it and handling when
your nodes are due for a kernel update.

-- On a side note I have manually built torque to support blcr (version
2.5.2).

I have a small (60 node) Scientific Linux cluster. I am in the processes
building a test headnode and 4 nodes so that we can recompile program codes
etc. with the newer versions of OpenMPI, Scientific Linux etc. ready for my
next cluster upgrade. I have built blcr into a rpm on the first node I
imaged and now use the following script (below) to install it on other
nodes. 

----

#!/bin/bash

######################################
# Cooper Lees - c...@ansto.gov.au
# Script to install blcr ...
# Last Updated: 20100809
######################################

function errorCheck() {
        if [ $? -ne 0 ]; then
                echo "ERROR: $@"
                exit 69
        fi
}

KERNEL="2.6.18-194.8.1.el5"
RPM_DIR=/data1/dist/blcr/current

yum install -y kernel-devel kernel-headers glibc-devel
errorCheck "Unable to install blcr deps ..."

# Install for a node
if [ "$1" != "-u" ]; then
        rpm -ivh ${RPM_DIR}/blcr*.rpm
else
        rpm -Uv ${RPM_DIR}/blcr*.rpm
fi
errorCheck "Problem with rpm install of torque ..."

# Load Modules
/sbin/insmod /lib/modules/${KERNEL}/extra/blcr_imports.ko
/sbin/insmod /lib/modules/${KERNEL}/extra/blcr.ko

/etc/init.d/blcr start
errorCheck "Problem with starting blcr ..."

----

I had to modify grub on each node to boot kernel Œ0¹ by default so that it
was easier to get the Œkernel-devel¹ package through yum (as it pulls down
the current / latest installed kernel). I plan to rebuild the blcr rpm each
time I plan to do a kernel upgrade on the nodes. Has anyone got a smarter
system or any other ideas?

Thanks,
--
Cooper Ry Lees
HPC / UNIX Systems Administrator - Information Management Services (IMS)
Australian Nuclear Science and Technology Organisation
T  +61 2 9717 3853
F  +61 2 9717 9273
M  +61 403 739 446
E  cooper.l...@ansto.gov.au
www.ansto.gov.au <http://www.ansto.gov.au>

Important: This transmission is intended only for the use of the addressee.
It is confidential and may contain privileged information or copyright
material. If you are not the intended recipient, any use or further
disclosure of this communication is strictly forbidden. If you have received
this transmission in error, please notify me immediately by telephone and
delete all copies of this transmission as well as any attachments.


------------------------------------------------------------------------------
This SF.net email is sponsored by 

Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev 
_______________________________________________
Oscar-users mailing list
Oscar-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/oscar-users

Reply via email to