Hi Fellow Oscar Users, Just an email to ask if anyone out there has a cluster based on Oscar and is using blcr (Berkeley Lab Checkpoint/Restart - https://ftg.lbl.gov/CheckpointRestart/CheckpointRestart.shtml). I am particularly after how you are building it, installing it and handling when your nodes are due for a kernel update.
-- On a side note I have manually built torque to support blcr (version 2.5.2). I have a small (60 node) Scientific Linux cluster. I am in the processes building a test headnode and 4 nodes so that we can recompile program codes etc. with the newer versions of OpenMPI, Scientific Linux etc. ready for my next cluster upgrade. I have built blcr into a rpm on the first node I imaged and now use the following script (below) to install it on other nodes. ---- #!/bin/bash ###################################### # Cooper Lees - c...@ansto.gov.au # Script to install blcr ... # Last Updated: 20100809 ###################################### function errorCheck() { if [ $? -ne 0 ]; then echo "ERROR: $@" exit 69 fi } KERNEL="2.6.18-194.8.1.el5" RPM_DIR=/data1/dist/blcr/current yum install -y kernel-devel kernel-headers glibc-devel errorCheck "Unable to install blcr deps ..." # Install for a node if [ "$1" != "-u" ]; then rpm -ivh ${RPM_DIR}/blcr*.rpm else rpm -Uv ${RPM_DIR}/blcr*.rpm fi errorCheck "Problem with rpm install of torque ..." # Load Modules /sbin/insmod /lib/modules/${KERNEL}/extra/blcr_imports.ko /sbin/insmod /lib/modules/${KERNEL}/extra/blcr.ko /etc/init.d/blcr start errorCheck "Problem with starting blcr ..." ---- I had to modify grub on each node to boot kernel 0¹ by default so that it was easier to get the kernel-devel¹ package through yum (as it pulls down the current / latest installed kernel). I plan to rebuild the blcr rpm each time I plan to do a kernel upgrade on the nodes. Has anyone got a smarter system or any other ideas? Thanks, -- Cooper Ry Lees HPC / UNIX Systems Administrator - Information Management Services (IMS) Australian Nuclear Science and Technology Organisation T +61 2 9717 3853 F +61 2 9717 9273 M +61 403 739 446 E cooper.l...@ansto.gov.au www.ansto.gov.au <http://www.ansto.gov.au> Important: This transmission is intended only for the use of the addressee. It is confidential and may contain privileged information or copyright material. If you are not the intended recipient, any use or further disclosure of this communication is strictly forbidden. If you have received this transmission in error, please notify me immediately by telephone and delete all copies of this transmission as well as any attachments. ------------------------------------------------------------------------------ This SF.net email is sponsored by Make an app they can't live without Enter the BlackBerry Developer Challenge http://p.sf.net/sfu/RIM-dev2dev _______________________________________________ Oscar-users mailing list Oscar-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/oscar-users