On Mon, 16 Jul 2012, Mark Dixon wrote:
...
* Transparent (from the job script's perspective) serial BLCR
  integration

Could you post the recipe/code?  DMTCP is facing the knife for exactly
that, but C++ encourages displacement activities.

Sure, I'll dig it out (to follow under new thread)
...

Hi Dave,

OK, same thread but amended subject line. As requested, here's the integration code I used for using BLCR transparently with Gridengine (sorry for the delay).

I've not touched it since 2008, when I was only starting to rediscover a love for bash scripting - but just before I discovered the "local" keyword and the "for" loop syntax that copes with whitespace. So apologies for the style.

I think the biggest thing I find unsatisfying about it is that way it handles BLCR's requirement that the kernel version be the same across checkpoints. It nails the job to the first kernel it ran on (exec hosts need the kernel_ver complex set) - unfortunately this creates a race condition for task array jobs. A better idea would be to pickup the default kernel version in the sge_request file and ditch the relevant bits in the attached scripts.

Also, there has obviously been progress in BLCR's limitations since 2008 - which the scripts do not take advantage of.

I include chroot-cwd.c for completeness, as the scenario I used BLCR checkpoint for required it (I was cycle scavenging on a visualisation cluster running a different OS, so had the compute node image installed in a subdirectory).

* The queue starter_method needs to execute starter.sh.

* "chkpt.sh <ckpt|migr|clean> $job_pid" did the actual checkpoint/migration business.

* env.sh is a library and needs to be present for the other scripts to function.

The checkpoint interface we used looked like this:

ckpt_name          blcr
interface          application-level
ckpt_command       /usr/local/sge6.0/ckpt/blcr/chkpt.sh ckpt $job_pid
migr_command       /usr/local/sge6.0/ckpt/blcr/chkpt.sh migr $job_pid
restart_command    none
clean_command      /usr/local/sge6.0/ckpt/blcr/chkpt.sh clean $job_pid
ckpt_dir           /nobackup/root/ckpt
signal             none
when               xs

All the best,

Mark
--
-----------------------------------------------------------------
Mark Dixon                       Email    : [email protected]
HPC/Grid Systems Support         Tel (int): 35429
Information Systems Services     Tel (ext): +44(0)113 343 5429
University of Leeds, LS2 9JT, UK
-----------------------------------------------------------------
#!/bin/bash

# Copyright (C) 2008 University of Leeds
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program.  If not, see <http://www.gnu.org/licenses/>.

# Handle checkpointing with BLCR
#
# 19Feb08 Mark Dixon
#
# $Id: chkpt.sh 2753 2012-10-02 15:41:28Z issmcd $

set -e

# Setup variables and environment
. ${SGE_ROOT}/ckpt/blcr/env.sh

blcrCheckpoint() {
    printUser "Checkpointing job"

    # Lock job to current kernel version (BLCR requirement)
    # WARNING: this is a per job setting. If a task array is submitted and there
    #          are multiple kernels to choose from, there will be problems!
    . ${SGE_ROOT}/default/common/settings.sh
    CKPT_KERNEL=`uname -r`
    CKPT_RES_LIST=`qstat -j $JOB_ID | egrep '^hard resource_list:' | sed 
's/hard resource_list: *//'`
    if echo $CKPT_RES_LIST | egrep "kernel_ver=" >/dev/null; then
        CKPT_RES_LIST=`echo $CKPT_RES_LIST | sed 
's/kernel_ver=[^,]*/kernel_ver='${CKPT_KERNEL}'/'`
    else
        CKPT_RES_LIST="${CKPT_RES_LIST},kernel_ver=${CKPT_KERNEL}"
    fi
    execute "qalter -l ${CKPT_RES_LIST} $JOB_ID"

    # Process arguments
    CKPT_ARGS=
    if [ "kill" = "$1" ]; then
        print "Job will be killed after checkpoint."
        CKPT_ARGS="--term"
    fi

    # Determine PID to checkpoint.
    CKPT_PID=`findProcess cr_restart ${CKPT_PPID}`

    # If no PID found, must not have been restarted before
    if [ -z "${CKPT_PID}" ]; then
        # Get first child of CKPT_PPID
        print "No restarted process found. Checkpoint default process."
        CKPT_PID=`pgrep -P ${CKPT_PPID}`
    fi

    # cr_checkpoint fails with "cr_async.c:191 thread_init: pthread_create() 
returned 12" if this is too high
    # (ignore any error: means stack size was smaller anyway)
    ulimit -s 8192 || true

    # Checkpoint child of CKPT_PPID
    #execute "${CR_PATH}/cr_checkpoint ${CKPT_ARGS} -f ${CKPT_FILE_NEW} --tree 
${CKPT_PID}"
    print "cr_checkpoint ${CKPT_ARGS} -f ${CKPT_FILE_NEW} --tree ${CKPT_PID}"
    ${SGE_ROOT}/ckpt/blcr/chroot-cwd "${CKPT_CHROOT}" /usr/bin/env 
LD_LIBRARY_PATH="$LD_LIBRARY_PATH" "${CR_PATH}/cr_checkpoint" ${CKPT_ARGS} -f 
${CKPT_FILE_NEW} --tree ${CKPT_PID}
    CKPT_STAT=$?
    if [ "0" = "${CKPT_STAT}" ]; then
        printUser "Checkpoint completed successfully"
        ckptSetVersion "${CKPT_JOBDIR}" "${CKPT_VERSION_NEW}"
        if [ -f "${CKPT_FILE}" ]; then
            execute "rm -f ${CKPT_FILE}"
        fi
    else
        printUser "Checkpoint error: ${CKPT_STAT}"
    fi

    exit ${CKPT_STAT}
}

#blcrClean() {
#    printUser "Cleaning job"
#    for FILE in ${CKPT_FILE_NEW} ${CKPT_FILE} ${CKPT_VERSION_FILE} 
${CKPT_JOBDIR}/job_script; do
#        if [ -f "${FILE}" ]; then
#            execute "rm -f ${FILE}"
#        fi
#    done
#
#    execute "rmdir ${CKPT_JOBDIR}"
#}

# Save arguments
CKPT_ACTION=$1
CKPT_PPID=$2

print "$0 $*"

case "$CKPT_ACTION" in
    ckpt)
        blcrCheckpoint
        ;;
    migr)
        blcrCheckpoint kill
        ;;
    clean)
        # Gets invoked when we exit 99 (try another host) - don't want this
        #blcrClean
        ;;
    *)
        print "Unknown action: $CKPT_ACTION"
        ;;
esac

exit 0
/*
 * Copyright (C) 2008 University of Leeds
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */

/*
 * chroots to the specified directory, preserving the
 * cwd, and executes program. To be run suid root.
 *
 * 26Feb08 Mark Dixon
 *
 * $Id: chroot-cwd.c 2753 2012-10-02 15:41:28Z issmcd $
 */

#include <unistd.h>
#include <sys/types.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>

void exit_perror(char *mesg) {
    perror(mesg);
    exit(1);
}

int main(int argc, char **argv) {
    /* temp. drop privs */
    uid_t suid = geteuid();
    uid_t sgid = getegid();
    if (setresuid(getuid(), getuid(), suid)) exit_perror("Could not temp drop 
privs");
    if (setresgid(getgid(), getgid(), sgid)) exit_perror("Could not temp drop 
privs");

    /* check we have enough args */
    if (argc < 2) {
        fprintf(stderr,"%s NEWROOT LD_LIBRARY_PATH COMMAND [...]\n", argv[0]);
        exit(1);
    }

    /* set and sanity-check args, pass remainder to application */
    char **eargv = argv;
    char  *root  = argv[1]; eargv++;
    char  *comm  = argv[2]; eargv++;
    if (root  == NULL || strlen(root)  == 0) exit_perror("Bad root argument");
    if (root[0] != '/')                      exit_perror("root argument does 
not start with /");
    if (comm  == NULL || strlen(comm)  == 0) exit_perror("Bad comm argument");

    /* get current dir  */
    char *cwd = getcwd(NULL, 0); /* Uses a Linux extension to POSIX */
    if (! cwd) exit_perror("Bad cwd");

    /* temp. regain privs */
    if (setresuid(-1, suid, -1)) exit_perror("Could not temp drop privs");
    if (setresgid(-1, sgid, -1)) exit_perror("Could not temp drop privs");

    /* chroot */
    if (chdir(root))  exit_perror("Bad chroot");
    if (chroot(root)) exit_perror("Could not chroot");

    /* drop privs */
    if (setresuid(getuid(), getuid(), getuid())) exit_perror("Could not drop 
privs");
    if (setresgid(getgid(), getgid(), getgid())) exit_perror("Could not drop 
privs");

    /* check we've really dropped privs */
    if (! setreuid(-1, 0)) exit_perror("Failed to drop privs");
    if (! setregid(-1, 0)) exit_perror("Failed to drop privs");

    /* chdir back to where we started, but now in the chroot */
    if (chdir(cwd)) exit_perror("Could not chdir to chroot cwd");

    /* No longer need to know current dir */
    free(cwd);

    /* execute code */
    int status = execv(comm, eargv);
    exit(status);
}
# Copyright (C) 2008 University of Leeds
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program.  If not, see <http://www.gnu.org/licenses/>.

# Source to setup BLCR environment
#
# 21Feb08 Mark Dixon
#
# $Id: env.sh 2753 2012-10-02 15:41:28Z issmcd $

print() {
    touch "${CKPT_LOG}"
    echo "`date`: `hostname`: $1" >> "${CKPT_LOG}"
    sync
}

printUser() {
    touch "${CKPT_USER_LOG}"
    echo "`date`: `hostname`: $1" >> "${CKPT_USER_LOG}"
    print "$1"
    sync
}

execute() {
    print "Execute: $1"
    $1 >> ${CKPT_LOG} 2>&1
#    if [ $? != 0 ]; then
#        echo "Command failed" >> ${CKPT_LOG}
#        exit 1
#    fi
}

getNextVersion() {
    echo $(( ($1 +1) % 2))
}

findProcess() {
    NAME=$1
    PROCESS=$2

    if [ "`ps -p ${PROCESS} -o comm --no-headers`"  = "${NAME}" ]; then
        echo $PROCESS
        return 0
    fi

    # Check each child
    for CHILD in `pgrep -P $PROCESS`; do
        GOT=`findProcess "${NAME}" "${CHILD}"`

        if [ -n "$GOT" ]; then
            echo $GOT
            return 0
        fi
    done

    return 0
}

# Argument: CKPT_JOBID
# Returns : CKPT_JOBDIR
ckptGetDir() {
    echo ${CKPT_DIR}/$1
}

# Argument: JOB_ID
# Argument: SGE_TASK_ID
# Returns : CKPT_JOBID
ckptGetJobId() {
    if [ "undefined" = "$2" ]; then
        echo $1.1
    else
        echo $1.$2
    fi
}

# Argument: CKPT_JOBDIR
# Returns : CKPT_VERSION
ckptGetVersion() {
    CKPT_VERSION_FILE=$1/current
    if [ ! -f "${CKPT_VERSION_FILE}" ]; then
        echo 0 > "${CKPT_VERSION_FILE}"
        echo 0
    else
        echo `cat "${CKPT_VERSION_FILE}"`
    fi
}

# Argument: CKPT_JOBDIR
# Argument: CKPT_VERSION
ckptSetVersion() {
    CKPT_VERSION_FILE=$1/current
    echo $2 > "${CKPT_VERSION_FILE}"
}

# Argument: CKPT_JOBDIR
# Argument: CKPT_VERSION
ckptGetFile() {
    echo "$1/context.$2"
}

CKPT_DIR=/nobackup/ckpt
CKPT_JOBID=`ckptGetJobId "${JOB_ID}" "${SGE_TASK_ID}"`

# Init ckpt directory
CKPT_JOBDIR=`ckptGetDir "${CKPT_JOBID}"`
if [ ! -d "${CKPT_JOBDIR}" ]; then
    mkdir -p "${CKPT_JOBDIR}"
fi

# Init system log
CKPT_LOG=${CKPT_JOBDIR}/log

# Init user log
if [ "undefined" = "${SGE_TASK_ID}" ]; then
    CKPT_USER_LOG=${PWD}/${REQNAME}.co${JOB_ID}
else
    CKPT_USER_LOG=${PWD}/${REQNAME}.co${CKPT_JOBID}
fi

# Get current and next ckpt versions
CKPT_VERSION=`ckptGetVersion "${CKPT_JOBDIR}"`
CKPT_VERSION_NEW=`getNextVersion ${CKPT_VERSION}`

# Define current and next checkpoint filenames
CKPT_FILE=`ckptGetFile "${CKPT_JOBDIR}" "${CKPT_VERSION}"`
CKPT_FILE_NEW=`ckptGetFile "${CKPT_JOBDIR}" "${CKPT_VERSION_NEW}"`

# Init BLCR
CR_PATH="/apps/blcr/everest/`uname -r`/0.6.4/bin"
LD_LIBRARY_PATH="/apps/blcr/everest/`uname -r`/0.6.4/lib:${LD_LIBRARY_PATH}"
LD_LIBRARY_PATH="/apps/blcr/everest/`uname -r`/0.6.4/lib64:${LD_LIBRARY_PATH}"

# Define location of everest node image
CKPT_CHROOT=/everest

#!/bin/bash

# Copyright (C) 2008 University of Leeds
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program.  If not, see <http://www.gnu.org/licenses/>.

# Start/restart a program under BLCR
#
# 26Feb08 Mark Dixon
#
# $Id: starter.sh 2753 2012-10-02 15:41:28Z issmcd $

set -e

# Setup variables and environment
. ${SGE_ROOT}/ckpt/blcr/env.sh

print "$0 called with $*"

BLCR_EXECUTE=$SGE_STARTER_SHELL_PATH

BLCR_FLAGS=
#if [ "true" = "$SGE_STARTER_USE_LOGIN_SHELL" ]; then
#    FLAGS="${FLAGS} -l"
#fi

if [ "0" = "${RESTARTED}" -a -z "${CKPT_RESTART_JOBID}" ]; then
    printUser "Starting job"

    # Ensure everything needed for restart is copied to checkpoint area
    for ARG in $*; do
        if [ "${ARG}" = ${JOB_SCRIPT} ]; then
            execute "cp ${JOB_SCRIPT} ${CKPT_JOBDIR}/job_script"
            BLCR_FLAGS="${BLCR_FLAGS} ${CKPT_JOBDIR}/job_script"
        else
            BLCR_FLAGS="${BLCR_FLAGS} $ARG"
        fi
    done

    print "cr_run ${BLCR_EXECUTE} ${BLCR_FLAGS} on `hostname`"
    ${SGE_ROOT}/ckpt/blcr/chroot-cwd "${CKPT_CHROOT}" /usr/bin/env 
LD_LIBRARY_PATH="$LD_LIBRARY_PATH" "${CR_PATH}/cr_run" ${BLCR_EXECUTE} 
${BLCR_FLAGS}
    print "Exit: $?"
else
    printUser "Restarting job"

    # cr_restore fails with "cr_async.c:191 thread_init: pthread_create() 
returned 12" if this is too high
    # (ignore any error: means stack size was smaller anyway)
    ##DEBUG I hope this isn't inherited by the restarted job?
    ulimit -s 8192 || true

    # If CKPT_FILE does not exist and CKPT_RESTART_JOBID does, we are
    # restarting an old job.
    if [ ! -f "${CKPT_FILE}" -a -n "${CKPT_RESTART_JOBID}" ]; then
        # If CKPT_RESTART_TASKID is not set, start number 1
        if [ -z "${CKPT_RESTART_TASKID}" ]; then
            CKPT_RESTART_TASKID=1
        fi

        CKPT_RESTART_JOBID=`ckptGetJobId "${CKPT_RESTART_JOBID}" 
"${CKPT_RESTART_TASKID}"`
        CKPT_RESTART_JOBDIR=`ckptGetDir "${CKPT_RESTART_JOBID}"`
        CKPT_RESTART_VERSION=`ckptGetVersion "${CKPT_RESTART_JOBDIR}"`

        # Bend normal variables to restart this job
        CKPT_FILE=`ckptGetFile "${CKPT_RESTART_JOBDIR}" 
"${CKPT_RESTART_VERSION}"`

        printUser "This will restart old job ${CKPT_RESTART_JOBID}"
    fi

    print "cr_restart ${CKPT_FILE} on `hostname`"
    CKPT_OUTPUT=`${SGE_ROOT}/ckpt/blcr/chroot-cwd "${CKPT_CHROOT}" /usr/bin/env 
LD_LIBRARY_PATH="$LD_LIBRARY_PATH" "${CR_PATH}/cr_restart" ${CKPT_FILE} 2>&1 || 
true`

    # Check for BLCR errors
    if [ -n "${CKPT_OUTPUT}" ]; then
        print "BLCR output: ${CKPT_OUTPUT}"

        if [ "Restart failed: Device or resource busy" = "${CKPT_OUTPUT}" ]; 
then
            # May have had a PID clash
            printUser "Restart failed (device or resource busy): trying on 
another host"
            exit 99
        fi

        # If we got to here, job failed to start: do checks
        if [ ! -f "${CKPT_FILE}" ]; then
            printUser "Checkpoint file ${CKPT_FILE} does not exist. Aborting 
job."
            exit 1
        fi

        for FILE in "${SGE_ROOT}/ckpt/blcr/chroot-cwd" "/usr/bin/env" 
"${CR_PATH}/cr_restart"; do
            if [ ! -x "${SGE_ROOT}/ckpt/blcr/chroot-cwd" ]; then
                printUser "Executable ${FILE} does not exist. Aborting job."
                exit 1
            fi
        done

        printUser "Unknown restart failure. Aborting job."
        exit 1
    fi
fi
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to