Hello again.

I'm working in the launch code to handle my checkpoints, but i'm a little
stuck in how to set the path to my checkpoint and the executable
(ompi_blcr_context.PID). I take a look at the code in
odls_base_default_fns.c and this piece of code took my attention:

#if OPAL_ENABLE_FT_CR == 1

            /*

             * OPAL CRS components need the opportunity to take action
before a process

             * is forked.

             * Needs access to:

             *   - Environment

             *   - Rank/ORTE Name

             *   - Binary to exec

             */

            if( NULL != opal_crs.crs_prelaunch ) {

                if( OPAL_SUCCESS != (rc =
opal_crs.crs_prelaunch(child->name->vpid,


orte_sstore_base_prelaunch_location,


&(app->app),


&(app->cwd),


&(app->argv),

                                                                 &(app->env)
) ) ) {

                    ORTE_ERROR_LOG(rc);

                    goto CLEANUP;

                }

            }
#endif

I've seen that i can set the *location* with a MCA parameter, but as i'm
working in a no-coordinated checkpoint and also passing checkpoints from one
node to another so for example, node 2 can restore a process that was
residing on node 1. So every node has his own process's checkpoints files in
a folder, and also checkpoints from another processes residing somewhere
else in another folder.

Is there any way to set the values of where the checkpoint are stored and
his exec names taking into account my situation?

Best Regards.

Hugo Meyer

Reply via email to