Hello again. I'm working in the launch code to handle my checkpoints, but i'm a little stuck in how to set the path to my checkpoint and the executable (ompi_blcr_context.PID). I take a look at the code in odls_base_default_fns.c and this piece of code took my attention:
#if OPAL_ENABLE_FT_CR == 1 /* * OPAL CRS components need the opportunity to take action before a process * is forked. * Needs access to: * - Environment * - Rank/ORTE Name * - Binary to exec */ if( NULL != opal_crs.crs_prelaunch ) { if( OPAL_SUCCESS != (rc = opal_crs.crs_prelaunch(child->name->vpid, orte_sstore_base_prelaunch_location, &(app->app), &(app->cwd), &(app->argv), &(app->env) ) ) ) { ORTE_ERROR_LOG(rc); goto CLEANUP; } } #endif I've seen that i can set the *location* with a MCA parameter, but as i'm working in a no-coordinated checkpoint and also passing checkpoints from one node to another so for example, node 2 can restore a process that was residing on node 1. So every node has his own process's checkpoints files in a folder, and also checkpoints from another processes residing somewhere else in another folder. Is there any way to set the values of where the checkpoint are stored and his exec names taking into account my situation? Best Regards. Hugo Meyer