Thanks for your fix.

You say that the environment is only taken in
account during register. There is another variable set in the
environment in opal-restart.c. Does the following still work:

opal-restart.c:

    (void) mca_base_var_env_name("crs", &tmp_env_var);
    opal_setenv(tmp_env_var,
                expected_crs_comp,
                true, &environ);
    free(tmp_env_var);
    tmp_env_var = NULL;

The preferred checkpointer is selected like this and in
opal_crs_base_select() the following happens:

    if( OPAL_SUCCESS != mca_base_select("crs", 
opal_crs_base_framework.framework_output,
                                        
&opal_crs_base_framework.framework_components,
                                        (mca_base_module_t **) &best_module,
                                        (mca_base_component_t **) 
&best_component) ) {
        /* This will only happen if no component was selected */
        exit_status = OPAL_ERROR;
        goto cleanup;
    }

Does the mca_base_var_env_name() influence which crs module
is selected during mca_base_select()? Or do I have to change it
also to mca_base_var_set_value() to select the preferred crs module?

                Adrian


On Mon, Mar 17, 2014 at 08:47:16AM -0600, Nathan Hjelm wrote:
> Good catch. Fixing now.
> 
> -Nathan
> 
> On Mon, Mar 17, 2014 at 02:50:02PM +0100, Adrian Reber wrote:
> > On Fri, Mar 14, 2014 at 10:18:06PM +0000, Hjelm, Nathan T wrote:
> > > The preferred way is to use mca_base_var_find and then call 
> > > mca_base_var_[set|get]_value. For performance sake we only look at the 
> > > environment when the variable is registered.
> > 
> > I believe I found a bug in mca_base_var_set_value using bool variables:
> > 
> > #0  0x00007f6e0d8fb800 in mca_base_var_enum_bool_sfv (self=0x7f6e0dbabc20 
> > <mca_base_var_enum_bool>, value=0, 
> >     string_value=0x0) at ../../../../opal/mca/base/mca_base_var_enum.c:82
> > #1  0x00007f6e0d8f45d6 in mca_base_var_set_value (vari=120, value=0x4031e6, 
> > size=0, source=MCA_BASE_VAR_SOURCE_DEFAULT, 
> >     source_file=0x0) at ../../../../opal/mca/base/mca_base_var.c:636
> > #2  0x0000000000401e44 in main (argc=7, argv=0x7fffa72a0a78) at 
> > ../../../../opal/tools/opal-restart/opal-restart.c:223
> > 
> > I am using set_value like this:
> > 
> > bool test=false;
> > mca_base_var_set_value(idx, &test, 0, MCA_BASE_VAR_SOURCE_DEFAULT, NULL);
> > 
> > As the size is ignored I am just setting it to '0'.
> > 
> > mca_base_var_set_value() does 
> > 
> > ret = var->mbv_enumerator->string_from_value(var->mbv_enumerator,((int *) 
> > value)[0], NULL);
> > 
> > which calls mca_base_var_enum_bool_sfv() with the last parameter set to 
> > NULL:
> > 
> > static int mca_base_var_enum_bool_sfv (mca_base_var_enum_t *self, const int 
> > value,
> >                                        const char **string_value)
> > {
> >     *string_value = value ? "true" : "false";
> > 
> >     return OPAL_SUCCESS;
> > }
> > 
> > and here it tries to access the last parameter (string_value) which has
> > been set to NULL. As I cannot find any usage of mca_base_var_set_value()
> > with bool variables this code path has probably not been used until now.
> > 
> >             Adrian
> > _______________________________________________
> > devel mailing list
> > de...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > Link to this post: 
> > http://www.open-mpi.org/community/lists/devel/2014/03/14354.php



> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/03/14355.php


                Adrian

-- 
Adrian Reber <adr...@lisas.de>            http://lisas.de/~adrian/
printk(KERN_ERR "msp3400: chip reset failed, penguin on i2c bus?\n");
        2.2.16 /usr/src/linux/drivers/char/msp3400.c

Attachment: pgph76CYFEG_J.pgp
Description: PGP signature

Reply via email to