I´m not sure, but I think that line 659 on file orte/mca/ess/env/ess_env_module.c should contain

if (ORTE_SUCCESS != (ret = orte_ess_base_build_nidmap(orte_process_info.sync_buf, &nidmap, *jmap*))) {

But actually it contains

if (ORTE_SUCCESS != (ret = orte_ess_base_build_nidmap(orte_process_info.sync_buf, &nidmap, *&jmap->pmap*))) {

No?

Leonardo


Leonardo Fialho escribió:
Hi All,

I think that exists an error in the trunk version while trying to restore a checkpoint.

The function orte_util_decode_pidmap while attempts to execute the following code

   /* store the data */
   for (i=0; i < num_procs; i++) {
       pmap.node = nodes[i];
       pmap.local_rank = local_rank[i];
       pmap.node_rank = node_rank[i];
       opal_value_array_set_item(procs, i, &pmap);
   }

produces a segmentation fault

[nodo2:18027] *** Process received signal ***
[nodo2:18027] Signal: Segmentation fault (11)
[nodo2:18027] Signal code: Address not mapped (1)
[nodo2:18027] Failing at address: (nil)

I was trying to trace the problem and I think that it occurs in the line opal_value_array_set_item(procs, i, &pmap);

Thanks,



--
Leonardo Fialho
Computer Architecture and Operating Systems Department - CAOS
Universidad Autonoma de Barcelona - UAB
ETSE, Edifcio Q, QC/3088
http://www.caos.uab.es
Phone: +34-93-581-2888
Fax: +34-93-581-2478

Reply via email to