Thanks Ralph for your reply.

2011/3/21 Ralph Castain <r...@open-mpi.org>

> You should never access a pointer array's data area that way (i.e., by
> index against the raw data). You really should do:
>
> if (NULL == (proc = (orte_proc_t*)opal_pointer_array_get_item(jdata->procs,
> vpid))) {
>       /* error report */
> }
>
>
About this, i've changed this in my code but i'm getting the same result.
Null when asking about a dead process.


> The errmgr generally doesn't remove a process object upon failure - it just
> sets its state to some appropriate value. However, depending upon where you
> are trying to do this, and the history that got you down this code path, it
> is possible.
>

I'm writing this code into the errmgr_orted.c, and it is executed when a
process fails.


>
> Also, remember that if you are in a daemon, then the jdata objects are not
> populated. The daemons work exclusively from the orte_local_jobdata and
> orte_local_children lists, so you would have to find your process there.
>

That's why i'm asking to the hnp about the jdata using *
ORTE_DAEMON_REPORT_JOB_INFO_CMD*, i assume that he has the information about
the dead process.

Any idea?

Best regards.

Hugo Meyer

Reply via email to