This came up in today's telecon and I promised to send this to George -
however, it occurred to me that others may also want to know.

If you want to dump info for debugging purposes, and if you can get into
orterun/mpirun (e.g., via gdb), you can dump info on anything with the
following (NOTE: Gdb will frequently truncate the output from these commands
- that is why there are so many and they are somewhat detailed. I tend to
bury the more verbose of these in the code itself when debugging so I can be
sure to see the entire output):

orte_gpr.dump_all(0): this will dump *everything* in the registry to opal
output stream 0 (or whatever one you care to designate), including all the
info on trigger status (e.g., whether it has fired or not).

orte_gpr.dump_segment(segment-name): this will provide the info stored on
any segment of the registry. Standard segments worth looking at include:

1. "orte-job-1": shows info on all procs in your initial applications,
including their reported state

2. "orte-node": what nodes are known to the system, and anything about their
status

3. "orte-job-0": info on all daemons in the system


orte_gpr.dump_triggers(0): status and info on all triggers. The "0" argument
indicates that you want them all dumped to the screen. Since gdb doesn't
like getting too much info, you can use this argument to specify how many
you want to see starting from the end of the list (i.e., "5" says give me
the five last triggers that were defined).

orte_gpr.dump_subscriptions(0): same as above, only for subscriptions

There are more of these that are defined, but they are fairly obvious - you
can see them all listed in orte/mca/gpr/gpr.h.

Also, don't forget that you can dump *any* data type object using the
orte_dss.dump command - see orte/dss/dss.h for a description.

Hope that helps!
Ralph


Reply via email to