[OMPI devel] Dumping process status etc.

2007-05-22 Thread Ralph H Castain
This came up in today's telecon and I promised to send this to George -
however, it occurred to me that others may also want to know.

If you want to dump info for debugging purposes, and if you can get into
orterun/mpirun (e.g., via gdb), you can dump info on anything with the
following (NOTE: Gdb will frequently truncate the output from these commands
- that is why there are so many and they are somewhat detailed. I tend to
bury the more verbose of these in the code itself when debugging so I can be
sure to see the entire output):

orte_gpr.dump_all(0): this will dump *everything* in the registry to opal
output stream 0 (or whatever one you care to designate), including all the
info on trigger status (e.g., whether it has fired or not).

orte_gpr.dump_segment(segment-name): this will provide the info stored on
any segment of the registry. Standard segments worth looking at include:

1. "orte-job-1": shows info on all procs in your initial applications,
including their reported state

2. "orte-node": what nodes are known to the system, and anything about their
status

3. "orte-job-0": info on all daemons in the system


orte_gpr.dump_triggers(0): status and info on all triggers. The "0" argument
indicates that you want them all dumped to the screen. Since gdb doesn't
like getting too much info, you can use this argument to specify how many
you want to see starting from the end of the list (i.e., "5" says give me
the five last triggers that were defined).

orte_gpr.dump_subscriptions(0): same as above, only for subscriptions

There are more of these that are defined, but they are fairly obvious - you
can see them all listed in orte/mca/gpr/gpr.h.

Also, don't forget that you can dump *any* data type object using the
orte_dss.dump command - see orte/dss/dss.h for a description.

Hope that helps!
Ralph




Re: [OMPI devel] [devel-core] Dumping process status etc.

2007-05-22 Thread Josh Hursey
You can also use the orte-ps tool to give you a dump of the GPR. On  
the machine with 'mpirun' running on it:

 shell$ orte-ps --dump
This will call orte_gpr.dump_all(0) and push the output to the terminal.

It gives a quick and dirty access to this information at any point in  
time.


-- Josh

On May 22, 2007, at 12:11 PM, Ralph H Castain wrote:

This came up in today's telecon and I promised to send this to  
George -

however, it occurred to me that others may also want to know.

If you want to dump info for debugging purposes, and if you can get  
into

orterun/mpirun (e.g., via gdb), you can dump info on anything with the
following (NOTE: Gdb will frequently truncate the output from these  
commands
- that is why there are so many and they are somewhat detailed. I  
tend to
bury the more verbose of these in the code itself when debugging so  
I can be

sure to see the entire output):

orte_gpr.dump_all(0): this will dump *everything* in the registry  
to opal
output stream 0 (or whatever one you care to designate), including  
all the

info on trigger status (e.g., whether it has fired or not).

orte_gpr.dump_segment(segment-name): this will provide the info  
stored on
any segment of the registry. Standard segments worth looking at  
include:


1. "orte-job-1": shows info on all procs in your initial applications,
including their reported state

2. "orte-node": what nodes are known to the system, and anything  
about their

status

3. "orte-job-0": info on all daemons in the system


orte_gpr.dump_triggers(0): status and info on all triggers. The "0"  
argument
indicates that you want them all dumped to the screen. Since gdb  
doesn't
like getting too much info, you can use this argument to specify  
how many
you want to see starting from the end of the list (i.e., "5" says  
give me

the five last triggers that were defined).

orte_gpr.dump_subscriptions(0): same as above, only for subscriptions

There are more of these that are defined, but they are fairly  
obvious - you

can see them all listed in orte/mca/gpr/gpr.h.

Also, don't forget that you can dump *any* data type object using the
orte_dss.dump command - see orte/dss/dss.h for a description.

Hope that helps!
Ralph


___
devel-core mailing list
devel-c...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel-core