On Dec 27, 2005, at 4:30 PM, Leslie Watter wrote:

Hi people,

I'm trying to develop a new btl module using LLC protocol.
I've based the code on the TCP btl code and I'm experiencing some problems like:
------
mpirun noticed that job rank 1 with PID 1763 on node "servidor" exited
on signal 11.
1 process killed (possibly by Open MPI)
------

Is there a way that I can debug or know where the code are being killed?

The easiest way is to use TotalView, but that can be expensive. If you are only trying to run a couple of processes in your job, it's possible to use gdb to debug your processes. It's easiest when using ssh to login to remote nodes, since it sets up X11 forwarding for you. running the command:

  mpirun -np X -d xterm -e gdb <my app>

will start X xterms, with one gdb process in each xterm. From there, you can debug the processes to figure out what your code is doing.

I'm using BTL_OUTPUT to trace the execution, is there another way?

This is probably the easiest way - the macros are already there so in optimized builds, all that code isn't executed (which is nice for performance.

Good luck!

Brian


--
  Brian Barrett
  Open MPI developer
  http://www.open-mpi.org/


Reply via email to