Hi, I wanted to profile the openib btl code (and a few other routines in the ompi layer) for performance bottlenecks. Could anyone, who has done something similar, post instructions on how to go about it. I'm using gnu compilers on a SLES 9 box (ppc64). So far I have been successful in debugging(gdb) through the PML/BTL layers & have a fair understanding of this part of the code. It would be really helpful to know from the core developers on how they go about diagnosing performance problems in Open MPI.
-Aleph