I've seen several cases now where people have functional, installed MPI libraries yet when they've come to use padb they have discovered a build problem with the Message Queue DLL which prevents it from working.
The most common problem is unresolved symbols in the dll meaning the debugger cannot dlopen it or in some cases can only dlopen it with RTLD_LAZY which introduces other problems. Attached is a patch to the OpenMPI sources which adds a simple test program, to be built and run as part of the build procedure that verifies the dll can be loaded without error. The test program itself is good, I'm less happy about the autoconf integration, it adds a check-local target in the debuggers makefile (the only one in the source tree) which fails if there is a problem with the DLL, this causes "make check" to fail however this isn't run by either "make" or "make install". As such it's a step forward but it would be better if the test was performed in the make stage, I haven't figured out how to do this however. Ashley, -- Ashley Pittman, Bath, UK. Padb - A parallel job inspection tool for cluster computing http://padb.pittman.org.uk
Index: ompi/debuggers/dlopen_test.c =================================================================== --- ompi/debuggers/dlopen_test.c (revision 0) +++ ompi/debuggers/dlopen_test.c (revision 0) @@ -0,0 +1,40 @@ + +#include <dlfcn.h> +#include <stdio.h> + +int main (int argc, char *argv[]) { + + char *filename = NULL; + void *dlhandle; + + if ( argc > 1 ) { + filename = argv[1]; + } else { + printf("You must supply a filename to open\n"); + return 10; + } + + printf("Trying to dlopen file %s\n",filename); + + dlhandle = dlopen(filename,RTLD_NOW); + + if ( dlhandle ) { + printf("File opened with RTLD_NOW, all passed\n"); + return 0; + } + + printf("Failed to open with RTLD_NOW: %s\n",dlerror()); + + printf("Retrying with RTLD_LAZY\n"); + + dlhandle = dlopen(filename,RTLD_LAZY); + + if ( dlhandle ) { + printf("File opened with RTLD_LAZY\n"); + return 1; + } + + printf("Failed to open with RTLD_LAZY: %s\n",dlerror()); + + return 2; +} Index: ompi/debuggers/Makefile.am =================================================================== --- ompi/debuggers/Makefile.am (revision 22102) +++ ompi/debuggers/Makefile.am (working copy) @@ -19,6 +19,7 @@ noinst_LTLIBRARIES = libdebuggers.la libompi_debugger_canary.la pkglib_LTLIBRARIES = libompi_dbg_msgq.la +check_PROGRAMS = dlopen_test # This is not quite in the Automake spirit, but we have to do it. # Since the totalview portion of the library must be built with -g, we @@ -36,6 +37,13 @@ ompi_common_dll_defs.h \ msgq_interface.h ompi_msgq_dll_defs.h +dlopen_test_SOURCES = dlopen_test.c + +check-local: + ./dlopen_test$(EXEEXT) .libs/libompi_dbg_msgq.so + +dlopen_test_CFLAGS = -ldl + libdebuggers_la_SOURCES = \ $(headers) \ ompi_debuggers.c