I've seen several cases now where people have functional, installed MPI
libraries yet when they've come to use padb they have discovered a build
problem with the Message Queue DLL which prevents it from working.

The most common problem is unresolved symbols in the dll meaning the
debugger cannot dlopen it or in some cases can only dlopen it with
RTLD_LAZY which introduces other problems.

Attached is a patch to the OpenMPI sources which adds a simple test
program, to be built and run as part of the build procedure that
verifies the dll can be loaded without error.

The test program itself is good, I'm less happy about the autoconf
integration, it adds a check-local target in the debuggers makefile (the
only one in the source tree) which fails if there is a problem with the
DLL, this causes "make check" to fail however this isn't run by either
"make" or "make install".  As such it's a step forward but it would be
better if the test was performed in the make stage, I haven't figured
out how to do this however.

Ashley,

-- 

Ashley Pittman, Bath, UK.

Padb - A parallel job inspection tool for cluster computing
http://padb.pittman.org.uk
Index: ompi/debuggers/dlopen_test.c
===================================================================
--- ompi/debuggers/dlopen_test.c	(revision 0)
+++ ompi/debuggers/dlopen_test.c	(revision 0)
@@ -0,0 +1,40 @@
+
+#include <dlfcn.h>
+#include <stdio.h>
+
+int main (int argc, char *argv[]) {
+	
+	char *filename = NULL;
+	void *dlhandle;
+	
+	if ( argc > 1 ) {
+		filename = argv[1];
+	} else {
+		printf("You must supply a filename to open\n");
+		return 10;
+	}
+
+	printf("Trying to dlopen file %s\n",filename);
+
+	dlhandle = dlopen(filename,RTLD_NOW);
+
+	if ( dlhandle ) {
+		printf("File opened with RTLD_NOW, all passed\n");
+		return 0;
+	}
+
+	printf("Failed to open with RTLD_NOW: %s\n",dlerror());
+	
+	printf("Retrying with RTLD_LAZY\n");
+
+	dlhandle = dlopen(filename,RTLD_LAZY);
+
+	if ( dlhandle ) {
+		printf("File opened with RTLD_LAZY\n");
+		return 1;
+	}
+
+	printf("Failed to open with RTLD_LAZY: %s\n",dlerror());
+
+	return 2;
+}
Index: ompi/debuggers/Makefile.am
===================================================================
--- ompi/debuggers/Makefile.am	(revision 22102)
+++ ompi/debuggers/Makefile.am	(working copy)
@@ -19,6 +19,7 @@

 noinst_LTLIBRARIES = libdebuggers.la libompi_debugger_canary.la
 pkglib_LTLIBRARIES = libompi_dbg_msgq.la
+check_PROGRAMS = dlopen_test

 # This is not quite in the Automake spirit, but we have to do it.
 # Since the totalview portion of the library must be built with -g, we
@@ -36,6 +37,13 @@
         ompi_common_dll_defs.h \
         msgq_interface.h ompi_msgq_dll_defs.h

+dlopen_test_SOURCES = dlopen_test.c
+
+check-local:
+	./dlopen_test$(EXEEXT) .libs/libompi_dbg_msgq.so
+
+dlopen_test_CFLAGS = -ldl
+
 libdebuggers_la_SOURCES = \
         $(headers) \
         ompi_debuggers.c

Reply via email to