FYI:

On several systems where Jeff's tarball for pr410 ran fine yesterday, I am
seeing errors in today's tarball related to either libverbs or mca_oob_ud.


Issue #1:
On Solaris verbs support is now rejected at configure time.
Configure output appears below as "1)"

Issue #2:
On Linux I get undefined symbols at either build time (from
oob_ud_component.o when static linking orte-cean) or at runtime (dynamic
linker again complaining about undefined symbol(s) in mca_oob_ud.so).  In
both cases I would venture a guess that some linker option (-L or -l ?) is
missing.
Outputs appear below as "2a)" and "2b)".

I am currently assuming these failures with Jeff's tarball for PR410
reflect recent changes in 'master' (e.g. I assume Jeff rebased his branch
since the previous tarballs), but I don't have time to confirm that.  I
believe that the failure in the static linking case eliminates Jeff's
dlopen-related work from consideration as a root cause, but I could
certainly be mistaken.

If these two issues are known, or have already been fixed in 'master', then
just say so and I'll drop this.
Otherwise, let me know what additional files/output you want to see and who
to send it to.

-Paul

1) Configure output from Solaris-11 on x86-64,using Gnu compilers:

--- MCA component common:verbs (m4 configuration macro)
checking for MCA component common:verbs compile mode... dso
checking if want to add padding to the openib BTL control header... no
checking for fcntl.h... (cached) yes
checking sys/poll.h usability... yes
checking sys/poll.h presence... yes
checking for sys/poll.h... yes
checking infiniband/verbs.h usability... yes
checking infiniband/verbs.h presence... yes
checking for infiniband/verbs.h... yes
looking for library without search path
checking for library containing ibv_open_device... -libverbs
checking number of arguments to ibv_create_cq... unknown
configure: WARNING: Can not determine number of args to ibv_create_cq.
configure: WARNING: Not building component.
checking if ConnectX XRC support is enabled... no
checking if ConnectIB XRC support is enabled... no
checking if dynamic SL is enabled... no
configure: WARNING: Verbs support requested (via --with-verbs) but not
found.
configure: WARNING: If you are using libibverbs v1.0 (i.e., OFED v1.0 or
v1.1), you *MUST* have both the libsysfs headers and libraries installed.
Later versions of libibverbs do not require libsysfs.
configure: error: Aborting.

2a) Failure at build time on Linux with "--enable-static --disable-shared"
(truncated):

/bin/sh ../../../libtool  --tag=CC   --mode=link gcc -std=gnu99  -g
-finline-functions -fno-strict-aliasing -pthread   -o orte-clean
orte-clean.o ../../../orte/libopen-rte.la ../../../opal/libopen-pal.la -lrt
-lm -lutil
  -lrt -lm -lutillibtool: link: gcc -std=gnu99 -g -finline-functions
-fno-strict-aliasing -pthread -o orte-clean orte-clean.o
../../../orte/.libs/libopen-rte.a -L/usr/syscom/opt/torque/4.1.4/lib
/usr/syscom/opt/torque/4.1.4/lib/libtorqu
e.so -lxml2 -lz -lcrypto -lssl -lpthread
/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-pr410-v2-linux-x86_64-static/BLD/opal/.libs/libopen-pal.a
../../../opal/.libs/libopen-pal.a -ldl -lrt -lm -lutil -pthread -Wl,-rpath
 -Wl,/usr/syscom/opt/torque/4.1.4/lib -Wl,-rpath
-Wl,/usr/syscom/opt/torque/4.1.4/lib
../../../orte/.libs/libopen-rte.a(oob_ud_component.o): In function
`mca_oob_ud_device_setup':/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-pr410-v2-linux-x86_64-static/openmpi-gitclone/orte/mca/oob/ud/oob_ud_component.c:220:
undefined reference to `ibv_open_device'
/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-pr410-v2-linux-x86_64-static/openmpi-gitclone/orte/mca/oob/ud/oob_ud_component.c:228:
undefined reference to `ibv_query_device'
/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-pr410-v2-linux-x86_64-static/openmpi-gitclone/orte/mca/oob/ud/oob_ud_component.c:236:
undefined reference to `ibv_create_comp_channel'
/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-pr410-v2-linux-x86_64-static/openmpi-gitclone/orte/mca/oob/ud/oob_ud_component.c:244:
undefined reference to `ibv_alloc_pd'
../../../orte/.libs/libopen-rte.a(oob_ud_component.o): In function
`mca_oob_ud_component_startup':
/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-pr410-v2-linux-x86_64-static/openmpi-gitclone/orte/mca/oob/ud/oob_ud_component.c:291:
undefined reference to `ibv_get_device_list'
[and many more]


2b)  Failure at run time on Linux with only "normal" configure options:

$ mpirun -mca btl sm,self -np 2 examples/ring_c'
/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-pr410-v2-linux-x86_64-icc-11.1/INST/bin/mpirun:
symbol lookup error:
/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-pr410-v2-linux-x86_64-icc-11.1/INST/lib/openmpi/mca_oob_ud.so:
undefined symbol: ibv_get_device_list




-- 
Paul H. Hargrove                          phhargr...@lbl.gov
Computer Languages & Systems Software (CLaSS) Group
Computer Science Department               Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900

Reply via email to