I usually run "mpirun -np 2 ./test". I execute always on a single node. The message appears either with 1 or 2 GPUs on the single node.
2013/1/24 Rolf vandeVaart <rvandeva...@nvidia.com> > Thanks for this report. I will look into this. Can you tell me what your > mpirun command looked like and do you know what transport you are running > over?**** > > Specifically, is this on a single node or multiple nodes?**** > > ** ** > > Rolf**** > > ** ** > > *From:* devel-boun...@open-mpi.org [mailto:devel-boun...@open-mpi.org] *On > Behalf Of *Alessandro Fanfarillo > *Sent:* Thursday, January 24, 2013 4:11 AM > *To:* de...@open-mpi.org > *Subject:* [OMPI devel] CUDA support doesn't work starting from > 1.9a1r27862**** > > ** ** > > Dear all,**** > > I would like to report a bug for the CUDA support on the last 5 trunk > versions.**** > > The attached code is a simply send/receive test case which correctly works > with version 1.9a1r27844. **** > > Starting from version 1.9a1r27862 up to 1.9a1r27897 I get the following > message: > > ./test: symbol lookup error: > /usr/local/openmpi/lib/openmpi/mca_pml_ob1.so: undefined symbol: > progress_one_cuda_htod_event > ./test: symbol lookup error: > /usr/local/openmpi/lib/openmpi/mca_pml_ob1.so: undefined symbol: > progress_one_cuda_htod_event > -------------------------------------------------------------------------- > mpirun has exited due to process rank 0 with PID 21641 on > node ip-10-16-24-100 exiting improperly. There are three reasons this > could occur: > > 1. this process did not call "init" before exiting, but others in > the job did. This can cause a job to hang indefinitely while it waits > for all processes to call "init". By rule, if one process calls "init", > then ALL processes must call "init" prior to termination. > > 2. this process called "init", but exited without calling "finalize". > By rule, all processes that call "init" MUST call "finalize" prior to > exiting or it will be considered an "abnormal termination" > > 3. this process called "MPI_Abort" or "orte_abort" and the mca parameter > orte_create_session_dirs is set to false. In this case, the run-time cannot > detect that the abort call was an abnormal termination. Hence, the only > error message you will receive is this one. > > This may have caused other processes in the application to be > terminated by signals sent by mpirun (as reported here). > > You can avoid this message by specifying -quiet on the mpirun command line. > **** > > > > ----------------------------------------------------------------------------------------------------- > **** > > I'm using gcc-4.7.2 and CUDA 4.2. The test fails also with CUDA 4.1.**** > > Thanks in advance. > > Best regards. > > Alessandro Fanfarillo**** > > ** ** > > ** ** > ------------------------------ > This email message is for the sole use of the intended recipient(s) and > may contain confidential information. Any unauthorized review, use, > disclosure or distribution is prohibited. If you are not the intended > recipient, please contact the sender by reply email and destroy all copies > of the original message. > ------------------------------ > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel >