Dear All meep users, Hi. I am now facing a problem about using MEEP-MPI with OPENMPI on a cluster. The calculation process is smooth. however, once it begins to output .h5 files, errors occur. A file (1.ctl) is tested with two nodes "main" and "main-1". it shows: creating output file "./1-eps-000000.00.h5"... [main-1:31223] *** An error occurred in MPI_Bcast [main-1:31223] *** on communicator MPI COMMUNICATOR 36 DUP FROM 28 [main-1:31223] *** MPI_ERR_TRUNCATE: message truncated [main-1:31223] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort) [main][[31448,1],0][../../../../../../ompi/mca/btl/tcp/btl_tcp_frag.c:216:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104) -------------------------------------------------------------------------- mpirun has exited due to process rank 2 with PID 31219 on node main-1 exiting without calling "finalize". This may have caused other processes in the application to be terminated by signals sent by mpirun (as reported here). -------------------------------------------------------------------------- [main:28388] 1 more process has sent help message help-mpi-errors.txt / mpi_errors_are_fatal [main:28388] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
The two nodes both have two slots. The system is ubuntu 9.04. Openmpi and libhdf5-openmpi are installed through the order of "apt-get install openmpi-bin libopenmpi-dev libhdf5-openmpi-dev libhdf5-openmpi-1.6.6-0". Meep-1.1.1 is configured by ./configure --with-mpi MPICXX=mpiCC. The configure info shows: checking for a BSD-compatible install... /usr/bin/install -c checking whether build environment is sane... yes checking for a thread-safe mkdir -p... /bin/mkdir -p checking for gawk... no checking for mawk... mawk checking whether make sets $(MAKE)... yes checking whether to enable maintainer-specific portions of Makefiles... no checking for g++... g++ checking for C++ compiler default output file name... a.out checking whether the C++ compiler works... yes checking whether we are cross compiling... no checking for suffix of executables... checking for suffix of object files... o checking whether we are using the GNU C++ compiler... yes checking whether g++ accepts -g... yes checking for style of include used by make... GNU checking dependency style of g++... gcc3 checking for mpic++... mpiCC checking for MPI_Init... yes checking for mpi.h... yes checking for extra flag needed to combine stdio.h and mpi.h... none checking build system type... i686-pc-linux-gnu checking host system type... i686-pc-linux-gnu checking for gcc... gcc checking whether we are using the GNU C compiler... yes checking whether gcc accepts -g... yes checking for gcc option to accept ISO C89... none needed checking dependency style of gcc... gcc3 checking for C++ compiler vendor... gnu checking whether C++ compiler accepts -malign-double... yes checking whether C++ compiler accepts -fstrict-aliasing... yes checking whether C++ compiler accepts -ffast-math... yes checking for gcc architecture flag... checking for x86 cpuid 0 output... 3:756e6547:6c65746e:49656e69 checking for x86 cpuid 1 output... f41:1020800:641d:bfebfbff checking whether C++ compiler accepts -march=prescott... yes checking for gcc architecture flag... -march=prescott checking whether C++ compiler accepts -O3 -malign-double -fstrict-aliasing -march=prescott... yes checking for a sed that does not truncate output... /bin/sed checking for grep that handles long lines and -e... /bin/grep checking for egrep... /bin/grep -E checking for fgrep... /bin/grep -F checking for ld used by gcc... /usr/bin/ld checking if the linker (/usr/bin/ld) is GNU ld... yes checking for BSD- or MS-compatible name lister (nm)... /usr/bin/nm -B checking the name lister (/usr/bin/nm -B) interface... BSD nm checking whether ln -s works... yes checking the maximum length of command line arguments... 1572864 checking whether the shell understands some XSI constructs... yes checking whether the shell understands "+="... yes checking for /usr/bin/ld option to reload object files... -r checking for objdump... objdump checking how to recognize dependent libraries... pass_all checking for ar... ar checking for strip... strip checking for ranlib... ranlib checking command to parse /usr/bin/nm -B output from gcc object... ok checking how to run the C preprocessor... gcc -E checking for ANSI C header files... yes checking for sys/types.h... yes checking for sys/stat.h... yes checking for stdlib.h... yes checking for string.h... yes checking for memory.h... yes checking for strings.h... yes checking for inttypes.h... yes checking for stdint.h... yes checking for unistd.h... yes checking for dlfcn.h... yes checking whether we are using the GNU C++ compiler... (cached) yes checking whether mpiCC accepts -g... (cached) yes checking dependency style of mpiCC... (cached) gcc3 checking how to run the C++ preprocessor... mpiCC -E checking for objdir... .libs checking if gcc supports -fno-rtti -fno-exceptions... no checking for gcc option to produce PIC... -fPIC -DPIC checking if gcc PIC flag -fPIC -DPIC works... yes checking if gcc static flag -static works... yes checking if gcc supports -c -o file.o... yes checking if gcc supports -c -o file.o... (cached) yes checking whether the gcc linker (/usr/bin/ld) supports shared libraries... yes checking dynamic linker characteristics... GNU/Linux ld.so checking how to hardcode library paths into programs... immediate checking whether stripping libraries is possible... yes checking if libtool supports shared libraries... yes checking whether to build shared libraries... no checking whether to build static libraries... yes checking for ld used by mpiCC... /usr/bin/ld checking if the linker (/usr/bin/ld) is GNU ld... yes checking whether the mpiCC linker (/usr/bin/ld) supports shared libraries... yes checking for mpiCC option to produce PIC... -fPIC -DPIC checking if mpiCC PIC flag -fPIC -DPIC works... yes checking if mpiCC static flag -static works... yes checking if mpiCC supports -c -o file.o... yes checking if mpiCC supports -c -o file.o... (cached) yes checking whether the mpiCC linker (/usr/bin/ld) supports shared libraries... yes checking dynamic linker characteristics... GNU/Linux ld.so checking how to hardcode library paths into programs... immediate checking for latex2html... latex2html checking for sin in -lm... yes checking for fftw_plan_dft_1d in -lfftw3... yes checking for pkg-config... /usr/bin/pkg-config checking for harminv >= 1.1... yes checking HARMINV_CFLAGS... checking HARMINV_LIBS... -L/usr/lib/gcc/i486-linux-gnu/4.2.3 -L/usr/lib/gcc/i486-linux-gnu/4.2.3/../../../../lib -L/lib/../lib -L/usr/lib/../lib -L/usr/lib/gcc/i486-linux-gnu/4.2.3/../../.. -lharminv -llapack -lblas -lgfortranbegin -lgfortran -lm -lgcc_s checking mpb.h usability... no checking mpb.h presence... no checking for mpb.h... no checking for cblas_cgemm... yes checking for gsl_sf_bessel_Jn in -lgsl... yes checking for deflate in -lz... yes checking for H5Pcreate in -lhdf5... yes checking hdf5.h usability... yes checking hdf5.h presence... yes checking for hdf5.h... yes checking for H5Pset_mpi... no checking for H5Pset_fapl_mpio... yes Looks like we have got 2 processors checking for guile-config... guile-config checking if linking to guile works... yes checking for scm_make_smob_type... yes checking for SCM_SMOB_PREDICATE... no checking for SCM_SMOB_DATA... no checking for SCM_NEWSMOB... no checking how to activate readline in Guile... ice-9 readline checking for libctl dir... /usr/local/share/libctl checking for gen-ctl-io... gen-ctl-io checking for ctl_get_vector3 in -lctl... yes checking ctl.h usability... yes checking ctl.h presence... yes checking for ctl.h... yes checking whether libctl version is at least 3.0.3... ok checking for basename in -lgen... no checking for feenableexcept... yes checking whether feenableexcept declaration is usable... yes checking whether to catch and ignore SIGFPE signals... no checking whether time.h and sys/time.h may both be included... yes checking sys/time.h usability... yes checking sys/time.h presence... yes checking for sys/time.h... yes checking for BSDgettimeofday... no checking for gettimeofday... yes checking for cblas_ddot... yes checking for cblas_daxpy... yes checking for C/C++ restrict keyword... __restrict configure: creating ./config.status config.status: creating Makefile config.status: creating meep-pkgconfig config.status: creating src/Makefile config.status: creating tests/Makefile config.status: creating examples/Makefile config.status: creating libctl/Makefile config.status: creating libctl/meep.scm config.status: creating config.h config.status: config.h is unchanged config.status: executing depfiles commands config.status: executing libtool commands The .h5 output is OK if only one node is used. Also, it is found that hdf5-serialcan works for the two computers although the output speed is low. Can anyone help about this? Perhaps, it is because the system is not clean. MPICH and LAM etc.. were installed before. A minor thing is that "export GUILE_WARN_DEPRECATED=no" doesn't work on these machines. The guile version is 1.8.5. Thanks in advance. Best wishes, Jiangjun Zheng
_______________________________________________ meep-discuss mailing list [email protected] http://ab-initio.mit.edu/cgi-bin/mailman/listinfo/meep-discuss

