Dear All meep users,

Hi. I am now facing a problem about using MEEP-MPI with OPENMPI on a cluster. 
The calculation process is smooth. however, once it begins to output .h5 files, 
errors occur. A file (1.ctl) is tested with two nodes "main" and "main-1". it 
shows:
creating output file "./1-eps-000000.00.h5"...
[main-1:31223] *** An error occurred in MPI_Bcast
[main-1:31223] *** on communicator MPI COMMUNICATOR 36 DUP FROM 28
[main-1:31223] *** MPI_ERR_TRUNCATE: message truncated
[main-1:31223] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
[main][[31448,1],0][../../../../../../ompi/mca/btl/tcp/btl_tcp_frag.c:216:mca_btl_tcp_frag_recv]
 mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104)
--------------------------------------------------------------------------
mpirun has exited due to process rank 2 with PID 31219 on
node main-1 exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
[main:28388] 1 more process has sent help message help-mpi-errors.txt / 
mpi_errors_are_fatal
[main:28388] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help 
/ error messages

The two nodes both have two slots. The system is ubuntu 9.04. Openmpi and 
libhdf5-openmpi are installed through the order of "apt-get install openmpi-bin 
libopenmpi-dev libhdf5-openmpi-dev libhdf5-openmpi-1.6.6-0". Meep-1.1.1 is 
configured by ./configure --with-mpi MPICXX=mpiCC. The configure info shows:
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... /bin/mkdir -p
checking for gawk... no
checking for mawk... mawk
checking whether make sets $(MAKE)... yes
checking whether to enable maintainer-specific portions of Makefiles... no
checking for g++... g++
checking for C++ compiler default output file name... a.out
checking whether the C++ compiler works... yes
checking whether we are cross compiling... no
checking for suffix of executables... 
checking for suffix of object files... o
checking whether we are using the GNU C++ compiler... yes
checking whether g++ accepts -g... yes
checking for style of include used by make... GNU
checking dependency style of g++... gcc3
checking for mpic++... mpiCC
checking for MPI_Init... yes
checking for mpi.h... yes
checking for extra flag needed to combine stdio.h and mpi.h... none
checking build system type... i686-pc-linux-gnu
checking host system type... i686-pc-linux-gnu
checking for gcc... gcc
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking dependency style of gcc... gcc3
checking for C++ compiler vendor... gnu
checking whether C++ compiler accepts -malign-double... yes
checking whether C++ compiler accepts -fstrict-aliasing... yes
checking whether C++ compiler accepts -ffast-math... yes
checking for gcc architecture flag... 
checking for x86 cpuid 0 output... 3:756e6547:6c65746e:49656e69
checking for x86 cpuid 1 output... f41:1020800:641d:bfebfbff
checking whether C++ compiler accepts -march=prescott... yes
checking for gcc architecture flag... -march=prescott
checking whether C++ compiler accepts -O3 -malign-double -fstrict-aliasing 
-march=prescott... yes
checking for a sed that does not truncate output... /bin/sed
checking for grep that handles long lines and -e... /bin/grep
checking for egrep... /bin/grep -E
checking for fgrep... /bin/grep -F
checking for ld used by gcc... /usr/bin/ld
checking if the linker (/usr/bin/ld) is GNU ld... yes
checking for BSD- or MS-compatible name lister (nm)... /usr/bin/nm -B
checking the name lister (/usr/bin/nm -B) interface... BSD nm
checking whether ln -s works... yes
checking the maximum length of command line arguments... 1572864
checking whether the shell understands some XSI constructs... yes
checking whether the shell understands "+="... yes
checking for /usr/bin/ld option to reload object files... -r
checking for objdump... objdump
checking how to recognize dependent libraries... pass_all
checking for ar... ar
checking for strip... strip
checking for ranlib... ranlib
checking command to parse /usr/bin/nm -B output from gcc object... ok
checking how to run the C preprocessor... gcc -E
checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking for dlfcn.h... yes
checking whether we are using the GNU C++ compiler... (cached) yes
checking whether mpiCC accepts -g... (cached) yes
checking dependency style of mpiCC... (cached) gcc3
checking how to run the C++ preprocessor... mpiCC -E
checking for objdir... .libs
checking if gcc supports -fno-rtti -fno-exceptions... no
checking for gcc option to produce PIC... -fPIC -DPIC
checking if gcc PIC flag -fPIC -DPIC works... yes
checking if gcc static flag -static works... yes
checking if gcc supports -c -o file.o... yes
checking if gcc supports -c -o file.o... (cached) yes
checking whether the gcc linker (/usr/bin/ld) supports shared libraries... yes
checking dynamic linker characteristics... GNU/Linux ld.so
checking how to hardcode library paths into programs... immediate
checking whether stripping libraries is possible... yes
checking if libtool supports shared libraries... yes
checking whether to build shared libraries... no
checking whether to build static libraries... yes
checking for ld used by mpiCC... /usr/bin/ld
checking if the linker (/usr/bin/ld) is GNU ld... yes
checking whether the mpiCC linker (/usr/bin/ld) supports shared libraries... yes
checking for mpiCC option to produce PIC... -fPIC -DPIC
checking if mpiCC PIC flag -fPIC -DPIC works... yes
checking if mpiCC static flag -static works... yes
checking if mpiCC supports -c -o file.o... yes
checking if mpiCC supports -c -o file.o... (cached) yes
checking whether the mpiCC linker (/usr/bin/ld) supports shared libraries... yes
checking dynamic linker characteristics... GNU/Linux ld.so
checking how to hardcode library paths into programs... immediate
checking for latex2html... latex2html
checking for sin in -lm... yes
checking for fftw_plan_dft_1d in -lfftw3... yes
checking for pkg-config... /usr/bin/pkg-config
checking for harminv >= 1.1... yes
checking HARMINV_CFLAGS...  
checking HARMINV_LIBS... -L/usr/lib/gcc/i486-linux-gnu/4.2.3 
-L/usr/lib/gcc/i486-linux-gnu/4.2.3/../../../../lib -L/lib/../lib 
-L/usr/lib/../lib -L/usr/lib/gcc/i486-linux-gnu/4.2.3/../../.. -lharminv 
-llapack -lblas -lgfortranbegin -lgfortran -lm -lgcc_s  
checking mpb.h usability... no
checking mpb.h presence... no
checking for mpb.h... no
checking for cblas_cgemm... yes
checking for gsl_sf_bessel_Jn in -lgsl... yes
checking for deflate in -lz... yes
checking for H5Pcreate in -lhdf5... yes
checking hdf5.h usability... yes
checking hdf5.h presence... yes
checking for hdf5.h... yes
checking for H5Pset_mpi... no
checking for H5Pset_fapl_mpio... yes
Looks like we have got 2 processors
checking for guile-config... guile-config
checking if linking to guile works... yes
checking for scm_make_smob_type... yes
checking for SCM_SMOB_PREDICATE... no
checking for SCM_SMOB_DATA... no
checking for SCM_NEWSMOB... no
checking how to activate readline in Guile... ice-9 readline
checking for libctl dir... /usr/local/share/libctl
checking for gen-ctl-io... gen-ctl-io
checking for ctl_get_vector3 in -lctl... yes
checking ctl.h usability... yes
checking ctl.h presence... yes
checking for ctl.h... yes
checking whether libctl version is at least 3.0.3... ok
checking for basename in -lgen... no
checking for feenableexcept... yes
checking whether feenableexcept declaration is usable... yes
checking whether to catch and ignore SIGFPE signals... no
checking whether time.h and sys/time.h may both be included... yes
checking sys/time.h usability... yes
checking sys/time.h presence... yes
checking for sys/time.h... yes
checking for BSDgettimeofday... no
checking for gettimeofday... yes
checking for cblas_ddot... yes
checking for cblas_daxpy... yes
checking for C/C++ restrict keyword... __restrict
configure: creating ./config.status
config.status: creating Makefile
config.status: creating meep-pkgconfig
config.status: creating src/Makefile
config.status: creating tests/Makefile
config.status: creating examples/Makefile
config.status: creating libctl/Makefile
config.status: creating libctl/meep.scm
config.status: creating config.h
config.status: config.h is unchanged
config.status: executing depfiles commands
config.status: executing libtool commands

The .h5 output is OK if only one node is used. Also, it is found that 
hdf5-serialcan works for the two computers although the output speed is low. 
Can anyone help about this? Perhaps, it is because the system is not clean. 
MPICH and LAM etc.. were installed before.

A minor thing is that "export GUILE_WARN_DEPRECATED=no" doesn't work on these 
machines. The guile version is 1.8.5.

Thanks in advance.

Best wishes,
Jiangjun Zheng



_______________________________________________
meep-discuss mailing list
[email protected]
http://ab-initio.mit.edu/cgi-bin/mailman/listinfo/meep-discuss

Reply via email to