I am not convinced that the problem you mention is the same as yours. In order to figure out if the problem arises from Scalapack, you should remove __SCALAPACK from DFLAGS and recompile: the code will use (much slower) internal routines for parallel dense-matrix diagonalization. You may also try to run with no dense-matrix diagonalization (-nd 1, not sure it is honored though).
You should also report how your are running your code and, if using exotic parallelizations like "band groups" (-nb N), check if the problem you have is related to its usage Paolo On Thu, Dec 1, 2016 at 11:37 PM, Ryan Herchig <[email protected]> wrote: > Hello all, > > I am running pw.x in Quantum Espresso version 5.4.0, however if I try > and run the job using more than 2 nodes with 8 cores each I receive the > following error : > > Fatal error in PMPI_Group_incl: Invalid rank, error stack: > PMPI_Group_incl(185).............: MPI_Group_incl(group=0x88000004, n=4, > ranks=0x2852700, new_group=0x7fff57564668) failed > MPIR_Group_check_valid_ranks(253): Invalid rank in rank array at index 3; > value is 33 but must be in the range 0 to 31 > > I am building/running on a local cluster maintained by the University I > attend. The specifications for the nodes are 2 x Intel Xeon E5-2670 (Eight > Core) 32GB QDR InfiniBand. I found in a previous thread > > https://www.mail-archive.com/[email protected]/msg27702.html > > involving espresso-5.3.0 where another user seemed to be experiencing the > same issue where it was determined that "The problem is related to the > obscure hacks needed to convince Scalapack to work in a subgroup of > processors." The suggestion in this post was to change a line in > Modules/mp_global.f90 and recompile. However I am running spin-collinear > vdW-DF calculations which requires at least version 5.4.0 I believe and the > lines in the subroutine found in mp_global.f90 has changed; furthermore > following the suggestion of the previous post does not fix the issue. It > instead produces the following compilation error : > > mp_global.f90(97): error #6631: A non-optional actual argument must be > present when invoking a procedure with an explicit interface. > [NPARENT_COMM] > CALL mp_start_diag ( ndiag_, intra_BGRP_comm ) > ---------^ > mp_global.f90(97): error #6631: A non-optional actual argument must be > present when invoking a procedure with an explicit interface. > [MY_PARENT_ID] > CALL mp_start_diag ( ndiag_, intra_BGRP_comm ) > ---------^ > compilation aborted for mp_global.f90 (code 1) > > > Does this problem with the ScaLAPACK libraries persist in the newer > versions or could these errors have a separate origin? Possibly something > I am doing wrong during the build? I have included the make.sys that I am > using for "make pw" below. If the error is due to the ScaLAPACK libraries, > is there a workaround which could allow the use of additional processors > when running calculations? Thank you in advance. > > Thank you, Ryan Herchig > > University of South Florida, Department of > Physics > > > .SUFFIXES : > .SUFFIXES : .o .c .f .f90 > > .f90.o: > $(MPIF90) $(F90FLAGS) -c $< > > # .f.o and .c.o: do not modify > > .f.o: > $(F77) $(FFLAGS) -c $< > > .c.o: > $(CC) $(CFLAGS) -c $< > > TOPDIR = /work/r/rch/espresso-5.4.0 > > MANUAL_DFLAGS = > DFLAGS = -D__INTEL -D__FFTW3 -D__MPI -D__PARA -D__SCALAPACK > FDFLAGS = $(DFLAGS) $(MANUAL_DFLAGS) > > IFLAGS = -I../include -I/apps/intel/2015/composer_ > xe_2015.3.187/mkl/include:/apps/intel/2015/composer_xe_ > 2015.3.187/tbb/include > > MOD_FLAG = -I > > MPIF90 = mpif90 > #F90 = ifort > CC = icc > F77 = ifort > > CPP = cpp > CPPFLAGS = -P -C -traditional $(DFLAGS) $(IFLAGS) > > CFLAGS = -O3 $(DFLAGS) $(IFLAGS) > F90FLAGS = $(FFLAGS) -nomodule -fpp $(FDFLAGS) $(IFLAGS) $(MODFLAGS) > FFLAGS = -O2 -assume byterecl -g -traceback > > FFLAGS_NOOPT = -O0 -assume byterecl -g -traceback > > FFLAGS_NOMAIN = -nofor_main > > LD = mpif90 > LDFLAGS = > LD_LIBS = > > BLAS_LIBS = -lmkl_intel_lp64 -lmkl_sequential -lmkl_core > BLAS_LIBS_SWITCH = external > > LAPACK_LIBS = -L/apps/intel/2015/composer_xe_2015.3.187/mkl/lib/intel64 > -lmkl_intel_lp64 -lmkl_sequential -lmkl_core > LAPACK_LIBS_SWITCH = external > > ELPA_LIBS_SWITCH = disabled > SCALAPACK_LIBS = -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_ilp64 > > FFT_LIBS = -L/apps/intel/2015/composer_xe_2015.3.187/mkl/lib/intel64 > -lmkl_intel_lp64 -lmkl_sequential -lmkl_core > > MPI_LIBS = > > MASS_LIBS = > > AR = ar > ARFLAGS = ruv > > RANLIB = ranlib > > FLIB_TARGETS = all > > LIBOBJS = ../clib/clib.a ../iotk/src/libiotk.a > LIBS = $(SCALAPACK_LIBS) $(LAPACK_LIBS) $(FFT_LIBS) $(BLAS_LIBS) > $(MPI_LIBS) $(MASS_LIBS) $(LD_LIBS) > > WGET = wget -O > > PREFIX = /work/r/rch/espresso-5.4.0/EXE > > > > > > > _______________________________________________ > Pw_forum mailing list > [email protected] > http://pwscf.org/mailman/listinfo/pw_forum > -- Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche, Univ. Udine, via delle Scienze 208, 33100 Udine, Italy Phone +39-0432-558216, fax +39-0432-558222
_______________________________________________ Pw_forum mailing list [email protected] http://pwscf.org/mailman/listinfo/pw_forum
