Hi,
Thank you for replying.
More details:
1. input data:
&control
calculation='scf'
restart_mode='from_scratch',
pseudo_dir = '../pot/',
outdir='./out/'
prefix='BaTiO3'
/
&system
nbnd = 48
ibrav = 0, nat = 5, ntyp = 3
ecutwfc = 50
occupations='smearing', smearing='gaussian', degauss=0.02
/
&electrons
conv_thr = 1.0e-8
/
ATOMIC_SPECIES
Ba 137.327 Ba.pbe-mt_fhi.UPF
Ti 204.380 Ti.pbe-mt_fhi.UPF
O 15.999 O.pbe-mt_fhi.UPF
ATOMIC_POSITIONS
Ba 0.0000000000000000 0.0000000000000000 0.0000000000000000
Ti 0.5000000000000000 0.5000000000000000 0.4819999933242795
O 0.5000000000000000 0.5000000000000000 0.0160000007599592
O 0.5000000000000000 -0.0000000000000000 0.5149999856948849
O 0.0000000000000000 0.5000000000000000 0.5149999856948849
K_POINTS (automatic)
11 11 11 0 0 0
CELL_PARAMETERS {angstrom}
3.999800000000001 0.000000000000000 0.000000000000000
0.000000000000000 3.999800000000001 0.000000000000000
0.000000000000000 0.000000000000000 4.018000000000000
2. number of processors:
I tested 24 cores and 8 cores, and both yield the same result.
3. type of parallelization:
I don't know your meaning. I execute pw.x by:
mpirun -np 24 pw.x < BTO.scf.in >> output
'which mpirun' output:
/opt/intel/compilers_and_libraries_2016.3.210/linux/mpi/intel64/bin/mpirun
4. when the error occurs:
in the middle of the run. The last a few lines of the output is
total cpu time spent up to now is 32.9 secs
total energy = -105.97885119 Ry
Harris-Foulkes estimate = -105.99394457 Ry
estimated scf accuracy < 0.03479229 Ry
iteration # 7 ecut= 50.00 Ry beta=0.70
Davidson diagonalization with overlap
ethr = 1.45E-04, avg # of iterations = 2.7
total cpu time spent up to now is 37.3 secs
total energy = -105.99039982 Ry
Harris-Foulkes estimate = -105.99025175 Ry
estimated scf accuracy < 0.00927902 Ry
iteration # 8 ecut= 50.00 Ry beta=0.70
Davidson diagonalization with overlap
5. Error message:
Something like:
Fatal error in PMPI_Cart_sub: Other MPI error, error stack:
PMPI_Cart_sub(242)...................: MPI_Cart_sub(comm=0xc400fcf3,
remain_dims=0x7ffc03ae5f38, comm_new=0x7ffc03ae5e90) failed
PMPI_Cart_sub(178)...................:
MPIR_Comm_split_impl(270)............:
MPIR_Get_contextid_sparse_group(1330): Too many communicators (0/16384 free on
this process; ignore_id=0)
Fatal error in PMPI_Cart_sub: Other MPI error, error stack:
PMPI_Cart_sub(242)...................: MPI_Cart_sub(comm=0xc400fcf3,
remain_dims=0x7ffd10080408, comm_new=0x7ffd10080360) failed
PMPI_Cart_sub(178)...................:
Cheers!
Chong
________________________________
From: [email protected] <[email protected]> on behalf of
Paolo Giannozzi <[email protected]>
Sent: Sunday, May 15, 2016 3:43 PM
To: PWSCF Forum
Subject: Re: [Pw_forum] mpi error using pw.x
Please tell us what is wrong and we will fix it.
Seriously: nobody can answer your question unless you specify, as a strict
minimum, input data, number of processors and type of parallelization that
trigger the error, and where the error occurs (at startup, later, in the middle
of the run, ...).
Paolo
On Sun, May 15, 2016 at 7:50 AM, Chong Wang
<[email protected]<mailto:[email protected]>> wrote:
I compiled quantum espresso 5.4 with intel mpi and mkl 2016 update 3.
However, when I ran pw.x the following errors were reported:
...
MPIR_Get_contextid_sparse_group(1330): Too many communicators (0/16384 free on
this process; ignore_id=0)
Fatal error in PMPI_Cart_sub: Other MPI error, error stack:
PMPI_Cart_sub(242)...................: MPI_Cart_sub(comm=0xc400fcf3,
remain_dims=0x7ffde1391dd8, comm_new=0x7ffde1391d30) failed
PMPI_Cart_sub(178)...................:
MPIR_Comm_split_impl(270)............:
MPIR_Get_contextid_sparse_group(1330): Too many communicators (0/16384 free on
this process; ignore_id=0)
Fatal error in PMPI_Cart_sub: Other MPI error, error stack:
PMPI_Cart_sub(242)...................: MPI_Cart_sub(comm=0xc400fcf3,
remain_dims=0x7ffc02ad7eb8, comm_new=0x7ffc02ad7e10) failed
PMPI_Cart_sub(178)...................:
MPIR_Comm_split_impl(270)............:
MPIR_Get_contextid_sparse_group(1330): Too many communicators (0/16384 free on
this process; ignore_id=0)
Fatal error in PMPI_Cart_sub: Other MPI error, error stack:
PMPI_Cart_sub(242)...................: MPI_Cart_sub(comm=0xc400fcf3,
remain_dims=0x7fffb24e60f8, comm_new=0x7fffb24e6050) failed
PMPI_Cart_sub(178)...................:
MPIR_Comm_split_impl(270)............:
MPIR_Get_contextid_sparse_group(1330): Too many communicators (0/16384 free on
this process; ignore_id=0)
I googled and found out this might be caused by hitting os limits of number of
opened files. However, After I increased number of opened files per process
from 1024 to 40960, the error persists.
What's wrong here?
Chong Wang
Ph. D. candidate
Institute for Advanced Study, Tsinghua University, Beijing, 100084
_______________________________________________
Pw_forum mailing list
[email protected]<mailto:[email protected]>
http://pwscf.org/mailman/listinfo/pw_forum
--
Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,
Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
Phone +39-0432-558216, fax +39-0432-558222
_______________________________________________
Pw_forum mailing list
[email protected]
http://pwscf.org/mailman/listinfo/pw_forum