Dear Siesta users and developers,
I want to do some LDA+U calculations by the ldau version of siesta. I
compiled it with the same makefile as I used for siesta-3.2. I have tested
the example of Fe_ldau in the subdirectory of Siesta. However, it turned out
that when I use only 1 core the execution is successful but if I use more
cores the code seems pausing at the following lines not moving on but it
never stops prompted an error. By the way, I can run normal calculations
without LDA+U with the ldau version of siesta with many cores successfully.
So, What is the problem? Any advice and comment will be appreciated.
Siesta Version:
siesta-2.6.8--ldau-reference-6-dm-fix
Architecture : x86_64-unknown-linux-gnu--Intel
Compiler flags: mpif90 -g -O2
PARALLEL version
* Running on 3 nodes in parallel
>> Start of run: 14-MAR-2015 17:39:52
***********************
* WELCOME TO SIESTA *
***********************
reinit: Reading from standard input
************************** Dump of input data file
****************************
# $Id: Fe.fdf,v 1.1 1999/04/20 12:52:43 emilio Exp $
#
----------------------------------------------------------------------------
-
# FDF for bcc iron
#
# GGA, Ferromagnetic.
# Scalar-relativistic pseudopotential with non-linear partial-core
correction
#
# E. Artacho, April 1999
#
----------------------------------------------------------------------------
-
SystemName bcc Fe ferro GGA # Descriptive name of the system
SystemLabel Fe # Short name for naming files
# Output options
WriteCoorStep
WriteMullikenPop 1
# Species and atoms
NumberOfSpecies 1
NumberOfAtoms 1
%block ChemicalSpeciesLabel
1 26 Fe
%endblock ChemicalSpeciesLabel
# Basis
PAO.EnergyShift 50 meV
PAO.BasisSize DZP
%block PAO.Basis
Fe 2
0 2 P
6. 0.
2 2
0. 0.
%endblock PAO.Basis
LatticeConstant 2.87 Ang
%block LatticeVectors
0.50000 0.500000 0.500000
0.50000 -0.500000 0.500000
0.50000 0.500000 -0.500000
%endblock LatticeVectors
KgridCutoff 15. Ang
%block BandLines
1 0.00000 0.000000 0.000000 \Gamma
40 2.00000 0.000000 0.000000 H
28 1.00000 1.000000 0.000000 N
28 0.00000 0.000000 0.000000 \Gamma
34 1.00000 1.000000 1.000000 P
%endblock BandLines
xc.functional GGA # Exchange-correlation functional
xc.authors PBE # Exchange-correlation version
SpinPolarized true # Logical parameters are: yes or no
MeshCutoff 150. Ry # Mesh cutoff. real space mesh
# SCF options
MaxSCFIterations 40 # Maximum number of SCF iter
DM.MixingWeight 0.1 # New DM amount for next SCF cycle
DM.Tolerance 1.d-3 # Tolerance in maximum difference
# between input and output DM
DM.UseSaveDM true # to use continuation files
DM.NumberPulay 3
SolutionMethod diagon # OrderN or Diagon
ElectronicTemperature 25 meV # Temp. for Fermi smearing
# MD options
MD.TypeOfRun cg # Type of dynamics:
MD.NumCGsteps 0 # Number of CG steps for
# coordinate optimization
MD.MaxCGDispl 0.1 Ang # Maximum atomic displacement
# in one CG step (Bohr)
MD.MaxForceTol 0.04 eV/Ang # Tolerance in the maximum
# atomic force (Ry/Bohr)
# Atomic coordinates
AtomicCoordinatesFormat Fractional
%block AtomicCoordinatesAndAtomicSpecies
0.000000000000 0.000000000000 0.000000000000 1
%endblock AtomicCoordinatesAndAtomicSpecies
LDAU.FirstIteration .false.
LDAU.PopTol 5.0d-4
LDAU.ThresholdTol 1.0d-2
LDAU.ProjectorGenerationMethod 2
%block LDAU.proj
Fe 1 # number of shells of projectors
n=3 2 # n, l
2.00 0.0000 # U(eV), J(eV)
0.000 0.0000 # rc, \omega
%endblock LDAU.proj
************************** End of input data file
*****************************
reinit:
-----------------------------------------------------------------------
reinit: System Name: bcc Fe ferro GGA
reinit:
-----------------------------------------------------------------------
reinit: System Label: Fe
reinit:
-----------------------------------------------------------------------
initatom: Reading input for thepseudopotentials and atomic orbitals
Species number: 1 Label: Fe Atomic number: 26
Ground state valence configuration: 4s02 3d06
Reading pseudopotential information in formatted form from Fe.psf
Valence configuration for pseudopotential generation:
4s( 2.00) rc: 2.41
4p( 0.00) rc: 2.53
3d( 6.00) rc: 2.29
4f( 0.00) rc: 2.29
Repaobasis: processing %block PAO.Basis
Repaobasis: species: Fe
Repaobasis: Number of shells= 2
Repaobasis: Shell with n,l= 4 0
Repaobasis: Shell with n,l= 3 2
For Fe, standard SIESTA heuristics set lmxkb to 3
(one more than the basis l, including polarization orbitals).
Use PS.lmax or PS.KBprojectors blocks to override.
Reldauproj: processing %block LDAU.proj
Reldauproj: species: Fe
Reldauproj: Number of shells= 1
Reldauproj: Shell with n,l= 3 2
Reldauproj: end processing %block LDAU.proj
<basis_specs>
============================================================================
===
=
Fe Z= 26 Mass= 55.850 Charge= 0.0000
Lmxo=2 Lmxkb= 3 BasisType=split Semic=F
L=0 Nsemic=0
n=4 nzeta=2 polorb= F
splnorm: 0.15000
vcte: 0.0000
rinn: 0.0000
rcs: 6.0000 0.0000
lambdas: 1.0000 1.0000
L=1 Nsemic=0
n=4 nzeta=1 polorb= T
splnorm: 0.15000
vcte: 0.0000
rinn: 0.0000
rcs: 0.0000
lambdas: 1.0000
L=2 Nsemic=0
n=3 nzeta=2 polorb= F
splnorm: 0.15000
vcte: 0.0000
rinn: 0.0000
rcs: 0.0000 0.0000
lambdas: 1.0000 1.0000
-
L=0 Nkbl=1 erefs: 0.17977+309
L=1 Nkbl=1 erefs: 0.17977+309
L=2 Nkbl=1 erefs: 0.17977+309
L=3 Nkbl=1 erefs: 0.17977+309
-
L=2 Nldau_semic=1
n=3
U, J=: 0.14700 0.0000
vcte: 0.0000
rinn: 0.0000
rcs: 0.0000
lambdas: 1.0000
=
</basis_specs>
ATOM: Species begin__________________________
ATOM: Called for Fe (Z = 26)
read_Read: Pseudopotential generation method:
read_vps: ATM 3.2.2 Troullier-Martins
Total valence charge: 8.00000
ATOM: Pseudopotential generated from an ionic configuration
ATOM: with net charge 0.00
xc_check: Exchange-correlation functional:
xc_check: GGA Perdew, Burke & Ernzerhof 1996
V l=0 =-2*Zval/r beyond r= 2.3499
V l=1 =-2*Zval/r beyond r= 2.4704
V l=2 =-2*Zval/r beyond r= 2.2353
V l=3 =-2*Zval/r beyond r= 2.2353
All V_l potentials equal beyond r= 2.4704
This should be close to max(r_c) in ps generation
All pots = -2*Zval/r beyond r= 2.4704
VLOCAL1: 99.0% of the norm of Vloc inside 7.113 Ry
VLOCAL1: 99.9% of the norm of Vloc inside 16.210 Ry
ATOM: Maximum radius for 4*pi*r*r*local-pseudopot. charge 2.97985
atom: Maximum radius for r*vlocal+2*Zval: 2.87017
--------------------------------------------
KB: Generation of KB projectors
KB: L= 0
KB: Number of Kleinman-Bylander projectors: 1
KB: Generating projector: 1
KB: Projector kind: standard
radial_log schro: updating the rc to: 25.5822751650800
GHOST: No ghost state for L = 0
KB: L= 1
KB: Number of Kleinman-Bylander projectors: 1
KB: Generating projector: 1
KB: Projector kind: standard
radial_log schro: updating the rc to: 46.6140229130900
GHOST: No ghost state for L = 1
KB: L= 2
KB: Number of Kleinman-Bylander projectors: 1
KB: Generating projector: 1
KB: Projector kind: standard
radial_log schro: updating the rc to: 18.9517908592700
GHOST: No ghost state for L = 2
KB: L= 3
KB: Number of Kleinman-Bylander projectors: 1
KB: Generating projector: 1
KB: Projector kind: standard
radial_log schro: updating the rc to: 120.530480482200
GHOST: No ghost state for L = 3
KBgen: Kleinman-Bylander projectors:
l= 0 rc= 2.764525 el= -0.389815 Ekb= 3.431041 kbcos= 0.254043
l= 1 rc= 2.799300 el= -0.098222 Ekb= 1.732346 kbcos= 0.192007
l= 2 rc= 2.564764 el= -0.551796 Ekb=-12.271205 kbcos= -0.715516
l= 3 rc= 2.870167 el= 0.003006 Ekb= -1.371972 kbcos= 0.000000
KBgen: Total number of Kleinman-Bylander projectors: 16
--------------------------------------------
BASIS_GEN begin
SPLIT: Orbitals with angular momentum L= 0
SPLIT: Basis orbitals for state 4s
radial_log schro: updating the rc to: 6.00076868208500
izeta = 1
lambda = 1.000000
rc = 6.000769
Total energy = -0.361656
kinetic = 0.369283
potential(screened) = -0.730939
potential(ionic) = -6.228540
izeta = 2
rmatch = 5.926225
splitnorm = 0.150000
Total energy = -0.304336
kinetic = 0.545465
potential(screened) = -0.849801
potential(ionic) = -6.653009
POLgen: Polarization orbital for state 4s
izeta = 1
lambda = 1.000000
rc = 6.000769
Total energy = 0.018885
kinetic = 0.674242
potential(screened) = -0.655358
potential(ionic) = -5.783059
SPLIT: Orbitals with angular momentum L= 2
radial_log schro: updating the rc to: 18.9517908592700
SPLIT: PAO cut-off radius determinated from an
SPLIT: energy shift= 0.003675 Ry
SPLIT: Basis orbitals for state 3d
radial_log schro: updating the rc to: 4.79169190888500
izeta = 1
lambda = 1.000000
rc = 4.791692
Total energy = -0.548667
kinetic = 8.561292
potential(screened) = -9.109958
potential(ionic) = -18.089190
izeta = 2
rmatch = 2.291856
splitnorm = 0.150000
Total energy = -0.137047
kinetic = 11.793848
potential(screened) = -11.930895
potential(ionic) = -21.624702
Basis: Total Species Charge = 8.0000
Basis: Species Exc (eV) = -105.1478
BASISgen end
LDAUprojgen begin
LDAUprojs with angular momentum L= 2
LDAUproj generation method 2
LDAUproj corresponding to state 3d
radial_log schro: updating the rc to: 18.9517908592700
LDAUproj cut-off radious determined from a
cutoff norm parameter = 0.900000
LDAUproj is an extended PAO orbital cut off with a
Fermi function 1/[1+exp(r-rc)/w] with
rc= 2.022544
w = 0.050000
LDAUproj cutoff radious 2.320685
LDAUprojgen end
ATOM: Species end_____________________________
na: Computing Vna for species 1
Vna: chval, zval: 8.00000 8.00000
Vna: Cut-off radius for the neutral-atom potential: 6.000769
na: Finished computing Vna for species 1
prinput: Basis input
----------------------------------------------------------
PAO.BasisType split
%block ChemicalSpeciesLabel
1 26 Fe # Species index, atomic number, species
label
%endblock ChemicalSpeciesLabel
%block PAO.Basis # Define Basis set
Fe 3 # Species label, number of l-shells
n=4 0 2 P 1 # n, l, Nzeta, Polarization, NzetaPol
6.001 5.926
1.000 1.000
n=3 2 2 # n, l, Nzeta
4.792 2.292
1.000 1.000
%endblock PAO.Basis
prinput:
----------------------------------------------------------------------
coor: Atomic-coordinates input format = Fractional
siesta: Atomic coordinates (Bohr) and species
siesta: 0.00000 0.00000 0.00000 1 1
siesta: System type = bulk
initatomlists: Number of atoms, orbitals, and projectors: 1 15 16
siesta: ******************** Simulation parameters
****************************
siesta:
siesta: The following are some of the parameters of the simulation.
siesta: A complete list of the parameters used, including default values,
siesta: can be found in file out.fdf
siesta:
redata: Non-Collinear-spin run = F
redata: SpinPolarized (Up/Down) run = T
redata: Number of spin components = 2
redata: Long output = F
redata: Maximum wall-clock time = unlimited
redata: Number of Atomic Species = 1
redata: Charge density info will appear in .RHO file
redata: Write Mulliken Pop. = Atomic and Orbital charges
redata: Mesh Cutoff = 150.0000 Ry
redata: Net charge of the system = 0.0000 |e|
redata: Max. number of SCF Iter = 40
redata: Performing Pulay mixing using = 3 iterations
redata: Mix DM in first SCF step ? = F
redata: Write Pulay info on disk? = F
redata: New DM Mixing Weight = 0.1000
redata: New DM Occupancy tolerance = 0.000000000001
redata: No kicks to SCF
redata: DM Mixing Weight for Kicks = 0.5000
redata: DM Tolerance for SCF = 0.001000
redata: Require Energy convergence for SCF = F
redata: DM Energy tolerance for SCF = 0.000100 eV
redata: Require Harris convergence for SCF = F
redata: DM Harris energy tolerance for SCF = 0.000100 eV
redata: Antiferro initial spin density = F
redata: Using Saved Data (generic) = F
redata: Use continuation files for DM = T
redata: Neglect nonoverlap interactions = F
redata: Method of Calculation = Diagonalization
redata: Divide and Conquer = T
redata: Electronic Temperature = 0.0018 Ry
redata: Fix the spin of the system = F
redata: Dynamics option = CG coord. optimization
redata: Variable cell = F
redata: Use continuation files for CG = F
redata: Max atomic displ per move = 0.1890 Bohr
redata: Maximum number of CG moves = 0
redata: Force tolerance = 0.0016 Ry/Bohr
redata:
***********************************************************************
Total number of electrons: 8.000000
Total ionic charge: 8.000000
* ProcessorY, Blocksize: 1 5
Kpoints in: 1183 . Kpoints trimmed: 1099
siesta: k-grid: Number of k-points = 1099
siesta: k-grid: Cutoff (effective) = 16.156 Ang
siesta: k-grid: Supercell and displacements
siesta: k-grid: 0 13 0 0.000
siesta: k-grid: 0 0 13 0.000
siesta: k-grid: 13 0 0 0.000
Naive supercell factors: 8 8 8
superc: Internal auxiliary supercell: 8 x 8 x 8 = 512
superc: Number of atoms, orbitals, and projectors: 512 7680 8192
Best,
Xiaoming Wang
Postdoc
Rutgers