Thank you very much.

diag.paralleloverk T worked.

New "good"  timings are as follows:

2cpu-------
Start of run             0.000
-------------- end of scf step            35.661
-------------- end of scf step            64.727
-------------- end of scf step            93.779
-------------- end of scf step           122.809
-------------- end of scf step           151.815
-------------- end of scf step           180.833
-------------- end of scf step           209.842
-------------- end of scf step           238.843
-------------- end of scf step           267.856
-------------- end of scf step           296.861
-------------- end of scf step           325.862
-------------- end of scf step           354.869
-------------- end of scf step           358.900
--- end of geometry step           358.910
End of run           359.189

4cpu-------
Start of run             0.000
-------------- end of scf step            19.362
-------------- end of scf step            33.702
-------------- end of scf step            47.954
-------------- end of scf step            62.193
-------------- end of scf step            76.570
-------------- end of scf step            90.962
-------------- end of scf step           105.142
-------------- end of scf step           119.510
-------------- end of scf step           133.742
-------------- end of scf step           147.781
-------------- end of scf step           162.087
-------------- end of scf step           176.279
-------------- end of scf step           178.423
--- end of geometry step           178.436
End of run           178.689

8cpu-------
Start of run             0.000
-------------- end of scf step            10.202
-------------- end of scf step            18.363
-------------- end of scf step            26.259
-------------- end of scf step            34.359
-------------- end of scf step            42.234
-------------- end of scf step            50.079
-------------- end of scf step            57.931
-------------- end of scf step            65.881
-------------- end of scf step            73.824
-------------- end of scf step            81.722
-------------- end of scf step            89.570
-------------- end of scf step            97.569
-------------- end of scf step            98.847
--- end of geometry step            99.129
End of run           100.093

16cpu-------
Start of run             0.000
-------------- end of scf step             9.298
-------------- end of scf step            13.831
-------------- end of scf step            18.210
-------------- end of scf step            22.586
-------------- end of scf step            26.970
-------------- end of scf step            31.506
-------------- end of scf step            35.914
-------------- end of scf step            40.253
-------------- end of scf step            44.620
-------------- end of scf step            49.005
-------------- end of scf step            53.384
-------------- end of scf step            57.777
-------------- end of scf step            58.435
--- end of geometry step            58.505
End of run            58.906

Marcel Mohr wrote:
did you try

diag.paralleloverk T  ?

Regards
Marcel

________________________________________________________________________
Marcel Mohr            Institut für Festkörperphysik, TU Berlin
marcel(at)physik.tu-berlin.de    Sekr. EW 5-4
TEL: +49-30-314 24442        Hardenbergstr. 36
FAX: +49-30-314 27705        10623 Berlin


On Wed, 18 Feb 2009, Mehmet Topsakal wrote:

Hi,

I'm an experienced user of VASP. Nowadays i'm trying to learn Siesta. During my simple tests i have realized that parallellization of Siesta is obviously poor than VASP. I'm using latest intel ifort (11) and mkl 10.1. My system is qual core xeon 2.33 with infiniband (4 core in one node).

I've chosen "siesta-2.0.2/Tests/si64/" as input. To see the realistic effects, i have increased kpoints
distorted 1 atom and increased mesh cut-off.

2CPU 4CPU 8CPU parallellization of vasp is nearly linear. However siesta's performance is poor.

CLOCK results are as below.

2CPU -----------------
Start of run             0.000
-------------- end of scf step            65.702
-------------- end of scf step           123.005
-------------- end of scf step           179.978
-------------- end of scf step           236.833
-------------- end of scf step           293.613
-------------- end of scf step           350.316
-------------- end of scf step           407.082
-------------- end of scf step           463.780
-------------- end of scf step           520.457
-------------- end of scf step           577.072
-------------- end of scf step           633.713
-------------- end of scf step           690.372
-------------- end of scf step           747.105
-------------- end of scf step           803.680
-------------- end of scf step           860.304
-------------- end of scf step           916.886
-------------- end of scf step           920.865
--- end of geometry step           920.887

4CPU -----------------
Start of run             0.000
-------------- end of scf step            52.757
-------------- end of scf step            99.481
-------------- end of scf step           145.754
-------------- end of scf step           191.974
-------------- end of scf step           238.180
-------------- end of scf step           284.612
-------------- end of scf step           330.736
-------------- end of scf step           377.200
-------------- end of scf step           423.579
-------------- end of scf step           469.623
-------------- end of scf step           515.901
-------------- end of scf step           561.912
-------------- end of scf step           608.275
-------------- end of scf step           654.488
-------------- end of scf step           700.990
-------------- end of scf step           747.432
-------------- end of scf step           749.578
--- end of geometry step           749.604
End of run           749.843

8CPU -----------------
Start of run             0.000
-------------- end of scf step            57.490
-------------- end of scf step           106.014
-------------- end of scf step           154.452
-------------- end of scf step           202.971
-------------- end of scf step           251.328
-------------- end of scf step           299.604
-------------- end of scf step           348.336
-------------- end of scf step           396.550
-------------- end of scf step           445.203
-------------- end of scf step           493.459
-------------- end of scf step           541.900
-------------- end of scf step           590.203
-------------- end of scf step           638.980
-------------- end of scf step           687.550
-------------- end of scf step           735.906
-------------- end of scf step           784.315
-------------- end of scf step           785.593
--- end of geometry step           785.667
End of run           786.080

I've done more tests and observed no difference. Even serial version
is faster than some parallel jobs :(.

Am I doing something WRONG ?????

Thanks.

I'm sending my input and output file below.

# -----------------------------------------------------------------------------
# FDF for a cubic c-Si supercell with 64 atoms
#
# E. Artacho, April 1999
# -----------------------------------------------------------------------------

SystemName          64-atom silicon
SystemLabel         si64

NumberOfAtoms       64
NumberOfSpecies     1

%block ChemicalSpeciesLabel
1  14  Si
%endblock ChemicalSpeciesLabel

PAO.BasisSize       SZ
PAO.EnergyShift     300 meV

LatticeConstant     5.430 Ang
%block LatticeVectors
2.000  0.000  0.000
0.000  2.000  0.000
0.000  0.000  2.000
%endblock LatticeVectors

%block kgrid_Monkhorst_Pack
7 0 0 0.0
0 7 0 0.0
0 0 7 0.0
%endblock kgrid_Monkhorst_Pack

MeshCutoff          100.0 Ry

MaxSCFIterations    50 DM.MixingWeight      0.3
DM.NumberPulay       3 DM.Tolerance         1.d-4
DM.UseSaveDM SolutionMethod diagon ElectronicTemperature 25 meV MD.TypeOfRun cg
MD.NumCGsteps        0
MD.MaxCGDispl         0.1  Ang
MD.MaxForceTol        0.04 eV/Ang

AtomicCoordinatesFormat  ScaledCartesian
%block AtomicCoordinatesAndAtomicSpecies
  0.1000    0.1000    0.1000   1 #  Si  1
  0.250    0.250    0.250   1 #  Si  2
  0.000    0.500    0.500   1 #  Si  3
  0.250    0.750    0.750   1 #  Si  4
  0.500    0.000    0.500   1 #  Si  5
  0.750    0.250    0.750   1 #  Si  6
  0.500    0.500    0.000   1 #  Si  7
  0.750    0.750    0.250   1 #  Si  8
  1.000    0.000    0.000   1 #  Si  9
  1.250    0.250    0.250   1 #  Si 10
  1.000    0.500    0.500   1 #  Si 11
  1.250    0.750    0.750   1 #  Si 12
  1.500    0.000    0.500   1 #  Si 13
  1.750    0.250    0.750   1 #  Si 14
  1.500    0.500    0.000   1 #  Si 15
  1.750    0.750    0.250   1 #  Si 16
  0.000    1.000    0.000   1 #  Si 17
  0.250    1.250    0.250   1 #  Si 18
  0.000    1.500    0.500   1 #  Si 19
  0.250    1.750    0.750   1 #  Si 20
  0.500    1.000    0.500   1 #  Si 21
  0.750    1.250    0.750   1 #  Si 22
  0.500    1.500    0.000   1 #  Si 23
  0.750    1.750    0.250   1 #  Si 24
  0.000    0.000    1.000   1 #  Si 25
  0.250    0.250    1.250   1 #  Si 26
  0.000    0.500    1.500   1 #  Si 27
  0.250    0.750    1.750   1 #  Si 28
  0.500    0.000    1.500   1 #  Si 29
  0.750    0.250    1.750   1 #  Si 30
  0.500    0.500    1.000   1 #  Si 31
  0.750    0.750    1.250   1 #  Si 32
  1.000    1.000    0.000   1 #  Si 33
  1.250    1.250    0.250   1 #  Si 34
  1.000    1.500    0.500   1 #  Si 35
  1.250    1.750    0.750   1 #  Si 36
  1.500    1.000    0.500   1 #  Si 37
  1.750    1.250    0.750   1 #  Si 38
  1.500    1.500    0.000   1 #  Si 39
  1.750    1.750    0.250   1 #  Si 40
  1.000    0.000    1.000   1 #  Si 41
  1.250    0.250    1.250   1 #  Si 42
  1.000    0.500    1.500   1 #  Si 43
  1.250    0.750    1.750   1 #  Si 44
  1.500    0.000    1.500   1 #  Si 45
  1.750    0.250    1.750   1 #  Si 46
  1.500    0.500    1.000   1 #  Si 47
  1.750    0.750    1.250   1 #  Si 48
  0.000    1.000    1.000   1 #  Si 49
  0.250    1.250    1.250   1 #  Si 50
  0.000    1.500    1.500   1 #  Si 51
  0.250    1.750    1.750   1 #  Si 52
  0.500    1.000    1.500   1 #  Si 53
  0.750    1.250    1.750   1 #  Si 54
  0.500    1.500    1.000   1 #  Si 55
  0.750    1.750    1.250   1 #  Si 56
  1.000    1.000    1.000   1 #  Si 57
  1.250    1.250    1.250   1 #  Si 58
  1.000    1.500    1.500   1 #  Si 59
  1.250    1.750    1.750   1 #  Si 60
  1.500    1.000    1.500   1 #  Si 61
  1.750    1.250    1.750   1 #  Si 62
  1.500    1.500    1.000   1 #  Si 63
  1.750    1.750    1.250   1 #  Si 64
%endblock AtomicCoordinatesAndAtomicSpecies


-------------------------

Siesta Version: siesta-2.0.2
Architecture  : intel9-cmkl8-mpi
Compiler flags: mpiifort -O2 -mp
PARALLEL version

* Running on    8 nodes in parallel
Start of run:  18-FEB-2009  22:34:52

*********************** * WELCOME TO SIESTA * *********************** reinit: Reading from standard input ************************** Dump of input data file **************************** # -----------------------------------------------------------------------------
# FDF for a cubic c-Si supercell with 64 atoms
#
# E. Artacho, April 1999
# -----------------------------------------------------------------------------
SystemName          64-atom silicon
SystemLabel         si64
NumberOfAtoms       64
NumberOfSpecies     1
%block ChemicalSpeciesLabel
1  14  Si
%endblock ChemicalSpeciesLabel
PAO.BasisSize       SZ
PAO.EnergyShift     300 meV
LatticeConstant     5.430 Ang
%block LatticeVectors
2.000  0.000  0.000
0.000  2.000  0.000
0.000  0.000  2.000
%endblock LatticeVectors
%block kgrid_Monkhorst_Pack
7 0 0 0.0
0 7 0 0.0
0 0 7 0.0
%endblock kgrid_Monkhorst_Pack
MeshCutoff          100.0 Ry
MaxSCFIterations    50
DM.MixingWeight      0.3
DM.NumberPulay       3
DM.Tolerance         1.d-4
DM.UseSaveDM
SolutionMethod       diagon
ElectronicTemperature  25 meV
MD.TypeOfRun         cg
MD.NumCGsteps        0
MD.MaxCGDispl         0.1  Ang
MD.MaxForceTol        0.04 eV/Ang
AtomicCoordinatesFormat  ScaledCartesian
%block AtomicCoordinatesAndAtomicSpecies
  0.1000    0.1000    0.1000   1 #  Si  1
  0.250    0.250    0.250   1 #  Si  2
  0.000    0.500    0.500   1 #  Si  3
  0.250    0.750    0.750   1 #  Si  4
  0.500    0.000    0.500   1 #  Si  5
  0.750    0.250    0.750   1 #  Si  6
  0.500    0.500    0.000   1 #  Si  7
  0.750    0.750    0.250   1 #  Si  8
  1.000    0.000    0.000   1 #  Si  9
  1.250    0.250    0.250   1 #  Si 10
  1.000    0.500    0.500   1 #  Si 11
  1.250    0.750    0.750   1 #  Si 12
  1.500    0.000    0.500   1 #  Si 13
  1.750    0.250    0.750   1 #  Si 14
  1.500    0.500    0.000   1 #  Si 15
  1.750    0.750    0.250   1 #  Si 16
  0.000    1.000    0.000   1 #  Si 17
  0.250    1.250    0.250   1 #  Si 18
  0.000    1.500    0.500   1 #  Si 19
  0.250    1.750    0.750   1 #  Si 20
  0.500    1.000    0.500   1 #  Si 21
  0.750    1.250    0.750   1 #  Si 22
  0.500    1.500    0.000   1 #  Si 23
  0.750    1.750    0.250   1 #  Si 24
  0.000    0.000    1.000   1 #  Si 25
  0.250    0.250    1.250   1 #  Si 26
  0.000    0.500    1.500   1 #  Si 27
  0.250    0.750    1.750   1 #  Si 28
  0.500    0.000    1.500   1 #  Si 29
  0.750    0.250    1.750   1 #  Si 30
  0.500    0.500    1.000   1 #  Si 31
  0.750    0.750    1.250   1 #  Si 32
  1.000    1.000    0.000   1 #  Si 33
  1.250    1.250    0.250   1 #  Si 34
  1.000    1.500    0.500   1 #  Si 35
  1.250    1.750    0.750   1 #  Si 36
  1.500    1.000    0.500   1 #  Si 37
  1.750    1.250    0.750   1 #  Si 38
  1.500    1.500    0.000   1 #  Si 39
  1.750    1.750    0.250   1 #  Si 40
  1.000    0.000    1.000   1 #  Si 41
  1.250    0.250    1.250   1 #  Si 42
  1.000    0.500    1.500   1 #  Si 43
  1.250    0.750    1.750   1 #  Si 44
  1.500    0.000    1.500   1 #  Si 45
  1.750    0.250    1.750   1 #  Si 46
  1.500    0.500    1.000   1 #  Si 47
  1.750    0.750    1.250   1 #  Si 48
  0.000    1.000    1.000   1 #  Si 49
  0.250    1.250    1.250   1 #  Si 50
  0.000    1.500    1.500   1 #  Si 51
  0.250    1.750    1.750   1 #  Si 52
  0.500    1.000    1.500   1 #  Si 53
  0.750    1.250    1.750   1 #  Si 54
  0.500    1.500    1.000   1 #  Si 55
  0.750    1.750    1.250   1 #  Si 56
  1.000    1.000    1.000   1 #  Si 57
  1.250    1.250    1.250   1 #  Si 58
  1.000    1.500    1.500   1 #  Si 59
  1.250    1.750    1.750   1 #  Si 60
  1.500    1.000    1.500   1 #  Si 61
  1.750    1.250    1.750   1 #  Si 62
  1.500    1.500    1.000   1 #  Si 63
  1.750    1.750    1.250   1 #  Si 64
%endblock AtomicCoordinatesAndAtomicSpecies
************************** End of input data file *****************************

reinit: ----------------------------------------------------------------------- reinit: System Name: 64-atom silicon reinit: ----------------------------------------------------------------------- reinit: System Label: si64 reinit: -----------------------------------------------------------------------

initatom: Reading input for the pseudopotentials and atomic orbitals ----------
Species number:            1  Label: Si Atomic number:          14
Ground state valence configuration:   3s02  3p02
Reading pseudopotential information in formatted form from Si.psf
For Si, standard SIESTA heuristics set lmxkb to 2
(one more than the basis l, including polarization orbitals).
Use PS.lmax or PS.KBprojectors blocks to override.

<basis_specs>
=============================================================================== Si Z= 14 Mass= 28.090 Charge= 0.0000 Lmxo=1 Lmxkb=2 BasisType=split Semic=F
L=0  Nsemic=0  Cnfigmx=3
        n=1  nzeta=1  polorb=0
vcte: 0.0000 rinn: 0.0000 rcs: 0.0000 lambdas: 1.0000 L=1 Nsemic=0 Cnfigmx=3
        n=1  nzeta=1  polorb=0
vcte: 0.0000 rinn: 0.0000 rcs: 0.0000 lambdas: 1.0000 -------------------------------------------------------------------------------
L=0  Nkbl=1  erefs: 0.17977+309
L=1  Nkbl=1  erefs: 0.17977+309
L=2  Nkbl=1  erefs: 0.17977+309
===============================================================================
</basis_specs>

atom: Called for Si  (Z =  14)

read_vps: Pseudopotential generation method:
read_vps: ATM3 Troullier-Martins read_vps: Valence configuration (pseudopotential and basis set generation):
3s( 2.00) rc: 1.89
3p( 2.00) rc: 1.89
3d( 0.00) rc: 1.89
Total valence charge:    4.00000

xc_check: Exchange-correlation functional:
xc_check: Ceperley-Alder
V l=0 = -2*Zval/r beyond r=  2.5494
V l=1 = -2*Zval/r beyond r=  2.5494
V l=2 = -2*Zval/r beyond r=  2.5494
All V_l potentials equal beyond r=  1.8652
This should be close to max(r_c) in ps generation
All pots = -2*Zval/r beyond r=  2.5494
Using large-core scheme for Vlocal

atom: Estimated core radius    2.54944

atom: Including non-local core corrections could be a good idea
atom: Maximum radius for 4*pi*r*r*local-pseudopot. charge    2.85303
atom: Maximum radius for r*vlocal+2*Zval:    2.58151
GHOST: No ghost state for L =  0
GHOST: No ghost state for L =  1
GHOST: No ghost state for L =  2

KBgen: Kleinman-Bylander projectors:
l= 0 rc= 1.936440 el= -0.796617 Ekb= 4.661340 kbcos= 0.299756 l= 1 rc= 1.936440 el= -0.307040 Ekb= 1.494238 kbcos= 0.301471 l= 2 rc= 1.936440 el= 0.002313 Ekb= -2.808672 kbcos= -0.054903

KBgen: Total number of  Kleinman-Bylander projectors:    9
atom: -------------------------------------------------------------------------

atom: SANKEY-TYPE ORBITALS:

SPLIT: Orbitals with angular momentum L= 0

SPLIT: Basis orbitals for state 3s

SPLIT: PAO cut-off radius determined from an
SPLIT: energy shift=  0.022049 Ry

 izeta = 1
               lambda =    1.000000
                   rc =    4.883716
               energy =   -0.773554
              kinetic =    0.585471
  potential(screened) =   -1.359025
     potential(ionic) =   -3.840954

SPLIT: Orbitals with angular momentum L= 1

SPLIT: Basis orbitals for state 3p

SPLIT: PAO cut-off radius determined from an
SPLIT: energy shift=  0.022049 Ry

 izeta = 1
               lambda =    1.000000
                   rc =    6.116033
               energy =   -0.285742
              kinetic =    0.892202
  potential(screened) =   -1.177944
     potential(ionic) =   -3.446720
atom: Total number of Sankey-type orbitals:  4

atm_pop: Valence configuration(local Pseudopot. screening):
3s( 2.00) 3p( 2.00) Vna: chval, zval: 4.00000 4.00000

Vna:  Cut-off radius for the neutral-atom potential:   6.116033

atom: _________________________________________________________________________

prinput: Basis input ----------------------------------------------------------

PAO.BasisType split %block ChemicalSpeciesLabel
1 14 Si # Species index, atomic number, species label
%endblock ChemicalSpeciesLabel

%block PAO.Basis                 # Define Basis set
Si          2                    # Species label, number of l-shells
n=3   0   1                         # n, l, Nzeta
 4.884    1.000  n=3   1   1                         # n, l, Nzeta
 6.116    1.000  %endblock PAO.Basis

prinput: ----------------------------------------------------------------------


siesta: ******************** Simulation parameters ****************************
siesta:
siesta: The following are some of the parameters of the simulation.
siesta: A complete list of the parameters used, including default values,
siesta: can be found in file out.fdf
siesta:
coor:   Atomic-coordinates input format  =     Cartesian coordinates
coor:                                          (in units of alat)
redata: SpinPolarized run                =     F
redata: Non-Collinear-spin run           =     F
redata: Number of spin components        =     1
redata: Long output                      =     F
redata: Number of Atomic Species         =     1
redata: Charge density info will appear in .RHO file
redata: Write Mulliken Pop.              =     NO
redata: Mesh Cutoff                      =   100.0000  Ry
redata: Net charge of the system         =     0.0000 |e|
redata: Max. number of SCF Iter          =    50
redata: Performing Pulay mixing using    =     3 iterations
redata: Mix DM in first SCF step ?       =     F
redata: Write Pulay info on disk?        =     F
redata: New DM Mixing Weight             =     0.3000
redata: New DM Occupancy tolerance       = 0.000000000001
redata: No kicks to SCF
redata: DM Mixing Weight for Kicks       =     0.5000
redata: DM Tolerance for SCF             =     0.000100
redata: Require Energy convergence for SCF =     F
redata: DM Energy tolerance for SCF      =     0.000100 eV
redata: Using Saved Data (generic)   =     F
redata: Use continuation files for DM    =     T
redata: Neglect nonoverlap interactions  =     F
redata: Method of Calculation            =     Diagonalization
redata: Divide and Conquer               =     T
redata: Electronic Temperature           =     0.0018  Ry
redata: Fix the spin of the system       =     F
redata: Dynamics option                  =     CG coord. optimization
redata: Variable cell                    =     F
redata: Use continuation files for CG    =     F
redata: Max atomic displ per move        =     0.1890  Bohr
redata: Maximum number of CG moves       =     0
redata: Force tolerance                  =     0.0016  Ry/Bohr
redata: ***********************************************************************

siesta: Atomic coordinates (Bohr) and species
siesta:      1.02612   1.02612   1.02612  1        1
siesta:      2.56530   2.56530   2.56530  1        2
siesta:      0.00000   5.13061   5.13061  1        3
siesta:      2.56530   7.69591   7.69591  1        4
siesta:      5.13061   0.00000   5.13061  1        5
siesta:      7.69591   2.56530   7.69591  1        6
siesta:      5.13061   5.13061   0.00000  1        7
siesta:      7.69591   7.69591   2.56530  1        8
siesta:     10.26122   0.00000   0.00000  1        9
siesta:     12.82652   2.56530   2.56530  1       10
siesta:     10.26122   5.13061   5.13061  1       11
siesta:     12.82652   7.69591   7.69591  1       12
siesta:     15.39183   0.00000   5.13061  1       13
siesta:     17.95713   2.56530   7.69591  1       14
siesta:     15.39183   5.13061   0.00000  1       15
siesta:     17.95713   7.69591   2.56530  1       16
siesta:      0.00000  10.26122   0.00000  1       17
siesta:      2.56530  12.82652   2.56530  1       18
siesta:      0.00000  15.39183   5.13061  1       19
siesta:      2.56530  17.95713   7.69591  1       20
siesta:      5.13061  10.26122   5.13061  1       21
siesta:      7.69591  12.82652   7.69591  1       22
siesta:      5.13061  15.39183   0.00000  1       23
siesta:      7.69591  17.95713   2.56530  1       24
siesta:      0.00000   0.00000  10.26122  1       25
siesta:      2.56530   2.56530  12.82652  1       26
siesta:      0.00000   5.13061  15.39183  1       27
siesta:      2.56530   7.69591  17.95713  1       28
siesta:      5.13061   0.00000  15.39183  1       29
siesta:      7.69591   2.56530  17.95713  1       30
siesta:      5.13061   5.13061  10.26122  1       31
siesta:      7.69591   7.69591  12.82652  1       32
siesta:     10.26122  10.26122   0.00000  1       33
siesta:     12.82652  12.82652   2.56530  1       34
siesta:     10.26122  15.39183   5.13061  1       35
siesta:     12.82652  17.95713   7.69591  1       36
siesta:     15.39183  10.26122   5.13061  1       37
siesta:     17.95713  12.82652   7.69591  1       38
siesta:     15.39183  15.39183   0.00000  1       39
siesta:     17.95713  17.95713   2.56530  1       40
siesta:     10.26122   0.00000  10.26122  1       41
siesta:     12.82652   2.56530  12.82652  1       42
siesta:     10.26122   5.13061  15.39183  1       43
siesta:     12.82652   7.69591  17.95713  1       44
siesta:     15.39183   0.00000  15.39183  1       45
siesta:     17.95713   2.56530  17.95713  1       46
siesta:     15.39183   5.13061  10.26122  1       47
siesta:     17.95713   7.69591  12.82652  1       48
siesta:      0.00000  10.26122  10.26122  1       49
siesta:      2.56530  12.82652  12.82652  1       50
siesta:      0.00000  15.39183  15.39183  1       51
siesta:      2.56530  17.95713  17.95713  1       52
siesta:      5.13061  10.26122  15.39183  1       53
siesta:      7.69591  12.82652  17.95713  1       54
siesta:      5.13061  15.39183  10.26122  1       55
siesta:      7.69591  17.95713  12.82652  1       56
siesta:     10.26122  10.26122  10.26122  1       57
siesta:     12.82652  12.82652  12.82652  1       58
siesta:     10.26122  15.39183  15.39183  1       59
siesta:     12.82652  17.95713  17.95713  1       60
siesta:     15.39183  10.26122  15.39183  1       61
siesta:     17.95713  12.82652  17.95713  1       62
siesta:     15.39183  15.39183  10.26122  1       63
siesta:     17.95713  17.95713  12.82652  1       64

initatomlists: Number of atoms, orbitals, and projectors: 64 256 576

siesta: System type = bulk * ProcessorY, Blocksize:    2  24


siesta: k-grid: Number of k-points =   196
siesta: k-grid: Cutoff (effective) =    38.010 Ang
siesta: k-grid: Supercell and displacements
siesta: k-grid:    7   0   0      0.000
siesta: k-grid:    0   7   0      0.000
siesta: k-grid:    0   0   7      0.000
Naive supercell factors:     2    2    2

superc: Internal auxiliary supercell:     2 x     2 x     2  =       8
superc: Number of atoms, orbitals, and projectors:    512  2048  4608

* Maximum dynamic memory allocated =     2 MB

siesta:                 ==============================
                          Begin CG move =      0
                      ==============================

superc: Internal auxiliary supercell:     2 x     2 x     2  =       8
superc: Number of atoms, orbitals, and projectors:    512  2048  4608

outcell: Unit cell vectors (Ang):
     10.860000    0.000000    0.000000
      0.000000   10.860000    0.000000
      0.000000    0.000000   10.860000

outcell: Cell vector modules (Ang) : 10.860000 10.860000 10.860000 outcell: Cell angles (23,13,12) (deg): 90.0000 90.0000 90.0000
outcell: Cell volume (Ang**3)        :   1280.8241

iodm: Reading Density Matrix from files

InitMesh: MESH =    72 x    72 x    72 =      373248
InitMesh: Mesh cutoff (required, used) =   100.000   121.481 Ry

* Maximum dynamic memory allocated =    16 MB

stepf: Fermi-Dirac step function

siesta: Program's energy decomposition (eV):
siesta: Eions   =     12185.667955
siesta: Ena     =      3677.030483
siesta: Ekin    =      2533.047298
siesta: Enl     =       968.173549
siesta: DEna    =       223.889159
siesta: DUscf   =         7.221156
siesta: DUext   =         0.000000
siesta: Exc     =     -2052.773864
siesta: eta*DQ  =         0.000000
siesta: Emadel  =         0.000000
siesta: Ekinion =         0.000000
siesta: Eharris =     -6831.494141
siesta: Etot    =     -6829.080174
siesta: FreeEng =     -6829.080174

siesta: iscf   Eharris(eV)      E_KS(eV)   FreeEng(eV)   dDmax  Ef(eV)
siesta:    1    -6831.4941    -6829.0802    -6829.0802  0.8471 -4.1021
timer: Routine,Calls,Time,% = IterSCF        1     409.882  95.74
elaps: Routine,Calls,Wall,% = IterSCF        1      54.954  95.62
siesta:    2    -6991.2181    -6793.0431    -6793.0595  2.3562 -3.2951
siesta:    3    -6824.7451    -6820.1704    -6820.2293  0.1763 -3.3640
siesta:    4    -6824.5535    -6820.5807    -6820.5807  0.1358 -3.3201
siesta:    5    -6824.5095    -6822.2506    -6822.2506  0.0365 -3.3021
siesta:    6    -6824.4949    -6823.4785    -6823.4785  0.0209 -3.2419
siesta:    7    -6824.4898    -6823.7839    -6823.7839  0.0098 -3.2572
siesta:    8    -6824.4891    -6824.0534    -6824.0534  0.0068 -3.2609
siesta:    9    -6824.4887    -6824.2947    -6824.2947  0.0034 -3.2623
siesta:   10    -6824.4887    -6824.4058    -6824.4058  0.0022 -3.2616
siesta:   11    -6824.4886    -6824.4619    -6824.4619  0.0013 -3.2621
siesta:   12    -6824.4886    -6824.5091    -6824.5091  0.0006 -3.2601
siesta:   13    -6824.4886    -6824.5173    -6824.5173  0.0003 -3.2612
siesta:   14    -6824.4886    -6824.5084    -6824.5084  0.0002 -3.2607
siesta:   15    -6824.4886    -6824.4984    -6824.4984  0.0001 -3.2617
siesta:   16    -6824.4886    -6824.4945    -6824.4945  0.0001 -3.2614

siesta: E_KS(eV) =            -6824.4909

siesta: E_KS - E_eggbox =     -6824.4909

siesta: Atomic forces (eV/Ang):
----------------------------------------
 Tot   -0.003537   -0.003538    0.000192
----------------------------------------
 Max   38.428666
 Res    6.467893    sqrt( Sum f_i^2 / 3N )
----------------------------------------
 Max   38.428666    constrained

Stress-tensor-Voigt (kbar): -89.67 -89.67 -89.67 -35.32 -35.32 -35.32
Target enthalpy (eV/cell)    -6824.4909

* Maximum dynamic memory allocated =    18 MB

siesta: Program's energy decomposition (eV):
siesta: Eions   =     12185.667955
siesta: Ena     =      3677.030483
siesta: Ekin    =      2539.205733
siesta: Enl     =       967.468346
siesta: DEna    =       224.163197
siesta: DUscf   =         5.849143
siesta: DUext   =         0.000000
siesta: Exc     =     -2052.539858
siesta: eta*DQ  =         0.000000
siesta: Emadel  =         0.000000
siesta: Ekinion =         0.000000
siesta: Eharris =     -6824.488618
siesta: Etot    =     -6824.490910
siesta: FreeEng =     -6824.490910

siesta: Final energy (eV):
siesta:       Kinetic =    2539.205733
siesta:       Hartree =     403.167281
siesta:    Ext. field =       0.000000
siesta:   Exch.-corr. =   -2052.539858
siesta:  Ion-electron =   -3058.454032
siesta:       Ion-ion =   -4655.870035
siesta:       Ekinion =       0.000000
siesta:         Total =   -6824.490910

siesta: Atomic forces (eV/Ang):
siesta:      1  -38.428666  -38.428665  -38.428665
siesta:      2   34.546099   34.546099   34.546159
siesta:      3    0.093356    0.525941    0.526005
siesta:      4    0.038862    0.015971    0.016038
siesta:      5    0.525942    0.093356    0.526006
siesta:      6    0.015971    0.038862    0.016038
siesta:      7    0.525942    0.525942    0.093356
siesta:      8    0.015971    0.015970    0.038930
siesta:      9   -0.000772    0.027848    0.027848
siesta:     10    0.002679    0.024390    0.024458
siesta:     11   -0.004722   -0.022084   -0.022016
siesta:     12   -0.034580   -0.013288   -0.013220
siesta:     13   -0.024803   -0.096815    0.000279
siesta:     14    0.044873   -0.124528   -0.089206
siesta:     15   -0.024803    0.000212   -0.096815
siesta:     16    0.044868   -0.089236   -0.124512
siesta:     17    0.027847   -0.000772    0.027848
siesta:     18    0.024390    0.002678    0.024458
siesta:     19   -0.096815   -0.024803    0.000279
siesta:     20   -0.124528    0.044873   -0.089206
siesta:     21   -0.022084   -0.004721   -0.022015
siesta:     22   -0.013288   -0.034580   -0.013221
siesta:     23    0.000212   -0.024803   -0.096815
siesta:     24   -0.089236    0.044868   -0.124511
siesta:     25    0.027848    0.027848   -0.000704
siesta:     26    0.024390    0.024390    0.002746
siesta:     27   -0.096815    0.000212   -0.024735
siesta:     28   -0.124528   -0.089236    0.045002
siesta:     29    0.000212   -0.096815   -0.024736
siesta:     30   -0.089236   -0.124528    0.045002
siesta:     31   -0.022084   -0.022084   -0.004653
siesta:     32   -0.013288   -0.013288   -0.034512
siesta:     33    0.000347    0.000347   -0.127758
siesta:     34   -0.004896   -0.004897    0.070552
siesta:     35    0.003778    0.042498    0.078022
siesta:     36   -0.027315    0.059364    0.024089
siesta:     37    0.042498    0.003778    0.078021
siesta:     38    0.059364   -0.027315    0.024089
siesta:     39    0.121082    0.121082   -0.270484
siesta:     40    1.759977    1.759977   -0.406439
siesta:     41    0.000347   -0.127758    0.000414
siesta:     42   -0.004896    0.070484   -0.004829
siesta:     43    0.003778    0.077954    0.042566
siesta:     44   -0.027315    0.024059    0.059492
siesta:     45    0.121082   -0.270485    0.121150
siesta:     46    1.759976   -0.406451    1.760067
siesta:     47    0.042498    0.077953    0.003846
siesta:     48    0.059377    0.024059   -0.027267
siesta:     49   -0.127758    0.000347    0.000414
siesta:     50    0.070484   -0.004896   -0.004829
siesta:     51   -0.270485    0.121082    0.121150
siesta:     52   -0.406451    1.759976    1.760067
siesta:     53    0.077954    0.003778    0.042566
siesta:     54    0.024059   -0.027315    0.059492
siesta:     55    0.077954    0.042497    0.003846
siesta:     56    0.024059    0.059377   -0.027267
siesta:     57    0.013447    0.013447    0.013515
siesta:     58    0.012255    0.012255    0.012323
siesta:     59   -0.015450   -0.007735   -0.007667
siesta:     60   -0.060732   -0.033153   -0.033057
siesta:     61   -0.007735   -0.015450   -0.007667
siesta:     62   -0.033153   -0.060732   -0.033057
siesta:     63   -0.007735   -0.007735   -0.015382
siesta:     64   -0.033144   -0.033144   -0.060693
siesta: ----------------------------------------
siesta:    Tot   -0.003537   -0.003538    0.000192

siesta: Stress tensor (static) (eV/Ang**3):
siesta:    -0.055967   -0.022047   -0.022047
siesta:    -0.022047   -0.055967   -0.022047
siesta:    -0.022047   -0.022047   -0.055967

siesta: Cell volume =       1280.824056 Ang**3

siesta: Pressure (static):
siesta:                Solid            Molecule  Units
siesta:           0.00060955          0.00006551  Ry/Bohr**3
siesta:           0.05596681          0.00601497  eV/Ang**3
siesta:          89.66967743          9.63714334  kBar

* Maximum dynamic memory allocated : Node    0 =    18 MB
* Maximum dynamic memory allocated : Node    1 =    18 MB
* Maximum dynamic memory allocated : Node    2 =    17 MB
* Maximum dynamic memory allocated : Node    3 =    16 MB
* Maximum dynamic memory allocated : Node    4 =    16 MB
* Maximum dynamic memory allocated : Node    5 =    16 MB
* Maximum dynamic memory allocated : Node    6 =    16 MB
* Maximum dynamic memory allocated : Node    7 =    16 MB

* Maximum memory occured during redistribXZ timer: CPU execution times:
timer:  Routine       Calls   Time/call    Tot.time        %
timer:  siesta            1    6170.780    6170.780   100.00
timer:  Setup             1      15.267      15.267     0.25
timer:  bands             1       0.103       0.103     0.00
timer:  writewave         1       0.006       0.006     0.00
timer:  KSV_init          1       0.005       0.005     0.00
timer:  IterMD            1    6152.373    6152.373    99.70
timer:  hsparse           2       0.483       0.967     0.02
timer:  overfsm           2       0.114       0.227     0.00
timer:  IterSCF          17     361.680    6148.565    99.64
timer:  kinefsm           2       0.090       0.180     0.00
timer:  nlefsm            2       1.881       3.761     0.06
timer:  DHSCF            17       2.933      49.856     0.81
timer:  DHSCF1            1       0.776       0.776     0.01
timer:  DHSCF2            1       5.447       5.447     0.09
timer:  REORD           104       0.001       0.094     0.00
timer:  POISON           18       0.171       3.072     0.05
timer:  DHSCF3           17       2.259      38.404     0.62
timer:  rhoofd           17       0.822      13.976     0.23
timer:  cellXC           17       0.262       4.460     0.07
timer:  vmat             17       0.919      15.619     0.25
timer:  diagon           16     380.542    6088.668    98.67
timer:  cdiag          6272       0.890    5581.808    90.46
timer:  cdiag1         6272       0.051     320.242     5.19
timer:  cdiag2         6272       0.186    1168.382    18.93
timer:  cdiag3         6272       0.606    3800.731    61.59
timer:  cdiag4         6272       0.023     142.532     2.31
timer:  DHSCF4            1       5.182       5.182     0.08
timer:  dfscf             1       4.015       4.015     0.07
timer:  optical           1       0.083       0.083     0.00


elaps: ELAPSED times:
elaps:  Routine       Calls   Time/call    Tot.time        %
elaps:  siesta            1     786.057     786.057   100.00
elaps:  Setup             1       1.986       1.986     0.25
elaps:  bands             1       0.000       0.000     0.00
elaps:  writewave         1       0.001       0.001     0.00
elaps:  KSV_init          1       0.010       0.010     0.00
elaps:  IterMD            1     783.558     783.558    99.68
elaps:  hsparse           2       0.091       0.182     0.02
elaps:  overfsm           2       0.026       0.051     0.01
elaps:  IterSCF          17      46.062     783.053    99.62
elaps:  kinefsm           2       0.012       0.025     0.00
elaps:  nlefsm            2       0.264       0.529     0.07
elaps:  DHSCF            17       0.376       6.390     0.81
elaps:  DHSCF1            1       0.112       0.112     0.01
elaps:  DHSCF2            1       0.690       0.690     0.09
elaps:  REORD           104       0.000       0.010     0.00
elaps:  POISON           18       0.021       0.387     0.05
elaps:  DHSCF3           17       0.289       4.915     0.63
elaps:  rhoofd           17       0.109       1.860     0.24
elaps:  cellXC           17       0.035       0.600     0.08
elaps:  vmat             17       0.116       1.977     0.25
elaps:  diagon           16      48.434     774.949    98.59
elaps:  cdiag          6272       0.113     707.234    89.97
elaps:  cdiag1         6272       0.007      41.328     5.26
elaps:  cdiag2         6272       0.024     149.477    19.02
elaps:  cdiag3         6272       0.077     482.557    61.39
elaps:  cdiag4         6272       0.003      18.369     2.34
elaps:  DHSCF4            1       0.663       0.663     0.08
elaps:  dfscf             1       0.524       0.524     0.07
elaps:  optical           1       0.008       0.008     0.00

End of run:  18-FEB-2009  22:47:59
Job /RS/progs/LSF/7.0/linux2.6-glibc2.3-x86_64/bin/intelmpi_wrapper -np 8 /RS/users/mehmet.topsakal/rs/rsbin/siesta

TID HOST_NAME COMMAND_LINE STATUS TERMINATION_TIME ===== ========== ================ ======================= =================== 00000 d118 /RS/users/mehmet Done 02/18/2009 22:47:59 00001 d118 /RS/users/mehmet Done 02/18/2009 22:47:59 00002 d118 /RS/users/mehmet Done 02/18/2009 22:47:59 00003 d118 /RS/users/mehmet Done 02/18/2009 22:47:59 00004 d086 /RS/users/mehmet Done 02/18/2009 22:47:59 00005 d086 /RS/users/mehmet Done 02/18/2009 22:47:59 00006 d086 /RS/users/mehmet Done 02/18/2009 22:47:59 00007 d086 /RS/users/mehmet Done 02/18/2009 22:47:59


Reply via email to