Just want to be clear, I am not Paolo !!! If you need more memory, you should not increase number of cores to a huge number. Instead, you can ask for more nodes but use less number of cores per node.
For instant, you can ask for 16 nodes and use 6 cores per node. Check you environment, but it is highly that you need to use something like size = 192 aprun -n 96 -N 6 pw.x ... If you regret to waste 1/2 node, check OPENMP for options. ---------------------------------------------------- Duy Le Postdoctoral Associate Department of Physics University of Central Florida. Website: http://www.physics.ucf.edu/~dle On Tue, Jun 11, 2013 at 3:57 PM, vijaya subramanian <vijaya65 at hotmail.com> wrote: > Hi Paolo > I am running an scf calculation on gold slabs. I have somewhat limited > resources on a supercomputer > and would like to optimize my runs. (Cray XT5 with 9,408 compute nodes > interconnected with the SeaStar router through HyperTransport. The SeaStars > are all interconnected in a 3-D torus topology. It is a massively parallel > processing (MPP) machine. Each compute node has two six-core 2.6 GHz AMD > Opterons for a total of 112,896 cores. All nodes have 16 Gbytes of DDR2 > memory: 1.33 Gbytes of memory per core.) > A 54 gold atom slab scf calculation worked best with 120 > processors/npool=2/ndiag=49/ntg6. > 240 processors and I get very good speed. 64 processors and I get an out of > memory issue. > When I use a larger unit cell I run into problems. > I have attached two files with different configurations of gold atoms in a > slab calculation with larger unit cells. > The unit cells are different, one has six layers of gold atoms (unit cell - > 16.12x48.36x60.8 in Bohr) and the other 2 layers of gold atoms (unit > cell-54.x43.x54.). > For some reason I cannot get the 160 atom problem to work. (>2000 still > doesn't work). For the 6 layer 162 atom problem(nproc=720 works). If I use > fewer number of processors I get an out of memory > problem. > Do you have any suggestions for what the problem may be? > > I have given partial output for the two calcs below: > 160 atoms-1200 processors-the run failed before the diagonalization began. > Parallelization info > -------------------- > sticks: dense smooth PW G-vecs: dense smooth > Min 105 31 8 24383 3975 > Max 106 32 9 24398 4042 > Sum 75823 22755 5881 17559633 2885465 37 > > > bravais-lattice index = 0 > lattice parameter (alat) = 54.5658 a.u. > unit-cell volume = 129972.7994 (a.u.)^3 > number of atoms/cell = 160 > number of atomic types = 1 > number of electrons = 1760.00 > number of Kohn-Sham states= 2112 > kinetic-energy cutoff = 30.0000 Ry > charge density cutoff = 400.0000 Ry > convergence threshold = 1.0E-06 > mixing beta = 0.7000 > number of iterations used = 8 plain mixing > Exchange-correlation = SLA PW PBX PBC ( 1 4 3 4 0) > EXX-fraction = 0.00 > ........ > Dense grid: 17559633 G-vectors FFT dimensions: ( 360, 288, 360) > > Smooth grid: 2885465 G-vectors FFT dimensions: ( 192, 160, 192) > > Largest allocated arrays est. size (Mb) dimensions > Kohn-Sham Wavefunctions 32.87 Mb ( 1020, 2112) > NL pseudopotentials 42.33 Mb ( 510, 5440) > Each V/rho on FFT grid 1.58 Mb ( 103680) > Each G-vector array 0.19 Mb ( 24385) > G-vector shells 0.09 Mb ( 11350) > Largest temporary arrays est. size (Mb) dimensions > Auxiliary wavefunctions 131.48 Mb ( 1020, 8448) > Each subspace H/S matrix 3.36 Mb ( 469, 469) > Each <psi_i|beta_j> matrix 350.63 Mb ( 5440, 2, 2112) > Arrays for rho mixing 12.66 Mb ( 103680, 8) > > Initial potential from superposition of free atoms > Check: negative starting charge= -0.028620 > > starting charge 1759.98221, renormalised to 1760.00000 > > negative rho (up, down): 0.286E-01 0.000E+00 > Starting wfc are 2880 randomized atomic wfcs > Application 5992317 exit signals: Killed > > 162 atom run: > Parallelization info > -------------------- > sticks: dense smooth PW G-vecs: dense smooth PW > Min 34 10 2 8950 1450 178 > Max 35 11 3 8981 1509 229 > Sum 24841 7453 2003 6454371 1060521 148169 > > > bravais-lattice index = 0 > lattice parameter (alat) = 16.1227 a.u. > unit-cell volume = 47776.5825 (a.u.)^3 > number of atoms/cell = 162 > number of atomic types = 1 > number of electrons = 1782.00 > number of Kohn-Sham states= 2138 > kinetic-energy cutoff = 30.0000 Ry > charge density cutoff = 400.0000 Ry > convergence threshold = 1.0E-06 > mixing beta = 0.7000 > number of iterations used = 8 plain mixing > Exchange-correlation = SLA PW PBX PBC ( 1 4 3 4 0) > EXX-fraction = 0.00 > Non magnetic calculation with spin-orbit > > > _______________________________________________ > Pw_forum mailing list > Pw_forum at pwscf.org > http://pwscf.org/mailman/listinfo/pw_forum
