Dear developers,
    I tried to use -ndiag options in scf process. Because I read tutorials which says "most CPU time spent in linear-algebra operations, implemented in BLAS and LAPACK libraries, and in FFT". And what linear algebra parallelization does is to "Distribute and parallelize matrix diagonalization and matrix- matrix multiplications needed in iterative diagonalization (SCF) ". So I thought adding -ndiag should have an obvious speedup. To test, I artificially construct a 2x2 supercell of copper which contains 8 atoms, and run

mpiexec.hydra -n 12 pw.x -ndiag 9 -in cu_supercell.scf.in > cu_supercell.scf.out

But I didn't see any improvement over no -ndiag run. The timing is almost the same. I check the output file, it does show that

Subspace diagonalization in iterative solution of the eigenvalue problem:
     one sub-group per band group will be used
     scalapack distributed-memory algorithm (size of sub-group:  3*  3 procs)

I also test a more practical case with a material of 23 atoms in primitive cell, and see no improvement. So I am wondering what is wrong? Is -ndiag not effective for scf? 


best regards


    

_______________________________________________
Pw_forum mailing list
[email protected]
http://pwscf.org/mailman/listinfo/pw_forum

Reply via email to