Re: [QE-users] Unfold.x colormap (6.6)

2021-01-04 Thread William Hewett
Hi Pietro,

Thanks for your detailed answer, also for the unfold.x code and the recent
updates, I've used it several times now and have found your examples very
helpful. I'll keep your description in mind when reading the PRB, it often
helps (me) to have a rough answer in mind when reading through the details.

On BandUP, I looked into this some months ago and found that the code could
not work with versions of QE past 6.3 (I think - maybe there was some
change to the output of pw.x after this version?). I've just looked at the
github page again (https://github.com/band-unfolding/bandup) and it has not
been updated yet, so still not compatible.

Thanks for your offer, I may contact you in the future,

Cheers

Will Hewett
Post Doctoral Researcher
Victoria University of Wellington
New Zealand



On Tue, Jan 5, 2021 at 10:34 AM Pietro Bonfa'  wrote:

> Dear Will,
>
> the quantities computed by the code are presented in Phys. Rev. B 85,
> 085201 (2012).
>
> More precisely, you can collect the quantity in Eq. 8 (detailed in
> Appendix A) for the k points specified in input. Roughly speaking, that
> is the amount unfolded states with equivalent k vector contribute to the
> folded state at a given energy. Admittedly this is far from clear but
> the article mentioned above describes very accurately both the problem
> and the proposed solution. Still I hope that my awkward description will
> be sufficient to understand that, in a perfect supercell, this number
> depends on the degeneracy of the eigenvalue in the base cell. If the
> symmetry is low enough (and forgetting about spin), this number is
> either 0 or 1.
>
> The picture that you are sharing is actually Eq. 9, which is the
> quantity described above times a Dirac delta function of the energy,
> which is approximated with a Gaussian.
>
> As a consequence, the 'weight' in a perfect supercell depends on the
> degeneracy (or almost degeneracy) of a state, the discretization of the
> energy interval and the width of the Gaussian approximating the Dirac
> delta.
> I hope this partially answers your question.
>
> That being said, as you may have read, I've been always advocating for
> bandUP, since unfold.x was just my exercise to learn Fortran and the
> internals of QE.
> However, lately I found a little bit of time to fix some problems, add
> some tests and parallel execution, so I'm a little more confident than
> before on its correctness. Still let me remind you to carefully check
> your results and feel free to contact me for further details.
>
> Best regards,
> Pietro
>
> --
> Pietro Bonfà
> Department of Mathematical, Physical and Computer Sciences
> University of Parma
>
>
>
>
>
> On 1/3/21 5:37 AM, William Hewett wrote:
> > Hi all,
> >
> > I'm using QE to calculate the band structure of rare-earth nitride
> > materials, currently looking at the effect of nitrogen vacancies and the
> > resulting states created. I'm running calculations on 3x3x3 (primitive)
> > supercells, then using unfold.x to process the results.
> >
> > Unfold.x gives an output where each point (k,energy) also has a 'weight'
> > clearly this is zero where no states are present and non-zero where they
> > are present. My question is, what exactly is this weight? Some sort of
> > DOS for a single k-point?
> >
> > i.e. in the image below most points in the VB (near 6 on the y-axis) are
> > quite dark, while some points in the CB (flat 4f bands near -1) are
> > lighter in color.
> >
> > image.png
> >
> > Kind regards,
> >
> > Will Hewett
> > Post Doctoral Researcher
> > Victoria University of Wellington
> > New Zealand
> >
> >
> > ___
> > Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
> > users mailing list users@lists.quantum-espresso.org
> > https://lists.quantum-espresso.org/mailman/listinfo/users
> >
> ___
> Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
> users mailing list users@lists.quantum-espresso.org
> https://lists.quantum-espresso.org/mailman/listinfo/users
___
Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
users mailing list users@lists.quantum-espresso.org
https://lists.quantum-espresso.org/mailman/listinfo/users

[QE-users] convergence problem for a molecule

2021-01-04 Thread Dr. K. C. Bhamu
Dear QE Users
I wish you all a very happy new year 2021!

I am running a molecule  with QE_6.6  but facing a convergence problem.
I tried with and without 'local-TF' and both the options are not working.

Could someone please have a look and suggest to me any solution?



 calculation = 'relax'
restart_mode = 'from_scratch'
  outdir = './tmp'
  pseudo_dir = '/home/kcbhamu/PPs'
  prefix = 'pwscf'
! disk_io = 'none'
   verbosity = 'default'
   etot_conv_thr = 0.0001
   forc_conv_thr = 0.001
   nstep = 400
 tstress = .true.
 tprnfor = .true.
 /


  ibrav=1,
  celldm(1)=47.2431531141d0,
  nat=45,
  ntyp=2,
  ecutwfc=65,
  ecutrho=650,
  occupations='smearing',
  smearing='mv',
  degauss=0.005d0,
 vdw_corr = 'DFT-D3'
 assume_isolated='mt'
/


electron_maxstep=999
  conv_thr=1d-06,
  mixing_beta=0.2,
mixing_mode='local-TF'

 /

ion_dynamics = 'bfgs'
 /

ATOMIC_SPECIES
C   12.010700  C.pbe-n-rrkjus_psl.1.0.0.UPF
H1.007900  H.pbe-rrkjus_psl.1.0.0.UPF


ATOMIC_POSITIONS {crystal}
   H   0.3386670148d0   0.4101931225d0   0.5567254731d0
   C   0.4036222773d0   0.4149138472d0   0.4333448459d0
   H   0.3896346255d0   0.3765041141d0   0.4161167921d0
   C   0.4581434036d0   0.4058061800d0   0.4574431361d0
   H   0.4912475090d0   0.4017946150d0   0.4290030859d0
   C   0.469124d0   0.4031817060d0   0.5099401207d0
   C   0.4253264110d0   0.4071145213d0   0.5513729998d0
   H   0.4327529164d0   0.3774401630d0   0.5831675023d0
   H   0.4276381009d0   0.4464313224d0   0.5714430496d0
   C   0.3695729050d0   0.3993621172d0   0.5273547231d0
   H   0.3636345597d0   0.3567170026d0   0.5178739313d0
   C   0.3632663685d0   0.4323236400d0   0.4762330101d0
   H   0.3222111650d0   0.4290965819d0   0.4608472936d0
   H   0.3700270318d0   0.4749720278d0   0.4855397879d0
   C   0.3550607872d0   0.4623307485d0   0.3556503474d0
   C   0.4072533598d0   0.4549686753d0   0.3866434814d0
   H   0.3606425820d0   0.4892876373d0   0.3214193048d0
   H   0.3233410756d0   0.4794322728d0   0.3807732671d0
   H   0.3403880199d0   0.4238800875d0   0.3400782741d0
   H   0.4209431037d0   0.4937648655d0   0.4025522295d0
   H   0.4387880407d0   0.4413449118d0   0.3590270889d0
   C   0.5253811195d0   0.3963553146d0   0.5303629030d0
   H   0.5275121376d0   0.3597814346d0   0.5550954410d0
   H   0.5527498198d0   0.3902494747d0   0.4963382387d0
   C   0.6241002137d0   0.4781543529d0   0.6196978697d0
   H   0.6647510385d0   0.4689076884d0   0.6339462788d0
   H   0.5991190333d0   0.4830564471d0   0.6558173241d0
   C   0.6245138950d0   0.5312336801d0   0.5887880064d0
   H   0.6519223460d0   0.5264394769d0   0.5543229817d0
   C   0.5684449407d0   0.5426728657d0   0.566474d0
   H   0.5687315910d0   0.5794127480d0   0.5422455469d0
   H   0.5411292134d0   0.5502889434d0   0.6003493599d0
   C   0.5467291274d0   0.4961698023d0   0.5329872592d0
   H   0.5063039439d0   0.5054115901d0   0.5182523433d0
   C   0.5462494021d0   0.4434602904d0   0.5642948220d0
   C   0.6023141698d0   0.4317080266d0   0.5862549648d0
   H   0.6016470576d0   0.3946830496d0   0.6101287345d0
   H   0.6295099445d0   0.4241153738d0   0.5522682608d0
   C   0.6550325759d0   0.6297412310d0   0.5941577656d0
   C   0.6461033707d0   0.5769127641d0   0.6236837684d0
   H   0.6727321964d0   0.6599556430d0   0.6206848621d0
   H   0.6175101186d0   0.6464185973d0   0.5785534114d0
   H   0.6823409744d0   0.6243625417d0   0.5600843513d0
   H   0.6183422352d0   0.5832463756d0   0.6574159578d0
   H   0.6841825362d0   0.5638719462d0   0.6416905666d0

K_POINTS (gamma)



Thank you very much
K C Bhamu
University of Ulsan
ROK
___
Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
users mailing list users@lists.quantum-espresso.org
https://lists.quantum-espresso.org/mailman/listinfo/users

Re: [QE-users] Unfold.x colormap (6.6)

2021-01-04 Thread Pietro Bonfa'

Dear Will,

the quantities computed by the code are presented in Phys. Rev. B 85, 
085201 (2012).


More precisely, you can collect the quantity in Eq. 8 (detailed in 
Appendix A) for the k points specified in input. Roughly speaking, that 
is the amount unfolded states with equivalent k vector contribute to the 
folded state at a given energy. Admittedly this is far from clear but 
the article mentioned above describes very accurately both the problem 
and the proposed solution. Still I hope that my awkward description will 
be sufficient to understand that, in a perfect supercell, this number 
depends on the degeneracy of the eigenvalue in the base cell. If the 
symmetry is low enough (and forgetting about spin), this number is 
either 0 or 1.


The picture that you are sharing is actually Eq. 9, which is the 
quantity described above times a Dirac delta function of the energy, 
which is approximated with a Gaussian.


As a consequence, the 'weight' in a perfect supercell depends on the 
degeneracy (or almost degeneracy) of a state, the discretization of the 
energy interval and the width of the Gaussian approximating the Dirac delta.

I hope this partially answers your question.

That being said, as you may have read, I've been always advocating for 
bandUP, since unfold.x was just my exercise to learn Fortran and the 
internals of QE.
However, lately I found a little bit of time to fix some problems, add 
some tests and parallel execution, so I'm a little more confident than 
before on its correctness. Still let me remind you to carefully check 
your results and feel free to contact me for further details.


Best regards,
Pietro

--
Pietro Bonfà
Department of Mathematical, Physical and Computer Sciences
University of Parma





On 1/3/21 5:37 AM, William Hewett wrote:

Hi all,

I'm using QE to calculate the band structure of rare-earth nitride 
materials, currently looking at the effect of nitrogen vacancies and the 
resulting states created. I'm running calculations on 3x3x3 (primitive) 
supercells, then using unfold.x to process the results.


Unfold.x gives an output where each point (k,energy) also has a 'weight' 
clearly this is zero where no states are present and non-zero where they 
are present. My question is, what exactly is this weight? Some sort of 
DOS for a single k-point?


i.e. in the image below most points in the VB (near 6 on the y-axis) are 
quite dark, while some points in the CB (flat 4f bands near -1) are 
lighter in color.


image.png

Kind regards,

Will Hewett
Post Doctoral Researcher
Victoria University of Wellington
New Zealand


___
Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
users mailing list users@lists.quantum-espresso.org
https://lists.quantum-espresso.org/mailman/listinfo/users


___
Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
users mailing list users@lists.quantum-espresso.org
https://lists.quantum-espresso.org/mailman/listinfo/users

Re: [QE-users] [QE-GPU] computing force and stress - time cost

2021-01-04 Thread Iurii TIMROV
Dear Mohammad,


> 2. You are using DFT+U, more precisely
> U_projection_type = 'ortho-atomic'
> The GPU acceleration of this functionality is limited and some portions
> of the algorithm will still run on the CPU. In addition, I fear that the
> evaluation of forces with that projection method scales pretty bad with
> the number of atoms, but I let the experts (and developers) of this new
> functionality further comment this last point.


Yes, indeed, in DFT+U with the 'ortho-atomic' Hubbard manifold the calculations 
of Hubbard forces and stress take much more time than with the 'atomic' Hubbard 
manifold. See Phys. Rev. B 102, 235159 (2020), in particular see Appendix C.


As Pietro said, the ortho-atomic Hubbard forces and stress were ported to the 
GPU version of QE but this is still not optimal and can be improved in the 
future.


Greetings,

Iurii


--
Dr. Iurii TIMROV
Postdoctoral Researcher
STI - IMX - THEOS and NCCR - MARVEL
Swiss Federal Institute of Technology Lausanne (EPFL)
CH-1015 Lausanne, Switzerland
+41 21 69 34 881
http://people.epfl.ch/265334

From: users  on behalf of Pietro 
Bonfa' 
Sent: Monday, January 4, 2021 11:52:31 AM
To: users@lists.quantum-espresso.org
Subject: Re: [QE-users] [QE-GPU] computing force and stress - time cost

Dear Mohammad,

Paolo is right, but a couple of comments can already be made:

1. you are not using OpenMP parallelism, and I believe you have more
than 2 cores in your system. In order to achieve a decent speedup (at
least in the SCF) it's mandatory to enable openmp and exploit the whole
CPU power.

2. You are using DFT+U, more precisely

 U_projection_type = 'ortho-atomic'

The GPU acceleration of this functionality is limited and some portions
of the algorithm will still run on the CPU. In addition, I fear that the
evaluation of forces with that projection method scales pretty bad with
the number of atoms, but I let the experts (and developers) of this new
functionality further comment this last point.

Best regards and happy new year,
Pietro




On 1/4/21 10:07 AM, Paolo Giannozzi wrote:
> The most important piece of information (the final time report) is not
> contained in your 45Mb output.
>
> Paolo
>
> On Mon, Jan 4, 2021 at 6:33 AM Mohammad Moaddeli
> mailto:mohammad.moadd...@gmail.com>> wrote:
>
> Dear Pietro,
>
> It takes about 22 hours to perform an scf (about 2 hours to perform
> diagonalization until convergence is achieved, and about 20 hours to
> compute force and stress).
> Here is the google drive link containing input and output files:
> 
> https://drive.google.com/file/d/1DFtLqFvrc8CFo1_q_jjnMFErvXpWEAHB/view?usp=sharing
> 
> 
>
> hp.x will be performed after vc-relax is done.
>
> Thanks in advance,
> Mohammad
>
> On Mon, Jan 4, 2021 at 2:38 AM Pietro Bonfa'  > wrote:
>
> Dear Mohammad,
>
> the performance of the GPU code depends dramatically on the
> portions of
> computation that are still performed on the CPU. Only a portion
> of all
> contributions to forces have been accelerated, and what is left
> out may
> be optimized for MPI parallelism rather than openmp.
>
> That being said, the behavior that you report is definitively
> unusual.
> Would you mind sharing input and output files?
>
> Best regards,
> Pietro
>
>
>
> On 1/3/21 8:52 AM, Mohammad Moaddeli wrote:
>  > Dear all,
>  >
>  > GPU enabled QE v.6.7 is compiled on a VOLTA card. I am trying
> to run a
>  > vc-relax for a bulk containing 48 atoms. Although
> diagonalization
>  > (davidson) is about 3x faster than CPU, it takes a lot of
> time (a couple
>  > of hours) to compute force and stress. Is this something
> related to the
>  > code itself?
>  >
>  > Best,
>  >
>  > Mohammad Moaddeli
>  > ShirazU
>  >
>  > ___
>  > Quantum ESPRESSO is supported by MaX (www.max-centre.eu
> 
> 

Re: [QE-users] [QE-GPU] computing force and stress - time cost

2021-01-04 Thread Pietro Bonfa'

Dear Mohammad,

Paolo is right, but a couple of comments can already be made:

1. you are not using OpenMP parallelism, and I believe you have more 
than 2 cores in your system. In order to achieve a decent speedup (at 
least in the SCF) it's mandatory to enable openmp and exploit the whole 
CPU power.


2. You are using DFT+U, more precisely

U_projection_type = 'ortho-atomic'

The GPU acceleration of this functionality is limited and some portions 
of the algorithm will still run on the CPU. In addition, I fear that the 
evaluation of forces with that projection method scales pretty bad with 
the number of atoms, but I let the experts (and developers) of this new 
functionality further comment this last point.


Best regards and happy new year,
Pietro




On 1/4/21 10:07 AM, Paolo Giannozzi wrote:
The most important piece of information (the final time report) is not 
contained in your 45Mb output.


Paolo

On Mon, Jan 4, 2021 at 6:33 AM Mohammad Moaddeli 
mailto:mohammad.moadd...@gmail.com>> wrote:


Dear Pietro,

It takes about 22 hours to perform an scf (about 2 hours to perform
diagonalization until convergence is achieved, and about 20 hours to
compute force and stress).
Here is the google drive link containing input and output files:

https://drive.google.com/file/d/1DFtLqFvrc8CFo1_q_jjnMFErvXpWEAHB/view?usp=sharing



hp.x will be performed after vc-relax is done.

Thanks in advance,
Mohammad

On Mon, Jan 4, 2021 at 2:38 AM Pietro Bonfa' mailto:pietro.bo...@unipr.it>> wrote:

Dear Mohammad,

the performance of the GPU code depends dramatically on the
portions of
computation that are still performed on the CPU. Only a portion
of all
contributions to forces have been accelerated, and what is left
out may
be optimized for MPI parallelism rather than openmp.

That being said, the behavior that you report is definitively
unusual.
Would you mind sharing input and output files?

Best regards,
Pietro



On 1/3/21 8:52 AM, Mohammad Moaddeli wrote:
 > Dear all,
 >
 > GPU enabled QE v.6.7 is compiled on a VOLTA card. I am trying
to run a
 > vc-relax for a bulk containing 48 atoms. Although
diagonalization
 > (davidson) is about 3x faster than CPU, it takes a lot of
time (a couple
 > of hours) to compute force and stress. Is this something
related to the
 > code itself?
 >
 > Best,
 >
 > Mohammad Moaddeli
 > ShirazU
 >
 > ___
 > Quantum ESPRESSO is supported by MaX (www.max-centre.eu

)
 > users mailing list users@lists.quantum-espresso.org

 > https://lists.quantum-espresso.org/mailman/listinfo/users


 >
___
Quantum ESPRESSO is supported by MaX (www.max-centre.eu

)
users mailing list users@lists.quantum-espresso.org

https://lists.quantum-espresso.org/mailman/listinfo/users


Re: [QE-users] [QE-GPU] computing force and stress - time cost

2021-01-04 Thread Paolo Giannozzi
The most important piece of information (the final time report) is not
contained in your 45Mb output.

Paolo

On Mon, Jan 4, 2021 at 6:33 AM Mohammad Moaddeli <
mohammad.moadd...@gmail.com> wrote:

> Dear Pietro,
>
> It takes about 22 hours to perform an scf (about 2 hours to perform
> diagonalization until convergence is achieved, and about 20 hours to
> compute force and stress).
> Here is the google drive link containing input and output files:
>
> https://drive.google.com/file/d/1DFtLqFvrc8CFo1_q_jjnMFErvXpWEAHB/view?usp=sharing
>
> hp.x will be performed after vc-relax is done.
>
> Thanks in advance,
> Mohammad
>
> On Mon, Jan 4, 2021 at 2:38 AM Pietro Bonfa' 
> wrote:
>
>> Dear Mohammad,
>>
>> the performance of the GPU code depends dramatically on the portions of
>> computation that are still performed on the CPU. Only a portion of all
>> contributions to forces have been accelerated, and what is left out may
>> be optimized for MPI parallelism rather than openmp.
>>
>> That being said, the behavior that you report is definitively unusual.
>> Would you mind sharing input and output files?
>>
>> Best regards,
>> Pietro
>>
>>
>>
>> On 1/3/21 8:52 AM, Mohammad Moaddeli wrote:
>> > Dear all,
>> >
>> > GPU enabled QE v.6.7 is compiled on a VOLTA card. I am trying to run a
>> > vc-relax for a bulk containing 48 atoms. Although diagonalization
>> > (davidson) is about 3x faster than CPU, it takes a lot of time (a
>> couple
>> > of hours) to compute force and stress. Is this something related to the
>> > code itself?
>> >
>> > Best,
>> >
>> > Mohammad Moaddeli
>> > ShirazU
>> >
>> > ___
>> > Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
>> > users mailing list users@lists.quantum-espresso.org
>> > https://lists.quantum-espresso.org/mailman/listinfo/users
>> >
>> ___
>> Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
>> users mailing list users@lists.quantum-espresso.org
>> https://lists.quantum-espresso.org/mailman/listinfo/users
>>
> ___
> Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
> users mailing list users@lists.quantum-espresso.org
> https://lists.quantum-espresso.org/mailman/listinfo/users



-- 
Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,
Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
Phone +39-0432-558216, fax +39-0432-558222
___
Quantum ESPRESSO is supported by MaX (www.max-centre.eu)
users mailing list users@lists.quantum-espresso.org
https://lists.quantum-espresso.org/mailman/listinfo/users