Re: [Wien] A question about the Rkm

2016-01-10 Thread Peter Blaha
a) Clearly, for a nanowire simulation the mpi-parallelization is best. 
Unfortunately, on some clusters mpi is not set-up properly, or users do 
not use the proper mkl-libraries for hthe particular mpi. Please use the 
Intel link-library advisor, as was mentioned in previous posts.
The mkl-scalapack will NOT work unless you use proper version of the 
blacs_lp64 library.


b) As a short term solution you should:
i) Use a parallelization with OMP_NUM_THREAD=2. This speeds up the 
calculation by nearly a factor of 2 and uses 2 cores in a single lapw1 
without memory increase.
ii) Reduce the number of k-points. I'm pretty sure you can reduce it to 
2-4 for scf and structure optimization. This will save memory due to 
fewer k-parallel jobs.
iii) During structure optimization you will end up with very small Si-H 
and C-H distances. So I'd reduce the H sphere right now to about 0.6, 
but keep Si and C large (for C use around 1.2). With such a setup, a 
preliminary structure optimization can be done with RKMAX=2.0, which 
should later be checked with 2.5 and 3.0
iv) Use iterative diagonalization ! After the first cycle, this will 
speed-up the scf by a factor of 5 !!
v) And of course, reconsider the size of your "vacuum", i.e. the 
seperation of your wires. "Vacuum" is VERY expensive in terms of memory 
and one should not set it too large without test. Optimize your wire 
with small a,b; then increase the vacuum later on (x supercell) and 
check if forces appear again and distances, ban structure, ... change.



Am 09.01.2016 um 22:07 schrieb Hu, Wenhao:

Hi, Marks and Peter:

Thank you for your suggestions. About your reply, I have several
follow-up questions. Actually, I’m using a intermediate cluster in my
university, which has 16 cores and 64 GB memory on standard nodes. The
calculation I’m doing is k-point but not MPI parallelized. From the :RKM
flag I posted in my first email, I estimate that the matrix size I need
for a Rkmax=5+ will be at least 4. In my current calculation, the
lapw1 program will occupy as large as 3GB on each slot (1 k point/slot).
So I estimate the memory for each slot will be at least 12 GB. I have 8
k points so that 96 GB memory will be required at least (if my
estimation is correct). Considering the current computation resources I
have, this is way too memory demanding. On our clusters, there’s a 4 GB
memory limit for each slot on standard node. Although I can submit
request for high memory node, but their usages are very competitive
among cluster users. Do you have any suggestions on accomplishing this
calculation within the limitation of my cluster?

About the details of my calculation, the material I'm looking at is a
hydrogen terminated silicon carbide with 56 atoms. A 1x1x14 k-mesh is
picked for k-point sampling. The radius of 1.2 is achieved from
setrmt_lapw actually. Indeed, the radius of hydrogen is too large and
I’m adjusting its radius during the progress of optimization all the
time. The reason why I have such a huge matrix is mainly due to size of
my unit cell. I’m using large unit cell to isolate the coupling between
neighboring nanowire.

Except for the above questions, I also met some problems in mpi
calculation. By following Marks’ suggestion on parallel calculation, I
want to test the efficiency of mpi calculation since I only used k-point
parallelized calculation before. The MPI installed on my cluster is
openmpi. In the output file, I get the following error:

---
  LAPW0 END

lapw1c_mpi:19058 terminated with signal 11 at PC=2b56d9118f79
SP=7fffc23d6890.  Backtrace:
...
mpirun has exited due to process rank 14 with PID 19061 on
node neon-compute-2-25.local exiting improperly. There are two reasons
this could occur:

1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.

2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"

This may have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--
Uni_+6%.scf1up_1: No such file or directory.
grep: *scf1up*: No such file or directory

---

The job script I’m using is:

Re: [Wien] A question about the Rkm

2016-01-10 Thread Hu, Wenhao

(I accidentally replied with a wrong title. To ensure consistency, I send this 
post again. Maybe the mail list manager can delete the wrong post for me^)

Hi, Peter:

Thank you very much for your reply. By following your suggestion, I unified the 
version of all the library to be compiled or consistent with intel composer xe 
2015 (MKL, fftw, openmpi etc.) and recompiled wien2k. The version of my openmpi 
is 1.6.5. However, I still get the same problem. Except for the message I 
posted earlier, I also have the following backtrace information of the process:

lapw1c_mpi:14596 terminated with signal 11 at PC=2ab4dac4df79 SP=7fff78b8e310.  
Backtrace:

lapw1c_mpi:14597 terminated with signal 11 at PC=2b847d2a1f79 SP=7fff8ef89690.  
Backtrace:
/opt/openmpi-intel-composer_xe_2015.3.187/1.6.5/lib/libmpi.so.1(MPI_Comm_size+0x59)[0x2ab4dac4df79]
/opt/openmpi-intel-composer_xe_2015.3.187/1.6.5/lib/libmpi.so.1(MPI_Comm_size+0x59)[0x2b847d2a1f79]
/Users/wenhhu/wien2k14/lapw1c_mpi(blacs_pinfo_+0x92)[0x49cf02]
/Users/wenhhu/wien2k14/lapw1c_mpi(blacs_pinfo_+0x92)[0x49cf02]
/opt/intel/composer_xe_2015.3.187/mkl/lib/intel64/libmkl_scalapack_lp64.so(sl_init_+0x21)[0x2b8478d2e171]
/opt/intel/composer_xe_2015.3.187/mkl/lib/intel64/libmkl_scalapack_lp64.so(sl_init_+0x21)[0x2ab4d66da171]
/Users/wenhhu/wien2k14/lapw1c_mpi(parallel_mp_init_parallel_+0x63)[0x463cd3]
/Users/wenhhu/wien2k14/lapw1c_mpi(parallel_mp_init_parallel_+0x63)[0x463cd3]
/Users/wenhhu/wien2k14/lapw1c_mpi(gtfnam_+0x22)[0x426372]
/Users/wenhhu/wien2k14/lapw1c_mpi(MAIN__+0x6c)[0x4493dc]
/Users/wenhhu/wien2k14/lapw1c_mpi(main+0x2e)[0x40d19e]
/Users/wenhhu/wien2k14/lapw1c_mpi(gtfnam_+0x22)[0x426372]
/Users/wenhhu/wien2k14/lapw1c_mpi(MAIN__+0x6c)[0x4493dc]
/Users/wenhhu/wien2k14/lapw1c_mpi(main+0x2e)[0x40d19e]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x339101ed5d]
/Users/wenhhu/wien2k14/lapw1c_mpi[0x40d0a9]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x339101ed5d]
/Users/wenhhu/wien2k14/lapw1c_mpi[0x40d0a9]

Do you think it’s still the problem of my MKL or there’re some other issues I 
miss?

Best,
Wenhao



a) Clearly, for a nanowire simulation the mpi-parallelization is best. 
Unfortunately, on some clusters mpi is not set-up properly, or users do not use 
the proper mkl-libraries for hthe particular mpi. Please use the Intel 
link-library advisor, as was mentioned in previous posts. The mkl-scalapack 
will NOT work unless you use proper version of the blacs_lp64 library.
b) As a short term solution you should:

i) Use a parallelization with OMP_NUM_THREAD=2. This speeds up the calculation 
by nearly a factor of 2 and uses 2 cores in a single lapw1 without memory 
increase. ii) Reduce the number of k-points. I'm pretty sure you can reduce it 
to 2-4 for scf and structure optimization. This will save memory due to fewer 
k-parallel jobs. iii) During structure optimization you will end up with very 
small Si-H and C-H distances. So I'd reduce the H sphere right now to about 
0.6, but keep Si and C large (for C use around 1.2). With such a setup, a 
preliminary structure optimization can be done with RKMAX=2.0, which should 
later be checked with 2.5 and 3.0 iv) Use iterative diagonalization ! After the 
first cycle, this will speed-up the scf by a factor of 5 !! v) And of course, 
reconsider the size of your "vacuum", i.e. the seperation of your wires. 
"Vacuum" is VERY expensive in terms of memory and one should not set it too 
large without test. Optimize your wire with small a,b; then increase the vacuum 
later on (x supercell) and check if forces appear again and distances, ban 
structure, ... change.

Am 09.01.2016 um 22:07 schrieb Hu, Wenhao:

Hi, Marks and Peter:

Thank you for your suggestions. About your reply, I have several
follow-up questions. Actually, I’m using a intermediate cluster in my
university, which has 16 cores and 64 GB memory on standard nodes. The
calculation I’m doing is k-point but not MPI parallelized. From the :RKM
flag I posted in my first email, I estimate that the matrix size I need
for a Rkmax=5+ will be at least 4. In my current calculation, the
lapw1 program will occupy as large as 3GB on each slot (1 k point/slot).
So I estimate the memory for each slot will be at least 12 GB. I have 8
k points so that 96 GB memory will be required at least (if my
estimation is correct). Considering the current computation resources I
have, this is way too memory demanding. On our clusters, there’s a 4 GB
memory limit for each slot on standard node. Although I can submit
request for high memory node, but their usages are very competitive
among cluster users. Do you have any suggestions on accomplishing this
calculation within the limitation of my cluster?

About the details of my calculation, the material I'm looking at is a
hydrogen terminated silicon carbide with 56 atoms. A 1x1x14 k-mesh is
picked for k-point sampling. The radius of 1.2 is achieved from
setrmt_lapw actually. Indeed, the radius of hydrogen is too large and
I’m adjusting 

Re: [Wien] A question about the Rkm

2016-01-10 Thread Gavin Abo
From the backtrace, it does look like it crashed in libmpi.so.1, which 
I believe is an Open MPI library.  I don't know if it will solve the 
problem or not, but I would try a different Open MPI version or 
recompile Open MPI (while tweaking the configuration options [ 
https://software.intel.com/en-us/articles/performance-tools-for-software-developers-building-open-mpi-with-the-intel-compilers 
]).


composer_xe_2015.3.187 => ifort version 15.0.3 [ 
https://software.intel.com/en-us/articles/intel-compiler-and-composer-update-version-numbers-to-compiler-version-number-mapping 
]


In the post at the following link on the Intel forum it looks like 
openmpi-1.10.1rc2 (or newer) was recommended for ifort 15.0 (or newer) 
to resolve a Fortran run-time library (RTL) issue:


https://software.intel.com/en-us/forums/intel-fortran-compiler-for-linux-and-mac-os-x/topic/564266

On 1/10/2016 3:42 PM, Hu, Wenhao wrote:


(I accidentally replied with a wrong title. To ensure consistency, I 
send this post again. Maybe the mail list manager can delete the wrong 
post for me^)


Hi, Peter:

Thank you very much for your reply. By following your suggestion, I 
unified the version of all the library to be compiled or consistent 
with intel composer xe 2015 (MKL, fftw, openmpi etc.) and recompiled 
wien2k. The version of my openmpi is 1.6.5. However, I still get the 
same problem. Except for the message I posted earlier, I also have the 
following backtrace information of the process:


lapw1c_mpi:14596 terminated with signal 11 at PC=2ab4dac4df79 
SP=7fff78b8e310.  Backtrace:


lapw1c_mpi:14597 terminated with signal 11 at PC=2b847d2a1f79 
SP=7fff8ef89690.  Backtrace:

/opt/openmpi-intel-composer_xe_2015.3.187/1.6.5/lib/libmpi.so.1(MPI_Comm_size+0x59)[0x2ab4dac4df79]
/opt/openmpi-intel-composer_xe_2015.3.187/1.6.5/lib/libmpi.so.1(MPI_Comm_size+0x59)[0x2b847d2a1f79]
/Users/wenhhu/wien2k14/lapw1c_mpi(blacs_pinfo_+0x92)[0x49cf02]
/Users/wenhhu/wien2k14/lapw1c_mpi(blacs_pinfo_+0x92)[0x49cf02]
/opt/intel/composer_xe_2015.3.187/mkl/lib/intel64/libmkl_scalapack_lp64.so(sl_init_+0x21)[0x2b8478d2e171]
/opt/intel/composer_xe_2015.3.187/mkl/lib/intel64/libmkl_scalapack_lp64.so(sl_init_+0x21)[0x2ab4d66da171]
/Users/wenhhu/wien2k14/lapw1c_mpi(parallel_mp_init_parallel_+0x63)[0x463cd3]
/Users/wenhhu/wien2k14/lapw1c_mpi(parallel_mp_init_parallel_+0x63)[0x463cd3]
/Users/wenhhu/wien2k14/lapw1c_mpi(gtfnam_+0x22)[0x426372]
/Users/wenhhu/wien2k14/lapw1c_mpi(MAIN__+0x6c)[0x4493dc]
/Users/wenhhu/wien2k14/lapw1c_mpi(main+0x2e)[0x40d19e]
/Users/wenhhu/wien2k14/lapw1c_mpi(gtfnam_+0x22)[0x426372]
/Users/wenhhu/wien2k14/lapw1c_mpi(MAIN__+0x6c)[0x4493dc]
/Users/wenhhu/wien2k14/lapw1c_mpi(main+0x2e)[0x40d19e]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x339101ed5d]
/Users/wenhhu/wien2k14/lapw1c_mpi[0x40d0a9]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x339101ed5d]
/Users/wenhhu/wien2k14/lapw1c_mpi[0x40d0a9]

Do you think it’s still the problem of my MKL or there’re some other 
issues I miss?


Best,
Wenhao



a) Clearly, for a nanowire simulation the mpi-parallelization is 
best. Unfortunately, on some clusters mpi is not set-up properly, or 
users do not use the proper mkl-libraries for hthe particular mpi. 
Please use the Intel link-library advisor, as was mentioned in 
previous posts. The mkl-scalapack will NOT work unless you use proper 
version of the blacs_lp64 library.

b) As a short term solution you should:

i) Use a parallelization with OMP_NUM_THREAD=2. This speeds up the 
calculation by nearly a factor of 2 and uses 2 cores in a single 
lapw1 without memory increase. ii) Reduce the number of k-points. I'm 
pretty sure you can reduce it to 2-4 for scf and structure 
optimization. This will save memory due to fewer k-parallel jobs. 
iii) During structure optimization you will end up with very small 
Si-H and C-H distances. So I'd reduce the H sphere right now to about 
0.6, but keep Si and C large (for C use around 1.2). With such a 
setup, a preliminary structure optimization can be done with 
RKMAX=2.0, which should later be checked with 2.5 and 3.0 iv) Use 
iterative diagonalization ! After the first cycle, this will speed-up 
the scf by a factor of 5 !! v) And of course, reconsider the size of 
your "vacuum", i.e. the seperation of your wires. "Vacuum" is VERY 
expensive in terms of memory and one should not set it too large 
without test. Optimize your wire with small a,b; then increase the 
vacuum later on (x supercell) and check if forces appear again and 
distances, ban structure, ... change.



Am 09.01.2016 um 22:07 schrieb Hu, Wenhao:

Hi, Marks and Peter:

Thank you for your suggestions. About your reply, I have several
follow-up questions. Actually, I’m using a intermediate cluster in my
university, which has 16 cores and 64 GB memory on standard nodes. The
calculation I’m doing is k-point but not MPI parallelized. From the :RKM
flag I posted in my first email, I estimate that the matrix size I need
for a Rkmax=5+ will be at least 

Re: [Wien] A question about the Rkm

2016-01-10 Thread Laurence Marks
Most common problem is use of the wrong version of blacs -- which the Intel
link advisor will provide information about.

I have very, very rarely seen anything beyond a wrong version of blacs.

On Sun, Jan 10, 2016 at 6:27 PM, Gavin Abo  wrote:

> From the backtrace, it does look like it crashed in libmpi.so.1, which I
> believe is an Open MPI library.  I don't know if it will solve the problem
> or not, but I would try a different Open MPI version or recompile Open MPI
> (while tweaking the configuration options [
> https://software.intel.com/en-us/articles/performance-tools-for-software-developers-building-open-mpi-with-the-intel-compilers
> ]).
>
> composer_xe_2015.3.187 => ifort version 15.0.3 [
> https://software.intel.com/en-us/articles/intel-compiler-and-composer-update-version-numbers-to-compiler-version-number-mapping
> ]
>
> In the post at the following link on the Intel forum it looks like
> openmpi-1.10.1rc2 (or newer) was recommended for ifort 15.0 (or newer) to
> resolve a Fortran run-time library (RTL) issue:
>
>
> https://software.intel.com/en-us/forums/intel-fortran-compiler-for-linux-and-mac-os-x/topic/564266
>
> On 1/10/2016 3:42 PM, Hu, Wenhao wrote:
>
>
> (I accidentally replied with a wrong title. To ensure consistency, I send
> this post again. Maybe the mail list manager can delete the wrong post for
> me^)
>
> Hi, Peter:
>
> Thank you very much for your reply. By following your suggestion, I
> unified the version of all the library to be compiled or consistent with
> intel composer xe 2015 (MKL, fftw, openmpi etc.) and recompiled wien2k. The
> version of my openmpi is 1.6.5. However, I still get the same problem.
> Except for the message I posted earlier, I also have the following
> backtrace information of the process:
>
> lapw1c_mpi:14596 terminated with signal 11 at PC=2ab4dac4df79
> SP=7fff78b8e310.  Backtrace:
>
> lapw1c_mpi:14597 terminated with signal 11 at PC=2b847d2a1f79
> SP=7fff8ef89690.  Backtrace:
>
> /opt/openmpi-intel-composer_xe_2015.3.187/1.6.5/lib/libmpi.so.1(MPI_Comm_size+0x59)[0x2ab4dac4df79]
>
> /opt/openmpi-intel-composer_xe_2015.3.187/1.6.5/lib/libmpi.so.1(MPI_Comm_size+0x59)[0x2b847d2a1f79]
> /Users/wenhhu/wien2k14/lapw1c_mpi(blacs_pinfo_+0x92)[0x49cf02]
> /Users/wenhhu/wien2k14/lapw1c_mpi(blacs_pinfo_+0x92)[0x49cf02]
>
> /opt/intel/composer_xe_2015.3.187/mkl/lib/intel64/libmkl_scalapack_lp64.so(sl_init_+0x21)[0x2b8478d2e171]
>
> /opt/intel/composer_xe_2015.3.187/mkl/lib/intel64/libmkl_scalapack_lp64.so(sl_init_+0x21)[0x2ab4d66da171]
>
> /Users/wenhhu/wien2k14/lapw1c_mpi(parallel_mp_init_parallel_+0x63)[0x463cd3]
>
> /Users/wenhhu/wien2k14/lapw1c_mpi(parallel_mp_init_parallel_+0x63)[0x463cd3]
> /Users/wenhhu/wien2k14/lapw1c_mpi(gtfnam_+0x22)[0x426372]
> /Users/wenhhu/wien2k14/lapw1c_mpi(MAIN__+0x6c)[0x4493dc]
> /Users/wenhhu/wien2k14/lapw1c_mpi(main+0x2e)[0x40d19e]
> /Users/wenhhu/wien2k14/lapw1c_mpi(gtfnam_+0x22)[0x426372]
> /Users/wenhhu/wien2k14/lapw1c_mpi(MAIN__+0x6c)[0x4493dc]
> /Users/wenhhu/wien2k14/lapw1c_mpi(main+0x2e)[0x40d19e]
> /lib64/libc.so.6(__libc_start_main+0xfd)[0x339101ed5d]
> /Users/wenhhu/wien2k14/lapw1c_mpi[0x40d0a9]
> /lib64/libc.so.6(__libc_start_main+0xfd)[0x339101ed5d]
> /Users/wenhhu/wien2k14/lapw1c_mpi[0x40d0a9]
>
> Do you think it’s still the problem of my MKL or there’re some other
> issues I miss?
>
> Best,
> Wenhao
>
>
>
> a) Clearly, for a nanowire simulation the mpi-parallelization is best.
> Unfortunately, on some clusters mpi is not set-up properly, or users do not
> use the proper mkl-libraries for hthe particular mpi. Please use the Intel
> link-library advisor, as was mentioned in previous posts. The mkl-scalapack
> will NOT work unless you use proper version of the blacs_lp64 library.
> b) As a short term solution you should:
>
> i) Use a parallelization with OMP_NUM_THREAD=2. This speeds up the
> calculation by nearly a factor of 2 and uses 2 cores in a single lapw1
> without memory increase. ii) Reduce the number of k-points. I'm pretty sure
> you can reduce it to 2-4 for scf and structure optimization. This will save
> memory due to fewer k-parallel jobs. iii) During structure optimization you
> will end up with very small Si-H and C-H distances. So I'd reduce the H
> sphere right now to about 0.6, but keep Si and C large (for C use around
> 1.2). With such a setup, a preliminary structure optimization can be done
> with RKMAX=2.0, which should later be checked with 2.5 and 3.0 iv) Use
> iterative diagonalization ! After the first cycle, this will speed-up the
> scf by a factor of 5 !! v) And of course, reconsider the size of your
> "vacuum", i.e. the seperation of your wires. "Vacuum" is VERY expensive in
> terms of memory and one should not set it too large without test. Optimize
> your wire with small a,b; then increase the vacuum later on (x supercell)
> and check if forces appear again and distances, ban structure, ... change.
>
> Am 09.01.2016 um 22:07 schrieb Hu, Wenhao:
>
> Hi, Marks 

Re: [Wien] A question about the Rkm

2016-01-09 Thread Hu, Wenhao
Hi, Marks and Peter:

Thank you for your suggestions. About your reply, I have several follow-up 
questions. Actually, I’m using a intermediate cluster in my university, which 
has 16 cores and 64 GB memory on standard nodes. The calculation I’m doing is 
k-point but not MPI parallelized. From the :RKM flag I posted in my first 
email, I estimate that the matrix size I need for a Rkmax=5+ will be at least 
4. In my current calculation, the lapw1 program will occupy as large as 3GB 
on each slot (1 k point/slot). So I estimate the memory for each slot will be 
at least 12 GB. I have 8 k points so that 96 GB memory will be required at 
least (if my estimation is correct). Considering the current computation 
resources I have, this is way too memory demanding. On our clusters, there’s a 
4 GB memory limit for each slot on standard node. Although I can submit request 
for high memory node, but their usages are very competitive among cluster 
users. Do you have any suggestions on accomplishing this calculation within the 
limitation of my cluster?

About the details of my calculation, the material I'm looking at is a hydrogen 
terminated silicon carbide with 56 atoms. A 1x1x14 k-mesh is picked for k-point 
sampling. The radius of 1.2 is achieved from setrmt_lapw actually. Indeed, the 
radius of hydrogen is too large and I’m adjusting its radius during the 
progress of optimization all the time. The reason why I have such a huge matrix 
is mainly due to size of my unit cell. I’m using large unit cell to isolate the 
coupling between neighboring nanowire.

Except for the above questions, I also met some problems in mpi calculation. By 
following Marks’ suggestion on parallel calculation, I want to test the 
efficiency of mpi calculation since I only used k-point parallelized 
calculation before. The MPI installed on my cluster is openmpi. In the output 
file, I get the following error:

---
 LAPW0 END

lapw1c_mpi:19058 terminated with signal 11 at PC=2b56d9118f79 SP=7fffc23d6890.  
Backtrace:
...
mpirun has exited due to process rank 14 with PID 19061 on
node neon-compute-2-25.local exiting improperly. There are two reasons this 
could occur:

1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.

2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"

This may have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--
Uni_+6%.scf1up_1: No such file or directory.
grep: *scf1up*: No such file or directory

---

The job script I’m using is:

---
!/bin/csh -f
# -S /bin/sh
#
#$ -N uni_6
#$ -q MF
#$ -m be
#$ -M wenhao...@uiowa.edu
#$ -pe smp 16
#$ -cwd
#$ -j y

cp $PE_HOSTFILE hostfile
echo "PE_HOSTFILE:"
echo $PE_HOSTFILE
rm .machines
echo granularity:1 >>.machines
while read hostname slot useless; do
i=0
l0=$hostname
while [ $i -lt $slot ]; do
echo 1:$hostname:2 >>.machines
let i=i+2
done
done>.machines

runsp_lapw -p -min -ec 0.0001 -cc 0.001 -fc 0.5
---

Is there any mistake I made or something missing in my script?

Thank your very much for your help.

Wenhao

I do not know many compounds, for which an RMT=1.2 bohr for H makes any sense 
(maybe LiH). Use setrmt and follow the suggestion. Usually, H spheres of CH or 
OH bonds should be less than 0.6 bohr. Experimental H-position are often very 
unreliable.
How many k-points ? Often 1 k-point is enough for 50+ atoms (at least at the 
beginning), in particular when you ahve an insulator.
Otherwise, follow the suggestions of L.Marks about parallelization.

Am 08.01.2016 um 07:28 schrieb Hu, Wenhao:

Hi, all:

I have some confusions on the Rkm in calculations with 50+ atoms. In my wien2k,
the NATMAX and NUME are set to 15000 and 1700. With the highest NE and NAT, the
Rkmax can only be as large as 2.05, which 

Re: [Wien] A question about the Rkm

2016-01-08 Thread Peter Blaha
I do not know many compounds, for which an RMT=1.2 bohr for H makes any 
sense (maybe LiH). Use setrmt and follow the suggestion. Usually, H 
spheres of CH or OH bonds should be less than 0.6 bohr. Experimental 
H-position are often very unreliable.


How many k-points ? Often 1 k-point is enough for 50+ atoms (at least at 
the beginning), in particular when you ahve an insulator.


Otherwise, follow the suggestions of L.Marks about parallelization.

Am 08.01.2016 um 07:28 schrieb Hu, Wenhao:

Hi, all:

I have some confusions on the Rkm in calculations with 50+ atoms. In my wien2k, 
the NATMAX and NUME are set to 15000 and 1700. With the highest NE and NAT, the 
Rkmax can only be as large as 2.05, which is much lower than the suggested 
value in FAQ page of WIEN2K (the smallest atom in my case is a H atom with 
radius of 1.2). By checking the :RKM flag in case.scf, I have the following 
information:

:RKM  : MATRIX SIZE 11292LOs: 979  RKM= 2.05  WEIGHT= 1.00  PGR:

With such a matrix size, the single cycle can take as long as two and half 
hours. Although I can increase the NATMAX and NUME to raise Rkmax, the 
calculation will be way slower, which will make the optimization calculation 
almost impossible. Before making convergence test on Rkmax, can anyone tell me 
whether such a Rkmax is a reasonable value?

If any further information is needed, please let me know. Thanks in advance.

Best,
Wenhao
___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html



--
--
Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
Phone: +43-1-58801-165300 FAX: +43-1-58801-165982
Email: bl...@theochem.tuwien.ac.atWIEN2k: http://www.wien2k.at
WWW:   http://www.imc.tuwien.ac.at/staff/tc_group_e.php
--
___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html


Re: [Wien] A question about the Rkm

2016-01-08 Thread Laurence Marks
A RKMAX (the RKM value in case.scf) of 2.05 is too small for a minimum RMT
of 1.2, the results of the calculation will be very poor. I estimate that
you need a value of at least 5, which will mean a much larger matrix size
and a longer calculation.

You need to use a more powerful computer or (better) a cluster. I will
estimate that something like 64 cores will give you a fairly fast result.
If all you have is a single core computer either give up on the calculation
or increase NATMAX & NUME and (if you have enough memory) wait.

N.B., depending upon your architecture the multithreaded mkl (which you
seem to be using) is faster or slower than mpi. On my systems mpi is faster
even for small system, on Peter Blaha's he says that the multithreading is
faster.

On Fri, Jan 8, 2016 at 12:28 AM, Hu, Wenhao  wrote:

> Hi, all:
>
> I have some confusions on the Rkm in calculations with 50+ atoms. In my
> wien2k, the NATMAX and NUME are set to 15000 and 1700. With the highest NE
> and NAT, the Rkmax can only be as large as 2.05, which is much lower than
> the suggested value in FAQ page of WIEN2K (the smallest atom in my case is
> a H atom with radius of 1.2). By checking the :RKM flag in case.scf, I have
> the following information:
>
> :RKM  : MATRIX SIZE 11292LOs: 979  RKM= 2.05  WEIGHT= 1.00  PGR:
>
> With such a matrix size, the single cycle can take as long as two and half
> hours. Although I can increase the NATMAX and NUME to raise Rkmax, the
> calculation will be way slower, which will make the optimization
> calculation almost impossible. Before making convergence test on Rkmax, can
> anyone tell me whether such a Rkmax is a reasonable value?
>
> If any further information is needed, please let me know. Thanks in
> advance.
>
> Best,
> Wenhao
> ___
> Wien mailing list
> Wien@zeus.theochem.tuwien.ac.at
> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
> SEARCH the MAILING-LIST at:
> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
>



-- 
Professor Laurence Marks
Department of Materials Science and Engineering
Northwestern University
www.numis.northwestern.edu
Corrosion in 4D: MURI4D.numis.northwestern.edu
Co-Editor, Acta Cryst A
"Research is to see what everybody else has seen, and to think what nobody
else has thought"
Albert Szent-Gyorgi
___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html