Re: [Wien] [please pay attention] query for mpi job file

2017-01-19 Thread Dr. K. C. Bhamu
 Thank you very much Prof. Lyudmila
Please see my updated reduced query.


> I do not use mpi, only simple parallelization over k-points, so I will
> answer only some of your questions.
> > (1) is it ok with mpiifort or mpicc or it should have mpifort or
> mpicc??
>
> I do not know and I even do not understand the question.
>

I compiled Win2k_16 with mpiifort and mpiicc, so my question is whether
mpiifort and mpiicc is correct or I should use mpifort and mpicc (look for
double "i").
Hope, this question is now well framed.


>
> > (2) how to know that job is running with mpi parallelization?
>
> IMHO, the simplest way is from dayfile:
>

It is good idea to see in case.dayfile.


> cycle 1 (Ср. сент. 21 21:59:09 SAMT 2016)   (60/99 to go)
> >   lapw0 -p(21:59:09) starting parallel lapw0 at Ср. сент. 21
> 21:59:09 SAMT 2016
>  .machine0 : processors
> running lapw0 in single mode  <-***this is no mpi--)
> 10.221u 0.064s 0:10.35 99.3%0+0k 0+28016io 0pf+0w
> >   lapw1  -up -p-c (21:59:19) starting parallel lapw1 at Ср.
> сент. 21 21:59:19 SAMT 2016
> ->  starting parallel LAPW1 jobs at Ср. сент. 21 21:59:19 SAMT 2016
> running LAPW1 in parallel mode (using .machines) <---***this is k-point
> parallel.--)
> 9 number_of_parallel_jobs <-***this is k-point parallel.--)
> localhost(12) 131.805u 1.038s 2:13.24 99.6% 0+0k 0+94072io 0pf+0w
> ...
> localhost(12) 122.034u 1.234s 2:03.67 99.6% 0+0k 0+81472io 0pf+0w
>Summary of lapw1para: <--***this is k-point parallel.--)
>


Thank you very much for detailed answer.


>
> > the *.err file seems as:
>
>> cp: cannot stat `CuGaO2.scfdmup': No such file or directory  >>>
>>
> I don't know, and I am afraid nobody knows without info
>

This is not a problem, this is set by default dor runsp_c_lapw case by
Prof. Peter to save computational time. I got answer from three years old
answer by Prof. Peter.


Mond. Sept 19 15:10:29 SAMT 2016> (x) lapw1 -up -p -c
> Mond. Sept 19 15:12:52 SAMT 2016> (x) lapw1 -dn -p -c
> Mond. Sept 19 15:15:09 SAMT 2016> (x) lapw2 -up -p -c ...


Okay, because you are running run_lapw -c case.

(3) I want to know how to change below variable in the job file so
>> that I can run more effectively mpi run
>> # the following number / 4 = number of nodes
>> #$ -pe mpich 32
>> set mpijob=1??
>> set jobs_per_node=4??
>>  the definition above requests 32 cores and we have 4 cores /node.
>>  We request only k-point parallel, thus mpijob=1
>>  the resulting machines names are in $TMPDIR/machines
>> setenv OMP_NUM_THREADS 1???
>>
>
> I don't know.
>


Okay, may be someone else may look for this.


>
> (4) The job with 32 core and with 64 core (with "set mpijob=2") taking
>> ~equal time for scf cycles.
>>
>
> From your log file it looks like you do not have any parallelization, so
> in both cases you have equal time.
>

Yeah, it may be. But if I use "set mpijob=1" then it runs well for k-point
parallelization.



Thnak you very much


Sincerely
Sincerely

Bhamu


>
>
___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html


Re: [Wien] [please pay attention] query for mpi job file

2017-01-18 Thread Lyudmila Dobysheva

18.01.2017 22:35, Dr. K. C. Bhamu wrote:
> On Tue, Jan 17, 2017 at 10:50 PM, Dr. K. C. Bhamu 
 > wrote:


I do not use mpi, only simple parallelization over k-points, so I will 
answer only some of your questions.
> (1) is it ok with mpiifort or mpicc or it should have mpifort or 
mpicc??


I do not know and I even do not understand the question.

> (2) how to know that job is running with mpi parallelization?

IMHO, the simplest way is from dayfile:
cycle 1 (Ср. сент. 21 21:59:09 SAMT 2016)   (60/99 to go)
>   lapw0 -p(21:59:09) starting parallel lapw0 at Ср. сент. 21 
21:59:09 SAMT 2016

 .machine0 : processors
running lapw0 in single mode  <-***this is no mpi--)
10.221u 0.064s 0:10.35 99.3%0+0k 0+28016io 0pf+0w
>   lapw1  -up -p-c (21:59:19) starting parallel lapw1 at 
Ср. сент. 21 21:59:19 SAMT 2016

->  starting parallel LAPW1 jobs at Ср. сент. 21 21:59:19 SAMT 2016
running LAPW1 in parallel mode (using .machines) <---***this is k-point 
parallel.--)

9 number_of_parallel_jobs <-***this is k-point parallel.--)
localhost(12) 131.805u 1.038s 2:13.24 99.6% 0+0k 0+94072io 0pf+0w
...
localhost(12) 122.034u 1.234s 2:03.67 99.6% 0+0k 0+81472io 0pf+0w
   Summary of lapw1para: <--***this is k-point parallel.--)

> the *.err file seems as:

cp: cannot stat `CuGaO2.scfdmup': No such file or directory  >>>
why this is error? I want to overcome this.


I don't know, and I am afraid nobody knows without info


The :log file
Tue Jan 17 22:16:14 IST 2017> (x) lapw0
Tue Jan 17 22:16:17 IST 2017> (x) orb -up
Tue Jan 17 22:16:17 IST 2017> (x) orb -dn
Tue Jan 17 22:16:17 IST 2017> (x) lapw1 -up -orb
Tue Jan 17 22:17:26 IST 2017> (x) lapw2 -up -orb


log file gives in my case (k-points parallel.!, do not know with mpi):
">   (runsp_lapw) options: -cc 0.005 -i 60 -p
Mond. Sept 19 15:10:18 SAMT 2016> (x) lapw0 -p
Mond. Sept 19 15:10:29 SAMT 2016> (x) lapw1 -up -p -c
Mond. Sept 19 15:12:52 SAMT 2016> (x) lapw1 -dn -p -c
Mond. Sept 19 15:15:09 SAMT 2016> (x) lapw2 -up -p -c ...
"


(3) I want to know how to change below variable in the job file so
that I can run more effectively mpi run
# the following number / 4 = number of nodes
#$ -pe mpich 32
set mpijob=1??
set jobs_per_node=4??
 the definition above requests 32 cores and we have 4 cores /node.
 We request only k-point parallel, thus mpijob=1
 the resulting machines names are in $TMPDIR/machines
setenv OMP_NUM_THREADS 1???


I don't know.


(4) The job with 32 core and with 64 core (with "set mpijob=2") taking 
~equal time for scf cycles.


From your log file it looks like you do not have any parallelization, 
so in both cases you have equal time.


Best wishes
  Lyudmila Dobysheva
--
Phys.-Techn. Institute of Ural Br. of Russian Ac. of Sci.
426001 Izhevsk, ul.Kirova 132
RUSSIA
--
Tel.:7(3412) 432045(office), 722529(Fax)
E-mail: l...@ftiudm.ru, lyuk...@mail.ru (office)
lyuk...@gmail.com (home)
Skype:  lyuka17 (home), lyuka18 (office)
http://ftiudm.ru/content/view/25/103/lang,english/
--
___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html


Re: [Wien] [please pay attention] query for mpi job file

2017-01-18 Thread Dr. K. C. Bhamu
Dear Experts,

Could someone please help me in running mpi job for below query:

Sincerely
Bhamu
On Tue, Jan 17, 2017 at 10:50 PM, Dr. K. C. Bhamu 
wrote:

> Dear Experts
>
> I just installed Wien2k_16 on a sge cluster (linuxifc) with 40 nodes with
> each node having 16 core and each core has 4GB RAM (~2GB/ processor), 40
> Gbps Infiniband interconnect. I used "mpiifort" and "mpiicc"   compiler
> with scalapck, blas, fftd3 and blacs library (without ELPA and
> LIBXC-3.0.0). I also specified number of core (16) during configuration for
> each node (the compiler options are specified at the bottom or email).
>
> Now I have submitted the job using the sge script:
>
> http://susi.theochem.tuwien.ac.at/reg_user/faq/sge.job
>
> with set mpijob=2 instead of set mpijob=1.
>
>
> I spacified
>   PARAMETER  (NMATMAX=   19000)
>   PARAMETER  (NUME=   6000)
>
> Now I have few queries:
> (1) is it ok with mpiifort or mpicc or it should have mpifort or mpicc??
> (2) how to know that job is running with mpi parallelization?
>
>
> the basic outputs are:
>
> [bhamu@gu CuGaO2]$ testpara1_lapw
> .processes: No such file or directory.
> (standard_in) 1: syntax error
>
> #
> # TESTPARA1 #
> #
>
> Tue Jan 17 22:14:57 IST 2017
>
>lapw1para was not yet executed
>
> the *.err file seems as:
> LAPW0 END
>  ORB   END
>  ORB   END
>  LAPW1 END
>  LAPW2 END
> cp: cannot stat `CuGaO2.scfdmup': No such file or directory  >>> why
> this is error? I want to overcome this.
>  CORE  END
>  CORE  END
>  MIXER END
>
> The :log file
>
> Tue Jan 17 22:16:14 IST 2017> (x) lapw0
> Tue Jan 17 22:16:17 IST 2017> (x) orb -up
> Tue Jan 17 22:16:17 IST 2017> (x) orb -dn
> Tue Jan 17 22:16:17 IST 2017> (x) lapw1 -up -orb
> Tue Jan 17 22:17:26 IST 2017> (x) lapw2 -up -orb
> Tue Jan 17 22:17:44 IST 2017> (x) lcore -up
> Tue Jan 17 22:17:44 IST 2017> (x) lcore -dn
> Tue Jan 17 22:17:45 IST 2017> (x) mixer -orb
>
>
> (3) I want to know how to change below variable in the job file so that I
> can run more effectively mpi run
>
> # the following number / 4 = number of nodes
> #$ -pe mpich 32
> set mpijob=1??
> set jobs_per_node=4??
>
>  the definition above requests 32 cores and we have 4 cores /node.
>  We request only k-point parallel, thus mpijob=1
>  the resulting machines names are in $TMPDIR/machines
>
> setenv OMP_NUM_THREADS 1???
>
>
> (4) The job with 32 core and with 64 core (with "set mpijob=2") taking ~equal 
> time for scf cycles.
>
>
>
> The other compilers options set as:
>
>
>Recommended options for system linuxifc are:
>
>  RP_LIB(SCALAPACK+PBLAS): -lmkl_scalapack_lp64 
> -lmkl_blacs_intelmpi_lp64 $(R_LIBS)
>  FPOPT(par.comp.options): -O1 -FR -mp1 -w -prec_div -pc80 -pad -ip 
> -DINTEL_VML -traceback -assume buffered_io -I$(MKLROOT)/include
>  MPIRUN command : mpirun -np _NP_ -machinefile _HOSTS_ _EXEC_
>
>Current settings:
>
>  FFTW_LIB + FFTW_OPT: -lfftw3_mpi -lfftw3 -L/usr/include/lib  +  
> -DFFTW3 -I/usr/include/include (already set)
>  ELPA_LIB + ELPA_OPT:   +   (already set)
>  RP  RP_LIB(SCALAPACK+PBLAS): -lmkl_scalapack_lp64 
> -lmkl_blacs_intelmpi_lp64 $(R_LIBS)
>  FP  FPOPT(par.comp.options): -O1 -FR -mp1 -w -prec_div -pc80 -pad -ip 
> -DINTEL_VML -traceback -assume buffered_io -I$(MKLROOT)/include
>  MP  MPIRUN command : mpirun -np _NP_ -machinefile _HOSTS_ _EXEC_
>  CN  CORES_PER_NODE : 16
>
>
> For any other supporting information please let me know.
>
>
> Sincerely
>
> Bhamu
>
>
___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html