Recompile with LI, since mpirun is supported (after loading the proper mpi).

PS: Ask them if -np and -machinefile is still possible to use. Otherwise you cannot mix k-parallel and mpi parallel and for sure, for smaller cases it is a severe limitation to have only ONE mpi job with many k-points, small matrix size and many mpi cores.

Am 23.04.2021 um 16:04 schrieb leila mollabashi:
Dear Prof. Peter Blaha and WIEN2k users,

Thank you for your assistances.

Here it is the admin reply:

  * mpirun/mpiexec command is supported after loadin propper module ( I
    suggest openmpi/4.1.0 with gcc 6.2.0 or icc )
  * you have to describe needed resources (I suggest : --nodes and
    --ntasks-per-node , please use "whole node" , so ntasks-pper-node=
    28 or 32 or 48 , depending of partition)
  * Yes, our cluster have "tight integration with mpi" but the
    other-way-arround : our MPI libraries are compiled with SLURM
    support, so when you describe resources at the beginning of batch
    script, you do not have to use "-np" and "-machinefile" options for
    mpirun/mpiexec

  * this error message " btl_openib_component.c:1699:init_one_device" is
    caused by "old" mpi library, so please recompile your application
    (WIEN2k) using openmpi/4.1.0_icc19

Now should I compile WIEN2k with SL or LI?

Sincerely yours,

Leila Mollabashi


On Wed, Apr 14, 2021 at 10:34 AM Peter Blaha <pbl...@theochem.tuwien.ac.at <mailto:pbl...@theochem.tuwien.ac.at>> wrote:

    It cannot initialize an mpi job, because it is missing the interface
    software.

    You need to ask the computing center / system administrators how one
    executes a mpi job on this computer.

    It could be, that "mpirun" is not supported on this machine. You may
    try
    a wien2k installation with  system   "LS"  in siteconfig. This will
    configure the parallel environment/commands using "slurm" commands like
    srun -K -N_nodes_ -n_NP_  ..., replacing mpirun.
    We used it once on our hpc machine, since it was recommended by the
    computing center people. However, it turned out that the standard
    mpirun
    installation was more stable because the "slurm controller" died too
    often leading to many random crashes. Anyway, if your system has
    what is
    called "tight integration of mpi", it might be necessary.

    Am 13.04.2021 um 21:47 schrieb leila mollabashi:
     > Dear Prof. Peter Blaha and WIEN2k users,
     >
     > Then by run x lapw1 –p:
     >
     > starting parallel lapw1 at Tue Apr 13 21:04:15 CEST 2021
     >
     > ->  starting parallel LAPW1 jobs at Tue Apr 13 21:04:15 CEST 2021
     >
     > running LAPW1 in parallel mode (using .machines)
     >
     > 2 number_of_parallel_jobs
     >
     > [1] 14530
     >
     > [e0467:14538] mca_base_component_repository_open: unable to open
     > mca_btl_uct: libucp.so.0: cannot open shared object file: No such
    file
     > or directory (ignored)
     >
     > WARNING: There was an error initializing an OpenFabrics device.
     >
     >    Local host:   e0467
     >
     >    Local device: mlx4_0
     >
     > MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD
     >
     > with errorcode 0.
     >
     > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
     >
     > You may or may not see output from other processes, depending on
     >
     > exactly when Open MPI kills them.
     >
     >
    --------------------------------------------------------------------------
     >
     > [e0467:14567] 1 more process has sent help message
     > help-mpi-btl-openib.txt / error in device init
     >
     > [e0467:14567] 1 more process has sent help message
     > help-mpi-btl-openib.txt / error in device init
     >
     > [e0467:14567] Set MCA parameter "orte_base_help_aggregate" to 0
    to see
     > all help / error messages
     >
     > [warn] Epoll MOD(1) on fd 27 failed.  Old events were 6; read
    change was
     > 0 (none); write change was 2 (del): Bad file descriptor
     >
     >>Somewhere there should be some documentation how one runs an mpi
    job on
     > your system.
     >
     > Only I found this:
     >
     > Before ordering a task, it should be encapsulated in an appropriate
     > script understandable for the queue system, e.g .:
     >
     > /home/users/user/submit_script.sl <http://submit_script.sl>
    <http://submit_script.sl <http://submit_script.sl>>
     >
     > Sample SLURM script:
     >
     > #! / bin / bash -l
     >
     > #SBATCH -N 1
     >
     > #SBATCH --mem 5000
     >
     > #SBATCH --time = 20:00:00
     >
     > /sciezka/do/pliku/binarnego/plik_binarny.in
    <http://plik_binarny.in> <http://plik_binarny.in
    <http://plik_binarny.in>>>
     > /sciezka/do/pliku/wyjsciowego.out
     >
     > To order a task to a specific queue, use the #SBATCH -p
    parameter, e.g.
     >
     > #! / bin / bash -l
     >
     > #SBATCH -N 1
     >
     > #SBATCH --mem 5000
     >
     > #SBATCH --time = 20:00:00
     >
     > #SBATCH -p standard
     >
     > /sciezka/do/pliku/binarnego/plik_binarny.in
    <http://plik_binarny.in> <http://plik_binarny.in
    <http://plik_binarny.in>>>
     > /siezka/do/pliku/wyjsciowego.out
     >
     > The task must then be ordered using the *sbatch* command
     >
     > sbatch /home/users/user/submit_script.sl
    <http://submit_script.sl> <http://submit_script.sl
    <http://submit_script.sl>>
     >
     > *Ordering interactive tasks***
     >
     >
     > Interactive tasks can be divided into two groups:
     >
     > ·interactive task (working in text mode)
     >
     > ·interactive task
     >
     > *Interactive task (working in text mode)***
     >
     >
     > Ordering interactive tasks is very simple and in the simplest
    case it
     > comes down to issuing the command below.
     >
     > srun --pty / bin / bash
     >
     > Sincerely yours,
     >
     > Leila Mollabashi
     >
     >
     > On Wed, Apr 14, 2021 at 12:03 AM leila mollabashi
     > <le.mollaba...@gmail.com <mailto:le.mollaba...@gmail.com>
    <mailto:le.mollaba...@gmail.com <mailto:le.mollaba...@gmail.com>>>
    wrote:
     >
     >     Dear Prof. Peter Blaha and WIEN2k users,
     >
     >     Thank you for your assistances.
     >
     >     >  At least now the error: "lapw0 not found" is gone. Do you
     >     understand why ??
     >
     >     Yes, I think that because now the path is clearly known.
     >
     >     >How many slots do you get by this srun command ?
     >
     >     Usually I went to node with 28 CPUs.
     >
     >     >Is this the node with the name  e0591 ???
     >
     >     Yes, it is.
     >
     >     >Of course the .machines file must be consistent (dynamically
    adapted)
     >
     >     with the actual nodename.
     >
     >     Yes, to do this I use my script.
     >
     >     >When I  use “srun --pty -n 8 /bin/bash” that goes to the
    node with 8 free
     >     cores, and run x lapw0 –p then this happens:
     >
     >     starting parallel lapw0 at Tue Apr 13 20:50:49 CEST 2021
     >
     >     -------- .machine0 : 4 processors
     >
     >     [1] 12852
     >
     >     [e0467:12859] mca_base_component_repository_open: unable to open
     >     mca_btl_uct: libucp.so.0: cannot open shared object file: No such
     >     file or directory (ignored)
     >
     >     [e0467][[56319,1],1][btl_openib_component.c:1699:init_one_device]
     >     error obtaining device attributes for mlx4_0 errno says
    Protocol not
     >     supported
     >
     >     [e0467:12859] mca_base_component_repository_open: unable to open
     >     mca_pml_ucx: libucp.so.0: cannot open shared object file: No such
     >     file or directory (ignored)
     >
     >     LAPW0 END
     >
     >     [1]    Done                          mpirun -np 4 -machinefile
     >     .machine0 /home/users/mollabashi/v19.2/lapw0_mpi lapw0.def >>
    .time00
     >
     >     Sincerely yours,
     >
     >     Leila Mollabashi
     >
     >
     > _______________________________________________
     > Wien mailing list
     > Wien@zeus.theochem.tuwien.ac.at
    <mailto:Wien@zeus.theochem.tuwien.ac.at>
     > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
    <http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien>
     > SEARCH the MAILING-LIST at:
    http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
    <http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html>
     >

-- --------------------------------------------------------------------------
    Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
    Phone: +43-1-58801-165300             FAX: +43-1-58801-165982
    Email: bl...@theochem.tuwien.ac.at
    <mailto:bl...@theochem.tuwien.ac.at>    WIEN2k: http://www.wien2k.at
    <http://www.wien2k.at>
    WWW: http://www.imc.tuwien.ac.at <http://www.imc.tuwien.ac.at>
    -------------------------------------------------------------------------
    _______________________________________________
    Wien mailing list
    Wien@zeus.theochem.tuwien.ac.at <mailto:Wien@zeus.theochem.tuwien.ac.at>
    http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
    <http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien>
    SEARCH the MAILING-LIST at:
    http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
    <http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html>


_______________________________________________
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html


--
--------------------------------------------------------------------------
Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
Phone: +43-1-58801-165300             FAX: +43-1-58801-165982
Email: bl...@theochem.tuwien.ac.at    WIEN2k: http://www.wien2k.at
WWW:   http://www.imc.tuwien.ac.at
-------------------------------------------------------------------------
_______________________________________________
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html

Reply via email to