Re: [OMPI users] Set maximum number of CPU (or threads) for a user
Hello, > > Hi, > > I am using OpenMPI 4.1.1 (without a scheduler) and I want to limit the > number of CPU or threads a specific user (or host, if I must) can use. > Is there a configuration or environment variable I can use? This user > has not been following our resource management policy so simply asking > him to use -np 2 will not suffice. > > Thank you, > > Dave Martin > in the same direction as previous answers: did you consider the user.slice feature in systemd? This would let you use cgroups to constrain a users complete login sessions. No extra software needed. See https://www.freedesktop.org/software/systemd/man/systemd.resource-control.html especially "CPUQuota". These limits can be set in a user.slice. Best Regards Christof -- Dr. rer. nat. Christof Köhler email: c.koeh...@uni-bremen.de Universitaet Bremen/FB1/BCCMS phone: +49-(0)421-218-62334 Am Fallturm 1/ TAB/ Raum 3.06 fax: +49-(0)421-218-62770 28359 Bremen signature.asc Description: PGP signature
Re: [OMPI users] psec warning when launching with srun
Hello Gilles, thank you very much for the prompt patch. I can confirm that configure now prefers the external PMIx. I can confirm that the munge warnings and PMIx errors we observed are gone. An mpi hello world runs successfully with srun --mpi=pmix and --mpi=pmi2. I noticed that configure complained loudly about a missing external libevent (i.e. libevent-devel package), but did not complain at all that an external hwloc-devel was also missing. Best Regards Christof On Sat, May 20, 2023 at 06:54:54PM +0900, Gilles Gouaillardet wrote: > Christof, > > Open MPI switching to the internal PMIx is a bug I addressed in > https://github.com/open-mpi/ompi/pull/11704 > > Feel free to manually download and apply the patch, you will then need > recent autotools and run > ./autogen.pl --force > > An other option is to manually edit the configure file > > Look for the following snippet > ># Final check - if they didn't point us explicitly at an > external version > ># but we found one anyway, use the internal version if it is > higher > >if test "$opal_external_pmix_version" != "internal" && (test -z > "$with_pmix" || test "$with_pmix" = "yes") > > then : > > if test "$opal_external_pmix_version" != "3x" > > > and replace the last line with > > if test $opal_external_pmix_version_major -lt 3 > > > Cheers, > > Gilles > > On Sat, May 20, 2023 at 6:13 PM christof.koehler--- via users < > users@lists.open-mpi.org> wrote: > > > Hello Z. Matthias Krawutschke, > > > > On Fri, May 19, 2023 at 09:08:08PM +0200, Zhéxué M. Krawutschke wrote: > > > Hello Christoph, > > > what exactly is your problem with OpenMPI and Slurm? > > > Do you compile the products yourself? Which LINUX distribution and > > version are you using? > > > > > > If you compile the software yourself, could you please tell me what the > > "configure" command looks like and which MUNGE version is in use? From the > > distribution or compiled by yourself? > > > > > > I would be very happy to take on this topic and help you. You can also > > reach me at +49 176 67270992. > > > Best regards from Berlin > > > > please refer to (especially the end) of my first mail in this thread > > which is available here > > https://www.mail-archive.com/users@lists.open-mpi.org/msg35141.html > > > > I believe this contains the relevant information you are requesting. The > > second mail which you are replying to was just additional information. > > My apologies if this led to confusion. > > > > Please let me know if any relevant information is missing from my first > > email. At the bottom of this email I include the ompi_info output as > > further addendum. > > > > To summarize: I would like to understand where the munge warning > > and PMIx error described in the first email (and the github link > > included) come from. The explanation in the github issue > > does not appear to be correct as all munge libraries are > > available everywhere. To me, it appears at the moment that OpenMPIs > > configure decides erroneously to build and use the internal pmix > > instead of using the (presumably) newer externally available PMIx, > > leading to launcher problems with srun. > > > > > > Best Regards > > > > Christof > > > > Package: Open MPI root@admin.service Distribution > > Open MPI: 4.1.5 > > Open MPI repo revision: v4.1.5 > >Open MPI release date: Feb 23, 2023 > > Open RTE: 4.1.5 > > Open RTE repo revision: v4.1.5 > >Open RTE release date: Feb 23, 2023 > > OPAL: 4.1.5 > > OPAL repo revision: v4.1.5 > >OPAL release date: Feb 23, 2023 > > MPI API: 3.1.0 > > Ident string: 4.1.5 > > Prefix: /cluster/mpi/openmpi/4.1.5/gcc-11.3.1 > > Configured architecture: x86_64-pc-linux-gnu > > Configure host: admin.service > >Configured by: root > >Configured on: Wed May 17 18:45:42 UTC 2023 > > Configure host: admin.service > > Configure command line: '--enable-mpi1-compatibility' > > '--enable-orterun-prefix-by-default' > > '--with-ofi=/cluster/libraries/libfabric/1.18.0/' '--with-slurm' > > '--with-pmix' '--with-pmix-libdir=/usr/lib64' '--with-pmi' > >
Re: [OMPI users] psec warning when launching with srun
MCA sharedfp: individual (MCA v2.1.0, API v2.0.0, Component v4.1.5) MCA sharedfp: sm (MCA v2.1.0, API v2.0.0, Component v4.1.5) MCA topo: basic (MCA v2.1.0, API v2.2.0, Component v4.1.5) MCA topo: treematch (MCA v2.1.0, API v2.2.0, Component v4.1.5) MCA vprotocol: pessimist (MCA v2.1.0, API v2.0.0, Component v4.1.5) > > Z. Matthias Krawutschke > > > On Donnerstag, Mai 18, 2023 at 5:47 PM, christof.koehler--- via users > > mailto:users@lists.open-mpi.org)> wrote: > > Hello again, > > > > I should add that the openmpi configure decides to use the internal pmix > > > > configure: WARNING: discovered external PMIx version is less than internal > > version 3.x > > configure: WARNING: using internal PMIx > > ... > > ... > > checking if user requested PMI support... yes > > checking for pmi.h in /usr/include... not found > > checking for pmi.h in /usr/include/slurm... found > > checking pmi.h usability... yes > > checking pmi.h presence... yes > > checking for pmi.h... yes > > checking for libpmi in /usr/lib64... found > > checking for PMI_Init in -lpmi... yes > > checking for pmi2.h in /usr/include... not found > > checking for pmi2.h in /usr/include/slurm... found > > checking pmi2.h usability... yes > > checking pmi2.h presence... yes > > checking for pmi2.h... yes > > checking for libpmi2 in /usr/lib64... found > > checking for PMI2_Init in -lpmi2... yes > > checking for pmix.h in ... not found > > checking for pmix.h in /include... not found > > checking can PMI support be built... yes > > checking if user requested internal PMIx support(yes)... no > > checking for pmix.h in /usr... not found > > checking for pmix.h in /usr/include... found > > checking libpmix.* in /usr/lib64... found > > checking PMIx version... version file found > > checking version 4x... found > > checking PMIx version to be used... internal > > > > I am not sure how it decides that, the external one is already a quite > > new version. > > > > # srun --mpi=list > > MPI plugin types are... > > pmix > > cray_shasta > > none > > pmi2 > > specific pmix plugin versions available: pmix_v4 > > > > > > Best Regards > > > > Christof > > > > -- > > Dr. rer. nat. Christof Köhler email: c.koeh...@uni-bremen.de > > Universitaet Bremen/FB1/BCCMS phone: +49-(0)421-218-62334 > > Am Fallturm 1/ TAB/ Raum 3.06 fax: +49-(0)421-218-62770 > > 28359 Bremen > > -- Dr. rer. nat. Christof Köhler email: c.koeh...@uni-bremen.de Universitaet Bremen/FB1/BCCMS phone: +49-(0)421-218-62334 Am Fallturm 1/ TAB/ Raum 3.06 fax: +49-(0)421-218-62770 28359 Bremen signature.asc Description: PGP signature
Re: [OMPI users] psec warning when launching with srun
Hello again, I should add that the openmpi configure decides to use the internal pmix configure: WARNING: discovered external PMIx version is less than internal version 3.x configure: WARNING: using internal PMIx ... ... checking if user requested PMI support... yes checking for pmi.h in /usr/include... not found checking for pmi.h in /usr/include/slurm... found checking pmi.h usability... yes checking pmi.h presence... yes checking for pmi.h... yes checking for libpmi in /usr/lib64... found checking for PMI_Init in -lpmi... yes checking for pmi2.h in /usr/include... not found checking for pmi2.h in /usr/include/slurm... found checking pmi2.h usability... yes checking pmi2.h presence... yes checking for pmi2.h... yes checking for libpmi2 in /usr/lib64... found checking for PMI2_Init in -lpmi2... yes checking for pmix.h in ... not found checking for pmix.h in /include... not found checking can PMI support be built... yes checking if user requested internal PMIx support(yes)... no checking for pmix.h in /usr... not found checking for pmix.h in /usr/include... found checking libpmix.* in /usr/lib64... found checking PMIx version... version file found checking version 4x... found checking PMIx version to be used... internal I am not sure how it decides that, the external one is already a quite new version. # srun --mpi=list MPI plugin types are... pmix cray_shasta none pmi2 specific pmix plugin versions available: pmix_v4 Best Regards Christof -- Dr. rer. nat. Christof Köhler email: c.koeh...@uni-bremen.de Universitaet Bremen/FB1/BCCMS phone: +49-(0)421-218-62334 Am Fallturm 1/ TAB/ Raum 3.06 fax: +49-(0)421-218-62770 28359 Bremen
[OMPI users] psec warning when launching with srun
Hello everybody, we are seeing the sypmptoms described in https://github.com/open-mpi/ompi/issues/11557 However, according to the systems package manager (dnf) all munge related packages on the build node and the execution node are identical, see details at the bottom. So, the explanation given by Ralph Castain in the git issue, which I read to refer to a missing munge library, does not appear to explain the warning. I would like to note that a PMIX_MCA_psec=native is sufficient to make the warning go away, it is not necessary to disable munge completely. Launching with srun --mpi=pmi2 instead works fine, anyway. Related to that: pmix has a configure switch to disable munge support. Would it be possible and/or adviseable to disable munge in the pmix build ? I have to note that we are seeing also PMIX ERROR: ERROR in file gds_ds12_lock_pthread.c at line 168 for each rank launched. This might be a separate issue or related. Setting PMIX_MCA_gds=^ds12 makes this go away. With or without setting PMIX_MCA_psec and PMIX_MCA_gds the job launches and is executed. Still, I would like to understand this better in case we have a broken slurm or mpi setup which would need to be corrected. Best Regards Christof System Details: Rocky Linux 9.1, slurm 23.02.2, open pmix 4.2.4rc1, openmpi 4.1.5. slurm and pmix were compiled in the same step to rpms in a mock build environment using the spec files included with them. I can provide the build logs of the rpms. Both are not from distribution repositories. openmpi was configured with ./configure --enable-mpi1-compatibility --enable-orterun-prefix-by-default --with-ofi=/cluster/libraries/libfabric/1.18.0/ --with-slurm --with-pmix --with-pmix-libdir=/usr/lib64 --with-pmi --with-pmi-libdir=/usr/lib64 Installed munge packages from distribution according to dnf on build node munge.x86_64 0.5.13-13.el9 munge-devel.x86_64 0.5.13-13.el9 munge-libs.x86_64 0.5.13-13.el9 on execution node munge.x86_640.5.13-13.el9 munge-devel.x86_64 0.5.13-13.el9 munge-libs.x86_64 0.5.13-13.el9 -- Dr. rer. nat. Christof Köhler email: c.koeh...@uni-bremen.de Universitaet Bremen/FB1/BCCMS phone: +49-(0)421-218-62334 Am Fallturm 1/ TAB/ Raum 3.06 fax: +49-(0)421-218-62770 28359 Bremen