--
Ole Holm Nielsen
PhD, Senior HPC Officer
Department of Physics, Technical University of Denmark,
Fysikvej Building 309, DK-2800 Kongens Lyngby, Denmark
E-mail: ole.h.niel...@fysik.dtu.dk
Homepage: http://dcwww.fysik.dtu.dk/~ohnielse/
Mobile: (+45) 5180 1620
umbers: https://developer.nvidia.com/cuda-gpus.
Great, thanks a lot for sharing your insights!
EasyBuild developers: Could you kindly add the above two URLs to the
documentation at
https://docs.easybuild.io/en/latest/version-specific/help.html ?
Best regards,
Ole
--
Ole Holm Nielsen
PhD, Seni
1 cpu tests failed:
//tensorflow/core/common_runtime:graph_constructor_test
== 2021-05-26 15:30:49,721 easyblock.py:298 INFO Closing log for
application name TensorFlow version 2.4.1
Can anyone suggest a fix for this issue?
Thanks,
Ole
--
Ole Holm Nielsen
PhD, Senior HPC Officer
Department
separated by a dot, for example: 3.5,5.0,7.2 (type comma-separated list)
This makes me none the wiser! Can anyone tell me what these numeric
values are supposed to mean, and how I pick the right values for the GPUs
in my nodes?
Thanks,
Ole
--
Ole Holm Nielsen
PhD, Senior HPC Officer
Department
n I tell OpenBLAS that we have an Intel Ice Lake CPU?
It seems that OpenBLAS doesn't know about Ice Lake nor Cascade Lake :-(
https://github.com/xianyi/OpenBLAS/blob/develop/TargetList.txt
Thanks,
Ole
--
Ole Holm Nielsen
PhD, Senior HPC Officer
Department of Physics, Technical University of
-20210528.152213.BZayz.log
and figure out what the actual problem is.
On 5/28/21 3:26 PM, Ole Holm Nielsen wrote:
Hi Simon,
On 5/28/21 2:50 PM, Simon Branford wrote:
OpenBLAS recently added IceLake detection:
https://github.com/xianyi/OpenBLAS/pull/3233
Thanks a lot for the info! It seems that Ice Lake
. For each of those (2) tests there should be some more
detailed output
Search for "At least 2 gpu tests failed" and look below.
FYI: Setting EASYBUILD_TMPDIR to a large directory is not required.
Temporary files are usually small.
Am 27.05.21 um 13:02 schrieb Ole Holm Nielsen:
On 5/27/2
On 5/27/21 10:46 AM, Alexander Grund wrote:
> Alexandre: should we look for patterns like "No space left on device"
in the Bazel output and highlight them better, perhaps with a concrete
suggestion to use --tmpdir to avoid the usage of /tmp?
We could in general put something into EasyBuild,
I can gzip it and send you an URL?
Thanks,
Ole
FYI: Setting EASYBUILD_TMPDIR to a large directory is not required.
Temporary files are usually small.
Am 27.05.21 um 13:02 schrieb Ole Holm Nielsen:
On 5/27/21 10:46 AM, Alexander Grund wrote:
> Alexandre: should we look for patterns like &
On 5/27/21 10:46 AM, Alexander Grund wrote:
/home/modules/software/binutils/2.35-GCCcore-10.2.0/bin/ld.gold: fatal
error:
bazel-out/k8-opt/bin/tensorflow/core/common_runtime/graph_constructor_test:
No space left on device
What device might that be? As shown above, I have quite a bit of disk
-fosscuda-2019b-Python-3.7.4.eb --robot
--cuda-compute-capabilities=6.1,7.5 --buildpath=/dev/shm
--tmpdir=/scratch/eb-build
Yes, I configured that with:
export EASYBUILD_BUILDPATH=/run/user/$UID/eb_build
ulimit -s 2000240
export EASYBUILD_TMPDIR=/scratch/$USER
Thanks,
Ole
--
Ole Holm Nielsen
PhD
, but perhaps it
rings a bell for Alexander (in CC).
There are a couple of fixes related to PyTorch that will be included in
the upcoming EasyBuild v4.4.0 release (which will be released tomorrow
hopefully), so keep an eye out for that...
regards,
Kenneth
On 01/06/2021 09:56, Ole Holm Nielsen wrote
for the update Ole!
regards,
Kenneth
On 03/06/2021 07:42, Ole Holm Nielsen wrote:
Hi Kenneth,
I can confirm that with EasyBuild v4.4.0 the PyTorch 1.8.1 installation
went smoothly and without any problems:
$ eb PyTorch-1.8.1-foss-2020b.eb -r
Best regards,
Ole
On 6/1/21 5:13 PM, Kenneth Hoste wrote
(CI tests are failing currently), but you
can try installing that using "eb --from-pr 12814".
regards,
Kenneth
On 03/06/2021 10:17, Alexander Grund wrote:
There is an open PR in the easyconfigs repo. Check that :)
Am 03.06.21 um 10:16 schrieb Ole Holm Nielsen:
Our users
om easybuild.tools.config import build_option
File
"/home/modules/software/EasyBuild/4.4.0/lib/python3.6/site-packages/easybuild/tools/config.py",
line 47, in
from easybuild.base.frozendict import FrozenDictKnownKeys
ModuleNotFoundError: No module named 'easybuild.base.frozendict'
On 04/06/
On 6/4/21 2:22 PM, Kenneth Hoste wrote:
Hi Ole,
On 04/06/2021 14:09, Ole Holm Nielsen wrote:
On 6/4/21 1:54 PM, Kenneth Hoste wrote:
Please try running this, which will probably reveal the problem:
python3 -O -m easybuild.main
$ python3 -O -m easybuild.main
Traceback (most recent call
On 6/4/21 2:41 PM, Kenneth Hoste wrote:
On 04/06/2021 14:36, Ole Holm Nielsen wrote:
$ ls -la
/home/modules/software/EasyBuild/4.4.0/lib/python3.6/site-packages/easybuild/base
total 172
drwxr-xr-x. 3 modules modules 147 Jun 4 07:30 .
drwxr-xr-x. 8 modules modules 137 Jun 3 09:42 ..
-rw
Hi Alexander,
On 6/4/21 1:46 PM, Alexander Grund wrote:
==
ERROR: test_process_group_as_module_member
(__main__.C10dProcessGroupSerialization)
--
Traceback
+6b4b10ec.x86_64
Is there a fix for the inability to locate the python command?
Thanks,
Ole
--
Ole Holm Nielsen
PhD, Senior HPC Officer
Department of Physics, Technical University of Denmark,
Fysikvej Building 309, DK-2800 Kongens Lyngby, Denmark
E-mail: ole.h.niel...@fysik.dtu.dk
Homepage: http
Hi Simon,
On 5/28/21 2:50 PM, Simon Branford wrote:
OpenBLAS recently added IceLake detection:
https://github.com/xianyi/OpenBLAS/pull/3233
Thanks a lot for the info! It seems that Ice Lake gets detected as
CPUTYPE_SKYLAKEX?
This has been patched in EasyBuild for OpenBLAS 0.3.12 and
rflow/core/platform/liberror.a', configuration:
</tt><tt>f6bc5b6107d950b9fac2186352cdfdfe45c6815016e3edc9f32af940b50d30a6,
</tt><tt>execution platform: @local_execution_config_platform//:platform]
</tt><tt>ERROR:
</tt><tt>/run/user/983/
n I tell OpenBLAS that we have an Intel Ice Lake CPU?
It seems that OpenBLAS doesn't know about Ice Lake nor Cascade Lake :-(
https://github.com/xianyi/OpenBLAS/blob/develop/TargetList.txt
Thanks,
Ole
--
Ole Holm Nielsen
PhD, Senior HPC Officer
Department of Physics, Technical University of Denmark,
-zAAAvr/tmpnh77Vl/lib/python3.8/site-packages:$PYTHONPATH
&& cd test && PYTHONUNBUFFERED=1
/home/modules/software/Python/3.8.6-GCCcore-10.2.0/bin/python run_test.py
--verbose -x distributed/rpc/test_process_group_agent test_quantization "
exited with exit code 1 and ou
== 2021-06
AssertionError(_build_err_msg())
E AssertionError:
E Arrays are not almost equal to 6 decimals
EACTUAL: array([2.+1.j, 1.+2.j], dtype=complex64)
EDESIRED: array([nan+nanj, nan+nanj], dtype=complex64)
E AssertionError:
E Arrays
em on
our own cluster.
Thanks,
Ole
--
Ole Holm Nielsen
PhD, Senior HPC Officer
Department of Physics, Technical University of Denmark
On 12-03-2021 10:35, Ole Holm Nielsen wrote:
Thanks a lot for pointing at the solution. I have asked the sysadmin if
he can install the libnl3-devel RPM. Hopefully that will resolve the
issue for us.
I can report that after installing the libnl3-devel RPM and rebuilding
the libfabric
Dear Easybuilders,
I have received a request to provide the software AFLOW - Automatic FLOW
for Materials Discovery (http://aflow.org) with an installation page at
http://aflow.org/install-aflow/
Question: Has anyone been working on an EB module for AFLOW?
Thanks a lot,
Ole
--
Ole Holm
additional feedback on this is
welcome (in particular in the GitHub issue).
regards,
Kenneth
On 11/03/2021 16:11, Ole Holm Nielsen wrote:
Dear EasyBuilders,
I'm trying to get EasyBuild modules up and running on an external
cluster with AMD EPYC 7351 and running CentOS 7.6. With EB 4.3.3 I
can't
MPI binaries on multiple clusters but set this on
OmniPath:
OMPI_MCA_btl='^openib'
OMPI_MCA_osc='^ucx'
OMPI_MCA_pml='^ucx'
to disable UCX and openib at runtime. If you include UCX in EB's OpenMPI
it will not compile in "openib" so the first one of those three would not
be
-easyblocks/pull/2611
<https://github.com/easybuilders/easybuild-easyblocks/pull/2611> and the
linked issues.
For an immediate fix, you just need to limit the number of cores used for
the build, e.g. use the eb option `--parallel=12`
On Mon, 15 Nov 2021 at 09:06, Ole Holm N
Hi Åke,
On 12/3/21 08:27, Åke Sandgren wrote:
On 02-12-2021 14:18, Åke Sandgren wrote:
On 12/2/21 2:06 PM, Ole Holm Nielsen wrote:
These are updated observations of running OpenMPI codes with an
Omni-Path network fabric on AlmaLinux 8.5::
Using the foss-2021b toolchain and OpenMPI/4.1.1-GCC
installed PMIx modules? Can
anyone give the exact command used to rebuild any given PMIx module
including the mentioned PRs?
Slurm users: Check if your Slurm has been built with PMIx support by:
$ srun --mpi=list
in which case you must rebuild Slurm without PMIx!
Thanks,
Ole
--
Ole Holm Nielsen
Dear Kenneth,
On 9/28/23 10:49, Kenneth Hoste wrote:
Not seeing the problem with OpenBLAS 0.3.23 is encouraging, that probably
means a fix is hiding in either OpenBLAS 0.3.22 or 0.3.23 that we may be
able to backport to 0.3.21.
I don't see anything obvious in the release notes though (see
nox IB
PCIe adapter lying around and mount it in my server? Or maybe a
relatively new Omni-Path adapter?
Would that make the OpenMPI EB module happy, and would the module work
with our future nodes?
Thanks,
Ole
--
Ole Holm Nielsen
PhD, Senior HPC Officer
Department of Physics, Technical University of Denmark,
the fix and patch OpenBLAS 0.3.21 to
fix the problem you're seeing...
So is there any hope that foss-2022b.eb with OpenBLAS/0.3.21-GCC-12.2.0
can be made to work correctly on AMD Genoa nodes?
Thanks,
Ole
On 28/09/2023 09:26, Ole Holm Nielsen wrote:
It's interesting that while attempting
difference here appears to be GCC version 12.2.0 versus 11.3.0!
Any ideas about what's causing this error in the tests?
Perhaps GCC version 12.2.0 tries to use the new AVX-512 instructions in
AMD Genoa and has a bug?
Thanks,
Ole
On 9/26/23 08:04, Ole Holm Nielsen wrote:
I'm starting EasyBuild up
?
We're using EB 4.8.1.
Thanks,
Ole
--
Ole Holm Nielsen
PhD, Senior HPC Officer
Department of Physics, Technical University of Denmark
nce we now have used the latest GCC 12.3.0, and we have installed an OPA
fabric, the problem would seem to be related to having the AMD "Genoa"
hardware.
Does anyone have suggestions for building OpenMPI successfully on this
platform?
Thanks,
Ole
--
Ole Holm Nielsen
PhD, Senior HPC
looking at the PR now. If the test reports are all successful I will
merge it and it will be part of the next EasyBuild release 4.8.2. If you
want to install it before the next release you do this with `--from-pr`, e.g.:
eb --from-pr 18731
Best,
Sebastian
On 02.10.23 13:51, Ole Holm Nielsen wrote
d: warning:
/tmp/eb-74m3kzgo/ccy1Gkzg.o: missing .note.GNU-stack section implies
executable stack
/
== 2023-09-25 16:13:04,293 easyblock.py:328 INFO Closing log for
application name OpenBLAS version 0.3.21
Thanks,
Ole
--
Ole Holm Nielsen
PhD, Senior HPC Officer
Department of Physics, Technical University of Denmark
h Infiniband or Omni-Path, just plain
Ethernet.
--
Ole Holm Nielsen
PhD, Senior HPC Officer
Department of Physics, Technical University of Denmark
builds correctly and passes all its tests.
Best regards,
Ole
On 10/3/23 09:51, Ole Holm Nielsen wrote:
I'm starting EasyBuild up on our new AMD "Genoa" platform with 1 AMD EPYC
9124 16-Core Processor with 2 threads/core, 384 GB RAM, Omni-Path (OPA)
fabric, and AlmaLinux 8.8 OS.
I
ive me permission to copy
over the content of your email to create the issue.
Thanks,
Alan
On Wed, 25 May 2022 at 10:54, Ole Holm Nielsen <mailto:ole.h.niel...@fysik.dtu.dk>> wrote:
Hi Easybuilders,
I'm testing the upgrade of our compute nodes from Almalinux 8.5 to 8.6
(t
11:09, Ole Holm Nielsen wrote:
Hi Alan,
Thanks a lot for the feedback! I've opened a new issue now:
https://github.com/easybuilders/easybuild-easyconfigs/issues/15651
Best regards,
Ole
On 6/9/22 10:52, Alan O'Cais wrote:
Ole,
Can you please copy this over to an issue in
https://github.
':
['3f9a18517e33f006a9c2fc4f43f01b54abfe6ff2eae7322424f31069296b615c'],
}),
Can anyone suggest a workaround such as getting a copy of 0.62?
Thanks,
Ole
--
Ole Holm Nielsen
PhD, Senior HPC Officer
Department of Physics, Technical University of Denmark
fetched this tarball:
https://cpan.metacpan.org/authors/id/V/VP/VPIT/Variable-Magic-0.62.tar.gz
/Ole
-Original Message-
From: easybuild-requ...@lists.ugent.be On
Behalf Of Ole Holm Nielsen
Sent: 23 November 2022 12:17
To: easybuild@lists.ugent.be
Subject: [easybuild] Build of Perl
Hi Loris,
On 12/2/22 08:27, Loris Bennett wrote:
How do I force a total rebuild of, say, a foss toolchain for a different
CPU architecture?
Up to now I had a homogeneous cluster with Intel Xeon CPUs, now we have
acquired some nodes with AMD Epyc CPUs for which I need to build
software.
I have
On 2/15/23 15:30, Jure Pečar wrote:
On Wed, 15 Feb 2023 15:26:26 +0100
Ole Holm Nielsen wrote:
Is "foss" also the preferred toolchain on AMD Rome and Genoa?
For now, yes.
There's some work going on to create a toolchain around amd compilers but
it's questionable how much you g
Hi Loris,
I'm being hit by that libxc download problem too! It's very bad for the
modules that we're trying to build :-(
Should EB use this download URL in stead?
https://gitlab.com/libxc/libxc/-/archive/5.2.3/libxc-5.2.3.tar.gz
Thanks,
Ole
On 3/28/23 11:41, Loris Bennett wrote:
The EC
Hi Loris,
It would be great if you could make a PR against libxc! I'm not that
experienced with Git usage ;-)
Thanks,
Ole
On 3/28/23 13:03, Loris Bennett wrote:
Hi Ole,
Ole Holm Nielsen writes:
Hi Loris,
I'm being hit by that libxc download problem too! It's very bad for
the modules
s for sharing any advice and experiences!
Best regards,
Ole
--
Ole Holm Nielsen
PhD, Senior HPC Officer
Department of Physics, Technical University of Denmark
-Original Message-
From: easybuild-requ...@lists.ugent.be On
Behalf Of Ole Holm Nielsen
Sent: 26 May 2023 09:29
To: easybuild
Subject: [easybuild] EB file for building a recent version of Meep?
CAUTION: This email originated from outside the organisation. Do not click
links or open attachments
of
Meep with modern compilers and Python3?
Thanks,
Ole
--
Ole Holm Nielsen
PhD, Senior HPC Officer
Department of Physics, Technical University of Denmark
,
and in there you will find the missing dependencies. You either can clone this
tree locally and add it to your robots search path, or copy out the
dependencies you need manually.
On 26 May 2023, at 10:48, Ole Holm Nielsen wrote:
Hi Simon,
Fantastic, thanks a lot for the quick answer! However
st/2023-04-20/rust-std-1.69.0-x86_64-unknown-linux-gnu.tar.xz
#=#=- # #
== 2023-12-08 20:28:03,383 easyblock.py:328 INFO Closing log for
application name Rust version 1.70.0
Can anyone suggest a fix?
Thanks,
Ole
--
Ole Holm Nielsen
PhD, Senior HPC Officer
Department of Physics, Te
a lot! This command does the job:
$ eb Rust-1.70.0-GCCcore-12.3.0.eb --include-easyblocks-from-pr=3038
On 8 December 2023 20:33:34 CET, Ole Holm Nielsen
wrote:
I'm trying to build Rust-1.70.0-GCCcore-12.3.0.eb but this fails
because of a download failure. I don't know how t
about AMD's ROCm RPM packages?
Thanks a lot,
Ole
--
Ole Holm Nielsen
PhD, Senior HPC Officer
Department of Physics, Technical University of Denmark
sa 72, 4a planta
20018 Donostia-San Sebastián,
Gipuzkoa, Spain
/External Collaborator./
Nano-Bio Spectroscopy group
Departamento de Física de Materiales
Universidad del País Vasco (UPV/EHU)
Donostia-San Sebastián, Gipuzkoa, Spain
The Max Planck Institute for the Structure and Dynamics of Matter (MP
101 - 158 of 158 matches
Mail list logo