Dear Ake,
Indeed we have Mellanox IB,
[:~]$ lspci | grep -i mellanox
02:00.0 Network controller: *Mellanox* Technologies MT27500 Family
[ConnectX-3]
[:~]$ ibstatus
Infiniband device 'mlx4_0' port 1 status:
default gid: fe80:0000:0000:0000:7079:9003:0007:f538
base lid: 0x76
sm lid: 0x1
state: 4: ACTIVE
phys state: 5: LinkUp
rate: 56 Gb/sec (4X FDR)
link_layer: InfiniBand
I had no problem installing foss-2023a.eb in another cluster with :
$ lspci | grep -i mellanox
86:00.0 Infiniband controller: *Mellanox* Technologies MT27800 Family
[ConnectX-5]
Thank you for any hint !
quim
Missatge de Åke Sandgren <[email protected]> del dia dv., 15 de març 2024
a les 12:18:
> lspci | grep -i mellanox
> will show if you have any mellanox devices on the system, some of these
> could be used as normal ethernet devices though
>
> ibstatus
> will show if any of those are running in Infiniband mode
>
> If ibstatus doesn't exist then you probably don't have infiniband, or at
> least lack the packages for using it.
>
> ________________________________________
> From: [email protected] <[email protected]>
> on behalf of Joaquim Jornet Somoza <[email protected]>
> Sent: Friday, March 15, 2024 11:44
> To: [email protected]
> Subject: Re: [easybuild] Failure in UCX-1.14.1-GCCcore-12.3.0.eb when
> installing foss-2023a.eb
>
> Dear Ake,
>
> How can I check this?
>
> Thank you!
>
> El vie, 15 mar 2024, 7:58, Åke Sandgren <[email protected]<mailto:
> [email protected]>> escribió:
> No there is no bug there.
>
> Which MOFED stack version are you using?
> Or does your system lack Infiniband?
>
> ________________________________________
> From: [email protected]<mailto:
> [email protected]> <[email protected]
> <mailto:[email protected]>> on behalf of Joaquim Jornet
> Somoza <[email protected]<mailto:[email protected]>>
> Sent: Thursday, March 14, 2024 16:02
> To: [email protected]<mailto:[email protected]>
> Subject: [easybuild] Failure in UCX-1.14.1-GCCcore-12.3.0.eb when
> installing foss-2023a.eb
>
> Dear easybuilders,
>
> I am trying to install foss-2023a.eb on a RH7.7 servers, but when
> installing UCX-1.14.1-GCCcore-12.3.0.eb , the installation fails with the
> following error:
> ...
> libtool: compile: gcc -DHAVE_CONFIG_H -I. -I../../.. "-DCPU_FLAGS=|avx"
> -I/dev/shm/easybuild/UCX/1.14.1/GCCcore-12.3.0/ucx-1.14.1/src
> -I/dev/shm/easybuild/UCX/1.14.1/GCCcore-12.3.0/ucx-1.14.1
> -I/dev/shm/easybuild/UCX/1.14.1/GCCcore-12.3.0/ucx-1.14.1/src
> -I/software/easybuild/x86_64/software/numactl/2.0.16-GCCcore-12.3.0/include
> -I/software/easybuild/x86_64/software/zlib/1.2.13-GCCcore-12.3.0/include
> -I/software/easybuild/x86_64/software/pkgconf/1.9.5-GCCcore-12.3.0/include
> -I/software/easybuild/x86_64/software/binutils/2.40-GCCcore-12.3.0/include
> -O3 -g -Wall -Werror -mavx -funwind-tables -Wno-missing-field-initializers
> -Wno-unused-parameter -Wno-unused-label -Wno-long-long -Wno-endif-labels
> -Wno-sign-compare -Wno-multichar -Wno-deprecated-declarations -Winvalid-pch
> -Wno-pointer-sign -Werror-implicit-function-declaration
> -Wno-format-zero-length -Wnested-externs -Wshadow
> -Werror=declaration-after-statement -O2 -ftree-vectorize -march=native
> -fno-math-errno -fPIC -MT rc/verbs/libuct_ib_la-rc_verbs_ep.lo -MD -MP -MF
> rc/verbs/.deps/libuct_ib_la-rc_verbs_ep.Tpo -c rc/verbs/rc_verbs_ep.c -o
> rc/verbs/libuct_ib_la-rc_verbs_ep.o >/dev/null 2>&1
> base/ib_md.c: In function 'uct_ib_md_access_flags':
> base/ib_md.c:638:25: error: 'IBV_ACCESS_ON_DEMAND' undeclared (first use
> in this function); did you mean 'IBV_EXP_ACCESS_ON_DEMAND'?
> 638 | access_flags |= IBV_ACCESS_ON_DEMAND;
> | ^~~~~~~~~~~~~~~~~~~~
> | IBV_EXP_ACCESS_ON_DEMAND
> base/ib_md.c:638:25: note: each undeclared identifier is reported only
> once for each function it appears in
> base/ib_md.c: In function 'uct_ib_mem_reg_internal':
> base/ib_md.c:751:24: error: 'IBV_ACCESS_ON_DEMAND' undeclared (first use
> in this function); did you mean 'IBV_EXP_ACCESS_ON_DEMAND'?
> 751 | if (access_flags & IBV_ACCESS_ON_DEMAND) {
> | ^~~~~~~~~~~~~~~~~~~~
> | IBV_EXP_ACCESS_ON_DEMAND
> base/ib_md.c: In function 'uct_ib_md_global_odp_init':
> base/ib_md.c:1449:54: error: 'IBV_ACCESS_ON_DEMAND' undeclared (first use
> in this function); did you mean 'IBV_EXP_ACCESS_ON_DEMAND'?
> 1449 | UCT_IB_MEM_ACCESS_FLAGS |
> IBV_ACCESS_ON_DEMAND,
> |
> ^~~~~~~~~~~~~~~~~~~~
> |
> IBV_EXP_ACCESS_ON_DEMAND
>
>
> Any hint on how to fix it? Is there a bug with IBV_ACCESS_ON_DEMAND
> variable?
>
--
----------------------------------------------------------------------------------------------------------------------------------------
*Dr. Joaquim Jornet Somoza*
*Técnico Superior de Cálculo Científico *
Servicios Generales a la Investigación (*SGIker*)
Universidad del País Vasco (*UPV/EHU*)
email: [email protected]
Edificio Joxe Maria Korta (Campus Gipuzkoa)
Av. Tolosa 72, 4a planta
20018 Donostia-San Sebastián,
Gipuzkoa, Spain
*External Collaborator.*
Nano-Bio Spectroscopy group
Departamento de Física de Materiales
Universidad del País Vasco (UPV/EHU)
Donostia-San Sebastián, Gipuzkoa, Spain
The Max Planck Institute for the Structure and Dynamics of Matter (MPSD)
Bldg. 99 (CFEL)
Luruper Chaussee 149
22761 Hamburg, Germany