Hello Kenneth, I found a workaround of sorts to fix the 'sanity_check' failure. Setting the environment variable: export FI_PROVIDER=tcp helped and after the (successful) installation the MLNX IB stack seems to work, still. At least I found no error till now. Maybe this is helpful for others also running sles15.3.
Actually this thread provides a quantum of insight: https://community.intel.com/t5/Intel-oneAPI-HPC-Toolkit/New-MPI-error-with-Intel-2019-1-unable-to-run-MPI-hello-world/m-p/1158382 Cheers, -Frank > -----Original Message----- > From: [email protected] <[email protected]> > On Behalf Of Heckes, Frank > Sent: Monday, 21 February 2022 09:37 > To: Kenneth Hoste <[email protected]> > Cc: [email protected] > Subject: RE: [easybuild] intel2019b (test fails) > > Hello Kenneth, > Many thanks for your quick reply. Unfortunately not, I:m working on > sles15.3. I'm running eb version 4.5.1 still and saw the fixes for sles in > 4.5.3. > I'll update this first and will check another release for the intel compiler. > Chrees, > -Frank > > > > > -----Original Message----- > > From: Kenneth Hoste <[email protected]> > > Sent: Thursday, 17 February 2022 14:43 > > To: Heckes, Frank <[email protected]> > > Cc: [email protected] > > Subject: Re: [easybuild] intel2019b (test fails) > > > > Dear Frank, > > > > You're probably running into this on a RHEL8 derivative? > > > > Intel MPI 2019 update 5 is known to be broken on those OS versions, see > > https://github.com/easybuilders/easybuild-easyconfigs/issues/11762 . > > > > The best way to handle this is probably to use a custom intel-2019b.eb > that > > uses Intel MPI 2019 update 7, which should work... > > > > > > regards, > > > > Kenneth > > > > On 17/02/2022 09:52, Heckes, Frank wrote: > > > Hi all, > > > > > > I didn’t find a solution for my problem neither in the mail archive nor > > > via google. > > > I tried to build intel-2019b.eb. The process runs successful till it > > > reaches the sanity check > > > > > > == sanity checking... > > > > > > == ... (took 1 secs) > > > > > > == FAILED: Installation ended unsuccessfully (build directory: > > > /opt/local/easybuild/build/impi/2018.5.288/iccifort-2019.5.281): build > > > failed (first 300 chars): Sanity check failed: sanity check command > > > mpirun -n 36 > > > /opt/local/easybuild/build/impi/2018.5.288/iccifort- > 2019.5.281/mpi_test > > > exited with code 11 (output: > > > > > > As the iccifort-2019.5.281 is already available I loaded this module in > > > another session. Starting the test manually leads to the errors below > > > (see ‘Errors without FI- environment variables’) > > > Setting the env.var. export FI_PROVIDER=tcp fix the problem. Now the > > > test completes: > > > mpirun -np 36 > > > /opt/local/easybuild/build/impi/2018.5.288/iccifort- > 2019.5.281/mpi_test > > > > > > Hello world: rank 0 of 36 running on atlas52 > > > > > > Hello world: rank 1 of 36 running on atlas52 > > > > > > Hello world: rank 2 of 36 running on atlas52 > > > > > > . . . > > > > > > By assigning verbs the error appears again (The node has a valid > > > ofedstack software, IP address assigned to HCA and is operational for > > > other MPI apps) > > > > > > Two questions: > > > > > > * How can I set-up the env. Variables so that eb will use them during > > > the test. (doing export FI_PROVIDER=…; eb intel2019b –robot doesn’t > > > help) > > > * Although I can see the verbs provider (running fi_info) I ran into > > > an error. Did I miss a dependency to intel MPI? > > > > > > Many thanks in advance for any help and advise. > > > > > > Cheers, > > > > > > -Frank Heckes > > > > > > ------------------------------ Errors without FI- environment variables > > > > > > mpirun -np 36 > > > /opt/local/easybuild/build/impi/2018.5.288/iccifort- > 2019.5.281/mpi_test > > > > > > Abort(1091471) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Init: > > > Other MPI error, error stack: > > > > > > MPIR_Init_thread(703).......: > > > > > > MPID_Init(923)..............: > > > > > > MPIDI_OFI_mpi_init_hook(883): OFI addrinfo() failure > > > > > > Abort(1091471) on node 11 (rank 11 in comm 0): Fatal error in > PMPI_Init: > > > Other MPI error, error stack: > > > > > > MPIR_Init_thread(703).......: > > > > > > MPID_Init(923)..............: > > > > > > MPIDI_OFI_mpi_init_hook(883): OFI addrinfo() failure > > > > > > Abort(1091471) on node 12 (rank 12 in comm 0): Fatal error in > PMPI_Init: > > > Other MPI error, error stack: > > > > > > MPIR_Init_thread(703).......: > > > > > > MPID_Init(923)..............: > > > > > > MPIDI_OFI_mpi_init_hook(883): OFI addrinfo() failure > > > > > > Abort(1091471) on node 14 (rank 14 in comm 0): Fatal error in > PMPI_Init: > > > Other MPI error, error stack: > > > > > > MPIR_Init_thread(703).......: > > > > > > MPID_Init(923)..............: > > > > > > MPIDI_OFI_mpi_init_hook(883): OFI addrinfo() failure > > >
smime.p7s
Description: S/MIME cryptographic signature

