[easybuild] CUDA on machine with OmniPath

2023-11-11 Thread Jakob Schiotz
Hi Easybuilders,

I am trying to get the newest PyTorch to work with the foss/2023a toolchain.  
So far, the CPU-only version seems to work (PR #19184), and I am trying to get 
the CUDA version to work.

In the test suite, a lot of stuff fails with 
  fi_info: error while loading shared libraries: libpsm_infinipath.so.1: cannot 
open shared object file: No such file or directory

It looks like it is trying to load an OmniPath driver.  However, our few GPU 
nodes have neither OmniPath nor Infiniband, but they do share the EasyBuild 
module with the CPU nodes, and of course OpenMPI is built with OmniPath support 
(which does not normally cause trouble for MPI jobs that stay within a single 
node, and this is our use case on the GPU nodes).

I suspect the problem is due to PyTorch depending on NCCL and magma, and both 
of these depend on UCX-CUDA.  Does anyone have a suggestion for how to handle 
this?  Can one build a version of NCCL and magma without UCX-CUDA?  Can one 
disable it with an environment variable?  Or something else?

We would rather avoid having a completely different module tree for these few 
nodes, but if it is necessary then we will have to.

Best regards

Jakob




[easybuild] HPC, Big Data, and Data Science Devroom at FOSDEM'24

2023-11-11 Thread Kenneth Hoste

HPC, Big Data, and Data Science Devroom at FOSDEM'24


https://hpc-bigdata-fosdem24.github.io - https://fosdem.org/2024

** Call for Participation **

Submission deadline: Friday 1 Dec 2023
Devroom date: Sat 3 Feb 2024


We are proud to announce the 9th edition of the HPC, Big Data and Data 
Science devroom at FOSDEM 2024 in Brussels (Belgium).


It is organised by representatives of the HPC, Big Data, and Data 
Science communities, who are joining forces to bring them together.


The devroom will take place during the FOSDEM'24 weekend, on Saturday 3 
February 2024, and is open for everybody to join (no registration required).


Join us to enjoy a variety of talks, demos and interesting discussions 
on open-source HPC, Big Data, and Data Science.



Sounds interesting? Submit your talk proposal, and see you at FOSDEM'24!

Submissions are light weight: basically talk title + short description, 
plain text.



Please visit the website for more information, and consider sharing this 
call for participation with friends and colleagues.


Contact us via email at hpc-bigdata-devr...@lists.fosdem.org with 
questions or concerns, or via @fosdem_hpc on Twitter/X.


--

Kenneth Hoste - HPC team at Ghent University, Belgium
Adam Huffman - Big Data Institute, University of Oxford, UK