Re: ROCm for AMD GPUs at Ubuntu Summit 2024

2025-02-18 Thread Cordell Bloor

Thanks for the vote of confidence!

On 2025-02-11 02:47, Drew Parsons wrote:

On 2025-02-11 09:57, PICCA Frederic-Emmanuel wrote:

On 2025-02-11 09:32, Cordell Bloor wrote:

On 2025-02-02 15:22, Cordell Bloor wrote:

One of the most important things I'd like to share is the list of
packages I found that have AMDGPU support upstream, and that could
have support enabled within Debian.
Same here, I have a bunch of packages which use OpenCL, and it would 
be great to add autopkgtests with the OpenCL AMD implementation.


So all packages with pocl-opencl-icd in the autopkgtest Depends 
should be good candidate for this instrumentation.


"me too".  Debian's GPU support is in general somewhat underdone, 
partly due to cuda's historical nonfreeness.


Improving our GPU support is a good theme for a GSOC project.


I've submitted a GSoC project proposal [1]. If any Debian Science Team 
member would like to co-mentor, please feel free to add yourself. I can 
always refer a student to the appropriate resources for questions when 
I'm stumped myself, but it certainly doesn't hurt to have a dedicated 
co-mentor.


Sincerely,
Cory Bloor

[1]: 
https://wiki.debian.org/SummerOfCode2025/Projects#SummerOfCode2025.2FApprovedProjects.2FEnhancingPackagesWithROCm.Enhancing_Debian_packages_with_ROCm_GPU_acceleration




Re: ROCm for AMD GPUs at Ubuntu Summit 2024

2025-02-17 Thread Cordell Bloor

Hi Drew,

On 2025-02-11 02:47, Drew Parsons wrote:
If we get it running well, we'll start wanting to have debci GPU 
testing too to be sure it continues to work. Might be possible to 
build upon my GSOC project last year with Nikolaos Chatzikonstantinou 
for that step, though good also if debian's own CI infrastructure can 
cover GPU


In the long term, I think the goal is to enhance Debian's own CI 
infrastructure. Right now, we're running a fairly large set of 
correctness tests on a GPU-enhanced fork of the DebCI software [1]. 
However, the validation suite is still not in a state that I'm happy 
with yet. At the moment, I'm just worrying about verifying correctness.


In any case, I took a look at your GSoC project. It was interesting, 
though perhaps not immediately applicable. AMD upstream does extensive 
performance regression testing and it is quite a lot of work, so I 
appreciate how difficult that task is!


Sincerely,
Cory Bloor

[1]: https://ci.rocm.debian.net/



Re: ROCm for AMD GPUs at Ubuntu Summit 2024

2025-02-11 Thread McKinstry, Alastair


From: Drew Parsons 
Date: Tuesday, 11 February 2025 at 09:47
To: PICCA Frederic-Emmanuel 
Cc: Andrius Merkys , Cordell Bloor , 
debian-science 
Subject: Re: ROCm for AMD GPUs at Ubuntu Summit 2024


On 2025-02-11 09:57, PICCA Frederic-Emmanuel wrote:

> Same here, I have a bunch of packages which use OpenCL, and it would be
> great to add autopkgtests with the OpenCL AMD implementation.
>
> So all packages with pocl-opencl-icd in the autopkgtest Depends should
> be good candidate for this instrumentation.
>


"me too".  Debian's GPU support is in general somewhat underdone, partly
due to cuda's historical nonfreeness.

Improving our GPU support is a good theme for a GSOC project.

Hypre has some GPU support, good to improve and test its activation. I
think PETSc might have some too.

If we get it running well, we'll start wanting to have debci GPU testing
too to be sure it continues to work. Might be possible to build upon my
GSOC project last year with Nikolaos Chatzikonstantinou for that step,
though good also if debian's own CI infrastructure can cover GPU.

Drew
Just to add:
I’ve added HIP/ROCm support to mpich 4.3.0 in experimental, (Its present in UCX 
so OpenMPI uses this); I’m working on enabling it on atlas-ecmwf (just working 
on one known. Bug), and am finalizing packaging of UCC – Unified Collectives 
Communication, which adds GPU (HIP) collectives support to UCX for MPI etc.
Regards
Alastair



Re: ROCm for AMD GPUs at Ubuntu Summit 2024

2025-02-11 Thread Drew Parsons

On 2025-02-11 09:57, PICCA Frederic-Emmanuel wrote:

On 2025-02-11 09:32, Cordell Bloor wrote:

On 2025-02-02 15:22, Cordell Bloor wrote:

One of the most important things I'd like to share is the list of
packages I found that have AMDGPU support upstream, and that could
have support enabled within Debian.


Same here, I have a bunch of packages which use OpenCL, and it would be 
great to add autopkgtests with the OpenCL AMD implementation.


So all packages with pocl-opencl-icd in the autopkgtest Depends should 
be good candidate for this instrumentation.





"me too".  Debian's GPU support is in general somewhat underdone, partly 
due to cuda's historical nonfreeness.


Improving our GPU support is a good theme for a GSOC project.

Hypre has some GPU support, good to improve and test its activation. I 
think PETSc might have some too.


If we get it running well, we'll start wanting to have debci GPU testing 
too to be sure it continues to work. Might be possible to build upon my 
GSOC project last year with Nikolaos Chatzikonstantinou for that step, 
though good also if debian's own CI infrastructure can cover GPU.


Drew



Re: ROCm for AMD GPUs at Ubuntu Summit 2024

2025-02-11 Thread PICCA Frederic-Emmanuel
> On 2025-02-11 09:32, Cordell Bloor wrote:
>> On 2025-02-02 15:22, Cordell Bloor wrote:
>>> One of the most important things I'd like to share is the list of
>>> packages I found that have AMDGPU support upstream, and that could
>>> have support enabled within Debian. These mostly fall under the domain
>>> of the Debian Science Team: adios2, blaspp, cp2k, cupy, dbcsr,
>>> ectrans, elpa, gloo, hpx, hwloc, hypre, kokkos, lammps, lapackpp,
>>> llama-cpp, magma, mfem, mpich, onnxruntime, papi, paraview, petsc,
>>> pyfr, pytorch, slepc, spfft, sundials, superlu-dist, trilinos, and
>>> whisper-cpp.


Same here, I have a bunch of packages which use OpenCL, and it would be great 
to add autopkgtests with the OpenCL AMD implementation.

So all packages with pocl-opencl-icd in the autopkgtest Depends should be good 
candidate for this instrumentation.

An exemple of such instrumentation whcih should be simplify is here

https://sources.debian.org/src/clpeak/1.1.3-1/debian/tests/

the upstream-binaries should be factorized for all packages :). This is 
something to see with the AMDGPU team.

Cheers


Fred




Re: ROCm for AMD GPUs at Ubuntu Summit 2024

2025-02-10 Thread Andrius Merkys

Hi,

On 2025-02-11 09:32, Cordell Bloor wrote:

On 2025-02-02 15:22, Cordell Bloor wrote:
One of the most important things I'd like to share is the list of 
packages I found that have AMDGPU support upstream, and that could 
have support enabled within Debian. These mostly fall under the domain 
of the Debian Science Team: adios2, blaspp, cp2k, cupy, dbcsr, 
ectrans, elpa, gloo, hpx, hwloc, hypre, kokkos, lammps, lapackpp, 
llama-cpp, magma, mfem, mpich, onnxruntime, papi, paraview, petsc, 
pyfr, pytorch, slepc, spfft, sundials, superlu-dist, trilinos, and 
whisper-cpp.


I think that these package enhancements would make for a good Google 
Summer of Code project. I'd be happy to mentor a new contributor in 
preparing updates for these packages that enable AMD GPU support. I 
would imagine that's very likely something that the Debian Science Team 
would support, but I feel I should nevertheless ask explicitly before 
trying to engage a third party to do this work.


So, is this something that the Debian Science Team would support? And, 
are there any other barriers to enablement that I should try to help with?


I cannot speak for everyone, but I personally would be glad to see
AMDGPU support enabled in packages I am interested in, in particular 
elpa, lammps, onnxruntime, pytorch and spfft. I maintain only the 
latter, though.


Best,
Andrius



Re: ROCm for AMD GPUs at Ubuntu Summit 2024

2025-02-10 Thread Cordell Bloor

Howdy folks,

On 2025-02-02 15:22, Cordell Bloor wrote:
One of the most important things I'd like to share is the list of 
packages I found that have AMDGPU support upstream, and that could 
have support enabled within Debian. These mostly fall under the domain 
of the Debian Science Team: adios2, blaspp, cp2k, cupy, dbcsr, 
ectrans, elpa, gloo, hpx, hwloc, hypre, kokkos, lammps, lapackpp, 
llama-cpp, magma, mfem, mpich, onnxruntime, papi, paraview, petsc, 
pyfr, pytorch, slepc, spfft, sundials, superlu-dist, trilinos, and 
whisper-cpp.


I think that these package enhancements would make for a good Google 
Summer of Code project. I'd be happy to mentor a new contributor in 
preparing updates for these packages that enable AMD GPU support. I 
would imagine that's very likely something that the Debian Science Team 
would support, but I feel I should nevertheless ask explicitly before 
trying to engage a third party to do this work.


So, is this something that the Debian Science Team would support? And, 
are there any other barriers to enablement that I should try to help with?


Sincerely,
Cory Bloor