Hi Team,

(CC'ed debian-science, but please redirect discussion to -ai@l.d.o)

Previously I made the decision totally by myself. But as long as we
have more people working on the pytorch related packages, I'd like
to ask for your comments on what range of nvidia GPUs we want to
support in the pytorch-cuda package, before the freeze.

The correspondence between GPU model and CUDA architecture can
be found in this website: https://developer.nvidia.com/cuda-gpus

I made a commit several minutes ago:
https://salsa.debian.org/deeplearning-team/pytorch/-/commit/eeb7758aea5a751191c25182feb628360e4c3704
Which finalizes the CUDA architecture list as
  6.1;7.5;8.6+PTX

This basically covers GPUs from 8-years-old ones like GTX1080 (6.1),
TitanX (6.1), up to RTX3090 (8.6). Newer GPUs like RTX4090 can still
run the program without problem. Older ones than GTX1??? will not
run this build.

Our neighbor, Archlinux, has made a long line for better support
for more specific architectures, but they bumped the bottomline
to Nvidia V100 (7.0), Nvidia Titan V (7.0):
https://gitlab.archlinux.org/archlinux/packaging/packages/python-pytorch/-/blob/main/PKGBUILD?ref_type=heads
  export TORCH_CUDA_ARCH_LIST="7.0 7.2 7.5 8.0 8.6 8.7 8.9 9.0 9.0a"

I kept the support for super-old devices like GTX1080Ti (6.1) here,
because in my impression lots of Debian users like to run Debian
on ancient hardware. That's probably why m68k, hppa, i386 are
still around.

I refrained from adding even older GPUs to the list, because
they really lag behind the modern hardware by an giant margin.
For instance, GTX1080Ti lags behind the RTX4070 (Laptop):
https://gpu.userbenchmark.com/Compare/Nvidia-RTX-4070-Laptop-vs-Nvidia-GTX-1080-Ti/m2033663vs3918
I don't really expect the potential users of older GPUs to
find pytorch-cuda useful in any sense.

Any thoughts? I personally tend to freeze at "6.1;7.5;8.6+PTX"
to stay at a small binary size. I've encountered linker overflow
for a long list of architecture some time ago so I dislike that.



By the way, Archlinux's pytorch-rocm architecture list is also
super long:
export 
PYTORCH_ROCM_ARCH="gfx900;gfx906;gfx908;gfx90a;gfx1030;gfx1100;gfx1101;gfx942"
https://gitlab.archlinux.org/archlinux/packaging/packages/python-pytorch/-/blob/main/PKGBUILD?ref_type=heads#L250

Their binary package size looks still under control:
https://mirrors.edge.kernel.org/archlinux/pool/packages/
python-pytorch-2.6.0-4-x86_64.pkg.tar.zst          16-Feb-2025 18:51     82M
python-pytorch-cuda-2.6.0-4-x86_64.pkg.tar.zst     16-Feb-2025 18:52    324M
python-pytorch-opt-2.6.0-4-x86_64.pkg.tar.zst      16-Feb-2025 18:52     82M
python-pytorch-opt-cuda-2.6.0-4-x86_64.pkg.tar.zst 16-Feb-2025 18:54    325M
python-pytorch-opt-rocm-2.6.0-4-x86_64.pkg.tar.zst 16-Feb-2025 18:55    339M
python-pytorch-rocm-2.6.0-4-x86_64.pkg.tar.zst     16-Feb-2025 18:56    338M

Reply via email to