Hi Cory, Helmut, On 2026-03-10 23:32, Cordell Bloor wrote: > Yes, I tried the same a while back. I gave up and decided to update to > rccl from ROCm 6.4. > > It was easy enough to move to rccl from ROCm 6.4 (albeit with a reduced > set of supported GPUs), but I got hung up on a missing symbol error. > Upstream had dropped a function without changing the SONAME. In talking > to them, they justified it on the basis that the function never worked > anyway. I still think we need a dummy implementation that returns an > error just for satisfying the linker. That's where I left off.
On 2026-03-11 12:20, Helmut Grohne wrote: > I appreciate your attention to detail. While being attentive to dropped > symbols is good as a general rule, I suggest that there may be > exceptions. > > librccl1 does not have any reverse dependencies in Debian (trixie nor > sid). in this particular case, I would concur with Helmut and propose that librccl1 be updated without bumping SOVER. If the function behind the symbol never worked anyway, then I think it's unlikely that some binary out there in the wild (usefully) links it, so dropping it should be low risk. That would just break what is already broken. And as Helmut pointed out, we know that nothing in Debian links to this. >> The incomplete rccl update is on my salsa account. If that helps with getting pytorch-rocm unstuck, I'm willing to take a look at an attempt to upload this (6.4), unless anyone objects. Best, Christian

