Re: [IMPORTANT] Debian images no longer works for GPU driver installation with apt update
On Tue, Aug 29, 2023 at 10:53:35AM -0700, Wenyan Hu wrote: > Hi Team, > > Google's GPU driver installation no longer works for Debian images since > 08/23/2023. > > Google is using > https://cloud.google.com/compute/docs/gpus/install-drivers-gpu#installation_scripts > to > install GPU drivers on Debian images. It executes scripts as > https://raw.githubusercontent.com/GoogleCloudPlatform/compute-gpu-installation/main/linux/install_gpu_driver.py > . > Maybe take that up with Google - their script and their engineers to fix? > Now for the latest images, if running scripts as below: > > ``` > $ apt-cache search linux-headers | grep -i $(uname -r) > $ apt update > $apt install -y linux-headers-5.10.0-24-cloud-amd64 > software-properties-common pciutils gcc make dkms > ``` > After running `apt update`, we find the package > linux-headers-5.10.0-24-cloud-amd64 no longer exists, which results in the > `apt install linux-headers-5.10.0-24-cloud-amd64` failure. > apt update and then apt upgrade, maybe, to bring the system up to date rather than just updating the list of available packages? > It seems after the `apt update`, the index somehow no longer points to > linux-headers-5.10.0-23-cloud-amd64 and linux-headers-5.10.0-24-cloud-amd64. > > I know it would work if we (1) not do `apt update`, or (2) install newer > kernel packages. If you *only* do apt update, nothing has changed. > But for (1), the apt update is required for our other package > installations, (2) the kernel package update needs VM rebooting. > This is always likely to be the case: if you do a kernel package update you *should* reboot to the new kernel, since this is likely to have fixed security or other problems. > It is now blocking my team's image release and product functionality > because we need the GPU driver installation without VM rebooting. > As above, you will need to reboot the VM eventually. > Could you please prioritize the issue and provide some tips or support? > This is an end user support and information list: we can't provide an immediate fix to the issue. I would have suggested the debian cloud image team if I hadn't already suggested taking this up with Google. > Thanks, > Wenyan With every good wish, Andy Cater
Re: Debian images no longer works for GPU driver installation with apt update
On 29 Aug 2023 10:53 -0700, from wen...@google.com (Wenyan Hu): > Hi Team, You're in the wrong place. This is a mailing list for Debian _users_ willing to help each other out. Some subscribers might be officially involved with the Debian project in various capacities, but that's not the primary purpose of the debian-user mailing list. > $ apt-cache search linux-headers | grep -i $(uname -r) > $ apt update > $apt install -y linux-headers-5.10.0-24-cloud-amd64 It looks like you might be relying on package versions being the same before and after running apt update. That is an invalid assumption. > Could you please prioritize the issue and provide some tips or support? If something within the Debian distribution is broken, see https://www.debian.org/Bugs/ for how to file a bug report. However, that your script is apparently expecting a specific version of a kernel package, or making assumptions about what changes happen as part of an apt{,-get} update, hardly seems to me as though something in Debian is broken. That seems more likely to be a problem with assumptions made in your script. Kernel packages are upgraded regularly in this manner. See also http://www.catb.org/esr/faqs/smart-questions.html#urgent -- Michael Kjörling 🔗 https://michael.kjorling.se “Remember when, on the Internet, nobody cared that you were a dog?”
[IMPORTANT] Debian images no longer works for GPU driver installation with apt update
Hi Team, Google's GPU driver installation no longer works for Debian images since 08/23/2023. Google is using https://cloud.google.com/compute/docs/gpus/install-drivers-gpu#installation_scripts to install GPU drivers on Debian images. It executes scripts as https://raw.githubusercontent.com/GoogleCloudPlatform/compute-gpu-installation/main/linux/install_gpu_driver.py . Now for the latest images, if running scripts as below: ``` $ apt-cache search linux-headers | grep -i $(uname -r) $ apt update $apt install -y linux-headers-5.10.0-24-cloud-amd64 software-properties-common pciutils gcc make dkms ``` After running `apt update`, we find the package linux-headers-5.10.0-24-cloud-amd64 no longer exists, which results in the `apt install linux-headers-5.10.0-24-cloud-amd64` failure. It seems after the `apt update`, the index somehow no longer points to linux-headers-5.10.0-23-cloud-amd64 and linux-headers-5.10.0-24-cloud-amd64. I know it would work if we (1) not do `apt update`, or (2) install newer kernel packages. But for (1), the apt update is required for our other package installations, (2) the kernel package update needs VM rebooting. It is now blocking my team's image release and product functionality because we need the GPU driver installation without VM rebooting. Could you please prioritize the issue and provide some tips or support? Thanks, Wenyan