Re: [IMPORTANT] Debian images no longer works for GPU driver installation with apt update

2023-08-29 Thread Andrew M.A. Cater
On Tue, Aug 29, 2023 at 10:53:35AM -0700, Wenyan Hu wrote:
> Hi Team,
> 
> Google's GPU driver installation no longer works for Debian images since
> 08/23/2023.
> 
> Google is using
> https://cloud.google.com/compute/docs/gpus/install-drivers-gpu#installation_scripts
> to
> install GPU drivers on Debian images. It executes scripts as
> https://raw.githubusercontent.com/GoogleCloudPlatform/compute-gpu-installation/main/linux/install_gpu_driver.py
> .
> 

Maybe take that up with Google - their script and their engineers to fix?

> Now for the latest images, if running scripts as below:
> 
> ```
> $ apt-cache search linux-headers | grep -i $(uname -r)
> $ apt update
> $apt install -y linux-headers-5.10.0-24-cloud-amd64
> software-properties-common pciutils gcc make dkms
> ```
> After running `apt update`, we find the package
> linux-headers-5.10.0-24-cloud-amd64 no longer exists, which results in the
> `apt install linux-headers-5.10.0-24-cloud-amd64` failure.
> 

apt update and then apt upgrade, maybe, to bring the system up to date
rather than just updating the list of available packages?

> It seems after the `apt update`, the index somehow no longer points to
> linux-headers-5.10.0-23-cloud-amd64 and linux-headers-5.10.0-24-cloud-amd64.
> 
> I know it would work if we (1) not do `apt update`, or (2) install newer
> kernel packages.

If you *only* do apt update, nothing has changed.

> But for (1), the apt update is required for our other package
> installations, (2) the kernel package update needs VM rebooting.
> 

This is always likely to be the case: if you do a kernel package update
you *should* reboot to the new kernel, since this is likely to have fixed
security or other problems.

> It is now blocking my team's image release and product functionality
> because we need the GPU driver installation without VM rebooting.
> 

As above, you will need to reboot the VM eventually.

> Could you please prioritize the issue and provide some tips or support?
> 

This is an end user support and information list: we can't provide an
immediate fix to the issue. I would have suggested the debian cloud image
team if I hadn't already suggested taking this up with Google.

> Thanks,
> Wenyan

With every good wish,

Andy Cater



Re: Debian images no longer works for GPU driver installation with apt update

2023-08-29 Thread Michael Kjörling
On 29 Aug 2023 10:53 -0700, from wen...@google.com (Wenyan Hu):
> Hi Team,

You're in the wrong place. This is a mailing list for Debian _users_
willing to help each other out. Some subscribers might be officially
involved with the Debian project in various capacities, but that's not
the primary purpose of the debian-user mailing list.


> $ apt-cache search linux-headers | grep -i $(uname -r)
> $ apt update
> $apt install -y linux-headers-5.10.0-24-cloud-amd64

It looks like you might be relying on package versions being the same
before and after running apt update. That is an invalid assumption.


> Could you please prioritize the issue and provide some tips or support?

If something within the Debian distribution is broken, see
https://www.debian.org/Bugs/ for how to file a bug report.

However, that your script is apparently expecting a specific version
of a kernel package, or making assumptions about what changes happen
as part of an apt{,-get} update, hardly seems to me as though
something in Debian is broken. That seems more likely to be a problem
with assumptions made in your script. Kernel packages are upgraded
regularly in this manner.

See also http://www.catb.org/esr/faqs/smart-questions.html#urgent

-- 
Michael Kjörling 🔗 https://michael.kjorling.se
“Remember when, on the Internet, nobody cared that you were a dog?”



[IMPORTANT] Debian images no longer works for GPU driver installation with apt update

2023-08-29 Thread Wenyan Hu
Hi Team,

Google's GPU driver installation no longer works for Debian images since
08/23/2023.

Google is using
https://cloud.google.com/compute/docs/gpus/install-drivers-gpu#installation_scripts
to
install GPU drivers on Debian images. It executes scripts as
https://raw.githubusercontent.com/GoogleCloudPlatform/compute-gpu-installation/main/linux/install_gpu_driver.py
.

Now for the latest images, if running scripts as below:

```
$ apt-cache search linux-headers | grep -i $(uname -r)
$ apt update
$apt install -y linux-headers-5.10.0-24-cloud-amd64
software-properties-common pciutils gcc make dkms
```
After running `apt update`, we find the package
linux-headers-5.10.0-24-cloud-amd64 no longer exists, which results in the
`apt install linux-headers-5.10.0-24-cloud-amd64` failure.

It seems after the `apt update`, the index somehow no longer points to
linux-headers-5.10.0-23-cloud-amd64 and linux-headers-5.10.0-24-cloud-amd64.

I know it would work if we (1) not do `apt update`, or (2) install newer
kernel packages.
But for (1), the apt update is required for our other package
installations, (2) the kernel package update needs VM rebooting.

It is now blocking my team's image release and product functionality
because we need the GPU driver installation without VM rebooting.

Could you please prioritize the issue and provide some tips or support?

Thanks,
Wenyan