Re: [IMPORTANT] Debian images no longer works for GPU driver installation with apt update

2023-08-29 Thread Andrew M.A. Cater
On Tue, Aug 29, 2023 at 10:53:35AM -0700, Wenyan Hu wrote:
> Hi Team,
> 
> Google's GPU driver installation no longer works for Debian images since
> 08/23/2023.
> 
> Google is using
> https://cloud.google.com/compute/docs/gpus/install-drivers-gpu#installation_scripts
> to
> install GPU drivers on Debian images. It executes scripts as
> https://raw.githubusercontent.com/GoogleCloudPlatform/compute-gpu-installation/main/linux/install_gpu_driver.py
> .
> 

Maybe take that up with Google - their script and their engineers to fix?

> Now for the latest images, if running scripts as below:
> 
> ```
> $ apt-cache search linux-headers | grep -i $(uname -r)
> $ apt update
> $apt install -y linux-headers-5.10.0-24-cloud-amd64
> software-properties-common pciutils gcc make dkms
> ```
> After running `apt update`, we find the package
> linux-headers-5.10.0-24-cloud-amd64 no longer exists, which results in the
> `apt install linux-headers-5.10.0-24-cloud-amd64` failure.
> 

apt update and then apt upgrade, maybe, to bring the system up to date
rather than just updating the list of available packages?

> It seems after the `apt update`, the index somehow no longer points to
> linux-headers-5.10.0-23-cloud-amd64 and linux-headers-5.10.0-24-cloud-amd64.
> 
> I know it would work if we (1) not do `apt update`, or (2) install newer
> kernel packages.

If you *only* do apt update, nothing has changed.

> But for (1), the apt update is required for our other package
> installations, (2) the kernel package update needs VM rebooting.
> 

This is always likely to be the case: if you do a kernel package update
you *should* reboot to the new kernel, since this is likely to have fixed
security or other problems.

> It is now blocking my team's image release and product functionality
> because we need the GPU driver installation without VM rebooting.
> 

As above, you will need to reboot the VM eventually.

> Could you please prioritize the issue and provide some tips or support?
> 

This is an end user support and information list: we can't provide an
immediate fix to the issue. I would have suggested the debian cloud image
team if I hadn't already suggested taking this up with Google.

> Thanks,
> Wenyan

With every good wish,

Andy Cater



[IMPORTANT] Debian images no longer works for GPU driver installation with apt update

2023-08-29 Thread Wenyan Hu
Hi Team,

Google's GPU driver installation no longer works for Debian images since
08/23/2023.

Google is using
https://cloud.google.com/compute/docs/gpus/install-drivers-gpu#installation_scripts
to
install GPU drivers on Debian images. It executes scripts as
https://raw.githubusercontent.com/GoogleCloudPlatform/compute-gpu-installation/main/linux/install_gpu_driver.py
.

Now for the latest images, if running scripts as below:

```
$ apt-cache search linux-headers | grep -i $(uname -r)
$ apt update
$apt install -y linux-headers-5.10.0-24-cloud-amd64
software-properties-common pciutils gcc make dkms
```
After running `apt update`, we find the package
linux-headers-5.10.0-24-cloud-amd64 no longer exists, which results in the
`apt install linux-headers-5.10.0-24-cloud-amd64` failure.

It seems after the `apt update`, the index somehow no longer points to
linux-headers-5.10.0-23-cloud-amd64 and linux-headers-5.10.0-24-cloud-amd64.

I know it would work if we (1) not do `apt update`, or (2) install newer
kernel packages.
But for (1), the apt update is required for our other package
installations, (2) the kernel package update needs VM rebooting.

It is now blocking my team's image release and product functionality
because we need the GPU driver installation without VM rebooting.

Could you please prioritize the issue and provide some tips or support?

Thanks,
Wenyan