Bug#1035024: unblock: nvidia-cudnn/8.7.0.84~cuda11.8+1 (pre-approval)

2023-05-08 Thread Paul Gevers

Hi Mo,

On 08-05-2023 00:55, M. Zhou wrote:

On Sun, 2023-05-07 at 22:03 +0200, Paul Gevers wrote:

On 27-04-2023 21:31, M. Zhou wrote:

4. debconf template default choice is changed to "I Agree".
  This package is in non-free section. Only by setting the debconf default 
choice
  to "I Agree", can we correctly build pytorch-cuda in sbuild without the 
cuDNN
  libraries not downloaded but the bin:nvidia-cudnn package installed.


Are we legally allowed to do this? If so, why even ask the question?


According to the upstream license and the package content, the URL points
to a distributable tarball depending on the user's agreement.
The debconf questions shows the full license texts and asks the
user whether to accept the terms. These terms, was deemed problematic
by ftp-masters if we directly upload the binary blobs into the archive.


I may not have phrased my question correctly. What I mean is that if a 
user installs the package in a non-interactive way, do you believe they 
agreed with the license? If not, is it OK to install the package even if 
the user didn't agree with it? If the answer is, the user must accept 
the license, then I believe that the default can't be to accept it. If 
it's acceptable to install without the user seeing the license and 
accepting it, then why even ask the question?



At least, building the reverse dependency pytorch-cuda via sbuild, where
the binary blobs will be pulled and linked against, is legal according to
the license. Uploading the binary form of pytorch-cuda is ok as well.


That's nice already.


Other binary distributions like ArchLinux, Anaconda, and even PyTorch
upstream have been redistributing the cuDNN binaries for years though.


I have no idea if and how they would ask for license agreements.


Although I hate dealing with annoying non-free license texts, I think
it not safe to remove the debconf question prompt, because the license
seems to pose even more restrictions than its dependency CUDA devkit.


I conclude from this part that it's NOT ok to skip the debconf question 
which is what happens if the user runs the install with non-interactive 
debconf.



PS wasn't an autopkgtest feasible such that this didn't need to be on
our radar? (too late for that now, but still)


It looks like I have to refresh my memory, I thought autopkgtest won't
be run for non-free packages.


Right. It was recently pointed out to me that the ci.d.n infrastructure 
failed to support that (always, or since last year), but the migration 
software of the Release Team always has supported it. That was a bug, 
not by design. It's fixed now.



Writing the test scripts are easy, but I think
that's not needed if I can get a manual removal or refusal.


Indeed, because we're too close to the full freeze to help you; you'll 
need an unblock anyways.


Paul


OpenPGP_signature
Description: OpenPGP digital signature


Bug#1035024: unblock: nvidia-cudnn/8.7.0.84~cuda11.8+1 (pre-approval)

2023-05-07 Thread M. Zhou
On Sun, 2023-05-07 at 22:03 +0200, Paul Gevers wrote:
> Control: tags -1 moreinfo
> 
> Hi Mo,
> 
> On 27-04-2023 21:31, M. Zhou wrote:
> > So, generally updating the package is simply to update the binary
> > tarball URL in the script, along with the exact version number,
> > which is very trivial.
> 
> So why didn't you ask for only this?

I thought about this choice. This package is hardly useful without
the the fake library (for fixing dh_shlibdeps resolving), because it
cannot serve as a component in the dependency chain for its future
reverse dependencies. Even if the current testing package entered
the next stable, it is still hardly useful.

So if this is not going to be approved, I would prefer to get it removed
from testing and prepare for the stable backports instead.

> > 4. debconf template default choice is changed to "I Agree".
> >  This package is in non-free section. Only by setting the debconf 
> > default choice
> >  to "I Agree", can we correctly build pytorch-cuda in sbuild without 
> > the cuDNN
> >  libraries not downloaded but the bin:nvidia-cudnn package installed.
> 
> Are we legally allowed to do this? If so, why even ask the question?

According to the upstream license and the package content, the URL points
to a distributable tarball depending on the user's agreement.
The debconf questions shows the full license texts and asks the
user whether to accept the terms. These terms, was deemed problematic
by ftp-masters if we directly upload the binary blobs into the archive.

At least, building the reverse dependency pytorch-cuda via sbuild, where
the binary blobs will be pulled and linked against, is legal according to
the license. Uploading the binary form of pytorch-cuda is ok as well.

Other binary distributions like ArchLinux, Anaconda, and even PyTorch
upstream have been redistributing the cuDNN binaries for years though.

Although I hate dealing with annoying non-free license texts, I think
it not safe to remove the debconf question prompt, because the license
seems to pose even more restrictions than its dependency CUDA devkit.

> Paul
> PS wasn't an autopkgtest feasible such that this didn't need to be on 
> our radar? (too late for that now, but still)

It looks like I have to refresh my memory, I thought autopkgtest won't
be run for non-free packages. Writing the test scripts are easy, but I think
that's not needed if I can get a manual removal or refusal.
I checked the license, some simple test scripts for testing the downloader
script, and do some testing compilation / linking will not violate the license.
Will add them in the future.

Both would work for me. I'm ok with stable backports. Afterall pytorch-cuda
will only be available through backports.


P.S.
I really hate dealing with this package with a complicated end user
agreement. It leads to my years long procrastination for the pytorch-cuda
preparation. But, I was still forced to get it done solely due to the
nvidia monopoly if we want a mature and high-performance version
of pytorch. That said, the debian-ai@l.d.o team is diligently working
on AMD's open-source ROCm, which provides alternatives for nvidia
CUDA and cuDNN. When the ROCm stack is ready in our archive, I
want to gradually give up the cuda branch and the corresponding
effort -- pytorch-rocm can enter main, while pytorch-cuda can never.



Bug#1035024: unblock: nvidia-cudnn/8.7.0.84~cuda11.8+1 (pre-approval)

2023-05-07 Thread Paul Gevers

Control: tags -1 moreinfo

Hi Mo,

On 27-04-2023 21:31, M. Zhou wrote:

So, generally updating the package is simply to update the binary
tarball URL in the script, along with the exact version number,
which is very trivial.


So why didn't you ask for only this?


4. debconf template default choice is changed to "I Agree".
 This package is in non-free section. Only by setting the debconf default 
choice
 to "I Agree", can we correctly build pytorch-cuda in sbuild without the 
cuDNN
 libraries not downloaded but the bin:nvidia-cudnn package installed.


Are we legally allowed to do this? If so, why even ask the question?

Paul
PS wasn't an autopkgtest feasible such that this didn't need to be on 
our radar? (too late for that now, but still)


OpenPGP_signature
Description: OpenPGP digital signature


Bug#1035024: unblock: nvidia-cudnn/8.7.0.84~cuda11.8+1 (pre-approval)

2023-04-27 Thread M. Zhou
Package: release.debian.org
Severity: normal
User: release.debian@packages.debian.org
Usertags: unblock
X-Debbugs-Cc: nvidia-cu...@packages.debian.org
Control: affects -1 + src:nvidia-cudnn

Please unblock package nvidia-cudnn. Not yet uploaded to unstable,
just asking for a pre-approval.

[ Reason ]

Our current package version in testing is 8.5.0.96~cuda11.7,
but the nvidia-cuda-toolkit version in testing is 11.8.89~11.8.0-2.
So there is a little minor version mismatch in the cuda version
(one 11.7, and the other 11.8).

This package is a downloader script that downloads the Nvidia
binary blob releases of the cuDNN library, and installs the library
to the system directories for building reverse dependencies.

So, generally updating the package is simply to update the binary
tarball URL in the script, along with the exact version number,
which is very trivial.

But unfortunately, during the cuda11.7 to cuda11.8 update,
I also introduced many changes to the package to the maintainer
scripts to let the package correctly support the pytorch-cuda build.
I'm the upstream of this package, and this looks like a low risk
update to me. But I'm not sure how the release team will think.
So asking for uploading permission in advance.

[ Impact ]
(What is the impact for the user if the unblock isn't granted?)

Nearly no impact. This package is new and does not exist
in the previous stable releases. To the best of my knowledge,
there is only one tentative reverse dependency pytorch-cuda,
which is not present in testing.

[ Tests ]
(What automated or manual tests cover the affected code?)

The updated package is now able to correctly support the
build of pytorch-cuda. I tested the built package with both
Nvidia MX250 (laptop) and RTX 2060 (laptop) GPUs. It works
correctly.

[ Risks ]

There is a small risk. The additional code is very simple.
It does not have reverse dependency in testing. There
is no alternative to this package. I'm the upstream author
of the script, and I can provide stable updates on my own
even if something goes wrong.

[ Checklist ]
  [x] all changes are documented in the d/changelog
  [x] I reviewed all changes and I approve them
  [x] attach debdiff against the package in testing

[ Other info ]
(Anything else the release team should know.)

The debdiff contains necessary changes to make the package
correctly support the pytorch-cuda build (with sbuild).

Specifically:

1. A fake library is compiled (from a nearly empty C file cudnn-fake.c)
   with the soname of the library to be downloaded. This seems to be
   the only way to make apt/dpkg believe that the libcudnn.so.* is
   really provided by this binary package.
   This solves the libcudnn_* cannot be found in any system package
   error from dh_shlibdeps.

2. Added curl as an alternative binary blob downloader.

3. Updated the postinst and prerm script for handling installed files.
In the current testing version, when we want to remove this package,
we use some manually written glob patterns to remove the downloaded
cudnn library. This implementation is not very safe when the user manually
install another instance of cudnn to the system. The glob pattern will also
include them to make them removed during postrm.

In the proposed version (see debdiff), I record a list of files that are 
installed
from the tarball to the system. And the postrm process will use the exact
recorded installation paths for removal. I think this is a safer 
implementation
than removal by glob pattern match.

4. debconf template default choice is changed to "I Agree".
This package is in non-free section. Only by setting the debconf default 
choice
to "I Agree", can we correctly build pytorch-cuda in sbuild without the 
cuDNN
libraries not downloaded but the bin:nvidia-cudnn package installed.

5. More code comments (maintainence notes) in the script, and the upgraded
binary blob URL.

unblock nvidia-cudnn/8.7.0.84~cuda11.8+1
Thank you for using reportbug

diff -Nru nvidia-cudnn-8.5.0.96~cuda11.7/cudnn-fake.c nvidia-cudnn-8.7.0.84~cuda11.8/cudnn-fake.c
--- nvidia-cudnn-8.5.0.96~cuda11.7/cudnn-fake.c	1969-12-31 19:00:00.0 -0500
+++ nvidia-cudnn-8.7.0.84~cuda11.8/cudnn-fake.c	2023-03-21 18:49:17.0 -0400
@@ -0,0 +1,8 @@
+/* This is a fake library. We want dpkg-shlibdeps to believe that the
+ * shared object libcudnn.so.8 is provided in this package.
+ */
+int
+__cudnn_fake_library__()
+{
+return 0;
+}
diff -Nru nvidia-cudnn-8.5.0.96~cuda11.7/debian/changelog nvidia-cudnn-8.7.0.84~cuda11.8/debian/changelog
--- nvidia-cudnn-8.5.0.96~cuda11.7/debian/changelog	2023-02-17 23:24:39.0 -0500
+++ nvidia-cudnn-8.7.0.84~cuda11.8/debian/changelog	2023-03-21 18:49:17.0 -0400
@@ -1,3 +1,33 @@
+nvidia-cudnn (8.7.0.84~cuda11.8) experimental; urgency=medium
+
+  * Upgrade to cuDNN v8.7.0.84
+  * Set the debconf template default choice to "I Agree".
+Only in this way can we use the binary packa