Re: [RFC] DPT Policy: Canonise recommendation against PyPi-provided upstream source tarballs
On 2021-06-27 23:49:18 -0300 (-0300), Emmanuel Arias wrote: [...] > if we package from PyPi, that don't contain the testsuite, that > result in packages with any test, and that isn't good. > > Also, I'm not sure, but the docs aren't in PyPi, isn't? [...] This depends entirely on how upstream is creating their sdists. They might certainly choose to omit tests or even documentation, but I think that's becoming less popular now that wheels exist. It is expected for a wheel to omit basically everything except the application, licensing information and some metadata. This has reduced the pressure on upstreams with massive suites of tests or volumes of documentation to strip them out of sdists, making it more likely they'll ship full source distributions that way. -- Jeremy Stanley signature.asc Description: PGP signature
Re: [RFC] DPT Policy: Canonise recommendation against PyPi-provided upstream source tarballs
Hola everybody, On 6/26/21 7:51 PM, Louis-Philippe Véronneau wrote: > To me, the most important thing is that all packages must at least run > the upstream testsuite when it exists (I'm planning on writing a policy > proposal saying this after the freeze). If PyPi releases include them, I > think it's fine (but they often don't). I was a little surprised because in all discussion anyone take account the tests (or if a missed sorry) thanks pollo to get it in discussion. I don't have the correct numbers but I saw many python package without autopkgtest, and if we package from PyPi, that don't contain the testsuite, that result in packages with any test, and that isn't good. Also, I'm not sure, but the docs aren't in PyPi, isn't? In the other hand, files like .git* or CI files, can be easily remove it. So, I don't see any problems with that. cheers -- Emmanuel Arias @eamanu yaerobi.com OpenPGP_0xFA9DEC5DE11C63F1.asc Description: OpenPGP public key OpenPGP_signature Description: OpenPGP digital signature
Re: [RFC] DPT Policy: Canonise recommendation against PyPi-provided upstream source tarballs
On 2021-06-26 18:51:26 -0400 (-0400), Louis-Philippe Véronneau wrote: [...] > To me, the most important thing is that all packages must at least > run the upstream testsuite when it exists (I'm planning on writing > a policy proposal saying this after the freeze). If PyPi releases > include them, I think it's fine (but they often don't). When you do write that, you'll of course want to clarify what "the upstream testsuite" really means too. Lots of projects have vast testing which is simply not feasible to replicate within Debian for a number of reasons. Running some battery of upstream tests makes sense, but testsuites which require root access outside a chroot, integration tests orchestrated across multiple machines, access to unusual sorts of accelerator or network hardware, and so on can easily comprise part of "the upstream testsuite." -- Jeremy Stanley signature.asc Description: PGP signature
Re: [RFC] DPT Policy: Canonise recommendation against PyPi-provided upstream source tarballs
On 2021-06-25 16 h 42, Nicholas D Steeves wrote: > Hi Team! > > I feel like there is probably consensus against the use of PyPi-provided > upstream source tarballs in preference for what will usually be a GitHub > release tarball, so I made an MR to this effect (moderate recommendation > rather than a "must" directive): > > > https://salsa.debian.org/python-team/tools/python-modules/-/merge_requests/16 I don't often use PyPi releases because of the issues mentioned in the MR, but I think Jeremy's point is valid. IMO, rewording the text so that it clearly says "should" and not "must" would fix the issues at hand, as long as people justify their usage of PyPi when it's "The Right Thing" in a file somewhere. To me, the most important thing is that all packages must at least run the upstream testsuite when it exists (I'm planning on writing a policy proposal saying this after the freeze). If PyPi releases include them, I think it's fine (but they often don't). -- ⢀⣴⠾⠻⢶⣦⠀ ⣾⠁⢠⠒⠀⣿⡁ Louis-Philippe Véronneau ⢿⡄⠘⠷⠚⠋ po...@debian.org / veronneau.org ⠈⠳⣄ OpenPGP_signature Description: OpenPGP digital signature
Re: [RFC] DPT Policy: Canonise recommendation against PyPi-provided upstream source tarballs
On Fri, 25 Jun 2021 at 18:29:19 -0400, Nicholas D Steeves wrote: > Take for example the > case where upstream exclusively supports a Flatpak and/or Snap > package... Flatpak and Snap aren't source package formats (like Autotools "make dist" or Meson "meson dist" or Python sdist), they're binary package formats (like .deb or Python wheels). I don't know Snap infrastructure well, but Flatpak apps are built from a manifest that lists one or more source projects, referenced as either a VCS commit with a known-good commit identifier (usually git) or an archive with a known-good hash (usually tar and sha256). The manifest format and the upstream-recommended Flathub "app store" infrastructure try to push authors towards building from source, although as with .deb, technically it's possible to release an archive containing binary blobs and use it as the "source" (which is how proprietary apps like com.valvesoftware.SteamLink work, similar to many packages in the non-free archive area). If the upstream only provides source via their VCS, then obviously we have to use `git archive` or equivalent because we have no other way to get a flat-file version, and the experimental dpkg-source format "3.0 (git)" isn't currently allowed in the Debian archive. If the upstream releases tarball artifacts and builds their Flatpak app from those, we can use those too. I think the problem case here is when the upstream releases something that has the name and format we would associate with a source release, but has contents that are somewhere between a pure source release and a binary release. Autotools "make dist" has always been a bit like this (it contains a pre-generated build system so that people can build on platforms where m4 and perl aren't available, and it's common to include pre-generated convenience copies of things like gtk-doc documentation); Python sdist archives are sometimes similar. In both Autotools and setuptools, it's also far too easy to have files in the VCS but accidentally omit them from the source distribution, by not listing them in Autotools EXTRA_DIST or in setuptools MANIFEST.in. What I have generally done to resolve this problem is to use the upstream's official source releases ("make dist" or sdist), and if they are missing files that we want, send merge requests to add them to the next release (for example https://gitlab.gnome.org/GNOME/gi-docgen/-/commit/5fcaba6f and https://github.com/containers/bubblewrap/commit/1c775f43), and if necessary work around missing files by shipping them in debian/ (for example https://salsa.debian.org/gnome-team/gi-docgen/-/commit/f16845d9). Several upstreams of projects I work on, notably GNOME, have been switching from Autotools to Meson, and one of the reasons I'm in favour of this tendency is that the Meson "meson dist" archive is a lightly filtered version of `git archive` (it excludes `.gitignore` and other highly git-specific files, but includes everything else), making it harder for upstreams to accidentally omit necessary source code from their source releases. smcv
Re: [RFC] DPT Policy: Canonise recommendation against PyPi-provided upstream source tarballs
On 2021-06-26 02:04, Paul Wise wrote: > I would like to see #2 split into two separate tarballs, one for the > exact copy of the git tree and one containing the data about the other > tarball. Then use dpkg-source v3 secondary tarballs to add the data > about the git repo to the Debian source package. IIRC, last time I tried multiple tarballs, I got stuck with pristine-tar. Not sure, if I didn't find out how to commit or if the problem was with checkout, though. Do you happen to know, if this is an issue? PS: Just for the record: I'm always(?) using upstream sources from git, not PyPi, because the latter typically are missing unit tests, which we want to run in Debian.
Re: [RFC] DPT Policy: Canonise recommendation against PyPi-provided upstream source tarballs
File names on PyPI are write once. Once a specific file name has been used it can never be used again (even if the entire project was deleted and recreated). Projects can delete uploaded files (and as mentioned they can be yanked, but yanking is just extra metadata beside the file), but file content can never change, only be removed. Sent from my iPhone > On Jun 25, 2021, at 11:47 PM, Brian Thompson wrote: > > On Fri, Jun 25, 2021 at 07:01:39PM -0400, Nicholas D Steeves wrote: >> Does PyPi provide immutable releases? > > From experience, I can tell you that yes, releases cannot be overwritten, > but they can be "yanked". Pypi states that a yanked release is: > > "A release that is always ignored by an installer, unless it is the > only release that matches a version specifier (using either '==' or > '===)." > > -- > Best regards, > > Brian T
Re: [RFC] DPT Policy: Canonise recommendation against PyPi-provided upstream source tarballs
On 2021-06-26 02:04:40 + (+), Paul Wise wrote: > On Fri, Jun 25, 2021 at 11:42 PM Jeremy Stanley wrote: [..] > > 2. Cryptographically signed tarballs of the file tree corresponding > >to a tag in the Git repository, with versioning, revision > >history, release notes and authorship extracted into files > >included directly within the tarball. > > I would like to see #2 split into two separate tarballs, one for the > exact copy of the git tree and one containing the data about the other > tarball. Then use dpkg-source v3 secondary tarballs to add the data > about the git repo to the Debian source package. [...] You might like to see them split, but why is the exact copy of the work tree the only legitimate way to export data from a Git repository? Adding egg-info to the tarball creates a *Python Source Distribution* which is a long-standing standard method for distributing source code of Python software. Those files could even be checked directly into the repository, so that the work tree was itself also a valid sdist. The only reason the projects I work on don't do that is because some of it would be redundant with the metadata from the revision control system. You could of course create your own split tarballs of the work tree and the additional metadata files, but to what end? If upstream is already delivering them together in a release tarball, how is making your own beneficial when it still has to be done by the package maintainer before assembling the source package? Users of Debian don't benefit, because they still can't recreate your split tarball if they wanted without also having a copy of the upstream Git repository anyway. It just seems like make-work. > Probably we should start systematically comparing upstream VCS repos > with upstream sdists and reacting to the differences. So far, I've > reacted by ignoring the sdists completely. I highly recommend it. We explicitly test that our sdists don't omit files from the Git worktree (sans .git* files like .gitignore and .gitreview which make no sense outside the context of a Git repository). On the other hand, I've found at least one case where a copyright statement in a Debian package refers to an AUTHORS file shipped as part of the sdist, but since the maintainer chose to package it from Git instead and did not generate that file when doing so, it's not included in the packaged version distributed in Debian. (Not linking the bug report here as I don't want it to seem like I'm picking on the maintainer.) Just to reiterate, as an upstream we don't consider the work trees of our Git repos to be complete source distributions. They can be used along with the versioning and history tracked as part of the repository to generate a complete source distribution, and that's what we officially release. Downstream distributions are encouraged to either use our release tarballs or clones of our Git repositories to recreate the same files we would release, but if you choose to do neither of those you're likely to miss something. -- Jeremy Stanley signature.asc Description: PGP signature
Re: [RFC] DPT Policy: Canonise recommendation against PyPi-provided upstream source tarballs
On Fri, Jun 25, 2021 at 07:01:39PM -0400, Nicholas D Steeves wrote: > Does PyPi provide immutable releases? From experience, I can tell you that yes, releases cannot be overwritten, but they can be "yanked". Pypi states that a yanked release is: "A release that is always ignored by an installer, unless it is the only release that matches a version specifier (using either '==' or '===)." -- Best regards, Brian T signature.asc Description: PGP signature
Re: [RFC] DPT Policy: Canonise recommendation against PyPi-provided upstream source tarballs
On Fri, Jun 25, 2021 at 11:42 PM Jeremy Stanley wrote: > 1. Cryptographically signed tags in a Git repository, with >versioning, revision history, release notes and authorship either >embedded within or tied to the Git metadata. > > 2. Cryptographically signed tarballs of the file tree corresponding >to a tag in the Git repository, with versioning, revision >history, release notes and authorship extracted into files >included directly within the tarball. I would like to see #2 split into two separate tarballs, one for the exact copy of the git tree and one containing the data about the other tarball. Then use dpkg-source v3 secondary tarballs to add the data about the git repo to the Debian source package. > Saying that a raw dump of the file content from a revision control > system is recommended over using upstream's sdists presumes all > upstreams are the same. They're not, and which is preferable (or > doable, or even legal) differs from one to another. Just because > some sdists, or even many, are not suitable as a basis for packaging > doesn't mean that sdists are a bad idea to base packages on. Yes, > basing packages on bad sdists is bad, it's hard to disagree with > that. Probably we should start systematically comparing upstream VCS repos with upstream sdists and reacting to the differences. So far, I've reacted by ignoring the sdists completely. -- bye, pabs https://wiki.debian.org/PaulWise
Re: [RFC] DPT Policy: Canonise recommendation against PyPi-provided upstream source tarballs
On Fri, Jun 25, 2021 at 9:17 PM Jeremy Stanley wrote: > The proposal is somewhat akin to saying that a > tarball created via `make dist` is unsuitable for packaging. This is definitely true; they generally contain generated files (configure, Makefile.in) and embedded code copies (missing install-sh depcomp config.sub config.guess etc), neither of which should be part of the "source". -- bye, pabs https://wiki.debian.org/PaulWise
Re: [RFC] DPT Policy: Canonise recommendation against PyPi-provided upstream source tarballs
On Fri, Jun 25, 2021 at 8:49 PM Nicholas D Steeves wrote: > I feel like there is probably consensus against the use of PyPi-provided > upstream source tarballs in preference for what will usually be a GitHub > release tarball I think this should be a Debian-wide default and documented in Debian Policy. I plan to bring this up more widely after the bullseye release. -- bye, pabs https://wiki.debian.org/PaulWise
Re: [RFC] DPT Policy: Canonise recommendation against PyPi-provided upstream source tarballs
Hi Jeremy! Wow, you've given me a lot to think about. Thank you :-) Yes, I agree with you that my MR doesn't adequately address the much more heterogeneous reality. (and is also indelicate, lacks nuance, etc) I'll take a day or two to think about this, and also to take into account what everyone else has written before making revisions for v2. If I do it right now my work won't be rigorous enough, nor fair to the contributions others have made to this thread. If nothing else it will require careful outlining... Maybe the heading should be: How to choose a source for the tarball (and why this can be difficult) == ;-) Take care, Nicholas signature.asc Description: PGP signature
Re: [RFC] DPT Policy: Canonise recommendation against PyPi-provided upstream source tarballs
On 2021-06-25 19:01:39 -0400 (-0400), Nicholas D Steeves wrote: [...] > And yes, I agree moderate is better, but I must sadly confess > ignorance to the technical reasons why PyPI is sometimes more > appropriate. Without technical reasons it seems like a case of > ideological compromise (based on the standards I've been mentored > to and the feedback I've received over the years). Hopefully my other replies here and in Salsa have provided some fairly large counterexamples for you. If those still aren't entirely clear, I'm happy to go into deeper detail or broaden to related examples elsewhere in the ecosystem. -- Jeremy Stanley signature.asc Description: PGP signature
Re: [RFC] DPT Policy: Canonise recommendation against PyPi-provided upstream source tarballs
On 2021-06-25 18:29:19 -0400 (-0400), Nicholas D Steeves wrote: > A recommendation is non-binding, and the intent of this proposal is to > say that the most "sourceful" form of source is the *most* suitable for > Debian packages. The inverse of this is that `make dist` is less > suitable for Debian packages. Neither formulation of this premise > applies to a scope outside of Debian. In other words, just because a > particular form of source packaging and distribution is not considered > ideal in Debian does not in any comment on its suitability for other > purposes. Would you prefer to see a note like "PyPi is a good thing for > the Python ecosystem, but sdists are not the preferred form of Debian > source tarballs"? To reset this discussion, take the case of an upstream like the one I'm involved with. For each project, two forms of source release are made available: 1. Cryptographically signed tags in a Git repository, with versioning, revision history, release notes and authorship either embedded within or tied to the Git metadata. 2. Cryptographically signed tarballs of the file tree corresponding to a tag in the Git repository, with versioning, revision history, release notes and authorship extracted into files included directly within the tarball. If some alternative mechanism is used to grab only the work tree from a checkout of the Git repository, critical information about the software is lost, making it uninstallable in some cases (can't figure out its own version), or even illegal to redistribute (missing authors list referenced from the copyright license). So in this case you have a few options: package from upstream's Git repository, package from upstream's "release tarball" (which happens to be in Python sdist format because the egg-info is used to hold information extracted from their Git metadata), or use something which is neither of those and then have to rely on one of them anyway to supply the missing bits. > It's also worth mentioning that upstream's "official release" > preference is not necessarily relevant to a Debian context. Take > for example the case where upstream exclusively supports a Flatpak > and/or Snap package... [...] The problem is that you seem to want to talk in absolutes. Sure some (I'll wager many) Python projects can be reasonably packaged from a flat dump of the file content in their revision control. There are many which can't. Sure some upstreams may only want to release Flatpaks or Snaps, or may even be openly hostile to getting packaged in distributions at all. There are also quite a few which don't host their revision control in platforms which provide raw tarball exports generated on the fly. Some sdist tarballs leave out files, I agree, but they don't have to (ours don't, we only add more in order to supply the exported revision control metadata). Saying that a raw dump of the file content from a revision control system is recommended over using upstream's sdists presumes all upstreams are the same. They're not, and which is preferable (or doable, or even legal) differs from one to another. Just because some sdists, or even many, are not suitable as a basis for packaging doesn't mean that sdists are a bad idea to base packages on. Yes, basing packages on bad sdists is bad, it's hard to disagree with that. > Thinking about an ideal solution, and the interesting PBR case, I > remember that gbp is supposed to be able to associate gbp tags with > upstream commits (or possibly tags), so maybe it's also possible to do > this: > > 1. When gbp import-orig finds a new release > 2. Fetch upstream remote as well > 3. Run PBR against the upstream release tag > 4. Stage this[ese] file[s] > 5. Either append them to the upstream tarball before committing to the >pristine-tar branch, or generate the upstream tarball from the >upstream branch (intent being that the upstream branch's HEAD should >be identical to the contents of that tarball) > 6. Gbp creates upstream/x.y tag > 7. Gbp merges to Debian packaging branch. You'll either need a copy of the upstream Git repository or at least some of the files generated from that repository's metadata which has been embedded in the release tarball. I understand the desire to not put files into Debian source packages which can be generated at package build time from other files in Debian, but when those files can't be generated without the presence of the Git repository itself which *isn't* files in Debian, using the generated copies supplied (and signed!) by upstream seems no different than many other sorts of data which get shipped in Debian source packages. -- Jeremy Stanley signature.asc Description: PGP signature
Re: [RFC] DPT Policy: Canonise recommendation against PyPi-provided upstream source tarballs
Hi Simon, Simon McVittie writes: > On Fri, 25 Jun 2021 at 16:42:42 -0400, Nicholas D Steeves wrote: >> I feel like there is probably consensus against the use of PyPi-provided >> upstream source tarballs in preference for what will usually be a GitHub >> release tarball > > This is not really consistent with what devref says: > > The defining characteristic of a pristine source tarball is that the > .orig.tar.{gz,bz2,xz} file is byte-for-byte identical to a tarball > officially distributed by the upstream author > > — > https://www.debian.org/doc/manuals/developers-reference/best-pkging-practices.en.html#best-practices-for-orig-tar-gz-bz2-xz-files > > Sites like Github and Gitlab that generate tarballs from git contents > don't (can't?) guarantee that the exported tarball will never change - I agree 100% > I'm fairly sure `git archive` doesn't try to make that guarantee - so it > seems hard to say that the official source code release artifact is always > the one that appears as a side-effect of the upstream project's git hosting > platform. > Also agreed 100%. This line of inquiry is actually why I think using upstream tags is best, but even then it's possible upstream will delete the tag and push a new one. Does PyPi provide immutable releases? If so, yes, I agree there's a strong argument to be made for using PyPi vis à vis DevRef within a DPT context where upstream git tags (and history) are not merged :-) > That doesn't *necessarily* mean that the equivalent of a `git archive` > is always the wrong thing (and indeed there are a lot of packages where > it's the only reasonably easily-obtained thing that is suitable for our > requirememnts), but I don't think it's as simple or clear-cut as you > are implying. > Also agreed 100%, but I've learned people often look at comprehensive proposals as tldr, so I wanted to try a discussion-based approach ;-) > devref also says: > > A repackaged .orig.tar.{gz,bz2,xz} ... should, except where impossible > for legal reasons, preserve the entire building and portablility > infrastructure provided by the upstream author. For example, it is > not a sufficient reason for omitting a file that it is used only > when building on MS-DOS. Similarly, a Makefile provided by upstream > should not be omitted even if the first thing your debian/rules does > is to overwrite it by running a configure script. > > I think devref goes too far on this - for projects where the official > upstream release artifact contains a significant amount of content we > don't want (convenience copies, portability glue, generated files, etc.), > checking the legal status of everything can end up being more work than > the actual packaging, and that's work that isn't improving the quality of > our operating system (which is, after all, the point). > I agree, and will support a proposal to modify DefRef to this end, because as far as I know the source tarballs in our archive aren't part of a secondary project to archive upstream tarballs as-released (eg: a kind of "ark" or source-bank, like a seed-bank, for DFSG software)...but maybe that is a secondary objective? > However, PyPI sdist archives are (at least in some cases) upstream's > official source code release artifact, so I think a blanket recommendation > that we ignore them probably goes too far in the other direction. > > I'd prefer to mention both options and have "use your best judgement, > like you have to do for every other aspect of the packaging" as a > recommendation :-) > So far the text I've been able to come up with to address this is something like: In some cases PyPI sdist archives may be the most appropriate upstream source tarball (then your "use your best judgement..." as a conclusion) :-) It would be really nice to include technical reasons that describe cases where PyPI is more appropriate, but I don't know any. My experience in Debian thus far has been that "what most closely fulfils Debian ideals" is always preferable to upstream preference. Yes, that's arguably insular, but I thought there was consensus on this. And yes, I agree moderate is better, but I must sadly confess ignorance to the technical reasons why PyPI is sometimes more appropriate. Without technical reasons it seems like a case of ideological compromise (based on the standards I've been mentored to and the feedback I've received over the years). Thanks! Nicholas signature.asc Description: PGP signature
Re: [RFC] DPT Policy: Canonise recommendation against PyPi-provided upstream source tarballs
Hi Scott, Scott Talbert writes: > On Fri, 25 Jun 2021, Jeremy Stanley wrote: > [snip] > I tend to agree about PyPI being the official releases for a lot of > projects. "GitHub tarballs" also tend to include other undesirable stuff > for distribution like upstream CI/CD configuration files, etc. > Would you please expand on "etc"? It seems like it would be reasonable to exclude CI/CD files via the watch file for the similar reasons to excluding an upstream-provided debian/ subdir. Thanks! Nicholas signature.asc Description: PGP signature
Re: [RFC] DPT Policy: Canonise recommendation against PyPi-provided upstream source tarballs
Hi Jeremy, Thank you for your comments, reply follows inline: Jeremy Stanley writes: > On 2021-06-25 16:42:42 -0400 (-0400), Nicholas D Steeves wrote: >> I feel like there is probably consensus against the use of PyPi-provided >> upstream source tarballs in preference for what will usually be a GitHub >> release tarball, so I made an MR to this effect (moderate recommendation >> rather than a "must" directive): >> >> >> https://salsa.debian.org/python-team/tools/python-modules/-/merge_requests/16 >> >> Comments, corrections, requests for additional information, and >> objections welcome :-) I'm also curious if there isn't consensus by >> this point and if it requires further discussion > > I work on a vast ecosystem of Python-based projects which consider > the sdist tarballs they upload to PyPI to be their official release > tarballs, because they encode information otherwise only available > in revision control metadata (version information, change history, > copyright holders). The proposal is somewhat akin to saying that a > tarball created via `make dist` is unsuitable for packaging. > A recommendation is non-binding, and the intent of this proposal is to say that the most "sourceful" form of source is the *most* suitable for Debian packages. The inverse of this is that `make dist` is less suitable for Debian packages. Neither formulation of this premise applies to a scope outside of Debian. In other words, just because a particular form of source packaging and distribution is not considered ideal in Debian does not in any comment on its suitability for other purposes. Would you prefer to see a note like "PyPi is a good thing for the Python ecosystem, but sdists are not the preferred form of Debian source tarballs"? It's also worth mentioning that upstream's "official release" preference is not necessarily relevant to a Debian context. Take for example the case where upstream exclusively supports a Flatpak and/or Snap package... > "GitHub tarballs" (aside from striking me as a blatant endorsement > of a wholly non-free software platform) lack this metadata, being > only a copy of the file contents from source control while missing > other relevant context Git would normally provide. "GitHub [and Gitlab!] tarballs" are fairly well understood, and it takes fewer words to talk about them than to write about integrating a merging or rebasing tag-based workflow (possibly with excluded files with a merge driver) in a team that has standardised on git-buildpackage. I might have out-of-date info, btw. Would it still upset the DSA if DPT packages' watch files polled using the lightweight git driver? I also prefer to have upstream git history :-) Thinking about an ideal solution, and the interesting PBR case, I remember that gbp is supposed to be able to associate gbp tags with upstream commits (or possibly tags), so maybe it's also possible to do this: 1. When gbp import-orig finds a new release 2. Fetch upstream remote as well 3. Run PBR against the upstream release tag 4. Stage this[ese] file[s] 5. Either append them to the upstream tarball before committing to the pristine-tar branch, or generate the upstream tarball from the upstream branch (intent being that the upstream branch's HEAD should be identical to the contents of that tarball) 6. Gbp creates upstream/x.y tag 7. Gbp merges to Debian packaging branch. Cheers, Nicholas signature.asc Description: PGP signature
Re: [RFC] DPT Policy: Canonise recommendation against PyPi-provided upstream source tarballs
On Fri, 25 Jun 2021 at 16:42:42 -0400, Nicholas D Steeves wrote: > I feel like there is probably consensus against the use of PyPi-provided > upstream source tarballs in preference for what will usually be a GitHub > release tarball This is not really consistent with what devref says: The defining characteristic of a pristine source tarball is that the .orig.tar.{gz,bz2,xz} file is byte-for-byte identical to a tarball officially distributed by the upstream author — https://www.debian.org/doc/manuals/developers-reference/best-pkging-practices.en.html#best-practices-for-orig-tar-gz-bz2-xz-files Sites like Github and Gitlab that generate tarballs from git contents don't (can't?) guarantee that the exported tarball will never change - I'm fairly sure `git archive` doesn't try to make that guarantee - so it seems hard to say that the official source code release artifact is always the one that appears as a side-effect of the upstream project's git hosting platform. That doesn't *necessarily* mean that the equivalent of a `git archive` is always the wrong thing (and indeed there are a lot of packages where it's the only reasonably easily-obtained thing that is suitable for our requirememnts), but I don't think it's as simple or clear-cut as you are implying. devref also says: A repackaged .orig.tar.{gz,bz2,xz} ... should, except where impossible for legal reasons, preserve the entire building and portablility infrastructure provided by the upstream author. For example, it is not a sufficient reason for omitting a file that it is used only when building on MS-DOS. Similarly, a Makefile provided by upstream should not be omitted even if the first thing your debian/rules does is to overwrite it by running a configure script. I think devref goes too far on this - for projects where the official upstream release artifact contains a significant amount of content we don't want (convenience copies, portability glue, generated files, etc.), checking the legal status of everything can end up being more work than the actual packaging, and that's work that isn't improving the quality of our operating system (which is, after all, the point). However, PyPI sdist archives are (at least in some cases) upstream's official source code release artifact, so I think a blanket recommendation that we ignore them probably goes too far in the other direction. I'd prefer to mention both options and have "use your best judgement, like you have to do for every other aspect of the packaging" as a recommendation :-) smcv
Re: [RFC] DPT Policy: Canonise recommendation against PyPi-provided upstream source tarballs
On Fri, 25 Jun 2021, Jeremy Stanley wrote: I feel like there is probably consensus against the use of PyPi-provided upstream source tarballs in preference for what will usually be a GitHub release tarball, so I made an MR to this effect (moderate recommendation rather than a "must" directive): https://salsa.debian.org/python-team/tools/python-modules/-/merge_requests/16 Comments, corrections, requests for additional information, and objections welcome :-) I'm also curious if there isn't consensus by this point and if it requires further discussion I work on a vast ecosystem of Python-based projects which consider the sdist tarballs they upload to PyPI to be their official release tarballs, because they encode information otherwise only available in revision control metadata (version information, change history, copyright holders). The proposal is somewhat akin to saying that a tarball created via `make dist` is unsuitable for packaging. "GitHub tarballs" (aside from striking me as a blatant endorsement of a wholly non-free software platform) lack this metadata, being only a copy of the file contents from source control while missing other relevant context Git would normally provide. I tend to agree about PyPI being the official releases for a lot of projects. "GitHub tarballs" also tend to include other undesirable stuff for distribution like upstream CI/CD configuration files, etc. Scott
Re: [RFC] DPT Policy: Canonise recommendation against PyPi-provided upstream source tarballs
On 2021-06-25 16:42:42 -0400 (-0400), Nicholas D Steeves wrote: > I feel like there is probably consensus against the use of PyPi-provided > upstream source tarballs in preference for what will usually be a GitHub > release tarball, so I made an MR to this effect (moderate recommendation > rather than a "must" directive): > > > https://salsa.debian.org/python-team/tools/python-modules/-/merge_requests/16 > > Comments, corrections, requests for additional information, and > objections welcome :-) I'm also curious if there isn't consensus by > this point and if it requires further discussion I work on a vast ecosystem of Python-based projects which consider the sdist tarballs they upload to PyPI to be their official release tarballs, because they encode information otherwise only available in revision control metadata (version information, change history, copyright holders). The proposal is somewhat akin to saying that a tarball created via `make dist` is unsuitable for packaging. "GitHub tarballs" (aside from striking me as a blatant endorsement of a wholly non-free software platform) lack this metadata, being only a copy of the file contents from source control while missing other relevant context Git would normally provide. -- Jeremy Stanley signature.asc Description: PGP signature