So, first, I owe you and the FTP team an apology. I was totally convinced that there had been more recent public discussions of tag2upload involving the FTP team than 2019. I either got confused with other discussions or had the increasingly common problem of thinking that things that happened ten years ago had only happened two years ago. Regardless, I really should have checked, I didn't, I made an incorrect assumption, and I apologize.
I would not, as a general rule, assume that any delegate decision made five years ago still holds today, and if I had not made erroneous assumptions about the timeline, I would have phrased several of my messages here differently. This is entirely my fault. So far, from this thread, it looks like the decision from 2019 may still stand, but I think there are still places to explore. Joerg Jaspert <[email protected]> writes: > On 17261 March 1977, Russ Allbery wrote: >> Why is this your red line? Is it only that you don't want to add >> another system to the trusted set, or is there something more specific >> that you're concerned about? > There ought to be one point that is doing this step, not many, yes. > Includes that it is the delegated work and task description of FTPMaster > to do this, though that can be addressed by either us ending up running > it, or adjusting delegations. Not sure the latter ends up with happy > people, but is one existing way. Elsewhere in this thread, Jessica Clarke made the excellent suggestion that perhaps the authentication check concern could be resolved by dak providing an API for performing the authentication and authorization check. I am embarrassed that I didn't think of that; thank you very much to Jessica for that suggestion. That gives me some hope that this point has a relatively neat solution, so I'm going to focus on exactly what dak needs the uploader signature to cover in order to accept the package. > Also, currently we have the nicety that we store all signatures directly > besides the source package, available for everyone to go and check. > Linking back to the actual Uploader, not to a random service key. You > can take that, run a gpgv on it and via the checksums of the files then > see that, sure, this is the code that the maintainer took and uploaded. > You do *not* need to trust any other random key on that. Not that of > tag2upload. *AND* not that of FTPMaster. The dgit-repos server similarly archives the signed Git tag with the Git tree over which it is a signature, ensuring that this is independent of Salsa where the tag could potentially be deleted by someone. This is not in the archive, of course, but I don't see any technical reason why some version of that data couldn't also be uploaded to the archive if one wanted to use the archive as a highly distributed backup of the dgit-repos server. There is, however, the long-standing concern about any variation on the 3.0 (git) source package format that the Git tree the maintainer signed may contain non-free code somewhere in its history. So here too, I'm not sure that this is inherently a blocker, although in the past the FTP team has been reluctant to include in the archive the data that is required to preseve a complete record of what is signed by a Git tag. (One obvious potential solution is to only put a shallow clone in the archive, so you can verify the signature but some of the content-addressable store references are unresolved.) > Unsure those are the right words. We want to have the uploader create a > signature over the content they want to have appear in the archive. In a > way, that this signature can be taken and placed beside the source, and > then independently verified. *Currently* this is done using .dsc files. Okay, so again I think it's easier to talk about specifics, so let me make this concrete by using myself as the use case. I use the git-debrebase workflow for maintaining most of my Debian packages. What this means, for those who aren't familiar with it, is that my workflow looks like this (this is idealized; I'm still migrating my packages fully to this workflow so the specifics currently vary somewhat): 1. I start a new package by creating a new Git branch based on the Git tag of the latest upstream release. I then add the debian/* directory with packaging files and commit that directly to the resulting branch. 2. I work on the package, freely making commits to both the debian/* files and the upstream source to fix problems and adjust the software for Debian. The only constraint that I have to follow is that I can't make a commit that changes both files in debian/* and files outside of debian/* at the same time. Other than that, I can treat this branch like a completely normal Git branch and do development like I would in any other Git repository, without doing anything special for the Debian packaging. 3. When upstream releases a new release, I can *rebase* my changes on top of the new upstream release rathe than doing a merge with all the messiness that a merge involves. For me, this is huge. I can fully drop upstream changes that have been merged upstream, rework changes that need to be done differently based on upstream changes, and don't have to wrestle with a long and messy merge history with conflict resolutions that grows over time. Instead, I can always see a simple list of the changes that I've applied to the current upstream release. This is exactly the workflow that I use with other development forks in Git with non-Debian packages. (I do have to remember to run git debrebase conclude here to make the magic work.) 4. When I'm ready to upload, currently I run dgit locally. dgit looks at my Git repository, finds all of the commits that modify the upstream source, extracts the commit metadata, creates nice patches based on those commits with proper metadata taken from the Git commit metadata, and uses them to construct a normal 3.0 (quilt) source package. Anyone working with the source package can treat it exactly like any other 3.0 (quilt) source package and has no need to care that I use the git-debrebase workflow. Making all of this work involves some Git trickery that I know some people dislike (for example, all of this is serialized as a sequence of Git changes that are fast-forwardable *including the rebases*, which is dark magic), but for me this is an excellent workflow. The development experience matches my mental semantics of the relationship between the Debian packaging and the upstream code, and the Git trickery is all hidden from me behind a nice interface. Now, I would like to use tag2upload rather than using dgit locally to make the upload. I want to move my testing into Salsa CI so that my overall workflow more closely matches the way that I do all of my development in my day job. Salsa CI is great about not getting lazy and skipping test steps just because I am in a hurry to get a package uploaded, and I can capture every test that was useful and not have to remember to re-run it. (This is the part that I haven't done yet; I know I want to do it and have not yet found the time.) What signed artifact do I need to provide so that the FTP team will be comfortable accepting my tag2upload-built source package? Note, importantly, that the source package contains things that are not in the files present in the working tree of a local Git checkout of my source package. The patch descriptions and committer information and related metadata are where they are supposed to be in Git: in the metadata for the corresponding Git commit, not in a file in my working tree. The transformation that puts that data into a 3.0 (quilt) source package is not rocket science, but it's not trivial either. The signed artifact that I'm naturally providing is a signature across the entire Git tree, which includes all of the history and thus all of the data that goes into the source package. So everything that goes into the source package *is signed*, by me, when I trigger a tag2upload upload. The problem comes when dak wants to verify the correspondence between that data structure and the source package. It certainly can verify that my Git tag is valid and it can verify that the tag specifies the correct source package, version, and so forth. But if it wants to verify that the construction of the debian/patches/* directory is correct, I think it would have to perform the same transformation on my Git history that dgit and tag2upload perform. > I basically assume that the uploader *does* need to have their source > locally, no matter what. (Their git cloned). Yes, I agree. I don't think there's any way to avoid this: the source has to be in the same place that the key is in, or close to in the case of secure key storage, in order for the uploader to sign it and know what they are signing. > I also do assume that the uploader will build things, to see if the > stuff they are going to "push to the archive" (and our users) actually > does what they intended it to do - and to test it. This is the assumption that I think is no longer valid given Salsa CI. It used to be that this was the only way to test a package; now we can do equally well and often better by letting Salsa CI do the hard work. > Well, if the maintainers system is broken in, it makes no difference if > a git tag or a dsc or whatever else is signed. This is more true than I would like it to be, and in the case of a Debian maintainer who doesn't have any sort of hardware key storage and does all their Debian development on the same system that they read mail, browse the web, opens random downloaded PDFs, try random software, etc., I think this is true and it's one of the things that I worry about with our existing security model. However, I don't think this is *necessarily* true for all maintainers, and tag2upload creates the *possibility* of doing better. Whether we will take advantage of that possibility, I don't know. But creating a tag2upload tag requires GnuPG and Git and not much else, and other people can see exactly the Git contents that were signed. Better security models are possible even with *.dsc files, of course, but I think tag2upload opens the door for a few additional improvements such as moving the source package construction off the maintainer's system, and, more importantly, forces exactly the content that was signed to be uploaded to Salsa, which provides that data in a somewhat richer form that gives us some additional detection and tracing capabilities. -- Russ Allbery ([email protected]) <https://www.eyrie.org/~eagle/>

