Re: Validating tarballs against git repositories
Otto Kekäläinen writes: > On Tue, 2 Apr 2024 at 17:19, Jeremy Stanley wrote: >> On 2024-04-02 16:44:54 -0700 (-0700), Russ Allbery wrote: >> [...] >>> I think a shallow clone of depth 1 is sufficient, although that's not >>> sufficient to get the correct version number from Git in all cases. >> [...] >> >> Some tools (python3-reno, for example) want to inspect the commits >> and historical tags on branches, in order to do things like >> assembling release notes documents. I don't know if any reno-using >> projects packaged in Debian get release notes included, but if they >> do then shallow clones would break that process. The python3-pbr > You could use --depth=99 perhaps? > Usually the difference of having depth=1 or 99 isn't that big unless > there was a recent large refactoring. Git repositories that are very > big (e.g. LibreOffice, MariaDB) have hundreds of thousands of commits, > and by doing a depth=99 clone you avoid 99.995% of the history, and in > projects where changelog/release notes is based on git commits, then > 99 commits is probably enough. I suppose that's *possible*, but I'd want to see some concrete survey evidence to support that. I'm fairly sure that 99 would be insufficient to build a change log on some of my small packages for which I'm a single developer in all cases, let alone a project with any significant commit volume and a policy of separating unrelated changes into separate commits. My guess is that the sweet spots are --depth=1 and a full checkout, it's not generally possible to tell which a given package needs in advance (in other words, it's best handled as a configuration option), and it's probably not worth the effort to mess around with any intermediate depth. I suspect we'll find that the vast majority of packages work fine with --depth=1, and the remaining cases should just use a full checkout to avoid creating fragile assumptions that may work today and break tomorrow. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Representing Debian Metadata in Git
Chris Hofstaedtler writes: > My *feeling* is we should do the opposite - that is, represent less > Debian stuff in git, and especially do it in less Debian-specific > ways. IOW, no git extensions, no setup with multiple branches that > contain more or less unrelated things, etc. +1 I think this is particularly important for attracting new contributors and easing the onboarding process. There are a lot of odd Debian-specific things that people have to learn because they're necessary to make Debian work. I am dubious that the Git representation is one of them, and would rather continue down the path of providing Debian tools and processes that reduce the delta between how Debian packaging uses Git and how most free software development uses Git. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: RFC: packages that can be installed by the user but not used as build dependencies
Lisandro Damián Nicanor Pérez Meyer writes: > So, what about if we could have [meta] packages that can be installed by > the user but not used as Build-Depends entries? Please note that for the > moment I'm targeting more at the idea itself rather than at the > implementation (but I'll certainly love to know if you have an idea on > how could this be implemented). > At one point I thought of adding a Lintian test checking for this kind > of usage, but first and foremost I would like to know if you think this > is a viable/acceptable idea, maybe even adding a special section in our > policy. I could have sworn that we already had tags like that in Lintian. Certainly, this is a concept that has already existed in Debian for some time. There have always been metapackages or other similar cases that are only intended for end users and would make no sense as build dependencies, such as all of the task-* packages. Lintian feels like the right place to put a test like this. If there are dependencies like that which could potentially cause serious issues, those could even be an auto-reject tag. I'm not sure that Policy would have much to say about this unless we need some mechanism for labeling such packages other than a MR to Lintian. The important information is the list of packages that shouldn't be used this way, and the hard part is probably gathering that list. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Accepting DEP14?
Chris Hofstaedtler writes: > "latest" is illnamed. What do you expect to find in a branch thats > called debian/latest? > Packaging for unstable? For experimental? What if both evolve in > parallel? Yes, some packages do that. We discussed this a lot during the drafting of DEP14, and the reason why the standard allows either convention is that it depends on the package and there were two separate perspectives with no consensus that one was universally better. Maintainers of some packages that upload to unstable except during freezes, during which they temporarily move into experimental, but consider it the same line of development, and then move back into unstable after the release preferred debian/latest since it matched how they thought about the line of development. People who maintained separate unstable and experimental lines of development preferred debian/unstable and debian/experimental. Personally, I use debian/unstable but do experimental development in that same branch if it's "targeting unstable," which is either the best or worst of both worlds, depending on your perspective. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: RFC: Sensible-editor sensible-utils alternative and update
Simon McVittie writes: > The approach to this that will work consistently is to launch the > handler asynchronously (in the background), and not attempt to find out > whether it has exited or not. So for example an interactive shell script > might do something like this: > #!/bin/bash > # note that disown is a bashism > xdg-open "$document" & > disown $! > echo "Press Enter when you have finished editing $document..." > read What this is telling me is that ideally someone should tighten the definition of EDITOR in Policy 11.4, which is the specification satisfied by sensible-editor, to make it clear that GUI editors with these sorts of properties are not valid things to set EDITOR to point to unless flags are present to make them behave in a way that satisfies the expectations of programs that use EDITOR. I don't have any strong opinion on the merits of trying to figure out how to invoke the editor with the proper flags to make it follow the expectations of EDITOR if EDITOR is not set, but we do need to be careful to not invoke programs that would cause, e.g., git commit --amend to immediately exit with no changes to the commit message, and to do that we probably need to write down what those expectations are. I think the Policy language was written in a time where we just assumed there was an obvious way for editors to behave that didn't include things like backgrounding themselves. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Bug#1075905: ITP: python-fraction -- Fraction carries out all the fraction operations including addition, subtraction, multiplication, division, reciprocation
Yogeswaran Umasankar writes: > As I look further, it appears that standard Python libs such as float or > decimal.Decimal do not provide exact representation of rational numbers > (fractions) without potential loss of precision. Seems ‘fraction’ > package yield exact results because those functions directly work on > fractions (to my limited understanding). > decimal.Decimal is better than float, but it only extends to arbitrary > precision decimal arithmetic, not exact representation of rational > numbers. Given that these libraries could potentially serve as > dependencies for tensor-related packages and beyond, should we consider > bringing 'fraction' or restrict ourselves to float (which is a fallback > in moarchiving if fraction unavailable)? I think the suggestion is to use the Python standard library package "fractions" specifically: https://docs.python.org/3/library/fractions.html -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Q: Ubuntu PPA induced version ordering mess.
Alec Leamas writes: > So, at least three possible paths: > 1. Persuade users to uninstall PPA packages before installing official > packages and also generation 2 PPA packages with sane versions like > 5.10.x > 2. Use versions like 9000.5.10, 9000.5.12. etc. > 3. Use an epoch. > Of these I would say that 1. is a **very** hard sell upstream. Users are > used to just update and will try, fail and cause friction. > 2. and 3. both adds something which must be kept forever. Given this > choice I tend to think that the epoch is the lesser evil, mostly because > the package version could match the "real" version. I would use an epoch. It sounds like the PPA was in serious use by the intended users and they're going to be switching to your packages. You are trying to make that easy and avoid obvious and easily-forseen problems and I think that's good: that's exactly what a maintainer should do. If it were just a handful of people you could walk through the transition, that's maybe different, but it sounds like that's not the case. 2 is a hard sell to upstream for psychological reasons. Maybe it shouldn't be, maybe upstream should be fine with this, but as you say upstream in practice isn't going to be fine with this and honestly if I were upstream I probably wouldn't be either, even if I knew I should be. It's hard enough to get people to use version numbers properly. Getting them to use a "weird" version number that their users might be confused by for the rest of time is going to be difficult. Changing the version number only in Debian is even worse: that's just horribly confusing for users and will be forever. And the confusion is going to affect upstream as well. Basically, you'd be burning a lot of social capital with upstream for no really good reason and you probably still wouldn't be able to convince them. I don't think it's worth it. I would just use the epoch. I know people really hate them and they have a few weird and annoying properties, but we have a bunch of packages with epochs and it's mostly fine. It's something you'll have to keep working around forever, but not in a way that's really that hard to deal with, IMO. (I would also warn upstream that you're doing that, so that they know what the weird "1:" thing means in bug reports in the future and why it's there.) This feels like exactly the type of situation that epochs were designed for: upstream was releasing packages with weird version numbers and now they're effectively going back to normal version numbers that are much smaller. In other words, to quote policy, "situations where the upstream version numbering scheme changes." Yes, in this case it was only in their packages and not in their software releases, but that still counts when they have an existing user base that has those packages installed. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Reviving schroot as used by sbuild
PICCA Frederic-Emmanuel writes: >> Ah, thank you, I didn't realize that existed. That sounds like a nice >> generalization of the file system snapshot approach. > I think that this how the > sbuild-debian-developer-setup > script, setup chroots Yeah, I think all that my contribution to this thread accomplished was to demonstrate that I set up sbuild years ago based on a wiki article for btrfs and don't know what I'm talking about. :) Apologies for that. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Reviving schroot as used by sbuild
Guillem Jover writes: > I manage my chroots with schroot (but not via sbuild, for dog fooding > purposes :), and use type=directory and union-type=overlay so that I get > a fast and persistent base, independent of the underlying filesystem, > with fresh instances per session. (You can access the base via the > source: names.) I never liked the type=file stuff, as it's slow to > setup and maintain. Ah, thank you, I didn't realize that existed. That sounds like a nice generalization of the file system snapshot approach. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Reviving schroot as used by sbuild
Simon McVittie writes: > Persisting a container root filesystem between multiple operations comes > with some serious correctness issues if there are "hooks" that can modify > it destructively on each operation: see <https://bugs.debian.org/499014> > and <https://bugs.debian.org/994836>. As a result of that, I think the > only model that should be used in new systems is to have some concept of > a session (like schroot type=file, but unlike schroot type=directory) > so that those "hooks" only run once, on session creation, preventing > them from arbitrarily reverting/overwriting changes that are subsequently > made by packages installed into the chroot/container (for example dbus' > creation of the messagebus uid/gid in #499014, and exim4's creation of > Debian-exim in #994836). I'm not entirely sure that I'm following the nuances of this discussion, so this may be irrelevant, but I think type=btrfs-snapshot provides the ideal properties for container file systems. This unfortunately require file system support and therefore cannot be used unless you've already embraced a file system with subvolumes, but if you have, you get all of the speed of a persistent container root file system with none of the correctness issues, because you get a fresh (and almost instant) clone of a canonical root file system that is discarded after each build. I use that in combination with a cron job to update the source subvolume daily to ensure that it's fully patched. Unfortunately, there's no way that we can rely on this, but it would be nice to continue to support it for those who are using a supported underlying file system already. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: About i386 support
r...@neoquasar.org writes: > Then it's not a problem in the first place. If you can't reproduce a bug > with a reasonable effort, then it is unconfirmed and you can stop > worrying about it. I think you're confusing two different types of reproduction. Architecture porting bugs are often hardware-specific. The bug may be 100% reproducible on that instance of the architecture, an instance that you do not own and do not have access to. So the package is reliably broken for a user trying to use that architecture, and yet the porter has limited ability to triage or debug it because they don't have access to that architecture. This is one of the reasons why projects (not just Debian) drop support for architectures. Once the *maintainers* no longer have easy access to instances of that architecture, it's very hard to support, even if users keep trying to use that architecture and run into problems that are reproducible for them. That's the first hurdle. The second hurdle you then run into is that frequently the cause of these problems is deep inside the compiler, the kernel, or some other complex piece of upstream code. There are a very limited number of people who have the ability to track down and fix problems like that, since they can require a lot of toolchain expertise. It's not a simple thing to commit to doing. Debian relies fairly heavily on a whole ecosystem of upstream developers to do a lot of the difficult work for supporting architectures, including the kernel, GCC, binutils, etc. If that ecosystem stops supporting architectures, it will be very difficult for Debian to keep support, and doing so usually requires the people interested in keeping those architectures working to also become upstream kernel, GCC, etc. developers. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: File descriptor hard limit is now bumped to the kernel max
Simon McVittie writes: > On Thu, 06 Jun 2024 at 18:39:15 +0200, Marco d'Itri wrote: >> Something did, because inn would start reporting ~1G available fds and >> then explode, and that patch solved the issue. :-) > It might be worthwhile to try to track down what larger component did > this, because inheriting a larger rlim_cur without opt-in can also break > users of select(2) as described in > <https://0pointer.net/blog/file-descriptor-limits.html>. I took a quick look at the old INN source and didn't see anything obvious. I was half-expecting it to do something like set the soft limit to the hard limit (that sounds like a very INN sort of thing to do), but if so, I couldn't find it in a quick search. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: DEP17 /usr-move: debootstrap set uploaded
Marc Haber writes: > Helmut Grohne wrote: >> Thanks for bearing with me and also thanks to all the people (release >> team and affected package maintainers in particular) who support this >> work. > Thank you for doing this work. I have rarely seen a change of this > magnitude in Debian that was managed on this professional level. I > especially praise the way you have communicated the progress. 100% agreed. The care and excellence that you've brought to this work has been exceptional. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Make /tmp/ a tmpfs and cleanup /var/tmp/ on a timer by default [was: Re: systemd: tmpfiles.d not cleaning /var/tmp by default]
Matthew Garrett writes: > On Mon, May 06, 2024 at 07:42:11AM -0700, Russ Allbery wrote: >> Historically, deleting anything in /var/tmp that hadn't been accessed >> in over seven days was a perfectly reasonable and typical >> configuration. These days, we have the complication that it's fairly >> common to turn off atime updates for performance reasons, which makes >> it a bit harder to implement that policy when /var/tmp isn't its own >> partition and thus inherits that setting from the rest of the system. > Apologies for being a bit late to this, but is this true? relatime-type > setups will still update atime if the time between the previous update > and the access is larger than some threshold, so you lose some degree of > granularity but the rough policy should still apply. You are correct and I completely forgot about that. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: MBF: drop dependencies on system-log-daemon
Simon McVittie writes: > I know fail2ban and logcheck do read plain-text logs (although as > mentioned, fail2ban already has native Journal-reading support too), and > I would guess that fwlogwatch, snort and xwatch probably also read the > logs. logcheck also has native journal-reading support. Note that its dependency is only Suggests. I have not checked if that's there for some other reason. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Documenting packaging workflows
Johannes Schauer Marin Rodrigues writes: > I would be *very* interested in more in-depth write-ups of the workflows > other DDs prefer to use, how they use them and what they think makes > them better than the alternatives. > Personally, I start packaging something without git, once I'm satisfied > I use "gbp import-dsc" to create a packaging git with pristine-tar (and > that will *not* have DEP14 branches and it will use "master" instead of > "main") and then I push that to salsa and do more fixes until the > pipeline succeeds and lintian is happy. My patches are managed using > quilt in d/patches and upstream git is not part of my packaging git. I > upload using "dgit --quilt=gbp push-source". > Would another workflow make me more happy? Probably! But how would I get > to know them or get convinced that they are better? Maybe I'm missing > out on existing write-ups or video recordings which explain how others > do their packaging and why it is superior? One of the lesser-known things found in the dgit package is a set of man pages describing several packaging workflows. These certainly aren't exhaustive of the packaging workflows that people use (for one thing, they are designed to explain how to use dgit in each workflow, and thus of course assume that you want to do that), but they're both succinct and fairly thorough and I found reading them very helpful. dgit(1) has a list at the start. dgit-maint-debrebase(7) is the workflow that I now use, pretty much exactly as described. The primary thing that I like about it is that I never have to deal with the externalized patches or with quilt (I used quilt for years, and I have developed a real dislike for it and its odd quirks -- I would rather only deal with Git's odd quirks), and I can use a git-rebase-like command in much the same way that I use it routinely for feature branches when doing upstream development. Then some Git magic happens behind the scenes to make this safe, and while I don't understand the details, it has always worked fine, so I don't really care how it works. :) I like having a Git history from as early in the process as possible and I want the upstream Git history to refer to while I work on packaging (and want to be able to cherry-pick upstream commits), so I generally start with a tagged release from the upstream Git repository, create a branch based on it, and start writing and committing debian/* files. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: finally end single-person maintainership
Salvo Tomaselli writes: > If the debian/ directory is on salsa, but the rest of the project is > somewhere else, then this no longer works, I have to tag in 2 different > places, I have 2 different repositories to push to and so on. For what it's worth, what I do for the packages for which I'm also upstream is that I just add Salsa as another remote and, after I upload a new version of the Debian package, I push to Salsa as well (yes, including all the upstream branches; why not, the Debian branches are based on that anyway, so it's not much more space). One of these days I'll get CI set up properly, and then it will be worthwhile to push to Salsa *before* I upload the package and let it do some additional checking. It's still an additional step, and I still sometimes forget to do it, but after some one-time setup, it's a fairly trivial amount of work. It's more work to accept a merge request on Salsa and update the repositories appropriately, since there are two repositories in play, but in that case I'm getting a contribution out of it that I might not have gotten otherwise, so to me that seems worth it. I used to try to keep the debian directory in a separate repository or try to keep the Debian Git branches in a separate repository, and all of that was just annoying and tedious and didn't feel like it accomplished much. Just pushing the same branches everywhere is easy and seems to accomplish the same thing. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: finally end single-person maintainership
Stefano Rivera writes: > On the other hand, dgit is only useful if you have a certain view of the > world, that hasn't aligned with how I've done Debian packaging. I mean, > an entirely git-centric view where you let go of trying to maintain your > patch stack. dgit has no problems with you maintaining your patch stack, at least as I understand that statement. I personally use the dgit-maint-debrebase(7) workflow, which is a fancy way of maintaining your patch stack using an equivalent of git rebase, since I love git rebase and use it all the time. But I used the dgit-maint-gbp(7) workflow, which is basically just the normal git-buildpackage workflow, for years and still use it for some of my packages and it works fine. Maybe you mean something different by this than I think you meant. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: finally end single-person maintainership
Simon Richter writes: > A better approach would not treat Debian metadata as git data. Even the > most vocal advocate of switching everything to Salsa writes in his MR > that the changelog should not be touched in a commit, because it creates > conflicts, and instead a manual step will need to be performed later. This is not a Debian-specific problem and has nothing to do with any special properties of our workflows or differences between packaging and other software maintenance tasks. It's a common issue faced by everyone who has ever maintained a software package in Git and wanted to publish a change log. There are oodles of tools and workflows to handle this problem, ranging from writing the change log based on the Git commits when you're making the release to accumulating fragments of release notes in separate files and using a tool to merge them. dch's approach of using the Git commit messages is one of the standard solutions, one that will be familiar to many people who have faced this same problem in other contexts. The hard part with these sorts of problems is agreeing on the tool and workflow to use to solve it, something Debian struggles with more than most software projects because we lack a decision-making body that can say things like "we're going to use scriv" and make it stick. But that isn't because packaging is a special problem unsuited to Git. Git has a rich ecosystem with many effective solutions to problems of this sort. It's because we've chosen a governance model that intentionally makes central decision-making and therefore consistency and coordination difficult, in exchange for other perceived benefits. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Confused about libnuma1 naming
"J.J. Martzki" writes: > Package 'libnuma1' is built from numactl, and there seems no 'libnuma' > exists. Why does it named as 'libnuma1' rather than 'libnuma'? Shared library packages should be named after the library SONAME, which generally includes a version (as it does here). See: https://www.debian.org/doc/debian-policy/ch-sharedlibs.html#run-time-shared-libraries -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: De-vendoring gnulib in Debian packages
Ansgar 🙀 writes: > In ecosystems like NPM, Cargo, Golang, Python and so on pinning to > specific versions is also "explicitly intended to be used"; they just > sometimes don't include convenience copies directly as they have tooling > to download these (which is not allowed in Debian). Yeah, this is a somewhat different case that isn't well-documented in Policy at the moment. > (Arguably Debian should use those more often as keeping all software at > the same dependency version is a futile effort IMHO...) There's a straight tradeoff with security effort: more security work is required for every additional copy of a library that exists in Debian stable. (And, of course, some languages have better support for having multiple simultaneously-installed versions of the same library than others. Python's support for this is not great; the ecosystem expectation is that one uses separate virtualenvs, which don't really solve the Debian build dependency problem.) -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: De-vendoring gnulib in Debian packages
"Theodore Ts'o" writes: > The best solution to this is to try to promote people to put those > autoconf macros that they are manually maintaining that can't be > supplied in acinclude.m4, which is now included by default by autoconf > in addition to aclocal.m4. Or use a subdirectory named something like m4, so that you can put each conceptually separate macro in a separate file and not mush everything together, and use: AC_CONFIG_MACRO_DIR([m4]) (and set ACLOCAL_AMFLAGS = -I m4 in Makefile.am if you're also using Automake). > Note that how we treat gnulib is a bit differently from how we treat > other C shared libraries, where we claim that *all* libraries must be > dynamically linked, and that include source code by reference is against > Debian Policy, precisely because of the toil needed to update all of the > binary packages should some security vulnerability gets discovered in > the library which is either linked statically or included by code > duplication. > And yet, we seem to have given a pass for gnulib, probably because it > would be too awkward to enforce that rule *everywhere*, so apparently > we've turned a blind eye. No, there's an explicit exception for cases like gnulib. Policy 4.13: Some software packages include in their distribution convenience copies of code from other software packages, generally so that users compiling from source don’t have to download multiple packages. Debian packages should not make use of these convenience copies unless the included package is explicitly intended to be used in this way. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Avoiding /var/tmp for long-running compute (was: Make /tmp/ a tmpfs and cleanup /var/tmp/ on a timer by default [was: Re: systemd: tmpfiles.d not cleaning /var/tmp by default])
"Jonathan Dowland" writes: > Else-thread, Russ begs people to stop doing this. I agree people > shouldn't! We should also work on education and promotion of the > alternatives. Also, helping people use better tools for managing workloads like this that make their lives easier and have better semantics, thus improving life for everyone. I'm suggesting solutions that I don't have time to help implement, and of course it will take a long time for better tools to filter into all those clusters, so this doesn't address the immediate problem of this thread (hence the subject change). But based on my past experience with these types of systems, I bet a lot of the patterns captured in software are older ones. Linux has a *lot* of facilities today that it didn't have, or at least weren't widely used, five years ago. It would be great to help some of those improvements filter down, because they can make a lot of these problems go away. For example, take the case of scratch space for batch computing. The logical lifespan for temporary files for a batch computing job is the lifetime of the job, whatever that may be. (I know there are exceptions, but here I'm just talking about defaults.) Previously one would have to build support into the batch job management system for creating and managing those per-job temporary directories, and ensure the jobs support TMPDIR or other environment variables to control where they store data, and everyone was doing this independently. (I've done a *lot* of this kind of thing, once upon a time.) But now we have mount namespaces, and systemd has PrivateTmp that builds on top of that. So if the job is managed by an execution manager, it can create per-job temporary directories and it may already support (as systemd does) the semantics of deleting the contents of those directories on job exit, and it bind-mounts those into the process space and the process is none the wiser. I think all of the desirable glue may not fully be there (controlling what underlying file system is used for PrivateTmp, ensuring they're also excluded from normal cleanup, etc.), but this is very close to a much better way of handling this problem that still exposes /tmp and /var/tmp to the job so that none of the often-crufty scientific computing software has to change. The new capabilities that Linux now has due to namespaces are marvellous and solve a whole lot of problems that I didn't realize were even solvable, and right now I suspect there are huge opportunities for substantial improvements without a whole lot of effort by just plumbing those facilities through to higher-level layers like batch systems. Whole classes of long-standing problems would just disappear, or at least be far, far easier to manage. Substantial, substantial caveat: I have been out of this world for a while, and maybe most of this work has already been done? That would be amazing. The best possible response to this post would be for someone to tell me I'm five years behind and the batch systems have already picked up this work and we can just point people at them. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Make /tmp/ a tmpfs and cleanup /var/tmp/ on a timer by default [was: Re: systemd: tmpfiles.d not cleaning /var/tmp by default]
Simon Richter writes: > On 5/8/24 07:05, Russ Allbery wrote: >> It sounds like that is what kicked off this discussion, but moving /tmp >> to tmpfs also usually makes programs that use /tmp run faster. I >> believe that was the original motivation for tmpfs back in the day. > IIRC it started out as an implementation of POSIX SHM, and was later > generalized. I believe you're correct for Linux specifically but not in general for UNIX. For example, I'm fairly sure this is not the case on Solaris, which was the first place I encountered tmpfs and where tmpfs /tmp was the default starting in Solaris 2.1 in 1992. tmpfs was present in SunOS in 1987, so I'm pretty sure it predates POSIX shared memory. Linux was very, very late to the tmpfs world. > When /var runs full, the problem is probably initrd building. I'm not quite sure what to make of this statement. On my systems, /var contains all sorts of rather large things, such as PostgreSQL databases, INN spool files, and mail spools. I have filled up /var on many systems over the years, and it's never been by building initrd images. > Taking a quick look around all my machines, the accumulated cruft in > /var/tmp is on the order of kilobytes -- mostly reportbug files, and a > few from audacity -- and these machines have not been reinstalled in the > last ten years. Yes, I don't think many programs use it. I think that's a good thing; the specific semantics of /var/tmp are only useful in fairly narrow situations, and overfilling it is fairly dangerous. Back in the day, /var/tmp was the thing that you used if /tmp was too small (because it was usually tmpfs). For example, using sort -T /var/tmp to sort large files is an old UNIX rune. And, of course, students would use it because they ran out of quota in their home directories and then get upset when their files got deleted automatically, back in the days of shared UNIX login clusters. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Make /tmp/ a tmpfs and cleanup /var/tmp/ on a timer by default [was: Re: systemd: tmpfiles.d not cleaning /var/tmp by default]
Richard Lewis writes: > btw, i'm not trying to argue against the change, but i dont yet > understand the rationale (which id like to be put into the > release-notes): is there perhaps something more compelling than "other > distributions and upstream already do this"? It sounds like that is what kicked off this discussion, but moving /tmp to tmpfs also usually makes programs that use /tmp run faster. I believe that was the original motivation for tmpfs back in the day. For /var/tmp, I think the primary motivation to garbage-collect those files is that filling up /var/tmp is often quite bad for the system. It's frequently not on its own partition, but is shared with at least /var, and filling up /var can be very bad. It can result in bounced mail, unstable services, and other serious problems. Most modern desktop systems now have large enough drives that this isn't as much of a concern as it used to be, but VMs often still have quite small / partitions and put /var/tmp on that partition. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Bug#966621: Make /tmp/ a tmpfs and cleanup /var/tmp/ on a timer by default [was: Re: systemd: tmpfiles.d not cleaning /var/tmp by default]
Richard Lewis writes: > Luca Boccassi writes: >> what would break where, and how to fix it? > Another one for you to investigate: I believe apt source and 'apt-get > source' download and extract things into /tmp, as in the mmdebootstap > example mentioned by someone else, this will create "old" files that > could immediately be flagged for deletion causing surprises. > (People restoring from backups might also find this an issue) systemd-tmpfiles respects atime and ctime by default, not just mtime, so I think this would only be a problem on file systems that didn't support those attributes. atime is often turned off, but I believe support for ctime is fairly universal among the likely file systems for /var/tmp, and I believe tmpfs supports all three. (I'm not 100% sure, though, so please correct me if I'm wrong.) -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Make /tmp/ a tmpfs and cleanup /var/tmp/ on a timer by default [was: Re: systemd: tmpfiles.d not cleaning /var/tmp by default]
Hakan Bayındır writes: > The applications users use create these temporary files without users' > knowledge. They work in their own directories, but applications create > another job dependent state files in both /tmp and /var/tmp. These are > different programs and I assure you they’re not created there because > user (or we) configured something. These files live there during the > lifetime of the job, and cleaned afterwards by the application. Then someone should fix those applications, because that behavior will result in user data loss if they're not fixed. However, first one should check whether the applications are just honoring TMPDIR or equivalent variables, in which case TMPDIR on batch systems often should be set to a user-specific or job-specific persistent directory for exactly this reason. That way you can use a user-specific cleanup strategy, such as purging that directory when all of the user's jobs have finished. I understand your point, which is that this pattern is out there in the wild and Debian is in danger of breaking existing usage patterns by matching the defaults of other distributions. This is a valid point, and I appreciate you making it. My replies are not intended to dispute that point, but to say that the burden of addressing this buggy behavior should not rest entirely on Debian. What the combination of batch system and application is doing is semantically incorrect and is dangerous, and it really should be fixed. Even if Debian changes nothing, at some point someone will deploy workers with a different base operating system and be very surprised when these files are automatically deleted. We were automatically cleaning /tmp and /var/tmp on commercial UNIX systems in 1995 and fixing broken applications that didn't honor TMPDIR. This is not a new problem. Nor is having /var/tmp fill up and cause all sorts of system problems because someone turned off /var/tmp cleaning while trying to work around broken applications. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Make /tmp/ a tmpfs and cleanup /var/tmp/ on a timer by default [was: Re: systemd: tmpfiles.d not cleaning /var/tmp by default]
Hakan Bayındır writes: > Dear Russ, >> If you are running a long-running task that produces data that you >> care about, make a directory for it to use, whether in your home >> directory, /opt, /srv, whatever. > Sorry but, clusters, batch systems and other automated systems doesn't > work that way. Yours might not, but I spent 20 years maintaining clusters and batch systems and I assure you that's how mine worked. > That's not an extension of the home directory in any way. After users > submit their jobs to the cluster, they neither have access to the > execution node, nor they can pick and choose where to put their files. > These files may stay there up to a couple of weeks, and deleting > everything periodically will probably corrupt the jobs of these users > somehow. Using /var/tmp for this purpose is not a good design decision. Directories are free; they can make a new one and point the files of batch jobs there. They don't have to overload a directory that historically has different semantics and is often periodically cleared. I get that this may not be your design or something you have control over, so telling you this doesn't directly help, but the point still stands. Again, obviously the people configuring that cluster can configure it however they want, including overriding the /var/tmp cleanup policy. But they're playing with fire by training users to use /var/tmp, and it's going to result in someone getting their data deleted at some point, regardless of what Debian does. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Make /tmp/ a tmpfs and cleanup /var/tmp/ on a timer by default [was: Re: systemd: tmpfiles.d not cleaning /var/tmp by default]
Hakan Bayındır writes: > Consider a long running task, which will take days or weeks (which is > the norm in simulation and science domains in general). System emitted a > warning after three days, that it'll delete my files in three days. My > job won't be finished, and I'll be losing three days of work unless I > catch that warning. I have to admit that I'm a little surprised at the number of people who are apparently using /var/tmp for things that are clearly not temporary files in the traditional UNIX sense. Clearly this bit of folk knowledge is not as widespread as I thought, so we have to figure out how to deal with that, but periodically deleting files out of /var/tmp has been common (not universal, but common) UNIX practice for at least thirty years. Whatever we do with /var/tmp retention, I beg people to stop using /var/tmp for data you're keeping for longer than a few days and care about losing. That's not what it's for, and you *will* be bitten by this someday, somewhere, because even with existing Debian configuration many people run tmpreaper or similar programs. If you are running a long-running task that produces data that you care about, make a directory for it to use, whether in your home directory, /opt, /srv, whatever. /var/tmp's primary purpose historically was to support things like temporary recovery files that needed to survive a system crash, but which were still expected to be *temporary* in that one would then either use the recovery file or expect it to be deleted. Not as an extension of people's home directory. Your system is your system, so of course you can configure /var/tmp however you want and no one is going to stop you, but a lot of people on this thread are describing habits that are going to lose their data if they use a different distribution or even a differently-configured Debian distribution with tmpreaper installed. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Bug#966621: Make /tmp/ a tmpfs and cleanup /var/tmp/ on a timer by default [was: Re: systemd: tmpfiles.d not cleaning /var/tmp by default]
Luca Boccassi writes: > Richard Lewis wrote: >> - tmux stores sockets in /tmp/tmux-$UID >> - I think screen might use /tmp/screens >> I suppose if you detached for a long time you might find yourself >> unable to reattach. >> I think you can change the location of these. > And those are useful only as long as screen/tmux are still running, > right (I don't really use either that much)? If so, a flock is the right > solution for these Also, using /tmp as a path for those sockets was always a questionable decision. I believe current versions of screen use /run/screen, which is a more reasonable location. Using a per-user directory would be even better, although I think screen intentionally supports shared screens between users (which is a somewhat terrifying feature from a security standpoint, but that's a different argument). -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Make /tmp/ a tmpfs and cleanup /var/tmp/ on a timer by default [was: Re: systemd: tmpfiles.d not cleaning /var/tmp by default]
Andrey Rakhmatullin writes: > On Mon, May 06, 2024 at 10:40:00AM +0200, Michael Biebl wrote: >> I'm not sure if we have software on long running servers which place >> files in /tmp and /var/tmp and expect files to not be deleted during >> runtime, even if not accessed for a long time. This is certainly an >> issue to be aware of and keep an eye on. > Note that FHS mandates it for /var/tmp: "Files and directories located > in /var/tmp must not be deleted when the system is booted. Although data > stored in /var/tmp is typically deleted in a site-specific manner, it is > recommended that deletions occur at a less frequent interval than /tmp." It mandates that it not be cleaned on *boot*. Not that it never be cleaned during runtime. It anticipates that it be cleaned periodically, just less frequently than /tmp. There is a specific prohibition against clearing /var/tmp on reboot because /var/tmp historically has been used to store temporary files whose whole reason for existence is that they need to survive a reboot, such as vi recover files, but are still safe to delete periodically. Historically, deleting anything in /var/tmp that hadn't been accessed in over seven days was a perfectly reasonable and typical configuration. These days, we have the complication that it's fairly common to turn off atime updates for performance reasons, which makes it a bit harder to implement that policy when /var/tmp isn't its own partition and thus inherits that setting from the rest of the system. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Silent hijacking and stripping records from changelog
Jonas Smedegaard writes: > Quoting Jonathan Dowland (2024-04-17 17:29:11) >> On Wed Apr 17, 2024 at 10:39 AM BST, Jonas Smedegaard wrote: >>> Interesting: Can you elaborate on those examplary contributions of >>> yours which highlighted a need for maintaining all Haskell packages in >>> same git repo? >> My Haskell contributions (which I did not enumerate) are tangential to >> the use of a monorepo. But it strikes me as an odd choice for you to >> describe them as examplary. Paired with you seeming to file me on "the >> opposing side", your mail reads to me as unnecessarily snarky. Please >> do not CC me for listmail. > I can see why it might come across as snarky. It was not intended that > way. > I just meant to write describe your contributions as examples, but I > realize now that with your emphasizing it that I wrongly described them > as extraordinary examples. I suspect (based on Jonas's domain) this is one of those subtle problems when English isn't your first language. The English language is full of weird connotation traps. For anyone else who may not be aware of this subtle shade of meaning, an English dictionary will partly lie to you about the common meaning of "exemplary" (which I assume is what Jonas meant by "examplary"). Yes, it means "serving as an example," but it specifically means serving as an *ideal* example: something that should be held up as being particularly excellent or worthy of imitation. If you ask someone "could you elaborate on your exemplary contributions," a native English speaker is going to assume you're being sarcastic about 90% of the time. In common usage, that phrase usually carries a tone closer to "please do enlighten us about your amazing contributions" than what Jonas actually intended. I keep having to remind myself of this in Debian since many Debian contributors have *excellent* written English skills (certainly massively bettern than my language skills in any language other than English), so it's easy to fall into the trap of assuming that they're completely fluent, but English is full of problems like this that will trip up even highly advanced non-native speakers. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Debian openssh option review: considering splitting out GSS-API key exchange
Florian Lohoff writes: > These times have long gone and tcp wrapper as a security mechanism has > lost its reliability, this is why people started moving away from tcp > wrapper (which i think is a shame) > I personally moved to nftables which is nearly as simple once you get > your muscle memory set. If ssh is your only candidate of network service > you could also use match statements in /etc/ssh/sshd_config.d/. For what it's worth, I have iptables (I know, it's nftables under the hood now, but I'm still using the iptables syntax because the number of hours in each day is annoyingly low) on every system I run and I still use TCP wrappers for ssh restrictions for one host. That's because I have users who use various ISPs, and for some of those ISPs, DNS-based restrictions are less maintenance work than playing whack-a-mole with their ever-changing IP blocks. Yes, yes, I know this isn't actually secure, etc., but that's fine, I'm not using it as a primary security measure. I'm using it to narrow the number of hosts on the Internet that can exploit an sshd vulnerability, and to reduce the amount of annoying automated exploit attempts I get. (Exactly the kind of thing that helps mildly against situations like the xz backdoor.) That said, the point that I could switch over to Match blocks in the sshd configuration is well-taken, and not wanting to take an hour to rewrite my rules in a different configuration format is probably not a good enough reason to keep a dependency in a security-critical, network-exposed service. I'm mildly grumbly becuase it's yet another thing I have to change just to keep things from breaking, but such is life. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Validating tarballs against git repositories
Stefano Rivera writes: > Then you haven't come across any that are using this mechanism to > install data, yet. You're only seeing the version determination. You > will, at some point run into this problem. It's getting more popular. Yup, we use this mechanism heavily at work, since it avoids having to separately maintain a MANIFEST.in file. Anything that's checked in to Git in the appropriate trees ships with the module. But this means that you have to build the module from a Git repository, if you're not using the artifact uploaded to PyPI (which expands out all the information derived from Git). If I correctly remember the failure mode, which I sometimes run into during local development if I forget to git add new data files, the data files are just not installed since nothing tells the build system they should be included with the module. I think a shallow clone of depth 1 is sufficient, although that's not sufficient to get the correct version number from Git in all cases. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Validating tarballs against git repositories
Adrian Bunk writes: > On Mon, Apr 01, 2024 at 11:17:21AM -0400, Theodore Ts'o wrote: >> Yeah, that too. There are still people building e2fsprogs on AIX, >> Solaris, and other legacy Unix systems, and I'd hate to break them, or >> require a lot of pain for people who are building on MacPorts, et. al. >>... > Everything you mention should already be supported by Meson. Meson honestly sounds great, and I personally love the idea of using a build system whose language is a bit more like Python, since I use that language professionally anyway. (It would be nice if it *was* Python rather than yet another ad hoc language, but I also get why they may want to restrict it.) The prospect of converting 25 years of portability code from M4 into a new language is daunting, however. For folks new to this ecosystem, what resources are already available? Are there large libraries of tests already out there akin to gnulib and the Autoconf Archive? Is there a really good "porting from Autotools" guide for Meson that goes beyond the very cursory guide in the Meson documentation? The problem with this sort of migration is that it is an immense amount of work just to get back to where you started. I look at the amount of effort and start thinking things like "well, if I'm going to rewrite a bunch of things anyway, maybe I should just rewrite the software in Rust instead." -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Debian openssh option review: considering splitting out GSS-API key exchange
Christoph Anton Mitterer writes: > Actually I think that most sites where I "need"/use GSSAPI... only > require the ticket for AFS, and do actually allow pubkey auth (but > right now, one doesn't have AFS access then). In past discussions of this patch, this has not been the case. One of the advantages of GSSAPI key exchange is that you can disable public keys for all of your hosts and never manage known hosts, instead only using the system Kerberos keytabs. Since in a Kerberos environment you have to put keytabs on every host *anyway*, and that *is* the host's identity in a Kerberos environment, this reduces the number of key infrastructures you have to manage by one, which matters to some Kerberos deployments. This arguably gives you better security in that specific environment because keytabs do not rely on leap-of-faith initial authentication; the server is always properly authenticated, even on first connect. > Not sure if there's a simple out of the box way to just transfer that > but without all the other GSSAPI stuff? If you want your ticket to refresh remotely when you refresh it locally, which is often needed for Kerberos applications like AFS, you do need key exchange, since that's the mechanism that allows that to happen. (I use both GSSAPI and tcpwrappers, so Colin's proposal would mean more work for me, but given the situation, I'm willing to rework the way that I use ssh to avoid both going forward. More features are nice, but I can see the merits of simplicity here. But I no longer maintain a large infrastructure built on Kerberos, so I'm not putting as much weight on the GSSAPI support as I used to.) -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: xz backdoor
Bastian Blank writes: > I don't understand what you are trying to say. If we add a hard check > to lintian for m4/*, set it to auto-reject, then it is fully irrelevant > if the upload is a tarball or git. Er, well, there goes every C package for which I'm upstream, all of which have M4 macros in m4/* that do not come from an external source. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Command /usr/bin/mv wrong message in German
Russell Stuart writes: > The reason I'm replying is after one, probably two decades this still > annoys me: >$ dpkg -S /etc/profile >dpkg-query: no path found matching pattern /etc/profile > It was put their by the Debian install, and I'm unlikely to change it. > Its fairly important security wise. It would be nice if "dpkg -S" told > me base-files.deb installed it. It would be nice if debsums told me if > it changed. There are lots of files like this, such as /etc/environment > and /etc/hosts. There are some directories like /etc/apt/trusted.gpg.d/ > which should only have files claimed by some .deb. Guillem has a plan for addressing this, I believe as part of metadata tracking, that would allow such files can be registered by their packages and then tracked by dpkg. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Validating tarballs against git repositories
Luca Boccassi writes: > On Sat, 30 Mar 2024 at 15:44, Russ Allbery wrote: >> Luca Boccassi writes: >>> In the end, massaged tarballs were needed to avoid rerunning >>> autoconfery on twelve thousands different proprietary and >>> non-proprietary Unix variants, back in the day. In 2024, we do >>> dh_autoreconf by default so it's all moot anyway. >> This is true from Debian's perspective. This is much less obviously >> true from upstream's perspective, and there are some advantages to >> aligning with upstream about what constitutes the release artifact. > My point is that, while there will be for sure exceptions here and > there, by and large the need for massaged tarballs comes from projects > using autoconf and wanting to ship source archives that do not require > to run the autoconf machinery. Just as a data point, literally every C project for which I am upstream ships additional files in the release tarballs that are not in Git for reasons unrelated to Autoconf and friends. Most of this is pregenerated documentation (primarily man pages generated from POD), but it also includes generated test data and other things. The reason is similar: regenerating those files requires tools that may not be present on an older system (like a mess of random Perl modules) or, in the case of the man pages, may be old and thus produce significantly inferior output. > However, we as in Debian do not have this problem. We can and do re-run > the autoconf machinery on every build. And at least on the main forges, > the autogenerated (and thus out of reach from this kind of attacks) > tarball is always present too - the massaged tarball is an _addition_, > not a _substitution_. Hence: we should really really think about forcing > all packages, by policy, to use the autogenerated tarball by default > instead of the autoconf one, when both are present, unless extenuating > circumstances (that have to be documented) are present. I think this is probably right as long as by "autogenerated" you mean basing the Debian package on a signed upstream Git tag and *locally* generating a tarball to satisfy Debian's .orig.tar.gz requirement, not using GitHub's autogenerated tarball that has all sorts of other potential issues. Just to note, though, this means that we lose the upstream signature in the archive. The only place the upstream signature would then live is in Salsa. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: xz backdoor
Sirius writes: > Would throwing away these unmodified (?) macros packaged projects may be > carrying for hysterical raisins in favour of just using the autoconf > native macros reduce the attack-surface a potential malicious actor > would have at their disposal, or would it simply be a "putting all eggs > in one basket" and just make things worse? And by how much vis-a-vis the > effort to do it? Most of the macros of this type are not from Autoconf. They're from either gnulib or the Autoconf Archive. In both cases, blindly upgrading to a newer upstream version may break things, I believe. I'm not as sure with gnulib, but the Autoconf Archive is a huge collection of things with varying quality and does not necessarily have any guarantees about APIs. > I think that what I am trying to get at is this: is there low-hanging > fruit that for minimal effort would disproportionately improve things > from a security perspective. (I have an inkling that this is a question > that every distribution is wrestling with today.) I think the right way to think about this is to say that the Autoconf ecosystem is rife with embedded code copies and, because the normal way of using this code is to make a copy, is also somewhat lax about making breaking changes since the expectation is that you only update during your release process when you can fix up any changes. (That code is also notoriously hard to read, both because M4 is a language with fairly noisy syntax and because the only tools assumed to be available in the output scripts is a very minimal Bourne shell and standard POSIX shell utilities, so there's a lot of the type of programming that only shell aficionados can love. That was the problem with detecting this backdoor: the sort of chain of tr and eval and whatnot that injected the backdoor is what, e.g., all of Libtool looks like, at least on a first superficial glance.) I know all this adds up to "why are we using this stuff anyway," but the amount of hard-won portability knowledge that's baked into these tools is IMMENSE, and while probably 75% of it is now irrelevant because the systems that needed it are long-dead, no one can agree on what 75% that is or figure out which useful 25% to extract. And rewriting it in some other programming language is daunting and feels like churn rather than progress. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Validating tarballs against git repositories
Simon Josefsson writes: > Russ Allbery writes: >> I believe you're talking about two different things. I think Sean is >> talking about preimage resistance, which assumes that the known-good >> repository is trusted, and I believe Simon is talking about >> manufactured collisions where the attacker controls both the good and >> the bad repository. > Right. I think the latter describes the xz scenario: someone could have > pushed a maliciously crafted commit with a SHA1 collision commit id, so > there are two different git repositories with that commit id, and a > signed git tag on that commit id authenticates both trees, opening up > for uncertainty about what was intended to be used. Unless I'm missing > some detail of how git signed tag verification works that would catch > this. This is also my understanding. >> The dgit and tag2upload design probably (I'd have to think about it >> some more, ideally while bouncing the problem off of someone else, >> because I've recycled those brain cells for other things) only needs >> preimage resistance, but the general case of a malicious upstream may >> be vulnerable to manufactured collisions. > It is not completely clear to me: How about if some malicious person > pushed a commit to salsa, asked a DD to "please review this repository > and sign a tag to make the upload"? The DD would presumably sign a > commit id that authenticate two different git trees, one with the > exploit and one without it. Oh, hm, yes, this is a good point. I had forgotten that tag2upload was intended to work by pushing a tag to Salsa. This means an attacker can potentially race Salsa CI to move that tag to the malicious tree before the tree is fetched by tag from Salsa, or reuse the signed tag with a different repository with the same SHA-1. The first, most obvious step is that one has to make sure that a signed tag is restricted to a specific package and version and not portable to a different package and/or version that has the same SHA-1 hash due to attacker construction. There are several obvious ways that could be done; the one that comes immediately to mind is to require the tag message be the source package name and version number, which is good practice anyway. I think any remaining issues could be addressed with a fairly simple modification to the protocol: rather than pushing the signed tag to Salsa, the DD reviewer should push the signed tag to a separate archive server similar to that used by dgit today. As long as the first time the signed tag leaves the DD's system is in conjunction with a push of the corresponding reviewed tree to secure project systems, this avoids the substitution problem. The tag could then be pushed back to Salsa, either by the DD or by the service. This unfortunately means that one couldn't use the Salsa CI service to do the source package construction, and one has to know about this extra server. I think that restriction comes from the fact that we're worried an attacker may be able to manipulate the Salsa Git repository (through force pushes and tag replacements, for example), whereas the separate dedicated archive server can be more restrictive and never allow force pushes or tag moves, and reject any attempts to push a SHA-1 hash that has already been seen. Another possible option would be to prevent force pushes and tag moves in Salsa, since I think one of those operations would be required to pull off this attack, but maybe I'm missing someting. One of the things I'm murky on is exactly what Git operations are required to substitute the two trees with identical SHA-1 hashes. That property is going to break Git in weird ways, and I'm not sure what that means for one's ability to manipulate a Git repository over the protocols that Salsa exposes. Obviously it would be ideal if Git used stronger hashes than SHA-1 for tags, so that one need worry less about all of this. Even if my analysis is wrong, I think there are some fairly obvious and trivial additions to the tag2upload process that would prevent this attack, such as building a Merkle tree of the reviewed source tree using a SHA-256 hash and embedding the top hash of that tree in the body of the signed tag where it can be verified by the archive infrastructure. That might be a good idea *anyway*, although it does have the unfortunate side effect of requiring a local client to produce a correct tag rather than using standard Git signed tags. Uploading to Debian currently already semi-requires a custom local client, so to me this isn't a big deal, although I think there was some hope to avoid that. (These variations unfortunately don't help with the upstream problem.) -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: xz backdoor
Christian Kastner writes: > This is both out of convenience (I want my workstation to be based on > stable) and precisely because of the afforded isolation. I personally specifically want my workstation to be running unstable, so I'm watching to see if that's considered unsafe (either, immediately, today, or in theory, in the future). If I have to use a stable host, I admit I will be sad. I've been using unstable for my personal client and development (not server, never exposing services to the Internet) systems for well over a decade (and, before, that, testing systems for as long as I've been working on Debian) and for me it's a much nicer experience than using stable. It also lets me directly and practically dogfood Debian, which has resulted in a fair number of bug reports. (This is an analysis specific to me, not general advice, and relies heavily on the fact that I'm very good at working around weird problems that transiently arise in unstable.) But this does come with a security risk because it means a compromised package could compromise my system much faster than if I were using testing or, certainly, stable. That's not a security trade-off that I can responsibly make entirely for myself, since it affects people who are using Debian as well. So I don't get to have the final decision here. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Validating tarballs against git repositories
Jeremy Stanley writes: > On 2024-03-29 23:29:01 -0700 (-0700), Russ Allbery wrote: > [...] >> if the Git repository is somewhere other than GitHub, the >> malicious possibilities are even broader. > [...] > I would not be so quick to make the same leap of faith. GitHub is > not itself open source, nor is it transparently operated. It's a > proprietary commercial service, with all the trust challenges that > represents. Long, long before XZ was a twinkle in anyone's eye, > malicious actors were already regularly getting their agents hired > onto development teams to compromise commercial software. Just look > at the Juniper VPN backdoor debacle for a fairly well-documented > example (but there's strong evidence this practice dates back well > before free/libre open source software even, at least to the 1970s). This is a valid point: let me instead say that the malicious possibilities are *different*. All of your points about GitHub are valid, but the counterexample I had in mind is one where the malicious upstream runs the entire Git hosting architecture themselves and can make completely arbitrary changes to the Git repository freely. I don't think we know everything that is possible to do in that situation. I think it would be difficult (not impossible, but difficult) to get into that position at GitHub, whereas it is commonplace among self-hosted projects. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Validating tarballs against git repositories
Simon Josefsson writes: > Sean Whitton writes: >> We did some analysis on the SHA1 vulnerabilities and determined that >> they did not meaningfully affect dgit & tag2upload's design. > Can you share that analysis? As far as I understand, it is possible for > a malicious actor to create a git repository with the same commit id as > HEAD, with different historic commits and tree content. I thought a > signed tag is merely a signed reference to a particular commit id. If > that commit id is a SHA1 reference, that opens up for ambiguity given > recent (well, 2019) results on SHA1. Of course, I may be wrong in any > of the chain, so would appreciate explanation of how this doesn't work. I believe you're talking about two different things. I think Sean is talking about preimage resistance, which assumes that the known-good repository is trusted, and I believe Simon is talking about manufactured collisions where the attacker controls both the good and the bad repository. The dgit and tag2upload design probably (I'd have to think about it some more, ideally while bouncing the problem off of someone else, because I've recycled those brain cells for other things) only needs preimage resistance, but the general case of a malicious upstream may be vulnerable to manufactured collisions. (So far as I know, preimage attacks against *MD5* are still infeasible, let alone against SHA-1.) -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Validating tarballs against git repositories
Ingo Jürgensmann writes: > This reminds me of https://xkcd.com/2347/ - and I think that’s getting a > more common threat vector for FLOSS: pick up some random lib that is > widely used, insert some malicious code and have fun. Then also imagine > stuff that automates builds in other ways like docker containers, Ruby, > Rust, pip that pull stuff from the network and installs it without > further checks. > I hope (and am confident) that Debian as a project will react > accordingly to prevent this happening again. Debian has precisely the same problem. We have more work to do than we possibly can do with the resources we have, there is some funding but not a lot of funding so most of the work is hobby work stolen from scarce free time, and we're under a lot of pressure to encourage and incorporate the work of new maintainers. And 99% of the time trusting the people who step up to help works out great. The hardest part about defending against social engineering is that it doesn't attack attack the weakness of a community. It attacks its *strengths*: trust, collaboration, and mutual assistance. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Validating tarballs against git repositories
Luca Boccassi writes: > In the end, massaged tarballs were needed to avoid rerunning autoconfery > on twelve thousands different proprietary and non-proprietary Unix > variants, back in the day. In 2024, we do dh_autoreconf by default so > it's all moot anyway. This is true from Debian's perspective. This is much less obviously true from upstream's perspective, and there are some advantages to aligning with upstream about what constitutes the release artifact. > When using Meson/CMake/home-grown makefiles there's no meaningful > difference on average, although I'm sure there are corner cases and > exceptions here and there. Yes, perhaps it's time to switch to a different build system, although one of the reasons I've personally been putting this off is that I do a lot of feature probing for library APIs that have changed over time, and I'm not sure how one does that in the non-Autoconf build systems. Meson's Porting from Autotools [1] page, for example, doesn't seem to address this use case at all. [1] https://mesonbuild.com/Porting-from-autotools.html Maybe the answer is "you should give up on portability to older systems as the cost of having a cleaner build system," and that's not an entirely unreasonable thing to say, but that's going to be a hard sell for a lot of upstreams that care immensely about this. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Validating tarballs against git repositories
't want to do that, and thus force people to fork packages rather than join in maintaining the existing package. This is an aside, but this is why my personal policy for my own projects that I no longer have to maintain is to orphan them and require that someone fork them, not add additional contributors to my repository or release infrastructure. I do not have the resources to vet new maintainers -- if I had that time to spend on the projects, I wouldn't have orphaned them -- and therefore I want to explicitly disclaim any responsibility for what the new maintainer may do. Someone else will have to judge whether they are trustworthy. But I'm not sure that distributions are in a good position to do that *either*. > But, I will definitely concede that, had I seen a commit that changed > that line in the m4, there's a good chance my eyes would have glazed > over it. This is why I am somewhat skeptical that forcing everything into Git commits is as much of a benefit as people are hoping. This particular attacker thought it was better to avoid the Git repository, so that is evidence in support of that approach, and it's certainly more helpful, once you know something bad has happened, to be able to use all the Git tools to figure out exactly what happened. But I'm not sure we're fully accounting for the fact that tags can be moved, branches can be force-pushed, and if the Git repository is somewhere other than GitHub, the malicious possibilities are even broader. We could narrow those possibilities somewhat by maintaining Debian-controlled mirrors of upstream Git repositories so that we could detect rewritten history. (There are a whole lot of reasons why I think dgit is a superior model for archive management. One of them is that it captures the full Git history of upstream at the point of the upload on Debian-controlled infrastructure if the maintainer of the package bases it on upstream's Git tree.) -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: xz backdoor
Moritz Mühlenhoff writes: > Russ Allbery wrote: >> I think this question can only be answered with reverse-engineering of >> the backdoors, and I personally don't have the skills to do that. > In the pre-disclosure discussion permission was asked to share the > payload with a company specialising in such reverse engineering. If that > went through, I'd expect results to be publicly available in the next > days. Excellent, thank you. For those who didn't read the analysis on oss-security yet, note that the initial investigation of the injected exploit indicates that it deactivates itself if argv[0] is not /usr/sbin/sshd, so there are good reasons to believe that the problem is bounded to testing or unstable systems running the OpenSSH server. If true, this is a huge limiting factor and in many ways quite relieving compared to what could have happened. But the stakes are high enough that hopefully we'll get detailed confirmation from people with expertise in understanding this sort of thing. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: xz backdoor
Russ Allbery writes: > Sirius writes: >> This is quite actively discussed on Fedora lists. >> https://www.openwall.com/lists/oss-security/2024/ >> https://www.openwall.com/lists/oss-security/2024/03/29/4 >> Worth taking a look if action need to be taken on Debian. > The version of xz-utils was reverted to 5.4.5 in unstable yesterday by > the security team and migrated to testing today. Anyone running an > unstable or testing system should urgently upgrade. I think the big open question we need to ask now is what exactly the backdoor (or, rather, backdoors; we know there were at least two versions over time) did. If they only target sshd, that's one thing, and we have a bound on systems possibly affected. But liblzma is linked directly or indirectly into all sorts of things such as, to give an obvious example, apt-get. A lot of Debian developers use unstable or testing systems. If the exploit was also exfiltrating key material, backdooring systems that didn't use sshd, etc., we have a lot more cleanup to do. I think this question can only be answered with reverse-engineering of the backdoors, and I personally don't have the skills to do that. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: xz backdoor
Sirius writes: > This is quite actively discussed on Fedora lists. > https://www.openwall.com/lists/oss-security/2024/ > https://www.openwall.com/lists/oss-security/2024/03/29/4 > Worth taking a look if action need to be taken on Debian. The version of xz-utils was reverted to 5.4.5 in unstable yesterday by the security team and migrated to testing today. Anyone running an unstable or testing system should urgently upgrade. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: 64-bit time_t transition in progress in unstable
Eric Valette writes: > You can force the migration by explicitly adding the package that it > propose to remove (e.g gdb for libelf, ...) > I managed to upgrade all packages you mention in your mail that > way. Only libkf5akonadisearch-bin libkf5akonadisearch-plugins > libkf5akonadisearchcore5t64 libkf5akonadisearchpim5t64 > libkf5akonadisearchxapian5t64 are missing because there are bugs in the > Provides: for api /or the packe depending on the T64 ABI are not yet > rebuild. I opened a bug for that Ah, yes, that worked. It took some experimentation to figure out which packages could be forced and which ones were causing removals. I'm down to only libzvbi-common having problems, which I can't manage to force without removing xine-ui. If I attempt to install them both together, I get this failure: The following packages have unmet dependencies: libxine2 : Depends: libxine2-plugins (= 1.2.13+hg20230710-2) but it is not going to be installed or libxine2-misc-plugins (= 1.2.13+hg20230710-2+b3) but it is not going to be installed libxine2-ffmpeg : Depends: libavcodec60 (>= 7:6.0) Depends: libavformat60 (>= 7:6.0) The apt resolver seems to be struggling pretty hard to make sense of the correct upgrade path. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: 64-bit time_t transition in progress in unstable
Steve Langasek writes: > So once the libuuidt64 revert is done (later today?), if apt > dist-upgrade is NOT working, I think we should want to see some apt > output showing what's not working. My current list of unupgradable packages on amd64 is: gir1.2-gstreamer-1.0/unstable 1.24.0-1 amd64 [upgradable from: 1.22.10-1] libegl-mesa0/unstable 24.0.2-1 amd64 [upgradable from: 24.0.1-1] libgbm1/unstable 24.0.2-1 amd64 [upgradable from: 24.0.1-1] libgl1-mesa-dri/unstable 24.0.2-1 amd64 [upgradable from: 24.0.1-1] libglapi-mesa/unstable 24.0.2-1 amd64 [upgradable from: 24.0.1-1] libglx-mesa0/unstable 24.0.2-1 amd64 [upgradable from: 24.0.1-1] libgstreamer1.0-0/unstable 1.24.0-1 amd64 [upgradable from: 1.22.10-1] libldb2/unstable 2:2.8.0+samba4.19.5+dfsg-4 amd64 [upgradable from: 2:2.8.0+samba4.19.5+dfsg-1] libspa-0.2-modules/unstable 1.0.3-1.1 amd64 [upgradable from: 1.0.3-1] libzvbi-common/unstable 0.2.42-1.2 all [upgradable from: 0.2.42-1.1] mesa-va-drivers/unstable 24.0.2-1 amd64 [upgradable from: 24.0.1-1] samba-libs/unstable 2:4.19.5+dfsg-4 amd64 [upgradable from: 2:4.19.5+dfsg-1] Doing a bit of exploration, the root problems seem to be: libdebuginfod1 : Depends: libelf1 (= 0.190-1+b1) libdw1 : Depends: libelf1 (= 0.190-1+b1) libxine2-misc-plugins : Depends: libsmbclient (>= 2:4.0.3+dfsg1) libgl1-mesa-dri : Depends: libglapi-mesa (= 24.0.1-1) I'm not sure what's blocking the chain ending in libelf1 since t64 versions of those libraries seem to be available, but attempting to force it would remove gdb and jupyter if that helps. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: 64-bit time_t transition in progress in unstable
Kevin Bowling writes: > Are there instructions on how to progress an unstable system through > this, or is the repo currently in a known inconsistent state? I have > tried upgrading various packages to work through deps but I am unable to > do a dist-upgrade for a while. It doesn't look like the migration is finished yet, so this is expected. There are a whole lot of packages that need to be rebuilt and a whole lot of libraries, so some edge cases will doubtless take a while to sort out. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: usrmerge breaks POSIX
Russ Allbery writes: > Thorsten Glaser writes: >> Right… and why does pkexec check against /etc/shells? > pkexec checks against /etc/shells because this is the traditional way to > determine whether the user is in a restricted shell, and pkexec is > essentially a type of sudo and should be unavailable to anyone who is > using a restricted shell. Apologies, this turns out to be incorrect. I assumed this based on my prior experience with other programs that tested /etc/shells without doing my research properly. I should have been less certain here. After some research with git blame, it appears that pkexec checks SHELL against /etc/shells because pkexec passes SHELL to the program that it executes (possibly in a different security context) and was worried about users being able to manipulate and potentially compromise programs across that security boundary by setting SHELL to some attacker-controlled value. It is using /etc/shells as a list of possible valid values for that variable that are safe to pass on. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: usrmerge breaks POSIX
Russ Allbery writes: > That definitely should not be the case and any restricted shell that adds > itself to /etc/shells is buggy. See chsh(1): > The only restriction placed on the login shell is that the command > name must be listed in /etc/shells, unless the invoker is the > superuser, and then any value may be added. An account with a > restricted login shell may not change her login shell. For this > reason, placing /bin/rsh in /etc/shells is discouraged since > accidentally changing to a restricted shell would prevent the user > from ever changing her login shell back to its original value. To follow up on this, currently rbash is added to /etc/shells, which is surprising to me and which I assume is what you were referring to. This seems directly contrary to the chsh advice. I can't find a reference to this in bash's changelog and am not sure the reasons for this, though, so presumably I'm missing something. I was only able to find this discussion of why pkexec checks $SHELL, and it doesn't support my assumption that it was an intentional security measure, so I may well be wrong in that part of my analysis. Apologies for that; I clearly should have done more research. git blame points to a commit that only references this thread: https://lists.freedesktop.org/archives/polkit-devel/2009-December/000282.html which seems to imply that this was done to match sudo behavior and because the author believed this was the right way to validate the SHELL setting. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: usrmerge breaks POSIX
Vincent Lefevre writes: > On 2024-02-15 14:14:46 -0800, Russ Allbery wrote: >> and pkexec is essentially a type of sudo and should be unavailable to >> anyone who is using a restricted shell. > The pkexec source doesn't say that the goal is to check whether > the user is in a restricted shell. So far as I am aware, the only purpose served by /etc/shells historically and currently is to (a) prevent users from shooting themselves in the foot by using chsh to change their shell to something that isn't a shell, and (b) detect users who are not "normal users" and therefore should have restricted access to system services. See shells(5), for example: Be aware that there are programs which consult this file to find out if a user is a normal user; for example, FTP daemons traditionally disallow access to users with shells not included in this file. > Also note than even in a restricted shell, the user may set $SHELL to a > non-restricted shell. This is generally not the case; see, for example, rbash(1): It behaves identically to bash with the exception that the following are disallowed or not performed: [...] * setting or unsetting the values of SHELL, PATH, HISTFILE, ENV, or BASH_ENV > Moreover, /etc/shells also contains restricted shells. That definitely should not be the case and any restricted shell that adds itself to /etc/shells is buggy. See chsh(1): The only restriction placed on the login shell is that the command name must be listed in /etc/shells, unless the invoker is the superuser, and then any value may be added. An account with a restricted login shell may not change her login shell. For this reason, placing /bin/rsh in /etc/shells is discouraged since accidentally changing to a restricted shell would prevent the user from ever changing her login shell back to its original value. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: usrmerge breaks POSIX
Thorsten Glaser writes: > Right… and why does pkexec check against /etc/shells? pkexec checks against /etc/shells because this is the traditional way to determine whether the user is in a restricted shell, and pkexec is essentially a type of sudo and should be unavailable to anyone who is using a restricted shell. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: usrmerge breaks POSIX
Thorsten Glaser writes: > Dixi quod… >> Russ Allbery dixit: >>> My guess is that pkexec is calling realpath to canonicalize the path >>> before checking for it in /etc/shells, although I have not confirmed >>> this. >> Now that would be weird and should be fixed… > Another question that probably should be answered first is that why > pkexec (whatever that is) checks against /etc/shells and if that’s > correct. Okay, I have done more research. My speculation that pkexec might use realpath was wrong. It does only check the contents of the SHELL environment variable. See: https://gitlab.freedesktop.org/polkit/polkit/-/blob/master/src/programs/pkexec.c?ref_type=heads#L343 https://gitlab.freedesktop.org/polkit/polkit/-/blob/master/src/programs/pkexec.c?ref_type=heads#L405 It does check whether $SHELL is found in /etc/shells. So your question about what is setting the $SHELL variable is a good one, although I think I would still argue that it's not the most effective way to solve the issue. > I’d be really appreciative if I did not have to add extra nōn-canonical > paths to /etc/shells for bugs in unrelated software. I understand the appeal of that stance, but the problem with it is that there is no enforcement of this definition of canonical. I know that you consider /bin/mksh to be the correct path, but /usr/bin/mksh is also present and works exactly the same. chsh will prevent unprivileged users from changing their shell to the /usr/bin path because of /etc/shells, but not if someone makes that change as root. Also, I'm not sure useradd cares, or possibly other ways of adding a user with a shell (Puppet, for instance). Or, for that matter, just editing /etc/passwd as root, which I admit is how I usually set the shells of users because I've been using UNIX for too long. Having only the /bin paths is fragile because it creates an expectation that every user who sets the shell is going to know that /bin/mksh is the correct path and /usr/bin/mksh is the wrong path and will not use the latter. I'm not sure how they're supposed to receive this information; I don't think it's going to be obvious to everyone who may be involved in setting the shell. We can tell everyone who ends up with /usr/bin/mksh that they need to change it to /bin/mksh, but this seems kind of tedious and annoying, and I'm not seeing the downside to registering both paths. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: usrmerge breaks POSIX
Thorsten Glaser writes: > Russ Allbery dixit: >> 3. Something else that I don't yet understand happened that caused pkexec >>to detect the shell as /usr/bin/mksh instead of /bin/mksh. I'm not > What sets $SHELL for the reporter’s case? Fix that instead. login(1) > sets it to the path from passwd(5), which hopefully is from shells(5). My guess is that pkexec is calling realpath to canonicalize the path before checking for it in /etc/shells, although I have not confirmed this. Regardless, I think we should list both paths in /etc/shells because both paths are valid and there are various benign reasons why one might see the other path. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: usrmerge breaks POSIX
Vincent Lefevre writes: > On 2024-02-14 17:16:23 -0800, Russ Allbery wrote: > Quoting https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=817168 > | with usrmerge, some programs - such as pkexec, or LEAP's bitmask > | on top of that- fails to run. Specifically, the error I get is: > | > | The value for the SHELL variable was not found the /etc/shells file >> You mentioned /etc/shells in your previous message, but /etc/shells on my >> system contains both the /usr/bin and the /bin paths, so I'm still at a >> complete loss. > Not for mksh. Okay, thank you. I think I understand now. The problem is: 1. mksh uses custom postinst code to add itself to /etc/shells that does not add the /usr/bin versions of the mksh paths, only the /bin versions. 2. pkexec uses /etc/shells as an authorization mechanism to not allow access from people who use restricted shells, so if it detects your shell as /usr/bin/mksh instead of the (expected by /etc/shells) /bin/mksh path, it will deny access. 3. Something else that I don't yet understand happened that caused pkexec to detect the shell as /usr/bin/mksh instead of /bin/mksh. I'm not sure what this is, but I can guess at a few things that could cause this, so it's not surprising to me that it happened. That pkexec uses /etc/shells in this way is a bit surprising, but I understand the goal. The intent is to keep people who are using restricted shells from accessing pkexec. I'm not sure this is the best way to achieve that security goal, but I can also see the potential for introducing security vulnerabilities in existing systems if we relaxed them now. I think the obvious solution is to ensure that both the /bin and /usr/bin paths for mksh are registered in /etc/shells. In other words, I think we have a missing usrmerge-related transition here that we should just fix. I'm copying Thorsten on this message in case he hasn't noticed this thread, but if I were you I'd just file a bug against mksh asking for the /usr/bin paths to also be added to /etc/shells to match the new behavior of add-shell. Hopefully most shells are using add-shell, and thus won't have this problem, but any other shell package in Debian that is intended to provide a non-restricted shell but is not using add-shell to manipulate /etc/shells will need a similar fix. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: usrmerge breaks POSIX
Vincent Lefevre writes: > On 2024-02-14 10:41:44 -0800, Russ Allbery wrote: >> I'm sorry, this is probably a really obvious question, but could you >> explain the connection between the subject of your mail message and the >> body of your mail message? I can't see any relationship, so I guess I >> need it spelled out for me in small words. >> (I believe /etc/shells enforcement is done via PAM or in specific >> programs that impose this as an additional non-POSIX restriction. This >> is outside the scope of POSIX.) > What's the point of having a standard if programs are allowed to > reject user settings for arbitrary and undocumented reasons? I have literally no idea what you're talking about. It would be really helpful if you would describe what program rejected your setting and what you expected to happen instead. You mentioned /etc/shells in your previous message, but /etc/shells on my system contains both the /usr/bin and the /bin paths, so I'm still at a complete loss. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: usrmerge breaks POSIX
Vincent Lefevre writes: > POSIX says: > SHELL This variable shall represent a pathname of the user's > preferred command language interpreter. If this interpreter > does not conform to the Shell Command Language in XCU > Chapter 2 (on page 2345), utilities may behave differently > from those described in POSIX.1-2017. > There is no requirement to match one of the /etc/shells pathnames. > The user or scripts should be free to use any arbitrary pathname to > the command language interpreter available on the system, and Debian > should ensure that this is allowed, in particular the one give by > the realpath command. I'm sorry, this is probably a really obvious question, but could you explain the connection between the subject of your mail message and the body of your mail message? I can't see any relationship, so I guess I need it spelled out for me in small words. (I believe /etc/shells enforcement is done via PAM or in specific programs that impose this as an additional non-POSIX restriction. This is outside the scope of POSIX.) -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Proposal for how to deal with Go/Rust/etc security bugs
Simon Josefsson writes: > I want to explore if there is a possibility to change status quo, and > what would be required to do so. > Given how often gnulib is vendored for C code in Debian, and other > similar examples, I don't think of this problem as purely a Go/Rust > problem. The parallel argument that we should not support coreutils, > sed, tar, gzip etc because they included vendored copies of gnulib code > is not reasonable. Since there are now a bunch of messages on this thread of people grumbling about Rust and Go and semi-proposing not even trying to package that software (and presumably removing python3-cryptography and everything that depends on it? I'm not sure where people think this argument is going), I wanted to counterbalance that by saying I completely agree with Simon's exploration here. Rebuilding a bunch of software after a security fix is not a completely intractable problem that we have no idea how to even approach. It's just CPU cycles and good metadata plus ensuring that our software can be rebuilt, something that we already promise. Some aspects of making this work will doubtless be *annoying*, but it doesn't seem outside of our capabilities as a project. Dealing with older versions is of course much more of a problem, particularly if upstream is not backporting security fixes, but this is a problem is inherent in having stable releases, that upstreams have been grumbly about long before either Rust or Go even existed, and that we have nonetheless dealt with throughout the whole history of Debian. There is no one-size-fits-all solution, but we have historically managed to muddle through in a mostly acceptable way. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Policy: should libraries depend on services (daemons) that they can speak to?
Roger Lynn writes: > On 15/01/2024 18:00, Russ Allbery wrote: >> When you have the case of an application that optionally wants to do foo, >> a shared library that acts as a client, and a daemon that does foo, there >> are three options: >> >> 1. Always install the shared library and daemon even though it's an >>optional feature, because the shared library is a link dependency for >>the application and the shared library viewed in isolation does require >>the daemon be running to do anything useful. >> >> 2. Weaken the dependency between the shared library and the daemon so that >>the shared library can be installed without the daemon even though it's >>objectively useless in that situation because it's the easiest and >>least annoying way to let the application be installed without the >>daemon, and that's the goal. The shared library is usually tiny and >>causes no problems by being installed; it just doesn't work. >> >> 3. Weaken the dependency between the application and the shared library, >>which means the application has to dynamically load the shared library >>rather than just link with it. This is in some ways the most "correct" >>from a dependency perspective, but it's annoying to do, introduces new >>error handling cases in the application, and I suspect often upstream >>will flatly refuse to take such a patch. > Unless I have misunderstood, I think you may have missed another option: > 4. Let the leaf application declare the appropriate dependency on the >daemon, because the application writer/packager is in the best position >to know how important the functionality provided by the daemon is to the >application. This could be considered to be option 2b, and a "suggests" >dependency of the library on the daemon may still be appropriate. I was thinking of this as a special case of 2, but yes, it's a sufficiently common special case that it's worth calling out on its own. I'm not sure that this whole discussion belongs in Policy because it's very hard to make policy recommendations here without a lot of case-specific details, but a section in the Developers Guide or some similar resource about how to think about these cases seems like it might be useful. It does come up pretty regularly. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Policy: should libraries depend on services (daemons) that they can speak to?
"Theodore Ts'o" writes: > I'll argue that best practice is that upstream show make the shared > library useful *without* the daemon, but if the daemon is present, > perhaps the shared library can do a better job. Eh, I think this too depends on precisely what the shared library is for. The obvious example of where this doesn't work is when the shared library is a client for a local system service, and its entire point is to dispatch calls to that service, but the library and service combined implement an optional feature in some of the programs linked to it. I think that's a relatively common case and the sort of case that provokes most of the desire to not make shared libraries have hard dependencies on their services. There are a bunch of services that do not support (and often would never reasonably support) network connections to their underlying services. An obvious example is a library and service pair that represents a way to manage privilege escalation with isolation on the local system. You cannot make the shared library useful without the daemon because the entire point of the shared library and daemon pair is to not give those permissions to the process containing the shared library. When you have the case of an application that optionally wants to do foo, a shared library that acts as a client, and a daemon that does foo, there are three options: 1. Always install the shared library and daemon even though it's an optional feature, because the shared library is a link dependency for the application and the shared library viewed in isolation does require the daemon be running to do anything useful. 2. Weaken the dependency between the shared library and the daemon so that the shared library can be installed without the daemon even though it's objectively useless in that situation because it's the easiest and least annoying way to let the application be installed without the daemon, and that's the goal. The shared library is usually tiny and causes no problems by being installed; it just doesn't work. 3. Weaken the dependency between the application and the shared library, which means the application has to dynamically load the shared library rather than just link with it. This is in some ways the most "correct" from a dependency perspective, but it's annoying to do, introduces new error handling cases in the application, and I suspect often upstream will flatly refuse to take such a patch. We do 2 a lot because it's pragmatic and it doesn't really cause any practical problems, even though it technically means that we're not properly representing the dependencies of the shared library. We in general try not to do 1 for reasons that I think are sound. Minimizing the footprint of applications for people who don't want optional features is something that I personally value a lot in Debian. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: RFC: advise against using Proton Mail for Debian work?
Jeremy Stanley writes: > Or build and sign the .tar.gz, then provide the .tar.gz file to the > upload automation on GitHub for publishing to PyPI. Oh, yes, that would work. You'd want to unpack that tarball and re-run the tests and whatnot, but all very doable. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: RFC: advise against using Proton Mail for Debian work?
Salvo Tomaselli writes: > I am currently not using any service to upload to pypi. But this > requires the occasional creation and deletion of global tokens. > The only way to avoid global tokens is to upload from github, in which > case I can no longer sign the .tar.gz. Well, you *can*, but you would have to then download the .tar.gz from PyPI, perform whatever checks you need to in order to ensure it is a faithful copy of the source release, and then sign it and put that .asc file somewhere (such as a GitHub release artifact). But it's an annoying process and I'm not sure anyone has automated it. > A signature isn't the same as a checksum. Probably nobody was using them > because there was no way to check them automatically. I suspect this chicken-and-egg problem is the heard of it. There are similar mechanisms for Perl modules that, last I checked, no one really used, although I think there was some recent movement towards maybe integrating it a bit more. It's very hard to create a critical mass of people who care enough to keep all the pieces working. PGP signatures definitely seem to be a minority interest among most upstream language communities. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: reference Debian package of multiple binaries sharing one man page
Andrey Rakhmatullin writes: > On Fri, Nov 10, 2023 at 11:44:06AM -0800, Russ Allbery wrote: >> The good news is that if you're using debhelper, you don't have to care >> about how man handles these indirections and can just use a symlink. >> Install the man page into usr/share/man/man1 under whatever name is >> canonical (possibly by using dh_installman), and then create a symlink >> in usr/share/man/man1 from the other man page name to that file. >> dh_installman will then clean this all up for you and create proper .so >> links and you don't have to care about the proper syntax. > Isn't it the other way around? The whole idea of using .so is to tell > dh_installman(1) to create symlinks. Oh, indeed, you're right and I misread that. So I think you can just use symlinks, period, and not worry about .so (although you have to handle nodoc builds correctly). -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: reference Debian package of multiple binaries sharing one man page
Norwid Behrnd writes: > Recently, I started to upgrade the Debian package about > `markdownlint`,[1] a syntax checker. The initially packaged version > 0.12.0 provided a binary of name `ruby-mdl` which now becomes a > transition dummy package in favour of the functionally updated > `markdownlint`. > I wonder how to properly prepare an adjusted man page for both binaries, > because lintian warns about the absence for `usr/bin/mdl`.[2] I think the problem is that you put the .so line in the wrong file. It should be the entire contents of the file corresponding to the deprecated binary name, not a line in the file that represents the current binary name. The good news is that if you're using debhelper, you don't have to care about how man handles these indirections and can just use a symlink. Install the man page into usr/share/man/man1 under whatever name is canonical (possibly by using dh_installman), and then create a symlink in usr/share/man/man1 from the other man page name to that file. dh_installman will then clean this all up for you and create proper .so links and you don't have to care about the proper syntax. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Bug#1041731: Hyphens in man pages
"G. Branden Robinson" writes: > How about this? > \- Minus sign. \- produces the basic Latin hyphen‐minus > specifying Unix command‐line options and frequently used in > file names. “-” is a hyphen in roff; some output devices > replace it with U+2010 (hyphen) or similar. Sorry for my original message, which was very poorly worded and probably incredibly confusing. Let me try to make less of a hash of it. I think what I'm proposing is something like: \- Basic Latin hyphenminus (U+002D) or ASCII hyphen. This is the character used for Unix commandline options and frequently in file names. It is non-breaking; roff will not wrap lines at this character. "-" (without the "\") is a true hyphen in roff, which is a different character; some output devices replace it with U+2010 (hyphen) or similar. What I was trying to get at but didn't express very well was to include the specific Unicode code point and to avoid the term "minus sign" because this character is not a minus sign in typography at all (although it is used that way in code). A minus sign is U+2212 and looks substantially different because it is designed to match the appearance of the plus sign. (For example, the line is often at a different height.) I don't know if *roff has a way of producing that character apart from providing it as Unicode. The above also explicitly says that it's non-breaking (I believe that's the case, although please tell me if I got that wrong) and is more (perhaps excessively) explicit about distinguishing it from "-" because of all the confusion about this. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Hyphens in man pages
Minor point, but since you posted it "G. Branden Robinson" writes: > ... > \- Minus sign or basic Latin hyphen‐minus. \- produces the > Unix command‐line option dash in the output. “-” is a > hyphen in the roff language; some output devices replace it > with U+2010 (hyphen) or similar. The official name of "the Unix command-line option dash" is the hyphen-minus character (U+002D). Given how much confusion there is about this, and particularly given how ambiguous the word "dash" is in typography (the hyphen-minus is one of 25 dashes in Unicode), you may want to say that explicitly in addition to saying that it's the character used in UNIX command-line options (and, arguably as importantly, in UNIX command names). -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Hyphens in man pages
Wookey writes: > I was left not actually know what - and \- represent, nor which one I > _should_ be using in my man pages. And that seems to be the one thing we > should be telling the 'average maintainer'. - turns into a real hyphen (, U+2010). \- turns into the ASCII hyphen-minus that we use for options, programming, and so forth (U+002D). I think my position at this point as pod2man maintainer (not yet implemented in podlators) is that every occurrence of - in POD source will be translated into \-, rather than using the current heuristics, and people who meant to use ‐ should type it directly in the POD source. pod2man now supports Unicode fairly well and will pass that along to *roff, which presumably will do the right thing with it after character set translation. Currently, pod2man uses an extensive set of heuristics, but I think this is a lost cause. I cannot think of any heuristic that will understand that the - in apt-get should be U+002D (so that one can search for the command as it is typed), but the - in apt-like should be aptlike, since this is an English hyphenated expression talking about programs that are similar to apt. This is simply not information that POD has available to it unless the user writing the document uses Unicode hyphens. I believe the primary formatting degredation will be for very long hyphenated phrases like super-long-adjectival-phrase-intended-as-a-joke, because *roff will now not break on those hyphens that have been turned into \-. People will have to rewrite them using proper Unicode hyphens to get proper formatting. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Is there a generic canonical way for a package script to check network connectivity?
Jonathan Kamens writes: > Regarding what I'm trying to accomplish, as part of the revamp of > apt-listchanges I need to rebuild the database that apt-listchanges uses > to determine which changelog and NEWS entries it has already shown to the > user. This can mostly be done from files installed on the local machine, > but not for packages which don't ship a changelog.Debian file and instead > expect the user to fetch it over the network with "apt changelog". Based on some other private conversation, I think there may be an underlying misunderstanding here, which is quite inobvious if you're just looking at Debian packages without having read all the previous discussions that got us here. Either that, or I have some incorrect assumptions, and someone should correct me. :) I believe that the following statements are true: Every Debian package either ships changelog.Debian or symlinks its doc directory to another package that ships changelog.Debian. In the latter case, that is a declaration that the package has no unique changelog entries and its changelog is always and exactly the changelog of the other package. (This is used to deduplicate files among packages that are always built together from the same source and are usually installed together.) So there are no packages in Debian that expect the user to fetch the changelog over the network; the changelog is always guaranteed to be part of the content installed on disk. It can just be indirected through another package (if the packages follow some strict limitations). What *does* happen is that some packages (well, all packages that have been rebuilt with current debhelper, I think) have *truncated* changelogs, in order to prevent the changelog from wasting a lot of disk space with old entries, and the *full* changelog is only available via the network. But the guarantee for truncated changelogs is that all entries newer than the release date of oldstable are retained, so since Debian doesn't support skip-version upgrades, apt-listchanges should never need the content that is dropped by truncation. In other words, the intent is to guarantee that all the information that apt-listchanges needs is present on disk, but it would have to deal with the /usr/share/doc symlinks. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Control header sent with done email didn't do what I expected, should it have?
Marvin Renich writes: > I've seen differing opinions about closing "wontfix" bugs, but as a > user, I appreciate when they are left open. Whether it is a simple > wishlist feature request or a crash when the user abuses the software, > if I go to file the same or similar bug at a later time, if the bug is > closed I will not see it and file a duplicate. If it is left open, I > can see the maintainer has already thought about it and intentionally > decided not to fix it, so I can save the trouble of refiling. Also, I > might gain some insight about the circumstances. I think it's a trade-off. There are some bugs that seem unlikely to ever come up again or that aren't helpfully worded, and I'm more willing to close those. Also, in the abstract, I don't like using the BTS as a documentation system, which is sort of what collecting wontfix amounts to. If it's something that I think is going to come up a lot, it feels better to put it into the actual documentation (README.Debian, a bug script if it's reported really often, etc.). You're also expecting everyone filing a bug to read through all the existing wontfix bugs (at least their titles), which in some cases is fine but in some cases can become overwhelming. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: debian/copyright format and SPDX
Sune Vuorela writes: > I do think that this is another point of "we should kill our babies if > they don't take off". And preferably faster if/when "we lost" the race. > We carried around the debian menu for a decade or so after we failed to > gain traction and people centered on desktop files. > We failed to gain traction on the structure of the copyright file, and > spdx is the one who has won here. I generally agree with everything you're saying, but I don't think it applies to the structure of the copyright file. Last I checked, SPDX even recommends that people use our format for complicated copyright summaries that their native format can't represent. It is hampered by being in a language that no one has a readily-available parser for, and I wish I'd supported the push for it to be in YAML at the time since YAML has been incredibly successful in the format wars due to the wild success of Kubernetes (which is heavily based on YAML at the UI layer although it uses JSON on the wire), but it's still one of the best if not the best format available for its purpose. (Yes, I know, the YAML spec is a massive mess, etc. It's also better than any other structured file format I've used among those with readily available parsers in every programming language, and you can use a very stripped-down version of it without object references and the like. TOML unforutnately failed miserably on nested tables in a way that makes it mostly unusable for a lot of applications YAML does well on.) -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: lpr/lpd
Simon Richter writes: > And yes, it is quicker for me to copy another printcap entry and swap > out the host name than it is to find out how to authenticate to CUPS, > set up the printer, print a test page then remove and recreate it > because the generated "short" name I need to pipe data into lpr isn't > short. I will definitely be looking into rlpr. Since I wrote my original message, I noticed that rlpr is orphaned. I no longer work in an office and print things about once a year, so I no longer use the package, but it was a lifesaver when I was working in an office regularly and I do recommend it. If anyone else who still prints regularly prefers the simple command-line interface, you may want to consider adopting it, although it looks like you're likely to have to adopt upstream as well since it seems to have disappeared. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: lpr/lpd
Christoph Biedl writes: > Well, not me. But the thing that puzzles me is the popcon numbers: lpr > has 755, lprng 233. > Assuming most of these installation were not done deliberately but are > rather by-catch, or: Caused by some package that eventually draws them > in, via a dependency that has "lpr" (or "lprng") in the first place > instead of "cups-bsd | lpr". For lpr, that might be xpaint. For lprng, I > have no idea. And there's little chance to know. It at least used to be that you could print directly to a remote printer with lpr and a pretty simple /etc/printcap entry that you could write manually. I used to use that mechanism to print to an office printer until I discovered rlpr, which is even better for that use case. It's possible some of those installations are people doing that, rather than via dependencies or other things (in which case they probably should move to rlpr). -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Do not plan to support /usr/lib/pam.d for Debian pam
Marco d'Itri writes: > On Sep 15, Sam Hartman wrote: >> I have significant discomfort aligning what you say (pam is the last >> blocker) with what several people said earlier in the week. What I >> heard is that there was no project consensus to do this, and that >> people were running experiments to see what is possible. > Indeed. I did the experiments and they where unexpectedly positive: pam > is the only blocker for booting _the base system_. > I never expected that everything would immediately work just fine with > an empty /etc: my goal is to have support for this in the base system > and selected packages. This started as an experiment: you were going to try running the base system in this mode with existing packages and see what happens. You ran that experiment and got results: it doesn't work, but it appears to only work because of PAM. So far, so good. You ran an experiment, the result was that the thing you want to do doesn't work, and now you understand what changes would be required to move forward. However, and this is very important, *no one has decided that you get to do that work in Debian*. Insofar as this is just a personal goal, sure, that's none of the business of anyone else. But if you want this to be a *project* goal, you're skipping a few important steps. The biggest ones is that there is no *plan* and no *agreement*. By plan, I mean an actual document spelling out in detail, not email messages with a few sentences about something that is familiar to you but not to other people who haven't been thinking about this, what base system support would look like. And by agreement, I mean that the maintainers of base system components agree that this is something that we are working towards as a project and something that they would not break lightly. Right now, any base system package maintainer could decide that putting configuration files in /etc makes sense for reasons of their own limited to their specific package and further break support for booting a system in this mode, and there are no grounds to ask them not to do this. Because you don't have an *agreement*. I feel like there is a tendency to consider work on Debian to be purely technical. If you turn it on and smoke doesn't come out, it works, so we have implemented that thing, and the goal is accomplished. This doesn't work, precisely because other people break your goal later (because they were never asked or never agreed with that goal), and then they are very confused about why you're upset and why your problems are now their problems. Or, worse, their packages are broken as collateral damage in accomplishing some goal, and you then argue that it's their problem to fix their packages, even though there was no agreement about that goal. Accomplishing things like this in Debian has a large social component that I think is being neglected. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: /usr/-only image
Luca Boccassi writes: > Perhaps 'modifications' was the wrong term, I meant the whole system > that handles the configuration. Correct me if I'm wrong, but AFAIK that > is all Debian-specific. Arch, Fedora and Suse do not have this issue. Speaking as the author of several PAM modules, Debian's PAM configuration system is also vastly superior to that of Arch, Fedora, and SuSE, which require that I as upstream provide complicated and tedious installation documentation for how people can configure my modules. It's a stark contrast with Debian, where I can just ship a configuration file and have everything happen automatically and correctly despite requiring some quite complex PAM syntax. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Do not plan to support /usr/lib/pam.d for Debian pam
Sam Hartman writes: >>>>>> "Peter" == Peter Pentchev writes: > Peter> Hm, what happens if a sysadmin deliberately removed a file > Peter> that the distribution ships in /etc, trying to make sure that > Peter> some specific service could never possibly succeed if it > Peter> should ever attempt PAM authentication? Then, if there is a > Peter> default shipped in /usr, the service authentication attempts > Peter> may suddenly start succeeding when the PAM packages are > Peter> upgraded on an existing system. > This might be an issue in general, but it is not an issue for PAM. PAM > falls back on the other service if a service configuration cannot be > found. I think that makes it an even more subtle problem, doesn't it? Currently, my understanding is that if I delete /etc/pam.d/lightdm, PAM falls back on /etc/pam.d/other. But if we define a search path for PAM configuration such that it first looks in /etc/pam.d and then in /usr/lib/pam.d, and I delete /etc/pam.d/lightdm, wouldn't PAM then fall back on /usr/lib/pam.d/lightdm and not /etc/pam.d/other? Unlike Peter's example, that would be a silent error; authentication may well succeed, but without running, say, pam_limits.so. I don't know if anyone is making this specific configuration change, but if they are, I think that result would be surprising. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: /usr/-only image
Marc Haber writes: > I'd go so far that the systemd/udev way is a strategy to cope with > nearly non-existent conffile handling on non-Debian distributions. We > didn't do ourselves a favor by blindly adopting this scheme, while > we're having a vastly superior package managed that handles conffiles > and conffile changes so nicely. > Please considernot throwing this advantage away for the rest of our > distribution. I've been using Debian for a lot of years now, and while describing our dconfiguration handling as vastly superior is possibly warranted (it's been a long time since I've tested the competition so I don't know from first-hand experience), saying that changes are handled nicely doesn't fit my experience. I have spent hours resolving configuration changes on Debian systems that turn out to be changes in comments or settings that I never changed, and even more hours maintaining absurdly complicated code that tries to handle in-place updates of all-in-one configuration files, extract information from them that needs to be used by maintainer scripts, or juggle the complicated interaction between debconf and the state machine of possible user changes to the file outside of debconf. This is certainly something that we put a lot of effort into, and those of us who have used Debian for a long time are used to it, but I wouldn't describe it as nice. Most of this problem is not of our creation. Managing configuration files in an unbounded set of possible syntaxes, many of which are ad hoc and have no standard parser and often do not support fragments in directories, is an inherently impossible problem, and we try very hard to carve out pieces of it that we can handle. But there are many packages for which a split configuration with a proper directory of overrides and a standard configuration syntax would be a *drastic* improvement over our complex single-file configuration management tools such as ucf, let alone over basic dpkg configuration file management. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Bug#885698: What licenses should be included in /usr/share/common-licenses?
Jonas Smedegaard writes: > Strictly speaking it is not (as I was more narrowly focusing on) that > the current debian/copyright spec leaves room for *ambiguity*, but > instead that there is a real risk of making mistakes when replacing with > centrally defined ones (e.g. redefining a local "Expat" from locally > meaning "MIT-ish legalese as stated in this project" to falsely mean > "the MIT-ish legalese that SPDX labels MIT"). Right, the existing copyright format defines a few standard labels and says that you should only use those labels when the license text matches, but it doesn't stress that "matches" means absolutely word-for-word identical. I suspect, although I haven't checked, that we've made at least a few mistakes where some license text that's basically equivalent to Expat is labelled as Expat even though the text is not word-for-word identical. Given that currently all labels in debian/copyright are essentially local and the full text is there (except for common-licenses, where apart from BSD the licenses normally are used verbatim), this is not currently really a bug. But we could turn it into a bug quite quickly if we relied on the license short name to look up the text. To take an example that I've been trying to get rid of for over a decade, many of the /usr/share/common-licenses/BSD references currently in the archive are incorrect. There are a few cases where the code is literally copyrighted only by the Regents of the University of California and uses exactly that license text, but this is not the case for a lot of them. It looks like a few people have even tried to say "use common-licenses but change the name in the license" rather than reproducing the license text, which I don't believe meets the terms of the license (although it's of course very unlikely that anyone would sue over it). A quick code search turns up the following examples, all of which I believe are wrong: https://sources.debian.org/src/mrpt/1:2.10.0+ds-3/doc/man-pages/pod/simul-beacons.pod/?hl=35#L35 https://sources.debian.org/src/gridengine/8.1.9+dfsg-11/debian/scripts/init_cluster/?hl=7#L7 https://sources.debian.org/src/rust-hyphenation/0.7.1-1/debian/copyright/?hl=278#L278 https://sources.debian.org/src/nim/1.6.14-1/debian/copyright/?hl=64#L64 https://sources.debian.org/src/yade/2023.02a-2/debian/copyright/?hl=78#L78 An example of one that probably is okay, although ideally we still wouldn't do this because there are other copyrights in the source: https://sources.debian.org/src/lpr/1:2008.05.17.3+nmu1/debian/copyright/?hl=15#L15 This problem potentially would happen a lot with the BSD licenses, since the copyright-format document points to SPDX and SPDX, since it only cares about labeling legally-equivalent documents, allows the license text to vary around things like the name of the person you're not supposed to say endorsed your software while still receiving the same label. We therefore cannot use solely SPDX as a way of determining whether we can substitute the text of the license automatically for people, because there are SPDX labels for a lot of licenses for which we'd need to copy and paste the exact license text because it varies. At least if I understand what our goals would be. (License texts that have portions that vary between packages they apply to are a menace and make everything much harder, and I really wish people would stop using them, but of course the world of software development is not going to listen to me.) -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Bug#885698: What licenses should be included in /usr/share/common-licenses?
Jonas Smedegaard writes: > If you mean to say that ambiguous MIT declarations exist in > debian/copyright files written using the machine-readable format, then > please point to an example, as I cannot imagine how that would look. I can see it: people use License: Expat but then include some license that is essentially, but not precisely, the same as Expat. If we then tell people that they can omit the text of the license and we'll fill it in automatically, they'll remove the actual text and we'll fill it in with the wrong thing. This is just a bug in handling the debian/copyright file, though. If we take this approach, we'll need to be very explicit that you can only use whatever triggers the automatic inclusion of the license text if your license text is word-for-word identical. Otherwise, you'll need to cut and paste it into the file as always. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: /usr/-only image
Simon Richter writes: > This would not work for a package like postfix, which absolutely > requires system-specific configuration, and we'd have to be careful with > packages like postgresql where there is a default configuration that > works fine for hobbyists that we do not make life too difficult for > professional users. I don't think there's any desire to avoid system-specific configuration. The model instead is that the package comes with a set of defaults, and if you don't set something in the local configuration in /etc, the default is used. I think this is exactly the model used by Postfix for main.cf. There are a few mandatory settings, but for the most part you can omit any setting and the default is used. The defaults are just hard-coded (at least so far as I know) rather than stored in separate configuration files in /usr, which doesn't make a fundamental difference. The problem configuration files are ones like Postfix's master.cf, where a whole ton of stuff almost no one ever changes is mixed into the same file that you're supposed to change for local configuration and there's no merger process. And honestly I have always hated the Postfix master.cf file, dating back to before systemd even existed. I think it's a bad configuration design. That of course is just my opinion and doesn't get us anywhere closer to using a defaults plus overrides syntax for master.cf, even assuming that upstream would consider it. There are a ton of packages with configuration syntaxes that were created a very long time ago or have accumulated over time. I maintain one of them upstream, INN, and I'll be the first to say that the INN configuration syntax is *awful*, and I have actively contributed to making it what it is. There are dozens of files, they use about fourteen completely separate and incompatible syntaxes, there's boilerplate in some places and defaults in other places, and learning all the ins and outs of the configuration is a full-time job. It's nonsense, and it's badly designed, and if I were writing it from scratch I'd replace the whole thing with simplified YAML or some similar well-known syntax with a schema and good editor support and a data model that supports configuration merging. And the chances of any of that happening when I have more free software projects lying on the floor in pieces than I have ones I'm managing to keep in the air is... low, even though I do have a much more active comaintainer. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Bug#885698: What licenses should be included in /usr/share/common-licenses?
Jonas Smedegaard writes: > I have so far worked the most on identifying and grouping source data, > putting only little attention (yet - but do dream big...) towards > parsing and processing debian/copyright files e.g. to compare and assess > how well aligned the file is with the content it is supposed to cover. > So if I understand your question correctly and you are not looking for > the output of `licensecheck --list-licenses`, then unfortunately I have > nothing exciting to offer. I think that's mostly correct. I was wondering what would happen if one ran licensecheck debian/copyright, but unfortunately it doesn't look like it does anything useful. I tried it on one of my packages (remctl) that has a bunch of different licenses, and it just said: debian/copyright: MIT License and apparently ignored all of the other licenses present (FSFAP, FSFFULLR, ISC, X11, GPL-2.0-or-later with Autoconf-exception-generic, and GPL-3.0-or-later with Autoconf-exception-generic). It also doesn't notice that some of the MIT licenses are variations that contain people's names. (I still put all the Autoconf build machinery licenses in my debian/copyright file because of the tooling I use to manage my copyright file, which I also use upstream. I probably should change that, but I need to either switch to licensecheck or rewrite my horrible script.) Also, presumably it doesn't know about copyright-format since it wouldn't be expecting that in source files, so it wouldn't know to include licenses referenced in License stanzas without the license text included. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: /usr/-only image
Luca Boccassi writes: > On Sun, 10 Sept 2023 at 18:55, Nils Kattenbeck wrote: >> I am looking to generate a Debian image with only a /usr and /var >> partition as per discoverable partition specification. However, it >> seems to me like the omission of /etc leads to several issues in core >> packages and logging in becomes impossible. >> Is this an unsupported use case and if yes, is there ongoing work to >> eventually support this? >> Many packages in Fedora for example are already configured to support >> this using systemd-sysuser, systemd-tmpfiles, and other declarative >> means stored in /usr/ to create any required files upon boot. > It is being slowly worked towards, but we are still at the prerequisites > at this time. Hopefully we'll have some usable experiments for the > Trixie timeline, but nothing definite yet. Just to make this explicit, one of the prerequisites that has not yet happened is for Debian to agree that this is even something that we intend to do. So far as I know, no one has ever made a detailed, concrete proposal for what the implications of this would be for Debian, what the transition plan would look like, and how to address the various issues that will arise. Moving configuration files out of /etc, in particular, is something I feel confident saying that we do not have any sort of project consensus on, and is not something Debian as a project (as opposed to individuals within the project) is currently planning on working on. That doesn't mean we won't eventually do this, or that people aren't working on other prerequisites, or that it's not something that we're considering. But I just want to make clear that we are so early in this process that it is not at all clear that we are even going to do this at all, and there is a substantial discussion that would need to happen and detailed design proposal that would need to be written before there is any chance whatsoever that Debian will officially support this configuration. (This does not rule out the possibility that certain carefully-crafted configurations with a subset of packages may work in this mode, of course.) -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Bug#885698: What licenses should be included in /usr/share/common-licenses?
Johannes Schauer Marin Rodrigues writes: > I very much like this idea. The main reason maintainers want more > licenses in /usr/share/common-licenses/ is so that they do not anymore > have humongous d/copyright files with all license texts copypasted over > and over again. If long texts could be reduced to a reference that get > expanded by a machine it would make debian/copyright look much nicer and > would make it easier to maintain while at the same time shipping the > full license text in the binary package. > Does anybody know why such an approach would be a bad idea? I can think of a few possible problems: * I'm not sure if we generate binary package copyright files at build time right now, and if all of our tooling deals with this. I had thought that we prohibited this, but it looks like it's only a Policy should and there isn't a mention of it in the reject FAQ, so I think I was remembering the rule for debian/control instead. Of course, even if tools don't support this now, they could always be changed. * If ftp-master has to review the copyright files of each binary package separate from the copyright file of the source package (I think this would be an implication of generating the copyright files during build time), and the binary copyright files have fully-expanded licenses, that sounds like kind of a pain for the ftp-master reviewers. Maybe we can deal with this with better tooling, but someone would need to write that. * If we took this to its logical end point and did this with the GPL as well, we would add 20,000 copies of the GPL to the archive and install a *lot* of copies on the system. Admittedly text files are small and disks are large, but this still seems a little excessive. So maybe we still need to do something with common-licenses? -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: What licenses should be included in /usr/share/common-licenses?
Jeremy Stanley writes: > I'm surprised, for example, by the absence of the ISC license given that > not only ISC's software but much of that originating from the OpenBSD > ecosystem uses it. My personal software projects also use the ISC > license. Are you aggregating the "License:" field in copyright files > too, or is it really simply a hard-coded list of matching patterns? It's only a hard-coded list of matching patterns, and it doesn't match any of the short licenses because historically I wasn't considering them (with the exception of common-licenses references to the BSD license, which I kind of would like to make an RC bug and clean up so that we could remove the BSD license from common-licenses on the grounds that it's specific to only the University of California and confuses people). If we go with any sort of threshold, the script will need serious improvements. That was something else I wanted to ask: I've invested all of a couple of hours in this script, and would be happy to throw it away in favor of something that tries to do a more proper job of classifying the licenses referenced in debian/copyright. Has someone already done this (Jonas, perhaps)? -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: What licenses should be included in /usr/share/common-licenses?
Russ Allbery writes: > In order to structure the discussion and prod people into thinking about > the implications, I will make the following straw man proposal. This is > what I would do if the decision was entirely up to me: > Licenses will be included in common-licenses if they meet all of the > following criteria: > * The license is DFSG-free. > * Exactly the same license wording is used by all works covered by it. > * The license applies to at least 100 source packages in Debian. > * The license text is longer than 25 lines. In the thread so far, there's been a bit of early convergence around my threshold of 100 packages above. I want to make sure people realize that this is a very conservative threshold that would mean saying no to most new license inclusion requests. My guess is that with the threshold set at 100, we will probably add around eight new licenses with the 25 line threshold (AGPL-2, Artistic-2.0, CC-BY 3.0, CC-BY 4.0, CC-BY-SA 3.0, CC-BY-SA 4.0, and OFL-1.1, and I'm not sure about some of those because the CC licenses have variants that would each have to reach the threshold independently; my current ad hoc script does not distinguish between the variants), and maybe 10 to 12 total without that threshold (adding Expat, zlib, some of the BSD licenses). This would essentially be continuing current practice except with more transparent and consistent criteria. It would mean not including a lot of long legal license texts that people have complained about having to duplicate, such as the CDDL, CeCILL licenses, probably the EPL, the Unicode license, etc. If that's what people want, that's what we'll do; as I said, that's what I would do if the choice were left entirely up to me. But I want to make sure I give the folks who want a much more relaxed standard a chance to speak up. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: What licenses should be included in /usr/share/common-licenses?
Jonas Smedegaard writes: > Quoting Hideki Yamane (2023-09-10 11:00:07) >> Hmm, how about providing license-common package and that depends on >> "license-common-list", and ISO image provides both, then? It would be >> no regressions. I do wonder why we've never done this. Does anyone know? common-licenses is in an essential package so it doesn't require a dependency and is always present, and we've leaned on that in the past in justifying not including those licenses in the binary packages themselves, but I'm not sure why a package dependency wouldn't be legally equivalent. We allow symlinking the /usr/share/doc directory in some cases where there is a dependency, so we don't strictly require every binary package have a copyright file. >> I expect license-common-list data as below >> >> license-short-name: URL >> GPL-2: file:///usr/share/common-licenses/GPL-2 >> Boost-1.0: https://spdx.org/licenses/BSL-1.0.html > Ah, so what you propose is to use file URIs. > I guess Russ' response above was a concern over using http(s) URIs > towards a non-local resource. Yes, I think the https URL is an essential part of the first proposal, since it avoids needing to ship a copy of all of the licenses. But I'm dubious that would pass legal muster. The alternative proposal as I understand it would be to haave a license-common package that includes full copies of all the licenses with some more relaxed threshold requirement and have packages that use one of those licenses depend on that package. (This would obviously require a maintainer be found for the license-common package.) > License: Apache-2.0 > Reference: /usr/share/common-licenses/Apache-2.0 This is separate from this particular bug, but I would love to see the pointer to common-licenses turned into a formal field of this type in the copyright format, rather than being an ad hoc comment. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: What licenses should be included in /usr/share/common-licenses?
Hideki Yamane writes: > Russ Allbery wrote: >> Licenses will be included in common-licenses if they meet all of the >> following criteria: > How about just pointing SPDX licenses URL for whole license text and > lists DFSG-free licenses from that? (but yes, we should adjust short > name of licenses for DEP-5 and SPDX for it). Can we do this legally? If we can, it certainly has substantial merits, but I'm not sure that this satisfies the requirement in a lot of licenses to distribute a copy of the license along with the work. Some licenses may allow that to be provided as a URL, but I don't think they all do (which makes sense since people may receive Debian on physical media and not have Internet access). -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
What licenses should be included in /usr/share/common-licenses?
* The license applies to at least 100 source packages in Debian. * The license text is longer than 25 lines. I will attempt to guide and summarize discussion on this topic. No decision will be made immediately; I will summarize what I've heard first and be transparent about what direction I think the discussion is converging towards (if any). Finally, as promised, here is the count of source packages in unstable that use the set of licenses that I taught my script to look for. This is likely not accurate; the script uses a bunch of heuristics and guesswork. AGPL 3 277 Apache 2.0 5274 Artistic 4187 Artistic 2.0337 BSD (common-licenses)42 CC-BY 1.0 3 CC-BY 2.015 CC-BY 2.513 CC-BY 3.0 240 CC-BY 4.0 159 CC-BY-SA 1.0 8 CC-BY-SA 2.0 48 CC-BY-SA 2.5 16 CC-BY-SA 3.0425 CC-BY-SA 4.0237 CC0-1.01069 CDDL 67 CeCILL 30 CeCILL-B 13 CeCILL-C 9 GFDL (any) 569 GFDL (symlink) 55 GFDL 1.2289 GFDL 1.3231 GPL (any) 20006 GPL (symlink) 1331 GPL 1 4033 GPL 2 10466 GPL 3 6783 LGPL (any) 5019 LGPL (symlink) 265 LGPL 2 3850 LGPL 2.1 2926 LGPL 3 1526 LaTeX PPL46 LaTeX PPL (any) 40 LaTeX PPL 1.3c 32 MPL 1.1 165 MPL 2.0 361 SIL OFL 1.0 11 SIL OFL 1.1 258 -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: debian/copyright format and SPDX
Jonas Smedegaard writes: > Only issue I am aware of is that SPDX shortname "MIT" equals Debian > shortname "Expat". There was also some sort of weirdly ideological argument with the FSF about what identifiers to use for the GPL and related licenses, which resulted in SPDX using an "-only" and "-or-later" syntax in the identifier at the insistence of the FSF rather than a separate generic syntax the way that we do. https://spdx.org/licenses/ is the current license list and assigned short identifiers. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: debian/copyright format and SPDX
Jeremy Stanley writes: > Since Debian's machine-readable format has been around longer than > either of the newer formats you mentioned, it seems like it would make > more sense for the tools to incorporate a parser for it rather than > create needless churn in the package archive just to transform an > established standard into whatever the format-du-jour happens to be (and > then halfway through another new format gains popularity, and the > process starts all over again). I don't think the file format is the most interesting part of SPDX. They don't really have a competing format equivalent to the functionality of our copyright files (at least that I've seen; I vaguely follow their lists). Last time I looked, they were doing a lot with XML, which I don't think anyone would adopt for new formats these days. (YAML or TOML or something like that is now a lot more popular.) In terms of file formats, writing a lossy converter from Debian copyright files to whatever format is of interest for BOMs would probably do most of the job. The really interesting part of SPDX is the license list and the canonical name assignment, which is *way* more active and *way* more mature at this point than the equivalent in Debian. They have a much larger license list, which is currently being bolstered by Fedora, and the new licenses and rules for deduplicating them are reviewed by lawyers as part of their maintenance process. Their identifiers are also incerasingly used in upstream software in SPDX-License-Identifier pseudo-headers. I have no idea how to do a transition, but I do think Debian would benefit from adopting the SPDX license identifiers where one exists, and possibly from joining forces with Fedora to submit and get idenifiers assigned to the licenses that we see that are not yet registered. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: DEP-5 copyright with different licenses for two parts of the same file
Marc Haber writes: > Now, how do I write this in a DEP-5 copyright file? Having two stanzas > for the same file gets flagged by Lintian as an Error, and the DEP-5 > syntax doesn't seem to allow to mention two Licenses in the License: > line. This is the intended purpose of "and": cases where one file is covered by multiple licenses simultaneously. So, basically: License: LGPL-2+ and manpage-license or whatever the right tag for that second license is. This is a bit confusing when the licenses conflict, but I think it's close enough to capturing what's going on here, and you can explain further in a Comment. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: Questionable Package Present in Debian: fortune-mod
ort of project-wide content-based decision on package vetting, and certainly against applying the Code of Conduct to something that does not have at all the same context as what the Code of Conduct was designed to address. If we were going to write a project content policy (which I'm dubious we really need to do, or that it would be worth the emotional effort required), I think it would look much different than the Code of Conduct because it would have different goals. It wouldn't be about building a community or encouraging productive collaboration, because the contents of our archive don't need to do either of those things. Lots of people use Debian who are not members of any shared community, and this is a feature. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: [RFC] Extending project standards to services linked through Vcs-*
Dominik George writes: > Hi, >> have you considered dgit? > no, as that's something entirely different. dgit does not > manage source packages in Git, it provides a Git frontend > to source packages not managed in Git. No, this is not really true. There's a lot of misunderstanding about dgit. It does in fact manage source packages in Git. You are thinking of the use of dgit for packages that don't use dgit in their upload flow. In those cases, yes, dgit creates a synthetic Git repository that only includes one commit per upload to Debian. Better than nothing, but not really managing the source package in Git. However, if one uses dgit in one's upload flow, all relevant Git changes are pushed to the dgit Git repository. You can close the dgit repository and get exactly the Git repository that the package maintainer used to develop and upload the package, just as if you were using a Git forge. Obviously, dgit doesn't have the other functions of a Git forge, such as issue tracking, CI, or merge requests. But it does manage source packages in Git. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: [RFC] Extending project standards to services linked through Vcs-*
Dominik George writes: > On Mon, Aug 21, 2023 at 09:48:26AM -0700, Russ Allbery wrote: >> This implies that Salsa is happy to create accounts for people under >> the age of 13, since the implicit statement here is that Debian's own >> Git hosting infrastructure is less excluding than GitHub. >> That's a somewhat surprising statement to me, given the complicated >> legal issues involved in taking personal data from someone that young, >> so I want to double-check: is that in fact the case? > That is, in fact, the case. > And no, it's not not legally complicated to collect personal data from > children. If we, for now, only look at COPPA and GDPR, the laws relevant > for the US and EU, respectively, the situation is: [...] Thank you! This is good to know and I'm very happy that this is the case. I'm glad people have done the research that I hadn't done and worked out what was required! -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: [RFC] Extending project standards to services linked through Vcs-*
Dominik George writes: > For the GitHub case, the problematic terms would be that in order to > register for a GitHub account, users must be at least 13 or 16 years old > (depending on the jurisdiction) ant must not live in a country under US > embargoes. This implies that Salsa is happy to create accounts for people under the age of 13, since the implicit statement here is that Debian's own Git hosting infrastructure is less excluding than GitHub. That's a somewhat surprising statement to me, given the complicated legal issues involved in taking personal data from someone that young, so I want to double-check: is that in fact the case? (US embargoes are indeed going to be a problem for any service hosted in the United States, and possibly an issue, depending on the details, for any maintainer with US citizenship even if they're using a site hosted elsewhere. I would not dare to venture an analysis without legal advice.) -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>
Re: systmd-analyze security as a release goal
"Trent W. Buck" writes: > As someone who does that kind of thing a lot, I'd rather have > the increased annoyance of opt-out hardening than > the reduced security of opt-in hardening. > Even if it means I occasionally need to patch site-local rules into > /etc/apparmor.d/local/usr.bin.msmtp or > /etc/systemd/system/libvirtd.service.d/override.conf. I also feel this way but there are a bunch of people who really, really don't, and also it's not entirely obvious when hardening is failing or what overrides you need to add. So making this the default is hard, because it fundamentally breaks the "it has to work out of the box" property that people expect. Making it be semi-normal for daemons to not work out of the box depending on what configuration options or other packages you have installed is a hard sell. That makes me want some way to opt in to "hardening that might break something," but I'm not sure the best way to do that. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>