from:"\"Russ Allbery\""

Re: Validating tarballs against git repositories

2024-08-26 Thread Russ Allbery

Otto Kekäläinen  writes:
> On Tue, 2 Apr 2024 at 17:19, Jeremy Stanley wrote:
>> On 2024-04-02 16:44:54 -0700 (-0700), Russ Allbery wrote:
>> [...]
>>> I think a shallow clone of depth 1 is sufficient, although that's not
>>> sufficient to get the correct version number from Git in all cases.
>> [...]
>>
>> Some tools (python3-reno, for example) want to inspect the commits
>> and historical tags on branches, in order to do things like
>> assembling release notes documents. I don't know if any reno-using
>> projects packaged in Debian get release notes included, but if they
>> do then shallow clones would break that process. The python3-pbr

> You could use --depth=99 perhaps?

> Usually the difference of having depth=1 or 99 isn't that big unless
> there was a recent large refactoring. Git repositories that are very
> big (e.g. LibreOffice, MariaDB) have hundreds of thousands of commits,
> and by doing a depth=99 clone you avoid 99.995% of the history, and in
> projects where changelog/release notes is based on git commits, then
> 99 commits is probably enough.

I suppose that's *possible*, but I'd want to see some concrete survey
evidence to support that.  I'm fairly sure that 99 would be insufficient
to build a change log on some of my small packages for which I'm a single
developer in all cases, let alone a project with any significant commit
volume and a policy of separating unrelated changes into separate commits.

My guess is that the sweet spots are --depth=1 and a full checkout, it's
not generally possible to tell which a given package needs in advance (in
other words, it's best handled as a configuration option), and it's
probably not worth the effort to mess around with any intermediate depth.
I suspect we'll find that the vast majority of packages work fine with
--depth=1, and the remaining cases should just use a full checkout to
avoid creating fragile assumptions that may work today and break tomorrow.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Representing Debian Metadata in Git

2024-08-21 Thread Russ Allbery

Chris Hofstaedtler  writes:

> My *feeling* is we should do the opposite - that is, represent less
> Debian stuff in git, and especially do it in less Debian-specific
> ways. IOW, no git extensions, no setup with multiple branches that
> contain more or less unrelated things, etc.

+1

I think this is particularly important for attracting new contributors and
easing the onboarding process.  There are a lot of odd Debian-specific
things that people have to learn because they're necessary to make Debian
work.  I am dubious that the Git representation is one of them, and would
rather continue down the path of providing Debian tools and processes that
reduce the delta between how Debian packaging uses Git and how most free
software development uses Git.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: RFC: packages that can be installed by the user but not used as build dependencies

2024-08-19 Thread Russ Allbery

Lisandro Damián Nicanor Pérez Meyer  writes:

> So, what about if we could have [meta] packages that can be installed by
> the user but not used as Build-Depends entries? Please note that for the
> moment I'm targeting more at the idea itself rather than at the
> implementation (but I'll certainly love to know if you have an idea on
> how could this be implemented).

> At one point I thought of adding a Lintian test checking for this kind
> of usage, but first and foremost I would like to know if you think this
> is a viable/acceptable idea, maybe even adding a special section in our
> policy.

I could have sworn that we already had tags like that in Lintian.
Certainly, this is a concept that has already existed in Debian for some
time.  There have always been metapackages or other similar cases that are
only intended for end users and would make no sense as build dependencies,
such as all of the task-* packages.

Lintian feels like the right place to put a test like this.  If there are
dependencies like that which could potentially cause serious issues, those
could even be an auto-reject tag.

I'm not sure that Policy would have much to say about this unless we need
some mechanism for labeling such packages other than a MR to Lintian.  The
important information is the list of packages that shouldn't be used this
way, and the hard part is probably gathering that list.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Accepting DEP14?

2024-08-17 Thread Russ Allbery

Chris Hofstaedtler  writes:

> "latest" is illnamed. What do you expect to find in a branch thats
> called debian/latest?

> Packaging for unstable? For experimental? What if both evolve in
> parallel? Yes, some packages do that.

We discussed this a lot during the drafting of DEP14, and the reason why
the standard allows either convention is that it depends on the package
and there were two separate perspectives with no consensus that one was
universally better.

Maintainers of some packages that upload to unstable except during
freezes, during which they temporarily move into experimental, but
consider it the same line of development, and then move back into unstable
after the release preferred debian/latest since it matched how they
thought about the line of development.  People who maintained separate
unstable and experimental lines of development preferred debian/unstable
and debian/experimental.

Personally, I use debian/unstable but do experimental development in that
same branch if it's "targeting unstable," which is either the best or
worst of both worlds, depending on your perspective.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: RFC: Sensible-editor sensible-utils alternative and update

2024-08-12 Thread Russ Allbery

Simon McVittie  writes:

> The approach to this that will work consistently is to launch the
> handler asynchronously (in the background), and not attempt to find out
> whether it has exited or not. So for example an interactive shell script
> might do something like this:

> #!/bin/bash
> # note that disown is a bashism

> xdg-open "$document" &
> disown $!
> echo "Press Enter when you have finished editing $document..."
> read

What this is telling me is that ideally someone should tighten the
definition of EDITOR in Policy 11.4, which is the specification satisfied
by sensible-editor, to make it clear that GUI editors with these sorts of
properties are not valid things to set EDITOR to point to unless flags are
present to make them behave in a way that satisfies the expectations of
programs that use EDITOR.

I don't have any strong opinion on the merits of trying to figure out how
to invoke the editor with the proper flags to make it follow the
expectations of EDITOR if EDITOR is not set, but we do need to be careful
to not invoke programs that would cause, e.g., git commit --amend to
immediately exit with no changes to the commit message, and to do that we
probably need to write down what those expectations are.  I think the
Policy language was written in a time where we just assumed there was an
obvious way for editors to behave that didn't include things like
backgrounding themselves.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Bug#1075905: ITP: python-fraction -- Fraction carries out all the fraction operations including addition, subtraction, multiplication, division, reciprocation

2024-07-07 Thread Russ Allbery

Yogeswaran Umasankar  writes:

> As I look further, it appears that standard Python libs such as float or
> decimal.Decimal do not provide exact representation of rational numbers
> (fractions) without potential loss of precision. Seems ‘fraction’
> package yield exact results because those functions directly work on
> fractions (to my limited understanding).

> decimal.Decimal is better than float, but it only extends to arbitrary
> precision decimal arithmetic, not exact representation of rational
> numbers. Given that these libraries could potentially serve as
> dependencies for tensor-related packages and beyond, should we consider
> bringing 'fraction' or restrict ourselves to float (which is a fallback
> in moarchiving if fraction unavailable)?

I think the suggestion is to use the Python standard library package
"fractions" specifically:

https://docs.python.org/3/library/fractions.html

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Q: Ubuntu PPA induced version ordering mess.

2024-07-01 Thread Russ Allbery

Alec Leamas  writes:

> So, at least three possible paths:

> 1. Persuade users to uninstall PPA packages before installing official
> packages and also generation 2 PPA packages with sane versions like
> 5.10.x

> 2. Use versions like 9000.5.10, 9000.5.12. etc.

> 3. Use an epoch.

> Of these I would say that 1. is a **very** hard sell upstream. Users are
> used to just update and will try, fail and cause friction.

> 2. and 3. both adds something which must be kept forever. Given this
> choice I tend to think that the epoch is the lesser evil, mostly because
> the package version could match the "real" version.

I would use an epoch.

It sounds like the PPA was in serious use by the intended users and
they're going to be switching to your packages.  You are trying to make
that easy and avoid obvious and easily-forseen problems and I think that's
good: that's exactly what a maintainer should do.  If it were just a
handful of people you could walk through the transition, that's maybe
different, but it sounds like that's not the case.

2 is a hard sell to upstream for psychological reasons.  Maybe it
shouldn't be, maybe upstream should be fine with this, but as you say
upstream in practice isn't going to be fine with this and honestly if I
were upstream I probably wouldn't be either, even if I knew I should be.
It's hard enough to get people to use version numbers properly.  Getting
them to use a "weird" version number that their users might be confused by
for the rest of time is going to be difficult.

Changing the version number only in Debian is even worse: that's just
horribly confusing for users and will be forever.  And the confusion is
going to affect upstream as well.

Basically, you'd be burning a lot of social capital with upstream for no
really good reason and you probably still wouldn't be able to convince
them.  I don't think it's worth it.

I would just use the epoch.  I know people really hate them and they have
a few weird and annoying properties, but we have a bunch of packages with
epochs and it's mostly fine.  It's something you'll have to keep working
around forever, but not in a way that's really that hard to deal with,
IMO.  (I would also warn upstream that you're doing that, so that they
know what the weird "1:" thing means in bug reports in the future and why
it's there.)

This feels like exactly the type of situation that epochs were designed
for: upstream was releasing packages with weird version numbers and now
they're effectively going back to normal version numbers that are much
smaller.  In other words, to quote policy, "situations where the upstream
version numbering scheme changes."  Yes, in this case it was only in their
packages and not in their software releases, but that still counts when
they have an existing user base that has those packages installed.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Reviving schroot as used by sbuild

2024-06-25 Thread Russ Allbery

PICCA Frederic-Emmanuel 
writes:

>> Ah, thank you, I didn't realize that existed.  That sounds like a nice
>> generalization of the file system snapshot approach.

> I think that this how the 

> sbuild-debian-developer-setup

> script, setup chroots

Yeah, I think all that my contribution to this thread accomplished was to
demonstrate that I set up sbuild years ago based on a wiki article for
btrfs and don't know what I'm talking about.  :)  Apologies for that.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Reviving schroot as used by sbuild

2024-06-25 Thread Russ Allbery

Guillem Jover  writes:

> I manage my chroots with schroot (but not via sbuild, for dog fooding
> purposes :), and use type=directory and union-type=overlay so that I get
> a fast and persistent base, independent of the underlying filesystem,
> with fresh instances per session. (You can access the base via the
> source: names.) I never liked the type=file stuff, as it's slow to
> setup and maintain.

Ah, thank you, I didn't realize that existed.  That sounds like a nice
generalization of the file system snapshot approach.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Reviving schroot as used by sbuild

2024-06-25 Thread Russ Allbery

Simon McVittie  writes:

> Persisting a container root filesystem between multiple operations comes
> with some serious correctness issues if there are "hooks" that can modify
> it destructively on each operation: see <https://bugs.debian.org/499014>
> and <https://bugs.debian.org/994836>. As a result of that, I think the
> only model that should be used in new systems is to have some concept of
> a session (like schroot type=file, but unlike schroot type=directory)
> so that those "hooks" only run once, on session creation, preventing
> them from arbitrarily reverting/overwriting changes that are subsequently
> made by packages installed into the chroot/container (for example dbus'
> creation of the messagebus uid/gid in #499014, and exim4's creation of
> Debian-exim in #994836).

I'm not entirely sure that I'm following the nuances of this discussion,
so this may be irrelevant, but I think type=btrfs-snapshot provides the
ideal properties for container file systems.  This unfortunately require
file system support and therefore cannot be used unless you've already
embraced a file system with subvolumes, but if you have, you get all of
the speed of a persistent container root file system with none of the
correctness issues, because you get a fresh (and almost instant) clone of
a canonical root file system that is discarded after each build.

I use that in combination with a cron job to update the source subvolume
daily to ensure that it's fully patched.

Unfortunately, there's no way that we can rely on this, but it would be
nice to continue to support it for those who are using a supported
underlying file system already.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: About i386 support

2024-06-14 Thread Russ Allbery

r...@neoquasar.org writes:

> Then it's not a problem in the first place. If you can't reproduce a bug
> with a reasonable effort, then it is unconfirmed and you can stop
> worrying about it.

I think you're confusing two different types of reproduction.

Architecture porting bugs are often hardware-specific.  The bug may be
100% reproducible on that instance of the architecture, an instance that
you do not own and do not have access to.  So the package is reliably
broken for a user trying to use that architecture, and yet the porter has
limited ability to triage or debug it because they don't have access to
that architecture.

This is one of the reasons why projects (not just Debian) drop support for
architectures.  Once the *maintainers* no longer have easy access to
instances of that architecture, it's very hard to support, even if users
keep trying to use that architecture and run into problems that are
reproducible for them.

That's the first hurdle.  The second hurdle you then run into is that
frequently the cause of these problems is deep inside the compiler, the
kernel, or some other complex piece of upstream code.  There are a very
limited number of people who have the ability to track down and fix
problems like that, since they can require a lot of toolchain expertise.
It's not a simple thing to commit to doing.

Debian relies fairly heavily on a whole ecosystem of upstream developers
to do a lot of the difficult work for supporting architectures, including
the kernel, GCC, binutils, etc.  If that ecosystem stops supporting
architectures, it will be very difficult for Debian to keep support, and
doing so usually requires the people interested in keeping those
architectures working to also become upstream kernel, GCC, etc.
developers.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: File descriptor hard limit is now bumped to the kernel max

2024-06-06 Thread Russ Allbery

Simon McVittie  writes:
> On Thu, 06 Jun 2024 at 18:39:15 +0200, Marco d'Itri wrote:

>> Something did, because inn would start reporting ~1G available fds and
>> then explode, and that patch solved the issue. :-)

> It might be worthwhile to try to track down what larger component did
> this, because inheriting a larger rlim_cur without opt-in can also break
> users of select(2) as described in
> <https://0pointer.net/blog/file-descriptor-limits.html>.

I took a quick look at the old INN source and didn't see anything obvious.
I was half-expecting it to do something like set the soft limit to the
hard limit (that sounds like a very INN sort of thing to do), but if so, I
couldn't find it in a quick search.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: DEP17 /usr-move: debootstrap set uploaded

2024-06-06 Thread Russ Allbery

Marc Haber  writes:
> Helmut Grohne  wrote:

>> Thanks for bearing with me and also thanks to all the people (release
>> team and affected package maintainers in particular) who support this
>> work.

> Thank you for doing this work. I have rarely seen a change of this
> magnitude in Debian that was managed on this professional level. I
> especially praise the way you have communicated the progress.

100% agreed.  The care and excellence that you've brought to this work has
been exceptional.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Make /tmp/ a tmpfs and cleanup /var/tmp/ on a timer by default [was: Re: systemd: tmpfiles.d not cleaning /var/tmp by default]

2024-05-28 Thread Russ Allbery

Matthew Garrett  writes:
> On Mon, May 06, 2024 at 07:42:11AM -0700, Russ Allbery wrote:

>> Historically, deleting anything in /var/tmp that hadn't been accessed
>> in over seven days was a perfectly reasonable and typical
>> configuration.  These days, we have the complication that it's fairly
>> common to turn off atime updates for performance reasons, which makes
>> it a bit harder to implement that policy when /var/tmp isn't its own
>> partition and thus inherits that setting from the rest of the system.

> Apologies for being a bit late to this, but is this true? relatime-type 
> setups will still update atime if the time between the previous update 
> and the access is larger than some threshold, so you lose some degree of 
> granularity but the rough policy should still apply.

You are correct and I completely forgot about that.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: MBF: drop dependencies on system-log-daemon

2024-05-27 Thread Russ Allbery

Simon McVittie  writes:

> I know fail2ban and logcheck do read plain-text logs (although as
> mentioned, fail2ban already has native Journal-reading support too), and
> I would guess that fwlogwatch, snort and xwatch probably also read the
> logs.

logcheck also has native journal-reading support.

Note that its dependency is only Suggests.  I have not checked if that's
there for some other reason.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Documenting packaging workflows

2024-05-21 Thread Russ Allbery

Johannes Schauer Marin Rodrigues  writes:

> I would be *very* interested in more in-depth write-ups of the workflows
> other DDs prefer to use, how they use them and what they think makes
> them better than the alternatives.

> Personally, I start packaging something without git, once I'm satisfied
> I use "gbp import-dsc" to create a packaging git with pristine-tar (and
> that will *not* have DEP14 branches and it will use "master" instead of
> "main") and then I push that to salsa and do more fixes until the
> pipeline succeeds and lintian is happy. My patches are managed using
> quilt in d/patches and upstream git is not part of my packaging git. I
> upload using "dgit --quilt=gbp push-source".

> Would another workflow make me more happy? Probably! But how would I get
> to know them or get convinced that they are better? Maybe I'm missing
> out on existing write-ups or video recordings which explain how others
> do their packaging and why it is superior?

One of the lesser-known things found in the dgit package is a set of man
pages describing several packaging workflows.  These certainly aren't
exhaustive of the packaging workflows that people use (for one thing, they
are designed to explain how to use dgit in each workflow, and thus of
course assume that you want to do that), but they're both succinct and
fairly thorough and I found reading them very helpful.  dgit(1) has a list
at the start.

dgit-maint-debrebase(7) is the workflow that I now use, pretty much
exactly as described.  The primary thing that I like about it is that I
never have to deal with the externalized patches or with quilt (I used
quilt for years, and I have developed a real dislike for it and its odd
quirks -- I would rather only deal with Git's odd quirks), and I can use a
git-rebase-like command in much the same way that I use it routinely for
feature branches when doing upstream development.  Then some Git magic
happens behind the scenes to make this safe, and while I don't understand
the details, it has always worked fine, so I don't really care how it
works.  :)

I like having a Git history from as early in the process as possible and I
want the upstream Git history to refer to while I work on packaging (and
want to be able to cherry-pick upstream commits), so I generally start
with a tagged release from the upstream Git repository, create a branch
based on it, and start writing and committing debian/* files.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: finally end single-person maintainership

2024-05-21 Thread Russ Allbery

Salvo Tomaselli  writes:

> If the debian/ directory is on salsa, but the rest of the project is
> somewhere else, then this no longer works, I have to tag in 2 different
> places, I have 2 different repositories to push to and so on.

For what it's worth, what I do for the packages for which I'm also
upstream is that I just add Salsa as another remote and, after I upload
a new version of the Debian package, I push to Salsa as well (yes,
including all the upstream branches; why not, the Debian branches are
based on that anyway, so it's not much more space).  One of these days
I'll get CI set up properly, and then it will be worthwhile to push to
Salsa *before* I upload the package and let it do some additional
checking.

It's still an additional step, and I still sometimes forget to do it, but
after some one-time setup, it's a fairly trivial amount of work.

It's more work to accept a merge request on Salsa and update the
repositories appropriately, since there are two repositories in play, but
in that case I'm getting a contribution out of it that I might not have
gotten otherwise, so to me that seems worth it.

I used to try to keep the debian directory in a separate repository or try
to keep the Debian Git branches in a separate repository, and all of that
was just annoying and tedious and didn't feel like it accomplished much.
Just pushing the same branches everywhere is easy and seems to accomplish
the same thing.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: finally end single-person maintainership

2024-05-21 Thread Russ Allbery

Stefano Rivera  writes:

> On the other hand, dgit is only useful if you have a certain view of the
> world, that hasn't aligned with how I've done Debian packaging. I mean,
> an entirely git-centric view where you let go of trying to maintain your
> patch stack.

dgit has no problems with you maintaining your patch stack, at least as I
understand that statement.  I personally use the dgit-maint-debrebase(7)
workflow, which is a fancy way of maintaining your patch stack using an
equivalent of git rebase, since I love git rebase and use it all the time.
But I used the dgit-maint-gbp(7) workflow, which is basically just the
normal git-buildpackage workflow, for years and still use it for some of
my packages and it works fine.

Maybe you mean something different by this than I think you meant.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: finally end single-person maintainership

2024-05-20 Thread Russ Allbery

Simon Richter  writes:

> A better approach would not treat Debian metadata as git data. Even the
> most vocal advocate of switching everything to Salsa writes in his MR
> that the changelog should not be touched in a commit, because it creates
> conflicts, and instead a manual step will need to be performed later.

This is not a Debian-specific problem and has nothing to do with any
special properties of our workflows or differences between packaging and
other software maintenance tasks.  It's a common issue faced by everyone
who has ever maintained a software package in Git and wanted to publish a
change log.

There are oodles of tools and workflows to handle this problem, ranging
from writing the change log based on the Git commits when you're making
the release to accumulating fragments of release notes in separate files
and using a tool to merge them.  dch's approach of using the Git commit
messages is one of the standard solutions, one that will be familiar to
many people who have faced this same problem in other contexts.

The hard part with these sorts of problems is agreeing on the tool and
workflow to use to solve it, something Debian struggles with more than
most software projects because we lack a decision-making body that can say
things like "we're going to use scriv" and make it stick.  But that isn't
because packaging is a special problem unsuited to Git.  Git has a rich
ecosystem with many effective solutions to problems of this sort.  It's
because we've chosen a governance model that intentionally makes central
decision-making and therefore consistency and coordination difficult, in
exchange for other perceived benefits.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Confused about libnuma1 naming

2024-05-16 Thread Russ Allbery

"J.J. Martzki"  writes:

> Package 'libnuma1' is built from numactl, and there seems no 'libnuma'
> exists. Why does it named as 'libnuma1' rather than 'libnuma'?

Shared library packages should be named after the library SONAME, which
generally includes a version (as it does here).  See:

https://www.debian.org/doc/debian-policy/ch-sharedlibs.html#run-time-shared-libraries

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: De-vendoring gnulib in Debian packages

2024-05-12 Thread Russ Allbery

Ansgar 🙀  writes:

> In ecosystems like NPM, Cargo, Golang, Python and so on pinning to
> specific versions is also "explicitly intended to be used"; they just
> sometimes don't include convenience copies directly as they have tooling
> to download these (which is not allowed in Debian).

Yeah, this is a somewhat different case that isn't well-documented in
Policy at the moment.

> (Arguably Debian should use those more often as keeping all software at
> the same dependency version is a futile effort IMHO...)

There's a straight tradeoff with security effort: more security work is
required for every additional copy of a library that exists in Debian
stable.  (And, of course, some languages have better support for having
multiple simultaneously-installed versions of the same library than
others.  Python's support for this is not great; the ecosystem expectation
is that one uses separate virtualenvs, which don't really solve the Debian
build dependency problem.)

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: De-vendoring gnulib in Debian packages

2024-05-12 Thread Russ Allbery

"Theodore Ts'o"  writes:

> The best solution to this is to try to promote people to put those
> autoconf macros that they are manually maintaining that can't be
> supplied in acinclude.m4, which is now included by default by autoconf
> in addition to aclocal.m4.

Or use a subdirectory named something like m4, so that you can put each
conceptually separate macro in a separate file and not mush everything
together, and use:

AC_CONFIG_MACRO_DIR([m4])

(and set ACLOCAL_AMFLAGS = -I m4 in Makefile.am if you're also using
Automake).

> Note that how we treat gnulib is a bit differently from how we treat
> other C shared libraries, where we claim that *all* libraries must be
> dynamically linked, and that include source code by reference is against
> Debian Policy, precisely because of the toil needed to update all of the
> binary packages should some security vulnerability gets discovered in
> the library which is either linked statically or included by code
> duplication.

> And yet, we seem to have given a pass for gnulib, probably because it
> would be too awkward to enforce that rule *everywhere*, so apparently
> we've turned a blind eye.

No, there's an explicit exception for cases like gnulib.  Policy 4.13:

Some software packages include in their distribution convenience
copies of code from other software packages, generally so that users
compiling from source don’t have to download multiple packages. Debian
packages should not make use of these convenience copies unless the
    included package is explicitly intended to be used in this way.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Avoiding /var/tmp for long-running compute (was: Make /tmp/ a tmpfs and cleanup /var/tmp/ on a timer by default [was: Re: systemd: tmpfiles.d not cleaning /var/tmp by default])

2024-05-08 Thread Russ Allbery

"Jonathan Dowland"  writes:

> Else-thread, Russ begs people to stop doing this. I agree people
> shouldn't! We should also work on education and promotion of the
> alternatives.

Also, helping people use better tools for managing workloads like this
that make their lives easier and have better semantics, thus improving
life for everyone.

I'm suggesting solutions that I don't have time to help implement, and of
course it will take a long time for better tools to filter into all those
clusters, so this doesn't address the immediate problem of this thread
(hence the subject change).  But based on my past experience with these
types of systems, I bet a lot of the patterns captured in software are
older ones.  Linux has a *lot* of facilities today that it didn't have, or
at least weren't widely used, five years ago.  It would be great to help
some of those improvements filter down, because they can make a lot of
these problems go away.

For example, take the case of scratch space for batch computing.  The
logical lifespan for temporary files for a batch computing job is the
lifetime of the job, whatever that may be.  (I know there are exceptions,
but here I'm just talking about defaults.)  Previously one would have to
build support into the batch job management system for creating and
managing those per-job temporary directories, and ensure the jobs support
TMPDIR or other environment variables to control where they store data,
and everyone was doing this independently.  (I've done a *lot* of this
kind of thing, once upon a time.)

But now we have mount namespaces, and systemd has PrivateTmp that builds
on top of that.  So if the job is managed by an execution manager, it can
create per-job temporary directories and it may already support (as
systemd does) the semantics of deleting the contents of those directories
on job exit, and it bind-mounts those into the process space and the
process is none the wiser.  I think all of the desirable glue may not
fully be there (controlling what underlying file system is used for
PrivateTmp, ensuring they're also excluded from normal cleanup, etc.), but
this is very close to a much better way of handling this problem that
still exposes /tmp and /var/tmp to the job so that none of the
often-crufty scientific computing software has to change.

The new capabilities that Linux now has due to namespaces are marvellous
and solve a whole lot of problems that I didn't realize were even
solvable, and right now I suspect there are huge opportunities for
substantial improvements without a whole lot of effort by just plumbing
those facilities through to higher-level layers like batch systems.  Whole
classes of long-standing problems would just disappear, or at least be
far, far easier to manage.

Substantial, substantial caveat: I have been out of this world for a
while, and maybe most of this work has already been done?  That would be
amazing.  The best possible response to this post would be for someone to
tell me I'm five years behind and the batch systems have already picked up
this work and we can just point people at them.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Make /tmp/ a tmpfs and cleanup /var/tmp/ on a timer by default [was: Re: systemd: tmpfiles.d not cleaning /var/tmp by default]

2024-05-07 Thread Russ Allbery

Simon Richter  writes:
> On 5/8/24 07:05, Russ Allbery wrote:

>> It sounds like that is what kicked off this discussion, but moving /tmp
>> to tmpfs also usually makes programs that use /tmp run faster.  I
>> believe that was the original motivation for tmpfs back in the day.

> IIRC it started out as an implementation of POSIX SHM, and was later
> generalized.

I believe you're correct for Linux specifically but not in general for
UNIX.  For example, I'm fairly sure this is not the case on Solaris, which
was the first place I encountered tmpfs and where tmpfs /tmp was the
default starting in Solaris 2.1 in 1992.  tmpfs was present in SunOS in
1987, so I'm pretty sure it predates POSIX shared memory.

Linux was very, very late to the tmpfs world.

> When /var runs full, the problem is probably initrd building.

I'm not quite sure what to make of this statement.  On my systems, /var
contains all sorts of rather large things, such as PostgreSQL databases,
INN spool files, and mail spools.  I have filled up /var on many systems
over the years, and it's never been by building initrd images.

> Taking a quick look around all my machines, the accumulated cruft in
> /var/tmp is on the order of kilobytes -- mostly reportbug files, and a
> few from audacity -- and these machines have not been reinstalled in the
> last ten years.

Yes, I don't think many programs use it.  I think that's a good thing; the
specific semantics of /var/tmp are only useful in fairly narrow
situations, and overfilling it is fairly dangerous.

Back in the day, /var/tmp was the thing that you used if /tmp was too
small (because it was usually tmpfs).  For example, using sort -T /var/tmp
to sort large files is an old UNIX rune.  And, of course, students would
use it because they ran out of quota in their home directories and then
get upset when their files got deleted automatically, back in the days of
shared UNIX login clusters.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Make /tmp/ a tmpfs and cleanup /var/tmp/ on a timer by default [was: Re: systemd: tmpfiles.d not cleaning /var/tmp by default]

2024-05-07 Thread Russ Allbery

Richard Lewis  writes:

> btw, i'm not trying to argue against the change, but i dont yet
> understand the rationale (which id like to be put into the
> release-notes): is there perhaps something more compelling than "other
> distributions and upstream already do this"?

It sounds like that is what kicked off this discussion, but moving /tmp to
tmpfs also usually makes programs that use /tmp run faster.  I believe
that was the original motivation for tmpfs back in the day.

For /var/tmp, I think the primary motivation to garbage-collect those
files is that filling up /var/tmp is often quite bad for the system.  It's
frequently not on its own partition, but is shared with at least /var, and
filling up /var can be very bad.  It can result in bounced mail, unstable
services, and other serious problems.

Most modern desktop systems now have large enough drives that this isn't
as much of a concern as it used to be, but VMs often still have quite
small / partitions and put /var/tmp on that partition.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Bug#966621: Make /tmp/ a tmpfs and cleanup /var/tmp/ on a timer by default [was: Re: systemd: tmpfiles.d not cleaning /var/tmp by default]

2024-05-07 Thread Russ Allbery

Richard Lewis  writes:
> Luca Boccassi  writes:

>> what would break where, and how to fix it?

> Another one for you to investigate: I believe apt source and 'apt-get
> source' download and extract things into /tmp, as in the mmdebootstap
> example mentioned by someone else, this will create "old" files that
> could immediately be flagged for deletion causing surprises.

> (People restoring from backups might also find this an issue)

systemd-tmpfiles respects atime and ctime by default, not just mtime, so I
think this would only be a problem on file systems that didn't support
those attributes.  atime is often turned off, but I believe support for
ctime is fairly universal among the likely file systems for /var/tmp, and
I believe tmpfs supports all three.  (I'm not 100% sure, though, so please
correct me if I'm wrong.)

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Make /tmp/ a tmpfs and cleanup /var/tmp/ on a timer by default [was: Re: systemd: tmpfiles.d not cleaning /var/tmp by default]

2024-05-07 Thread Russ Allbery

Hakan Bayındır  writes:

> The applications users use create these temporary files without users'
> knowledge. They work in their own directories, but applications create
> another job dependent state files in both /tmp and /var/tmp. These are
> different programs and I assure you they’re not created there because
> user (or we) configured something. These files live there during the
> lifetime of the job, and cleaned afterwards by the application.

Then someone should fix those applications, because that behavior will
result in user data loss if they're not fixed.  However, first one should
check whether the applications are just honoring TMPDIR or equivalent
variables, in which case TMPDIR on batch systems often should be set to a
user-specific or job-specific persistent directory for exactly this
reason.  That way you can use a user-specific cleanup strategy, such as
purging that directory when all of the user's jobs have finished.

I understand your point, which is that this pattern is out there in the
wild and Debian is in danger of breaking existing usage patterns by
matching the defaults of other distributions.  This is a valid point, and
I appreciate you making it.

My replies are not intended to dispute that point, but to say that the
burden of addressing this buggy behavior should not rest entirely on
Debian.  What the combination of batch system and application is doing is
semantically incorrect and is dangerous, and it really should be fixed.
Even if Debian changes nothing, at some point someone will deploy workers
with a different base operating system and be very surprised when these
files are automatically deleted.

We were automatically cleaning /tmp and /var/tmp on commercial UNIX
systems in 1995 and fixing broken applications that didn't honor TMPDIR.
This is not a new problem.  Nor is having /var/tmp fill up and cause all
sorts of system problems because someone turned off /var/tmp cleaning
while trying to work around broken applications.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Make /tmp/ a tmpfs and cleanup /var/tmp/ on a timer by default [was: Re: systemd: tmpfiles.d not cleaning /var/tmp by default]

2024-05-07 Thread Russ Allbery

Hakan Bayındır  writes:
> Dear Russ,

>> If you are running a long-running task that produces data that you
>> care about, make a directory for it to use, whether in your home
>> directory, /opt, /srv, whatever.

> Sorry but, clusters, batch systems and other automated systems doesn't
> work that way.

Yours might not, but I spent 20 years maintaining clusters and batch
systems and I assure you that's how mine worked.

> That's not an extension of the home directory in any way. After users
> submit their jobs to the cluster, they neither have access to the
> execution node, nor they can pick and choose where to put their files.

> These files may stay there up to a couple of weeks, and deleting
> everything periodically will probably corrupt the jobs of these users
> somehow.

Using /var/tmp for this purpose is not a good design decision.
Directories are free; they can make a new one and point the files of batch
jobs there.  They don't have to overload a directory that historically has
different semantics and is often periodically cleared.  I get that this
may not be your design or something you have control over, so telling you
this doesn't directly help, but the point still stands.

Again, obviously the people configuring that cluster can configure it
however they want, including overriding the /var/tmp cleanup policy.  But
they're playing with fire by training users to use /var/tmp, and it's
going to result in someone getting their data deleted at some point,
regardless of what Debian does.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Make /tmp/ a tmpfs and cleanup /var/tmp/ on a timer by default [was: Re: systemd: tmpfiles.d not cleaning /var/tmp by default]

2024-05-07 Thread Russ Allbery

Hakan Bayındır  writes:

> Consider a long running task, which will take days or weeks (which is
> the norm in simulation and science domains in general). System emitted a
> warning after three days, that it'll delete my files in three days. My
> job won't be finished, and I'll be losing three days of work unless I
> catch that warning.

I have to admit that I'm a little surprised at the number of people who
are apparently using /var/tmp for things that are clearly not temporary
files in the traditional UNIX sense.  Clearly this bit of folk knowledge
is not as widespread as I thought, so we have to figure out how to deal
with that, but periodically deleting files out of /var/tmp has been common
(not universal, but common) UNIX practice for at least thirty years.

Whatever we do with /var/tmp retention, I beg people to stop using
/var/tmp for data you're keeping for longer than a few days and care about
losing.  That's not what it's for, and you *will* be bitten by this
someday, somewhere, because even with existing Debian configuration many
people run tmpreaper or similar programs.  If you are running a
long-running task that produces data that you care about, make a directory
for it to use, whether in your home directory, /opt, /srv, whatever.

/var/tmp's primary purpose historically was to support things like
temporary recovery files that needed to survive a system crash, but which
were still expected to be *temporary* in that one would then either use
the recovery file or expect it to be deleted.  Not as an extension of
people's home directory.

Your system is your system, so of course you can configure /var/tmp
however you want and no one is going to stop you, but a lot of people on
this thread are describing habits that are going to lose their data if
they use a different distribution or even a differently-configured Debian
distribution with tmpreaper installed.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Bug#966621: Make /tmp/ a tmpfs and cleanup /var/tmp/ on a timer by default [was: Re: systemd: tmpfiles.d not cleaning /var/tmp by default]

2024-05-06 Thread Russ Allbery

Luca Boccassi  writes:
> Richard Lewis  wrote:

>> - tmux stores sockets in /tmp/tmux-$UID
>> - I think screen might use /tmp/screens

>> I suppose if you detached for a long time you might find yourself
>> unable to reattach.

>> I think you can change the location of these.

> And those are useful only as long as screen/tmux are still running,
> right (I don't really use either that much)? If so, a flock is the right
> solution for these

Also, using /tmp as a path for those sockets was always a questionable
decision.  I believe current versions of screen use /run/screen, which is
a more reasonable location.  Using a per-user directory would be even
better, although I think screen intentionally supports shared screens
between users (which is a somewhat terrifying feature from a security
standpoint, but that's a different argument).

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Make /tmp/ a tmpfs and cleanup /var/tmp/ on a timer by default [was: Re: systemd: tmpfiles.d not cleaning /var/tmp by default]

2024-05-06 Thread Russ Allbery

Andrey Rakhmatullin  writes:
> On Mon, May 06, 2024 at 10:40:00AM +0200, Michael Biebl wrote:

>> I'm not sure if we have software on long running servers which place
>> files in /tmp and /var/tmp and expect files to not be deleted during
>> runtime, even if not accessed for a long time. This is certainly an
>> issue to be aware of and keep an eye on.

> Note that FHS mandates it for /var/tmp: "Files and directories located
> in /var/tmp must not be deleted when the system is booted. Although data
> stored in /var/tmp is typically deleted in a site-specific manner, it is
> recommended that deletions occur at a less frequent interval than /tmp."

It mandates that it not be cleaned on *boot*.  Not that it never be
cleaned during runtime.  It anticipates that it be cleaned periodically,
just less frequently than /tmp.

There is a specific prohibition against clearing /var/tmp on reboot
because /var/tmp historically has been used to store temporary files whose
whole reason for existence is that they need to survive a reboot, such as
vi recover files, but are still safe to delete periodically.

Historically, deleting anything in /var/tmp that hadn't been accessed in
over seven days was a perfectly reasonable and typical configuration.
These days, we have the complication that it's fairly common to turn off
atime updates for performance reasons, which makes it a bit harder to
implement that policy when /var/tmp isn't its own partition and thus
inherits that setting from the rest of the system.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Silent hijacking and stripping records from changelog

2024-04-17 Thread Russ Allbery

Jonas Smedegaard  writes:
> Quoting Jonathan Dowland (2024-04-17 17:29:11)
>> On Wed Apr 17, 2024 at 10:39 AM BST, Jonas Smedegaard wrote:

>>> Interesting: Can you elaborate on those examplary contributions of
>>> yours which highlighted a need for maintaining all Haskell packages in
>>> same git repo?

>> My Haskell contributions (which I did not enumerate) are tangential to
>> the use of a monorepo. But it strikes me as an odd choice for you to
>> describe them as examplary. Paired with you seeming to file me on "the
>> opposing side", your mail reads to me as unnecessarily snarky.  Please
>> do not CC me for listmail.

> I can see why it might come across as snarky.  It was not intended that
> way.

> I just meant to write describe your contributions as examples, but I
> realize now that with your emphasizing it that I wrongly described them
> as extraordinary examples.

I suspect (based on Jonas's domain) this is one of those subtle problems
when English isn't your first language.  The English language is full of
weird connotation traps.

For anyone else who may not be aware of this subtle shade of meaning, an
English dictionary will partly lie to you about the common meaning of
"exemplary" (which I assume is what Jonas meant by "examplary").  Yes, it
means "serving as an example," but it specifically means serving as an
*ideal* example: something that should be held up as being particularly
excellent or worthy of imitation.

If you ask someone "could you elaborate on your exemplary contributions,"
a native English speaker is going to assume you're being sarcastic about
90% of the time.  In common usage, that phrase usually carries a tone
closer to "please do enlighten us about your amazing contributions" than
what Jonas actually intended.

I keep having to remind myself of this in Debian since many Debian
contributors have *excellent* written English skills (certainly massively
bettern than my language skills in any language other than English), so
it's easy to fall into the trap of assuming that they're completely
fluent, but English is full of problems like this that will trip up even
highly advanced non-native speakers.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Debian openssh option review: considering splitting out GSS-API key exchange

2024-04-04 Thread Russ Allbery

Florian Lohoff  writes:

> These times have long gone and tcp wrapper as a security mechanism has
> lost its reliability, this is why people started moving away from tcp
> wrapper (which i think is a shame)

> I personally moved to nftables which is nearly as simple once you get
> your muscle memory set. If ssh is your only candidate of network service
> you could also use match statements in /etc/ssh/sshd_config.d/.

For what it's worth, I have iptables (I know, it's nftables under the hood
now, but I'm still using the iptables syntax because the number of hours
in each day is annoyingly low) on every system I run and I still use TCP
wrappers for ssh restrictions for one host.  That's because I have users
who use various ISPs, and for some of those ISPs, DNS-based restrictions
are less maintenance work than playing whack-a-mole with their
ever-changing IP blocks.

Yes, yes, I know this isn't actually secure, etc., but that's fine, I'm
not using it as a primary security measure.  I'm using it to narrow the
number of hosts on the Internet that can exploit an sshd vulnerability,
and to reduce the amount of annoying automated exploit attempts I get.
(Exactly the kind of thing that helps mildly against situations like the
xz backdoor.)

That said, the point that I could switch over to Match blocks in the sshd
configuration is well-taken, and not wanting to take an hour to rewrite my
rules in a different configuration format is probably not a good enough
reason to keep a dependency in a security-critical, network-exposed
service.  I'm mildly grumbly becuase it's yet another thing I have to
change just to keep things from breaking, but such is life.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Validating tarballs against git repositories

2024-04-02 Thread Russ Allbery

Stefano Rivera  writes:

> Then you haven't come across any that are using this mechanism to
> install data, yet. You're only seeing the version determination.  You
> will, at some point run into this problem. It's getting more popular.

Yup, we use this mechanism heavily at work, since it avoids having to
separately maintain a MANIFEST.in file.  Anything that's checked in to Git
in the appropriate trees ships with the module.  But this means that you
have to build the module from a Git repository, if you're not using the
artifact uploaded to PyPI (which expands out all the information derived
from Git).

If I correctly remember the failure mode, which I sometimes run into
during local development if I forget to git add new data files, the data
files are just not installed since nothing tells the build system they
should be included with the module.

I think a shallow clone of depth 1 is sufficient, although that's not
sufficient to get the correct version number from Git in all cases.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Validating tarballs against git repositories

2024-04-02 Thread Russ Allbery

Adrian Bunk  writes:
> On Mon, Apr 01, 2024 at 11:17:21AM -0400, Theodore Ts'o wrote:

>> Yeah, that too.  There are still people building e2fsprogs on AIX,
>> Solaris, and other legacy Unix systems, and I'd hate to break them, or
>> require a lot of pain for people who are building on MacPorts, et. al.
>>...

> Everything you mention should already be supported by Meson.

Meson honestly sounds great, and I personally love the idea of using a
build system whose language is a bit more like Python, since I use that
language professionally anyway.  (It would be nice if it *was* Python
rather than yet another ad hoc language, but I also get why they may want
to restrict it.)

The prospect of converting 25 years of portability code from M4 into a new
language is daunting, however.  For folks new to this ecosystem, what
resources are already available?  Are there large libraries of tests
already out there akin to gnulib and the Autoconf Archive?  Is there a
really good "porting from Autotools" guide for Meson that goes beyond the
very cursory guide in the Meson documentation?

The problem with this sort of migration is that it is an immense amount of
work just to get back to where you started.  I look at the amount of
effort and start thinking things like "well, if I'm going to rewrite a
bunch of things anyway, maybe I should just rewrite the software in Rust
instead."

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Debian openssh option review: considering splitting out GSS-API key exchange

2024-04-01 Thread Russ Allbery

Christoph Anton Mitterer  writes:

> Actually I think that most sites where I "need"/use GSSAPI... only
> require the ticket for AFS, and do actually allow pubkey auth (but
> right now, one doesn't have AFS access then).

In past discussions of this patch, this has not been the case.  One of the
advantages of GSSAPI key exchange is that you can disable public keys for
all of your hosts and never manage known hosts, instead only using the
system Kerberos keytabs.  Since in a Kerberos environment you have to put
keytabs on every host *anyway*, and that *is* the host's identity in a
Kerberos environment, this reduces the number of key infrastructures you
have to manage by one, which matters to some Kerberos deployments.  This
arguably gives you better security in that specific environment because
keytabs do not rely on leap-of-faith initial authentication; the server is
always properly authenticated, even on first connect.

> Not sure if there's a simple out of the box way to just transfer that
> but without all the other GSSAPI stuff?

If you want your ticket to refresh remotely when you refresh it locally,
which is often needed for Kerberos applications like AFS, you do need key
exchange, since that's the mechanism that allows that to happen.

(I use both GSSAPI and tcpwrappers, so Colin's proposal would mean more
work for me, but given the situation, I'm willing to rework the way that I
use ssh to avoid both going forward.  More features are nice, but I can
see the merits of simplicity here.  But I no longer maintain a large
infrastructure built on Kerberos, so I'm not putting as much weight on the
GSSAPI support as I used to.)

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: xz backdoor

2024-04-01 Thread Russ Allbery

Bastian Blank  writes:

> I don't understand what you are trying to say.  If we add a hard check
> to lintian for m4/*, set it to auto-reject, then it is fully irrelevant
> if the upload is a tarball or git.

Er, well, there goes every C package for which I'm upstream, all of which
have M4 macros in m4/* that do not come from an external source.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Command /usr/bin/mv wrong message in German

2024-03-31 Thread Russ Allbery

Russell Stuart  writes:

> The reason I'm replying is after one, probably two decades this still
> annoys me:

>$ dpkg -S /etc/profile
>dpkg-query: no path found matching pattern /etc/profile

> It was put their by the Debian install, and I'm unlikely to change it.
> Its fairly important security wise.  It would be nice if "dpkg -S" told
> me base-files.deb installed it.  It would be nice if debsums told me if
> it changed.  There are lots of files like this, such as /etc/environment
> and /etc/hosts.  There are some directories like /etc/apt/trusted.gpg.d/
> which should only have files claimed by some .deb.

Guillem has a plan for addressing this, I believe as part of metadata
tracking, that would allow such files can be registered by their packages
and then tracked by dpkg.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Validating tarballs against git repositories

2024-03-31 Thread Russ Allbery

Luca Boccassi  writes:
> On Sat, 30 Mar 2024 at 15:44, Russ Allbery  wrote:
>> Luca Boccassi  writes:

>>> In the end, massaged tarballs were needed to avoid rerunning
>>> autoconfery on twelve thousands different proprietary and
>>> non-proprietary Unix variants, back in the day. In 2024, we do
>>> dh_autoreconf by default so it's all moot anyway.

>> This is true from Debian's perspective.  This is much less obviously
>> true from upstream's perspective, and there are some advantages to
>> aligning with upstream about what constitutes the release artifact.

> My point is that, while there will be for sure exceptions here and
> there, by and large the need for massaged tarballs comes from projects
> using autoconf and wanting to ship source archives that do not require
> to run the autoconf machinery.

Just as a data point, literally every C project for which I am upstream
ships additional files in the release tarballs that are not in Git for
reasons unrelated to Autoconf and friends.

Most of this is pregenerated documentation (primarily man pages generated
from POD), but it also includes generated test data and other things.  The
reason is similar: regenerating those files requires tools that may not be
present on an older system (like a mess of random Perl modules) or, in the
case of the man pages, may be old and thus produce significantly inferior
output.

> However, we as in Debian do not have this problem. We can and do re-run
> the autoconf machinery on every build. And at least on the main forges,
> the autogenerated (and thus out of reach from this kind of attacks)
> tarball is always present too - the massaged tarball is an _addition_,
> not a _substitution_. Hence: we should really really think about forcing
> all packages, by policy, to use the autogenerated tarball by default
> instead of the autoconf one, when both are present, unless extenuating
> circumstances (that have to be documented) are present.

I think this is probably right as long as by "autogenerated" you mean
basing the Debian package on a signed upstream Git tag and *locally*
generating a tarball to satisfy Debian's .orig.tar.gz requirement, not
using GitHub's autogenerated tarball that has all sorts of other potential
issues.

Just to note, though, this means that we lose the upstream signature in
the archive.  The only place the upstream signature would then live is in
Salsa.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: xz backdoor

2024-03-31 Thread Russ Allbery

Sirius  writes:

> Would throwing away these unmodified (?) macros packaged projects may be
> carrying for hysterical raisins in favour of just using the autoconf
> native macros reduce the attack-surface a potential malicious actor
> would have at their disposal, or would it simply be a "putting all eggs
> in one basket" and just make things worse? And by how much vis-a-vis the
> effort to do it?

Most of the macros of this type are not from Autoconf.  They're from
either gnulib or the Autoconf Archive.  In both cases, blindly upgrading
to a newer upstream version may break things, I believe.  I'm not as sure
with gnulib, but the Autoconf Archive is a huge collection of things with
varying quality and does not necessarily have any guarantees about APIs.

> I think that what I am trying to get at is this: is there low-hanging
> fruit that for minimal effort would disproportionately improve things
> from a security perspective. (I have an inkling that this is a question
> that every distribution is wrestling with today.)

I think the right way to think about this is to say that the Autoconf
ecosystem is rife with embedded code copies and, because the normal way of
using this code is to make a copy, is also somewhat lax about making
breaking changes since the expectation is that you only update during your
release process when you can fix up any changes.

(That code is also notoriously hard to read, both because M4 is a language
with fairly noisy syntax and because the only tools assumed to be
available in the output scripts is a very minimal Bourne shell and
standard POSIX shell utilities, so there's a lot of the type of
programming that only shell aficionados can love.  That was the problem
with detecting this backdoor: the sort of chain of tr and eval and
whatnot that injected the backdoor is what, e.g., all of Libtool looks
like, at least on a first superficial glance.)

I know all this adds up to "why are we using this stuff anyway," but the
amount of hard-won portability knowledge that's baked into these tools is
IMMENSE, and while probably 75% of it is now irrelevant because the
systems that needed it are long-dead, no one can agree on what 75% that is
or figure out which useful 25% to extract.  And rewriting it in some other
programming language is daunting and feels like churn rather than
progress.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Validating tarballs against git repositories

2024-03-30 Thread Russ Allbery

Simon Josefsson  writes:
> Russ Allbery  writes:

>> I believe you're talking about two different things.  I think Sean is
>> talking about preimage resistance, which assumes that the known-good
>> repository is trusted, and I believe Simon is talking about
>> manufactured collisions where the attacker controls both the good and
>> the bad repository.

> Right.  I think the latter describes the xz scenario: someone could have
> pushed a maliciously crafted commit with a SHA1 collision commit id, so
> there are two different git repositories with that commit id, and a
> signed git tag on that commit id authenticates both trees, opening up
> for uncertainty about what was intended to be used.  Unless I'm missing
> some detail of how git signed tag verification works that would catch
> this.

This is also my understanding.

>> The dgit and tag2upload design probably (I'd have to think about it
>> some more, ideally while bouncing the problem off of someone else,
>> because I've recycled those brain cells for other things) only needs
>> preimage resistance, but the general case of a malicious upstream may
>> be vulnerable to manufactured collisions.

> It is not completely clear to me: How about if some malicious person
> pushed a commit to salsa, asked a DD to "please review this repository
> and sign a tag to make the upload"?  The DD would presumably sign a
> commit id that authenticate two different git trees, one with the
> exploit and one without it.

Oh, hm, yes, this is a good point.  I had forgotten that tag2upload was
intended to work by pushing a tag to Salsa.  This means an attacker can
potentially race Salsa CI to move that tag to the malicious tree before
the tree is fetched by tag from Salsa, or reuse the signed tag with a
different repository with the same SHA-1.

The first, most obvious step is that one has to make sure that a signed
tag is restricted to a specific package and version and not portable to a
different package and/or version that has the same SHA-1 hash due to
attacker construction.  There are several obvious ways that could be done;
the one that comes immediately to mind is to require the tag message be
the source package name and version number, which is good practice anyway.

I think any remaining issues could be addressed with a fairly simple
modification to the protocol: rather than pushing the signed tag to Salsa,
the DD reviewer should push the signed tag to a separate archive server
similar to that used by dgit today.  As long as the first time the signed
tag leaves the DD's system is in conjunction with a push of the
corresponding reviewed tree to secure project systems, this avoids the
substitution problem.  The tag could then be pushed back to Salsa, either
by the DD or by the service.

This unfortunately means that one couldn't use the Salsa CI service to do
the source package construction, and one has to know about this extra
server.  I think that restriction comes from the fact that we're worried
an attacker may be able to manipulate the Salsa Git repository (through
force pushes and tag replacements, for example), whereas the separate
dedicated archive server can be more restrictive and never allow force
pushes or tag moves, and reject any attempts to push a SHA-1 hash that has
already been seen.

Another possible option would be to prevent force pushes and tag moves in
Salsa, since I think one of those operations would be required to pull off
this attack, but maybe I'm missing someting.  One of the things I'm murky
on is exactly what Git operations are required to substitute the two trees
with identical SHA-1 hashes.  That property is going to break Git in weird
ways, and I'm not sure what that means for one's ability to manipulate a
Git repository over the protocols that Salsa exposes.

Obviously it would be ideal if Git used stronger hashes than SHA-1 for
tags, so that one need worry less about all of this.

Even if my analysis is wrong, I think there are some fairly obvious and
trivial additions to the tag2upload process that would prevent this
attack, such as building a Merkle tree of the reviewed source tree using a
SHA-256 hash and embedding the top hash of that tree in the body of the
signed tag where it can be verified by the archive infrastructure.  That
might be a good idea *anyway*, although it does have the unfortunate side
effect of requiring a local client to produce a correct tag rather than
using standard Git signed tags.  Uploading to Debian currently already
semi-requires a custom local client, so to me this isn't a big deal,
although I think there was some hope to avoid that.

(These variations unfortunately don't help with the upstream problem.)

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: xz backdoor

2024-03-30 Thread Russ Allbery

Christian Kastner  writes:

> This is both out of convenience (I want my workstation to be based on
> stable) and precisely because of the afforded isolation.

I personally specifically want my workstation to be running unstable, so
I'm watching to see if that's considered unsafe (either, immediately,
today, or in theory, in the future).

If I have to use a stable host, I admit I will be sad.  I've been using
unstable for my personal client and development (not server, never
exposing services to the Internet) systems for well over a decade (and,
before, that, testing systems for as long as I've been working on Debian)
and for me it's a much nicer experience than using stable.  It also lets
me directly and practically dogfood Debian, which has resulted in a fair
number of bug reports.

(This is an analysis specific to me, not general advice, and relies
heavily on the fact that I'm very good at working around weird problems
that transiently arise in unstable.)

But this does come with a security risk because it means a compromised
package could compromise my system much faster than if I were using
testing or, certainly, stable.  That's not a security trade-off that I can
responsibly make entirely for myself, since it affects people who are
using Debian as well.  So I don't get to have the final decision here.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Validating tarballs against git repositories

2024-03-30 Thread Russ Allbery

Jeremy Stanley  writes:
> On 2024-03-29 23:29:01 -0700 (-0700), Russ Allbery wrote:
> [...]
>> if the Git repository is somewhere other than GitHub, the
>> malicious possibilities are even broader.
> [...]

> I would not be so quick to make the same leap of faith. GitHub is
> not itself open source, nor is it transparently operated. It's a
> proprietary commercial service, with all the trust challenges that
> represents. Long, long before XZ was a twinkle in anyone's eye,
> malicious actors were already regularly getting their agents hired
> onto development teams to compromise commercial software. Just look
> at the Juniper VPN backdoor debacle for a fairly well-documented
> example (but there's strong evidence this practice dates back well
> before free/libre open source software even, at least to the 1970s).

This is a valid point: let me instead say that the malicious possibilities
are *different*.  All of your points about GitHub are valid, but the
counterexample I had in mind is one where the malicious upstream runs the
entire Git hosting architecture themselves and can make completely
arbitrary changes to the Git repository freely.  I don't think we know
everything that is possible to do in that situation.  I think it would be
difficult (not impossible, but difficult) to get into that position at
GitHub, whereas it is commonplace among self-hosted projects.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Validating tarballs against git repositories

2024-03-30 Thread Russ Allbery

Simon Josefsson  writes:
> Sean Whitton  writes:

>> We did some analysis on the SHA1 vulnerabilities and determined that
>> they did not meaningfully affect dgit & tag2upload's design.

> Can you share that analysis?  As far as I understand, it is possible for
> a malicious actor to create a git repository with the same commit id as
> HEAD, with different historic commits and tree content.  I thought a
> signed tag is merely a signed reference to a particular commit id.  If
> that commit id is a SHA1 reference, that opens up for ambiguity given
> recent (well, 2019) results on SHA1.  Of course, I may be wrong in any
> of the chain, so would appreciate explanation of how this doesn't work.

I believe you're talking about two different things.  I think Sean is
talking about preimage resistance, which assumes that the known-good
repository is trusted, and I believe Simon is talking about manufactured
collisions where the attacker controls both the good and the bad
repository.

The dgit and tag2upload design probably (I'd have to think about it some
more, ideally while bouncing the problem off of someone else, because I've
recycled those brain cells for other things) only needs preimage
resistance, but the general case of a malicious upstream may be vulnerable
to manufactured collisions.

(So far as I know, preimage attacks against *MD5* are still infeasible,
let alone against SHA-1.)

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Validating tarballs against git repositories

2024-03-30 Thread Russ Allbery

Ingo Jürgensmann  writes:

> This reminds me of https://xkcd.com/2347/ - and I think that’s getting a
> more common threat vector for FLOSS: pick up some random lib that is
> widely used, insert some malicious code and have fun. Then also imagine
> stuff that automates builds in other ways like docker containers, Ruby,
> Rust, pip that pull stuff from the network and installs it without
> further checks.

> I hope (and am confident) that Debian as a project will react
> accordingly to prevent this happening again.

Debian has precisely the same problem.  We have more work to do than we
possibly can do with the resources we have, there is some funding but not
a lot of funding so most of the work is hobby work stolen from scarce free
time, and we're under a lot of pressure to encourage and incorporate the
work of new maintainers.

And 99% of the time trusting the people who step up to help works out
great.

The hardest part about defending against social engineering is that it
doesn't attack attack the weakness of a community.  It attacks its
*strengths*: trust, collaboration, and mutual assistance.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Validating tarballs against git repositories

2024-03-30 Thread Russ Allbery

Luca Boccassi  writes:

> In the end, massaged tarballs were needed to avoid rerunning autoconfery
> on twelve thousands different proprietary and non-proprietary Unix
> variants, back in the day. In 2024, we do dh_autoreconf by default so
> it's all moot anyway.

This is true from Debian's perspective.  This is much less obviously true
from upstream's perspective, and there are some advantages to aligning
with upstream about what constitutes the release artifact.

> When using Meson/CMake/home-grown makefiles there's no meaningful
> difference on average, although I'm sure there are corner cases and
> exceptions here and there.

Yes, perhaps it's time to switch to a different build system, although one
of the reasons I've personally been putting this off is that I do a lot of
feature probing for library APIs that have changed over time, and I'm not
sure how one does that in the non-Autoconf build systems.  Meson's Porting
from Autotools [1] page, for example, doesn't seem to address this use
case at all.

[1] https://mesonbuild.com/Porting-from-autotools.html

Maybe the answer is "you should give up on portability to older systems as
the cost of having a cleaner build system," and that's not an entirely
unreasonable thing to say, but that's going to be a hard sell for a lot of
upstreams that care immensely about this.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Validating tarballs against git repositories

2024-03-29 Thread Russ Allbery

't
want to do that, and thus force people to fork packages rather than join
in maintaining the existing package.

This is an aside, but this is why my personal policy for my own projects
that I no longer have to maintain is to orphan them and require that
someone fork them, not add additional contributors to my repository or
release infrastructure.  I do not have the resources to vet new
maintainers -- if I had that time to spend on the projects, I wouldn't
have orphaned them -- and therefore I want to explicitly disclaim any
responsibility for what the new maintainer may do.  Someone else will have
to judge whether they are trustworthy.  But I'm not sure that
distributions are in a good position to do that *either*.

> But, I will definitely concede that, had I seen a commit that changed
> that line in the m4, there's a good chance my eyes would have glazed
> over it.

This is why I am somewhat skeptical that forcing everything into Git
commits is as much of a benefit as people are hoping.  This particular
attacker thought it was better to avoid the Git repository, so that is
evidence in support of that approach, and it's certainly more helpful,
once you know something bad has happened, to be able to use all the Git
tools to figure out exactly what happened.  But I'm not sure we're fully
accounting for the fact that tags can be moved, branches can be
force-pushed, and if the Git repository is somewhere other than GitHub,
the malicious possibilities are even broader.

We could narrow those possibilities somewhat by maintaining
Debian-controlled mirrors of upstream Git repositories so that we could
detect rewritten history.  (There are a whole lot of reasons why I think
dgit is a superior model for archive management.  One of them is that it
captures the full Git history of upstream at the point of the upload on
Debian-controlled infrastructure if the maintainer of the package bases it
on upstream's Git tree.)

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: xz backdoor

2024-03-29 Thread Russ Allbery

Moritz Mühlenhoff  writes:
> Russ Allbery  wrote:

>> I think this question can only be answered with reverse-engineering of
>> the backdoors, and I personally don't have the skills to do that.

> In the pre-disclosure discussion permission was asked to share the
> payload with a company specialising in such reverse engineering. If that
> went through, I'd expect results to be publicly available in the next
> days.

Excellent, thank you.

For those who didn't read the analysis on oss-security yet, note that the
initial investigation of the injected exploit indicates that it
deactivates itself if argv[0] is not /usr/sbin/sshd, so there are good
reasons to believe that the problem is bounded to testing or unstable
systems running the OpenSSH server.  If true, this is a huge limiting
factor and in many ways quite relieving compared to what could have
happened.  But the stakes are high enough that hopefully we'll get
detailed confirmation from people with expertise in understanding this
sort of thing.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: xz backdoor

2024-03-29 Thread Russ Allbery

Russ Allbery  writes:
> Sirius  writes:

>> This is quite actively discussed on Fedora lists.
>> https://www.openwall.com/lists/oss-security/2024/
>> https://www.openwall.com/lists/oss-security/2024/03/29/4

>> Worth taking a look if action need to be taken on Debian.

> The version of xz-utils was reverted to 5.4.5 in unstable yesterday by
> the security team and migrated to testing today.  Anyone running an
> unstable or testing system should urgently upgrade.

I think the big open question we need to ask now is what exactly the
backdoor (or, rather, backdoors; we know there were at least two versions
over time) did.  If they only target sshd, that's one thing, and we have a
bound on systems possibly affected.  But liblzma is linked directly or
indirectly into all sorts of things such as, to give an obvious example,
apt-get.  A lot of Debian developers use unstable or testing systems.  If
the exploit was also exfiltrating key material, backdooring systems that
didn't use sshd, etc., we have a lot more cleanup to do.

I think this question can only be answered with reverse-engineering of the
backdoors, and I personally don't have the skills to do that.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: xz backdoor

2024-03-29 Thread Russ Allbery

Sirius  writes:

> This is quite actively discussed on Fedora lists.
> https://www.openwall.com/lists/oss-security/2024/
> https://www.openwall.com/lists/oss-security/2024/03/29/4

> Worth taking a look if action need to be taken on Debian.

The version of xz-utils was reverted to 5.4.5 in unstable yesterday by the
security team and migrated to testing today.  Anyone running an unstable
or testing system should urgently upgrade.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: 64-bit time_t transition in progress in unstable

2024-03-06 Thread Russ Allbery

Eric Valette  writes:

> You can force the migration by explicitly adding the package that it
> propose to remove (e.g gdb for libelf, ...)

> I managed to upgrade all packages you mention in your mail that
> way. Only libkf5akonadisearch-bin libkf5akonadisearch-plugins
> libkf5akonadisearchcore5t64 libkf5akonadisearchpim5t64
> libkf5akonadisearchxapian5t64 are missing because there are bugs in the
> Provides: for api /or the packe depending on the T64 ABI are not yet
> rebuild. I opened a bug for that

Ah, yes, that worked.  It took some experimentation to figure out which
packages could be forced and which ones were causing removals.

I'm down to only libzvbi-common having problems, which I can't manage to
force without removing xine-ui.  If I attempt to install them both
together, I get this failure:

The following packages have unmet dependencies:
 libxine2 : Depends: libxine2-plugins (= 1.2.13+hg20230710-2) but it is not 
going to be installed or
 libxine2-misc-plugins (= 1.2.13+hg20230710-2+b3) but it is 
not going to be installed
 libxine2-ffmpeg : Depends: libavcodec60 (>= 7:6.0)
   Depends: libavformat60 (>= 7:6.0)

The apt resolver seems to be struggling pretty hard to make sense of the
correct upgrade path.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: 64-bit time_t transition in progress in unstable

2024-03-06 Thread Russ Allbery

Steve Langasek  writes:

> So once the libuuidt64 revert is done (later today?), if apt
> dist-upgrade is NOT working, I think we should want to see some apt
> output showing what's not working.

My current list of unupgradable packages on amd64 is:

gir1.2-gstreamer-1.0/unstable 1.24.0-1 amd64 [upgradable from: 1.22.10-1]
libegl-mesa0/unstable 24.0.2-1 amd64 [upgradable from: 24.0.1-1]
libgbm1/unstable 24.0.2-1 amd64 [upgradable from: 24.0.1-1]
libgl1-mesa-dri/unstable 24.0.2-1 amd64 [upgradable from: 24.0.1-1]
libglapi-mesa/unstable 24.0.2-1 amd64 [upgradable from: 24.0.1-1]
libglx-mesa0/unstable 24.0.2-1 amd64 [upgradable from: 24.0.1-1]
libgstreamer1.0-0/unstable 1.24.0-1 amd64 [upgradable from: 1.22.10-1]
libldb2/unstable 2:2.8.0+samba4.19.5+dfsg-4 amd64 [upgradable from: 
2:2.8.0+samba4.19.5+dfsg-1]
libspa-0.2-modules/unstable 1.0.3-1.1 amd64 [upgradable from: 1.0.3-1]
libzvbi-common/unstable 0.2.42-1.2 all [upgradable from: 0.2.42-1.1]
mesa-va-drivers/unstable 24.0.2-1 amd64 [upgradable from: 24.0.1-1]
samba-libs/unstable 2:4.19.5+dfsg-4 amd64 [upgradable from: 2:4.19.5+dfsg-1]

Doing a bit of exploration, the root problems seem to be:

 libdebuginfod1 : Depends: libelf1 (= 0.190-1+b1)
 libdw1 : Depends: libelf1 (= 0.190-1+b1)
 libxine2-misc-plugins : Depends: libsmbclient (>= 2:4.0.3+dfsg1)
 libgl1-mesa-dri : Depends: libglapi-mesa (= 24.0.1-1)

I'm not sure what's blocking the chain ending in libelf1 since t64
versions of those libraries seem to be available, but attempting to force
it would remove gdb and jupyter if that helps.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: 64-bit time_t transition in progress in unstable

2024-03-06 Thread Russ Allbery

Kevin Bowling  writes:

> Are there instructions on how to progress an unstable system through
> this, or is the repo currently in a known inconsistent state?  I have
> tried upgrading various packages to work through deps but I am unable to
> do a dist-upgrade for a while.

It doesn't look like the migration is finished yet, so this is expected.
There are a whole lot of packages that need to be rebuilt and a whole lot
of libraries, so some edge cases will doubtless take a while to sort out.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: usrmerge breaks POSIX

2024-02-15 Thread Russ Allbery

Russ Allbery  writes:
> Thorsten Glaser  writes:

>> Right… and why does pkexec check against /etc/shells?

> pkexec checks against /etc/shells because this is the traditional way to
> determine whether the user is in a restricted shell, and pkexec is
> essentially a type of sudo and should be unavailable to anyone who is
> using a restricted shell.

Apologies, this turns out to be incorrect.  I assumed this based on my
prior experience with other programs that tested /etc/shells without doing
my research properly.  I should have been less certain here.

After some research with git blame, it appears that pkexec checks SHELL
against /etc/shells because pkexec passes SHELL to the program that it
executes (possibly in a different security context) and was worried about
users being able to manipulate and potentially compromise programs across
that security boundary by setting SHELL to some attacker-controlled value.
It is using /etc/shells as a list of possible valid values for that
variable that are safe to pass on.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: usrmerge breaks POSIX

2024-02-15 Thread Russ Allbery

Russ Allbery  writes:

> That definitely should not be the case and any restricted shell that adds
> itself to /etc/shells is buggy.  See chsh(1):

> The only restriction placed on the login shell is that the command
> name must be listed in /etc/shells, unless the invoker is the
> superuser, and then any value may be added. An account with a
> restricted login shell may not change her login shell. For this
> reason, placing /bin/rsh in /etc/shells is discouraged since
> accidentally changing to a restricted shell would prevent the user
> from ever changing her login shell back to its original value.

To follow up on this, currently rbash is added to /etc/shells, which is
surprising to me and which I assume is what you were referring to.  This
seems directly contrary to the chsh advice.  I can't find a reference to
this in bash's changelog and am not sure the reasons for this, though, so
presumably I'm missing something.

I was only able to find this discussion of why pkexec checks $SHELL, and
it doesn't support my assumption that it was an intentional security
measure, so I may well be wrong in that part of my analysis.  Apologies
for that; I clearly should have done more research.  git blame points to a
commit that only references this thread:

https://lists.freedesktop.org/archives/polkit-devel/2009-December/000282.html

which seems to imply that this was done to match sudo behavior and because
the author believed this was the right way to validate the SHELL setting.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: usrmerge breaks POSIX

2024-02-15 Thread Russ Allbery

Vincent Lefevre  writes:
> On 2024-02-15 14:14:46 -0800, Russ Allbery wrote:

>> and pkexec is essentially a type of sudo and should be unavailable to
>> anyone who is using a restricted shell.

> The pkexec source doesn't say that the goal is to check whether
> the user is in a restricted shell.

So far as I am aware, the only purpose served by /etc/shells historically
and currently is to (a) prevent users from shooting themselves in the foot
by using chsh to change their shell to something that isn't a shell, and
(b) detect users who are not "normal users" and therefore should have
restricted access to system services.  See shells(5), for example:

Be aware that there are programs which consult this file to find out
if a user is a normal user; for example, FTP daemons traditionally
disallow access to users with shells not included in this file.

> Also note than even in a restricted shell, the user may set $SHELL to a
> non-restricted shell.

This is generally not the case; see, for example, rbash(1):

It behaves identically to bash with the exception that the following
are disallowed or not performed:

[...]

* setting or unsetting the values of SHELL, PATH, HISTFILE, ENV, or
  BASH_ENV

> Moreover, /etc/shells also contains restricted shells.

That definitely should not be the case and any restricted shell that adds
itself to /etc/shells is buggy.  See chsh(1):

The only restriction placed on the login shell is that the command
name must be listed in /etc/shells, unless the invoker is the
superuser, and then any value may be added. An account with a
restricted login shell may not change her login shell. For this
reason, placing /bin/rsh in /etc/shells is discouraged since
accidentally changing to a restricted shell would prevent the user
from ever changing her login shell back to its original value.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: usrmerge breaks POSIX

2024-02-15 Thread Russ Allbery

Thorsten Glaser  writes:

> Right… and why does pkexec check against /etc/shells?

pkexec checks against /etc/shells because this is the traditional way to
determine whether the user is in a restricted shell, and pkexec is
essentially a type of sudo and should be unavailable to anyone who is
using a restricted shell.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: usrmerge breaks POSIX

2024-02-15 Thread Russ Allbery

Thorsten Glaser  writes:
> Dixi quod…
>> Russ Allbery dixit:

>>> My guess is that pkexec is calling realpath to canonicalize the path
>>> before checking for it in /etc/shells, although I have not confirmed
>>> this.

>> Now that would be weird and should be fixed…

> Another question that probably should be answered first is that why
> pkexec (whatever that is) checks against /etc/shells and if that’s
> correct.

Okay, I have done more research.  My speculation that pkexec might use
realpath was wrong.  It does only check the contents of the SHELL
environment variable.  See:

https://gitlab.freedesktop.org/polkit/polkit/-/blob/master/src/programs/pkexec.c?ref_type=heads#L343
https://gitlab.freedesktop.org/polkit/polkit/-/blob/master/src/programs/pkexec.c?ref_type=heads#L405

It does check whether $SHELL is found in /etc/shells.  So your question
about what is setting the $SHELL variable is a good one, although I think
I would still argue that it's not the most effective way to solve the
issue.

> I’d be really appreciative if I did not have to add extra nōn-canonical
> paths to /etc/shells for bugs in unrelated software.

I understand the appeal of that stance, but the problem with it is that
there is no enforcement of this definition of canonical.  I know that you
consider /bin/mksh to be the correct path, but /usr/bin/mksh is also
present and works exactly the same.  chsh will prevent unprivileged users
from changing their shell to the /usr/bin path because of /etc/shells, but
not if someone makes that change as root.  Also, I'm not sure useradd
cares, or possibly other ways of adding a user with a shell (Puppet, for
instance).  Or, for that matter, just editing /etc/passwd as root, which I
admit is how I usually set the shells of users because I've been using
UNIX for too long.

Having only the /bin paths is fragile because it creates an expectation
that every user who sets the shell is going to know that /bin/mksh is the
correct path and /usr/bin/mksh is the wrong path and will not use the
latter.  I'm not sure how they're supposed to receive this information; I
don't think it's going to be obvious to everyone who may be involved in
setting the shell.  We can tell everyone who ends up with /usr/bin/mksh
that they need to change it to /bin/mksh, but this seems kind of tedious
and annoying, and I'm not seeing the downside to registering both paths.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: usrmerge breaks POSIX

2024-02-15 Thread Russ Allbery

Thorsten Glaser  writes:
> Russ Allbery dixit:

>> 3. Something else that I don't yet understand happened that caused pkexec
>>to detect the shell as /usr/bin/mksh instead of /bin/mksh.  I'm not

> What sets $SHELL for the reporter’s case? Fix that instead.  login(1)
> sets it to the path from passwd(5), which hopefully is from shells(5).

My guess is that pkexec is calling realpath to canonicalize the path
before checking for it in /etc/shells, although I have not confirmed this.

Regardless, I think we should list both paths in /etc/shells because both
paths are valid and there are various benign reasons why one might see the
other path.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: usrmerge breaks POSIX

2024-02-15 Thread Russ Allbery

Vincent Lefevre  writes:
> On 2024-02-14 17:16:23 -0800, Russ Allbery wrote:

> Quoting https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=817168

> | with usrmerge, some programs - such as pkexec, or LEAP's bitmask
> | on top of that- fails to run. Specifically, the error I get is:
> |
> | The value for the SHELL variable was not found the /etc/shells file

>> You mentioned /etc/shells in your previous message, but /etc/shells on my
>> system contains both the /usr/bin and the /bin paths, so I'm still at a
>> complete loss.

> Not for mksh.

Okay, thank you.  I think I understand now.  The problem is:

1. mksh uses custom postinst code to add itself to /etc/shells that does
   not add the /usr/bin versions of the mksh paths, only the /bin
   versions.

2. pkexec uses /etc/shells as an authorization mechanism to not allow
   access from people who use restricted shells, so if it detects your
   shell as /usr/bin/mksh instead of the (expected by /etc/shells)
   /bin/mksh path, it will deny access.

3. Something else that I don't yet understand happened that caused pkexec
   to detect the shell as /usr/bin/mksh instead of /bin/mksh.  I'm not
   sure what this is, but I can guess at a few things that could cause
   this, so it's not surprising to me that it happened.

That pkexec uses /etc/shells in this way is a bit surprising, but I
understand the goal.  The intent is to keep people who are using
restricted shells from accessing pkexec.  I'm not sure this is the best
way to achieve that security goal, but I can also see the potential for
introducing security vulnerabilities in existing systems if we relaxed
them now.

I think the obvious solution is to ensure that both the /bin and /usr/bin
paths for mksh are registered in /etc/shells.  In other words, I think we
have a missing usrmerge-related transition here that we should just fix.
I'm copying Thorsten on this message in case he hasn't noticed this
thread, but if I were you I'd just file a bug against mksh asking for the
/usr/bin paths to also be added to /etc/shells to match the new behavior
of add-shell.

Hopefully most shells are using add-shell, and thus won't have this
problem, but any other shell package in Debian that is intended to provide
a non-restricted shell but is not using add-shell to manipulate
/etc/shells will need a similar fix.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: usrmerge breaks POSIX

2024-02-14 Thread Russ Allbery

Vincent Lefevre  writes:
> On 2024-02-14 10:41:44 -0800, Russ Allbery wrote:

>> I'm sorry, this is probably a really obvious question, but could you
>> explain the connection between the subject of your mail message and the
>> body of your mail message?  I can't see any relationship, so I guess I
>> need it spelled out for me in small words.

>> (I believe /etc/shells enforcement is done via PAM or in specific
>> programs that impose this as an additional non-POSIX restriction.  This
>> is outside the scope of POSIX.)

> What's the point of having a standard if programs are allowed to
> reject user settings for arbitrary and undocumented reasons?

I have literally no idea what you're talking about.  It would be really
helpful if you would describe what program rejected your setting and what
you expected to happen instead.

You mentioned /etc/shells in your previous message, but /etc/shells on my
system contains both the /usr/bin and the /bin paths, so I'm still at a
complete loss.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: usrmerge breaks POSIX

2024-02-14 Thread Russ Allbery

Vincent Lefevre  writes:

> POSIX says:

>   SHELL   This variable shall represent a pathname of the user's
>   preferred command language interpreter. If this interpreter
>   does not conform to the Shell Command Language in XCU
>   Chapter 2 (on page 2345), utilities may behave differently
>   from those described in POSIX.1-2017.

> There is no requirement to match one of the /etc/shells pathnames.
> The user or scripts should be free to use any arbitrary pathname to
> the command language interpreter available on the system, and Debian
> should ensure that this is allowed, in particular the one give by
> the realpath command.

I'm sorry, this is probably a really obvious question, but could you
explain the connection between the subject of your mail message and the
body of your mail message?  I can't see any relationship, so I guess I
need it spelled out for me in small words.

(I believe /etc/shells enforcement is done via PAM or in specific
programs that impose this as an additional non-POSIX restriction.  This is
outside the scope of POSIX.)

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Proposal for how to deal with Go/Rust/etc security bugs

2024-01-25 Thread Russ Allbery

Simon Josefsson  writes:

> I want to explore if there is a possibility to change status quo, and
> what would be required to do so.

> Given how often gnulib is vendored for C code in Debian, and other
> similar examples, I don't think of this problem as purely a Go/Rust
> problem.  The parallel argument that we should not support coreutils,
> sed, tar, gzip etc because they included vendored copies of gnulib code
> is not reasonable.

Since there are now a bunch of messages on this thread of people grumbling
about Rust and Go and semi-proposing not even trying to package that
software (and presumably removing python3-cryptography and everything that
depends on it? I'm not sure where people think this argument is going), I
wanted to counterbalance that by saying I completely agree with Simon's
exploration here.

Rebuilding a bunch of software after a security fix is not a completely
intractable problem that we have no idea how to even approach.  It's just
CPU cycles and good metadata plus ensuring that our software can be
rebuilt, something that we already promise.  Some aspects of making this
work will doubtless be *annoying*, but it doesn't seem outside of our
capabilities as a project.

Dealing with older versions is of course much more of a problem,
particularly if upstream is not backporting security fixes, but this is a
problem is inherent in having stable releases, that upstreams have been
grumbly about long before either Rust or Go even existed, and that we have
nonetheless dealt with throughout the whole history of Debian.  There is
no one-size-fits-all solution, but we have historically managed to muddle
through in a mostly acceptable way.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Policy: should libraries depend on services (daemons) that they can speak to?

2024-01-15 Thread Russ Allbery

Roger Lynn  writes:
> On 15/01/2024 18:00, Russ Allbery wrote:

>> When you have the case of an application that optionally wants to do foo,
>> a shared library that acts as a client, and a daemon that does foo, there
>> are three options:
>>
>> 1. Always install the shared library and daemon even though it's an
>>optional feature, because the shared library is a link dependency for
>>the application and the shared library viewed in isolation does require
>>the daemon be running to do anything useful.
>> 
>> 2. Weaken the dependency between the shared library and the daemon so that
>>the shared library can be installed without the daemon even though it's
>>objectively useless in that situation because it's the easiest and
>>least annoying way to let the application be installed without the
>>daemon, and that's the goal.  The shared library is usually tiny and
>>causes no problems by being installed; it just doesn't work.
>> 
>> 3. Weaken the dependency between the application and the shared library,
>>which means the application has to dynamically load the shared library
>>rather than just link with it.  This is in some ways the most "correct"
>>from a dependency perspective, but it's annoying to do, introduces new
>>error handling cases in the application, and I suspect often upstream
>>will flatly refuse to take such a patch.

> Unless I have misunderstood, I think you may have missed another option:

> 4. Let the leaf application declare the appropriate dependency on the
>daemon, because the application writer/packager is in the best position
>to know how important the functionality provided by the daemon is to the
>application. This could be considered to be option 2b, and a "suggests"
>dependency of the library on the daemon may still be appropriate.

I was thinking of this as a special case of 2, but yes, it's a
sufficiently common special case that it's worth calling out on its own.

I'm not sure that this whole discussion belongs in Policy because it's
very hard to make policy recommendations here without a lot of
case-specific details, but a section in the Developers Guide or some
similar resource about how to think about these cases seems like it might
be useful.  It does come up pretty regularly.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Policy: should libraries depend on services (daemons) that they can speak to?

2024-01-15 Thread Russ Allbery

"Theodore Ts'o"  writes:

> I'll argue that best practice is that upstream show make the shared
> library useful *without* the daemon, but if the daemon is present,
> perhaps the shared library can do a better job.

Eh, I think this too depends on precisely what the shared library is for.
The obvious example of where this doesn't work is when the shared library
is a client for a local system service, and its entire point is to
dispatch calls to that service, but the library and service combined
implement an optional feature in some of the programs linked to it.

I think that's a relatively common case and the sort of case that provokes
most of the desire to not make shared libraries have hard dependencies on
their services.  There are a bunch of services that do not support (and
often would never reasonably support) network connections to their
underlying services.  An obvious example is a library and service pair
that represents a way to manage privilege escalation with isolation on the
local system.  You cannot make the shared library useful without the
daemon because the entire point of the shared library and daemon pair is
to not give those permissions to the process containing the shared
library.

When you have the case of an application that optionally wants to do foo,
a shared library that acts as a client, and a daemon that does foo, there
are three options:

1. Always install the shared library and daemon even though it's an
   optional feature, because the shared library is a link dependency for
   the application and the shared library viewed in isolation does require
   the daemon be running to do anything useful.

2. Weaken the dependency between the shared library and the daemon so that
   the shared library can be installed without the daemon even though it's
   objectively useless in that situation because it's the easiest and
   least annoying way to let the application be installed without the
   daemon, and that's the goal.  The shared library is usually tiny and
   causes no problems by being installed; it just doesn't work.

3. Weaken the dependency between the application and the shared library,
   which means the application has to dynamically load the shared library
   rather than just link with it.  This is in some ways the most "correct"
   from a dependency perspective, but it's annoying to do, introduces new
   error handling cases in the application, and I suspect often upstream
   will flatly refuse to take such a patch.

We do 2 a lot because it's pragmatic and it doesn't really cause any
practical problems, even though it technically means that we're not
properly representing the dependencies of the shared library.  We in
general try not to do 1 for reasons that I think are sound.  Minimizing
the footprint of applications for people who don't want optional features
is something that I personally value a lot in Debian.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: RFC: advise against using Proton Mail for Debian work?

2023-11-15 Thread Russ Allbery

Jeremy Stanley  writes:

> Or build and sign the .tar.gz, then provide the .tar.gz file to the
> upload automation on GitHub for publishing to PyPI.

Oh, yes, that would work.  You'd want to unpack that tarball and re-run
the tests and whatnot, but all very doable.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: RFC: advise against using Proton Mail for Debian work?

2023-11-15 Thread Russ Allbery

Salvo Tomaselli  writes:

> I am currently not using any service to upload to pypi. But this
> requires the occasional creation and deletion of global tokens.

> The only way to avoid global tokens is to upload from github, in which
> case I can no longer sign the .tar.gz.

Well, you *can*, but you would have to then download the .tar.gz from
PyPI, perform whatever checks you need to in order to ensure it is a
faithful copy of the source release, and then sign it and put that .asc
file somewhere (such as a GitHub release artifact).

But it's an annoying process and I'm not sure anyone has automated it.

> A signature isn't the same as a checksum. Probably nobody was using them
> because there was no way to check them automatically.

I suspect this chicken-and-egg problem is the heard of it.  There are
similar mechanisms for Perl modules that, last I checked, no one really
used, although I think there was some recent movement towards maybe
integrating it a bit more.  It's very hard to create a critical mass of
people who care enough to keep all the pieces working.

PGP signatures definitely seem to be a minority interest among most
upstream language communities.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: reference Debian package of multiple binaries sharing one man page

2023-11-13 Thread Russ Allbery

Andrey Rakhmatullin  writes:
> On Fri, Nov 10, 2023 at 11:44:06AM -0800, Russ Allbery wrote:

>> The good news is that if you're using debhelper, you don't have to care
>> about how man handles these indirections and can just use a symlink.
>> Install the man page into usr/share/man/man1 under whatever name is
>> canonical (possibly by using dh_installman), and then create a symlink
>> in usr/share/man/man1 from the other man page name to that file.

>> dh_installman will then clean this all up for you and create proper .so
>> links and you don't have to care about the proper syntax.

> Isn't it the other way around? The whole idea of using .so is to tell
> dh_installman(1) to create symlinks.

Oh, indeed, you're right and I misread that.  So I think you can just use
symlinks, period, and not worry about .so (although you have to handle
nodoc builds correctly).

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: reference Debian package of multiple binaries sharing one man page

2023-11-13 Thread Russ Allbery

Norwid Behrnd  writes:

> Recently, I started to upgrade the Debian package about
> `markdownlint`,[1] a syntax checker.  The initially packaged version
> 0.12.0 provided a binary of name `ruby-mdl` which now becomes a
> transition dummy package in favour of the functionally updated
> `markdownlint`.

> I wonder how to properly prepare an adjusted man page for both binaries,
> because lintian warns about the absence for `usr/bin/mdl`.[2]

I think the problem is that you put the .so line in the wrong file.  It
should be the entire contents of the file corresponding to the deprecated
binary name, not a line in the file that represents the current binary
name.

The good news is that if you're using debhelper, you don't have to care
about how man handles these indirections and can just use a symlink.
Install the man page into usr/share/man/man1 under whatever name is
canonical (possibly by using dh_installman), and then create a symlink in
usr/share/man/man1 from the other man page name to that file.

dh_installman will then clean this all up for you and create proper .so
links and you don't have to care about the proper syntax.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Bug#1041731: Hyphens in man pages

2023-10-15 Thread Russ Allbery

"G. Branden Robinson"  writes:

> How about this?

>  \- Minus sign.  \- produces the basic Latin hyphen‐minus
> specifying Unix command‐line options and frequently used in
> file names.  “-” is a hyphen in roff; some output devices
> replace it with U+2010 (hyphen) or similar.

Sorry for my original message, which was very poorly worded and probably
incredibly confusing.  Let me try to make less of a hash of it.  I think
what I'm proposing is something like:

\-   Basic Latin hyphenminus (U+002D) or ASCII hyphen.  This is the
 character used for Unix commandline options and frequently in file
 names.  It is non-breaking; roff will not wrap lines at this
 character.  "-" (without the "\") is a true hyphen in roff, which is
 a different character; some output devices replace it with U+2010
 (hyphen) or similar.

What I was trying to get at but didn't express very well was to include
the specific Unicode code point and to avoid the term "minus sign" because
this character is not a minus sign in typography at all (although it is
used that way in code).  A minus sign is U+2212 and looks substantially
different because it is designed to match the appearance of the plus sign.
(For example, the line is often at a different height.)  I don't know if
*roff has a way of producing that character apart from providing it as
Unicode.

The above also explicitly says that it's non-breaking (I believe that's
the case, although please tell me if I got that wrong) and is more
(perhaps excessively) explicit about distinguishing it from "-" because of
all the confusion about this.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Hyphens in man pages

2023-10-15 Thread Russ Allbery

Minor point, but since you posted it

"G. Branden Robinson"  writes:

> ...

>  \- Minus sign or basic Latin hyphen‐minus.  \- produces the
> Unix command‐line option dash in the output.  “-” is a
> hyphen in the roff language; some output devices replace it
> with U+2010 (hyphen) or similar.

The official name of "the Unix command-line option dash" is the
hyphen-minus character (U+002D).  Given how much confusion there is about
this, and particularly given how ambiguous the word "dash" is in
typography (the hyphen-minus is one of 25 dashes in Unicode), you may want
to say that explicitly in addition to saying that it's the character used
in UNIX command-line options (and, arguably as importantly, in UNIX
command names).

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Hyphens in man pages

2023-10-15 Thread Russ Allbery

Wookey  writes:

> I was left not actually know what - and \- represent, nor which one I
> _should_ be using in my man pages. And that seems to be the one thing we
> should be telling the 'average maintainer'.

- turns into a real hyphen (, U+2010).  \- turns into the ASCII
hyphen-minus that we use for options, programming, and so forth (U+002D).

I think my position at this point as pod2man maintainer (not yet
implemented in podlators) is that every occurrence of - in POD source will
be translated into \-, rather than using the current heuristics, and
people who meant to use ‐ should type it directly in the POD source.
pod2man now supports Unicode fairly well and will pass that along to
*roff, which presumably will do the right thing with it after character
set translation.

Currently, pod2man uses an extensive set of heuristics, but I think this
is a lost cause.  I cannot think of any heuristic that will understand
that the - in apt-get should be U+002D (so that one can search for the
command as it is typed), but the - in apt-like should be aptlike, since
this is an English hyphenated expression talking about programs that are
similar to apt.  This is simply not information that POD has available to
it unless the user writing the document uses Unicode hyphens.

I believe the primary formatting degredation will be for very long
hyphenated phrases like super-long-adjectival-phrase-intended-as-a-joke,
because *roff will now not break on those hyphens that have been turned
into \-.  People will have to rewrite them using proper Unicode hyphens to
get proper formatting.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Is there a generic canonical way for a package script to check network connectivity?

2023-10-09 Thread Russ Allbery

Jonathan Kamens  writes:

> Regarding what I'm trying to accomplish, as part of the revamp of
> apt-listchanges I need to rebuild the database that apt-listchanges uses
> to determine which changelog and NEWS entries it has already shown to the
> user. This can mostly be done from files installed on the local machine,
> but not for packages which don't ship a changelog.Debian file and instead
> expect the user to fetch it over the network with "apt changelog".

Based on some other private conversation, I think there may be an
underlying misunderstanding here, which is quite inobvious if you're just
looking at Debian packages without having read all the previous
discussions that got us here.

Either that, or I have some incorrect assumptions, and someone should
correct me.  :)

I believe that the following statements are true:

Every Debian package either ships changelog.Debian or symlinks its doc
directory to another package that ships changelog.Debian.  In the latter
case, that is a declaration that the package has no unique changelog
entries and its changelog is always and exactly the changelog of the other
package.  (This is used to deduplicate files among packages that are
always built together from the same source and are usually installed
together.)  So there are no packages in Debian that expect the user to
fetch the changelog over the network; the changelog is always guaranteed
to be part of the content installed on disk.  It can just be indirected
through another package (if the packages follow some strict limitations).

What *does* happen is that some packages (well, all packages that have
been rebuilt with current debhelper, I think) have *truncated* changelogs,
in order to prevent the changelog from wasting a lot of disk space with
old entries, and the *full* changelog is only available via the network.
But the guarantee for truncated changelogs is that all entries newer than
the release date of oldstable are retained, so since Debian doesn't
support skip-version upgrades, apt-listchanges should never need the
content that is dropped by truncation.

In other words, the intent is to guarantee that all the information that
apt-listchanges needs is present on disk, but it would have to deal with
the /usr/share/doc symlinks.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Control header sent with done email didn't do what I expected, should it have?

2023-09-25 Thread Russ Allbery

Marvin Renich  writes:

> I've seen differing opinions about closing "wontfix" bugs, but as a
> user, I appreciate when they are left open.  Whether it is a simple
> wishlist feature request or a crash when the user abuses the software,
> if I go to file the same or similar bug at a later time, if the bug is
> closed I will not see it and file a duplicate.  If it is left open, I
> can see the maintainer has already thought about it and intentionally
> decided not to fix it, so I can save the trouble of refiling.  Also, I
> might gain some insight about the circumstances.

I think it's a trade-off.  There are some bugs that seem unlikely to ever
come up again or that aren't helpfully worded, and I'm more willing to
close those.

Also, in the abstract, I don't like using the BTS as a documentation
system, which is sort of what collecting wontfix amounts to.  If it's
something that I think is going to come up a lot, it feels better to put
it into the actual documentation (README.Debian, a bug script if it's
reported really often, etc.).  You're also expecting everyone filing a bug
to read through all the existing wontfix bugs (at least their titles),
which in some cases is fine but in some cases can become overwhelming.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: debian/copyright format and SPDX

2023-09-22 Thread Russ Allbery

Sune Vuorela  writes:

> I do think that this is another point of "we should kill our babies if
> they don't take off". And preferably faster if/when "we lost" the race.

> We carried around the debian menu for a decade or so after we failed to
> gain traction and people centered on desktop files.

> We failed to gain traction on the structure of the copyright file, and
> spdx is the one who has won here.

I generally agree with everything you're saying, but I don't think it
applies to the structure of the copyright file.  Last I checked, SPDX even
recommends that people use our format for complicated copyright summaries
that their native format can't represent.

It is hampered by being in a language that no one has a readily-available
parser for, and I wish I'd supported the push for it to be in YAML at the
time since YAML has been incredibly successful in the format wars due to
the wild success of Kubernetes (which is heavily based on YAML at the UI
layer although it uses JSON on the wire), but it's still one of the best
if not the best format available for its purpose.

(Yes, I know, the YAML spec is a massive mess, etc.  It's also better than
any other structured file format I've used among those with readily
available parsers in every programming language, and you can use a very
stripped-down version of it without object references and the like.  TOML
unforutnately failed miserably on nested tables in a way that makes it
mostly unusable for a lot of applications YAML does well on.)

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: lpr/lpd

2023-09-18 Thread Russ Allbery

Simon Richter  writes:

> And yes, it is quicker for me to copy another printcap entry and swap
> out the host name than it is to find out how to authenticate to CUPS,
> set up the printer, print a test page then remove and recreate it
> because the generated "short" name I need to pipe data into lpr isn't
> short. I will definitely be looking into rlpr.

Since I wrote my original message, I noticed that rlpr is orphaned.  I no
longer work in an office and print things about once a year, so I no
longer use the package, but it was a lifesaver when I was working in an
office regularly and I do recommend it.  If anyone else who still prints
regularly prefers the simple command-line interface, you may want to
consider adopting it, although it looks like you're likely to have to
adopt upstream as well since it seems to have disappeared.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: lpr/lpd

2023-09-17 Thread Russ Allbery

Christoph Biedl  writes:

> Well, not me. But the thing that puzzles me is the popcon numbers:  lpr
> has 755, lprng 233.

> Assuming most of these installation were not done deliberately but are
> rather by-catch, or: Caused by some package that eventually draws them
> in, via a dependency that has "lpr" (or "lprng") in the first place
> instead of "cups-bsd | lpr". For lpr, that might be xpaint. For lprng, I
> have no idea. And there's little chance to know.

It at least used to be that you could print directly to a remote printer
with lpr and a pretty simple /etc/printcap entry that you could write
manually.  I used to use that mechanism to print to an office printer
until I discovered rlpr, which is even better for that use case.  It's
possible some of those installations are people doing that, rather than
via dependencies or other things (in which case they probably should move
to rlpr).

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Do not plan to support /usr/lib/pam.d for Debian pam

2023-09-16 Thread Russ Allbery

Marco d'Itri  writes:
> On Sep 15, Sam Hartman  wrote:

>> I have significant discomfort aligning what you say (pam is the last
>> blocker) with what several people said earlier in the week.  What I
>> heard is that there was no project consensus to do this, and that
>> people were running experiments to see what is possible.

> Indeed. I did the experiments and they where unexpectedly positive:  pam
> is the only blocker for booting _the base system_.

> I never expected that everything would immediately work just fine with
> an empty /etc: my goal is to have support for this in the base system
> and selected packages.

This started as an experiment: you were going to try running the base
system in this mode with existing packages and see what happens.  You ran
that experiment and got results: it doesn't work, but it appears to only
work because of PAM.  So far, so good.  You ran an experiment, the result
was that the thing you want to do doesn't work, and now you understand
what changes would be required to move forward.

However, and this is very important, *no one has decided that you get to
do that work in Debian*.

Insofar as this is just a personal goal, sure, that's none of the business
of anyone else.  But if you want this to be a *project* goal, you're
skipping a few important steps.

The biggest ones is that there is no *plan* and no *agreement*.  By plan,
I mean an actual document spelling out in detail, not email messages with
a few sentences about something that is familiar to you but not to other
people who haven't been thinking about this, what base system support
would look like.  And by agreement, I mean that the maintainers of base
system components agree that this is something that we are working towards
as a project and something that they would not break lightly.

Right now, any base system package maintainer could decide that putting
configuration files in /etc makes sense for reasons of their own limited
to their specific package and further break support for booting a system
in this mode, and there are no grounds to ask them not to do this.
Because you don't have an *agreement*.

I feel like there is a tendency to consider work on Debian to be purely
technical.  If you turn it on and smoke doesn't come out, it works, so we
have implemented that thing, and the goal is accomplished.  This doesn't
work, precisely because other people break your goal later (because they
were never asked or never agreed with that goal), and then they are very
confused about why you're upset and why your problems are now their
problems.  Or, worse, their packages are broken as collateral damage in
accomplishing some goal, and you then argue that it's their problem to fix
their packages, even though there was no agreement about that goal.

Accomplishing things like this in Debian has a large social component that
I think is being neglected.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: /usr/-only image

2023-09-16 Thread Russ Allbery

Luca Boccassi  writes:

> Perhaps 'modifications' was the wrong term, I meant the whole system
> that handles the configuration. Correct me if I'm wrong, but AFAIK that
> is all Debian-specific. Arch, Fedora and Suse do not have this issue.

Speaking as the author of several PAM modules, Debian's PAM configuration
system is also vastly superior to that of Arch, Fedora, and SuSE, which
require that I as upstream provide complicated and tedious installation
documentation for how people can configure my modules.  It's a stark
contrast with Debian, where I can just ship a configuration file and have
everything happen automatically and correctly despite requiring some quite
complex PAM syntax.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Do not plan to support /usr/lib/pam.d for Debian pam

2023-09-15 Thread Russ Allbery

Sam Hartman  writes:
>>>>>> "Peter" == Peter Pentchev  writes:

> Peter> Hm, what happens if a sysadmin deliberately removed a file
> Peter> that the distribution ships in /etc, trying to make sure that
> Peter> some specific service could never possibly succeed if it
> Peter> should ever attempt PAM authentication? Then, if there is a
> Peter> default shipped in /usr, the service authentication attempts
> Peter> may suddenly start succeeding when the PAM packages are
> Peter> upgraded on an existing system.

> This might be an issue in general, but it is not an issue for PAM.  PAM
> falls back on the other service if a service configuration cannot be
> found.

I think that makes it an even more subtle problem, doesn't it?

Currently, my understanding is that if I delete /etc/pam.d/lightdm, PAM
falls back on /etc/pam.d/other.  But if we define a search path for PAM
configuration such that it first looks in /etc/pam.d and then in
/usr/lib/pam.d, and I delete /etc/pam.d/lightdm, wouldn't PAM then fall
back on /usr/lib/pam.d/lightdm and not /etc/pam.d/other?  Unlike Peter's
example, that would be a silent error; authentication may well succeed,
but without running, say, pam_limits.so.

I don't know if anyone is making this specific configuration change, but
if they are, I think that result would be surprising.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: /usr/-only image

2023-09-14 Thread Russ Allbery

Marc Haber  writes:

> I'd go so far that the systemd/udev way is a strategy to cope with
> nearly non-existent conffile handling on non-Debian distributions. We
> didn't do ourselves a favor by blindly adopting this scheme, while
> we're having a vastly superior package managed that handles conffiles
> and conffile changes so nicely.

> Please considernot throwing this advantage away for the rest of our
> distribution.

I've been using Debian for a lot of years now, and while describing our
dconfiguration handling as vastly superior is possibly warranted (it's
been a long time since I've tested the competition so I don't know from
first-hand experience), saying that changes are handled nicely doesn't fit
my experience.

I have spent hours resolving configuration changes on Debian systems that
turn out to be changes in comments or settings that I never changed, and
even more hours maintaining absurdly complicated code that tries to handle
in-place updates of all-in-one configuration files, extract information
from them that needs to be used by maintainer scripts, or juggle the
complicated interaction between debconf and the state machine of possible
user changes to the file outside of debconf.  This is certainly something
that we put a lot of effort into, and those of us who have used Debian for
a long time are used to it, but I wouldn't describe it as nice.

Most of this problem is not of our creation.  Managing configuration files
in an unbounded set of possible syntaxes, many of which are ad hoc and
have no standard parser and often do not support fragments in directories,
is an inherently impossible problem, and we try very hard to carve out
pieces of it that we can handle.  But there are many packages for which a
split configuration with a proper directory of overrides and a standard
configuration syntax would be a *drastic* improvement over our complex
single-file configuration management tools such as ucf, let alone over
basic dpkg configuration file management.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Bug#885698: What licenses should be included in /usr/share/common-licenses?

2023-09-12 Thread Russ Allbery

Jonas Smedegaard  writes:

> Strictly speaking it is not (as I was more narrowly focusing on) that
> the current debian/copyright spec leaves room for *ambiguity*, but
> instead that there is a real risk of making mistakes when replacing with
> centrally defined ones (e.g. redefining a local "Expat" from locally
> meaning "MIT-ish legalese as stated in this project" to falsely mean
> "the MIT-ish legalese that SPDX labels MIT").

Right, the existing copyright format defines a few standard labels and
says that you should only use those labels when the license text matches,
but it doesn't stress that "matches" means absolutely word-for-word
identical.  I suspect, although I haven't checked, that we've made at
least a few mistakes where some license text that's basically equivalent
to Expat is labelled as Expat even though the text is not word-for-word
identical.  Given that currently all labels in debian/copyright are
essentially local and the full text is there (except for common-licenses,
where apart from BSD the licenses normally are used verbatim), this is not
currently really a bug.  But we could turn it into a bug quite quickly if
we relied on the license short name to look up the text.

To take an example that I've been trying to get rid of for over a decade,
many of the /usr/share/common-licenses/BSD references currently in the
archive are incorrect.  There are a few cases where the code is literally
copyrighted only by the Regents of the University of California and uses
exactly that license text, but this is not the case for a lot of them.  It
looks like a few people have even tried to say "use common-licenses but
change the name in the license" rather than reproducing the license text,
which I don't believe meets the terms of the license (although it's of
course very unlikely that anyone would sue over it).

A quick code search turns up the following examples, all of which I
believe are wrong:

https://sources.debian.org/src/mrpt/1:2.10.0+ds-3/doc/man-pages/pod/simul-beacons.pod/?hl=35#L35
https://sources.debian.org/src/gridengine/8.1.9+dfsg-11/debian/scripts/init_cluster/?hl=7#L7
https://sources.debian.org/src/rust-hyphenation/0.7.1-1/debian/copyright/?hl=278#L278
https://sources.debian.org/src/nim/1.6.14-1/debian/copyright/?hl=64#L64
https://sources.debian.org/src/yade/2023.02a-2/debian/copyright/?hl=78#L78

An example of one that probably is okay, although ideally we still
wouldn't do this because there are other copyrights in the source:

https://sources.debian.org/src/lpr/1:2008.05.17.3+nmu1/debian/copyright/?hl=15#L15

This problem potentially would happen a lot with the BSD licenses, since
the copyright-format document points to SPDX and SPDX, since it only cares
about labeling legally-equivalent documents, allows the license text to
vary around things like the name of the person you're not supposed to say
endorsed your software while still receiving the same label.

We therefore cannot use solely SPDX as a way of determining whether we can
substitute the text of the license automatically for people, because there
are SPDX labels for a lot of licenses for which we'd need to copy and
paste the exact license text because it varies.  At least if I understand
what our goals would be.

(License texts that have portions that vary between packages they apply to
are a menace and make everything much harder, and I really wish people
would stop using them, but of course the world of software development is
not going to listen to me.)

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Bug#885698: What licenses should be included in /usr/share/common-licenses?

2023-09-12 Thread Russ Allbery

Jonas Smedegaard  writes:

> If you mean to say that ambiguous MIT declarations exist in
> debian/copyright files written using the machine-readable format, then
> please point to an example, as I cannot imagine how that would look.

I can see it: people use License: Expat but then include some license that
is essentially, but not precisely, the same as Expat.  If we then tell
people that they can omit the text of the license and we'll fill it in
automatically, they'll remove the actual text and we'll fill it in with
the wrong thing.

This is just a bug in handling the debian/copyright file, though.  If we
take this approach, we'll need to be very explicit that you can only use
whatever triggers the automatic inclusion of the license text if your
license text is word-for-word identical.  Otherwise, you'll need to cut
and paste it into the file as always.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: /usr/-only image

2023-09-11 Thread Russ Allbery

Simon Richter  writes:

> This would not work for a package like postfix, which absolutely
> requires system-specific configuration, and we'd have to be careful with
> packages like postgresql where there is a default configuration that
> works fine for hobbyists that we do not make life too difficult for
> professional users.

I don't think there's any desire to avoid system-specific configuration.
The model instead is that the package comes with a set of defaults, and if
you don't set something in the local configuration in /etc, the default is
used.  I think this is exactly the model used by Postfix for main.cf.
There are a few mandatory settings, but for the most part you can omit any
setting and the default is used.  The defaults are just hard-coded (at
least so far as I know) rather than stored in separate configuration files
in /usr, which doesn't make a fundamental difference.

The problem configuration files are ones like Postfix's master.cf, where a
whole ton of stuff almost no one ever changes is mixed into the same file
that you're supposed to change for local configuration and there's no
merger process.  And honestly I have always hated the Postfix master.cf
file, dating back to before systemd even existed.  I think it's a bad
configuration design.

That of course is just my opinion and doesn't get us anywhere closer to
using a defaults plus overrides syntax for master.cf, even assuming that
upstream would consider it.

There are a ton of packages with configuration syntaxes that were created
a very long time ago or have accumulated over time.  I maintain one of
them upstream, INN, and I'll be the first to say that the INN
configuration syntax is *awful*, and I have actively contributed to making
it what it is.  There are dozens of files, they use about fourteen
completely separate and incompatible syntaxes, there's boilerplate in some
places and defaults in other places, and learning all the ins and outs of
the configuration is a full-time job.  It's nonsense, and it's badly
designed, and if I were writing it from scratch I'd replace the whole
thing with simplified YAML or some similar well-known syntax with a schema
and good editor support and a data model that supports configuration
merging.

And the chances of any of that happening when I have more free software
projects lying on the floor in pieces than I have ones I'm managing to
keep in the air is... low, even though I do have a much more active
comaintainer.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Bug#885698: What licenses should be included in /usr/share/common-licenses?

2023-09-10 Thread Russ Allbery

Jonas Smedegaard  writes:

> I have so far worked the most on identifying and grouping source data,
> putting only little attention (yet - but do dream big...) towards
> parsing and processing debian/copyright files e.g. to compare and assess
> how well aligned the file is with the content it is supposed to cover.

> So if I understand your question correctly and you are not looking for
> the output of `licensecheck --list-licenses`, then unfortunately I have
> nothing exciting to offer.

I think that's mostly correct.  I was wondering what would happen if one
ran licensecheck debian/copyright, but unfortunately it doesn't look like
it does anything useful.  I tried it on one of my packages (remctl) that
has a bunch of different licenses, and it just said:

debian/copyright: MIT License

and apparently ignored all of the other licenses present (FSFAP, FSFFULLR,
ISC, X11, GPL-2.0-or-later with Autoconf-exception-generic, and
GPL-3.0-or-later with Autoconf-exception-generic).  It also doesn't notice
that some of the MIT licenses are variations that contain people's names.

(I still put all the Autoconf build machinery licenses in my
debian/copyright file because of the tooling I use to manage my copyright
file, which I also use upstream.  I probably should change that, but I
need to either switch to licensecheck or rewrite my horrible script.)

Also, presumably it doesn't know about copyright-format since it wouldn't
be expecting that in source files, so it wouldn't know to include licenses
referenced in License stanzas without the license text included.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: /usr/-only image

2023-09-10 Thread Russ Allbery

Luca Boccassi  writes:
> On Sun, 10 Sept 2023 at 18:55, Nils Kattenbeck  wrote:

>> I am looking to generate a Debian image with only a /usr and /var
>> partition as per discoverable partition specification. However, it
>> seems to me like the omission of /etc leads to several issues in core
>> packages and logging in becomes impossible.

>> Is this an unsupported use case and if yes, is there ongoing work to
>> eventually support this?

>> Many packages in Fedora for example are already configured to support
>> this using systemd-sysuser, systemd-tmpfiles, and other declarative
>> means stored in /usr/ to create any required files upon boot.

> It is being slowly worked towards, but we are still at the prerequisites
> at this time. Hopefully we'll have some usable experiments for the
> Trixie timeline, but nothing definite yet.

Just to make this explicit, one of the prerequisites that has not yet
happened is for Debian to agree that this is even something that we intend
to do.

So far as I know, no one has ever made a detailed, concrete proposal for
what the implications of this would be for Debian, what the transition
plan would look like, and how to address the various issues that will
arise.  Moving configuration files out of /etc, in particular, is
something I feel confident saying that we do not have any sort of project
consensus on, and is not something Debian as a project (as opposed to
individuals within the project) is currently planning on working on.

That doesn't mean we won't eventually do this, or that people aren't
working on other prerequisites, or that it's not something that we're
considering.  But I just want to make clear that we are so early in this
process that it is not at all clear that we are even going to do this at
all, and there is a substantial discussion that would need to happen and
detailed design proposal that would need to be written before there is any
chance whatsoever that Debian will officially support this configuration.
(This does not rule out the possibility that certain carefully-crafted
configurations with a subset of packages may work in this mode, of
course.)

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Bug#885698: What licenses should be included in /usr/share/common-licenses?

2023-09-10 Thread Russ Allbery

Johannes Schauer Marin Rodrigues  writes:

> I very much like this idea. The main reason maintainers want more
> licenses in /usr/share/common-licenses/ is so that they do not anymore
> have humongous d/copyright files with all license texts copypasted over
> and over again. If long texts could be reduced to a reference that get
> expanded by a machine it would make debian/copyright look much nicer and
> would make it easier to maintain while at the same time shipping the
> full license text in the binary package.

> Does anybody know why such an approach would be a bad idea?

I can think of a few possible problems:

* I'm not sure if we generate binary package copyright files at build time
  right now, and if all of our tooling deals with this.  I had thought
  that we prohibited this, but it looks like it's only a Policy should and
  there isn't a mention of it in the reject FAQ, so I think I was
  remembering the rule for debian/control instead.  Of course, even if
  tools don't support this now, they could always be changed.

* If ftp-master has to review the copyright files of each binary package
  separate from the copyright file of the source package (I think this
  would be an implication of generating the copyright files during build
  time), and the binary copyright files have fully-expanded licenses, that
  sounds like kind of a pain for the ftp-master reviewers.  Maybe we can
  deal with this with better tooling, but someone would need to write
  that.

* If we took this to its logical end point and did this with the GPL as
  well, we would add 20,000 copies of the GPL to the archive and install a
  *lot* of copies on the system.  Admittedly text files are small and
  disks are large, but this still seems a little excessive.  So maybe we
  still need to do something with common-licenses?

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: What licenses should be included in /usr/share/common-licenses?

2023-09-10 Thread Russ Allbery

Jeremy Stanley  writes:

> I'm surprised, for example, by the absence of the ISC license given that
> not only ISC's software but much of that originating from the OpenBSD
> ecosystem uses it. My personal software projects also use the ISC
> license. Are you aggregating the "License:" field in copyright files
> too, or is it really simply a hard-coded list of matching patterns?

It's only a hard-coded list of matching patterns, and it doesn't match any
of the short licenses because historically I wasn't considering them (with
the exception of common-licenses references to the BSD license, which I
kind of would like to make an RC bug and clean up so that we could remove
the BSD license from common-licenses on the grounds that it's specific to
only the University of California and confuses people).  If we go with any
sort of threshold, the script will need serious improvements.

That was something else I wanted to ask: I've invested all of a couple of
hours in this script, and would be happy to throw it away in favor of
something that tries to do a more proper job of classifying the licenses
referenced in debian/copyright.  Has someone already done this (Jonas,
perhaps)?

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: What licenses should be included in /usr/share/common-licenses?

2023-09-10 Thread Russ Allbery

Russ Allbery  writes:

> In order to structure the discussion and prod people into thinking about
> the implications, I will make the following straw man proposal.  This is
> what I would do if the decision was entirely up to me:

> Licenses will be included in common-licenses if they meet all of the
> following criteria:

> * The license is DFSG-free.
> * Exactly the same license wording is used by all works covered by it.
> * The license applies to at least 100 source packages in Debian.
> * The license text is longer than 25 lines.

In the thread so far, there's been a bit of early convergence around my
threshold of 100 packages above.  I want to make sure people realize that
this is a very conservative threshold that would mean saying no to most
new license inclusion requests.

My guess is that with the threshold set at 100, we will probably add
around eight new licenses with the 25 line threshold (AGPL-2,
Artistic-2.0, CC-BY 3.0, CC-BY 4.0, CC-BY-SA 3.0, CC-BY-SA 4.0, and
OFL-1.1, and I'm not sure about some of those because the CC licenses have
variants that would each have to reach the threshold independently; my
current ad hoc script does not distinguish between the variants), and
maybe 10 to 12 total without that threshold (adding Expat, zlib, some of
the BSD licenses).  This would essentially be continuing current practice
except with more transparent and consistent criteria.  It would mean not
including a lot of long legal license texts that people have complained
about having to duplicate, such as the CDDL, CeCILL licenses, probably the
EPL, the Unicode license, etc.

If that's what people want, that's what we'll do; as I said, that's what I
would do if the choice were left entirely up to me.  But I want to make
sure I give the folks who want a much more relaxed standard a chance to
speak up.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: What licenses should be included in /usr/share/common-licenses?

2023-09-10 Thread Russ Allbery

Jonas Smedegaard  writes:
> Quoting Hideki Yamane (2023-09-10 11:00:07)

>>  Hmm, how about providing license-common package and that depends on
>>  "license-common-list", and ISO image provides both, then? It would be
>>  no regressions.

I do wonder why we've never done this.  Does anyone know?  common-licenses
is in an essential package so it doesn't require a dependency and is
always present, and we've leaned on that in the past in justifying not
including those licenses in the binary packages themselves, but I'm not
sure why a package dependency wouldn't be legally equivalent.  We allow
symlinking the /usr/share/doc directory in some cases where there is a
dependency, so we don't strictly require every binary package have a
copyright file.

>>  I expect license-common-list data as below
>> 
>>  license-short-name: URL
>>  GPL-2: file:///usr/share/common-licenses/GPL-2
>>  Boost-1.0: https://spdx.org/licenses/BSL-1.0.html

> Ah, so what you propose is to use file URIs.

> I guess Russ' response above was a concern over using http(s) URIs
> towards a non-local resource.

Yes, I think the https URL is an essential part of the first proposal,
since it avoids needing to ship a copy of all of the licenses.  But I'm
dubious that would pass legal muster.

The alternative proposal as I understand it would be to haave a
license-common package that includes full copies of all the licenses with
some more relaxed threshold requirement and have packages that use one of
those licenses depend on that package.  (This would obviously require a
maintainer be found for the license-common package.)

> License: Apache-2.0
> Reference: /usr/share/common-licenses/Apache-2.0

This is separate from this particular bug, but I would love to see the
pointer to common-licenses turned into a formal field of this type in the
copyright format, rather than being an ad hoc comment.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: What licenses should be included in /usr/share/common-licenses?

2023-09-09 Thread Russ Allbery

Hideki Yamane  writes:
> Russ Allbery  wrote:

>> Licenses will be included in common-licenses if they meet all of the
>> following criteria:

>  How about just pointing SPDX licenses URL for whole license text and
>  lists DFSG-free licenses from that? (but yes, we should adjust short
>  name of licenses for DEP-5 and SPDX for it).

Can we do this legally?  If we can, it certainly has substantial merits,
but I'm not sure that this satisfies the requirement in a lot of licenses
to distribute a copy of the license along with the work.  Some licenses
may allow that to be provided as a URL, but I don't think they all do
(which makes sense since people may receive Debian on physical media and
not have Internet access).

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

What licenses should be included in /usr/share/common-licenses?

2023-09-09 Thread Russ Allbery

* The license applies to at least 100 source packages in Debian.
* The license text is longer than 25 lines.

I will attempt to guide and summarize discussion on this topic.  No
decision will be made immediately; I will summarize what I've heard first
and be transparent about what direction I think the discussion is
converging towards (if any).

Finally, as promised, here is the count of source packages in unstable
that use the set of licenses that I taught my script to look for.  This is
likely not accurate; the script uses a bunch of heuristics and guesswork.

AGPL 3  277
Apache 2.0 5274
Artistic   4187
Artistic 2.0337
BSD (common-licenses)42
CC-BY 1.0 3
CC-BY 2.015
CC-BY 2.513
CC-BY 3.0   240
CC-BY 4.0   159
CC-BY-SA 1.0  8
CC-BY-SA 2.0 48
CC-BY-SA 2.5 16
CC-BY-SA 3.0425
CC-BY-SA 4.0237
CC0-1.01069
CDDL 67
CeCILL   30
CeCILL-B 13
CeCILL-C  9
GFDL (any)  569
GFDL (symlink)   55
GFDL 1.2289
GFDL 1.3231
GPL (any) 20006
GPL (symlink)  1331
GPL 1  4033
GPL 2 10466
GPL 3  6783
LGPL (any) 5019
LGPL (symlink)  265
LGPL 2 3850
LGPL 2.1   2926
LGPL 3 1526
LaTeX PPL46
LaTeX PPL (any)  40
LaTeX PPL 1.3c   32
MPL 1.1 165
MPL 2.0 361
SIL OFL 1.0  11
SIL OFL 1.1     258

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: debian/copyright format and SPDX

2023-09-08 Thread Russ Allbery

Jonas Smedegaard  writes:

> Only issue I am aware of is that SPDX shortname "MIT" equals Debian
> shortname "Expat".

There was also some sort of weirdly ideological argument with the FSF
about what identifiers to use for the GPL and related licenses, which
resulted in SPDX using an "-only" and "-or-later" syntax in the identifier
at the insistence of the FSF rather than a separate generic syntax the way
that we do.

https://spdx.org/licenses/ is the current license list and assigned short
identifiers.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: debian/copyright format and SPDX

2023-09-08 Thread Russ Allbery

Jeremy Stanley  writes:

> Since Debian's machine-readable format has been around longer than
> either of the newer formats you mentioned, it seems like it would make
> more sense for the tools to incorporate a parser for it rather than
> create needless churn in the package archive just to transform an
> established standard into whatever the format-du-jour happens to be (and
> then halfway through another new format gains popularity, and the
> process starts all over again).

I don't think the file format is the most interesting part of SPDX.  They
don't really have a competing format equivalent to the functionality of
our copyright files (at least that I've seen; I vaguely follow their
lists).  Last time I looked, they were doing a lot with XML, which I don't
think anyone would adopt for new formats these days.  (YAML or TOML or
something like that is now a lot more popular.)  In terms of file formats,
writing a lossy converter from Debian copyright files to whatever format
is of interest for BOMs would probably do most of the job.

The really interesting part of SPDX is the license list and the canonical
name assignment, which is *way* more active and *way* more mature at this
point than the equivalent in Debian.  They have a much larger license
list, which is currently being bolstered by Fedora, and the new licenses
and rules for deduplicating them are reviewed by lawyers as part of their
maintenance process.  Their identifiers are also incerasingly used in
upstream software in SPDX-License-Identifier pseudo-headers.

I have no idea how to do a transition, but I do think Debian would benefit
from adopting the SPDX license identifiers where one exists, and possibly
from joining forces with Fedora to submit and get idenifiers assigned to
the licenses that we see that are not yet registered.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: DEP-5 copyright with different licenses for two parts of the same file

2023-08-29 Thread Russ Allbery

Marc Haber  writes:

> Now, how do I write this in a DEP-5 copyright file? Having two stanzas
> for the same file gets flagged by Lintian as an Error, and the DEP-5
> syntax doesn't seem to allow to mention two Licenses in the License:
> line.

This is the intended purpose of "and": cases where one file is covered by
multiple licenses simultaneously.  So, basically:

License: LGPL-2+ and manpage-license

or whatever the right tag for that second license is.  This is a bit
confusing when the licenses conflict, but I think it's close enough to
capturing what's going on here, and you can explain further in a Comment.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: Questionable Package Present in Debian: fortune-mod

2023-08-21 Thread Russ Allbery

ort of project-wide content-based decision
on package vetting, and certainly against applying the Code of Conduct to
something that does not have at all the same context as what the Code of
Conduct was designed to address.

If we were going to write a project content policy (which I'm dubious we
really need to do, or that it would be worth the emotional effort
required), I think it would look much different than the Code of Conduct
because it would have different goals.  It wouldn't be about building a
community or encouraging productive collaboration, because the contents of
our archive don't need to do either of those things.  Lots of people use
Debian who are not members of any shared community, and this is a feature.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: [RFC] Extending project standards to services linked through Vcs-*

2023-08-21 Thread Russ Allbery

Dominik George  writes:

> Hi,

>> have you considered dgit?

> no, as that's something entirely different. dgit does not
> manage source packages in Git, it provides a Git frontend
> to source packages not managed in Git.

No, this is not really true.  There's a lot of misunderstanding about
dgit.  It does in fact manage source packages in Git.

You are thinking of the use of dgit for packages that don't use dgit in
their upload flow.  In those cases, yes, dgit creates a synthetic Git
repository that only includes one commit per upload to Debian.  Better
than nothing, but not really managing the source package in Git.

However, if one uses dgit in one's upload flow, all relevant Git changes
are pushed to the dgit Git repository.  You can close the dgit repository
and get exactly the Git repository that the package maintainer used to
develop and upload the package, just as if you were using a Git forge.

Obviously, dgit doesn't have the other functions of a Git forge, such as
issue tracking, CI, or merge requests.  But it does manage source packages
in Git.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: [RFC] Extending project standards to services linked through Vcs-*

2023-08-21 Thread Russ Allbery

Dominik George  writes:
> On Mon, Aug 21, 2023 at 09:48:26AM -0700, Russ Allbery wrote:

>> This implies that Salsa is happy to create accounts for people under
>> the age of 13, since the implicit statement here is that Debian's own
>> Git hosting infrastructure is less excluding than GitHub.

>> That's a somewhat surprising statement to me, given the complicated
>> legal issues involved in taking personal data from someone that young,
>> so I want to double-check: is that in fact the case?

> That is, in fact, the case.

> And no, it's not not legally complicated to collect personal data from
> children. If we, for now, only look at COPPA and GDPR, the laws relevant
> for the US and EU, respectively, the situation is:

[...]

Thank you!  This is good to know and I'm very happy that this is the case.
I'm glad people have done the research that I hadn't done and worked out
what was required!

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: [RFC] Extending project standards to services linked through Vcs-*

2023-08-21 Thread Russ Allbery

Dominik George  writes:

> For the GitHub case, the problematic terms would be that in order to
> register for a GitHub account, users must be at least 13 or 16 years old
> (depending on the jurisdiction) ant must not live in a country under US
> embargoes.

This implies that Salsa is happy to create accounts for people under the
age of 13, since the implicit statement here is that Debian's own Git
hosting infrastructure is less excluding than GitHub.

That's a somewhat surprising statement to me, given the complicated legal
issues involved in taking personal data from someone that young, so I want
to double-check: is that in fact the case?

(US embargoes are indeed going to be a problem for any service hosted in
the United States, and possibly an issue, depending on the details, for
any maintainer with US citizenship even if they're using a site hosted
elsewhere.  I would not dare to venture an analysis without legal advice.)

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

Re: systmd-analyze security as a release goal

2023-07-05 Thread Russ Allbery

"Trent W. Buck"  writes:

> As someone who does that kind of thing a lot, I'd rather have
> the increased annoyance of opt-out hardening than
> the reduced security of opt-in hardening.
> Even if it means I occasionally need to patch site-local rules into
> /etc/apparmor.d/local/usr.bin.msmtp or
> /etc/systemd/system/libvirtd.service.d/override.conf.

I also feel this way but there are a bunch of people who really, really
don't, and also it's not entirely obvious when hardening is failing or
what overrides you need to add.  So making this the default is hard,
because it fundamentally breaks the "it has to work out of the box"
property that people expect.  Making it be semi-normal for daemons to not
work out of the box depending on what configuration options or other
packages you have installed is a hard sell.

That makes me want some way to opt in to "hardening that might break
something," but I'm not sure the best way to do that.

-- 
Russ Allbery (r...@debian.org)  <https://www.eyrie.org/~eagle/>

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 2932 matches

Mail list logo