Bug#1050256: [pkg-apparmor] Bug#1050256: autopkgtest fails on debci
On Tue, 2023-09-19 at 07:17 +0200, Salvatore Bonaccorso wrote: > On Sun, Sep 17, 2023 at 12:01:37PM +0530, intrigeri wrote: > > In the last month or so, a number of people from various Debian teams > > and other distributions have been tracking down a regression that > > affects systems upgraded to Bookworm: services that use certain > > systemd facilities such as PrivateNetwork=yes fail to start in LXC/LXD > > containers. Among other things, this breaks the autopkgtests of many > > packages, such as systemd, on ci.debian.net (#1050256). This was > > tracked down to a kernel regression, for which a fix landed in Linux > > 6.2: > > > > 1cf26c3d2c4c apparmor: fix apparmor mediating locking non-fs unix sockets > > > > Work is ongoing to backport the fix to linux-stable/linux-6.1.y. > > I'm Cc'ing John and Mathias who have been working on this. > > > > FYI, ideally this would be fixed in the upcoming Bookworm > > point-release (12.2, early October). > > Thanks for the details. Has this already been sent it to the stable > maintainers? I do not see it yet on the stable list. I believe that John has been working on the fix for the 6.1 branch, although I don't know what the status is. I don't have the necessary familiarity with apparmor internals to attempt to backport the fix myself, but I'll be very happy to test once it's available. Mathias signature.asc Description: This is a digitally signed message part
Bug#1050256: [pkg-apparmor] Bug#1050256: autopkgtest fails on debci
Control: tags -1 + confirmed moreinfo Hi, On Sun, Sep 17, 2023 at 12:01:37PM +0530, intrigeri wrote: > Control: reassign -1 src:linux > Control: retitle -1 AppArmor breaks locking non-fs Unix sockets > Control: affects -1 src:apparmor src:lxc src:systemd src:pdns src:policykit-1 > Control: found -1 6.1.38-1 > Control: found -1 6.1.38-2 > Control: notfound -1 6.3.1-1~exp1 > > Hi Debian Kernel Team, > > In the last month or so, a number of people from various Debian teams > and other distributions have been tracking down a regression that > affects systems upgraded to Bookworm: services that use certain > systemd facilities such as PrivateNetwork=yes fail to start in LXC/LXD > containers. Among other things, this breaks the autopkgtests of many > packages, such as systemd, on ci.debian.net (#1050256). This was > tracked down to a kernel regression, for which a fix landed in Linux > 6.2: > > 1cf26c3d2c4c apparmor: fix apparmor mediating locking non-fs unix sockets > > Work is ongoing to backport the fix to linux-stable/linux-6.1.y. > I'm Cc'ing John and Mathias who have been working on this. > > FYI, ideally this would be fixed in the upcoming Bookworm > point-release (12.2, early October). Thanks for the details. Has this already been sent it to the stable maintainers? I do not see it yet on the stable list. Regards, Salvatore
Bug#1050256: autopkgtest fails on debci
Hi all, On 09-09-2023 13:06, Paul Gevers wrote: All ci.d.n workers (except riscv64) now run the kernel from bookworm-backports. systemd passes it's autopkgtest again in unstable, testing and stable. We're having issues [1] with the (backports and) unstable kernel on our main amd64 host, so we reverted back to the stable kernel for amd64. Paul [1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1052130 OpenPGP_signature.asc Description: OpenPGP digital signature
Bug#1038315: [pkg-apparmor] Bug#1050256: autopkgtest fails on debci
Dear lxd and systemd maintainers, Michael Biebl (2023-09-11): > When you do the reassignment, you should probably merge this bug report > with #1038315 and #1042880, now that we know what the root cause is. FTR I did not dare merging these myself: perhaps you want to keep separate bug reports to track workarounds on top of #1050256 that's tracking the root cause, or something. Cheers, -- intrigeri
Bug#1050256: [pkg-apparmor] Bug#1050256: autopkgtest fails on debci
Control: reassign -1 src:linux Control: retitle -1 AppArmor breaks locking non-fs Unix sockets Control: affects -1 src:apparmor src:lxc src:systemd src:pdns src:policykit-1 Control: found -1 6.1.38-1 Control: found -1 6.1.38-2 Control: notfound -1 6.3.1-1~exp1 Hi Debian Kernel Team, In the last month or so, a number of people from various Debian teams and other distributions have been tracking down a regression that affects systems upgraded to Bookworm: services that use certain systemd facilities such as PrivateNetwork=yes fail to start in LXC/LXD containers. Among other things, this breaks the autopkgtests of many packages, such as systemd, on ci.debian.net (#1050256). This was tracked down to a kernel regression, for which a fix landed in Linux 6.2: 1cf26c3d2c4c apparmor: fix apparmor mediating locking non-fs unix sockets Work is ongoing to backport the fix to linux-stable/linux-6.1.y. I'm Cc'ing John and Mathias who have been working on this. FYI, ideally this would be fixed in the upcoming Bookworm point-release (12.2, early October). Current workarounds: - ci.debian.net was upgraded to the bookworm-backports kernel - various packages maintainers have added workarounds such as disabling PrivateNetwork=yes for autopkgtests Cheers, -- intrigeri
Bug#1050256: autopkgtest fails on debci
On Mon, 2023-09-11 at 13:45 +0200, Michael Biebl wrote: > Am 09.09.23 um 14:20 schrieb intrigeri: > > > At this stage it seems clear that the bug and the corresponding > > ideal fix are in the AppArmor part of src:linux, and the bug > > affects at least src:apparmor and src:lxc. I'd like to reflect this > > in the metadata of #1050256 by reassigning the bug to Linux, and > > adding "affects" indications. I'll do so in the next few days > > unless someone objects soon. > > It also affects at least > src:systemd, src:pdns, src:policykit-1 > All those packages have added workarounds for this issue. > I'll revert the workaround in systemd and notify the maintainers of > pdns and policykit-1. > > > Doing so will also be an opportunity for me to sum up the problem > > for the maintainers of src:linux, and let them know about our > > desired timeline: ideally this would be fixed in the upcoming > > Bookworm point-release. Not having heard any objections, please feel free to reassign this bug. As you said, this will give the src:linux maintainers a heads up, even if the patch isn't quite ready yet (but hopefully in time for the 12.2 point release). Mathias signature.asc Description: This is a digitally signed message part
Bug#1050256: autopkgtest fails on debci
On Mon, 2023-09-04 at 12:39 -0700, John Johansen wrote: > On 9/4/23 12:32, Michael Biebl wrote: > > John, could you help with getting this fix into 6.1.x? > > yes, I am working on a patch. Hi John, I wanted to check in to see if you've had a chance to work on that patch for the 6.1 kernel. The deadline for package updates being included in the 12.2 point release is in roughly two weeks, but given this will be a patch for the kernel I'd really like to have something tested and handed over to the src:linux team well before then. Thanks, Mathias signature.asc Description: This is a digitally signed message part
Bug#1050256: [pkg-apparmor] Bug#1050256: autopkgtest fails on debci
Control: severity -1 important Am 09.09.23 um 14:20 schrieb intrigeri: Hi again, Thank you all for working both on workarounds for Debian CI and on a proper upstream Linux kernel fix. Impressive cross-team work! :) +1 At this stage it seems clear that the bug and the corresponding ideal fix are in the AppArmor part of src:linux, and the bug affects at least src:apparmor and src:lxc. I'd like to reflect this in the metadata of #1050256 by reassigning the bug to Linux, and adding "affects" indications. I'll do so in the next few days unless someone objects soon. It also affects at least src:systemd, src:pdns, src:policykit-1 All those packages have added workarounds for this issue. I'll revert the workaround in systemd and notify the maintainers of pdns and policykit-1. Doing so will also be an opportunity for me to sum up the problem for the maintainers of src:linux, and let them know about our desired timeline: ideally this would be fixed in the upcoming Bookworm point-release. This being said, if said timeline can't be met in src:linux, it'll be up to the maintainers of LXC in Debian to decide what they want to do in the upcoming Bookworm point-release. If I misunderstood something important, please let me know. Sounds good to me. For now, given that all the debci hosts are running the backports kernel, I'm downgrading the severity again. When you do the reassignment, you should probably merge this bug report with #1038315 and #1042880, now that we know what the root cause is. Regards, Michael OpenPGP_signature.asc Description: OpenPGP digital signature
Bug#1050256: [pkg-apparmor] Bug#1050256: autopkgtest fails on debci
Hi again, Thank you all for working both on workarounds for Debian CI and on a proper upstream Linux kernel fix. Impressive cross-team work! :) At this stage it seems clear that the bug and the corresponding ideal fix are in the AppArmor part of src:linux, and the bug affects at least src:apparmor and src:lxc. I'd like to reflect this in the metadata of #1050256 by reassigning the bug to Linux, and adding "affects" indications. I'll do so in the next few days unless someone objects soon. Doing so will also be an opportunity for me to sum up the problem for the maintainers of src:linux, and let them know about our desired timeline: ideally this would be fixed in the upcoming Bookworm point-release. This being said, if said timeline can't be met in src:linux, it'll be up to the maintainers of LXC in Debian to decide what they want to do in the upcoming Bookworm point-release. If I misunderstood something important, please let me know. Cheers, -- intrigeri
Bug#1050256: autopkgtest fails on debci
Hi, On 03-09-2023 10:50, Paul Gevers wrote: I have manually upgraded the s390x host and rebooted, so that can serve as a test arch. All ci.d.n workers (except riscv64) now run the kernel from bookworm-backports. systemd passes it's autopkgtest again in unstable, testing and stable. Paul OpenPGP_signature.asc Description: OpenPGP digital signature
Bug#1050256: autopkgtest fails on debci
On 9/4/23 12:32, Michael Biebl wrote: Am 04.09.23 um 20:23 schrieb Mathias Gibbens: On Mon, 2023-09-04 at 01:00 -0700, John Johansen wrote: I took a quick look through v6.1..v6.3.1 there is a patch that I think is the likely fix, it first landed in v6.2 1cf26c3d2c4c apparmor: fix apparmor mediating locking non-fs unix sockets Thanks for the pointer John -- I think that is the fix we've been looking for! Commit 1cf26c3d2c4c doesn't apply cleanly to the v6.1 tree due to the other commits from the patchset of Oct 3, 2022 that modified a bunch of the apparmor code. Because I couldn't quickly cherry-pick all the changes without amassing a large diff, I made the small proof-of- concept patch at the end of this message and applied it to the 6.1.38- 4 kernel from bookworm. Booting with the patched kernel allows services to start up in containers without any issues. :) So, I think the next step should be to get that commit properly backported to the v6.1 longterm tree and included in an upstream release. Hopefully that would be able to happen in enough time so that it is bundled with the kernel updates for bookworm's point release next month. If not, we should be sure to get it into Debian's packaging so at least there's a proper fix available. Thanks for the update Mathias, this looks very promising. A stable update of the Linux 6.1.x kernel would obviously be the ideal solution. John, could you help with getting this fix into 6.1.x? yes, I am working on a patch.
Bug#1050256: autopkgtest fails on debci
Am 04.09.23 um 20:23 schrieb Mathias Gibbens: On Mon, 2023-09-04 at 01:00 -0700, John Johansen wrote: I took a quick look through v6.1..v6.3.1 there is a patch that I think is the likely fix, it first landed in v6.2 1cf26c3d2c4c apparmor: fix apparmor mediating locking non-fs unix sockets Thanks for the pointer John -- I think that is the fix we've been looking for! Commit 1cf26c3d2c4c doesn't apply cleanly to the v6.1 tree due to the other commits from the patchset of Oct 3, 2022 that modified a bunch of the apparmor code. Because I couldn't quickly cherry-pick all the changes without amassing a large diff, I made the small proof-of- concept patch at the end of this message and applied it to the 6.1.38- 4 kernel from bookworm. Booting with the patched kernel allows services to start up in containers without any issues. :) So, I think the next step should be to get that commit properly backported to the v6.1 longterm tree and included in an upstream release. Hopefully that would be able to happen in enough time so that it is bundled with the kernel updates for bookworm's point release next month. If not, we should be sure to get it into Debian's packaging so at least there's a proper fix available. Thanks for the update Mathias, this looks very promising. A stable update of the Linux 6.1.x kernel would obviously be the ideal solution. John, could you help with getting this fix into 6.1.x? Regards, Michael OpenPGP_signature.asc Description: OpenPGP digital signature
Bug#1050256: autopkgtest fails on debci
On Mon, 2023-09-04 at 01:00 -0700, John Johansen wrote: > I took a quick look through v6.1..v6.3.1 > > there is a patch that I think is the likely fix, it first landed in v6.2 > > 1cf26c3d2c4c apparmor: fix apparmor mediating locking non-fs unix sockets Thanks for the pointer John -- I think that is the fix we've been looking for! Commit 1cf26c3d2c4c doesn't apply cleanly to the v6.1 tree due to the other commits from the patchset of Oct 3, 2022 that modified a bunch of the apparmor code. Because I couldn't quickly cherry-pick all the changes without amassing a large diff, I made the small proof-of- concept patch at the end of this message and applied it to the 6.1.38- 4 kernel from bookworm. Booting with the patched kernel allows services to start up in containers without any issues. :) So, I think the next step should be to get that commit properly backported to the v6.1 longterm tree and included in an upstream release. Hopefully that would be able to happen in enough time so that it is bundled with the kernel updates for bookworm's point release next month. If not, we should be sure to get it into Debian's packaging so at least there's a proper fix available. I'm happy to help test any proposed patch for this fix on my end. Mathias - > --- a/security/apparmor/lib.c 2023-09-04 16:08:28.818066140 + > +++ b/security/apparmor/lib.c 2023-09-04 16:09:17.56661 + > @@ -355,6 +355,9 @@ > perms->allow |= map_other(dfa_other_allow(dfa, state)); > perms->audit |= map_other(dfa_other_audit(dfa, state)); > perms->quiet |= map_other(dfa_other_quiet(dfa, state)); > + > + // For testing only! > + perms->allow |= AA_MAY_LOCK; > } > > /** signature.asc Description: This is a digitally signed message part
Bug#1050256: [pkg-apparmor] Bug#1050256: autopkgtest fails on debci
Hello, Am Samstag, 2. September 2023, 01:13:11 CEST schrieb Mathias Gibbens: > A minimal reproducer is to install bookworm and create a container > with a systemd service using a hardening option like > PrivateNetwork=yes. With the latest bookworm kernel (6.1.38-4), the > service will fail. But, grab a kernel from testing (6.4.11-1) and then > things work -- with no other changes required. I tried the "oldest" > kernel on snapshot.d.o post 6.1 series (6.3.1+1~exp1 [1]) and the > service works properly with that version as well. So, something > changed in the kernel (either upstream or in Debian's packaging) > between 6.1 and 6.3 that "unbreaks" services within lxc containers. I asked in #apparmor, and John answered [11:04:33] can someone have a look at https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1050256 ? Short version: Debian gets unix denials when running lxc with kernel 6.1.38 from bookwork, but things work with kernel 6.3.1 [19:19:41] cboltz: ok, I will try and look at it today [07:00:34] cboltz: I didn't see anything that would cause unix failures in a first pass. I will take another pass at it tomorrow [10:01:30] cboltz: commit 1cf26c3d2c4c apparmor: fix apparmor mediating locking non-fs unix sockets So you could test if the bookwork kernel with 1cf26c3d2c4c applied on top fixes the issue. To answer a question from a later mail: Am Sonntag, 3. September 2023, 02:56:05 CEST schrieb Michael Biebl: > I also tested downgrading apparmor to 2.13.6-10 (i.e. the version from > oldstable) on a bookworm system. > > This was also sufficient to unbreak lxc. > > So it "looks" like apparmor 3.x makes assumptions about the kernel > that are not fulfilled by the kernel 6.1.x in bookworm. The difference is in the abi levels - without an abi/ include specified, unix rules don't get enforced (= allow everything), while with abi/3.0 and AppArmor >= 3.x userspace, unix rules get enforced. abi/3.0 got introduced in AppArmor 3.0, and my guess is that the abi/3.0 include was also added to the lxc profile. Actually the explanation might be slightly different (same result, but without abi/3.0 in the lxc profile): It looks like the Debian AppArmor maintainers pinned the abi to /etc/apparmor.d/abi/kernel-5.4-outoftree-network which, like abi/3.0, includes enforcing unix rules. (Note: I'm only looking at https://salsa.debian.org/apparmor-team/apparmor.git/ since I don't have a Debian machine running.) For completeness: 2.13.x doesn't support abi at all (besides ignoring abi/* includes if it finds them in a profile) so even if you have a profile with abi/3.0, unix rules won't be enforced. There's an exception: Ubuntu kernels carry some patches to enable unix and some other rules even with older AppArmor versions. Regards, Christian Boltz -- in my experience it's safe to assume developers never test [Stephan Kulow in opensuse-factory] signature.asc Description: This is a digitally signed message part.
Bug#1050256: autopkgtest fails on debci
I took a quick look through v6.1..v6.3.1 there is a patch that I think is the likely fix, it first landed in v6.2 1cf26c3d2c4c apparmor: fix apparmor mediating locking non-fs unix sockets it matches up the reported audit logs. Unfortunately it does not have a Fixes tag but as best I can figure it should be applied all the way back to. 56974a6fcfef apparmor: add base infastructure for socket mediation how/where this bug surfaces partly depends on the userspace policy and compiler which combines the features set supported by the kernel with what policy claims to support. So it is possible to have an affected kernel but not trigger the bug.
Bug#1050256: autopkgtest fails on debci
Am 03.09.23 um 10:50 schrieb Paul Gevers: Hi, On 03-09-2023 02:56, Michael Biebl wrote: ng? Do the debci maintainers / lxc maintainers / release team have any preference regarding a/, b/ and c/ ? One part of me likes the ci.d.n infrastructure to run stable as an example of "eat your own dogfood". Another part of me agrees with Antonio that it makes sense if it would run a backports kernel to be as close as possible to testing as we can reasonably (maintenance wise) can get. Because we have a known issue at hand, the balance goes to backports for me. If Antonio doesn't beat me to it, I'll get to it (although I don't know yet how to do that in our configuration [1] and exclude riscv64 too). I have manually upgraded the s390x host and rebooted, so that can serve as a test arch. Seems it worked, the latest run succeeded: https://ci.debian.net/data/autopkgtest/testing/s390x/s/systemd/37374052/log.gz Thanks! OpenPGP_signature.asc Description: OpenPGP digital signature
Bug#1050256: autopkgtest fails on debci
Hi, On 03-09-2023 02:56, Michael Biebl wrote: My main concern is to "stop the bleeding" quickly, so to speak, especially/mainly for debci. I agree with you, but also consider that with this issue being there since ~ April 2023 we don't need to rush. I guess we have three options here: a/ upgrade the kernels to the one from backports as suggested by Antonio b/ disable apparmor confinement for lxc on debci via some debci specific configuration c/ disable apparmor confinement for lxc in bookworm via a stable upload of the lxc package That said, I would be fine with a/ and b/ as well, as this would buy us time to investigate this issue without being under the pressure of causing debci failures. What I fear a bit, is that if we do either of the three, Debian infra is not affected anymore which removes some incentive to find the root cause. Those debci failures are hard to debug and I would like to avoid having individual maintainers waste time on it. a, b, or c means that Debian maintainers don't need to dive into it anymore, but who knows which downstream project (volunteers or paid alike) will need to look into the problem in the future if we don't fix it inside packaging? Do the debci maintainers / lxc maintainers / release team have any preference regarding a/, b/ and c/ ? One part of me likes the ci.d.n infrastructure to run stable as an example of "eat your own dogfood". Another part of me agrees with Antonio that it makes sense if it would run a backports kernel to be as close as possible to testing as we can reasonably (maintenance wise) can get. Because we have a known issue at hand, the balance goes to backports for me. If Antonio doesn't beat me to it, I'll get to it (although I don't know yet how to do that in our configuration [1] and exclude riscv64 too). I have manually upgraded the s390x host and rebooted, so that can serve as a test arch. Paul [1] https://salsa.debian.org/ci-team/debian-ci-config OpenPGP_signature.asc Description: OpenPGP digital signature
Bug#1050256: autopkgtest fails on debci
Control: severity -1 serious I'm tentatively raising this to RC, mainly to make this issue more visible for other maintainers. OpenPGP_signature.asc Description: OpenPGP digital signature
Bug#1050256: autopkgtest fails on debci
Hi everyone Am 02.09.23 um 13:09 schrieb Antonio Terceiro: On Fri, Sep 01, 2023 at 11:13:11PM +, Mathias Gibbens wrote: I don't think we have a good understanding of the root cause of this issue. Initially we thought this was a known upstream issue with all- but very recent versions of apparmor and a corresponding lxc profile fix [0]. However, it appears this is a different issue that somehow depends on the interaction of bookworm's versions of the kernel, apparmor, and/or lxc. Nod A minimal reproducer is to install bookworm and create a container with a systemd service using a hardening option like PrivateNetwork=yes. With the latest bookworm kernel (6.1.38-4), the service will fail. But, grab a kernel from testing (6.4.11-1) and then things work -- with no other changes required. I tried the "oldest" kernel on snapshot.d.o post 6.1 series (6.3.1+1~exp1 [1]) and the service works properly with that version as well. So, something changed in the kernel (either upstream or in Debian's packaging) between 6.1 and 6.3 that "unbreaks" services within lxc containers. Right, these are my findings as well. I also tested downgrading apparmor to 2.13.6-10 (i.e. the version from oldstable) on a bookworm system. This was also sufficient to unbreak lxc. So it "looks" like apparmor 3.x makes assumptions about the kernel that are not fulfilled by the kernel 6.1.x in bookworm. Given that simply installing a newer kernel fixes things, I am hesitant to start making changes to lxc until we actually understand what's changed when running the newer kernel and how it's affecting lxc's behavior. My main concern is to "stop the bleeding" quickly, so to speak, especially/mainly for debci. I guess we have three options here: a/ upgrade the kernels to the one from backports as suggested by Antonio b/ disable apparmor confinement for lxc on debci via some debci specific configuration c/ disable apparmor confinement for lxc in bookworm via a stable upload of the lxc package The MR I proposed is c/, as I don't know how to implement a/ or b/. That said, I would be fine with a/ and b/ as well, as this would buy us time to investigate this issue without being under the pressure of causing debci failures. Those debci failures are hard to debug and I would like to avoid having individual maintainers waste time on it. Do the debci maintainers / lxc maintainers / release team have any preference regarding a/, b/ and c/ ? Michael OpenPGP_signature.asc Description: OpenPGP digital signature
Bug#1050256: autopkgtest fails on debci
On Fri, Sep 01, 2023 at 11:13:11PM +, Mathias Gibbens wrote: > Control: block 1038315 by -1 > Control: block 1042880 by -1 > > I don't think we have a good understanding of the root cause of this > issue. Initially we thought this was a known upstream issue with all- > but very recent versions of apparmor and a corresponding lxc profile > fix [0]. However, it appears this is a different issue that somehow > depends on the interaction of bookworm's versions of the kernel, > apparmor, and/or lxc. > > A minimal reproducer is to install bookworm and create a container > with a systemd service using a hardening option like > PrivateNetwork=yes. With the latest bookworm kernel (6.1.38-4), the > service will fail. But, grab a kernel from testing (6.4.11-1) and then > things work -- with no other changes required. I tried the "oldest" > kernel on snapshot.d.o post 6.1 series (6.3.1+1~exp1 [1]) and the > service works properly with that version as well. So, something changed > in the kernel (either upstream or in Debian's packaging) between 6.1 > and 6.3 that "unbreaks" services within lxc containers. > > Given that simply installing a newer kernel fixes things, I am > hesitant to start making changes to lxc until we actually understand > what's changed when running the newer kernel and how it's affecting > lxc's behavior. Thanks for the investigation. This led to think of something that would work around this issue, but maybe has bigger consequences. I'm wondering whether we should, as a policy, run backports kernels on the ci.debian.net workers. Given the most important use case is testing testing¹, having a kernel that is closest to the one in testing might make sense. ¹ pun intended Of course, this does not prevents having QEMU workers, and I want to provide that at some point. But since we won't be able to have QEMU for all architectures, anyway, I still think running backports kernels in the lxc workers might be a valid strategy. signature.asc Description: PGP signature
Bug#1050256: autopkgtest fails on debci
Control: block 1038315 by -1 Control: block 1042880 by -1 I don't think we have a good understanding of the root cause of this issue. Initially we thought this was a known upstream issue with all- but very recent versions of apparmor and a corresponding lxc profile fix [0]. However, it appears this is a different issue that somehow depends on the interaction of bookworm's versions of the kernel, apparmor, and/or lxc. A minimal reproducer is to install bookworm and create a container with a systemd service using a hardening option like PrivateNetwork=yes. With the latest bookworm kernel (6.1.38-4), the service will fail. But, grab a kernel from testing (6.4.11-1) and then things work -- with no other changes required. I tried the "oldest" kernel on snapshot.d.o post 6.1 series (6.3.1+1~exp1 [1]) and the service works properly with that version as well. So, something changed in the kernel (either upstream or in Debian's packaging) between 6.1 and 6.3 that "unbreaks" services within lxc containers. Given that simply installing a newer kernel fixes things, I am hesitant to start making changes to lxc until we actually understand what's changed when running the newer kernel and how it's affecting lxc's behavior. On Thu, 2023-08-31 at 19:54 +0200, Christian Boltz wrote: > That said - the DENIED log entry translates to > > unix send type=dgram, > > You could try if adding this rule to the lxc-autopkgtest-lxc-iomhit_* > profile helps - but if the issue is really on the kernel side, my > hope is limited). I have tried tweaking the apparmor profile that's generated for containers (the relevant part is defined in the variable AA_PROFILE_UNIX_SOCKETS in src/lxc/lsm/apparmor.c), but haven't had any success in a workaround. I am not super familiar with apparmor, so maybe I'm not specifying things right, but I've previously tried the sort of rules Christian suggested, none of which have had any affect. On Fri, 2023-09-01 at 13:23 +0200, Michael Biebl wrote: > The only way to fix the container was to use the aforementioned > `lxc.apparmor.profile = unconfined`. > I think we should do that as the breakage is rather widespread and I > already see individual packages trying to work around that to at > least keep debci afloat. I strongly dislike the idea of blanketly disabling apparmor profiles by default for all lxc installs, since apparmor is one of the ways of helping to ensure isolation of containers. For the specific instance of debci, /etc/lxc/default.conf can be modified post-lxc install to change lxc.apparmor.profile from "generated" to "unconfined" for the time being. Mathias --- [0] -- https://github.com/lxc/lxc/issues/4333 [1] -- https://snapshot.debian.org/package/linux-signed-amd64/6.3.1%2B1~exp1/ signature.asc Description: This is a digitally signed message part
Bug#1050256: [pkg-apparmor] Bug#1050256: autopkgtest fails on debci
Am 01.09.23 um 13:23 schrieb Michael Biebl: The only way to fix the container was to use the aforementioned `lxc.apparmor.profile = unconfined`. I think we should do that as the breakage is rather widespread and I already see individual packages trying to work around that to at least keep debci afloat. See e.g.: https://salsa.debian.org/systemd-team/systemd/-/merge_requests/211 https://salsa.debian.org/debian/pdns/-/commit/637e54ef73386541086da430553b82db78266bac or disabling the systemd hardening options completely_ https://salsa.debian.org/utopia-team/polkit/-/blob/master/debian/patches/debian/Don-t-use-PrivateNetwork-yes-for-the-systemd-unit.patch This is not a good outcome of this and the problem will become more apparent with debci running on bookworm now. I went ahead and submitted https://salsa.debian.org/lxc-team/lxc/-/merge_requests/18 since I don't see another solution atm. Looping in the release team as well for their input. Regards, Michael OpenPGP_signature.asc Description: OpenPGP digital signature
Bug#1050256: [pkg-apparmor] Bug#1050256: autopkgtest fails on debci
Am 31.08.23 um 19:54 schrieb Christian Boltz: Hello, Am Donnerstag, 31. August 2023, 08:41:59 CEST schrieb Michael Biebl: What we found so far is, that the AppArmor policy of lxc breaks any systemd service using PrivateNetwork=yes or PrivateIPC=yes when being run under lxc (running under bookworm using the bookworm kernel). I wonder what the best course of action is here. Should we disable the AA policy of lxc via a stable upload of the lxc package until the root cause is found? Unfortunately I know too little about AppArmor and lxc's AppArmor policy and my attempts to ask around for help weren't successful so far. Two quick hints, but let me warn you that I'm not familiar with lxc and also didn't check the content of the lxc-autopkgtest-lxc-iomhit_* profile. https://github.com/lxc/lxc/issues/4333 indicates that this issue was fixed in (much) a newer kernel - but that's probably not news to you since you wrote that comment ;-) That said - the DENIED log entry translates to unix send type=dgram, You could try if adding this rule to the lxc-autopkgtest-lxc-iomhit_* profile helps - but if the issue is really on the kernel side, my hope is limited). For testing, you could also try with a more broad unix send, or even unix, rule - but please don't add these broader rules to the production profile. I have no idea, where to add that and what specific syntax I should use. The profile above seems to be autogenerated and I only found a binary file with that name in /var/cache/apparmor. The only way to fix the container was to use the aforementioned `lxc.apparmor.profile = unconfined`. I think we should do that as the breakage is rather widespread and I already see individual packages trying to work around that to at least keep debci afloat. See e.g.: https://salsa.debian.org/systemd-team/systemd/-/merge_requests/211 https://salsa.debian.org/debian/pdns/-/commit/637e54ef73386541086da430553b82db78266bac or disabling the systemd hardening options completely_ https://salsa.debian.org/utopia-team/polkit/-/blob/master/debian/patches/debian/Don-t-use-PrivateNetwork-yes-for-the-systemd-unit.patch This is not a good outcome of this and the problem will become more apparent with debci running on bookworm now. Regards, Michael OpenPGP_signature.asc Description: OpenPGP digital signature
Bug#1050256: [pkg-apparmor] Bug#1050256: autopkgtest fails on debci
Hello, Am Donnerstag, 31. August 2023, 08:41:59 CEST schrieb Michael Biebl: > What we found so far is, that the AppArmor policy of lxc breaks any > systemd service using PrivateNetwork=yes or PrivateIPC=yes when being > run under lxc (running under bookworm using the bookworm kernel). > I wonder what the best course of action is here. > Should we disable the AA policy of lxc via a stable upload of the lxc > package until the root cause is found? > > Unfortunately I know too little about AppArmor and lxc's AppArmor > policy and my attempts to ask around for help weren't successful so > far. Two quick hints, but let me warn you that I'm not familiar with lxc and also didn't check the content of the lxc-autopkgtest-lxc-iomhit_* profile. https://github.com/lxc/lxc/issues/4333 indicates that this issue was fixed in (much) a newer kernel - but that's probably not news to you since you wrote that comment ;-) That said - the DENIED log entry translates to unix send type=dgram, You could try if adding this rule to the lxc-autopkgtest-lxc-iomhit_* profile helps - but if the issue is really on the kernel side, my hope is limited). For testing, you could also try with a more broad unix send, or even unix, rule - but please don't add these broader rules to the production profile. Regards, Christian Boltz -- you need a certificate, nobody knows how to do that securely (including the CAs ;-) [Bernd Paysan, https://bugs.kde.org/show_bug.cgi?id=131083] signature.asc Description: This is a digitally signed message part.
Bug#1050256: autopkgtest fails on debci
Hello everyone, On Thu, 2023-08-31 at 08:55 +0200, Michael Biebl wrote: > > > > What we found so far is, that the AppArmor policy of lxc breaks any > > systemd service using PrivateNetwork=yes or PrivateIPC=yes when > > being > > run under lxc (running under bookworm using the bookworm kernel). > > > I.e. by setting `lxc.apparmor.profile = unconfined` in > /etc/lxc/default.conf and regenerating the autopkgtest container on > bookworm, the failures are gone. > same case for systemd services using DynamicUser=yes Kind regards, Dan smime.p7s Description: S/MIME cryptographic signature
Bug#1050256: autopkgtest fails on debci
Am 31.08.23 um 08:41 schrieb Michael Biebl: On Tue, 22 Aug 2023 16:08:24 +0200 Michael Biebl wrote: Source: systemd Version: 254.1-2 Severity: important Looking at https://ci.debian.net/packages/s/systemd/unstable/amd64/ , systemd has been failing on debci since about the beginning of May. Asking around on #debci, this might be kernel related, as the debci related systems were upgraded to bookworm around that time. What we found so far is, that the AppArmor policy of lxc breaks any systemd service using PrivateNetwork=yes or PrivateIPC=yes when being run under lxc (running under bookworm using the bookworm kernel). I wonder what the best course of action is here. Should we disable the AA policy of lxc via a stable upload of the lxc package until the root cause is found? Unfortunately I know too little about AppArmor and lxc's AppArmor policy and my attempts to ask around for help weren't successful so far. I.e. by setting `lxc.apparmor.profile = unconfined` in /etc/lxc/default.conf and regenerating the autopkgtest container on bookworm, the failures are gone. OpenPGP_signature.asc Description: OpenPGP digital signature
Bug#1050256: autopkgtest fails on debci
On Tue, 22 Aug 2023 16:08:24 +0200 Michael Biebl wrote: Source: systemd Version: 254.1-2 Severity: important Looking at https://ci.debian.net/packages/s/systemd/unstable/amd64/ , systemd has been failing on debci since about the beginning of May. Asking around on #debci, this might be kernel related, as the debci related systems were upgraded to bookworm around that time. What we found so far is, that the AppArmor policy of lxc breaks any systemd service using PrivateNetwork=yes or PrivateIPC=yes when being run under lxc (running under bookworm using the bookworm kernel). I wonder what the best course of action is here. Should we disable the AA policy of lxc via a stable upload of the lxc package until the root cause is found? Unfortunately I know too little about AppArmor and lxc's AppArmor policy and my attempts to ask around for help weren't successful so far. Regards, Michael OpenPGP_signature.asc Description: OpenPGP digital signature
Bug#1050256: autopkgtest fails on debci
Am 23.08.23 um 14:32 schrieb Michael Biebl: I see the following error in the journal: Aug 23 14:23:50 debian audit[4096]: AVC apparmor="DENIED" operation="file_lock" profile="lxc-autopkgtest-lxc-iomhit_" pid=4096 comm="(ostnamed)" family="unix" sock_type="dgram" protocol=0 requested_mask="send" Aug 23 14:23:50 debian kernel: audit: type=1400 audit(1692793430.788:33): apparmor="DENIED" operation="file_lock" profile="lxc-autopkgtest-lxc-iomhit_" pid=4096 comm="(ostnamed)" family="unix" sock_type="dgram" protocol=0 requested_mask="send" Aug 23 14:23:50 debian kernel: audit: type=1400 audit(1692793430.788:34): apparmor="DENIED" operation="file_lock" profile="lxc-autopkgtest-lxc-iomhit_" pid=4096 comm="(ostnamed)" family="unix" sock_type="dgram" protocol=0 requested_mask="send" Aug 23 14:23:50 debian audit[4096]: AVC apparmor="DENIED" operation="file_lock" profile="lxc-autopkgtest-lxc-iomhit_" pid=4096 comm="(ostnamed)" family="unix" sock_type="dgram" protocol=0 requested_mask="send" With the 6.4 kernel, no such error happens. So, this looks to me like an AppArmor issue, thus reassigning to the apparmor package. It appears this was already reported separately as https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1038315 and the corresponding upstream bug https://github.com/lxc/lxc/issues/4333 Apparently any service using PrivateNetwork=yes and running inside lxc, will trigger this AppArmor violation. OpenPGP_signature.asc Description: OpenPGP digital signature
Bug#1050256: autopkgtest fails on debci
Control: reassign -1 apparmor Control: affects -1 src:systemd Control: retitle -1 apparmor makes systemd autopkgtests fail on bookworm Control: found -1 3.0.8-3 The plot thickens... Am 23.08.23 um 13:20 schrieb Michael Biebl: On Tue, 22 Aug 2023 16:08:24 +0200 Michael Biebl wrote: Source: systemd Version: 254.1-2 Severity: important Looking at https://ci.debian.net/packages/s/systemd/unstable/amd64/ , systemd has been failing on debci since about the beginning of May. Asking around on #debci, this might be kernel related, as the debci related systems were upgraded to bookworm around that time. Small update: I can reproduce the failures in a bookworm (qemu) VM, using LXC. Only upgrading the kernel to the one from trixie [1] is sufficient to make autopkgtest pass. ... so does disabling AppArmor with the bookworm kernel. For completeness sake the failing tests are: # autopkgtest systemd -- lxc autopkgtest-bookworm 784s hostnamedFAIL non-zero exit status 1 784s localed-locale FAIL non-zero exit status 1 784s localed-x11-keymap FAIL non-zero exit status 1 784s networkd-test.py FAIL non-zero exit status 1 784s boot-and-servicesFAIL non-zero exit status 1 784s unit-tests FAIL non-zero exit status 1 # autopkgtest systemd -- lxc autopkgtest-trixie 782s hostnamedFAIL non-zero exit status 1 782s localed-locale FAIL non-zero exit status 1 782s networkd-test.py FAIL non-zero exit status 1 782s boot-and-servicesFAIL non-zero exit status 1 Running e.g. # autopkgtest --test-name=hostnamed systemd -- lxc autopkgtest-trixie I see the following error in the journal: Aug 23 14:23:50 debian audit[4096]: AVC apparmor="DENIED" operation="file_lock" profile="lxc-autopkgtest-lxc-iomhit_" pid=4096 comm="(ostnamed)" family="unix" sock_type="dgram" protocol=0 requested_mask="send" Aug 23 14:23:50 debian kernel: audit: type=1400 audit(1692793430.788:33): apparmor="DENIED" operation="file_lock" profile="lxc-autopkgtest-lxc-iomhit_" pid=4096 comm="(ostnamed)" family="unix" sock_type="dgram" protocol=0 requested_mask="send" Aug 23 14:23:50 debian kernel: audit: type=1400 audit(1692793430.788:34): apparmor="DENIED" operation="file_lock" profile="lxc-autopkgtest-lxc-iomhit_" pid=4096 comm="(ostnamed)" family="unix" sock_type="dgram" protocol=0 requested_mask="send" Aug 23 14:23:50 debian audit[4096]: AVC apparmor="DENIED" operation="file_lock" profile="lxc-autopkgtest-lxc-iomhit_" pid=4096 comm="(ostnamed)" family="unix" sock_type="dgram" protocol=0 requested_mask="send" With the 6.4 kernel, no such error happens. So, this looks to me like an AppArmor issue, thus reassigning to the apparmor package. Dear AppArmor maintainers: can you please have a look? If you need further information, please let me know. Regards, Michael OpenPGP_signature.asc Description: OpenPGP digital signature
Bug#1050256: autopkgtest fails on debci
On Tue, 22 Aug 2023 16:08:24 +0200 Michael Biebl wrote: Source: systemd Version: 254.1-2 Severity: important Looking at https://ci.debian.net/packages/s/systemd/unstable/amd64/ , systemd has been failing on debci since about the beginning of May. Asking around on #debci, this might be kernel related, as the debci related systems were upgraded to bookworm around that time. Small update: I can reproduce the failures in a bookworm (qemu) VM, using LXC. Only upgrading the kernel to the one from trixie [1] is sufficient to make autopkgtest pass. [1] 6.4.0-2-amd64 OpenPGP_signature.asc Description: OpenPGP digital signature
Bug#1050256: autopkgtest fails on debci
Source: systemd Version: 254.1-2 Severity: important Looking at https://ci.debian.net/packages/s/systemd/unstable/amd64/ , systemd has been failing on debci since about the beginning of May. Asking around on #debci, this might be kernel related, as the debci related systems were upgraded to bookworm around that time.