[systemd-devel] logind device access weird behavior

2025-04-01 Thread serenissi
I noticed a phenomenon about logind managed devices (drm node). I have 
two users, localuser and testuser, the former has a session in seat0 
(this is important). I attached drm card1 to new seat `seat1` and set 
777 permission to the dev node /dev/dri/card1. Now the acl looks like


# file: dev/dri/card1
# owner: root
# group: video
user::rwx
group::---
mask::rwx
other::rwx

as expected. Now if I do from a localuser shell: sudo -u testuser cat 
/dev/dri/card1, the device opens as expected. However doing so as 
localuser results in permission denied.


But if I add another acl entry with setfacl -m u:localuser:rw 
/dev/dri/card1, cat /dev/dri/card1 suddenly works as expected. In this 
case the acl is


# file: dev/dri/card1
# owner: root
# group: video
user::rwx
user:localuser:rw-
group::---
mask::rw-
other::rwx

here the `other` entry makes the `user:localuser` entry pointless in 
common sense, which is not the case.


My hunch is ebpf but I couldn't find where this logic is defined in 
systemd tree. Could anyone here help me with that?



~ serene



OpenPGP_0x20257A7131FFF28B.asc
Description: OpenPGP public key


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: [systemd-devel] logind device access weird behavior

2025-04-01 Thread Mantas Mikulėnas
It sounds as if your original user is in the "video" group, so it receives
the 'group' permissions and not 'other' permissions. (They are not additive
in the POSIX model like they would be in Windows.)

Even though the device node had no specific ACL entries, it still *had* an
ACL in general, so the 'group' permission bits no longer affect actual
group permissions: they change the overall ACL access mask (and so can
limit access for all entries at once, but not grant access).

So doing "chmod 777" actually did the equivalent of setting
"u::rwx,m::rwx,o::rwx" while the "g::-" entry was left unchanged with no
permissions. If you're not owner but are in the 'video' group you therefore
get no access.

Use "setfacl -m g::rwx" to change the main group access entry instead.

On Tue, Apr 1, 2025, 17:29 serenissi  wrote:

> I noticed a phenomenon about logind managed devices (drm node). I have
> two users, localuser and testuser, the former has a session in seat0
> (this is important). I attached drm card1 to new seat `seat1` and set
> 777 permission to the dev node /dev/dri/card1. Now the acl looks like
>
> # file: dev/dri/card1
> # owner: root
> # group: video
> user::rwx
> group::---
> mask::rwx
> other::rwx
>
> as expected. Now if I do from a localuser shell: sudo -u testuser cat
> /dev/dri/card1, the device opens as expected. However doing so as
> localuser results in permission denied.
>
> But if I add another acl entry with setfacl -m u:localuser:rw
> /dev/dri/card1, cat /dev/dri/card1 suddenly works as expected. In this
> case the acl is
>
> # file: dev/dri/card1
> # owner: root
> # group: video
> user::rwx
> user:localuser:rw-
> group::---
> mask::rw-
> other::rwx
>
> here the `other` entry makes the `user:localuser` entry pointless in
> common sense, which is not the case.
>
> My hunch is ebpf but I couldn't find where this logic is defined in
> systemd tree. Could anyone here help me with that?
>
>
> ~ serene
>
>


Re: [systemd-devel] repart: Value too large for defined data type

2025-04-01 Thread Thayne Harbaugh
On Thu, 2025-01-09 at 14:47 -0700, Thayne Harbaugh wrote:

I *finally* got back around to investigating the below EOVERFLOW
failure - additional details in-line:

> I have a mkosi build that is failing with the following message:
> 
>   >$ mkosi build
>   ...
>   /var/tmp/.#repart020c1a929b048b02 successfully formatted as ext4 (label 
> "root", uuid c8ed6cd3-04b0-4667-8f2f-af9487b8b986)
>   Automatically determined minimal disk image size as 2.2G.
>   Sized 
> '/home/thayne/.cache/mkosi/mkosi-workspace-37l8iaf4/staging/test_1.2.3.raw' 
> to 2.2G.
>   Applying changes to 
> /home/thayne/.cache/mkosi/mkosi-workspace-37l8iaf4/staging/test_1.2.3.raw.
>   Failed to read file attributes of 
> /home/thayne/.cache/mkosi/mkosi-workspace-37l8iaf4/staging/test_1.2.3.raw: 
> Value too large for defined data type
>   ? "systemd-repart --empty=allow --size=auto --dry-run=no --json=pretty 
> --no-pager --offline=yes --seed 1feaec73-f24d-454c-bd15-4f80926e951e 
> /home/thayne/.cache/mkosi/mkosi-workspace-37l8iaf4/staging/test_1.2.3.raw 
> --root=/buildroot --empty=create --defer-partitions esp,xbootldr 
> --generate-fstab=/etc/fstab --generate-crypttab=/etc/crypttab --definitions 
> /source/mkosi.repart" returned non-zero exit code 1.

The above failure is specifically triggered by the following line in
repart.c:prepare_temporary_file():

  r = read_attr_fd(fdisk_get_devfd(context->fdisk_context), &attrs);

Then chattr-util.c:read_attr_fd() calls the following line:

  return RET_NERRNO(ioctl(fd, FS_IOC_GETFLAGS, ret));

> It is a build that has been running without problems for some time.
> Recently it has changed from incorporating systemd v256 to v257. 
> When I switch it back to v256 it succeeds and does not error. 

The repart.c:prepare_temporary_file() changed recently with this commit:

commit b9c0b6c011fd0b30d3484d21d70cef6f5ae2fc0a
Author: Daan De Meyer 
Date:   Tue Jul 23 21:43:13 2024 +0200

repart: Make partition files NOCOW if the disk image is NOCOW

https://github.com/systemd/systemd/commit/b9c0b6c011f

> I have tried to narrow it further.
> 
>   * The build runs inside of a Docker container with the mkosi source
>     tree mounted inside.  The container is started with the
> following command:
> 
>   >$ docker run -it --privileged -v "/dir/to/source:/source" ubuntu:24.04
> 
>   * The container has mkosi 24.3 inside
> 
>   * The failure occurs with both erofs+best and ext4+guess
> 
>   * The mkosi configuration is the following:
> 
>   mkosi.conf
>   ==
> 
>     [Distribution]
>     Distribution=ubuntu
>     # noble == 24.04
>     Release=noble
>     Repositories=noble,noble-security,noble-updates
>     Architecture=x86-64
> 
>     [Output]
>     Format=disk
>     ImageId=test
>     ImageVersion=1.2.3
> 
>     [Content]
>     RootPassword=tomato
>     Bootable=yes
>     Bootloader=systemd-boot
> 
>     Packages=
>     linux-image-6.8.0-51-generic
> 
>   mkosi.repart/10-esp.conf
>   
> 
>     [Partition]
>     Type=esp
>     Format=vfat
>     CopyFiles=/boot:/
>     CopyFiles=/efi:/
>     SizeMinBytes=2048M
> 
>   mkosi.repart/20-rootfs.conf
>   ===
> 
>     [Partition]
>     Type=root
>     Label=root
>     #Format=erofs
>     #Minimize=best
>     Format=ext4
>     Minimize=guess
>     MountPoint=/:ro
>     CopyFiles=/:/
>     ReadOnly=on
>     Encrypt=key-file
>     EncryptedVolume=root:/run/fscrypt.sock:luks,headless,x-initrd.attach
> 
> While initially the failure seemed to correlate directly with systemd
> v257 packages being injected into mkosi.packages I have since been
> able to reproduce the failure with the upstream Ubuntu 24.04/noble
> v255 version of systemd.
> 
> It seems that building on a mount inside of a Docker container is
> necessary factor to cause the failure.  While I have made a quick
> glance at src/repart/repart.c in prepare_temporary_file(),
> context_split() and context_minimize() I have not done any serious
> digging yet.

I'm running Linux kernel 6.11.10 and Docker 26.1.5.

> Any ideas about the specifics of what causes this failure and what it
> will take to fix it?

It seems to me that this is related to this comment from the Linux
kernel source tree in include/uapi/linux/fs.h:

  /*
   * Inode flags (FS_IOC_GETFLAGS / FS_IOC_SETFLAGS)
   *
   * Note: for historical reasons, these flags were originally used and
   * defined for use by ext2/ext3, and then other file systems started
   * using these flags so they wouldn't need to write their own version
   * of chattr/lsattr (which was shipped as part of e2fsprogs).  You
   * should think twice before trying to use these flags in new
   * contexts, or trying to assign these flags, since they are used both
   * as the UAPI and the on-disk encoding for ext2/3/4.  Also, we are
   * alm

Re: [systemd-devel] logind device access weird behavior

2025-04-01 Thread serenissi
Right! Stupid me. It just occurred to me that it is debian (which adds 
local users to video) and `group::---` looks sus. I spun up the vm to 
confirm and saw this reply in my inbox. Thanks!


~serene

On 4/1/25 19:36, Mantas Mikulėnas wrote:
It sounds as if your original user is in the "video" group, so it 
receives the 'group' permissions and not 'other' permissions. (They 
are not additive in the POSIX model like they would be in Windows.)


Even though the device node had no specific ACL entries, it still 
*had* an ACL in general, so the 'group' permission bits no longer 
affect actual group permissions: they change the overall ACL access 
mask (and so can limit access for all entries at once, but not grant 
access).


So doing "chmod 777" actually did the equivalent of setting 
"u::rwx,m::rwx,o::rwx" while the "g::-" entry was left unchanged with 
no permissions. If you're not owner but are in the 'video' group you 
therefore get no access.


Use "setfacl -m g::rwx" to change the main group access entry instead.

On Tue, Apr 1, 2025, 17:29 serenissi  wrote:

I noticed a phenomenon about logind managed devices (drm node). I
have
two users, localuser and testuser, the former has a session in seat0
(this is important). I attached drm card1 to new seat `seat1` and set
777 permission to the dev node /dev/dri/card1. Now the acl looks like

# file: dev/dri/card1
# owner: root
# group: video
user::rwx
group::---
mask::rwx
other::rwx

as expected. Now if I do from a localuser shell: sudo -u testuser cat
/dev/dri/card1, the device opens as expected. However doing so as
localuser results in permission denied.

But if I add another acl entry with setfacl -m u:localuser:rw
/dev/dri/card1, cat /dev/dri/card1 suddenly works as expected. In
this
case the acl is

# file: dev/dri/card1
# owner: root
# group: video
user::rwx
user:localuser:rw-
group::---
mask::rw-
other::rwx

here the `other` entry makes the `user:localuser` entry pointless in
common sense, which is not the case.

My hunch is ebpf but I couldn't find where this logic is defined in
systemd tree. Could anyone here help me with that?


~ serene



OpenPGP_0x20257A7131FFF28B.asc
Description: OpenPGP public key


OpenPGP_signature.asc
Description: OpenPGP digital signature