On Thu, 2025-01-09 at 14:47 -0700, Thayne Harbaugh wrote: I *finally* got back around to investigating the below EOVERFLOW failure - additional details in-line:
> I have a mkosi build that is failing with the following message: > > >$ mkosi build > ... > /var/tmp/.#repart020c1a929b048b02 successfully formatted as ext4 (label > "root", uuid c8ed6cd3-04b0-4667-8f2f-af9487b8b986) > Automatically determined minimal disk image size as 2.2G. > Sized > '/home/thayne/.cache/mkosi/mkosi-workspace-37l8iaf4/staging/test_1.2.3.raw' > to 2.2G. > Applying changes to > /home/thayne/.cache/mkosi/mkosi-workspace-37l8iaf4/staging/test_1.2.3.raw. > Failed to read file attributes of > /home/thayne/.cache/mkosi/mkosi-workspace-37l8iaf4/staging/test_1.2.3.raw: > Value too large for defined data type > ? "systemd-repart --empty=allow --size=auto --dry-run=no --json=pretty > --no-pager --offline=yes --seed 1feaec73-f24d-454c-bd15-4f80926e951e > /home/thayne/.cache/mkosi/mkosi-workspace-37l8iaf4/staging/test_1.2.3.raw > --root=/buildroot --empty=create --defer-partitions esp,xbootldr > --generate-fstab=/etc/fstab --generate-crypttab=/etc/crypttab --definitions > /source/mkosi.repart" returned non-zero exit code 1. The above failure is specifically triggered by the following line in repart.c:prepare_temporary_file(): r = read_attr_fd(fdisk_get_devfd(context->fdisk_context), &attrs); Then chattr-util.c:read_attr_fd() calls the following line: return RET_NERRNO(ioctl(fd, FS_IOC_GETFLAGS, ret)); > It is a build that has been running without problems for some time. > Recently it has changed from incorporating systemd v256 to v257. > When I switch it back to v256 it succeeds and does not error. The repart.c:prepare_temporary_file() changed recently with this commit: commit b9c0b6c011fd0b30d3484d21d70cef6f5ae2fc0a Author: Daan De Meyer <daan.j.deme...@gmail.com> Date: Tue Jul 23 21:43:13 2024 +0200 repart: Make partition files NOCOW if the disk image is NOCOW https://github.com/systemd/systemd/commit/b9c0b6c011f > I have tried to narrow it further. > > * The build runs inside of a Docker container with the mkosi source > tree mounted inside. The container is started with the > following command: > > >$ docker run -it --privileged -v "/dir/to/source:/source" ubuntu:24.04 > > * The container has mkosi 24.3 inside > > * The failure occurs with both erofs+best and ext4+guess > > * The mkosi configuration is the following: > > mkosi.conf > ========== > > [Distribution] > Distribution=ubuntu > # noble == 24.04 > Release=noble > Repositories=noble,noble-security,noble-updates > Architecture=x86-64 > > [Output] > Format=disk > ImageId=test > ImageVersion=1.2.3 > > [Content] > RootPassword=tomato > Bootable=yes > Bootloader=systemd-boot > > Packages= > linux-image-6.8.0-51-generic > > mkosi.repart/10-esp.conf > ======================== > > [Partition] > Type=esp > Format=vfat > CopyFiles=/boot:/ > CopyFiles=/efi:/ > SizeMinBytes=2048M > > mkosi.repart/20-rootfs.conf > =========================== > > [Partition] > Type=root > Label=root > #Format=erofs > #Minimize=best > Format=ext4 > Minimize=guess > MountPoint=/:ro > CopyFiles=/:/ > ReadOnly=on > Encrypt=key-file > EncryptedVolume=root:/run/fscrypt.sock:luks,headless,x-initrd.attach > > While initially the failure seemed to correlate directly with systemd > v257 packages being injected into mkosi.packages I have since been > able to reproduce the failure with the upstream Ubuntu 24.04/noble > v255 version of systemd. > > It seems that building on a mount inside of a Docker container is > necessary factor to cause the failure. While I have made a quick > glance at src/repart/repart.c in prepare_temporary_file(), > context_split() and context_minimize() I have not done any serious > digging yet. I'm running Linux kernel 6.11.10 and Docker 26.1.5. > Any ideas about the specifics of what causes this failure and what it > will take to fix it? It seems to me that this is related to this comment from the Linux kernel source tree in include/uapi/linux/fs.h: /* * Inode flags (FS_IOC_GETFLAGS / FS_IOC_SETFLAGS) * * Note: for historical reasons, these flags were originally used and * defined for use by ext2/ext3, and then other file systems started * using these flags so they wouldn't need to write their own version * of chattr/lsattr (which was shipped as part of e2fsprogs). You * should think twice before trying to use these flags in new * contexts, or trying to assign these flags, since they are used both * as the UAPI and the on-disk encoding for ext2/3/4. Also, we are * almost out of 32-bit flags. :-) * * We have recently hoisted FS_IOC_FSGETXATTR / FS_IOC_FSSETXATTR from * XFS to the generic FS level interface. This uses a structure that * has padding and hence has more room to grow, so it may be more * appropriate for many new use cases. ... */ I'm uncertain which file system layer - or other translation layer - introduced by Docker is causing the EOVERFLOW error. A quick scan through the kernel code hints that inodes, uid/gid translations and a few other possibilities can return EOVERFLOW. Maybe chattr-util.c:read_attr_fd() might be better-implemented using the newer FS_IOC_FSGETXATTR ioctl? Maybe there's a different way to detect FS_NOCOW_FL? I'm continuing to poke at this. Please send me suggestions.