Re: [systemd-devel] timed out waiting for device dev-disk-by\x2duuid
On Sa, 04.01.20 18:58, Georg Großmann (ge...@grossmann-technologies.de) wrote: > Has this issue been fixed on either systemd or on BTRFS side in the > meantime? I am currently testing a BTRFS raid1 with two disk in my > virtualbox. I have installed a bootloader on both disks. After removing > one of the disks I always get stuck at "timed out waiting for device > dev-disk-by\x2duuid". Or is it still a 100% manual task to get the raid1 > up again? systemd does not implement a policy manager that decides when it's time to not wait for additional members of a raid device anymore. This is something the btrfs folks should put together. How long to wait for how many of how many devices, possibly asking for user input is some complex and specific enough for the btrfs folsk to take care of, and is outside of systemd#s realm. systemd with it's internal code will deal with the obvious case (i.e. all members appeared), and provides hooks for policy managers to hook into (the ID_BTRFS_READY udev property), but it doesn't provide those policy managers, and that's unlikely to change. Sorry, Lennart -- Lennart Poettering, Berlin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] timed out waiting for device dev-disk-by\x2duuid
04.01.2020 20:58, Georg Großmann пишет: > Has this issue been fixed on either systemd or on BTRFS side in the > meantime? No. Each project says that from its side there is nothing to fix. > I am currently testing a BTRFS raid1 with two disk in my > virtualbox. I have installed a bootloader on both disks. After removing > one of the disks I always get stuck at "timed out waiting for device > dev-disk-by\x2duuid". Or is it still a 100% manual task to get the raid1 > up again? > ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] timed out waiting for device dev-disk-by\x2duuid
Has this issue been fixed on either systemd or on BTRFS side in the meantime? I am currently testing a BTRFS raid1 with two disk in my virtualbox. I have installed a bootloader on both disks. After removing one of the disks I always get stuck at "timed out waiting for device dev-disk-by\x2duuid". Or is it still a 100% manual task to get the raid1 up again? Georg ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] timed out waiting for device dev-disk-by\x2duuid
On May 17, 2014, at 5:30 PM, Chris Murphy li...@colorremedies.com wrote: No, the system definitely does not attempt to mount it if there's a missing device. Systemd never executes /bin/mount at all in that case. A prerequisite for the mount attempt is this line: [1.621517] localhost.localdomain systemd[1]: dev-disk-by\x2duuid-9ff63135\x2dce42\x2d4447\x2da6de\x2dd7c9b4fb6d66.device changed dead - plugged That line only appears if all devices are present. And mount attempt doesn't happen. The system just hangs. However, if I do an rd.break=pre-mount, and get to a dracut shell this command works: mount -t btrfs -o subvol=root,ro,degraded -U uuid The volume UUID is definitely present even though not all devices are present. So actually in this case it's confusing why this uuid hasn't gone from dead to plugged. Until it's plugged, the mount command won't happen. 2 device Btrfs raid1, sda3 and sdb3. When both are available I get these lines: [2.168697] localhost.localdomain systemd-udevd[109]: creating link '/dev/disk/by-uuid/9ff63135-ce42-4447-a6de-d7c9b4fb6d66' to '/dev/sda3' [2.170232] localhost.localdomain systemd-udevd[135]: creating link '/dev/disk/by-uuid/9ff63135-ce42-4447-a6de-d7c9b4fb6d66' to '/dev/sdb3' That precipitates systemd changing the by-uuid.device from dead to plugged. If I remove one device, then udev does not create a link from uuid to /dev for the remaining device. Therefore the expected uuid doesn't ever appear to systemd, and thus it doesn't attempt to mount it, and the system hangs indefinitely. Chris Murphy ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] timed out waiting for device dev-disk-by\x2duuid
On May 16, 2014, at 11:31 AM, Goffredo Baroncelli kreij...@inwind.it wrote: On 05/15/2014 11:54 PM, Chris Murphy wrote: On May 15, 2014, at 2:57 PM, Goffredo Baroncelli kreij...@libero.it wrote: [] The udev rule right now is asking if all Btrfs member devices are present and it sounds like that answer is no with a missing device; so a mount isn't even attempted by systemd rather than attempting a degraded mount specifically for the root=UUID device(s). Who is in charge to mount the filesystem ? Ultimately systemd mounts the defined root file system to /sysroot. It knows what volume to mount based on boot parameter root=UUID= but it doesn't even try to mount it until the volume UUID for root fs has appeared. Until then, the mount command isn't even issued. What I found is that dracut waits until all the btrfs devices are present: cat /usr/lib/dracut/modules.d/90btrfs/80-btrfs.rules SUBSYSTEM!=block, GOTO=btrfs_end ACTION!=add|change, GOTO=btrfs_end ENV{ID_FS_TYPE}!=btrfs, GOTO=btrfs_end RUN+=/sbin/btrfs device scan $env{DEVNAME} RUN+=/sbin/initqueue --finished --unique --name btrfs_finished /sbin/btrfs_finished LABEL=btrfs_end and cat btrfs_finished.sh #!/bin/sh # -*- mode: shell-script; indent-tabs-mode: nil; sh-basic-offset: 4; -*- # ex: ts=8 sw=4 sts=4 et filetype=sh type getarg /dev/null 21 || . /lib/dracut-lib.sh btrfs_check_complete() { local _rootinfo _dev _dev=${1:-/dev/root} [ -e $_dev ] || return 0 _rootinfo=$(udevadm info --query=env --name=$_dev 2/dev/null) if strstr $_rootinfo ID_FS_TYPE=btrfs; then info Checking, if btrfs device complete unset __btrfs_mount mount -o ro $_dev /tmp /dev/null 21 __btrfs_mount=$? [ $__btrfs_mount -eq 0 ] umount $_dev /dev/null 21 return $__btrfs_mount fi return 0 } btrfs_check_complete $1 exit $? It seems that when a new btrfs device appears, the system attempt to mount it. If it succeed then it is assumed that all devices are present. No, the system definitely does not attempt to mount it if there's a missing device. Systemd never executes /bin/mount at all in that case. A prerequisite for the mount attempt is this line: [1.621517] localhost.localdomain systemd[1]: dev-disk-by\x2duuid-9ff63135\x2dce42\x2d4447\x2da6de\x2dd7c9b4fb6d66.device changed dead - plugged That line only appears if all devices are present. And mount attempt doesn't happen. The system just hangs. However, if I do an rd.break=pre-mount, and get to a dracut shell this command works: mount -t btrfs -o subvol=root,ro,degraded -U uuid The volume UUID is definitely present even though not all devices are present. So actually in this case it's confusing why this uuid hasn't gone from dead to plugged. Until it's plugged, the mount command won't happen. To allow a degraded boot, it should be sufficient replace mount -o ro $_dev /tmp /dev/null 21 with OPTS=ro grep -q degraded /proc/cmdline OPTS=,degraded mount -o $OPTS $_dev /tmp /dev/null 21 The problem isn't that the degraded mount option isn't being used by systemd. The problem is that systemd isn't changing the device from dead to plugged. And the problem there is that there are actually four possible states for an array, yet btrfs device ready apparently only distinguishes between 1 and not 1 (i.e. 2, 3, 4). 1. All devices ready. 2. Minimum number of data/metadata devices ready, allow degraded rw mount. 3. Minimum number of data devices not ready, but enough metadata devices are ready, allow degraded ro mount. 4. Minimum number of data/metadata devices not ready, degraded mount not possible. So I think it's a question for the btrfs list to see what the long term strategy is, in the face of the fact rootflags=degraded alone does not work on systemd systems. Once I'm on 208-16 on Fedora 20, I get the same hang as on Rawhide. So actually I have to force power off, reboot with mount option rd.break=pre-mount, mount the volume manually, and exit twice. And that's fine for me, but it's non-obvious for most users. The thing to put to the Btrfs list is how are they expecting this to work down the road. Right now, the way md does this, it doesn't do anything at all. It's actually dracut scripts that check for the existance of the rootfs volume UUID up to 240 times, with an 0.5 sleep between each attempt. After 240 failed attempts, dracut runs mdadm -R which forcibly runs the array with available devices (i.e. degraded assembly), at that moment the volume UUID becomes available, the device goes from dead to plugged, and systemd mounts it. And boot continues normally. So maybe Btrfs can leverage that same loop used for md degraded booting. But after the loop completes, then what? I don't see how systemd gets informed to use an additional mount option degraded conditionally. I think the equivalent for dracut's mdadm
Re: [systemd-devel] timed out waiting for device dev-disk-by\x2duuid
Here's an example; $ cat /etc/fstab UUID=8c618270-30ae-4a18-a921-1d99034c35a5 / ext4 defaults1 1 UUID=c40ada21-740e-49d9-bbd1-2c2a7c10b028 /boot ext4 defaults1 2 UUID=85e74fda-7354-4384-8baf-4338e84b9ebe swapswap defaults0 0 # journalctl -b -x ... :04:10 kernel: Kernel command line: BOOT_IMAGE=/vmlinuz-3.15.0-0.rc5.git2.8.fc21.x86_64 root=UUID=8c618270-30ae-4a18-a921-1d99034c35a5 ... initrd=/initramfs-3.15.0-0.rc5.git2.8.fc21.x86_64.img ... :04:10 systemd[1]: Expecting device dev-disk-by\x2duuid-8c618270\x2d30ae\x2d4a18\x2da921\x2d1d99034c35a5.device... ... :04:11 systemd[1]: Found device SAMSUNG_HD103SJ 3. -- Subject: Unit dev-disk-by\x2duuid-8c618270\x2d30ae\x2d4a18\x2da921\x2d1d99034c35a5.device has finished start-up -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Unit dev-disk-by\x2duuid-8c618270\x2d30ae\x2d4a18\x2da921\x2d1d99034c35a5.device has finished starting up. -- -- The start-up result is done. :04:11 systemd[1]: Starting File System Check on /dev/disk/by-uuid/8c618270-30ae-4a18-a921-1d99034c35a5... -- Subject: Unit systemd-fsck@dev-disk-by\x2duuid-8c618270\x2d30ae\x2d4a18\x2da921\x2d1d99034c35a5.service has begun with start-up -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Unit systemd-fsck@dev-disk-by\x2duuid-8c618270\x2d30ae\x2d4a18\x2da921\x2d1d99034c35a5.service has begun starting up. ... :04:12 systemd-fsck[259]: /dev/sda3: clean, 1133241/6627328 files, 12143849/26479136 blocks :04:12 kernel: fsck (260) used greatest stack depth: 4592 bytes left :04:12 systemd[1]: Started File System Check on /dev/disk/by-uuid/8c618270-30ae-4a18-a921-1d99034c35a5. -- Subject: Unit systemd-fsck@dev-disk-by\x2duuid-8c618270\x2d30ae\x2d4a18\x2da921\x2d1d99034c35a5.service has finished start-up -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Unit systemd-fsck@dev-disk-by\x2duuid-8c618270\x2d30ae\x2d4a18\x2da921\x2d1d99034c35a5.service has finished starting up. -- -- The start-up result is done. :04:12 systemd[1]: Mounting /sysroot... -- Subject: Unit sysroot.mount has begun with start-up -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Unit sysroot.mount has begun starting up. :04:12 kernel: EXT4-fs (sda3): mounted filesystem with ordered data mode. Opts: (null) :04:12 systemd[1]: Mounted /sysroot. -- Subject: Unit sysroot.mount has finished start-up -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Unit sysroot.mount has finished starting up. -- -- The start-up result is done. :04:12 kernel: mount (273) used greatest stack depth: 4096 bytes left :04:12 systemd[1]: Starting Initrd Root File System. -- Subject: Unit initrd-root-fs.target has begun with start-up -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Unit initrd-root-fs.target has begun starting up. :04:12 systemd[1]: Reached target Initrd Root File System. -- Subject: Unit initrd-root-fs.target has finished start-up -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Unit initrd-root-fs.target has finished starting up. -- -- The start-up result is done. :04:12 systemd[1]: Starting Reload Configuration from the Real Root... -- Subject: Unit initrd-parse-etc.service has begun with start-up -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Unit initrd-parse-etc.service has begun starting up. :04:12 systemd[1]: Reloading. :04:12 kernel: systemd-fstab-g (281) used greatest stack depth: 4032 bytes left :04:12 systemd[1]: Started Reload Configuration from the Real Root. -- Subject: Unit initrd-parse-etc.service has finished start-up -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Unit initrd-parse-etc.service has finished starting up. -- -- The start-up result is done. :04:12 systemd[1]: Starting Initrd File Systems. -- Subject: Unit initrd-fs.target has begun with start-up -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Unit initrd-fs.target has begun starting up. :04:12 systemd[1]: Reached target Initrd File Systems. -- Subject: Unit initrd-fs.target has finished start-up -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Unit initrd-fs.target has finished starting up. -- -- The start-up result is done. ... :04:24 kernel: systemd-udevd (400) used greatest stack depth: 3200 bytes left ... :05:45 systemd[1]: Job
Re: [systemd-devel] timed out waiting for device dev-disk-by\x2duuid
And this is the output when working properly; [1.509163] localhost.localdomain systemd[1]: Expecting device dev-disk-by\x2duuid-8c618270\x2d30ae\x2d4a18\x2da921\x2d1d99034c35a5.device... -- Subject: Unit dev-disk-by\x2duuid-8c618270\x2d30ae\x2d4a18\x2da921\x2d1d99034c35a5.device has finished start-up -- Unit dev-disk-by\x2duuid-8c618270\x2d30ae\x2d4a18\x2da921\x2d1d99034c35a5.device has finished starting up. [2.694743] localhost.localdomain systemd[1]: Starting File System Check on /dev/disk/by-uuid/8c618270-30ae-4a18-a921-1d99034c35a5... -- Subject: Unit systemd-fsck@dev-disk-by\x2duuid-8c618270\x2d30ae\x2d4a18\x2da921\x2d1d99034c35a5.service has begun with start-up -- Unit systemd-fsck@dev-disk-by\x2duuid-8c618270\x2d30ae\x2d4a18\x2da921\x2d1d99034c35a5.service has begun starting up. [2.871497] localhost.localdomain systemd[1]: Started File System Check on /dev/disk/by-uuid/8c618270-30ae-4a18-a921-1d99034c35a5. -- Subject: Unit systemd-fsck@dev-disk-by\x2duuid-8c618270\x2d30ae\x2d4a18\x2da921\x2d1d99034c35a5.service has finished start-up -- Unit systemd-fsck@dev-disk-by\x2duuid-8c618270\x2d30ae\x2d4a18\x2da921\x2d1d99034c35a5.service has finished starting up. [6.974099] localhost.localdomain systemd[1]: Expecting device dev-disk-by\x2duuid-c40ada21\x2d740e\x2d49d9\x2dbbd1\x2d2c2a7c10b028.device... -- Subject: Unit dev-disk-by\x2duuid-c40ada21\x2d740e\x2d49d9\x2dbbd1\x2d2c2a7c10b028.device has begun with start-up -- Unit dev-disk-by\x2duuid-c40ada21\x2d740e\x2d49d9\x2dbbd1\x2d2c2a7c10b028.device has begun starting up. -- Subject: Unit dev-disk-by\x2duuid-c40ada21\x2d740e\x2d49d9\x2dbbd1\x2d2c2a7c10b028.device has finished start-up -- Unit dev-disk-by\x2duuid-c40ada21\x2d740e\x2d49d9\x2dbbd1\x2d2c2a7c10b028.device has finished starting up. [ 11.609924] localhost.localdomain systemd[1]: Starting File System Check on /dev/disk/by-uuid/c40ada21-740e-49d9-bbd1-2c2a7c10b028... -- Subject: Unit systemd-fsck@dev-disk-by\x2duuid-c40ada21\x2d740e\x2d49d9\x2dbbd1\x2d2c2a7c10b028.service has begun with start-up -- Unit systemd-fsck@dev-disk-by\x2duuid-c40ada21\x2d740e\x2d49d9\x2dbbd1\x2d2c2a7c10b028.service has begun starting up. -- Subject: Unit dev-disk-by\x2duuid-85e74fda\x2d7354\x2d4384\x2d8baf\x2d4338e84b9ebe.device has finished start-up -- Unit dev-disk-by\x2duuid-85e74fda\x2d7354\x2d4384\x2d8baf\x2d4338e84b9ebe.device has finished starting up. [ 11.843937] localhost.localdomain systemd[1]: Activating swap /dev/disk/by-uuid/85e74fda-7354-4384-8baf-4338e84b9ebe... -- Subject: Unit dev-disk-by\x2duuid-85e74fda\x2d7354\x2d4384\x2d8baf\x2d4338e84b9ebe.swap has begun with start-up -- Unit dev-disk-by\x2duuid-85e74fda\x2d7354\x2d4384\x2d8baf\x2d4338e84b9ebe.swap has begun starting up. [ 12.116326] localhost.localdomain systemd[1]: Activated swap /dev/disk/by-uuid/85e74fda-7354-4384-8baf-4338e84b9ebe. -- Subject: Unit dev-disk-by\x2duuid-85e74fda\x2d7354\x2d4384\x2d8baf\x2d4338e84b9ebe.swap has finished start-up -- Unit dev-disk-by\x2duuid-85e74fda\x2d7354\x2d4384\x2d8baf\x2d4338e84b9ebe.swap has finished starting up. [ 12.545564] localhost.localdomain systemd[1]: Started File System Check on /dev/disk/by-uuid/c40ada21-740e-49d9-bbd1-2c2a7c10b028. -- Subject: Unit systemd-fsck@dev-disk-by\x2duuid-c40ada21\x2d740e\x2d49d9\x2dbbd1\x2d2c2a7c10b028.service has finished start-up -- Unit systemd-fsck@dev-disk-by\x2duuid-c40ada21\x2d740e\x2d49d9\x2dbbd1\x2d2c2a7c10b028.service has finished starting up. poma ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] timed out waiting for device dev-disk-by\x2duuid
On Mon, 12.05.14 20:48, Chris Murphy (li...@colorremedies.com) wrote: Two device Btrfs volume, with one device missing (simulated) will not boot, even with rootflags=degraded set which is currently required to enable Btrfs degraded mounts. Upon reaching a dracut shell after basic.target fails with time out, I can mount -o subvol=root,degraded and exit and continue boot normally with just the single device. The problem seems to be that systemd (udev?) is not finding the volume by uuid for some reason, and therefore not attempting to mount it. But I don't know why it can't find it, or even how the find by uuid mechanism works this early in boot. So I'm not sure if this is a systemd or udev bug, or a dracut, or kernel bug. The problem happens with systemd 208-9.fc20 with kernel 3.11.10-301.fc20, and systemd 212-4.fc21 and kernel 3.15.0-0.rc5.git0.1.fc21. As soon as btrfs reports that a file system is ready, systemd will pick it up. This is handled with the btrfs udev built-in, and invoked via /usr/lib/udev/rules.d/64-btrfs.rules. rootflags has no influence on that, as at that point it is not clear whether the block device will be the once that carries the root file system, or any other file system. Not sure what we should be doing about this. Maybe introduce a new btrfs=degraded switch that acts globally, and influences the udev built-in? Kay? Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] timed out waiting for device dev-disk-by\x2duuid
On Thu, 15.05.14 19:29, Lennart Poettering (lenn...@poettering.net) wrote: On Mon, 12.05.14 20:48, Chris Murphy (li...@colorremedies.com) wrote: Two device Btrfs volume, with one device missing (simulated) will not boot, even with rootflags=degraded set which is currently required to enable Btrfs degraded mounts. Upon reaching a dracut shell after basic.target fails with time out, I can mount -o subvol=root,degraded and exit and continue boot normally with just the single device. The problem seems to be that systemd (udev?) is not finding the volume by uuid for some reason, and therefore not attempting to mount it. But I don't know why it can't find it, or even how the find by uuid mechanism works this early in boot. So I'm not sure if this is a systemd or udev bug, or a dracut, or kernel bug. The problem happens with systemd 208-9.fc20 with kernel 3.11.10-301.fc20, and systemd 212-4.fc21 and kernel 3.15.0-0.rc5.git0.1.fc21. As soon as btrfs reports that a file system is ready, systemd will pick it up. This is handled with the btrfs udev built-in, and invoked via /usr/lib/udev/rules.d/64-btrfs.rules. rootflags has no influence on that, as at that point it is not clear whether the block device will be the once that carries the root file system, or any other file system. Not sure what we should be doing about this. Maybe introduce a new btrfs=degraded switch that acts globally, and influences the udev built-in? Kay? So, as it turns out there's no kernel APi available to check whether a btrfs raid array is now complete enough for degraded mode to succeed. There's only a way to check whether it is fully complete. And even if we had an API for this, how would this even work at all? I mean, just having a catchall switch to boot in degraded mode is really dangerous if people have more than one array and we might end up mounting an fs in degraded mode that actually is fully available if we just waited 50ms longer... I mean this is even the problem with just one array: if you have redundancy of 3 disks, when do you start mounting the thing when degraded mode is requested on the kernel command line? as soon as degrdaded mounting is possible (thus fucking up possible all 3 disks that happened to show up last), or later? I have no idea how this all should work really, it's a giant mess. There probably needs to be some btrfs userspace daemon thing that watches btrfs arrays and does some magic if they timeout. But for now I am pretty sure we should just leave everything in fully manual mode, that's the safest thing to do... Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] timed out waiting for device dev-disk-by\x2duuid
On 05/15/2014 08:16 PM, Lennart Poettering wrote: On Thu, 15.05.14 19:29, Lennart Poettering (lenn...@poettering.net) wrote: On Mon, 12.05.14 20:48, Chris Murphy (li...@colorremedies.com) wrote: [...] So, as it turns out there's no kernel APi available to check whether a btrfs raid array is now complete enough for degraded mode to succeed. There's only a way to check whether it is fully complete. And even if we had an API for this, how would this even work at all? In what this should be different than the normal RAID system ? In both case there are two timeout: the first one is for waiting the full system, the second one is for the minimal set of disks to a degraded mode. If even the second timeout is passed, then we should consider the filesystem not build-able. How it is handle for the RAID system ? Knowing that we should consider to apply the same strategies fro btrfs (may be we need some userspace tool to do that) mean, just having a catchall switch to boot in degraded mode is really dangerous if people have more than one array and we might end up mounting an fs in degraded mode that actually is fully available if we just waited 50ms longer... I mean this is even the problem with just one array: if you have redundancy of 3 disks, when do you start mounting the thing when degraded mode is requested on the kernel command line? as soon as degrdaded mounting is possible (thus fucking up possible all 3 disks that happened to show up last), or later? I have no idea how this all should work really, it's a giant mess. There probably needs to be some btrfs userspace daemon thing that watches btrfs arrays and does some magic if they timeout. But for now I am pretty sure we should just leave everything in fully manual mode, that's the safest thing to do... Lennart -- gpg @keyserver.linux.it: Goffredo Baroncelli (kreijackATinwind.it Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] timed out waiting for device dev-disk-by\x2duuid
On May 15, 2014, at 12:16 PM, Lennart Poettering lenn...@poettering.net wrote: On Thu, 15.05.14 19:29, Lennart Poettering (lenn...@poettering.net) wrote: On Mon, 12.05.14 20:48, Chris Murphy (li...@colorremedies.com) wrote: Two device Btrfs volume, with one device missing (simulated) will not boot, even with rootflags=degraded set which is currently required to enable Btrfs degraded mounts. Upon reaching a dracut shell after basic.target fails with time out, I can mount -o subvol=root,degraded and exit and continue boot normally with just the single device. The problem seems to be that systemd (udev?) is not finding the volume by uuid for some reason, and therefore not attempting to mount it. But I don't know why it can't find it, or even how the find by uuid mechanism works this early in boot. So I'm not sure if this is a systemd or udev bug, or a dracut, or kernel bug. The problem happens with systemd 208-9.fc20 with kernel 3.11.10-301.fc20, and systemd 212-4.fc21 and kernel 3.15.0-0.rc5.git0.1.fc21. As soon as btrfs reports that a file system is ready, systemd will pick it up. This is handled with the btrfs udev built-in, and invoked via /usr/lib/udev/rules.d/64-btrfs.rules. rootflags has no influence on that, as at that point it is not clear whether the block device will be the once that carries the root file system, or any other file system. Not sure what we should be doing about this. Maybe introduce a new btrfs=degraded switch that acts globally, and influences the udev built-in? Kay? So, as it turns out there's no kernel APi available to check whether a btrfs raid array is now complete enough for degraded mode to succeed. There's only a way to check whether it is fully complete. And even if we had an API for this, how would this even work at all? I mean, just having a catchall switch to boot in degraded mode is really dangerous if people have more than one array and we might end up mounting an fs in degraded mode that actually is fully available if we just waited 50ms longer... I mean this is even the problem with just one array: if you have redundancy of 3 disks, when do you start mounting the thing when degraded mode is requested on the kernel command line? as soon as degrdaded mounting is possible (thus fucking up possible all 3 disks that happened to show up last), or later? I have no idea how this all should work really, it's a giant mess. There probably needs to be some btrfs userspace daemon thing that watches btrfs arrays and does some magic if they timeout. But for now I am pretty sure we should just leave everything in fully manual mode, that's the safest thing to do… Is it that the existing udev rule either doesn't know, or doesn't have a way of knowing, that rootflags=degraded should enable only the root=UUID device to bypass the ready rule? Does udev expect a different readiness state to attempt a mount, than a manual mount from dracut shell? I'm confused why the Btrfs volume is not ready for systemd which then doesn't even attempt to mount it; and yet at a dracut shell it is ready when I do the mount manually. That seems like two readiness states. I'd say it's not udev's responsibility, but rather Btrfs kernel code, to make sure things don't get worse with the file system, regardless of what devices it's presented with. At the time it tries to do the mount, it has its own logic for normal and degraded mounts whether the minimum number of devices are present or not and if not it fails. The degraded mount is also per volume, not global. For example if I remove a device, and boot degraded and work for a few hours making lots of changes (even doing a system update, which is probably insane to do), I can later reboot with the stale device attached and Btrfs figures it out, passively. That means it figures out if there's a newer copy when a file is read, and forwards the newest copy to user space, and fixes the stale copy on the previously missing device. A manual balance ensures all new files also have redundancy. I think it's intended eventually to have a smarter balance catch up filter that can also run automatically in such a case. In any case the file system isn't trashed. Chris Murphy ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] timed out waiting for device dev-disk-by\x2duuid
On Thu, May 15, 2014 at 10:57 PM, Goffredo Baroncelli kreij...@libero.it wrote: On 05/15/2014 08:16 PM, Lennart Poettering wrote: On Thu, 15.05.14 19:29, Lennart Poettering (lenn...@poettering.net) wrote: On Mon, 12.05.14 20:48, Chris Murphy (li...@colorremedies.com) wrote: [...] So, as it turns out there's no kernel APi available to check whether a btrfs raid array is now complete enough for degraded mode to succeed. There's only a way to check whether it is fully complete. And even if we had an API for this, how would this even work at all? In what this should be different than the normal RAID system ? In both case there are two timeout: the first one is for waiting the full system, the second one is for the minimal set of disks to a degraded mode. If even the second timeout is passed, then we should consider the filesystem not build-able. How it is handle for the RAID system ? Knowing that we should consider to apply the same strategies fro btrfs (may be we need some userspace tool to do that) RAID is not handled by systemd, it is handled by other tools or not at all. Initrds have some logic here, but nothing convincing, and it is just the same mess as this. Kay ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] timed out waiting for device dev-disk-by\x2duuid
On Thu, May 15, 2014 at 10:57 PM, Chris Murphy li...@colorremedies.com wrote: On May 15, 2014, at 12:16 PM, Lennart Poettering lenn...@poettering.net wrote: On Thu, 15.05.14 19:29, Lennart Poettering (lenn...@poettering.net) wrote: On Mon, 12.05.14 20:48, Chris Murphy (li...@colorremedies.com) wrote: Two device Btrfs volume, with one device missing (simulated) will not boot, even with rootflags=degraded set which is currently required to enable Btrfs degraded mounts. Upon reaching a dracut shell after basic.target fails with time out, I can mount -o subvol=root,degraded and exit and continue boot normally with just the single device. The problem seems to be that systemd (udev?) is not finding the volume by uuid for some reason, and therefore not attempting to mount it. But I don't know why it can't find it, or even how the find by uuid mechanism works this early in boot. So I'm not sure if this is a systemd or udev bug, or a dracut, or kernel bug. The problem happens with systemd 208-9.fc20 with kernel 3.11.10-301.fc20, and systemd 212-4.fc21 and kernel 3.15.0-0.rc5.git0.1.fc21. As soon as btrfs reports that a file system is ready, systemd will pick it up. This is handled with the btrfs udev built-in, and invoked via /usr/lib/udev/rules.d/64-btrfs.rules. rootflags has no influence on that, as at that point it is not clear whether the block device will be the once that carries the root file system, or any other file system. Not sure what we should be doing about this. Maybe introduce a new btrfs=degraded switch that acts globally, and influences the udev built-in? Kay? So, as it turns out there's no kernel APi available to check whether a btrfs raid array is now complete enough for degraded mode to succeed. There's only a way to check whether it is fully complete. And even if we had an API for this, how would this even work at all? I mean, just having a catchall switch to boot in degraded mode is really dangerous if people have more than one array and we might end up mounting an fs in degraded mode that actually is fully available if we just waited 50ms longer... I mean this is even the problem with just one array: if you have redundancy of 3 disks, when do you start mounting the thing when degraded mode is requested on the kernel command line? as soon as degrdaded mounting is possible (thus fucking up possible all 3 disks that happened to show up last), or later? I have no idea how this all should work really, it's a giant mess. There probably needs to be some btrfs userspace daemon thing that watches btrfs arrays and does some magic if they timeout. But for now I am pretty sure we should just leave everything in fully manual mode, that's the safest thing to do… Is it that the existing udev rule either doesn't know, or doesn't have a way of knowing, that rootflags=degraded should enable only the root=UUID device to bypass the ready rule? Does udev expect a different readiness state to attempt a mount, than a manual mount from dracut shell? I'm confused why the Btrfs volume is not ready for systemd which then doesn't even attempt to mount it; and yet at a dracut shell it is ready when I do the mount manually. That seems like two readiness states. The btrfs kernel state has only one state, and that is what udev reacts to. I'd say it's not udev's responsibility, but rather Btrfs kernel code, to make sure things don't get worse with the file system, regardless of what devices it's presented with. At the time it tries to do the mount, it has its own logic for normal and degraded mounts whether the minimum number of devices are present or not and if not it fails. The degraded mount is also per volume, not global. For example if I remove a device, and boot degraded and work for a few hours making lots of changes (even doing a system update, which is probably insane to do), I can later reboot with the stale device attached and Btrfs figures it out, passively. That means it figures out if there's a newer copy when a file is read, and forwards the newest copy to user space, and fixes the stale copy on the previously missing device. A manual balance ensures all new files also have redundancy. I think it's intended eventually to have a smarter balance catch up filter that can also run automatically in such a case. In any case the file system isn't trashed. The problem is when to actively force to degrade things when devices do not show up in time. That is nothing the kernel can know, it would need to be userspace making that decision. But udev does not really have that information at that level, it would need to try until the kernel is satisfied mounting a volume degraded. This all is probably not a job for udev or systemd, but for a specialized storage daemon which has explicit configuration/policy in which way to mess around with the user's data. This is not an area where we should try to be smart; falling back
Re: [systemd-devel] timed out waiting for device dev-disk-by\x2duuid
On May 15, 2014, at 2:57 PM, Goffredo Baroncelli kreij...@libero.it wrote: On 05/15/2014 08:16 PM, Lennart Poettering wrote: On Thu, 15.05.14 19:29, Lennart Poettering (lenn...@poettering.net) wrote: On Mon, 12.05.14 20:48, Chris Murphy (li...@colorremedies.com) wrote: [...] So, as it turns out there's no kernel APi available to check whether a btrfs raid array is now complete enough for degraded mode to succeed. There's only a way to check whether it is fully complete. And even if we had an API for this, how would this even work at all? In what this should be different than the normal RAID system ? I think it's because with md, the array assembly is separate from fs mount. I don't know what timeout it uses, but it does do automatic degraded assembly eventually. Once assembled (degraded or normal) then the md device is ready and that's when udev rules start to apply and systemd will try to mount root fs. However on Btrfs, the degraded assembly and fs mount concepts are combined. So without degraded assembly first, it sounds like udev+systemd don't even try to mount the fs. At least that's my rudimentary understanding. The udev rule right now is asking if all Btrfs member devices are present and it sounds like that answer is no with a missing device; so a mount isn't even attempted by systemd rather than attempting a degraded mount specifically for the root=UUID device(s). What is parsing the boot parameters ro, root=, and rootflags=? Are those recognized by the kernel or systemd? In both case there are two timeout: the first one is for waiting the full system, the second one is for the minimal set of disks to a degraded mode. If even the second timeout is passed, then we should consider the filesystem not build-able. How it is handle for the RAID system ? Knowing that we should consider to apply the same strategies fro btrfs (may be we need some userspace tool to do that) Well that sounds like a user space tool to be in the initramfs so it can do this logic before systemd even attempt to mount rootfs. Or it's done by kernel code if it's possible for it to parse root=UUID and only care about the member devices for that volume. Chris Murphy ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] timed out waiting for device dev-disk-by\x2duuid
On Thu, 15.05.14 15:54, Chris Murphy (li...@colorremedies.com) wrote: The udev rule right now is asking if all Btrfs member devices are present and it sounds like that answer is no with a missing device; so a mount isn't even attempted by systemd rather than attempting a degraded mount specifically for the root=UUID device(s). Yes, correct. And my suspicion is that if any more complex logic than this shall take place, then this should probably be managed by some kind of storage daemon still to be written, not udev. udev doesn't do timeouts, and has no clue about raid arrays. If things get this complex there really needs to be some other component in place that can handle this... What is parsing the boot parameters ro, root=, and rootflags=? Are those recognized by the kernel or systemd? When an initrd is used it is the initrd, otherwise the kernel itself. Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel