Re: [systemd-devel] timed out waiting for device dev-disk-by\x2duuid

2020-01-13 Thread Lennart Poettering
On Sa, 04.01.20 18:58, Georg Großmann (ge...@grossmann-technologies.de) wrote:

> Has this issue been fixed on either systemd or on BTRFS side in the
> meantime? I am currently testing a BTRFS raid1 with two disk in my
> virtualbox. I have installed a bootloader on both disks. After removing
> one of the disks I always get stuck at "timed out waiting for device
> dev-disk-by\x2duuid". Or is it still a 100% manual task to get the raid1
> up again?

systemd does not implement a policy manager that decides when it's
time to not wait for additional members of a raid device anymore. This
is something the btrfs folks should put together. How long to wait for
how many of how many devices, possibly asking for user input is some
complex and specific enough for the btrfs folsk to take care of, and
is outside of systemd#s realm.

systemd with it's internal code will deal with the obvious case
(i.e. all members appeared), and provides hooks for policy managers to
hook into (the ID_BTRFS_READY udev property), but it doesn't provide
those policy managers, and that's unlikely to change.

Sorry,

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] timed out waiting for device dev-disk-by\x2duuid

2020-01-11 Thread Andrei Borzenkov
04.01.2020 20:58, Georg Großmann пишет:
> Has this issue been fixed on either systemd or on BTRFS side in the
> meantime?

No. Each project says that from its side there is nothing to fix.

> I am currently testing a BTRFS raid1 with two disk in my
> virtualbox. I have installed a bootloader on both disks. After removing
> one of the disks I always get stuck at "timed out waiting for device
> dev-disk-by\x2duuid". Or is it still a 100% manual task to get the raid1
> up again?
> 
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] timed out waiting for device dev-disk-by\x2duuid

2020-01-04 Thread Georg Großmann
Has this issue been fixed on either systemd or on BTRFS side in the
meantime? I am currently testing a BTRFS raid1 with two disk in my
virtualbox. I have installed a bootloader on both disks. After removing
one of the disks I always get stuck at "timed out waiting for device
dev-disk-by\x2duuid". Or is it still a 100% manual task to get the raid1
up again?

Georg


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] timed out waiting for device dev-disk-by\x2duuid

2014-05-18 Thread Chris Murphy

On May 17, 2014, at 5:30 PM, Chris Murphy li...@colorremedies.com wrote:
 
 No, the system definitely does not attempt to mount it if there's a missing 
 device. Systemd never executes /bin/mount at all in that case. A prerequisite 
 for the mount attempt is this line:
 
 [1.621517] localhost.localdomain systemd[1]: 
 dev-disk-by\x2duuid-9ff63135\x2dce42\x2d4447\x2da6de\x2dd7c9b4fb6d66.device 
 changed dead - plugged
 
 That line only appears if all devices are present. And mount attempt doesn't 
 happen. The system just hangs.
 
 However, if I do an rd.break=pre-mount, and get to a dracut shell this 
 command works:
 
 mount -t btrfs -o subvol=root,ro,degraded -U uuid
 
 The volume UUID is definitely present even though not all devices are 
 present. So actually in this case it's confusing why this uuid hasn't gone 
 from dead to plugged. Until it's plugged, the mount command won't happen.

2 device Btrfs raid1, sda3 and sdb3. When both are available I get these lines:

[2.168697] localhost.localdomain systemd-udevd[109]: creating link 
'/dev/disk/by-uuid/9ff63135-ce42-4447-a6de-d7c9b4fb6d66' to '/dev/sda3'
[2.170232] localhost.localdomain systemd-udevd[135]: creating link 
'/dev/disk/by-uuid/9ff63135-ce42-4447-a6de-d7c9b4fb6d66' to '/dev/sdb3'

That precipitates systemd changing the by-uuid.device from dead to plugged. If 
I remove one device, then udev does not create a link from uuid to /dev for the 
remaining device. Therefore the expected uuid doesn't ever appear to systemd, 
and thus it doesn't attempt to mount it, and the system hangs indefinitely.




Chris Murphy

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] timed out waiting for device dev-disk-by\x2duuid

2014-05-17 Thread Chris Murphy

On May 16, 2014, at 11:31 AM, Goffredo Baroncelli kreij...@inwind.it wrote:

 On 05/15/2014 11:54 PM, Chris Murphy wrote:
 
 On May 15, 2014, at 2:57 PM, Goffredo Baroncelli kreij...@libero.it
 wrote:
 []
 
 The udev rule right now is asking if all Btrfs member devices are
 present and it sounds like that answer is no with a missing device;
 so a mount isn't even attempted by systemd rather than attempting a
 degraded mount specifically for the root=UUID device(s).
 
 Who is in charge to mount the filesystem ?

Ultimately systemd mounts the defined root file system to /sysroot. It knows 
what volume to mount based on boot parameter root=UUID= but it doesn't even try 
to mount it until the volume UUID for root fs has appeared. Until then, the 
mount command isn't even issued.


 
 What I found is that dracut waits until all the btrfs devices are present:
 
 cat /usr/lib/dracut/modules.d/90btrfs/80-btrfs.rules
 SUBSYSTEM!=block, GOTO=btrfs_end
 ACTION!=add|change, GOTO=btrfs_end
 ENV{ID_FS_TYPE}!=btrfs, GOTO=btrfs_end
 RUN+=/sbin/btrfs device scan $env{DEVNAME}
 
 RUN+=/sbin/initqueue --finished --unique --name btrfs_finished 
 /sbin/btrfs_finished
 
 LABEL=btrfs_end
 
 
 and 
 
 
 cat btrfs_finished.sh 
 #!/bin/sh
 # -*- mode: shell-script; indent-tabs-mode: nil; sh-basic-offset: 4; -*-
 # ex: ts=8 sw=4 sts=4 et filetype=sh
 
 type getarg /dev/null 21 || . /lib/dracut-lib.sh
 
 btrfs_check_complete() {
local _rootinfo _dev
_dev=${1:-/dev/root}
[ -e $_dev ] || return 0
_rootinfo=$(udevadm info --query=env --name=$_dev 2/dev/null)
if strstr $_rootinfo ID_FS_TYPE=btrfs; then
info Checking, if btrfs device complete
unset __btrfs_mount
mount -o ro $_dev /tmp /dev/null 21
__btrfs_mount=$?
[ $__btrfs_mount -eq 0 ]  umount $_dev /dev/null 21
return $__btrfs_mount
fi
return 0
 }
 
 btrfs_check_complete $1
 exit $?
 

 
 It seems that when a new btrfs device appears, the system attempt to mount 
 it. If it succeed then it is assumed that all devices are present.

No, the system definitely does not attempt to mount it if there's a missing 
device. Systemd never executes /bin/mount at all in that case. A prerequisite 
for the mount attempt is this line:

[1.621517] localhost.localdomain systemd[1]: 
dev-disk-by\x2duuid-9ff63135\x2dce42\x2d4447\x2da6de\x2dd7c9b4fb6d66.device 
changed dead - plugged

That line only appears if all devices are present. And mount attempt doesn't 
happen. The system just hangs.

However, if I do an rd.break=pre-mount, and get to a dracut shell this command 
works:

mount -t btrfs -o subvol=root,ro,degraded -U uuid

The volume UUID is definitely present even though not all devices are present. 
So actually in this case it's confusing why this uuid hasn't gone from dead to 
plugged. Until it's plugged, the mount command won't happen.



 To allow a degraded boot, it should be sufficient replace
 
 
   mount -o ro $_dev /tmp /dev/null 21
 
 with
 
   OPTS=ro
   grep -q degraded /proc/cmdline  OPTS=,degraded
   mount -o $OPTS $_dev /tmp /dev/null 21

The problem isn't that the degraded mount option isn't being used by systemd. 
The problem is that systemd isn't changing the device from dead to plugged.

And the problem there is that there are actually four possible states for an 
array, yet btrfs device ready apparently only distinguishes between 1 and not 1 
(i.e. 2, 3, 4).

1. All devices ready.
2. Minimum number of data/metadata devices ready, allow degraded rw mount.
3. Minimum number of data devices not ready, but enough metadata devices are 
ready, allow degraded ro mount.
4. Minimum number of data/metadata devices not ready, degraded mount not 
possible.

So I think it's a question for the btrfs list to see what the long term 
strategy is, in the face of the fact rootflags=degraded alone does not work on 
systemd systems. Once I'm on 208-16 on Fedora 20, I get the same hang as on 
Rawhide. So actually I have to force power off, reboot with mount option 
rd.break=pre-mount, mount the volume manually, and exit twice. And that's fine 
for me, but it's non-obvious for most users.

The thing to put to the Btrfs list is how are they expecting this to work down 
the road.

Right now, the way md does this, it doesn't do anything at all. It's actually 
dracut scripts that check for the existance of the rootfs volume UUID up to 240 
times, with an 0.5 sleep between each attempt. After 240 failed attempts, 
dracut runs mdadm -R which forcibly runs the array with available devices (i.e. 
degraded assembly), at that moment the volume UUID becomes available, the 
device goes from dead to plugged, and systemd mounts it. And boot continues 
normally.

So maybe Btrfs can leverage that same loop used for md degraded booting. But 
after the loop completes, then what? I don't see how systemd gets informed to 
use an additional mount option degraded conditionally. I think the equivalent 
for dracut's mdadm 

Re: [systemd-devel] timed out waiting for device dev-disk-by\x2duuid

2014-05-16 Thread poma

Here's an example;

$ cat /etc/fstab 
UUID=8c618270-30ae-4a18-a921-1d99034c35a5 /   ext4
defaults1 1
UUID=c40ada21-740e-49d9-bbd1-2c2a7c10b028 /boot   ext4
defaults1 2
UUID=85e74fda-7354-4384-8baf-4338e84b9ebe swapswap
defaults0 0

# journalctl -b -x
...
 :04:10 kernel: Kernel command line: 
BOOT_IMAGE=/vmlinuz-3.15.0-0.rc5.git2.8.fc21.x86_64 
root=UUID=8c618270-30ae-4a18-a921-1d99034c35a5 ... 
initrd=/initramfs-3.15.0-0.rc5.git2.8.fc21.x86_64.img
...
 :04:10 systemd[1]: Expecting device 
dev-disk-by\x2duuid-8c618270\x2d30ae\x2d4a18\x2da921\x2d1d99034c35a5.device...
...
 :04:11 systemd[1]: Found device SAMSUNG_HD103SJ 3.
 -- Subject: Unit 
dev-disk-by\x2duuid-8c618270\x2d30ae\x2d4a18\x2da921\x2d1d99034c35a5.device has 
finished start-up
 -- Defined-By: systemd
 -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
 -- 
 -- Unit 
dev-disk-by\x2duuid-8c618270\x2d30ae\x2d4a18\x2da921\x2d1d99034c35a5.device has 
finished starting up.
 -- 
 -- The start-up result is done.
 :04:11 systemd[1]: Starting File System Check on 
/dev/disk/by-uuid/8c618270-30ae-4a18-a921-1d99034c35a5...
 -- Subject: Unit 
systemd-fsck@dev-disk-by\x2duuid-8c618270\x2d30ae\x2d4a18\x2da921\x2d1d99034c35a5.service
 has begun with start-up
 -- Defined-By: systemd
 -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
 -- 
 -- Unit 
systemd-fsck@dev-disk-by\x2duuid-8c618270\x2d30ae\x2d4a18\x2da921\x2d1d99034c35a5.service
 has begun starting up.
...
 :04:12 systemd-fsck[259]: /dev/sda3: clean, 1133241/6627328 files, 
12143849/26479136 blocks
 :04:12 kernel: fsck (260) used greatest stack depth: 4592 bytes left
 :04:12 systemd[1]: Started File System Check on 
/dev/disk/by-uuid/8c618270-30ae-4a18-a921-1d99034c35a5.
 -- Subject: Unit 
systemd-fsck@dev-disk-by\x2duuid-8c618270\x2d30ae\x2d4a18\x2da921\x2d1d99034c35a5.service
 has finished start-up
 -- Defined-By: systemd
 -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
 -- 
 -- Unit 
systemd-fsck@dev-disk-by\x2duuid-8c618270\x2d30ae\x2d4a18\x2da921\x2d1d99034c35a5.service
 has finished starting up.
 -- 
 -- The start-up result is done.
 :04:12 systemd[1]: Mounting /sysroot...
 -- Subject: Unit sysroot.mount has begun with start-up
 -- Defined-By: systemd
 -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
 -- 
 -- Unit sysroot.mount has begun starting up.
 :04:12 kernel: EXT4-fs (sda3): mounted filesystem with ordered data mode. 
Opts: (null)
 :04:12 systemd[1]: Mounted /sysroot.
 -- Subject: Unit sysroot.mount has finished start-up
 -- Defined-By: systemd
 -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
 -- 
 -- Unit sysroot.mount has finished starting up.
 -- 
 -- The start-up result is done.

 :04:12 kernel: mount (273) used greatest stack depth: 4096 bytes left
 :04:12 systemd[1]: Starting Initrd Root File System.
 -- Subject: Unit initrd-root-fs.target has begun with start-up
 -- Defined-By: systemd
 -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
 -- 
 -- Unit initrd-root-fs.target has begun starting up.
 :04:12 systemd[1]: Reached target Initrd Root File System.
 -- Subject: Unit initrd-root-fs.target has finished start-up
 -- Defined-By: systemd
 -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
 -- 
 -- Unit initrd-root-fs.target has finished starting up.
 -- 
 -- The start-up result is done.
 :04:12 systemd[1]: Starting Reload Configuration from the Real Root...
 -- Subject: Unit initrd-parse-etc.service has begun with start-up
 -- Defined-By: systemd
 -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
 -- 
 -- Unit initrd-parse-etc.service has begun starting up.
 :04:12 systemd[1]: Reloading.
 :04:12 kernel: systemd-fstab-g (281) used greatest stack depth: 4032 bytes left
 :04:12 systemd[1]: Started Reload Configuration from the Real Root.
 -- Subject: Unit initrd-parse-etc.service has finished start-up
 -- Defined-By: systemd
 -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
 -- 
 -- Unit initrd-parse-etc.service has finished starting up.
 -- 
 -- The start-up result is done.
 :04:12 systemd[1]: Starting Initrd File Systems.
 -- Subject: Unit initrd-fs.target has begun with start-up
 -- Defined-By: systemd
 -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
 -- 
 -- Unit initrd-fs.target has begun starting up.
 :04:12 systemd[1]: Reached target Initrd File Systems.
 -- Subject: Unit initrd-fs.target has finished start-up
 -- Defined-By: systemd
 -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
 -- 
 -- Unit initrd-fs.target has finished starting up.
 -- 
 -- The start-up result is done.
...
 :04:24 kernel: systemd-udevd (400) used greatest stack depth: 3200 bytes left
...
   :05:45 systemd[1]: Job 

Re: [systemd-devel] timed out waiting for device dev-disk-by\x2duuid

2014-05-16 Thread poma

And this is the output when working properly;

 [1.509163] localhost.localdomain systemd[1]: Expecting device 
dev-disk-by\x2duuid-8c618270\x2d30ae\x2d4a18\x2da921\x2d1d99034c35a5.device...
 -- Subject: Unit 
dev-disk-by\x2duuid-8c618270\x2d30ae\x2d4a18\x2da921\x2d1d99034c35a5.device has 
finished start-up
 -- Unit 
dev-disk-by\x2duuid-8c618270\x2d30ae\x2d4a18\x2da921\x2d1d99034c35a5.device has 
finished starting up.
 [2.694743] localhost.localdomain systemd[1]: Starting File System Check on 
/dev/disk/by-uuid/8c618270-30ae-4a18-a921-1d99034c35a5...
 -- Subject: Unit 
systemd-fsck@dev-disk-by\x2duuid-8c618270\x2d30ae\x2d4a18\x2da921\x2d1d99034c35a5.service
 has begun with start-up
 -- Unit 
systemd-fsck@dev-disk-by\x2duuid-8c618270\x2d30ae\x2d4a18\x2da921\x2d1d99034c35a5.service
 has begun starting up.
 [2.871497] localhost.localdomain systemd[1]: Started File System Check on 
/dev/disk/by-uuid/8c618270-30ae-4a18-a921-1d99034c35a5.
 -- Subject: Unit 
systemd-fsck@dev-disk-by\x2duuid-8c618270\x2d30ae\x2d4a18\x2da921\x2d1d99034c35a5.service
 has finished start-up
 -- Unit 
systemd-fsck@dev-disk-by\x2duuid-8c618270\x2d30ae\x2d4a18\x2da921\x2d1d99034c35a5.service
 has finished starting up.
 [6.974099] localhost.localdomain systemd[1]: Expecting device 
dev-disk-by\x2duuid-c40ada21\x2d740e\x2d49d9\x2dbbd1\x2d2c2a7c10b028.device...
 -- Subject: Unit 
dev-disk-by\x2duuid-c40ada21\x2d740e\x2d49d9\x2dbbd1\x2d2c2a7c10b028.device has 
begun with start-up
 -- Unit 
dev-disk-by\x2duuid-c40ada21\x2d740e\x2d49d9\x2dbbd1\x2d2c2a7c10b028.device has 
begun starting up.
 -- Subject: Unit 
dev-disk-by\x2duuid-c40ada21\x2d740e\x2d49d9\x2dbbd1\x2d2c2a7c10b028.device has 
finished start-up
 -- Unit 
dev-disk-by\x2duuid-c40ada21\x2d740e\x2d49d9\x2dbbd1\x2d2c2a7c10b028.device has 
finished starting up.
 [   11.609924] localhost.localdomain systemd[1]: Starting File System Check on 
/dev/disk/by-uuid/c40ada21-740e-49d9-bbd1-2c2a7c10b028...
 -- Subject: Unit 
systemd-fsck@dev-disk-by\x2duuid-c40ada21\x2d740e\x2d49d9\x2dbbd1\x2d2c2a7c10b028.service
 has begun with start-up
 -- Unit 
systemd-fsck@dev-disk-by\x2duuid-c40ada21\x2d740e\x2d49d9\x2dbbd1\x2d2c2a7c10b028.service
 has begun starting up.
 -- Subject: Unit 
dev-disk-by\x2duuid-85e74fda\x2d7354\x2d4384\x2d8baf\x2d4338e84b9ebe.device has 
finished start-up
 -- Unit 
dev-disk-by\x2duuid-85e74fda\x2d7354\x2d4384\x2d8baf\x2d4338e84b9ebe.device has 
finished starting up.
 [   11.843937] localhost.localdomain systemd[1]: Activating swap 
/dev/disk/by-uuid/85e74fda-7354-4384-8baf-4338e84b9ebe...
 -- Subject: Unit 
dev-disk-by\x2duuid-85e74fda\x2d7354\x2d4384\x2d8baf\x2d4338e84b9ebe.swap has 
begun with start-up
 -- Unit 
dev-disk-by\x2duuid-85e74fda\x2d7354\x2d4384\x2d8baf\x2d4338e84b9ebe.swap has 
begun starting up.
 [   12.116326] localhost.localdomain systemd[1]: Activated swap 
/dev/disk/by-uuid/85e74fda-7354-4384-8baf-4338e84b9ebe.
 -- Subject: Unit 
dev-disk-by\x2duuid-85e74fda\x2d7354\x2d4384\x2d8baf\x2d4338e84b9ebe.swap has 
finished start-up
 -- Unit 
dev-disk-by\x2duuid-85e74fda\x2d7354\x2d4384\x2d8baf\x2d4338e84b9ebe.swap has 
finished starting up.
 [   12.545564] localhost.localdomain systemd[1]: Started File System Check on 
/dev/disk/by-uuid/c40ada21-740e-49d9-bbd1-2c2a7c10b028.
 -- Subject: Unit 
systemd-fsck@dev-disk-by\x2duuid-c40ada21\x2d740e\x2d49d9\x2dbbd1\x2d2c2a7c10b028.service
 has finished start-up
 -- Unit 
systemd-fsck@dev-disk-by\x2duuid-c40ada21\x2d740e\x2d49d9\x2dbbd1\x2d2c2a7c10b028.service
 has finished starting up.


poma


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] timed out waiting for device dev-disk-by\x2duuid

2014-05-15 Thread Lennart Poettering
On Mon, 12.05.14 20:48, Chris Murphy (li...@colorremedies.com) wrote:

 Two device Btrfs volume, with one device missing (simulated) will not
 boot, even with rootflags=degraded set which is currently required to
 enable Btrfs degraded mounts. Upon reaching a dracut shell after
 basic.target fails with time out, I can mount -o subvol=root,degraded
 and exit and continue boot normally with just the single device.
 
 The problem seems to be that systemd (udev?) is not finding the volume
 by uuid for some reason, and therefore not attempting to mount it. But
 I don't know why it can't find it, or even how the find by uuid
 mechanism works this early in boot. So I'm not sure if this is a
 systemd or udev bug, or a dracut, or kernel bug.
 
 The problem happens with systemd 208-9.fc20 with kernel
 3.11.10-301.fc20, and systemd 212-4.fc21 and kernel
 3.15.0-0.rc5.git0.1.fc21.

As soon as btrfs reports that a file system is ready, systemd will pick
it up. This is handled with the btrfs udev built-in, and invoked via
/usr/lib/udev/rules.d/64-btrfs.rules. rootflags has no influence on
that, as at that point it is not clear whether the block device will be
the once that carries the root file system, or any other file system.

Not sure what we should be doing about this. Maybe introduce a new
btrfs=degraded switch that acts globally, and influences the udev built-in?

Kay?

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] timed out waiting for device dev-disk-by\x2duuid

2014-05-15 Thread Lennart Poettering
On Thu, 15.05.14 19:29, Lennart Poettering (lenn...@poettering.net) wrote:

 On Mon, 12.05.14 20:48, Chris Murphy (li...@colorremedies.com) wrote:
 
  Two device Btrfs volume, with one device missing (simulated) will not
  boot, even with rootflags=degraded set which is currently required to
  enable Btrfs degraded mounts. Upon reaching a dracut shell after
  basic.target fails with time out, I can mount -o subvol=root,degraded
  and exit and continue boot normally with just the single device.
  
  The problem seems to be that systemd (udev?) is not finding the volume
  by uuid for some reason, and therefore not attempting to mount it. But
  I don't know why it can't find it, or even how the find by uuid
  mechanism works this early in boot. So I'm not sure if this is a
  systemd or udev bug, or a dracut, or kernel bug.
  
  The problem happens with systemd 208-9.fc20 with kernel
  3.11.10-301.fc20, and systemd 212-4.fc21 and kernel
  3.15.0-0.rc5.git0.1.fc21.
 
 As soon as btrfs reports that a file system is ready, systemd will pick
 it up. This is handled with the btrfs udev built-in, and invoked via
 /usr/lib/udev/rules.d/64-btrfs.rules. rootflags has no influence on
 that, as at that point it is not clear whether the block device will be
 the once that carries the root file system, or any other file system.
 
 Not sure what we should be doing about this. Maybe introduce a new
 btrfs=degraded switch that acts globally, and influences the udev built-in?
 
 Kay?

So, as it turns out there's no kernel APi available to check whether a
btrfs raid array is now complete enough for degraded mode to
succeed. There's only a way to check whether it is fully complete.

And even if we had an API for this, how would this even work at all? I
mean, just having a catchall switch to boot in degraded mode is really
dangerous if people have more than one array and we might end up
mounting an fs in degraded mode that actually is fully available if we
just waited 50ms longer...

I mean this is even the problem with just one array: if you have
redundancy of 3 disks, when do you start mounting the thing when
degraded mode is requested on the kernel command line? as soon as
degrdaded mounting is possible (thus fucking up possible all 3 disks
that happened to show up last), or later?

I have no idea how this all should work really, it's a giant mess. There
probably needs to be some btrfs userspace daemon thing that watches
btrfs arrays and does some magic if they timeout.

But for now I am pretty sure we should just leave everything in fully
manual mode, that's the safest thing to do...

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] timed out waiting for device dev-disk-by\x2duuid

2014-05-15 Thread Goffredo Baroncelli
On 05/15/2014 08:16 PM, Lennart Poettering wrote:
 On Thu, 15.05.14 19:29, Lennart Poettering (lenn...@poettering.net) wrote:
 
 On Mon, 12.05.14 20:48, Chris Murphy (li...@colorremedies.com) wrote:

[...]
 
 So, as it turns out there's no kernel APi available to check whether a
 btrfs raid array is now complete enough for degraded mode to
 succeed. There's only a way to check whether it is fully complete.
 
 And even if we had an API for this, how would this even work at all? 

In what this should be different than the normal RAID system ? 

In both case there are two timeout: the first one is for waiting the full 
system, the second one is for the minimal set of disks to a degraded mode. If 
even the second timeout is passed, then we should consider the filesystem not 
build-able.

How it is handle for the RAID system ? Knowing that we should consider to apply 
the same strategies fro btrfs (may be we need some userspace tool to do that)






 mean, just having a catchall switch to boot in degraded mode is really
 dangerous if people have more than one array and we might end up
 mounting an fs in degraded mode that actually is fully available if we
 just waited 50ms longer...
 
 I mean this is even the problem with just one array: if you have
 redundancy of 3 disks, when do you start mounting the thing when
 degraded mode is requested on the kernel command line? as soon as
 degrdaded mounting is possible (thus fucking up possible all 3 disks
 that happened to show up last), or later?
 
 I have no idea how this all should work really, it's a giant mess. There
 probably needs to be some btrfs userspace daemon thing that watches
 btrfs arrays and does some magic if they timeout.
 
 But for now I am pretty sure we should just leave everything in fully
 manual mode, that's the safest thing to do...
 
 Lennart
 


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli (kreijackATinwind.it
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] timed out waiting for device dev-disk-by\x2duuid

2014-05-15 Thread Chris Murphy

On May 15, 2014, at 12:16 PM, Lennart Poettering lenn...@poettering.net wrote:

 On Thu, 15.05.14 19:29, Lennart Poettering (lenn...@poettering.net) wrote:
 
 On Mon, 12.05.14 20:48, Chris Murphy (li...@colorremedies.com) wrote:
 
 Two device Btrfs volume, with one device missing (simulated) will not
 boot, even with rootflags=degraded set which is currently required to
 enable Btrfs degraded mounts. Upon reaching a dracut shell after
 basic.target fails with time out, I can mount -o subvol=root,degraded
 and exit and continue boot normally with just the single device.
 
 The problem seems to be that systemd (udev?) is not finding the volume
 by uuid for some reason, and therefore not attempting to mount it. But
 I don't know why it can't find it, or even how the find by uuid
 mechanism works this early in boot. So I'm not sure if this is a
 systemd or udev bug, or a dracut, or kernel bug.
 
 The problem happens with systemd 208-9.fc20 with kernel
 3.11.10-301.fc20, and systemd 212-4.fc21 and kernel
 3.15.0-0.rc5.git0.1.fc21.
 
 As soon as btrfs reports that a file system is ready, systemd will pick
 it up. This is handled with the btrfs udev built-in, and invoked via
 /usr/lib/udev/rules.d/64-btrfs.rules. rootflags has no influence on
 that, as at that point it is not clear whether the block device will be
 the once that carries the root file system, or any other file system.
 
 Not sure what we should be doing about this. Maybe introduce a new
 btrfs=degraded switch that acts globally, and influences the udev built-in?
 
 Kay?
 
 So, as it turns out there's no kernel APi available to check whether a
 btrfs raid array is now complete enough for degraded mode to
 succeed. There's only a way to check whether it is fully complete.
 
 And even if we had an API for this, how would this even work at all? I
 mean, just having a catchall switch to boot in degraded mode is really
 dangerous if people have more than one array and we might end up
 mounting an fs in degraded mode that actually is fully available if we
 just waited 50ms longer...
 
 I mean this is even the problem with just one array: if you have
 redundancy of 3 disks, when do you start mounting the thing when
 degraded mode is requested on the kernel command line? as soon as
 degrdaded mounting is possible (thus fucking up possible all 3 disks
 that happened to show up last), or later?
 
 I have no idea how this all should work really, it's a giant mess. There
 probably needs to be some btrfs userspace daemon thing that watches
 btrfs arrays and does some magic if they timeout.
 
 But for now I am pretty sure we should just leave everything in fully
 manual mode, that's the safest thing to do…

Is it that the existing udev rule either doesn't know, or doesn't have a way of 
knowing, that rootflags=degraded should enable only the root=UUID device to 
bypass the ready rule?

Does udev expect a different readiness state to attempt a mount, than a manual 
mount from dracut shell? I'm confused why the Btrfs volume is not ready for 
systemd which then doesn't even attempt to mount it; and yet at a dracut shell 
it is ready when I do the mount manually. That seems like two readiness states.

I'd say it's not udev's responsibility, but rather Btrfs kernel code, to make 
sure things don't get worse with the file system, regardless of what devices 
it's presented with. At the time it tries to do the mount, it has its own logic 
for normal and degraded mounts whether the minimum number of devices are 
present or not and if not it fails. The degraded mount is also per volume, not 
global.

For example if I remove a device, and boot degraded and work for a few hours 
making lots of changes (even doing a system update, which is probably insane to 
do), I can later reboot with the stale device attached and Btrfs figures it 
out, passively. That means it figures out if there's a newer copy when a file 
is read, and forwards the newest copy to user space, and fixes the stale copy 
on the previously missing device. A manual balance ensures all new files also 
have redundancy. I think it's intended eventually to have a smarter balance 
catch up filter that can also run automatically in such a case. In any case 
the file system isn't trashed.



Chris Murphy
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] timed out waiting for device dev-disk-by\x2duuid

2014-05-15 Thread Kay Sievers
On Thu, May 15, 2014 at 10:57 PM, Goffredo Baroncelli
kreij...@libero.it wrote:
 On 05/15/2014 08:16 PM, Lennart Poettering wrote:
 On Thu, 15.05.14 19:29, Lennart Poettering (lenn...@poettering.net) wrote:

 On Mon, 12.05.14 20:48, Chris Murphy (li...@colorremedies.com) wrote:

 [...]

 So, as it turns out there's no kernel APi available to check whether a
 btrfs raid array is now complete enough for degraded mode to
 succeed. There's only a way to check whether it is fully complete.

 And even if we had an API for this, how would this even work at all?

 In what this should be different than the normal RAID system ?

 In both case there are two timeout: the first one is for waiting the full 
 system, the second one is for the minimal set of disks to a degraded mode. If 
 even the second timeout is passed, then we should consider the filesystem not 
 build-able.

 How it is handle for the RAID system ? Knowing that we should consider to 
 apply the same strategies fro btrfs (may be we need some userspace tool to do 
 that)

RAID is not handled by systemd, it is handled by other tools or not at
all. Initrds have some logic here, but nothing convincing, and it is
just the same mess as this.

Kay
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] timed out waiting for device dev-disk-by\x2duuid

2014-05-15 Thread Kay Sievers
On Thu, May 15, 2014 at 10:57 PM, Chris Murphy li...@colorremedies.com wrote:
 On May 15, 2014, at 12:16 PM, Lennart Poettering lenn...@poettering.net 
 wrote:
 On Thu, 15.05.14 19:29, Lennart Poettering (lenn...@poettering.net) wrote:
 On Mon, 12.05.14 20:48, Chris Murphy (li...@colorremedies.com) wrote:

 Two device Btrfs volume, with one device missing (simulated) will not
 boot, even with rootflags=degraded set which is currently required to
 enable Btrfs degraded mounts. Upon reaching a dracut shell after
 basic.target fails with time out, I can mount -o subvol=root,degraded
 and exit and continue boot normally with just the single device.

 The problem seems to be that systemd (udev?) is not finding the volume
 by uuid for some reason, and therefore not attempting to mount it. But
 I don't know why it can't find it, or even how the find by uuid
 mechanism works this early in boot. So I'm not sure if this is a
 systemd or udev bug, or a dracut, or kernel bug.

 The problem happens with systemd 208-9.fc20 with kernel
 3.11.10-301.fc20, and systemd 212-4.fc21 and kernel
 3.15.0-0.rc5.git0.1.fc21.

 As soon as btrfs reports that a file system is ready, systemd will pick
 it up. This is handled with the btrfs udev built-in, and invoked via
 /usr/lib/udev/rules.d/64-btrfs.rules. rootflags has no influence on
 that, as at that point it is not clear whether the block device will be
 the once that carries the root file system, or any other file system.

 Not sure what we should be doing about this. Maybe introduce a new
 btrfs=degraded switch that acts globally, and influences the udev built-in?

 Kay?

 So, as it turns out there's no kernel APi available to check whether a
 btrfs raid array is now complete enough for degraded mode to
 succeed. There's only a way to check whether it is fully complete.

 And even if we had an API for this, how would this even work at all? I
 mean, just having a catchall switch to boot in degraded mode is really
 dangerous if people have more than one array and we might end up
 mounting an fs in degraded mode that actually is fully available if we
 just waited 50ms longer...

 I mean this is even the problem with just one array: if you have
 redundancy of 3 disks, when do you start mounting the thing when
 degraded mode is requested on the kernel command line? as soon as
 degrdaded mounting is possible (thus fucking up possible all 3 disks
 that happened to show up last), or later?

 I have no idea how this all should work really, it's a giant mess. There
 probably needs to be some btrfs userspace daemon thing that watches
 btrfs arrays and does some magic if they timeout.

 But for now I am pretty sure we should just leave everything in fully
 manual mode, that's the safest thing to do…

 Is it that the existing udev rule either doesn't know, or doesn't have a way 
 of knowing, that rootflags=degraded should enable only the root=UUID device 
 to bypass the ready rule?

 Does udev expect a different readiness state to attempt a mount, than a 
 manual mount from dracut shell? I'm confused why the Btrfs volume is not 
 ready for systemd which then doesn't even attempt to mount it; and yet at a 
 dracut shell it is ready when I do the mount manually. That seems like two 
 readiness states.


The btrfs kernel state has only one state, and that is what udev reacts to.

 I'd say it's not udev's responsibility, but rather Btrfs kernel code, to make 
 sure things don't get worse with the file system, regardless of what devices 
 it's presented with. At the time it tries to do the mount, it has its own 
 logic for normal and degraded mounts whether the minimum number of devices 
 are present or not and if not it fails. The degraded mount is also per 
 volume, not global.

 For example if I remove a device, and boot degraded and work for a few hours 
 making lots of changes (even doing a system update, which is probably insane 
 to do), I can later reboot with the stale device attached and Btrfs figures 
 it out, passively. That means it figures out if there's a newer copy when a 
 file is read, and forwards the newest copy to user space, and fixes the 
 stale copy on the previously missing device. A manual balance ensures all new 
 files also have redundancy. I think it's intended eventually to have a 
 smarter balance catch up filter that can also run automatically in such a 
 case. In any case the file system isn't trashed.

The problem is when to actively force to degrade things when devices
do not show up in time. That is nothing the kernel can know, it would
need to be userspace making that decision. But udev does not really
have that information at that level, it would need to try until the
kernel is satisfied mounting a volume degraded.

This all is probably not a job for udev or systemd, but for a
specialized storage daemon which has explicit configuration/policy in
which way to mess around with the user's data.

This is not an area where we should try to be smart; falling back 

Re: [systemd-devel] timed out waiting for device dev-disk-by\x2duuid

2014-05-15 Thread Chris Murphy

On May 15, 2014, at 2:57 PM, Goffredo Baroncelli kreij...@libero.it wrote:

 On 05/15/2014 08:16 PM, Lennart Poettering wrote:
 On Thu, 15.05.14 19:29, Lennart Poettering (lenn...@poettering.net) wrote:
 
 On Mon, 12.05.14 20:48, Chris Murphy (li...@colorremedies.com) wrote:
 
 [...]
 
 So, as it turns out there's no kernel APi available to check whether a
 btrfs raid array is now complete enough for degraded mode to
 succeed. There's only a way to check whether it is fully complete.
 
 And even if we had an API for this, how would this even work at all? 
 
 In what this should be different than the normal RAID system ? 

I think it's because with md, the array assembly is separate from fs mount. I 
don't know what timeout it uses, but it does do automatic degraded assembly 
eventually. Once assembled (degraded or normal) then the md device is ready 
and that's when udev rules start to apply and systemd will try to mount root fs.

However on Btrfs, the degraded assembly and fs mount concepts are combined. So 
without degraded assembly first, it sounds like udev+systemd don't even try to 
mount the fs.

At least that's my rudimentary understanding.

The udev rule right now is asking if all Btrfs member devices are present and 
it sounds like that answer is no with a missing device; so a mount isn't even 
attempted by systemd rather than attempting a degraded mount specifically for 
the root=UUID device(s).

What is parsing the boot parameters ro, root=, and rootflags=? Are those 
recognized by the kernel or systemd?



 
 In both case there are two timeout: the first one is for waiting the full 
 system, the second one is for the minimal set of disks to a degraded mode. If 
 even the second timeout is passed, then we should consider the filesystem not 
 build-able.

 How it is handle for the RAID system ? Knowing that we should consider to 
 apply the same strategies fro btrfs (may be we need some userspace tool to do 
 that)

Well that sounds like a user space tool to be in the initramfs so it can do 
this logic before systemd even attempt to mount rootfs. Or it's done by kernel 
code if it's possible for it to parse root=UUID and only care about the member 
devices for that volume.


Chris Murphy

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] timed out waiting for device dev-disk-by\x2duuid

2014-05-15 Thread Lennart Poettering
On Thu, 15.05.14 15:54, Chris Murphy (li...@colorremedies.com) wrote:

 The udev rule right now is asking if all Btrfs member devices are
 present and it sounds like that answer is no with a missing device; so
 a mount isn't even attempted by systemd rather than attempting a
 degraded mount specifically for the root=UUID device(s).

Yes, correct. 

And my suspicion is that if any more complex logic than this shall take
place, then this should probably be managed by some kind of storage
daemon still to be written, not udev. udev doesn't do timeouts, and has
no clue about raid arrays. If things get this complex there really needs
to be some other component in place that can handle this...

 What is parsing the boot parameters ro, root=, and rootflags=? Are
 those recognized by the kernel or systemd?

When an initrd is used it is the initrd, otherwise the kernel itself.

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel