Re: [systemd-devel] btrfs raid not ready but systemd tries to mount it anyway

2020-10-17 Thread Daniel J. R. May
On Fri, 2020-10-16 at 19:02 +0200, Michał Zegan wrote:
> As a workaround I am almost sure you can instruct dracut to include
> the file, can't you?

Hi Michal,

Yes you are right. I've got it working reliably by doing:

$ sudo dracut --add btrfs --force

Thank you for the prompt! 

Cheers, Dan



signature.asc
Description: This is a digitally signed message part
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] btrfs raid not ready but systemd tries to mount it anyway

2020-10-16 Thread Daniel J. R. May
On Fri, 2020-10-16 at 15:16 +0200, Lennart Poettering wrote:
> So the btrfs ready ioctl is called and the device considered by the
> kernel btrfs implementation  to be ready
> (i.e. assembled from all component devices) with this line:
> 
> [   27.804250] systemd-udevd[712]: sde: 
> /usr/lib/udev/rules.d/64-btrfs.rules:15 RUN '/usr/bin/udevadm trigger -s 
> block -p ID_BTRFS_READY=0'
> 
> (that's line 35792)
> 
> And srv.mount then later mounts the thing correctly.
> 
> Where's the problem supposed to be be?
> 
I will try and summarise the situation in one message to help make
things clear. Here is a summarised history of the problem:

1. The original problem: 
A 10 HDD BTRFS volume with 4 drives connected to the motherboard and 6
drives connected to the HBA *fails* to mount automatically at boot time.
The log for this (without any special debugging options set in grub) is
here:

https://drive.google.com/file/d/1o1-7smQAjg3LKP98EFfeHbb_jZdNz_jT/view?usp=sharing

2. Debugging with "rd.udev.debug systemd.log_level=debug":
The same 10 HDD BTRFS volume with 4 drives connected to the motherboard
and 6 drives connected to the HBA *fails* to mount automatically at boot
time. The log for this with "rd.udev.debug systemd.log_level=debug" set
in grub is here:

https://drive.google.com/file/d/1jVHjAQ8CY9vABtM2giPTB6XeZCclm7R-/view?usp=sharing


3. Debugging with "udev.log_priority=debug systemd.log_level=debug":
The same 10 HDD BTRFS volume with 4 drives connected to the motherboard
and 6 drives connected to the HBA *correctly* mounts automatically at
boot time. The log for this with "udev.log_priority=debug
systemd.log_level=debug" set in grub is here:

https://drive.google.com/file/d/1-x26JS5gcZxwZx7zWoKduEg9ycwUHKqO/view?usp=sharing

This was the one you were looking at which caused you to ask "Where's
the problem supposed to be be?" 

4. Changing HDD connections:
When I changed the hard drive connections so that all 10 HDDs are
connected via the HBA the BTRFS volume *correctly* mounts automatically
at boot time.


I hope that is a bit clearer.

So although putting all the HDDs on the HBA seems to fix the problem for
me. I thought that I should report my findings.

Thank you for all your time and effort looking into this.

Best wishes,

Dan




signature.asc
Description: This is a digitally signed message part
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] btrfs raid not ready but systemd tries to mount it anyway

2020-10-16 Thread Andrei Borzenkov
16.10.2020 18:51, Lennart Poettering пишет:
>>
>> Ths btrfs udev rule file appears to be missing in the initrd. The
>> block devices with the btrfs file systems on them will thus be marked
>> ready in systemd instantly instead of being delayed until all other
>> devices of the same btrfs fs have shown up in udev too.
>>
>> Fix your initrd.
> 
> So my educated guess is that this is a dracut bug: it excludes the
> btrfs udev rule file from the initrd unless the root fs is btrfs.
> 
> But this doesn't work, because the absence of that file means that all
> btrfs file systems will be marked as ready instantly as they appear,
> which then blows up later if during later boot btrfs file systems that
> are backed by multiple devices shall be mounted.
> 
> It's basically a race: if yor block devices appear in the initrd
> already, then you lost,

It is unmanageable. It means we now have to include half of kernel and
user space because god knows what could be required by udev rules. How
should dracut even supposed to know what to include?

initrd is needed to mount root. Period. If you broke it by carrying over
incomplete udev database state from initrd, it is up to *you* to fix it,
not forcing copy of root system into initrd.

> because all such devices will be instantly be
> marked "ready to be mounted" because the udev rule file is missing
> there. However, if the block devices take longer to appear, and are
> thus first seen after the initrd→host transition, then all will be
> good, as the udev rules file for it exists there, and the devices are
> not marked ready until all necessary devices have shown up in udev.
> 
> Fix is: dracut should just include the file unconditionally. It's
> tiny.
> 
> If it really really insist to not include it on systems where btrfs isn't
> used, then it should scan the host for any btrfs use at all. it's not
> sufficient to determine whether the rootfs is btrfs or not.
> 
> Anyway, please report to dracut.
> 
> Lennart
> 
> --
> Lennart Poettering, Berlin
> ___
> systemd-devel mailing list
> systemd-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/systemd-devel
> 

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] btrfs raid not ready but systemd tries to mount it anyway

2020-10-16 Thread Michał Zegan
As a workaround I am almost sure you can instruct dracut to include the
file, can't you?

W dniu 16.10.2020 o 17:45, Lennart Poettering pisze:
> On Fr, 16.10.20 16:26, Daniel J. R. May (daniel@danieljrmay.com) wrote:
> 
>> On Fri, 2020-10-16 at 15:16 +0200, Lennart Poettering wrote:
>>> So the btrfs ready ioctl is called and the device considered by the
>>> kernel btrfs implementation  to be ready
>>> (i.e. assembled from all component devices) with this line:
>>>
>>> [   27.804250] systemd-udevd[712]: sde: 
>>> /usr/lib/udev/rules.d/64-btrfs.rules:15 RUN '/usr/bin/udevadm trigger -s 
>>> block -p ID_BTRFS_READY=0'
>>>
>>> (that's line 35792)
>>>
>>> And srv.mount then later mounts the thing correctly.
>>>
>>> Where's the problem supposed to be be?
>>>
>> I will try and summarise the situation in one message to help make
>> things clear. Here is a summarised history of the problem:
>>
>> 1. The original problem:
>> A 10 HDD BTRFS volume with 4 drives connected to the motherboard and 6
>> drives connected to the HBA *fails* to mount automatically at boot time.
>> The log for this (without any special debugging options set in grub) is
>> here:
>>
>> https://drive.google.com/file/d/1o1-7smQAjg3LKP98EFfeHbb_jZdNz_jT/view?usp=sharing
>>
>> 2. Debugging with "rd.udev.debug systemd.log_level=debug":
>> The same 10 HDD BTRFS volume with 4 drives connected to the motherboard
>> and 6 drives connected to the HBA *fails* to mount automatically at boot
>> time. The log for this with "rd.udev.debug systemd.log_level=debug" set
>> in grub is here:
>>
>> https://drive.google.com/file/d/1jVHjAQ8CY9vABtM2giPTB6XeZCclm7R-/view?usp=sharing
> 
> Ths btrfs udev rule file appears to be missing in the initrd. The
> block devices with the btrfs file systems on them will thus be marked
> ready in systemd instantly instead of being delayed until all other
> devices of the same btrfs fs have shown up in udev too.
> 
> Fix your initrd.
> 
> Lennart
> 
> --
> Lennart Poettering, Berlin
> ___
> systemd-devel mailing list
> systemd-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/systemd-devel
> 



signature.asc
Description: OpenPGP digital signature
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] btrfs raid not ready but systemd tries to mount it anyway

2020-10-16 Thread Lennart Poettering
On Fr, 16.10.20 17:45, Lennart Poettering (lenn...@poettering.net) wrote:

> > 2. Debugging with "rd.udev.debug systemd.log_level=debug":
> > The same 10 HDD BTRFS volume with 4 drives connected to the motherboard
> > and 6 drives connected to the HBA *fails* to mount automatically at boot
> > time. The log for this with "rd.udev.debug systemd.log_level=debug" set
> > in grub is here:
> >
> > https://drive.google.com/file/d/1jVHjAQ8CY9vABtM2giPTB6XeZCclm7R-/view?usp=sharing
>
> Ths btrfs udev rule file appears to be missing in the initrd. The
> block devices with the btrfs file systems on them will thus be marked
> ready in systemd instantly instead of being delayed until all other
> devices of the same btrfs fs have shown up in udev too.
>
> Fix your initrd.

So my educated guess is that this is a dracut bug: it excludes the
btrfs udev rule file from the initrd unless the root fs is btrfs.

But this doesn't work, because the absence of that file means that all
btrfs file systems will be marked as ready instantly as they appear,
which then blows up later if during later boot btrfs file systems that
are backed by multiple devices shall be mounted.

It's basically a race: if yor block devices appear in the initrd
already, then you lost, because all such devices will be instantly be
marked "ready to be mounted" because the udev rule file is missing
there. However, if the block devices take longer to appear, and are
thus first seen after the initrd→host transition, then all will be
good, as the udev rules file for it exists there, and the devices are
not marked ready until all necessary devices have shown up in udev.

Fix is: dracut should just include the file unconditionally. It's
tiny.

If it really really insist to not include it on systems where btrfs isn't
used, then it should scan the host for any btrfs use at all. it's not
sufficient to determine whether the rootfs is btrfs or not.

Anyway, please report to dracut.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] btrfs raid not ready but systemd tries to mount it anyway

2020-10-16 Thread Lennart Poettering
On Fr, 16.10.20 16:26, Daniel J. R. May (daniel@danieljrmay.com) wrote:

> On Fri, 2020-10-16 at 15:16 +0200, Lennart Poettering wrote:
> > So the btrfs ready ioctl is called and the device considered by the
> > kernel btrfs implementation  to be ready
> > (i.e. assembled from all component devices) with this line:
> >
> > [   27.804250] systemd-udevd[712]: sde: 
> > /usr/lib/udev/rules.d/64-btrfs.rules:15 RUN '/usr/bin/udevadm trigger -s 
> > block -p ID_BTRFS_READY=0'
> >
> > (that's line 35792)
> >
> > And srv.mount then later mounts the thing correctly.
> >
> > Where's the problem supposed to be be?
> >
> I will try and summarise the situation in one message to help make
> things clear. Here is a summarised history of the problem:
>
> 1. The original problem:
> A 10 HDD BTRFS volume with 4 drives connected to the motherboard and 6
> drives connected to the HBA *fails* to mount automatically at boot time.
> The log for this (without any special debugging options set in grub) is
> here:
>
> https://drive.google.com/file/d/1o1-7smQAjg3LKP98EFfeHbb_jZdNz_jT/view?usp=sharing
>
> 2. Debugging with "rd.udev.debug systemd.log_level=debug":
> The same 10 HDD BTRFS volume with 4 drives connected to the motherboard
> and 6 drives connected to the HBA *fails* to mount automatically at boot
> time. The log for this with "rd.udev.debug systemd.log_level=debug" set
> in grub is here:
>
> https://drive.google.com/file/d/1jVHjAQ8CY9vABtM2giPTB6XeZCclm7R-/view?usp=sharing

Ths btrfs udev rule file appears to be missing in the initrd. The
block devices with the btrfs file systems on them will thus be marked
ready in systemd instantly instead of being delayed until all other
devices of the same btrfs fs have shown up in udev too.

Fix your initrd.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] btrfs raid not ready but systemd tries to mount it anyway

2020-10-16 Thread Lennart Poettering
On Mo, 12.10.20 14:42, Chris Murphy (li...@colorremedies.com) wrote:

> > When precisely it returns success or failure is entirely up to the btrfs 
> > kernel
> > code. systemd/udev doesn't have any control on that. The udev btrfs
> > builtin is too trivial for that: it just calls the ioctl and that
> > pretty much is it.
>
> What does this line mean? Does it mean the 'btrfs ready' ioctl has
> been called at this moment and the device is ready? i.e. this specific
> device is ready now, but not before now?
>
> [   30.923721] kernel: BTRFS: device label BTRFS_RAID1_srv devid 1
> transid 60815 /dev/sdg scanned by systemd-udevd (710)

It's generated by the kernel whenever userspace calls
BTRFS_IOC_DEVICES_READY (also in a few other cases, e.g. if you never
call that but mount the fs right-away).

That ioctl is called by udev whenever a btrfs device pops up, and it
basically tells the btrfs layer in the kernel to consider it, and then
maybe assemble something from it.

> Because I see six such lines for this file system before the mount
> attempt. And four such lines after the mount attempt. If "all devices
> ready" is not true until the last such line appears, then the mount is
> happening too soon for some reason.

It should be generate at least once for each device belonging to the
file system. Because udev calls the ioctl whenever a btrfs fs pops up.

But note that due to the initrd transitoin and retriggering of udev
rules you might see this more than once per device.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] btrfs raid not ready but systemd tries to mount it anyway

2020-10-16 Thread Lennart Poettering
On Fr, 16.10.20 13:21, Daniel J. R. May (daniel@danieljrmay.com) wrote:

> Log available here:
> https://drive.google.com/file/d/1-x26JS5gcZxwZx7zWoKduEg9ycwUHKqO/view?usp=sharing

So the btrfs ready ioctl is called and the device considered by the
kernel btrfs implementation  to be ready
(i.e. assembled from all component devices) with this line:

[   27.804250] systemd-udevd[712]: sde: /usr/lib/udev/rules.d/64-btrfs.rules:15 
RUN '/usr/bin/udevadm trigger -s block -p ID_BTRFS_READY=0'

(that's line 35792)

And srv.mount then later mounts the thing correctly.

Where's the problem supposed to be be?

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] btrfs raid not ready but systemd tries to mount it anyway

2020-10-12 Thread Chris Murphy
On Mon, Oct 12, 2020 at 1:33 AM Lennart Poettering
 wrote:
>
> On So, 11.10.20 14:57, Chris Murphy (li...@colorremedies.com) wrote:
>
> > Hi,
> >
> > A Fedora 32 (systemd-245.8-2.fc32) user has a 10-drive Btrfs raid1 set
> > to mount in /etc/fstab:
> >
> > UUID=f89f0a16-  /srv   btrfs  defaults,nofail,x-systemd.requires=/ 
> >  0 0
> >
> > For some reason, systemd is trying to mount this file system before
> > all ten devices are ready. Supposedly this rule applies:
> > https://github.com/systemd/systemd/blob/master/rules.d/64-btrfs.rules.in
>
> udev calls the btrfs ready ioctl whenever a new btrfs fs block deice
> shows up. The ioctl will fail as long as not all devices that make up
> the fs have shown up. It succeeds once all devices for the fs are
> there. i.e. for n=10 devices it will return failure 9 times, and
> sucess the 1 final time.
>
> When precisely it returns success or failure is entirely up to the btrfs 
> kernel
> code. systemd/udev doesn't have any control on that. The udev btrfs
> builtin is too trivial for that: it just calls the ioctl and that
> pretty much is it.

What does this line mean? Does it mean the 'btrfs ready' ioctl has
been called at this moment and the device is ready? i.e. this specific
device is ready now, but not before now?

[   30.923721] kernel: BTRFS: device label BTRFS_RAID1_srv devid 1
transid 60815 /dev/sdg scanned by systemd-udevd (710)

Because I see six such lines for this file system before the mount
attempt. And four such lines after the mount attempt. If "all devices
ready" is not true until the last such line appears, then the mount is
happening too soon for some reason.


> For historical reasons udev log level is independent from the rest of
> systemd log level. Thus use udev.log_priority=debug to turn on udev
> debug logging.

I'll have him retry with udev.log_priority=debug and if I get a moment
I'll try to reproduce. The difficulty is reproducing truly missing
devices is easy and appears to work, whereas in this case they are
merely late being scanned for whatever reason (maybe they take longer
to spin up, maybe the HBA they're connected to is just slow or has a
later loading driver, etc)


-- 
Chris Murphy
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] btrfs raid not ready but systemd tries to mount it anyway

2020-10-12 Thread Chris Murphy
On Sun, Oct 11, 2020 at 11:56 PM Andrei Borzenkov  wrote:
>
> 11.10.2020 23:57, Chris Murphy пишет:
> > Hi,
> >
> > A Fedora 32 (systemd-245.8-2.fc32) user has a 10-drive Btrfs raid1 set
> > to mount in /etc/fstab:
> >
> > UUID=f89f0a16-  /srv   btrfs  defaults,nofail,x-systemd.requires=/ 
> >  0 0
> >
> > For some reason, systemd is trying to mount this file system before
> > all ten devices are ready. Supposedly this rule applies:
> > https://github.com/systemd/systemd/blob/master/rules.d/64-btrfs.rules.in
> >
> > Fedora does have /usr/lib/udev/rules.d/64-btrfs.rules but I find no
> > reference at all to this rule when the user boots with 'rd.udev.debug
> > systemd.log_level=debug'. The entire journal is here:
> >
> > https://drive.google.com/file/d/1jVHjAQ8CY9vABtM2giPTB6XeZCclm7R-/view
> >
>
> Educated guess - rule is missing in initrd and you do not run udev
> trigger after switch to root.

I will ask the user to double check their initrd, but mine definitely
has it without any initrd/dracut related customizations.

$ sudo lsinitrd initramfs-5.8.8-200.fc32.x86_64.img | grep btrfs
btrfs

-rw-r--r--   1 root root  616 May 29 12:35
usr/lib/udev/rules.d/64-btrfs.rules

-- 
Chris Murphy
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] btrfs raid not ready but systemd tries to mount it anyway

2020-10-12 Thread Lennart Poettering
On So, 11.10.20 14:57, Chris Murphy (li...@colorremedies.com) wrote:

> Hi,
>
> A Fedora 32 (systemd-245.8-2.fc32) user has a 10-drive Btrfs raid1 set
> to mount in /etc/fstab:
>
> UUID=f89f0a16-  /srv   btrfs  defaults,nofail,x-systemd.requires=/  
> 0 0
>
> For some reason, systemd is trying to mount this file system before
> all ten devices are ready. Supposedly this rule applies:
> https://github.com/systemd/systemd/blob/master/rules.d/64-btrfs.rules.in

udev calls the btrfs ready ioctl whenever a new btrfs fs block deice
shows up. The ioctl will fail as long as not all devices that make up
the fs have shown up. It succeeds once all devices for the fs are
there. i.e. for n=10 devices it will return failure 9 times, and
sucess the 1 final time.

When precisely it returns success or failure is entirely up to the btrfs kernel
code. systemd/udev doesn't have any control on that. The udev btrfs
builtin is too trivial for that: it just calls the ioctl and that
pretty much is it.

> x-systemd.automount,noauto,nofail,x-systemd.requires=/
>
> In fact, I'm not sure x-systemd.requires is needed because / must be
> mounted successfully to read /etc/fstab in the first place; in order
> to know to mount this file system at /srv

x-systemd.requires=/ is a NOP. the root fs is always mounted, in
systemd the unit "-.mount" that wraps it is marked with the
"perpetual" flag internally, which all units have set that are always,
unconditionally up as long as userspace exists. Thus requiring the
root fs is useless, it's always there and implicitly required by all
userspace.

Also note: in the initrd the host root fs is at /sysroot/ not at
/. During the initrd→host transition we do a switch root after all
that turns /sysroot/ → /.

> Anyway I'm mainly confused why the btrfs udev rule is seemingly not
> applied in this case.

For historical reasons udev log level is independent from the rest of
systemd log level. Thus use udev.log_priority=debug to turn on udev
debug logging.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] btrfs raid not ready but systemd tries to mount it anyway

2020-10-11 Thread Andrei Borzenkov
11.10.2020 23:57, Chris Murphy пишет:
> Hi,
> 
> A Fedora 32 (systemd-245.8-2.fc32) user has a 10-drive Btrfs raid1 set
> to mount in /etc/fstab:
> 
> UUID=f89f0a16-  /srv   btrfs  defaults,nofail,x-systemd.requires=/  
> 0 0
> 
> For some reason, systemd is trying to mount this file system before
> all ten devices are ready. Supposedly this rule applies:
> https://github.com/systemd/systemd/blob/master/rules.d/64-btrfs.rules.in
> 
> Fedora does have /usr/lib/udev/rules.d/64-btrfs.rules but I find no
> reference at all to this rule when the user boots with 'rd.udev.debug
> systemd.log_level=debug'. The entire journal is here:
> 
> https://drive.google.com/file/d/1jVHjAQ8CY9vABtM2giPTB6XeZCclm7R-/view
> 

Educated guess - rule is missing in initrd and you do not run udev
trigger after switch to root.


> I expect a workaround would be to use mount option:
> 
> x-systemd.automount,noauto,nofail,x-systemd.requires=/
> 
> In fact, I'm not sure x-systemd.requires is needed because / must be
> mounted successfully to read /etc/fstab in the first place; in order
> to know to mount this file system at /srv
> 

That makes no sense. You cannot boot without root being present, it is
controlled outside of normal boot sequence.

> Anyway I'm mainly confused why the btrfs udev rule is seemingly not
> applied in this case.
> 

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel