Bug#867368: systemd-udevd: delays boot for 2+ minutes when LVM PV is inside a LV

2020-05-31 Thread Oliver Brakmann
Hi all,

I recently ran into this problem on my KVM host after I migrated all my
VMs to use unpartitioned virtual disks for everything except the /boot disk.

A little investigation revealed that the problem was related to pvscan
processes started by the 69-lvm-metad.rules udev script.

At the end of that script there is a conditional check that results in
the two branches "systemd_background" and "direct_pvscan" (lines 92 and
93).  The error case uses the "direct_pvscan" branch and results in the
delays.  Changing the script so that the "systemd_background" branch is
always used instead (just add a 'GOTO="systemd_background"' line before
line 92) completely solves the delays for me, both on virtual and
physical systems.  I also tested online hotplug and did not notice
regressions with the change.

Reverting the change from line 475 in the debian-patches file
(https://sources.debian.org/src/lvm2/2.03.07-1/debian/patches/debian-changes/#L475)
however does not bring any improvement.

I hope this helps anyone running into this problem.

Regards
Oliver



Bug#867368: systemd-udevd: delays boot for 2+ minutes when LVM PV is inside a LV

2018-04-02 Thread Michael Biebl
Control: reassign -1 lvm2

On Thu, 6 Jul 2017 13:22:38 -0700 Rob Leslie  wrote:
> > On Jul 6, 2017, at 8:03 AM, Felipe Sateler  wrote:
> > 
> > > % systemd-analyze
> > > Startup finished in 2min 55.491s (kernel) + 13.076s (userspace) = 3min 
> > > 8.568s
> > 
> > This suggests the kernel is the one slowing things down.
> 
> I initially thought so too, but attempts to debug the kernel with e.g.
> 'initcall_debug' were not fruitful. Instead indications point to something
> happening in early initramfs userspace.
> 
> > Can you try booting with the old jessie kernel? This looks like a kernel 
> > regression.
> 
> I have confirmed that the problem still exists even when booting a stretch
> system using the most recent jessie kernel.
> 
> I have tried modifying the initramfs scripts to arrange for systemd-udevd to 
> be
> passed the '--debug' option. Here are some relevant excerpts:
> 
> > % grep 793 systemd-udevd.debug
> > seq 793 queued, 'change' 'block'
> > seq 793 running
> > seq 793 '/devices/virtual/block/dm-2' is taking a long time
> 
> > % tail -n 5 systemd-udevd.debug 
> > giving up waiting for workers to finish
> > event loop failed: Connection timed out
> > Unload module index
> > Unloaded link configuration context.
> > timeout '/sbin/lvm pvscan --cache --activate ay --major 253 --minor 2'
> 
> 
> This of course suggests the problem may actually lie with lvm2. Please feel 
> free
> to reassign if you agree, or suggest a further course of investigation.

This does look like it might be a lvm2 bug indeed, which is why I'm
going to reassign this bug now, as I think the lvm2 maintainers have a
better chance helping you debug this.
If this turns out to be a bug in udev, please reassign back.

Regards,
Michael


-- 
Why is it that all of the instruments seeking intelligent life in the
universe are pointed away from Earth?



signature.asc
Description: OpenPGP digital signature


Bug#867368: systemd-udevd: delays boot for 2+ minutes when LVM PV is inside a LV

2017-11-04 Thread Christian Schwamborn

I'm experiencing the same behavior with a lvm pv on top of a md mirror:

Nov  2 22:03:13 zero systemd-udevd[515]: seq 4167 
'/devices/virtual/block/md4' is taking a long time


pvscan
  PV /dev/sda   VG BACKUP0 lvm2 [9.10 TiB / 98.00 GiB free]
  PV /dev/md4   VG MAIN0   lvm2 [9.10 TiB / 273.93 GiB free]
  Total: 2 [18.19 TiB] / in use: 2 [18.19 TiB] / in no VG: 0 [0   ]


cat /proc/mdstat
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] 
[raid4] [raid10]


md4 : active raid1 sdc4[1] sdb4[0]
  9765778432 blocks super 1.2 [2/2] [UU]
  bitmap: 4/73 pages [16KB], 65536KB chunk

I first thought, the delay has something to do with the network setup, 
as it appears during the 'Raise network interfaces' message when systemd 
counts about two and a half minute before continuing, but I think that's 
more a coincidence.




Bug#867368: systemd-udevd: delays boot for 2+ minutes when LVM PV is inside a LV

2017-07-06 Thread Rob Leslie
> On Jul 6, 2017, at 8:03 AM, Felipe Sateler  wrote:
> 
> > % systemd-analyze
> > Startup finished in 2min 55.491s (kernel) + 13.076s (userspace) = 3min 
> > 8.568s
> 
> This suggests the kernel is the one slowing things down.

I initially thought so too, but attempts to debug the kernel with e.g.
'initcall_debug' were not fruitful. Instead indications point to something
happening in early initramfs userspace.

> Can you try booting with the old jessie kernel? This looks like a kernel 
> regression.

I have confirmed that the problem still exists even when booting a stretch
system using the most recent jessie kernel.

I have tried modifying the initramfs scripts to arrange for systemd-udevd to be
passed the '--debug' option. Here are some relevant excerpts:

> % grep 793 systemd-udevd.debug
> seq 793 queued, 'change' 'block'
> seq 793 running
> seq 793 '/devices/virtual/block/dm-2' is taking a long time

> % tail -n 5 systemd-udevd.debug 
> giving up waiting for workers to finish
> event loop failed: Connection timed out
> Unload module index
> Unloaded link configuration context.
> timeout '/sbin/lvm pvscan --cache --activate ay --major 253 --minor 2'


This of course suggests the problem may actually lie with lvm2. Please feel free
to reassign if you agree, or suggest a further course of investigation.

Thanks.

-- 
Rob Leslie
r...@mars.org



Bug#867368: systemd-udevd: delays boot for 2+ minutes when LVM PV is inside a LV

2017-07-06 Thread Felipe Sateler
On Wed, Jul 5, 2017 at 11:38 PM, Rob Leslie  wrote:

> Package: udev
> Version: 232-25
> Severity: important
> File: /lib/systemd/systemd-udevd
>
> Dear Maintainer,
>
> After upgrading to stretch I've discovered that some machines are taking a
> long time to boot -- more than two minutes -- with no immediately obvious
> cause. For example:
>
> > % systemd-analyze
> > Startup finished in 2min 55.491s (kernel) + 13.076s (userspace) = 3min
> 8.568s
>

This suggests the kernel is the one slowing things down.


>
> During most of this time, nothing appears on the console except a cursor.
> Kernel and system logs show nothing to pinpoint the issue other than long
> time gaps, e.g.:
>
> > [3.194336] clocksource: Switched to clocksource tsc
> > [  121.499692] md: linear personality registered for level -1
>
> Adding 'debug' to the kernel command line however I got this clue:
>
> > [   64.386629] systemd-udevd[88]: seq 905 '/devices/virtual/block/dm-5'
> is taking a long time
>

This is just udev complaining. Udev does not generate the devices, it waits
for the kernel to notify it.


>
> After some further investigation it appears the problem stems from having
> an
> LVM PV inside another LV. Here is how I am able to recreate the problem on
> a
> fresh stretch install:
>
> > # lvcreate -n pvtest -L 1g $vg
> > # pvcreate /dev/$vg/pvtest
>
> (where $vg is some existing volume group name)
>
> Now rebooting the system will result in the long boot delay.
>
> This sort of setup did not cause any problems in jessie. An obvious
> workaround
> is not to put PVs inside other LVs, but this is sometimes a convenient
> arrangement and I would hope it could be made to work without an excessive
> boot delay.
>


Can you try booting with the old jessie kernel? This looks like a kernel
regression.


-- 

Saludos,
Felipe Sateler


Bug#867368: systemd-udevd: delays boot for 2+ minutes when LVM PV is inside a LV

2017-07-05 Thread Rob Leslie
Package: udev
Version: 232-25
Severity: important
File: /lib/systemd/systemd-udevd

Dear Maintainer,

After upgrading to stretch I've discovered that some machines are taking a
long time to boot -- more than two minutes -- with no immediately obvious
cause. For example:

> % systemd-analyze
> Startup finished in 2min 55.491s (kernel) + 13.076s (userspace) = 3min 8.568s

During most of this time, nothing appears on the console except a cursor.
Kernel and system logs show nothing to pinpoint the issue other than long
time gaps, e.g.:

> [3.194336] clocksource: Switched to clocksource tsc
> [  121.499692] md: linear personality registered for level -1

Adding 'debug' to the kernel command line however I got this clue:

> [   64.386629] systemd-udevd[88]: seq 905 '/devices/virtual/block/dm-5' is 
> taking a long time

After some further investigation it appears the problem stems from having an
LVM PV inside another LV. Here is how I am able to recreate the problem on a
fresh stretch install:

> # lvcreate -n pvtest -L 1g $vg
> # pvcreate /dev/$vg/pvtest

(where $vg is some existing volume group name)

Now rebooting the system will result in the long boot delay.

This sort of setup did not cause any problems in jessie. An obvious workaround
is not to put PVs inside other LVs, but this is sometimes a convenient
arrangement and I would hope it could be made to work without an excessive
boot delay.

Thanks.


-- Package-specific info:

-- System Information:
Debian Release: 9.0
  APT prefers stable
  APT policy: (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 4.9.0-3-amd64 (SMP w/4 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), 
LANGUAGE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)

Versions of packages udev depends on:
ii  adduser  3.115
ii  dpkg 1.18.24
ii  libacl1  2.2.52-3+b1
ii  libblkid12.29.2-1
ii  libc62.24-11+deb9u1
ii  libkmod2 23-2
ii  libselinux1  2.6-3+b1
ii  libudev1 232-25
ii  lsb-base 9.20161125
ii  procps   2:3.3.12-3
ii  util-linux   2.29.2-1

udev recommends no packages.

udev suggests no packages.

Versions of packages udev is related to:
ii  systemd  232-25

-- no debconf information