Public bug reported:

This is reproducible in Bionic and late.

Here's an example running 'focal':

$ lsb_release -cs
focal

$ uname -r
5.3.0-24-generic

How to trigger it:

$ sosreport -o block

or more precisely the command causing the situation inside the block plugin:
$ parted -s /dev/$(losetup -f) unit s print

https://github.com/sosreport/sos/blob/master/sos/plugins/block.py#L52

but if I run it on the next next unused loop device, in this case
/dev/loop3 (which is also unused), no errors.

While I agree that sosreport shouldn't query unused loop devices, there
is definitely something going on with the next unused loop device.

What is the difference between loop2 and loop3 and other unused one ?

3 things so far I have noticed:
* The loop device need to be the next unused loop device (losetup -f)
* A reboot is needed (if some loop modification (snap install, mount loop, ...) 
has been made at runtime
* I have also noticed that loop2 (or whatever the next unused one is) have some 
stat as oppose to other unused loop devices. The stat exist already right after 
the system boot for the next unused loop device.

/sys/block/loop2/stat
::::::::::::::
2 0 10 0 1 0 0 0 0 0 0

2  = number of read I/Os processed
10 = number of sectors read 
1  = number of write I/Os processed

Explanation of each column:
https://www.kernel.org/doc/html/latest/block/stat.html

while /dev/loop3 doesn't

/sys/block/loop3/stat
::::::::::::::
0 0 0 0 0 0 0 0 0 0 0


Which tells me that something during the boot process most likely acquired (on 
purpose or not) the next unused loop and possibly didn't released it well.

If loop2 is generating errors, and I install a snap, the snap squashfs
will take loop2, making loop3 the next unused loop device.

If I query loop3 with 'parted' right after, no errors.

If I reboot, and query loop3 again, then no I'll have an error.

To triggers the errors it need to be after a reboot and it only impact
the first unused loop device available (losetup -f).

This was tested with focal/systemd whic his very close to latest upstream code.
This has been test with latest v5.5 kernel as well. For now, I don't think it's 
a kernel problem, I'm more thinking of a userspace misbehaviour dealing with 
loop device at boot.

** Affects: systemd (Ubuntu)
     Importance: Undecided
         Status: New

** Affects: udev (Ubuntu)
     Importance: Undecided
         Status: New


** Tags: sts

** Also affects: udev (Ubuntu)
   Importance: Undecided
       Status: New

** Description changed:

  This is reproducible in Bionic and late.
  
  Here's an example running 'focal':
  
  $ lsb_release -cs
  focal
  
  $ uname -r
  5.3.0-24-generic
  
  How to trigger it:
  
  $ sosreport -o block
  
  or more precisely the command causing the situation inside the block plugin:
  $ parted -s /dev/$(losetup -f) unit s print
  
  https://github.com/sosreport/sos/blob/master/sos/plugins/block.py#L52
  
- 
- but if I run it on the next next unused loop device, in this case /dev/loop3 
(which is also unused), no errors.
+ but if I run it on the next next unused loop device, in this case
+ /dev/loop3 (which is also unused), no errors.
  
  While I agree that sosreport shouldn't query unused loop devices, there
  is definitely something going on with the next unused loop device.
  
  What is the difference between loop2 and loop3 and other unused one ?
  
  3 things so far I have noticed:
  * The loop device need to be the next unused loop device (losetup -f)
  * A reboot is needed (if some loop modification (snap install, mount loop, 
...) has been made at runtime
- * I have also noticed that loop2 (or whatever the next unused one is) have 
some stat as oppose to other unused loop devices
+ * I have also noticed that loop2 (or whatever the next unused one is) have 
some stat as oppose to other unused loop devices. The stat exist already right 
after the system boot for the next unused loop device.
  
  /sys/block/loop2/stat
  ::::::::::::::
  2 0 10 0 1 0 0 0 0 0 0
  
  while /dev/loop3 doesn't
  
- /sys/block/loop2/stat
+ /sys/block/loop3/stat
  ::::::::::::::
  0 0 0 0 0 0 0 0 0 0 0
  
  Explanation of each column:
  
https://meet.google.com/linkredirect?authuser=0&dest=https%3A%2F%2Fwww.kernel.org%2Fdoc%2Fhtml%2Flatest%2Fblock%2Fstat.html
  
  Which tells me that something during the boot process most likely
  acquired (on purpose or not) the next unused loop and possibly didn't
  released it well.
  
  If loop2 is generating errors, and I install a snap, the snap squashfs
  will take loop2, making loop3 the next unused loop device.
  
  If I query loop3 with 'parted' right after, no errors.
  
  If I reboot, and query loop3 again, then no I'll have an error.
  
  To triggers the errors it need to be after a reboot and it only impact
  the first unused loop device available (losetup -f).
  
  This was tested with focal/systemd whic his very close to latest upstream 
code.
  This has been test with latest v5.5 kernel as well. For now, I don't think 
it's a kernel problem, I'm more thinking of a userspace misbehaviour dealing 
with loop device at boot.

** Description changed:

  This is reproducible in Bionic and late.
  
  Here's an example running 'focal':
  
  $ lsb_release -cs
  focal
  
  $ uname -r
  5.3.0-24-generic
  
  How to trigger it:
  
  $ sosreport -o block
  
  or more precisely the command causing the situation inside the block plugin:
  $ parted -s /dev/$(losetup -f) unit s print
  
  https://github.com/sosreport/sos/blob/master/sos/plugins/block.py#L52
  
  but if I run it on the next next unused loop device, in this case
  /dev/loop3 (which is also unused), no errors.
  
  While I agree that sosreport shouldn't query unused loop devices, there
  is definitely something going on with the next unused loop device.
  
  What is the difference between loop2 and loop3 and other unused one ?
  
  3 things so far I have noticed:
  * The loop device need to be the next unused loop device (losetup -f)
  * A reboot is needed (if some loop modification (snap install, mount loop, 
...) has been made at runtime
  * I have also noticed that loop2 (or whatever the next unused one is) have 
some stat as oppose to other unused loop devices. The stat exist already right 
after the system boot for the next unused loop device.
  
  /sys/block/loop2/stat
  ::::::::::::::
  2 0 10 0 1 0 0 0 0 0 0
  
+ 2  = number of read I/Os processed
+ 10 = number of sectors read 
+ 1  = number of write I/Os processed
+ 
+ Explanation of each column:
+ https://www.kernel.org/doc/html/latest/block/stat.html
+ 
  while /dev/loop3 doesn't
  
  /sys/block/loop3/stat
  ::::::::::::::
  0 0 0 0 0 0 0 0 0 0 0
  
- Explanation of each column:
- 
https://meet.google.com/linkredirect?authuser=0&dest=https%3A%2F%2Fwww.kernel.org%2Fdoc%2Fhtml%2Flatest%2Fblock%2Fstat.html
  
- Which tells me that something during the boot process most likely
- acquired (on purpose or not) the next unused loop and possibly didn't
- released it well.
+ Which tells me that something during the boot process most likely acquired 
(on purpose or not) the next unused loop and possibly didn't released it well.
  
  If loop2 is generating errors, and I install a snap, the snap squashfs
  will take loop2, making loop3 the next unused loop device.
  
  If I query loop3 with 'parted' right after, no errors.
  
  If I reboot, and query loop3 again, then no I'll have an error.
  
  To triggers the errors it need to be after a reboot and it only impact
  the first unused loop device available (losetup -f).
  
  This was tested with focal/systemd whic his very close to latest upstream 
code.
  This has been test with latest v5.5 kernel as well. For now, I don't think 
it's a kernel problem, I'm more thinking of a userspace misbehaviour dealing 
with loop device at boot.

** Tags added: sts

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1856871

Title:
  i/o error if next unused loop device is queried

Status in systemd package in Ubuntu:
  New
Status in udev package in Ubuntu:
  New

Bug description:
  This is reproducible in Bionic and late.

  Here's an example running 'focal':

  $ lsb_release -cs
  focal

  $ uname -r
  5.3.0-24-generic

  How to trigger it:

  $ sosreport -o block

  or more precisely the command causing the situation inside the block plugin:
  $ parted -s /dev/$(losetup -f) unit s print

  https://github.com/sosreport/sos/blob/master/sos/plugins/block.py#L52

  but if I run it on the next next unused loop device, in this case
  /dev/loop3 (which is also unused), no errors.

  While I agree that sosreport shouldn't query unused loop devices,
  there is definitely something going on with the next unused loop
  device.

  What is the difference between loop2 and loop3 and other unused one ?

  3 things so far I have noticed:
  * The loop device need to be the next unused loop device (losetup -f)
  * A reboot is needed (if some loop modification (snap install, mount loop, 
...) has been made at runtime
  * I have also noticed that loop2 (or whatever the next unused one is) have 
some stat as oppose to other unused loop devices. The stat exist already right 
after the system boot for the next unused loop device.

  /sys/block/loop2/stat
  ::::::::::::::
  2 0 10 0 1 0 0 0 0 0 0

  2  = number of read I/Os processed
  10 = number of sectors read 
  1  = number of write I/Os processed

  Explanation of each column:
  https://www.kernel.org/doc/html/latest/block/stat.html

  while /dev/loop3 doesn't

  /sys/block/loop3/stat
  ::::::::::::::
  0 0 0 0 0 0 0 0 0 0 0

  
  Which tells me that something during the boot process most likely acquired 
(on purpose or not) the next unused loop and possibly didn't released it well.

  If loop2 is generating errors, and I install a snap, the snap squashfs
  will take loop2, making loop3 the next unused loop device.

  If I query loop3 with 'parted' right after, no errors.

  If I reboot, and query loop3 again, then no I'll have an error.

  To triggers the errors it need to be after a reboot and it only impact
  the first unused loop device available (losetup -f).

  This was tested with focal/systemd whic his very close to latest upstream 
code.
  This has been test with latest v5.5 kernel as well. For now, I don't think 
it's a kernel problem, I'm more thinking of a userspace misbehaviour dealing 
with loop device at boot.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1856871/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to     : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to