** Also affects: linux (Ubuntu Focal)
   Importance: Critical
     Assignee: Colin Ian King (colin-king)
       Status: Confirmed

** Also affects: linux-hwe (Ubuntu Focal)
   Importance: Undecided
       Status: Invalid

** Also affects: linux (Ubuntu Eoan)
   Importance: Undecided
       Status: New

** Also affects: linux-hwe (Ubuntu Eoan)
   Importance: Undecided
       Status: New

** Also affects: linux (Ubuntu Disco)
   Importance: Undecided
       Status: New

** Also affects: linux-hwe (Ubuntu Disco)
   Importance: Undecided
       Status: New

** No longer affects: linux-hwe (Ubuntu Focal)

** No longer affects: linux-hwe (Ubuntu Eoan)

** No longer affects: linux-hwe (Ubuntu Disco)

** Changed in: linux (Ubuntu Focal)
       Status: Confirmed => In Progress

** Changed in: linux-hwe (Ubuntu Bionic)
       Status: Confirmed => In Progress

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1824407

Title:
  remount of multilower moved pivoted-root overlayfs root, results in
  I/O errors on some modified files

Status in linux package in Ubuntu:
  In Progress
Status in linux-hwe package in Ubuntu:
  Invalid
Status in linux-hwe source package in Bionic:
  In Progress
Status in linux source package in Disco:
  New
Status in linux source package in Eoan:
  New
Status in linux source package in Focal:
  In Progress

Bug description:
  == SRU Justification Disco, Eoan, Focal ==

  Multiple squashfs filesystems with overlayfs cause file corruption issues
  when modifying zero sized files

  == Fix ==

  The current fix is pending in
  
https://github.com/amir73il/linux/commit/b2d4f0ea5af42e16e154254de99da064f3ac551a

  == Test case ==

  With an Ubuntu ISO on the cdrom drive, use:

  #!/bin/bash -x
  mkdir -p /cdrom
  mount -t iso9660 -o ro,noatime /dev/sr0 /cdrom
  sleep 1
  mkdir -p /cow
  mount -t tmpfs -o 'rw,noatime,mode=755' tmpfs /cow
  sleep 1
  mkdir -p /cow/upper
  mkdir -p /cow/work
  modprobe -q -b overlay
  sleep 1
  modprobe -q -b loop
  sleep 1
  dev=$(losetup -f)
  mkdir -p /filesystem.squashfs
  losetup $dev /cdrom/casper/filesystem.squashfs
  mount -t squashfs -o ro,noatime $dev /filesystem.squashfs
  sleep 1

  dev=$(losetup -f)
  mkdir -p /installer.squashfs
  losetup $dev /cdrom/casper/installer.squashfs
  mount -t squashfs -o ro,noatime $dev /installer.squashfs
  sleep 1

  mkdir -p /root-tmp
  mount -t overlay -o 
'upperdir=/cow/upper,lowerdir=/installer.squashfs:/filesystem.squashfs,workdir=/cow/work'
 /cow /root-tmp

  FILE=/root-tmp/etc/.pwd.lock

  echo foo > $FILE
  cat $FILE
  sync
  #
  # dropping caches or remounting causes the bug
  #
  echo 3 > /proc/sys/vm/drop_caches
  cat $FILE

  Without the fix the cat of the file will produce an error. With the
  the cat will work correctly.

  == Regression Potential ==

  There is an unhandled corner case:
      - two filesystems, A and B, both have null uuid
      - upper layer is on A
      - lower layer 1 is also on A
      - lower layer 2 is on B

  However, since this is an issue without the fix and will be addressed
  later with subsequent fixes once they are OK with upstream I think the
  risk is minimal considering nobody is complaining about these corner
  cases with the current broken overlayfs squashfs layering.

  -----------------------

  1) Download focal subiquity pending image, or eoan release image
  2) boot, and press ESC and edit boot command line (F6 in bios, e in UEFI)
  3) After --- insert the following options

     break=top debug init=/bin/bash

  4) Continue boot (Enter in BIOS, ctrl+x in UEFI)
  5) in the initramfs execute:

      rm /scripts/casper-bottom/25adduser
      exit

  6) you will be dropped into pivoted root filesystem, before systemd is execed 
as pid one
  7) /run/initramfs/ will contain a debug log, showing how everything was 
mounted. Ie. cdrom mounted, squashfs losetup from there, then multilower 
overlay setup from them, moved to /root, and then pivot-root to /root done to 
finally end up as /. Underlying layers are moved into /cow for your convenience.

  8) At this point modifying zero-byte length files, that exist in the
  lowest layer, but not the middle one, in certain ways, will results in
  them to be corrupted, after / is remounted.

  9) Corruption examples

  (On both focal & eoan)

  cat /etc/.pwd.lock
  systemd-sysusers
  cat /etc/.pwd.lock
  mount -o remount /
  cat /etc/.pwd.lock
  overlayfs: invalid origin (etc/.pwd.lock, ftype=8000, origin ftype=4000)
  cat: /etc/.pwd.lock: Input/output error

  (Only on eoan)

  cat /etc/machine-id
  systemd-machine-id-setup
  cat /etc/machine-id
  mount -o remount /
  cat /etc/machine-id
  overlayfs: invalid origin (etc/machine-id, ftype=8000, origin ftype=4000)
  cat: /etc/machine-id: Input/output error

  Lots of things break once machine-id and .pwd.lock are corrupted. I.e.
  unable to dhcp, connect to dbus, add/remove/change users or groups,
  etc.

  We were unable to recreate the issue outside of booting things with
  casper. Ie. statically on a regular host machine without pivot-root.
  But hopefully booting to a quite state with nothing running is
  sufficient to reproduce this.

  Instead of booting with `bebroken init=/bin/bash` you can boot with
  `bebroken systemd.mask=systemd-remount-fs.service` this will complete
  the boot, with /etc/machine-id & .pwd.lock modified, meaning that
  remount of / will cause IO errors on those files.

  Currently, we are shipping two hacks in casper's 25adduser script to
  "rm" the offending files, and create them again on the upper rw layer.
  They then survive remount without i/o errors. However, we'd rather not
  ship those hacks, and have kernel overlay fixed to work correctly with
  multi-lower-dir and not corrupt files upon remounting /.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1824407/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to