[Kernel-packages] [Bug 1847773] Re: md raid0/linear don't show error state if an array member is removed and allows successful writes

Guilherme G. Piccoli Sun, 13 Oct 2019 16:06:28 -0700

There was a first attempt to mimic the behavior of NVMe/SCSI devices
removed while holding a mounted filesystem: lore.kernel.org/linux-
block/b39b96ea-2540-a407-2232-1af91e3e6...@canonical.com/T/#u


It was quite complex, relying on force an unmount operation in this
filesystem, stop writeback threads and remove the md block device. It
got not many reviews, and most of them not favorable for it, hence we
proposed a more simpler approach, hereby SRUEd.

The RFC cover-letter aforementioned details the issue to a large extent,
so it may be interesting read for the interested parties in this issue.

Thanks,


Guilherme

** Summary changed:

- md raid0/linear don't show error state if an array member is removed
+ md raid0/linear don't show error state if an array member is removed and 
allows successful writes

** Summary changed:

- md raid0/linear don't show error state if an array member is removed and 
allows successful writes
+ md raid0/linear doesn't show error state if an array member is removed and 
allows successful writes

** Description changed:

- TBD - just a LP number now for SRU, will elaborate the description soon
+ [Impact]
+ 
+ * Currently, mounted raid0/md-linear arrays have no indication/warning
+ when one or more members are removed or suffer from some non-recoverable
+ error condition.
+ 
+ * Given that, arrays keep mounted, and regular written data to it goes
+ through page cache and appear as successful written to the devices,
+ despite writeback threads can't write to it. For users, it can
+ potentially cause data corruption, given that even "sync" command will
+ return success despite the data is not written to the disk. Kernel
+ messages will show I/O errors though.
+ 
+ * The patch proposed in this SRU addresses this issue in 2 levels;
+ first, it fast-fails written I/Os to the raid0/md-linear array devices
+ with one or more failed members. Also, it introduces the "broken" state,
+ which is analog to "clean" but indicates that array is not in a
+ good/correct state. A message showed in dmesg helps to clarify when such
+ array gets a member removed/failed.
+ 
+ * The commit proposed here, available on Linus tree as 62f7b1989c02 ("md
+ raid0/linear: Mark array as 'broken' and fail BIOs if a member is gone")
+ [http://git.kernel.org/linus/62f7b1989c02], was pretty discussed
+ upstream and received a good amount of reviews/analysis by both the
+ current md maintainer as well as an old maintainer.
+ 
+ * One important note here is that this patch requires a counter-part in mdadm 
tool to be fully functional, which was SRUed in LP: #1847924.
+ It works fine without this counter-part, but in case of broken arrays, the
+ "mdadm --detail" command won't show broken, and instead will show "clean, 
FAILED".
+ 
+ * We ask hereby an exception from kernel team to have this backported to
+ kernel 4.15 *only in Bionic* and not in Xenial. The reason is that mdadm
+ code changed too much and we didn't want to introduce a potential
+ regression in the Xenial version from that tool, so we only backported
+ the mdadm counter-part of this patch to Bionic, Disco and Eoan - hence,
+ we'd like to have a match in the kernel backported versions.
+ 
+ [Test case]
+ 
+ * To test this patch, create a raid0 or linear md array on Linux using mdadm, 
like in: "mdadm --create md0 --level=0 --raid-devices=2 /dev/nvme0n1 
/dev/nvme1n1";
+  
+ * Format the array using a filesystem of your choice (for example ext4) and 
mount the array;
+ 
+ * Remove one member of the array, for example using sysfs interface (for
+ nvme: echo 1 > /sys/block/nvme0nX/device/device/remove, for scsi: echo 1
+ > /sys/block/sdX/device/delete);
+ 
+ * Without this patch, the array partition can be written with success,
+ and "mdadm --detail" will show clean state.
+ 
+ [Regression potential]
+ 
+ * There's not much potential regression here; we failed written I/Os to
+ bad arrays and show message/status according to it, showing the array
+ broken status. We believe the most common "issue" that could be reported
+ from this patch is if an userspace tool rely on success of I/O writes or
+ in the "clean" state of an array - after this patch it can potentially
+ have a different behavior in case of a broken array.

** Also affects: linux (Ubuntu Bionic)
   Importance: Undecided
       Status: New

** Also affects: linux (Ubuntu Eoan)
   Importance: High
     Assignee: Guilherme G. Piccoli (gpiccoli)
       Status: Confirmed

** Also affects: linux (Ubuntu Disco)
   Importance: Undecided
       Status: New

** Changed in: linux (Ubuntu Bionic)
       Status: New => Confirmed

** Changed in: linux (Ubuntu Disco)
       Status: New => Confirmed

** Changed in: linux (Ubuntu Disco)
   Importance: Undecided => High

** Changed in: linux (Ubuntu Bionic)
   Importance: Undecided => High

** Changed in: linux (Ubuntu Bionic)
     Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: linux (Ubuntu Disco)
     Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1847773

Title:
  md raid0/linear doesn't show error state if an array member is removed
  and allows successful writes

Status in linux package in Ubuntu:
  Confirmed
Status in linux source package in Bionic:
  Confirmed
Status in linux source package in Disco:
  Confirmed
Status in linux source package in Eoan:
  Confirmed

Bug description:
  [Impact]

  * Currently, mounted raid0/md-linear arrays have no indication/warning
  when one or more members are removed or suffer from some non-
  recoverable error condition.

  * Given that, arrays keep mounted, and regular written data to it goes
  through page cache and appear as successful written to the devices,
  despite writeback threads can't write to it. For users, it can
  potentially cause data corruption, given that even "sync" command will
  return success despite the data is not written to the disk. Kernel
  messages will show I/O errors though.

  * The patch proposed in this SRU addresses this issue in 2 levels;
  first, it fast-fails written I/Os to the raid0/md-linear array devices
  with one or more failed members. Also, it introduces the "broken"
  state, which is analog to "clean" but indicates that array is not in a
  good/correct state. A message showed in dmesg helps to clarify when
  such array gets a member removed/failed.

  * The commit proposed here, available on Linus tree as 62f7b1989c02
  ("md raid0/linear: Mark array as 'broken' and fail BIOs if a member is
  gone") [http://git.kernel.org/linus/62f7b1989c02], was pretty
  discussed upstream and received a good amount of reviews/analysis by
  both the current md maintainer as well as an old maintainer.

  * One important note here is that this patch requires a counter-part in mdadm 
tool to be fully functional, which was SRUed in LP: #1847924.
  It works fine without this counter-part, but in case of broken arrays, the
  "mdadm --detail" command won't show broken, and instead will show "clean, 
FAILED".

  * We ask hereby an exception from kernel team to have this backported
  to kernel 4.15 *only in Bionic* and not in Xenial. The reason is that
  mdadm code changed too much and we didn't want to introduce a
  potential regression in the Xenial version from that tool, so we only
  backported the mdadm counter-part of this patch to Bionic, Disco and
  Eoan - hence, we'd like to have a match in the kernel backported
  versions.

  [Test case]

  * To test this patch, create a raid0 or linear md array on Linux using mdadm, 
like in: "mdadm --create md0 --level=0 --raid-devices=2 /dev/nvme0n1 
/dev/nvme1n1";
   
  * Format the array using a filesystem of your choice (for example ext4) and 
mount the array;

  * Remove one member of the array, for example using sysfs interface
  (for nvme: echo 1 > /sys/block/nvme0nX/device/device/remove, for scsi:
  echo 1 > /sys/block/sdX/device/delete);

  * Without this patch, the array partition can be written with success,
  and "mdadm --detail" will show clean state.

  [Regression potential]

  * There's not much potential regression here; we failed written I/Os
  to bad arrays and show message/status according to it, showing the
  array broken status. We believe the most common "issue" that could be
  reported from this patch is if an userspace tool rely on success of
  I/O writes or in the "clean" state of an array - after this patch it
  can potentially have a different behavior in case of a broken array.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1847773/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

[Kernel-packages] [Bug 1847773] Re: md raid0/linear don't show error state if an array member is removed and allows successful writes

Reply via email to