@Phillip Susi / comment #23: Did you actually read what I wrote? :)

I was *NOT* advocating "backup" by having multiple RAID disks constantly 
connected to the array and in sync. It is completely obvious to me that a hot 
running copy of data is NOT a backup.
I was advocating the following procedure:
1. Connect disk
2. Wait until it is synced into the array.
3. Shutdown the machine
4. *DISCONNECT* the disc from the machine  and consider the completely offline 
disk as a backup.

This is a backup because the disk is physically disconnected from the machine.
It is much better than a rsync/cp, because it provides a *coherent* copy since 
all modifications to the data which happen during the copying process are also 
applied on the backup. With rsync/cp, files which are modified *after* they 
have already been copied are not up to date in the backup, and for applications 
which store data in multiple files (which *many* programs do), their data would 
be corrupt in such a case.

It is relevant to this bugtracker entry because it shows that using multiple 
kinds of disks in a RAID1, such as SATA+USB, is a common desire and not some 
exotic border case.
Or can you name any other kind of non-exotic, non-beta (such as btrfs) backup 
mechanism which can copy data while the system is in use without breaking its 
coherency? :)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/320638

Title:
  hot-add/remove in mixed (IDE/SATA/USB/SD-card/...)  RAIDs with device
  mapper on top => data corruption (bio too big device md0 (248 > 240))

Status in The Linux Kernel:
  Confirmed
Status in mdadm - Tool for managing linux software RAID arrays.:
  Confirmed
Status in debian-installer package in Ubuntu:
  Invalid
Status in linux package in Ubuntu:
  Won't Fix
Status in mdadm package in Ubuntu:
  Confirmed
Status in ubiquity package in Ubuntu:
  Invalid

Bug description:
  Problem: md changes max_sector setting of an already running and busy
  md device, when a (hotplugable) device is added or removed. However,
  the device mapper and filesystem layer on top of the raid can not
  (always?) cope with that.

  Observations:
  * "bio too big device mdX (248 > 240)" messages in the syslog
  * read/write errors (some dropped silently, no noticable errors reported 
during operation, until things like dhcpclient looses its IP etc.)

  Expected:
  Adding and removing members to running raids (hotplugging) should not change 
the raid device characteristics. If the new member supports only smaller 
max_sector values, buffer and split the data steam, until the raid device can 
be set up from a clean state with a more appropriate max_sector value. To avoid 
buffering and splitting in the future, md could save the smallest max_sector 
value of the known members in the superblock, and use that when setting up the 
raid even if that member is not present.

  Note: This is reproducible in much more common scenarios as the
  original reporter had (e.g. --add a USB (3.0 these days) drive to an
  already running SATA raid1 and grow the number of devices).

  Fix:
  Upsteam has no formal bug tracking, but a mailing list. The response was that 
finally this needs to be "fixed [outside of mdadm] by cleaning up the bio path 
so that big bios are split by the device that needs the split, not be the fs 
sending the bio."

  However, in the meantime mdadm needs to saveguard against the date
  corruption:

  > > [The mdadm] fix is to reject the added device [if] its limits are
  > > too low.
  > 
  > Good Idea to avoid the data corruption. MD could save the
  > max_sectors default limit for arrays. If the array is modified and the new 
  > limit gets smaller, postpone the sync until the next assembe/restart.
  > 
  > And of course print a message if postponing, that explains when --force 
would be save.
  > What ever that would be: no block device abstraction layer (device mapper, 
lvm, luks,...) 
  > between an unmounted? ext, fat?, ...? filesystem and md?

  As upsteam does not do public bug tracking, the status and
  rememberence of this need remains unsure though.

  
  ---

  This is on a MSI Wind U100 and I've got the following stack running:
  HDD & SD card (USB card reader) -> RAID1 -> LUKS -> LVM -> Reiser

  Whenever I remove the HDD from the Raid1
  > mdadm /dev/md0 --fail /dev/sda2
  > mdadm /dev/md0 --remove /dev/sda2)
  for powersaving reasons, I cannot run any apt related tools.

  > sudo apt-get update
  [...]
  Hit http://de.archive.ubuntu.com intrepid-updates/multiverse Sources
  Reading package lists... Error!
  E: Read error - read (5 Input/output error)
  E: The package lists or status file could not be parsed or opened.

  Taking a look at the kernel log shows (and many more above):
  > dmesg|tail
  [ 9479.330550] bio too big device md0 (248 > 240)
  [ 9479.331375] bio too big device md0 (248 > 240)
  [ 9479.332182] bio too big device md0 (248 > 240)
  [ 9611.980294] bio too big device md0 (248 > 240)
  [ 9742.929761] bio too big device md0 (248 > 240)
  [ 9852.932001] bio too big device md0 (248 > 240)
  [ 9852.935395] bio too big device md0 (248 > 240)
  [ 9852.938064] bio too big device md0 (248 > 240)
  [ 9853.081046] bio too big device md0 (248 > 240)
  [ 9853.081688] bio too big device md0 (248 > 240)

  $ sudo mdadm --detail /dev/md0
  /dev/md0:
          Version : 00.90
    Creation Time : Tue Jan 13 11:25:57 2009
       Raid Level : raid1
       Array Size : 3871552 (3.69 GiB 3.96 GB)
    Used Dev Size : 3871552 (3.69 GiB 3.96 GB)
     Raid Devices : 2
    Total Devices : 1
  Preferred Minor : 0
      Persistence : Superblock is persistent

    Intent Bitmap : Internal

      Update Time : Fri Jan 23 21:47:35 2009
            State : active, degraded
   Active Devices : 1
  Working Devices : 1
   Failed Devices : 0
    Spare Devices : 0

             UUID : 89863068:bc52a0c0:44a5346e:9d69deca (local to host m-twain)
           Events : 0.8767

      Number   Major   Minor   RaidDevice State
         0       0        0        0      removed
         1       8       17        1      active sync writemostly   /dev/sdb1

  $ sudo ubuntu-bug -p linux-meta
  dpkg-query: failed in buffer_read(fd): copy info file `/var/lib/dpkg/status': 
Input/output error
  dpkg-query: failed in buffer_read(fd): copy info file `/var/lib/dpkg/status': 
Input/output error
  [...]

  Will provide separate attachements.

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/320638/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to