** Description changed:

  Problem: md changes max_sector setting of an already running and busy md
  device, when a (hotplugable) device is added or removed. However, the
  device mapper and filesystem layer on top of the raid can not (always?)
  cope with that.
  
  Observations:
  * "bio too big device mdX (248 > 240)" messages in the syslog
  * read/write errors (some dropped silently, no noticable errors reported 
during operation, until things like dhcpclient looses its IP etc.)
  
  Expected:
  Adding and removing members to running raids (hotplugging) should not change 
the raid device characteristics. If the new member supports only smaller 
max_sector values, buffer and split the data steam, until the raid device can 
be set up from a clean state with a more appropriate max_sector value. To avoid 
buffering and splitting in the future, md could save the smallest max_sector 
value of the known members in the superblock, and use that when setting up the 
raid even if that member is not present.
  
- Note: This is reproducible in much more common scenarios as the original 
reporter had (e.g. --add a USB (3.0 these days) drive to an already running 
SATA raid1 and grow the number of devices).
+ Note: This is reproducible in much more common scenarios as the original
+ reporter had (e.g. --add a USB (3.0 these days) drive to an already
+ running SATA raid1 and grow the number of devices).
+ 
+ Fix:
+ Upsteam has no formal bug tracking, but a mailing list. The response was that 
finally this needs to be "fixed [outside of mdadm] by cleaning up the bio path 
so that big bios are split by the device that needs the split, not be the fs 
sending the bio."
+ 
+ However, in the meantime mdadm needs to saveguard against the date
+ corruption:
+ 
+ > > [The mdadm] fix is to reject the added device [if] its limits are
+ > > too low.
+ > 
+ > Good Idea to avoid the data corruption. MD could save the
+ > max_sectors default limit for arrays. If the array is modified and the new 
+ > limit gets smaller, postpone the sync until the next assembe/restart.
+ > 
+ > And of course print a message if postponing, that explains when --force 
would be save.
+ > What ever that would be: no block device abstraction layer (device mapper, 
lvm, luks,...) 
+ > between an unmounted? ext, fat?, ...? filesystem and md?
+ 
+ As upsteam does not do public bug tracking, the status and rememberence
+ of this need remains unsure though.
+ 
+ 
  ---
  
  This is on a MSI Wind U100 and I've got the following stack running:
  HDD & SD card (USB card reader) -> RAID1 -> LUKS -> LVM -> Reiser
  
  Whenever I remove the HDD from the Raid1
  > mdadm /dev/md0 --fail /dev/sda2
  > mdadm /dev/md0 --remove /dev/sda2)
  for powersaving reasons, I cannot run any apt related tools.
  
  > sudo apt-get update
  [...]
  Hit http://de.archive.ubuntu.com intrepid-updates/multiverse Sources
  Reading package lists... Error!
  E: Read error - read (5 Input/output error)
  E: The package lists or status file could not be parsed or opened.
  
  Taking a look at the kernel log shows (and many more above):
  > dmesg|tail
  [ 9479.330550] bio too big device md0 (248 > 240)
  [ 9479.331375] bio too big device md0 (248 > 240)
  [ 9479.332182] bio too big device md0 (248 > 240)
  [ 9611.980294] bio too big device md0 (248 > 240)
  [ 9742.929761] bio too big device md0 (248 > 240)
  [ 9852.932001] bio too big device md0 (248 > 240)
  [ 9852.935395] bio too big device md0 (248 > 240)
  [ 9852.938064] bio too big device md0 (248 > 240)
  [ 9853.081046] bio too big device md0 (248 > 240)
  [ 9853.081688] bio too big device md0 (248 > 240)
  
  $ sudo mdadm --detail /dev/md0
  /dev/md0:
          Version : 00.90
    Creation Time : Tue Jan 13 11:25:57 2009
       Raid Level : raid1
       Array Size : 3871552 (3.69 GiB 3.96 GB)
    Used Dev Size : 3871552 (3.69 GiB 3.96 GB)
     Raid Devices : 2
    Total Devices : 1
  Preferred Minor : 0
      Persistence : Superblock is persistent
  
    Intent Bitmap : Internal
  
      Update Time : Fri Jan 23 21:47:35 2009
            State : active, degraded
   Active Devices : 1
  Working Devices : 1
   Failed Devices : 0
    Spare Devices : 0
  
             UUID : 89863068:bc52a0c0:44a5346e:9d69deca (local to host m-twain)
           Events : 0.8767
  
      Number   Major   Minor   RaidDevice State
         0       0        0        0      removed
         1       8       17        1      active sync writemostly   /dev/sdb1
  
  $ sudo ubuntu-bug -p linux-meta
  dpkg-query: failed in buffer_read(fd): copy info file `/var/lib/dpkg/status': 
Input/output error
  dpkg-query: failed in buffer_read(fd): copy info file `/var/lib/dpkg/status': 
Input/output error
  [...]
  
  Will provide separate attachements.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/320638

Title:
  hot-add/remove in mixed (IDE/SATA/USB/SD-card/...)  RAIDs with device
  mapper on top => data corruption (bio too big device md0 (248 > 240))

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/320638/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to