I finally got back to this bug and figured out what was going on.  It
took a while ...

A few people suggested that what was happening was that the partitioner
was creating partitions that extended beyond the end of the disk.  That
wasn't actually quite right if you looked at the logs in detail and did
the arithmetic; they were entirely within the disk, just extending onto
the last (incomplete) cylinder, and there's nothing wrong with that in
itself.  However, there were log messages indicating that the md layer
in the kernel thought that an md device was overflowing the disk, and
this pointed me in the right direction.

When I tried to fix this bug before, I observed that what was happening
was that mdadm was getting confused between /dev/sda and /dev/sda1 (or
whatever the last partition happened to be).  Since the 0.90 metadata
format stores the superblock at the end of the device, there's obvious
potential for confusion between a partition extending all the way to the
end of the disk and the disk device itself.  I fixed this, or so I
thought, by constraining the installer's partitioner to never use the
last sector of the disk.  This fixed the problem in my tests.

Unfortunately, I apparently didn't quite do enough research on exactly
what was happening.  When I came back to this bug, I read the md(4)
manual page, and found this:

  The  common format - known as version 0.90 - has a superblock that is 4K long
  and is written into a 64K aligned block that starts at least 64K and less
  than 128K from the end of the device (i.e. to get the address of the 
superblock round the size of the device down to a multiple of 64K and then 
subtract 64K).

(The 1.0 superblock format is similar, but is never more than 12K from
the end of the device, so a fix for 0.90 will fix 1.0 too.  1.1 and 1.2
store their superblocks at or near the start of the device, and do not
suffer from this problem.)

So, if you do the mathematics based on partman's current constraints,
the result is that Ubuntu will currently get this wrong for any disk
whose size is an exact multiple of 1048576 bytes plus any number between
512 and 65536.  The 500GB disks common among commenters on this bug
report are, according to the logs, 500107862016 bytes long, which is
476940 * 1048576 + 24576.  I could never reproduce this in KVM before
because my habit is to create disk images which are an exact number of
megabytes (I usually just say '10G' or thereabouts), and such an image
would never encounter this bug thanks to my previous attempted fix of
avoiding the last sector.

The proper fix, then, is for partman to round the disk size down to 64K,
subtract one further sector, and avoid any sectors after that.

-- 
mount: mounting /dev/md0 on /root/ failed: Invalid argument
https://bugs.launchpad.net/bugs/569900
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to