You're welcome, Dmitrijs. Now that this system is finally behaving itself (for the first time in the better part of a year!) I can now look at this properly functioning configuration and compare it with the previously broken one. It is becoming more clear what happened.
(Note: all of the following are things I have deduced by reading a vast amount of material from different sources [and giving myself many headaches in the process], and some of it may not be an entirely accurate description of what's really happening.) The superblock contains a field called name. It's not like /dev/md1 or /dev/md/1 or /dev/md1p1 or anything like that. On my system, it's more like 5. As it happened I had two very different arrays (different array levels, sizes, etc.) that both had that name. When you run Disk Utility and select a RAID array, this is the Name displayed in the right pane; if it's empty, the pane shows Name: - but on my system two arrays showed Name: 5. I didn't choose this name; I think mdadm assigned it because each array happened to be mounted at /dev/md5 at the time it was created, and the two arrays were created at different times (of course these device names change arbitrarily whenever you boot). But of course it's more complicated than that. That's just part of the name; the superblock actually contains a 'fully qualified name' that is of the form hostname:name and Disk Utility only displays the last part of it. The hostname part is just the hostname at the time the array is created. My system has a long history. A year ago, its hostname was different, and one of the arrays was created then. After the system became unstable (when I upgraded to Oneiric and gained a particularly buggy version of mdadm) I stopped using it and backed all the data off. When Precise became available, I did a fresh install onto a non-RAID partition and left all the existing RAID partitions in place for testing purposes. Because I was no longer going to use this system as a file server, but as a test machine, I gave it a different hostname (precisetest). A bit later I added another array and mdadm assigned it the name 5, presumably because it was sitting at /dev/md5 at the time. I did not even notice this at first. Of course, the fully qualified names stored in the superblocks were actually different, having different hostnames on the front, so even though Disk Utility showed the name 5 on both arrays, they really had different full names. Although RAID was really messed up on this system, that was only a problem at boot time. After booting, I could go into Disk Utility and manually start all affected arrays. Once this was done, the system worked great, until the next reboot. RAID was working; I could access files on any array. I came to the (perhaps incorrect?) conclusion that these two arrays having the same name was not a problem. After all, I could look at mdadm.conf and see that the arrays really had different names (the fully qualified names are shown there). Now I am thinking that it is not sufficient that the fully qualified names be unique. I think the part of the name after the : has to be unique too, otherwise problems happen at boot, at least on Ubuntu Precise. But I don't think mdadm upstream intended it to be that way. So: some part of the boot process is getting hung up on these (apparently) duplicate names, because it is looking at just the short names instead of the fully qualified names. (In the udev scripts perhaps?) If that code looked at hostname:name instead of just name, perhaps this problem would disappear. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1036366 Title: software RAID arrays fail to start on boot To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/mdadm/+bug/1036366/+subscriptions -- ubuntu-bugs mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
