Bug#567468: md homehost
On Wed, Feb 24, 2010 at 11:46 PM, Neil Brown ne...@suse.de wrote: On Thu, 25 Feb 2010 08:16:14 +0100 Goswin von Brederlow goswin-...@web.de wrote: Neil Brown ne...@suse.de writes: On Wed, 24 Feb 2010 14:41:16 +0100 Goswin von Brederlow goswin-...@web.de wrote: Neil Brown ne...@suse.de writes: On Tue, 23 Feb 2010 07:27:00 +0100 martin f krafft madd...@madduck.net wrote: The only issue homehost protects against, I think, is machines that use /dev/md0 directly from grub.conf or fstab. That is exactly correct. If no code or config file depends on a name like /dev/mdX or /dev/md/foo, then you don't need to be concerned about the whole homehost thing. You can either mount by fs-uuid, or mount e.g. /dev/disk/by-id/md-uuid-8fd0af3f:4fbb94ea:12cc2127:f9855db5 What if you have two raids (one local, one from the other hosts that broke down) and both have LVM on it with /dev/vg/root? Shouldn't it only assemble the local raid (as md0 or whatever) and then only start the local volume group? If it assembles the remote raid as /dev/md127 as well then lvm will have problems and the boot will likely (even randomly) go wrong since only one VG can be activated. I think it is pretty common for admins to configure LVM to the same volume group name on different systems. So if you consider raids being pluged into other systems please keep this in mind. You are entirely correct. However lvm problems are not my problems. It has always been my position that the best way to configure md is to explicitly list your arrays in mdadm.conf. But people seem to not like this and want it to all be 'automatic'. So I do my best to make it as automatic as possible but still remove as many of the possible confusion that this can cause as possible. But I cannot remove them all. If you move disks around and boot and lvm gets confused because there are two things call /dev/vg/root, then I'm sorry but there is nothing I can do about that. If you had an mdadm.conf which listed you md arrays, and had auto -all then you can be sure that mdadm would not be contributing to this problem. NeilBrown Yes you can do something about it: Only start the raid arrays with the correct homehost. This is what 'homehost' originally did, but I got a lot of push-back on that. I added the auto line in mdadm.conf so that the admin could choose what happens. If the particular metadata type is enabled on the auto line, the the array is assembled with a random name. If it is disabled, it is not assembled at all (unless explicitly listed in mdadm.conf). I'm not sure exactly how 'auto' interacts with 'homehost'. The documentation I wrote only talks about arrays listed in mdadm.conf or on the command line, not arrays with a valid homehost. I guess I should check. I think I want auto -all to still assemble arrays with a valid homehost. I'll confirm that before I release 3.1.2. If the homehost is only used to decide wether the prefered minor in the metadata is used for the device name then I feel the feature is entirely useless. It would only help in stupid configurations, i.e. when you use the device name directly. Yes. Another scenario where starting a raid with the wrong homehost would be bad is when the raid is degraded and you have a global spare. You probably wouldn't want the global spare of one host to be used to repair a raid of another host. I only support global spares that are explicitly listed in mdadm.conf, so currently this couldn't happen. One day some one is going to ask for auto-configure global spares. Then I'll have to worry about this (or just say no). MfG Goswin PS: If a raid is not listed in mdadm.conf doesn't it currently start too but the name can be random? It depends on the auto line in mdadm.conf Thanks, NeilBrown -- To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html I've always tried to assign volume-groups a host-unique name anyway. Though I don't currently run enough systems to demand a formal approach, I imagine a dedicated hostname within the VG name would work. You could then use a pattern like sys-${HOSTNAME} or sys-*/ to obtain the hostname on a nominal basis; though obviously if working on multiple 'system' volume groups at a time the * method fails... -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4877c76c1002250033o714913dcnaf8970c9aca5e...@mail.gmail.com
Bug#567468: md homehost (was: Bug#567468: (boot time consequences of) Linux mdadm superblock) question.
On Tue, Feb 23, 2010 at 4:10 PM, Neil Brown ne...@suse.de wrote: On Tue, 23 Feb 2010 07:27:00 +0100 martin f krafft madd...@madduck.net wrote: also sprach Neil Brown ne...@suse.de [2010.02.23.0330 +0100]: The problem to protect against is any consequence of rearranging devices while the host is off, including attaching devices that previously were attached to a different computer. How often does this happen, and how grave/dangerous are the effects? a/ no idea. b/ it all depends... It is the sort of thing that happens when something has just gone drastically wrong and you need to stitch things back together again as quickly as you can. You aren't exactly panicing, but you are probably hasty and don't want anything else to go wrong. If the array from the 'other' machine with the same name has very different content, then things could go wrong in various different ways if we depended on that name. It is true that the admin would have to by physically present and could presumably get a console and 'fix' things. But it would be best if they didn't have too. They may not even know clearly what to do to 'fix' things - because it always worked perfectly before, but this time when in a particular hurry, something strange goes wrongs. I've been there, I don't want to inflict it on others. But if '/' is mounted by a name in /dev/md/, I want to be sure mdadm puts the correct array at that name no matter what other arrays might be visible. Of course it would be nice if this happened, but wouldn't it be acceptable to assume that if someone swaps drives between machines that they ought to know how to deal with the consequences, or at least be ready to tae additional steps to make sure the system still boots as desired? No. We cannot assume that an average sys-admin will have a deep knowledge of md and mdadm. Many do, many don't. But in either case the behaviour must be predictable. After all, Debian is for when you have better things to do than fixing systems Even if the wrong array appeared as /dev/md0 and was mounted as root device, is there any actual problem, other than inconvenience? Remember that the person who has previously swapped the drives is physically in front of (or behind ;)) the machine. I am unconvinced. I think we should definitely switch to using filesystem-UUIDs over device names, and that is the only real solution to the problem, no? What exactly are you unconvinced of? I agree completely that mounting filesystems by UUID is the right way to go. (I also happen to think that assembly md arrays by UUID is the right way to go too, but while people seem happy to put fs uuids in /etc/fstab, they seem less happy to put md uuids in /etc/mdadm.conf). As you say in another email: The only issue homehost protects against, I think, is machines that use /dev/md0 directly from grub.conf or fstab. That is exactly correct. If no code or config file depends on a name like /dev/mdX or /dev/md/foo, then you don't need to be concerned about the whole homehost thing. You can either mount by fs-uuid, or mount e.g. /dev/disk/by-id/md-uuid-8fd0af3f:4fbb94ea:12cc2127:f9855db5 NeilBrown -- To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Would a permissible behavior be to add a third case: If an entry is not detected in the mdadm.conf file AND the homehost is not found to match ask on the standard console what to do with something like a 30 second timeout; as well as being noisy in the kernel log so the admin knows why it was slow. Really there should probably be two questions: 1) Do you want to run this? 2) What name do you want? (with the defaults being yes, and the currently chosen alternate name pattern). -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4877c76c1002232012j55e77adcs16d958fa6e922...@mail.gmail.com
Bug#567468: (boot time consequences of) Linux mdadm superblock question.
On Sun, Feb 21, 2010 at 11:06 PM, Goswin von Brederlow goswin-...@web.de wrote: martin f krafft madd...@madduck.net writes: also sprach Daniel Reurich dan...@centurion.net.nz [2010.02.19.0351 +0100]: But if a generated 'system uuid' value (I just suggested the root fs UUID because it would be highly unlikely to be unchanged, and nobody would be likely to fiddle with it) was copied into a file called /etc/system_uuid and copied into the initrd, then we could add put into mdadms hook script in initramfs-tools, to verify and update the homehost variable in the boot time required raid volumes when ever a new initrd is installed. (This generally happens on debian whenever a kernel is installed and mdadm is installed or upgraded. Neil's point is that no such value exists. The root filesystem UUID is not available when the array is created. And updating the homehost in the RAID metadata at boot time would defeat the purpose of homehost in the first place. As an added protection we could include checks in mdadm shutdown script a check that warns when mdadm.conf doesn't exist and the /etc/system_uuid doesn't match the homehost value in the boottime assembled raid volumes. If we did use the root filesystem UUID for this, we could compare that as well. Debian has no policy for this. There is no way to warn a user and interrupt the shutdown process. It would be useful to have a tool similar to /bin/hostname that could be used to create|read|verify|update the system uuid, which would update all the relevant locations which store and check against this system uuid. Yes, it would be useful to have a system UUID that could be generated by the installer and henceforth written to the newly installed system. This is probably something the LSB should push. But you could also bring it up for discussion on debian-devel. How would that work with network boot where the initrd would have to work for multiple hosts? MfG Goswin -- To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html I don't know how whatever was mentioned previously would work for that, but I do have a solution. Incremental assembly, or examine with all block devices to generate a new mdadm.conf file. Then run only devices which are in a complete state. The next step would be not mount by uuid, but mount by label. Presuming you have a consistently labeled rootfs in your deployment (say mandating that the / filesystem be labeled 'root' or some other value and that no other FS may share that same label) then it should work out just fine. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4877c76c1002212337i36f208dcyd1be7a93625d...@mail.gmail.com