Re: Raid5 Debian Yaird Woes
On Sun, 5 Feb 2006 09:07:29 +1100 Lewis Shobbrook wrote: Basically it just states waiting X seconds Please post in public rather than to me privately. If this debate is related to a bug already filed against the Debian package of yaird then cc that bugreport: bug number@bugs.debian.org - and if not then please file a bugreport. Thanks in advance, - Jonas -- * Jonas Smedegaard - idealist og Internet-arkitekt * Tlf.: +45 40843136 Website: http://dr.jones.dk/ - Enden er n_r: http://www.shibumi.org/eoti.htm pgpruWEVYPCJS.pgp Description: PGP signature
Re: Raid5 Debian Yaird Woes
On Mon, 24 Apr 2006 17:13:42 +0200 Jonas Smedegaard wrote: On Sun, 5 Feb 2006 09:07:29 +1100 Lewis Shobbrook wrote: Basically it just states waiting X seconds Please post in public rather than to me privately. Uh, how embarrassing: I thought I was looking in my inbox, but instead was looking in the todo box full of old postings I am supposed to deal with. Sorry for my rant - I guess I've already commented on this long time ago. Kind regards, - Jonas -- * Jonas Smedegaard - idealist og Internet-arkitekt * Tlf.: +45 40843136 Website: http://dr.jones.dk/ - Enden er n_r: http://www.shibumi.org/eoti.htm pgpZgVdN53AEI.pgp Description: PGP signature
Re: Raid5 Debian Yaird Woes
On Sun, 5 Feb 2006, Lewis Shobbrook wrote: On Saturday 04 February 2006 11:22 am, you wrote: On Sat, 4 Feb 2006, Lewis Shobbrook wrote: Is there any way to avoid this requirement for input, so that the system skips the missing drive as the raid/initrd system did previously? what boot errors are you getting before it drops you to the root password prompt? Basically it just states waiting X seconds for /dev/sdx3 (corresponding to the missing raid5 member). Where X cycles from 2,4,8,16 and then drops you into a recovery console, no root pwd prompt. It will only occur if the partition is completely missing, such as a replacement disk with a blank partition table, or a completely missing/failed drive. is it trying to fsck some filesystem it doesn't have access to? No fsck seen for bad extX partitions etc. try something like this... cd /tmp mkdir t cd t zcat /boot/initrd.img-`uname -r` | cpio -i grep -r sd.3 . that should show us what script is directly accessing /dev/sdx3 ... maybe there's something more we can do about it. i did find a possible deficiency with the patch i posted... looking more closely at my yaird /init i see this: mkbdev '/dev/sdb' 'sdb' mkbdev '/dev/sdb4' 'sdb/sdb4' mkbdev '/dev/sda' 'sda' mkbdev '/dev/sda4' 'sda/sda4' and i think that means that mdadm -Ac partitions will fail if one of my root disks ends up somewhere other than sda or sdb... because the device nodes won't exist. i suspect i should update the patch to use mdrun instead of mdadm -Ac partitions... because mdrun will create temporary device nodes for everything in /proc/partitions in order to find all the possible raid pieces. -dean - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid5 Debian Yaird Woes
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 This thread is all very relevant. But please cc [EMAIL PROTECTED] rather than me privately. Regards, - Jonas - -- * Jonas Smedegaard - idealist og Internet-arkitekt * Tlf.: +45 40843136 Website: http://dr.jones.dk/ - Enden er nær: http://www.shibumi.org/eoti.htm -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.2 (GNU/Linux) iD8DBQFD5Gc/n7DbMsAkQLgRAq9XAKCTicLEnlz6iK5USZAVH0oD6bCzeQCgh1tE jgtJm7dsf0b5oKdx0JWnnpk= =4g1e -END PGP SIGNATURE- - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid5 Debian Yaird Woes
On Saturday 04 February 2006 11:22 am, you wrote: On Sat, 4 Feb 2006, Lewis Shobbrook wrote: Is there any way to avoid this requirement for input, so that the system skips the missing drive as the raid/initrd system did previously? what boot errors are you getting before it drops you to the root password prompt? Basically it just states waiting X seconds for /dev/sdx3 (corresponding to the missing raid5 member). Where X cycles from 2,4,8,16 and then drops you into a recovery console, no root pwd prompt. It will only occur if the partition is completely missing, such as a replacement disk with a blank partition table, or a completely missing/failed drive. is it trying to fsck some filesystem it doesn't have access to? No fsck seen for bad extX partitions etc. Cheers, Lewis - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid5 Debian Yaird Woes
On Friday 03 February 2006 2:02 pm, you wrote: Hi Dean, Thanks for the suggestions. On Thu, 2 Feb 2006, dean gaudet wrote: i've never looked at yaird in detail -- but you can probably use initramfs-tools instead of yaird... i take it all back... i just tried initramfs-tools and it failed to boot my system properly... whereas yaird almost got everything right. the main thing i'd say yaird is doing wrong is that it is specifying the root raid devices explicitly rather than allowing mdadm to scan the partitions list and assemble by UUID... maybe try the patch below on your yaird configuration and then run: dpkg-reconfigure linux-image-`uname -r` which will rebuild your initrd with this change... then see if it survives your boot testing. -dean p.s. this patch has been submitted to debian bugdb... --- /etc/yaird/Templates.cfg 2006/02/03 02:44:49 1.1 +++ /etc/yaird/Templates.cfg 2006/02/03 02:46:15 @@ -299,8 +299,7 @@ SCRIPT /init BEGIN !mknod TMPL_VAR NAME=target b TMPL_VAR NAME=major TMPL_VAR NAME=minor - !mdadm --assemble TMPL_VAR NAME=target --uuid TMPL_VAR NAME=uuid \ -! TMPL_LOOP NAME=components TMPL_VAR NAME=dev/TMPL_LOOP + !mdadm -Ac partitions TMPL_VAR NAME=target --uuid TMPL_VAR NAME=uuid END SCRIPT END TEMPLATE I applied the patch as well as modified the mdadm.conf, as you suggested in the previous email, and the system restarted without problem! A positive step forward. Removing a drive however, results in a disruption to the boot process requiring user input (ctrl D) in the admin console to kick things off again. Notably it works from this point, where previously I had encountered kernel panic. Is there any way to avoid this requirement for input, so that the system skips the missing drive as the raid/initrd system did previously? If you have a system restart after a power outage combined with a degraded array, the server would be unacceptably kept offline until manual intervention occurred. Cheers Thanks, Lewis - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid5 Debian Yaird Woes
On Sat, 4 Feb 2006, Lewis Shobbrook wrote: Is there any way to avoid this requirement for input, so that the system skips the missing drive as the raid/initrd system did previously? what boot errors are you getting before it drops you to the root password prompt? is it trying to fsck some filesystem it doesn't have access to? -dean - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Raid5 Debian Yaird Woes
Hi All, I'm trying to get my head around the way that the new debian initrd system yaird and mdadm.conf interact. While running raid5 with yaird, I've discovered that if I replace or remove a healthy drive, without manually using mdadm --set-faulty, the system will not reboot. I get startup messages stating waiting X seconds for /dev/sdc, eventually dropping me into a useless (for raid purposes) maintenance shell. If I continue to boot via use of 'ctrl D', the system kernel panics, telling me in has 2/3 members but needs all 3. This seriously impacts the benefit of using raid5. Problems also occurs if the disk is replaced, and the raid reconstructed (using an alternate kernel initrd), somehow the new replacement drive is set as faulty again, during startup ...resulting in the failure described above, unless I first create a fresh yaird initrd.img via re-installation of the kernel.deb prior to the system restart. My mdadm.conf (I never needed to use at all previous to the yaird system) is as follows... ARRAY /dev/md0 level=raid1 num-devices=3 devices=/dev/sda2,/dev/sdb2,/dev/sdc2 auto=yes ARRAY /dev/md1 level=raid5 num-devices=3 auto=yes UUID=a3452240:a1578a31:737679af:58f53690 DEVICE partitions The yaird documentation recommended at the use of at least auto=md, but the use of results in errors (auto=md unknown something or other) that cause kernel installation to fail. Hoping someone can ease my pain here? Cheers, Lewis - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid5 Debian Yaird Woes
i've never looked at yaird in detail -- but you can probably use initramfs-tools instead of yaird... the deb 2.6.14 and later kernels will use whichever one of those is installed. i know that initramfs-tools uses mdrun to start the root partition based on its UUID -- and so it should work fine (to get root mounted) even without dorking around with mdadm.conf. but if you want to stick with yaird: On Fri, 3 Feb 2006, Lewis Shobbrook wrote: My mdadm.conf (I never needed to use at all previous to the yaird system) is as follows... ARRAY /dev/md0 level=raid1 num-devices=3 devices=/dev/sda2,/dev/sdb2,/dev/sdc2 auto=yes ARRAY /dev/md1 level=raid5 num-devices=3 auto=yes UUID=a3452240:a1578a31:737679af:58f53690 DEVICE partitions some wrapping occured there i'm guessing... you might be a lot happier if your /dev/md0 also specified the UUID rather than the individual devices. this is probably the source of your troubles. you can get the UUID by doing mdadm --examine /dev/sda2. or you can try: mdadm --examine --scan --brief ... just prepend DEVICE partitions in front of that and you should be happy. -dean - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid5 Debian Yaird Woes
On Thu, 2 Feb 2006, dean gaudet wrote: i've never looked at yaird in detail -- but you can probably use initramfs-tools instead of yaird... i take it all back... i just tried initramfs-tools and it failed to boot my system properly... whereas yaird almost got everything right. the main thing i'd say yaird is doing wrong is that it is specifying the root raid devices explicitly rather than allowing mdadm to scan the partitions list and assemble by UUID... maybe try the patch below on your yaird configuration and then run: dpkg-reconfigure linux-image-`uname -r` which will rebuild your initrd with this change... then see if it survives your boot testing. -dean p.s. this patch has been submitted to debian bugdb... --- /etc/yaird/Templates.cfg2006/02/03 02:44:49 1.1 +++ /etc/yaird/Templates.cfg2006/02/03 02:46:15 @@ -299,8 +299,7 @@ SCRIPT /init BEGIN !mknod TMPL_VAR NAME=target b TMPL_VAR NAME=major TMPL_VAR NAME=minor - !mdadm --assemble TMPL_VAR NAME=target --uuid TMPL_VAR NAME=uuid \ - ! TMPL_LOOP NAME=components TMPL_VAR NAME=dev/TMPL_LOOP + !mdadm -Ac partitions TMPL_VAR NAME=target --uuid TMPL_VAR NAME=uuid END SCRIPT END TEMPLATE - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid5 Debian Yaird Woes
On Friday 03 February 2006 1:13 pm, you wrote: Thanks Dean, I'll try this out... i've never looked at yaird in detail -- but you can probably use initramfs-tools instead of yaird... the deb 2.6.14 and later kernels will use whichever one of those is installed. i know that initramfs-tools uses mdrun to start the root partition based on its UUID -- and so it should work fine (to get root mounted) even without dorking around with mdadm.conf. but if you want to stick with yaird: On Fri, 3 Feb 2006, Lewis Shobbrook wrote: My mdadm.conf (I never needed to use at all previous to the yaird system) is as follows... ARRAY /dev/md0 level=raid1 num-devices=3 devices=/dev/sda2,/dev/sdb2,/dev/sdc2 auto=yes ARRAY /dev/md1 level=raid5 num-devices=3 auto=yes UUID=a3452240:a1578a31:737679af:58f53690 DEVICE partitions some wrapping occured there i'm guessing... you might be a lot happier if your /dev/md0 also specified the UUID rather than the individual devices. this is probably the source of your troubles. Seems a bit confusing and fickle of yaird that all md devices must follow the uuid syntax in mdadm,conf. How do you expect that this would effect the detection of /dev/md1, where all the uuid on all components are intact, and /dev/md0 has the 'non-uuid' syntax? When yaird first arrived (did not specifically install it just a dist-upgrade), I had initial problems with the boot sequence where the root /dev/md0 wasn't starting, despite being able to manually start it from the recovery console. Specifying the devices in mdadm.conf was the initial fix. I'd never found the need to use mdadm.conf at all previously. I can't really try this til I get home, if the machine doesn't come back up my wife will have no MythTV playschool episodes for the rugrats. I'll let you know how it goes. Cheers, Lewis - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html