Re: Raid5 Debian Yaird Woes

2006-04-24 Thread Jonas Smedegaard
On Sun, 5 Feb 2006 09:07:29 +1100 Lewis Shobbrook wrote:

 Basically it just states waiting X seconds

Please post in public rather than to me privately.

If this debate is related to a bug already filed against the Debian
package of yaird then cc that bugreport: bug number@bugs.debian.org -
and if not then please file a bugreport.


Thanks in advance,

 - Jonas


-- 
* Jonas Smedegaard - idealist og Internet-arkitekt
* Tlf.: +45 40843136  Website: http://dr.jones.dk/

 - Enden er n_r: http://www.shibumi.org/eoti.htm


pgpruWEVYPCJS.pgp
Description: PGP signature


Re: Raid5 Debian Yaird Woes

2006-04-24 Thread Jonas Smedegaard
On Mon, 24 Apr 2006 17:13:42 +0200 Jonas Smedegaard wrote:

 On Sun, 5 Feb 2006 09:07:29 +1100 Lewis Shobbrook wrote:
 
  Basically it just states waiting X seconds
 
 Please post in public rather than to me privately.

Uh, how embarrassing: I thought I was looking in my inbox, but instead
was looking in the todo box full of old postings I am supposed to
deal with.

Sorry for my rant - I guess I've already commented on this long time
ago.


Kind regards,

 - Jonas

-- 
* Jonas Smedegaard - idealist og Internet-arkitekt
* Tlf.: +45 40843136  Website: http://dr.jones.dk/

 - Enden er n_r: http://www.shibumi.org/eoti.htm


pgpZgVdN53AEI.pgp
Description: PGP signature


Re: Raid5 Debian Yaird Woes

2006-02-06 Thread dean gaudet
On Sun, 5 Feb 2006, Lewis Shobbrook wrote:

 On Saturday 04 February 2006 11:22 am, you wrote:
  On Sat, 4 Feb 2006, Lewis Shobbrook wrote:
   Is there any way to avoid this requirement for input, so that the system
   skips the missing drive as the raid/initrd system did previously?
 
  what boot errors are you getting before it drops you to the root password
  prompt?
 
 Basically it just states waiting X seconds for /dev/sdx3 (corresponding to 
 the 
 missing raid5 member). Where X cycles from 2,4,8,16 and then drops you into a 
 recovery console, no root pwd prompt.
 It will only occur if the partition is completely missing, such as a 
 replacement disk with a blank partition table, or a completely missing/failed 
 drive.
  is it trying to fsck some filesystem it doesn't have access to?
 
 No fsck seen for bad extX partitions etc.

try something like this...

cd /tmp
mkdir t
cd t
zcat /boot/initrd.img-`uname -r` | cpio -i
grep -r sd.3 .

that should show us what script is directly accessing /dev/sdx3 ... maybe 
there's something more we can do about it.

i did find a possible deficiency with the patch i posted... looking more 
closely at my yaird /init i see this:

mkbdev '/dev/sdb' 'sdb'
mkbdev '/dev/sdb4' 'sdb/sdb4'
mkbdev '/dev/sda' 'sda'
mkbdev '/dev/sda4' 'sda/sda4'

and i think that means that mdadm -Ac partitions will fail if one of my 
root disks ends up somewhere other than sda or sdb... because the device 
nodes won't exist.

i suspect i should update the patch to use mdrun instead of mdadm -Ac 
partitions... because mdrun will create temporary device nodes for 
everything in /proc/partitions in order to find all the possible raid 
pieces.

-dean
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 Debian Yaird Woes

2006-02-04 Thread Jonas Smedegaard
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

This thread is all very relevant.

But please cc [EMAIL PROTECTED] rather than me
privately.


Regards,

 - Jonas

- -- 
* Jonas Smedegaard - idealist og Internet-arkitekt
* Tlf.: +45 40843136  Website: http://dr.jones.dk/

 - Enden er nær: http://www.shibumi.org/eoti.htm
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2 (GNU/Linux)

iD8DBQFD5Gc/n7DbMsAkQLgRAq9XAKCTicLEnlz6iK5USZAVH0oD6bCzeQCgh1tE
jgtJm7dsf0b5oKdx0JWnnpk=
=4g1e
-END PGP SIGNATURE-
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 Debian Yaird Woes

2006-02-04 Thread Lewis Shobbrook
On Saturday 04 February 2006 11:22 am, you wrote:
 On Sat, 4 Feb 2006, Lewis Shobbrook wrote:
  Is there any way to avoid this requirement for input, so that the system
  skips the missing drive as the raid/initrd system did previously?

 what boot errors are you getting before it drops you to the root password
 prompt?

Basically it just states waiting X seconds for /dev/sdx3 (corresponding to the 
missing raid5 member). Where X cycles from 2,4,8,16 and then drops you into a 
recovery console, no root pwd prompt.
It will only occur if the partition is completely missing, such as a 
replacement disk with a blank partition table, or a completely missing/failed 
drive.
 is it trying to fsck some filesystem it doesn't have access to?

No fsck seen for bad extX partitions etc.

Cheers,

Lewis


-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 Debian Yaird Woes

2006-02-03 Thread Lewis Shobbrook
On Friday 03 February 2006 2:02 pm, you wrote:

Hi Dean,
Thanks for the suggestions.
 On Thu, 2 Feb 2006, dean gaudet wrote:
  i've never looked at yaird in detail -- but you can probably use
  initramfs-tools instead of yaird...

 i take it all back... i just tried initramfs-tools and it failed to boot
 my system properly... whereas yaird almost got everything right.

 the main thing i'd say yaird is doing wrong is that it is specifying the
 root raid devices explicitly rather than allowing mdadm to scan the
 partitions list and assemble by UUID...

 maybe try the patch below on your yaird configuration and then run:

   dpkg-reconfigure linux-image-`uname -r`

 which will rebuild your initrd with this change... then see if it survives
 your boot testing.

 -dean

 p.s. this patch has been submitted to debian bugdb...

 --- /etc/yaird/Templates.cfg  2006/02/03 02:44:49 1.1
 +++ /etc/yaird/Templates.cfg  2006/02/03 02:46:15
 @@ -299,8 +299,7 @@
   SCRIPT /init
   BEGIN
   !mknod TMPL_VAR NAME=target b TMPL_VAR NAME=major 
 TMPL_VAR
 NAME=minor - !mdadm --assemble TMPL_VAR NAME=target --uuid 
 TMPL_VAR
 NAME=uuid \ -!   TMPL_LOOP NAME=components 
 TMPL_VAR
 NAME=dev/TMPL_LOOP
 + !mdadm -Ac partitions TMPL_VAR NAME=target --uuid 
 TMPL_VAR
 NAME=uuid END SCRIPT
   END TEMPLATE

I applied the patch as well as modified the mdadm.conf, as you suggested in 
the previous email, and the system restarted without problem! 
A positive step forward.
Removing a drive however, results in a disruption to the boot process 
requiring user input (ctrl D) in the admin console to kick things off again.  
Notably it works from this point, where previously I had encountered kernel 
panic.
Is there any way to avoid this requirement for input, so that the system skips 
the missing drive as the raid/initrd system did previously?  
If you have a system restart after a power outage combined with a degraded 
array, the server would be unacceptably kept offline until manual 
intervention occurred.

Cheers  Thanks,

Lewis
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 Debian Yaird Woes

2006-02-03 Thread dean gaudet
On Sat, 4 Feb 2006, Lewis Shobbrook wrote:

 Is there any way to avoid this requirement for input, so that the system 
 skips 
 the missing drive as the raid/initrd system did previously?  

what boot errors are you getting before it drops you to the root password 
prompt?

is it trying to fsck some filesystem it doesn't have access to?

-dean
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Raid5 Debian Yaird Woes

2006-02-02 Thread Lewis Shobbrook
Hi All,

I'm trying to get my head around the way that the new debian initrd system
yaird and mdadm.conf interact.
While running raid5 with yaird, I've discovered that if I replace or remove a
healthy drive, without manually using mdadm --set-faulty, the system will not
reboot. I get startup messages stating waiting  X seconds for /dev/sdc,
eventually dropping me into a useless (for raid purposes) maintenance shell.
If I continue to boot via use of  'ctrl D', the system kernel panics, telling
me in has 2/3 members but needs all 3.  This seriously impacts the benefit of
using raid5.
Problems also occurs if the disk is replaced, and the raid reconstructed
(using an alternate kernel initrd), somehow the new replacement drive is set
as faulty again, during startup ...resulting in the failure described above,
unless I first create a fresh yaird initrd.img via re-installation of the
kernel.deb prior to the system restart.
My mdadm.conf (I never needed to use at all previous to the yaird system) is
as follows...
ARRAY /dev/md0 level=raid1 num-devices=3 devices=/dev/sda2,/dev/sdb2,/dev/sdc2
auto=yes
ARRAY /dev/md1 level=raid5 num-devices=3 auto=yes
UUID=a3452240:a1578a31:737679af:58f53690
DEVICE partitions

The yaird documentation recommended at the use of at least auto=md, but the
use of results in errors (auto=md unknown something or other) that cause
kernel installation to fail.

Hoping someone can ease my pain here?

Cheers,

Lewis
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 Debian Yaird Woes

2006-02-02 Thread dean gaudet
i've never looked at yaird in detail -- but you can probably use 
initramfs-tools instead of yaird... the deb 2.6.14 and later kernels will 
use whichever one of those is installed.  i know that initramfs-tools uses 
mdrun to start the root partition based on its UUID -- and so it should 
work fine (to get root mounted) even without dorking around with 
mdadm.conf.

but if you want to stick with yaird:

On Fri, 3 Feb 2006, Lewis Shobbrook wrote:

 My mdadm.conf (I never needed to use at all previous to the yaird system) is
 as follows...
 ARRAY /dev/md0 level=raid1 num-devices=3 devices=/dev/sda2,/dev/sdb2,/dev/sdc2
 auto=yes
 ARRAY /dev/md1 level=raid5 num-devices=3 auto=yes
 UUID=a3452240:a1578a31:737679af:58f53690
 DEVICE partitions

some wrapping occured there i'm guessing...

you might be a lot happier if your /dev/md0 also specified the UUID rather 
than the individual devices.  this is probably the source of your 
troubles.

you can get the UUID by doing mdadm --examine /dev/sda2.

or you can try:  mdadm --examine --scan --brief ... just prepend DEVICE 
partitions in front of that and you should be happy.

-dean
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 Debian Yaird Woes

2006-02-02 Thread dean gaudet
On Thu, 2 Feb 2006, dean gaudet wrote:

 i've never looked at yaird in detail -- but you can probably use 
 initramfs-tools instead of yaird... 

i take it all back... i just tried initramfs-tools and it failed to boot 
my system properly... whereas yaird almost got everything right.

the main thing i'd say yaird is doing wrong is that it is specifying the 
root raid devices explicitly rather than allowing mdadm to scan the 
partitions list and assemble by UUID...

maybe try the patch below on your yaird configuration and then run:

dpkg-reconfigure linux-image-`uname -r`

which will rebuild your initrd with this change... then see if it survives 
your boot testing.

-dean

p.s. this patch has been submitted to debian bugdb...

--- /etc/yaird/Templates.cfg2006/02/03 02:44:49 1.1
+++ /etc/yaird/Templates.cfg2006/02/03 02:46:15
@@ -299,8 +299,7 @@
SCRIPT /init
BEGIN
!mknod TMPL_VAR NAME=target b TMPL_VAR NAME=major 
TMPL_VAR NAME=minor
-   !mdadm --assemble TMPL_VAR NAME=target --uuid 
TMPL_VAR NAME=uuid \
-   !   TMPL_LOOP NAME=components TMPL_VAR 
NAME=dev/TMPL_LOOP
+   !mdadm -Ac partitions TMPL_VAR NAME=target --uuid 
TMPL_VAR NAME=uuid
END SCRIPT
END TEMPLATE
 
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 Debian Yaird Woes

2006-02-02 Thread Lewis Shobbrook
On Friday 03 February 2006 1:13 pm, you wrote:
Thanks Dean,

I'll try this out...
 i've never looked at yaird in detail -- but you can probably use
 initramfs-tools instead of yaird... the deb 2.6.14 and later kernels will
 use whichever one of those is installed.  i know that initramfs-tools uses
 mdrun to start the root partition based on its UUID -- and so it should
 work fine (to get root mounted) even without dorking around with
 mdadm.conf.

 but if you want to stick with yaird:

 On Fri, 3 Feb 2006, Lewis Shobbrook wrote:
  My mdadm.conf (I never needed to use at all previous to the yaird system)
  is as follows...
  ARRAY /dev/md0 level=raid1 num-devices=3
  devices=/dev/sda2,/dev/sdb2,/dev/sdc2 auto=yes
  ARRAY /dev/md1 level=raid5 num-devices=3 auto=yes
  UUID=a3452240:a1578a31:737679af:58f53690
  DEVICE partitions

 some wrapping occured there i'm guessing...

 you might be a lot happier if your /dev/md0 also specified the UUID rather
 than the individual devices.  this is probably the source of your
 troubles.
Seems a bit confusing  and fickle of yaird that all md devices must follow the 
uuid syntax in mdadm,conf.

How do you expect that this would effect the detection of /dev/md1, where all 
the uuid on all components are intact, and /dev/md0 has the 'non-uuid' 
syntax?

When yaird first arrived (did not specifically install it just a 
dist-upgrade), I had initial problems with the boot sequence where the 
root /dev/md0 wasn't starting, despite being able to manually start it from 
the recovery console. Specifying the devices in mdadm.conf was the initial 
fix.  I'd never found the need to use mdadm.conf at all previously. 

I can't really try this til I get home, if the machine doesn't come back up my 
wife will have no MythTV playschool episodes for the rugrats.
I'll let you know how it goes.

Cheers,

Lewis
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html