Checksums wrong on one disk of mirror

2006-11-07 Thread David
I recently installed a server with mirrored disks using software RAID. Everything was working fine for a few days until a normal reboot (not the first). Now the machine will not boot because it appears the superblock is wrong on some of the RAID devices on the first disk. The rough layout

Re: Checksums wrong on one disk of mirror

2006-11-07 Thread Neil Brown
On Tuesday November 7, [EMAIL PROTECTED] wrote: I recently installed a server with mirrored disks using software RAID. Everything was working fine for a few days until a normal reboot (not the first). Now the machine will not boot because it appears the superblock is wrong on some of

Re: Checksums wrong on one disk of mirror

2006-11-07 Thread David
Quoting Neil Brown [EMAIL PROTECTED]: On Tuesday November 7, [EMAIL PROTECTED] wrote: Booting into a live CD, mdadm -E /dev/sdaX shows that the checksum is not what would be expected for sda1,2,3 but is fine for sda6. All of the checksums on drive sdb are correct. I'm surprised it doesn't

[PATCH 002 of 9] md: Fix sizing problem with raid5-reshape and CONFIG_LBD=n

2006-11-07 Thread NeilBrown
I forgot to has the size-in-blocks to (loff_t) before shifting up to a size-in-bytes. Signed-off-by: Neil Brown [EMAIL PROTECTED] ### Diffstat output ./drivers/md/raid5.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff .prev/drivers/md/raid5.c ./drivers/md/raid5.c ---

[PATCH 008 of 9] md: Allow reads that have bypassed the cache to be retried on failure.

2006-11-07 Thread NeilBrown
From: Raz Ben-Jehuda(caro) [EMAIL PROTECTED] If a bypass-the-cache read fails, we simply try again through the cache. If it fails again it will trigger normal recovery precedures. Signed-off-by: Neil Brown [EMAIL PROTECTED] ### Diffstat output ./drivers/md/raid5.c | 150

[PATCH 005 of 9] md: Change lifetime rules for 'md' devices.

2006-11-07 Thread NeilBrown
Currently md devices are created when first opened and remain in existence until the module is unloaded. This isn't a major problem, but it somewhat ugly. This patch changes the lifetime rules so that an md device will disappear on the last close if it has no state. Locking rules depend on

[PATCH 007 of 9] md: Handle bypassing the read cache (assuming nothing fails).

2006-11-07 Thread NeilBrown
From: Raz Ben-Jehuda(caro) [EMAIL PROTECTED] Signed-off-by: Neil Brown [EMAIL PROTECTED] ### Diffstat output ./drivers/md/raid5.c | 78 +++ 1 file changed, 78 insertions(+) diff .prev/drivers/md/raid5.c ./drivers/md/raid5.c ---

[PATCH 006 of 9] md: Define raid5_mergeable_bvec

2006-11-07 Thread NeilBrown
From: Raz Ben-Jehuda(caro) [EMAIL PROTECTED] This will encourage read request to be on only one device, so we will often be able to bypass the cache for read requests. Signed-off-by: Neil Brown [EMAIL PROTECTED] ### Diffstat output ./drivers/md/raid5.c | 24 1 file

[PATCH 004 of 9] md: Tidy up device-change notification when an md array is stopped

2006-11-07 Thread NeilBrown
An md array can be stopped leaving all the setting still in place, or it can torn down and destroyed. set_capacity and other change notifications only happen in the latter case, but should happen in both. Signed-off-by: Neil Brown [EMAIL PROTECTED] ### Diffstat output ./drivers/md/md.c | 10

[PATCH 001 of 9] md: Change ONLINE/OFFLINE events to a single CHANGE event

2006-11-07 Thread NeilBrown
It turns out that CHANGE is preferred to ONLINE/OFFLINE for various reasons (not least of which being that udev understands it already). So remove the recently added KOBJ_OFFLINE (no-one is likely to care anyway) and change the ONLINE to a CHANGE event Cc: Kay Sievers [EMAIL PROTECTED]

[PATCH 009 of 9] md: Enable bypassing cache for reads.

2006-11-07 Thread NeilBrown
From: Raz Ben-Jehuda(caro) [EMAIL PROTECTED] Call the chunk_aligned_read where appropriate. Signed-off-by: Neil Brown [EMAIL PROTECTED] ### Diffstat output ./drivers/md/raid5.c |5 + 1 file changed, 5 insertions(+) diff .prev/drivers/md/raid5.c ./drivers/md/raid5.c ---

[PATCH 003 of 9] md: Do not freeze md threads for suspend.

2006-11-07 Thread NeilBrown
From: Rafael J. Wysocki [EMAIL PROTECTED] If there's a swap file on a software RAID, it should be possible to use this file for saving the swsusp's suspend image. Also, this file should be available to the memory management subsystem when memory is being freed before the suspend image is

Re: new array not starting

2006-11-07 Thread Robin Bowes
Robin Bowes wrote: If I try to start the array manually: # mdadm --assemble --auto=yes /dev/md2 /dev/hdc /dev/hdd /dev/hde /dev/hdf /dev/hdg /dev/hdh /dev/hdi /dev/hdj mdadm: cannot open device /dev/hdc: No such file or directory mdadm: /dev/hdc has no superblock - assembly aborted

Re: new array not starting

2006-11-07 Thread Robin Bowes
Robin Bowes wrote: This worked: # mdadm --assemble --auto=yes /dev/md2 /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj mdadm: /dev/md2 has been started with 8 drives. However, I'm not sure why it didn't start automatically at boot. Do I need to put it in

Re: new array not starting

2006-11-07 Thread Richard Scobie
Robin Bowes wrote: Robin Bowes wrote: This worked: # mdadm --assemble --auto=yes /dev/md2 /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj mdadm: /dev/md2 has been started with 8 drives. However, I'm not sure why it didn't start automatically at boot. Do I need to put

Re: new array not starting

2006-11-07 Thread Robin Bowes
Robin Bowes wrote: Robin Bowes wrote: This worked: # mdadm --assemble --auto=yes /dev/md2 /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj mdadm: /dev/md2 has been started with 8 drives. However, I'm not sure why it didn't start automatically at boot. Do I need to

FYI: [2.6.16 patch] drivers/md/md.c: update START_ARRAY printk

2006-11-07 Thread Adrian Bunk
FYI: I've just committed the patch below in the 2.6.16 tree. cu Adrian commit f919643362f45c65457e01ddd9aed0682497b2f8 Author: Adrian Bunk [EMAIL PROTECTED] Date: Wed Nov 8 08:19:14 2006 +0100 drivers/md/md.c: update START_ARRAY printk START_ARRAY will not be removed in 2.6.16,

Re: RAID5 array showing as degraded after motherboard replacement

2006-11-07 Thread dean gaudet
On Wed, 8 Nov 2006, James Lee wrote: However I'm still seeing the error messages in my dmesg (the ones I posted earlier), and they suggest that there is some kind of hardware fault (based on a quick Google of the error codes). So I'm a little confused. the fact that the error is in a