raid1 oops, 2.6.16

2006-08-05 Thread Jason Lunz
I just had a disk die in a 2.6.16 (debian kernel) raid1 server, and it's
triggered an oops in raid1.

There are a bunch of 2-partition mirrors:

Personalities : [raid1]
md5 : active raid1 hdc7[2](F) hda7[1]
  77625984 blocks [2/1] [_U]

md4 : active raid1 hdc6[2](F) hda6[1]
  16000640 blocks [2/1] [_U]

md3 : active raid1 hdc5[2](F) hda5[1]
  12570752 blocks [2/1] [_U]

md2 : active raid1 hdc3[2](F) hda3[1]
  8000256 blocks [2/1] [_U]

md1 : active raid1 hdc2[2](F) hda2[1]
  200 blocks [2/1] [_U]

md0 : active raid1 hdc1[2](F) hda1[1]
  995904 blocks [2/1] [_U]

unused devices: none


amidst all the failure messages for these arrays in dmesg, I have this:


RAID1 conf printout:
 --- wd:1 rd:2
 disk 1, wo:0, o:1, dev:hda1
Unable to handle kernel NULL pointer dereference at virtual address 0088
 printing eip:
f0831ea8
*pde = 
Oops: 0002 [#1]
Modules linked in: thermal fan button processor ac battery e1000 rtc ext3 jbd 
mbcache raid1 md_mod ide_disk generic siimage ide_core evdev mousedev
CPU:0
EIP:0060:[f0831ea8]Not tainted VLI
EFLAGS: 00010246   (2.6.16-2-686 #1)
EIP is at raid1d+0x2c8/0x4c3 [raid1]
eax: 0008   ebx:    ecx: c9c60100   edx: b1a1ac60
esi:    edi: dd67c6c0   ebp: efd52740   esp: b1ac1f08
ds: 007b   es: 007b   ss: 0068
Process md5_raid1 (pid: 1001, threadinfo=b1ac task=b1bc5a70)
Stack: 000b1 5e341300 003e0387 0001 0001 0008 0008 
025e1458
    b1bc5b98 0001 efd5275c b1ac1fa4 7fff b026e32b 0005
   b1ac b1ac1f84 7fff  b1a1aba0 b1ac1f84 b1ac1fa4 7fff
Call Trace:
 [b026e32b] schedule_timeout+0x13/0x8e
 [f0864095] md_thread+0xe3/0xfb [md_mod]
 [b012522e] autoremove_wake_function+0x0/0x3a
 [b026dced] schedule+0x45f/0x4cd
 [b012522e] autoremove_wake_function+0x0/0x3a
 [f0863fb2] md_thread+0x0/0xfb [md_mod]
 [b0124efe] kthread+0x79/0xa3
 [b0124e85] kthread+0x0/0xa3
 [b01012cd] kernel_thread_helper+0x5/0xb
Code: 83 7c 24 10 00 8b 47 20 0f 84 dc 00 00 00 89 74 24 0c 39 c6 74 63 85 f6 
75 03 8b 75
08 4e 8b 55 04 6b c6 0c 8b 1c 02 8b 44 24 14 01 83 88 00 00 00 85 db 74 3f 8b 
43 70 a8 04 74 38 6a 01 ff 75
 end_request: I/O error, dev hdc, sector 26464418


The server is still running, but processes (like sync(1)) are getting hung in D
state.

Jason

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid5/lvm setup questions

2006-08-05 Thread David Greaves
Shane wrote:
 Hello all,
 
 I'm building a new server which will use a number of disks
 and am not sure of the best way to go about the setup. 
 There will be 4 320gb SATA drives installed at first.  I'm
 just wondering how to set the system up for upgradability. 
 I'll be using raid5 but not sure whether to use lvm over
 the raid array.
 
 By upgradability, I'd like to do several things.  Adding
 another drive of the same size to the array.  I understand
 reshape can be used here to expand the underlying block
 device.
Yes, it can.

  If the block device is the pv of an lvm array,
 would that also automatically expand in which I would
 create additional lvs in the new space.  If this isn't
 automatic, are there ways to do it manually?
Not automatic AFAIK - but doable.

 What about replacing all four drives with larger units. 
 Say going from 300gbx4 to 500gbx4.  Can one replace them
 one at a time, going through fail/rebuild as appropriate
 and then expand the array into the unused space
Yes.

 or would
 one have to reinstall at that point.
No


None of the requirements above drive you to layering lvm over the top.

That's not to say don't do it - but you certainly don't *need* to do it.

Pros:
* allows snapshots (for consistent backups)
* allows various lvm block movements etc...
* Can later grow vg to use discrete additional block devices without raid5 grow
limitations (eg same-ish size disks etc)

Cons:
* extra complexity - risk of bugs/admin errors...
* performance impact

As an example of the cons: I've just set up lvm2 over my raid5 and whilst
testing snapshots, the first thing that happened was a kernel BUG and an oops...

David
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid5/lvm setup questions

2006-08-05 Thread Martin Schröder

2006/8/5, Shane [EMAIL PROTECTED]:

Well, the reason I was looking at LVM is because since this
is a fairly big array, I didn't want to lose a bunch of
space with ext3 inodes.  For example, the PostGreSQL


Then forget about ext{2|3} and use xfs or reiserfs. ext3 is limited to
4TB anyway.

Best
  Martin
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid 5 read performance

2006-08-05 Thread Dan Williams

On 8/5/06, Raz Ben-Jehuda(caro) [EMAIL PROTECTED] wrote:

patch is applied by Neil.
I do not know when he going to apply it.
i have applied it on my systems ( on 2.6.15 )  but they are currenly in the
lab and not in production.
Raz.
PS
I must say that it saves lots of cpu cycles.


Did you send the 2.6.15 patch in a private message I can't find it in
the archives?

Thanks,

Dan
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 010 of 10] md: Allow the write_mostly flag to be set via sysfs.

2006-08-05 Thread Mike Snitzer

On 8/5/06, Mike Snitzer [EMAIL PROTECTED] wrote:

Aside from this write-mostly sysfs support, is there a way to toggle
the write-mostly bit of an md member with mdadm?  I couldn't identify
a clear way to do so.

It'd be nice if mdadm --assemble would honor --write-mostly...


I went ahead and implemented the ability to toggle the write-mostly
bit for all disks in an array.  I did so by adding another type of
--update to --assemble.  This is very useful for a 2 disk raid1 (one
disk local, one remote).   When you switch the raidhost you also need
to toggle the write-mostly bit too.

I've tested the attached patch to work with both ver.90 and ver1
superblocks with mdadm 2.4.1 and 2.5.2.  The patch is against mdadm
2.4.1 but applies cleanly (with fuzz) against mdadm 2.5.2).

# cat /proc/mdstat
...
md2 : active raid1 nbd2[0] sdd[1](W)
 390613952 blocks [2/2] [UU]
 bitmap: 0/187 pages [0KB], 1024KB chunk

# mdadm -S /dev/md2
# mdadm --assemble /dev/md2 --run --update=toggle-write-mostly
/dev/sdd /dev/nbd2
mdadm: /dev/md2 has been started with 2 drives.

# cat /proc/mdstat
...
md2 : active raid1 nbd2[0](W) sdd[1]
 390613952 blocks [2/2] [UU]
 bitmap: 0/187 pages [0KB], 1024KB chunk
diff -Naur mdadm-2.4.1/mdadm.c mdadm-2.4.1_toggle_write_mostly/mdadm.c
--- mdadm-2.4.1/mdadm.c	2006-03-28 21:55:39.0 -0500
+++ mdadm-2.4.1_toggle_write_mostly/mdadm.c	2006-08-05 17:01:48.0 -0400
@@ -587,6 +587,8 @@
 continue;
 			if (strcmp(update, uuid)==0)
 continue;
+			if (strcmp(update, toggle-write-mostly)==0)
+continue;
 			if (strcmp(update, byteorder)==0) {
 if (ss) {
 	fprintf(stderr, Name : must not set metadata type with --update=byteorder.\n);
@@ -601,7 +603,7 @@
 
 continue;
 			}
-			fprintf(stderr, Name : '--update %s' invalid.  Only 'sparc2.2', 'super-minor', 'uuid', 'resync' or 'summaries' supported\n,update);
+			fprintf(stderr, Name : '--update %s' invalid.  Only 'sparc2.2', 'super-minor', 'uuid', 'resync', 'summaries' or 'toggle-write-mostly' supported\n,update);
 			exit(2);
 
 		case O(ASSEMBLE,'c'): /* config file */
diff -Naur mdadm-2.4.1/super0.c mdadm-2.4.1_toggle_write_mostly/super0.c
--- mdadm-2.4.1/super0.c	2006-03-28 01:10:51.0 -0500
+++ mdadm-2.4.1_toggle_write_mostly/super0.c	2006-08-05 18:04:45.0 -0400
@@ -382,6 +382,10 @@
 			rv = 1;
 		}
 	}
+	if (strcmp(update, toggle-write-mostly)==0) {
+		int d = info-disk.number;
+		sb-disks[d].state ^= (1MD_DISK_WRITEMOSTLY);
+	}
 	if (strcmp(update, newdev) == 0) {
 		int d = info-disk.number;
 		memset(sb-disks[d], 0, sizeof(sb-disks[d]));
diff -Naur mdadm-2.4.1/super1.c mdadm-2.4.1_toggle_write_mostly/super1.c
--- mdadm-2.4.1/super1.c	2006-04-07 00:32:06.0 -0400
+++ mdadm-2.4.1_toggle_write_mostly/super1.c	2006-08-05 18:33:21.0 -0400
@@ -446,6 +446,9 @@
 			rv = 1;
 		}
 	}
+	if (strcmp(update, toggle-write-mostly)==0) {
+		sb-devflags ^= WriteMostly1;
+	}
 #if 0
 	if (strcmp(update, newdev) == 0) {
 		int d = info-disk.number;


issue with mdadm ver1 sb and bitmap on x86_64

2006-08-05 Thread Mike Snitzer

FYI, with both mdadm ver 2.4.1 and 2.5.2 I can't mdadm --create with a
ver1 superblock and a write intent bitmap on x86_64.

running: mdadm --create /dev/md2 -e 1.0 -l 1 --bitmap=internal -n 2
/dev/sdd --write-mostly /dev/nbd2
I get: mdadm: RUN_ARRAY failed: Invalid argument

Mike
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html