Re: raid1 oops, 2.6.16

2006-08-20 Thread Neil Brown
On Friday August 18, [EMAIL PROTECTED] wrote:
 
 The atomic_add (and any other access to an rdev, for that matter) needs 
 to be guarded by a NULL check, I think.
 
 Neil, does that sound right?

Yes.  Thanks for doing the analysis.

However after looking closely through the code, I think the atomic_inc
really is in the wrong place.
We are increasing 'corrected_errors' before we try to write out the
good data.  We should really wait until after reading back the good
data.  So rather than just protecting the atomic_inc with if(dev),
I'm going to move it several lines down.

And while I'm at it, there seem to be several place that reference
-rdev that could be cleaned up.

So I'm thinking of something like the following against -mm

I'll make a smaller patch for -stable.

Thanks,
NeilBrown

Signed-off-by: Neil Brown [EMAIL PROTECTED]

### Diffstat output
 ./drivers/md/raid1.c |   55 +++
 1 file changed, 34 insertions(+), 21 deletions(-)

diff .prev/drivers/md/raid1.c ./drivers/md/raid1.c
--- .prev/drivers/md/raid1.c2006-08-15 13:33:02.0 +1000
+++ ./drivers/md/raid1.c2006-08-21 09:40:41.0 +1000
@@ -930,10 +930,13 @@ static void status(struct seq_file *seq,
 
seq_printf(seq,  [%d/%d] [, conf-raid_disks,
   conf-raid_disks - mddev-degraded);
-   for (i = 0; i  conf-raid_disks; i++)
+   rcu_read_lock();
+   for (i = 0; i  conf-raid_disks; i++) {
+   mdk_rdev_t *rdev = rcu_dereference(conf-mirrors[i].rdev);
seq_printf(seq, %s,
- conf-mirrors[i].rdev 
- test_bit(In_sync, conf-mirrors[i].rdev-flags) 
? U : _);
+  rdev  test_bit(In_sync, rdev-flags) ? U : _);
+   }
+   rcu_read_unlock();
seq_printf(seq, ]);
 }
 
@@ -976,7 +979,6 @@ static void error(mddev_t *mddev, mdk_rd
 static void print_conf(conf_t *conf)
 {
int i;
-   mirror_info_t *tmp;
 
printk(RAID1 conf printout:\n);
if (!conf) {
@@ -986,14 +988,17 @@ static void print_conf(conf_t *conf)
printk( --- wd:%d rd:%d\n, conf-raid_disks - conf-mddev-degraded,
conf-raid_disks);
 
+   rcu_read_lock();
for (i = 0; i  conf-raid_disks; i++) {
char b[BDEVNAME_SIZE];
-   tmp = conf-mirrors + i;
-   if (tmp-rdev)
+   mdk_rdev_t *rdev = rcu_dereference(conf-mirrors[i].rdev);
+   if (rdev)
printk( disk %d, wo:%d, o:%d, dev:%s\n,
-   i, !test_bit(In_sync, tmp-rdev-flags), 
!test_bit(Faulty, tmp-rdev-flags),
-   bdevname(tmp-rdev-bdev,b));
+  i, !test_bit(In_sync, rdev-flags),
+  !test_bit(Faulty, rdev-flags),
+  bdevname(rdev-bdev,b));
}
+   rcu_read_unlock();
 }
 
 static void close_sync(conf_t *conf)
@@ -1009,17 +1014,17 @@ static int raid1_spare_active(mddev_t *m
 {
int i;
conf_t *conf = mddev-private;
-   mirror_info_t *tmp;
 
/*
 * Find all failed disks within the RAID1 configuration 
-* and mark them readable
+* and mark them readable.
+* Called under mddev lock, so rcu protection not needed.
 */
for (i = 0; i  conf-raid_disks; i++) {
-   tmp = conf-mirrors + i;
-   if (tmp-rdev 
-!test_bit(Faulty, tmp-rdev-flags)
-!test_and_set_bit(In_sync, tmp-rdev-flags)) {
+   mdk_rdev_t *rdev = conf-mirrors[i].rdev;
+   if (rdev
+!test_bit(Faulty, rdev-flags)
+!test_and_set_bit(In_sync, rdev-flags)) {
unsigned long flags;
spin_lock_irqsave(conf-device_lock, flags);
mddev-degraded--;
@@ -1239,7 +1244,7 @@ static void sync_request_write(mddev_t *
/* ouch - failed to read all of that.
 * Try some synchronous reads of other devices to get
 * good data, much like with normal read errors.  Only
-* read into the pages we already have so they we don't
+* read into the pages we already have so we don't
 * need to re-issue the read request.
 * We don't need to freeze the array, because being in an
 * active sync request, there is no normal IO, and
@@ -1259,6 +1264,10 @@ static void sync_request_write(mddev_t *
s = PAGE_SIZE  9;
do {
if (r1_bio-bios[d]-bi_end_io == 
end_sync_read) {
+   /* No rcu protection needed here devices
+* can only be removed when no resync 
is 
+   

raid1 oops, 2.6.16

2006-08-05 Thread Jason Lunz
I just had a disk die in a 2.6.16 (debian kernel) raid1 server, and it's
triggered an oops in raid1.

There are a bunch of 2-partition mirrors:

Personalities : [raid1]
md5 : active raid1 hdc7[2](F) hda7[1]
  77625984 blocks [2/1] [_U]

md4 : active raid1 hdc6[2](F) hda6[1]
  16000640 blocks [2/1] [_U]

md3 : active raid1 hdc5[2](F) hda5[1]
  12570752 blocks [2/1] [_U]

md2 : active raid1 hdc3[2](F) hda3[1]
  8000256 blocks [2/1] [_U]

md1 : active raid1 hdc2[2](F) hda2[1]
  200 blocks [2/1] [_U]

md0 : active raid1 hdc1[2](F) hda1[1]
  995904 blocks [2/1] [_U]

unused devices: none


amidst all the failure messages for these arrays in dmesg, I have this:


RAID1 conf printout:
 --- wd:1 rd:2
 disk 1, wo:0, o:1, dev:hda1
Unable to handle kernel NULL pointer dereference at virtual address 0088
 printing eip:
f0831ea8
*pde = 
Oops: 0002 [#1]
Modules linked in: thermal fan button processor ac battery e1000 rtc ext3 jbd 
mbcache raid1 md_mod ide_disk generic siimage ide_core evdev mousedev
CPU:0
EIP:0060:[f0831ea8]Not tainted VLI
EFLAGS: 00010246   (2.6.16-2-686 #1)
EIP is at raid1d+0x2c8/0x4c3 [raid1]
eax: 0008   ebx:    ecx: c9c60100   edx: b1a1ac60
esi:    edi: dd67c6c0   ebp: efd52740   esp: b1ac1f08
ds: 007b   es: 007b   ss: 0068
Process md5_raid1 (pid: 1001, threadinfo=b1ac task=b1bc5a70)
Stack: 000b1 5e341300 003e0387 0001 0001 0008 0008 
025e1458
    b1bc5b98 0001 efd5275c b1ac1fa4 7fff b026e32b 0005
   b1ac b1ac1f84 7fff  b1a1aba0 b1ac1f84 b1ac1fa4 7fff
Call Trace:
 [b026e32b] schedule_timeout+0x13/0x8e
 [f0864095] md_thread+0xe3/0xfb [md_mod]
 [b012522e] autoremove_wake_function+0x0/0x3a
 [b026dced] schedule+0x45f/0x4cd
 [b012522e] autoremove_wake_function+0x0/0x3a
 [f0863fb2] md_thread+0x0/0xfb [md_mod]
 [b0124efe] kthread+0x79/0xa3
 [b0124e85] kthread+0x0/0xa3
 [b01012cd] kernel_thread_helper+0x5/0xb
Code: 83 7c 24 10 00 8b 47 20 0f 84 dc 00 00 00 89 74 24 0c 39 c6 74 63 85 f6 
75 03 8b 75
08 4e 8b 55 04 6b c6 0c 8b 1c 02 8b 44 24 14 01 83 88 00 00 00 85 db 74 3f 8b 
43 70 a8 04 74 38 6a 01 ff 75
 end_request: I/O error, dev hdc, sector 26464418


The server is still running, but processes (like sync(1)) are getting hung in D
state.

Jason

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html