Linus,
  There is a buggy "BUG" in the raid5 code.
 If a request on an underlying device reports an error, raid5 finds out
 which device that was and marks it as failed.  This is fine.
 If another request on the same device reports an error, raid5 fails
 to find that device in its table (because though  it is there, it is
 not operational), and so it thinks something is wrong and calls
 MD_BUG() - which is very noisy, though not actually harmful (except
 to the confidence of the sysadmin)
 

 This patch changes the test so that a failure on a drive that is
 known but not-operational will be "Expected" and node a "BUG".

NeilBrown

--- ./drivers/md/raid5.c        2001/06/21 01:04:05     1.4
+++ ./drivers/md/raid5.c        2001/06/21 01:04:41     1.5
@@ -486,22 +486,24 @@
        PRINTK("raid5_error called\n");
        conf->resync_parity = 0;
        for (i = 0, disk = conf->disks; i < conf->raid_disks; i++, disk++) {
-               if (disk->dev == dev && disk->operational) {
-                       disk->operational = 0;
-                       mark_disk_faulty(sb->disks+disk->number);
-                       mark_disk_nonsync(sb->disks+disk->number);
-                       mark_disk_inactive(sb->disks+disk->number);
-                       sb->active_disks--;
-                       sb->working_disks--;
-                       sb->failed_disks++;
-                       mddev->sb_dirty = 1;
-                       conf->working_disks--;
-                       conf->failed_disks++;
-                       md_wakeup_thread(conf->thread);
-                       printk (KERN_ALERT
-                               "raid5: Disk failure on %s, disabling device."
-                               " Operation continuing on %d devices\n",
-                               partition_name (dev), conf->working_disks);
+               if (disk->dev == dev) {
+                       if (disk->operational) {
+                               disk->operational = 0;
+                               mark_disk_faulty(sb->disks+disk->number);
+                               mark_disk_nonsync(sb->disks+disk->number);
+                               mark_disk_inactive(sb->disks+disk->number);
+                               sb->active_disks--;
+                               sb->working_disks--;
+                               sb->failed_disks++;
+                               mddev->sb_dirty = 1;
+                               conf->working_disks--;
+                               conf->failed_disks++;
+                               md_wakeup_thread(conf->thread);
+                               printk (KERN_ALERT
+                                       "raid5: Disk failure on %s, disabling device."
+                                       " Operation continuing on %d devices\n",
+                                       partition_name (dev), conf->working_disks);
+                       }
                        return 0;
                }
        }
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]

Reply via email to