Re: Kernel 2.6.23.9 + mdadm 2.6.2-2 + Auto rebuild RAID1?

2007-12-07 Thread Jan Engelhardt

On Dec 7 2007 07:30, Nix wrote:
On 6 Dec 2007, Jan Engelhardt verbalised:
 On Dec 5 2007 19:29, Nix wrote:
 On Dec 1 2007 06:19, Justin Piszcz wrote:

 RAID1, 0.90.03 superblocks (in order to be compatible with LILO, if
 you use 1.x superblocks with LILO you can't boot)

 Says who? (Don't use LILO ;-)

Well, your kernels must be on a 0.90-superblocked RAID-0 or RAID-1
device. It can't handle booting off 1.x superblocks nor RAID-[56]
(not that I could really hope for the latter).

 If the superblock is at the end (which is the case for 0.90 and 1.0),
 then the offsets for a specific block on /dev/mdX match the ones for 
 /dev/sda,
 so it should be easy to use lilo on 1.0 too, no?

Sure, but you may have to hack /sbin/lilo to convince it to create the
superblock there at all. It's likely to recognise that this is an md
device without a v0.90 superblock and refuse to continue. (But I haven't
tested it.)

In that case, see above - move to a different bootloader.
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] (2nd try) force parallel resync

2007-12-07 Thread Bernd Schubert
Hello Neil,

On Friday 07 December 2007 03:10:37 Neil Brown wrote:
 On Thursday December 6, [EMAIL PROTECTED] wrote:
  Hello,
 
  here is the second version of the patch. With this version also on
  setting /sys/block/*/md/sync_force_parallel the sync_thread is woken up.
  Though, I still don't understand why md_wakeup_thread() is not working.

 Could give a little more detail on why you want this?  When do you
 want multiple arrays on the same device to sync at the same time?
 What exactly is the hardware like?

I did describe it in my first mail, I guess you just missed this one. So I'm 
pasting this part here:

On Monday 03 December 2007 14:37:03 Bernd Schubert wrote:
 Hello Neil,


 on doing software-raid over some Infortrend hardware raid boxes, we can
 easily bring a single cpu to 100%, while the hardware-raid boxes are still
 only utilized for 50% or less.
 So we are using *several* software raids instead of only one. However, for
 us it then also doesn't make sense to have a resync for only one of the
 md-raids. Below is a patch to force parallel resync for specified md-sets
 via sysfs.
 This still leaves a problem, though. If a resync starts before settings
 this parameter, I don't know how to wake up the sleeping resync-thread. Any
 ideas?




 md threads generally run for a little while to perform some task, then
 stop and wait to be needed again.  md_wakeup_thread says you are
 needed again.

 The resync/recovery thread is a bit different.  It just run md_do_sync
 once.  md_wakeup_thread is not really meaningful in that context.

 What you want is:
   wake_up(resync_wait);

 that will get any thread that is waiting for some other array to
 resync to wake up and see if something needs to be done.

Ah thanks a lot! Now I understand. What about this?


Signed-off-by: Bernd Schubert [EMAIL PROTECTED]


Index: linux-2.6.22/drivers/md/md.c
===
--- linux-2.6.22.orig/drivers/md/md.c   2007-12-06 19:51:55.0 +0100
+++ linux-2.6.22/drivers/md/md.c2007-12-07 12:07:47.0 +0100
@@ -74,6 +74,8 @@ static DEFINE_SPINLOCK(pers_lock);
 
 static void md_print_devices(void);
 
+static DECLARE_WAIT_QUEUE_HEAD(resync_wait);
+
 #define MD_BUG(x...) { printk(md: bug in file %s, line %d\n, __FILE__, 
__LINE__); md_print_devices(); }
 
 /*
@@ -2843,6 +2845,34 @@ __ATTR(sync_speed_max, S_IRUGO|S_IWUSR, 
 
 
 static ssize_t
+sync_force_parallel_show(mddev_t *mddev, char *page)
+{
+return sprintf(page, %d\n, mddev-parallel_resync);
+}
+
+static ssize_t
+sync_force_parallel_store(mddev_t *mddev, const char *buf, size_t len)
+{
+   char *e;
+   unsigned long n = simple_strtoul(buf, e, 10);
+
+   if (!*buf || (*e  *e != '\n') || (n != 0  n != 1))
+   return -EINVAL;
+
+   mddev-parallel_resync = n;
+
+   if (mddev-sync_thread) {
+   wake_up(resync_wait);
+   }
+   return len;
+}
+
+/* force parallel resync, even with shared block devices */
+static struct md_sysfs_entry md_sync_force_parallel =
+__ATTR(sync_force_parallel, S_IRUGO|S_IWUSR,
+   sync_force_parallel_show, sync_force_parallel_store);
+
+static ssize_t
 sync_speed_show(mddev_t *mddev, char *page)
 {
unsigned long resync, dt, db;
@@ -2980,6 +3010,7 @@ static struct attribute *md_redundancy_a
md_sync_min.attr,
md_sync_max.attr,
md_sync_speed.attr,
+   md_sync_force_parallel.attr,
md_sync_completed.attr,
md_suspend_lo.attr,
md_suspend_hi.attr,
@@ -5199,8 +5230,6 @@ void md_allow_write(mddev_t *mddev)
 }
 EXPORT_SYMBOL_GPL(md_allow_write);
 
-static DECLARE_WAIT_QUEUE_HEAD(resync_wait);
-
 #define SYNC_MARKS 10
 #defineSYNC_MARK_STEP  (3*HZ)
 void md_do_sync(mddev_t *mddev)
@@ -5264,8 +5293,9 @@ void md_do_sync(mddev_t *mddev)
ITERATE_MDDEV(mddev2,tmp) {
if (mddev2 == mddev)
continue;
-   if (mddev2-curr_resync  
-   match_mddev_units(mddev,mddev2)) {
+   if (!mddev-parallel_resync
+ mddev2-curr_resync
+ match_mddev_units(mddev,mddev2)) {
DEFINE_WAIT(wq);
if (mddev  mddev2  mddev-curr_resync == 2) {
/* arbitrarily yield */
Index: linux-2.6.22/include/linux/raid/md_k.h
===
--- linux-2.6.22.orig/include/linux/raid/md_k.h 2007-12-06 19:51:55.0 
+0100
+++ linux-2.6.22/include/linux/raid/md_k.h  2007-12-06 19:52:33.0 
+0100
@@ -170,6 +170,9 @@ struct mddev_s
int sync_speed_min;
int sync_speed_max;
 
+   /* resync even though the same disks are shared among md-devices */
+   int   

Re: raid6 check/repair

2007-12-07 Thread Gabor Gombas
On Wed, Dec 05, 2007 at 03:31:14PM -0500, Bill Davidsen wrote:

 BTW: if this can be done in a user program, mdadm, rather than by code in 
 the kernel, that might well make everyone happy. Okay, realistically less 
 unhappy.

I start to like the idea. Of course you can't repair a running array
from user space (just think about something re-writing the full stripe
while mdadm is trying to fix the old data - you can get the data disks
containing the new data but the fixed disks rewritten with the old
data).

We just need to make the kernel not to try to fix anything but merely
report that something is wrong - but wait, using check instead of
repair does that already.

So the kernel is fine as it is, we just need a simple user-space utility
that can take the components of a non-running array and repair a given
stripe using whatever method is appropriate. Shouldn't be too hard to
write for anyone interested...

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
 -
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Few questions

2007-12-07 Thread Corey Hickey
Michael Makuch wrote:
 I realize this is the developers list and though I am a developer I'm 
 not a developer
 of linux raid, but I can find no other source of answers to these questions:

Don't worry; it's a user list too.

 $ cat /proc/mdstat
 Personalities : [raid6] [raid5] [raid4]
 md0 : active raid5 etherd/e0.0[0] etherd/e0.2[9](S) etherd/e0.9[8] 
 etherd/e0.8[7] etherd/e0.7[6] etherd/e0.6[5] etherd/e0.5[4] 
 etherd/e0.4[3] etherd/e0.3[2] etherd/e0.1[1]
   3907091968 blocks level 5, 64k chunk, algorithm 2 [9/9] [U]
   []  resync = 64.5% (315458352/488386496) 
 finish=2228.0min speed=1292K/sec
 unused devices: none
 
 and I have no idea where the raid6 came from.

As far as I understand, the Personalities line just shows what RAID
capabilities are compiled into the kernel (and loaded, if modules). For
example, even though I'm only using raid0, I have:

-
$ cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md0 : active raid0 sdc[0] sdb[1]
  976772992 blocks 64k chunks

unused devices: none
-

-Corey
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


unable to remove failed drive

2007-12-07 Thread Jeff Breidenbach
... and all access to array hangs indefinitely, resulting in unkillable zombie
processes. Have to hard reboot the machine. Any thoughts on the matter?

===

# cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 sde1[6](F) sdg1[1] sdb1[4] sdd1[3] sdc1[2]
  488383936 blocks [6/4] [__]

unused devices: none

# mdadm --fail /dev/md1 /dev/sde1
mdadm: set /dev/sde1 faulty in /dev/md1

# mdadm --remove /dev/md1 /dev/sde1
mdadm: hot remove failed for /dev/sde1: Device or resource busy

# mdadm -D /dev/md1
/dev/md1:
Version : 00.90.03
  Creation Time : Sun Dec 25 16:12:34 2005
 Raid Level : raid1
 Array Size : 488383936 (465.76 GiB 500.11 GB)
Device Size : 488383936 (465.76 GiB 500.11 GB)
   Raid Devices : 6
  Total Devices : 5
Preferred Minor : 1
Persistence : Superblock is persistent

Update Time : Fri Dec  7 11:37:46 2007
  State : active, degraded
 Active Devices : 4
Working Devices : 4
 Failed Devices : 1
  Spare Devices : 0

   UUID : f3ee6aa3:2f1d5767:f443dfc0:c23e80af
 Events : 0.22331500

Number   Major   Minor   RaidDevice State
   0   00-  removed
   1   8   971  active sync   /dev/sdg1
   2   8   332  active sync   /dev/sdc1
   3   8   493  active sync   /dev/sdd1
   4   8   174  active sync   /dev/sdb1
   5   00-  removed

   6   8   650  faulty   /dev/sde1


# dmesg
sd 4:0:0:0: SCSI error: return code = 0x802
sde: Current: sense key: Aborted Command
Additional sense: Scsi parity error
end_request: I/O error, dev sde, sector 594882271
raid1: sde1: rescheduling sector 594882208
ata5: command timeout
ata5: translated ATA stat/err 0xff/00 to SCSI SK/ASC/ASCQ 0xb/47/00
ata5: status=0xff { Busy }
sd 4:0:0:0: SCSI error: return code = 0x802
sde: Current: sense key: Aborted Command
Additional sense: Scsi parity error
end_request: I/O error, dev sde, sector 528737607
raid1: sde1: rescheduling sector 528737544
ata5: command timeout
ata5: translated ATA stat/err 0xff/00 to SCSI SK/ASC/ASCQ 0xb/47/00
ata5: status=0xff { Busy }
sd 4:0:0:0: SCSI error: return code = 0x802
sde: Current: sense key: Aborted Command
Additional sense: Scsi parity error
end_request: I/O error, dev sde, sector 814377071

[...]

md: cannot remove active disk sde1 from md1 ...
md: could not bd_claim sde1.
md: error, md_import_device() returned -16

# cat /proc/version
Linux version 2.6.17-2-amd64 (Debian 2.6.17-7) ([EMAIL PROTECTED]) (gcc
version 4.1.2 20060814 (prerelease) (Debian 4.1.1-11)) #1 SMP Thu Aug
24 16:13:57 UTC 2006
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Few questions

2007-12-07 Thread Guy Watkins
man md
man mdadm

I use RAID6.  Happy with it so far, but haven't had a disk failure yet.
RAID5 sucks because if you have 1 failed disk and 1 bad block on any other
disk, you are hosed.

Hope that helps.

} -Original Message-
} From: [EMAIL PROTECTED] [mailto:linux-raid-
} [EMAIL PROTECTED] On Behalf Of Michael Makuch
} Sent: Friday, December 07, 2007 7:12 PM
} To: linux-raid@vger.kernel.org
} Subject: Few questions
} 
} I realize this is the developers list and though I am a developer I'm
} not a developer
} of linux raid, but I can find no other source of answers to these
} questions:
} 
} I've been using linux software raid (5) for a couple of years, having
} recently uped
} to the 2.6.23 kernel (FC7, was previously on FC5). I just noticed that
} my /proc/mdstat shows
} 
} $ cat /proc/mdstat
} Personalities : [raid6] [raid5] [raid4]
} md0 : active raid5 etherd/e0.0[0] etherd/e0.2[9](S) etherd/e0.9[8]
} etherd/e0.8[7] etherd/e0.7[6] etherd/e0.6[5] etherd/e0.5[4]
} etherd/e0.4[3] etherd/e0.3[2] etherd/e0.1[1]
}   3907091968 blocks level 5, 64k chunk, algorithm 2 [9/9] [U]
}   []  resync = 64.5% (315458352/488386496)
} finish=2228.0min speed=1292K/sec
} unused devices: none
} 
} and I have no idea where the raid6 came from. The only thing I've found
} on raid6
} is a wikipedia.org page, nothing on
} http://tldp.org/HOWTO/Software-RAID-HOWTO.html
} 
} So my questions are:
} 
} - Is raid6 documented anywhere? If so, where? I'd like to take advantage
} of it if
} it's really there.
} - Why does my array (which I configured as raid5) have personalities of
} raid6 (I can understand why raid4 would be there)?
} - Is this a.o.k for a raid5 array?
} 
} Thanks
} -
} To unsubscribe from this list: send the line unsubscribe linux-raid in
} the body of a message to [EMAIL PROTECTED]
} More majordomo info at  http://vger.kernel.org/majordomo-info.html

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Few questions

2007-12-07 Thread Michael Makuch
I realize this is the developers list and though I am a developer I'm 
not a developer

of linux raid, but I can find no other source of answers to these questions:

I've been using linux software raid (5) for a couple of years, having 
recently uped

to the 2.6.23 kernel (FC7, was previously on FC5). I just noticed that
my /proc/mdstat shows

$ cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : active raid5 etherd/e0.0[0] etherd/e0.2[9](S) etherd/e0.9[8] 
etherd/e0.8[7] etherd/e0.7[6] etherd/e0.6[5] etherd/e0.5[4] 
etherd/e0.4[3] etherd/e0.3[2] etherd/e0.1[1]

 3907091968 blocks level 5, 64k chunk, algorithm 2 [9/9] [U]
 []  resync = 64.5% (315458352/488386496) 
finish=2228.0min speed=1292K/sec

unused devices: none

and I have no idea where the raid6 came from. The only thing I've found 
on raid6
is a wikipedia.org page, nothing on 
http://tldp.org/HOWTO/Software-RAID-HOWTO.html


So my questions are:

- Is raid6 documented anywhere? If so, where? I'd like to take advantage 
of it if

it's really there.
- Why does my array (which I configured as raid5) have personalities of
raid6 (I can understand why raid4 would be there)?
- Is this a.o.k for a raid5 array?

Thanks
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html