[patch v2 1/1] md: Software Raid autodetect dev list not array

2007-08-26 Thread Michael J. Evans
From: Michael J. Evans [EMAIL PROTECTED]

In current release kernels the md module (Software RAID) uses a static array
 (dev_t[128]) to store partition/device info temporarily for autostart.

This patch replaces that static array with a list.

Signed-off-by: Michael J. Evans [EMAIL PROTECTED]
--- 
Version 2: Following Neil Brown's requests...
using list_add_tail, and corrected missing i_passed++;.
removed sections of code that would never be reached.
- -
The data/structures are only used within md.c, and very close together.
However I wonder if the structural information shouldn't go in to...
../../include/linux/raid/md_k.h instead.


I discovered this (and that the devices are added as disks/partitions are
discovered at boot) while I was debugging why only one of my MD arrays would
come up whole, while all the others were short a disk.

I eventually discovered that it was enumerating through all of 9 of my 11 hds
(2 had only 4 partitions apiece) while the other 9 have 15 partitions
(I wanted 64 per drive...). The last partition of the 8th drive in my 9 drive
raid 5 sets wasn't added, thus making the final md array short both a parity
and data disk, and it was started later, elsewhere.

Subject: [patch 1/1] md: Software Raid autodetect dev list not array

SOFTWARE RAID (Multiple Disks) SUPPORT
P:  Ingo Molnar
M:  [EMAIL PROTECTED]
P:  Neil Brown
M:  [EMAIL PROTECTED]
L:  linux-raid@vger.kernel.org
S:  Supported
Unless you have a reason NOT to do so, CC [EMAIL PROTECTED]

12: Has been tested with CONFIG_PREEMPT, CONFIG_DEBUG_PREEMPT,
CONFIG_DEBUG_SLAB, CONFIG_DEBUG_PAGEALLOC, CONFIG_DEBUG_MUTEXES,
CONFIG_DEBUG_SPINLOCK, CONFIG_DEBUG_SPINLOCK_SLEEP all simultaneously
enabled.

It has been tested with CONFIG_SMP set and unset (Different x86_64 systems).
It has been tested with CONFIG_PREEMPT set and unset (same system).
CONFIG_LBD isn't even an option in my .config file.

Note: between 2.6.22 and 2.6.23-rc3-git5
rdev = md_import_device(dev,0, 0);
became
rdev = md_import_device(dev,0, 90);
So the patch has been edited to patch around that line. (might be fuzzy)

Signed-off-by: Michael J. Evans [EMAIL PROTECTED]
=
--- linux/drivers/md/md.c.orig  2007-08-21 03:19:42.511576248 -0700
+++ linux/drivers/md/md.c   2007-08-21 04:30:09.775525710 -0700
@@ -24,4 +24,6 @@

+   - autodetect dev list not array: Michael J. Evans [EMAIL PROTECTED]
+
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2, or (at your option)
@@ -5752,13 +5754,24 @@ void md_autodetect_dev(dev_t dev)
  * Searches all registered partitions for autorun RAID arrays
  * at boot time.
  */
-static dev_t detected_devices[128];
-static int dev_cnt;
+
+static LIST_HEAD(all_detected_devices);
+struct detected_devices_node {
+   struct list_head list;
+   dev_t dev;
+};
 
 void md_autodetect_dev(dev_t dev)
 {
-   if (dev_cnt = 0  dev_cnt  127)
-   detected_devices[dev_cnt++] = dev;
+   struct detected_devices_node *node_detected_dev;
+   node_detected_dev = kzalloc(sizeof(*node_detected_dev), GFP_KERNEL);\
+   if (node_detected_dev) {
+   node_detected_dev-dev = dev;
+   list_add_tail(node_detected_dev-list, all_detected_devices);
+   } else {
+   printk(KERN_CRIT md: kzAlloc node failed, skipping device.
+ : 0x%p.\n, node_detected_dev);
+   }
 }
 
 
@@ -5765,7 +5778,12 @@ static void autostart_arrays(int part)
 static void autostart_arrays(int part)
 {
mdk_rdev_t *rdev;
-   int i;
+   struct detected_devices_node *node_detected_dev;
+   dev_t dev;
+   int i_scanned, i_passed;
+   signed int i_found;
+   i_scanned = 0;
+   i_passed = 0;
 
printk(KERN_INFO md: Autodetecting RAID arrays.\n);
 
@@ -5772,3 +5790,8 @@ static void autostart_arrays(int part)
-   for (i = 0; i  dev_cnt; i++) {
-   dev_t dev = detected_devices[i];
-
+   /* FIXME: max 'int' #DEFINEd somewhere?  not   0x7FFF ? */
+   while (!list_empty(all_detected_devices)  i_scanned  0x7FFF) {
+   i_scanned++;
+   node_detected_dev = list_entry(all_detected_devices.next,
+   struct detected_devices_node, list);
+   list_del(node_detected_dev-list);
+   dev = node_detected_dev-dev;
+   kfree(node_detected_dev);
@@ -5781,8 +5806,11 @@ static void autostart_arrays(int part)
continue;
}
list_add(rdev-same_set, pending_raid_disks);
+   i_passed++;
}
-   dev_cnt = 0;
+
+   printk(KERN_INFO md: Scanned %d and added %d devices.\n,
+

Re: [patch v2 1/1] md: Software Raid autodetect dev list not array

2007-08-26 Thread Michael Evans
Also, I forgot to mention, the reason I added the counters was mostly
for debugging.  However they're also as useful in the same way that
listing the partitions when a new disk is added can be (in fact this
augments that and the existing messages the autodetect routines
provide).

As for using autodetect or not... the only way to skip it seems to be
compiling md's raid support as a module.  I checked 2.6.22's
menuconfig and there's no way for me to explicitly turn it on or off
at compile time.  I also feel that forcing the addition of a boot
parameter to de-activate a broken and deprecated system you aren't
even aware you are getting is somehow wrong.  So if you have over 128
devices for it to scan, as I do on one of my PCs, then it can bring up
an array in degraded mode.

... crud.

I also just noticed, while looking to see if there was some existing
way of detecting if debugging were enabled and to be extra-verbose,
that I left in one of my other debugging variables by mistake.
i_found. Since it's signed, it must have been the variable I was using
to detect where my list matched the existing array in my initial
verification runs.


Are there any other things you'd like to see changed before I submit a
third patch version?
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch v2 1/1] md: Software Raid autodetect dev list not array

2007-08-26 Thread Jan Engelhardt

On Aug 26 2007 04:51, Michael J. Evans wrote:
 {
-  if (dev_cnt = 0  dev_cnt  127)
-  detected_devices[dev_cnt++] = dev;
+  struct detected_devices_node *node_detected_dev;
+  node_detected_dev = kzalloc(sizeof(*node_detected_dev), GFP_KERNEL);\

What's the \ good for, besides escaping the newline
that is ignored as whitespace anyway? :-)

@@ -5772,3 +5790,8 @@ static void autostart_arrays(int part)
-  for (i = 0; i  dev_cnt; i++) {
-  dev_t dev = detected_devices[i];
-
+  /* FIXME: max 'int' #DEFINEd somewhere?  not   0x7FFF ? */
+  while (!list_empty(all_detected_devices)  i_scanned  0x7FFF) {

I doubt someone really has _that_ many devices. Of course, to be on the
safer side, make it an unsigned int. That way, people could put in about
0xFFFE devs (which is even less likely than 0x7FFF)

+  i_scanned++;
+  node_detected_dev = list_entry(all_detected_devices.next,
+  struct detected_devices_node, list);
+  list_del(node_detected_dev-list);
+  dev = node_detected_dev-dev;
+  kfree(node_detected_dev);

Jan
-- 
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Patch for boot-time assembly of v1.x-metadata-based soft (MD) arrays

2007-08-26 Thread Abe Skolnik
Dear Mr./Dr./Prof. Brown et al,

I recently had the unpleasant experience of creating an MD array for
the purpose of booting off it and then not being able to do so.  Since
I had already made changes to the array's contents relative to that
which I cloned it from, I did not want to reformat the array and
re-clone it just to bring it down to the old 0.90 metadata format so
that I would be able to boot off it, so I searched for a solution, and
I found it.

First I tried the patch (written by Neil Brown) which can be seen at...
  http://www.issociate.de/board/post/277868/

That patch did not work as-is, but with some more hacking, I got it
working.  I then cleaned up my work and added relevant comments.

I know that Mr./Dr./Prof. Brown is against in-kernel boot-time MD
assembly and prefers init[rd/ramfs], but I prefer in-kernel assembly,
and I think several other people do too.  Since this patch does not
(AFAIK) disable the init[rd/ramfs] way of bringing up MDs in boot-time,
I hope that this patch will be accepted and submitted up-stream for
future inclusion in the mainline kernel.org kernel distribution.
This way kernel users can choose their MD assembly strategy at will
without having to restrict themselves to the old metadata format.

I hope that this message finds all those who read it doing well and
feeling fine.

Sincerely,

Abe Skolnik

P.S.  Mr./Dr./Prof. Brown, in case you read this:  thanks!
  And if you want your name removed from the code, just say so.


add-ability-to-start-MDs-with-persistent-superblock-v1.x---update-against-2.6.22.4_.patch
Description: 3723671779-add-ability-to-start-MDs-with-persistent-superblock-v1.x---update-against-2.6.22.4_.patch


Re: Patch for boot-time assembly of v1.x-metadata-based soft (MD) arrays

2007-08-26 Thread Justin Piszcz



On Sun, 26 Aug 2007, Abe Skolnik wrote:


Dear Mr./Dr./Prof. Brown et al,

I recently had the unpleasant experience of creating an MD array for
the purpose of booting off it and then not being able to do so.  Since
I had already made changes to the array's contents relative to that
which I cloned it from, I did not want to reformat the array and
re-clone it just to bring it down to the old 0.90 metadata format so
that I would be able to boot off it, so I searched for a solution, and
I found it.

First I tried the patch (written by Neil Brown) which can be seen at...
 http://www.issociate.de/board/post/277868/

That patch did not work as-is, but with some more hacking, I got it
working.  I then cleaned up my work and added relevant comments.

I know that Mr./Dr./Prof. Brown is against in-kernel boot-time MD
assembly and prefers init[rd/ramfs], but I prefer in-kernel assembly,
and I think several other people do too.  Since this patch does not
(AFAIK) disable the init[rd/ramfs] way of bringing up MDs in boot-time,
I hope that this patch will be accepted and submitted up-stream for
future inclusion in the mainline kernel.org kernel distribution.
This way kernel users can choose their MD assembly strategy at will
without having to restrict themselves to the old metadata format.

I hope that this message finds all those who read it doing well and
feeling fine.

Sincerely,

Abe Skolnik

P.S.  Mr./Dr./Prof. Brown, in case you read this:  thanks!
 And if you want your name removed from the code, just say so.




but I prefer in-kernel assembly,
and I think several other people do too.
I concur with this statement, why go through the hassle of init[rd/ramfs] 
if we can just have it done in the kernel?


Justin.
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Patch for boot-time assembly of v1.x-metadata-based soft (MD) arrays

2007-08-26 Thread Dan Williams
On 8/26/07, Justin Piszcz [EMAIL PROTECTED] wrote:


 On Sun, 26 Aug 2007, Abe Skolnik wrote:

  Dear Mr./Dr./Prof. Brown et al,
 
  I recently had the unpleasant experience of creating an MD array for
  the purpose of booting off it and then not being able to do so.  Since
  I had already made changes to the array's contents relative to that
  which I cloned it from, I did not want to reformat the array and
  re-clone it just to bring it down to the old 0.90 metadata format so
  that I would be able to boot off it, so I searched for a solution, and
  I found it.
 
  First I tried the patch (written by Neil Brown) which can be seen at...
   http://www.issociate.de/board/post/277868/
 
  That patch did not work as-is, but with some more hacking, I got it
  working.  I then cleaned up my work and added relevant comments.
 
  I know that Mr./Dr./Prof. Brown is against in-kernel boot-time MD
  assembly and prefers init[rd/ramfs], but I prefer in-kernel assembly,
  and I think several other people do too.  Since this patch does not
  (AFAIK) disable the init[rd/ramfs] way of bringing up MDs in boot-time,
  I hope that this patch will be accepted and submitted up-stream for
  future inclusion in the mainline kernel.org kernel distribution.
  This way kernel users can choose their MD assembly strategy at will
  without having to restrict themselves to the old metadata format.
 
  I hope that this message finds all those who read it doing well and
  feeling fine.
 
  Sincerely,
 
  Abe Skolnik
 
  P.S.  Mr./Dr./Prof. Brown, in case you read this:  thanks!
   And if you want your name removed from the code, just say so.
 

  but I prefer in-kernel assembly,
  and I think several other people do too.
 I concur with this statement, why go through the hassle of init[rd/ramfs]
 if we can just have it done in the kernel?


Because you can rely on the configuration file to be certain about
which disks to pull in and which to ignore.  Without the config file
the auto-detect routine may not always do the right thing because it
will need to make assumptions.

So I turn the question around, why go through the exercise of trying
to improve an auto-detect routine which can never be perfect when the
explicit configuration can be specified by a config file?

I believe the real issue is the need to improve the distributions'
initramfs build-scripts and relieve the hassle of handling MD details.

 Justin.

Regards,
Dan
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


degenerated raid5 array doesn't rebuild

2007-08-26 Thread Marc Dietrich
Hi,

I installed opensuse 10.3b2 on my server having 4 identical hard drives.
on each drive is an 290 gb partition used for raid5. It seems, that the
setup program created a degenerated array of sd[abc]2 and added the last
as a spare drive. mdadm -D /dev/md1 reports:

/dev/md1:
Version : 01.00.03
  Creation Time : Sun Aug 26 14:23:29 2007
 Raid Level : raid5
 Array Size : 937488384 (894.06 GiB 959.99 GB)
  Used Dev Size : 624992256 (298.02 GiB 320.00 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 1
Persistence : Superblock is persistent

  Intent Bitmap : Internal

Update Time : Sun Aug 26 17:12:52 2007
  State : active, degraded
 Active Devices : 3
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 1

 Layout : left-symmetric
 Chunk Size : 128K

   Name : 1
   UUID : 11c75eb5:f7281221:886b7d04:d46516e8
 Events : 4352

Number   Major   Minor   RaidDevice State
   0   820  active sync   /dev/sda2
   1   8   181  active sync   /dev/sdb2
   2   8   342  active sync   /dev/sdc2
   4   8   503  spare rebuilding   /dev/sdd2

and /proc/mdstat shows:

Personalities : [raid6] [raid5] [raid4] [raid0] [raid1] [linear] 
md0 : active raid1 sda1[0] sdd1[3] sdc1[2] sdb1[1]
  72248 blocks super 1.0 [4/4] []
  bitmap: 0/9 pages [0KB], 4KB chunk

md1 : active raid5 sda2[0] sdd2[4] sdc2[2] sdb2[1]
  937488384 blocks super 1.0 level 5, 128k chunk, algorithm 2 [4/3]
[UUU_]
  bitmap: 6/299 pages [24KB], 512KB chunk

unused devices: none

There is no hard disk activity, so I guess md1 is not rebuildinh even if
mdadm states so. 
Is there a way to include the spare disk into the array? 
Why is this not done automaticaly?

Greetings

Marc

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch v2 1/1] md: Software Raid autodetect dev list not array

2007-08-26 Thread Michael Evans
On 8/26/07, Jan Engelhardt [EMAIL PROTECTED] wrote:

 On Aug 26 2007 04:51, Michael J. Evans wrote:
  {
 -  if (dev_cnt = 0  dev_cnt  127)
 -  detected_devices[dev_cnt++] = dev;
 +  struct detected_devices_node *node_detected_dev;
 +  node_detected_dev = kzalloc(sizeof(*node_detected_dev), GFP_KERNEL);\

 What's the \ good for, besides escaping the newline
 that is ignored as whitespace anyway? :-)


I hadn't even noticed that, I suppose I mashed the key above enter at
some time.  Removing from my local file.

 @@ -5772,3 +5790,8 @@ static void autostart_arrays(int part)
 -  for (i = 0; i  dev_cnt; i++) {
 -  dev_t dev = detected_devices[i];
 -
 +  /* FIXME: max 'int' #DEFINEd somewhere?  not   0x7FFF ? */
 +  while (!list_empty(all_detected_devices)  i_scanned  0x7FFF) {

 I doubt someone really has _that_ many devices. Of course, to be on the
 safer side, make it an unsigned int. That way, people could put in about
 0xFFFE devs (which is even less likely than 0x7FFF)


There is that, but I'm almost expecting someone to ask me to remove
both the ints and kprint statement.  (I'd like them as part of some
kind of verbose startup that people would actually think to use
however.)  Additionally a though occurred to me earlier, if there are
That many devices, the chance of a UUID namespace collision might
actually be realistic anyway.  Though I'm not short sighted enough to
put it past anyone to have more then 32/64K possible block devices.
Anyone with that much cash today is probably buying hardware raid, but
who knows.

 +  i_scanned++;
 +  node_detected_dev = list_entry(all_detected_devices.next,
 +  struct detected_devices_node, list);
 +  list_del(node_detected_dev-list);
 +  dev = node_detected_dev-dev;
 +  kfree(node_detected_dev);

 Jan
 --

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch v2 1/1] md: Software Raid autodetect dev list not array

2007-08-26 Thread Randy Dunlap
On Sun, 26 Aug 2007 04:51:24 -0700 Michael J. Evans wrote:

 From: Michael J. Evans [EMAIL PROTECTED]
 
 In current release kernels the md module (Software RAID) uses a static array
  (dev_t[128]) to store partition/device info temporarily for autostart.
 
 This patch replaces that static array with a list.
 
 Signed-off-by: Michael J. Evans [EMAIL PROTECTED]
 --- 
 Version 2: Following Neil Brown's requests...
 using list_add_tail, and corrected missing i_passed++;.
 removed sections of code that would never be reached.
 - -
 The data/structures are only used within md.c, and very close together.
 However I wonder if the structural information shouldn't go in to...
 ../../include/linux/raid/md_k.h instead.
 
 
 I discovered this (and that the devices are added as disks/partitions are
 discovered at boot) while I was debugging why only one of my MD arrays would
 come up whole, while all the others were short a disk.
 
 I eventually discovered that it was enumerating through all of 9 of my 11 hds
 (2 had only 4 partitions apiece) while the other 9 have 15 partitions
 (I wanted 64 per drive...). The last partition of the 8th drive in my 9 drive
 raid 5 sets wasn't added, thus making the final md array short both a parity
 and data disk, and it was started later, elsewhere.
 
 Subject: [patch 1/1] md: Software Raid autodetect dev list not array
 
 SOFTWARE RAID (Multiple Disks) SUPPORT
 P:Ingo Molnar
 M:[EMAIL PROTECTED]
 P:Neil Brown
 M:[EMAIL PROTECTED]
 L:linux-raid@vger.kernel.org
 S:Supported
 Unless you have a reason NOT to do so, CC [EMAIL PROTECTED]
 
 12: Has been tested with CONFIG_PREEMPT, CONFIG_DEBUG_PREEMPT,
 CONFIG_DEBUG_SLAB, CONFIG_DEBUG_PAGEALLOC, CONFIG_DEBUG_MUTEXES,
 CONFIG_DEBUG_SPINLOCK, CONFIG_DEBUG_SPINLOCK_SLEEP all simultaneously
 enabled.
 
 It has been tested with CONFIG_SMP set and unset (Different x86_64 systems).
 It has been tested with CONFIG_PREEMPT set and unset (same system).
 CONFIG_LBD isn't even an option in my .config file.

It's not an option 64_BIT builds.

 Note: between 2.6.22 and 2.6.23-rc3-git5
 rdev = md_import_device(dev,0, 0);
 became
 rdev = md_import_device(dev,0, 90);
 So the patch has been edited to patch around that line. (might be fuzzy)
 
 Signed-off-by: Michael J. Evans [EMAIL PROTECTED]
 =
 --- linux/drivers/md/md.c.orig2007-08-21 03:19:42.511576248 -0700
 +++ linux/drivers/md/md.c 2007-08-21 04:30:09.775525710 -0700
 @@ -5752,13 +5754,24 @@ void md_autodetect_dev(dev_t dev)
   * Searches all registered partitions for autorun RAID arrays
   * at boot time.
   */
 -static dev_t detected_devices[128];
 -static int dev_cnt;
 +
 +static LIST_HEAD(all_detected_devices);
 +struct detected_devices_node {
 + struct list_head list;
 + dev_t dev;
 +};
  
  void md_autodetect_dev(dev_t dev)
  {
 - if (dev_cnt = 0  dev_cnt  127)
 - detected_devices[dev_cnt++] = dev;
 + struct detected_devices_node *node_detected_dev;
 + node_detected_dev = kzalloc(sizeof(*node_detected_dev), GFP_KERNEL);\
 + if (node_detected_dev) {
 + node_detected_dev-dev = dev;
 + list_add_tail(node_detected_dev-list, all_detected_devices);
 + } else {
 + printk(KERN_CRIT md: kzAlloc node failed, skipping device.
 +   : 0x%p.\n, node_detected_dev);

Is there any way to tell the user what device (or partition?) is
bein skipped?  This printk should just print (confirm) that
node_detected_dev is NULL.  Shouldn't it just print dev in
major:minor format?

 + }
  }
  
  
 @@ -5765,7 +5778,12 @@ static void autostart_arrays(int part)
  static void autostart_arrays(int part)
  {
   mdk_rdev_t *rdev;
 - int i;
 + struct detected_devices_node *node_detected_dev;
 + dev_t dev;
 + int i_scanned, i_passed;
 + signed int i_found;

Drop signed, like the surrounding code.
Leave a blank line between data declarations and beginning of code.

 + i_scanned = 0;
 + i_passed = 0;
  
   printk(KERN_INFO md: Autodetecting RAID arrays.\n);
  
 @@ -5772,3 +5790,8 @@ static void autostart_arrays(int part)
 - for (i = 0; i  dev_cnt; i++) {
 - dev_t dev = detected_devices[i];
 -
 + /* FIXME: max 'int' #DEFINEd somewhere?  not   0x7FFF ? */

include/linux/kernel.h has INT_MAX, UINT_MAX, LONG_MAX, ULONG_MAX,
LLONG_MAX, ULLONG_MAX.

 + while (!list_empty(all_detected_devices)  i_scanned  0x7FFF) {
 + i_scanned++;
 + node_detected_dev = list_entry(all_detected_devices.next,
 + struct detected_devices_node, list);
 + list_del(node_detected_dev-list);
 + dev = node_detected_dev-dev;
 + kfree(node_detected_dev);


---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
-
To 

Re: Patch for boot-time assembly of v1.x-metadata-based soft (MD) arrays

2007-08-26 Thread Mr. James W. Laferriere

On Sun, 26 Aug 2007, Justin Piszcz wrote:

On Sun, 26 Aug 2007, Abe Skolnik wrote:

Dear Mr./Dr./Prof. Brown et al,

I recently had the unpleasant experience of creating an MD array for
the purpose of booting off it and then not being able to do so.  Since
I had already made changes to the array's contents relative to that
which I cloned it from, I did not want to reformat the array and
re-clone it just to bring it down to the old 0.90 metadata format so
that I would be able to boot off it, so I searched for a solution, and
I found it.

First I tried the patch (written by Neil Brown) which can be seen at...
 http://www.issociate.de/board/post/277868/

That patch did not work as-is, but with some more hacking, I got it
working.  I then cleaned up my work and added relevant comments.

I know that Mr./Dr./Prof. Brown is against in-kernel boot-time MD
assembly and prefers init[rd/ramfs], but I prefer in-kernel assembly,
and I think several other people do too.  Since this patch does not
(AFAIK) disable the init[rd/ramfs] way of bringing up MDs in boot-time,
I hope that this patch will be accepted and submitted up-stream for
future inclusion in the mainline kernel.org kernel distribution.
This way kernel users can choose their MD assembly strategy at will
without having to restrict themselves to the old metadata format.

I hope that this message finds all those who read it doing well and
feeling fine.

Sincerely,

Abe Skolnik

P.S.  Mr./Dr./Prof. Brown, in case you read this:  thanks!
 And if you want your name removed from the code, just say so.




but I prefer in-kernel assembly,
and I think several other people do too.
I concur with this statement, why go through the hassle of init[rd/ramfs] if 
we can just have it done in the kernel?


Justin.


Motion seconded !-) .  I too agree with Abe  Justin .
I am not in favor with the init[rd/ramfs] layer of abstractions .

Twyl ,  JimL
--
+-+
| James   W.   Laferriere | System   Techniques | Give me VMS |
| NetworkEngineer | 663  Beaumont  Blvd |  Give me Linux  |
| [EMAIL PROTECTED] | Pacifica, CA. 94044 |   only  on  AXP |
+-+
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch v2 1/1] md: Software Raid autodetect dev list not array

2007-08-26 Thread Michael Evans
On 8/26/07, Randy Dunlap [EMAIL PROTECTED] wrote:
 On Sun, 26 Aug 2007 04:51:24 -0700 Michael J. Evans wrote:

  From: Michael J. Evans [EMAIL PROTECTED]
 


 Is there any way to tell the user what device (or partition?) is
 bein skipped?  This printk should just print (confirm) that
 node_detected_dev is NULL.  Shouldn't it just print dev in
 major:minor format?


It would be possible with the MAJOR() and MINOR() macros to do this...
however it doesn't really help out much during troubleshooting. I
tried using the bdevname function like the function that calls this
one uses, however, it wants a struct device_block... which I tried
getting with:

container_of(dev, struct block_device, bd_dev)

Of course this didn't quite work out, I got kernel panics on my two
trial attempts.

Here's a skip from a dmesg where I added a printk right under the line
in question.

[   63.033532] sd 11:0:0:0: [sdk] 976773168 512-byte hardware sectors
(500108 MB)
[   63.039842] sd 11:0:0:0: [sdk] Write Protect is off
[   63.046012] sd 11:0:0:0: [sdk] Mode Sense: 00 3a 00 00
[   63.046025] sd 11:0:0:0: [sdk] Write cache: enabled, read cache:
enabled, doesn't support DPO or FUA
[   63.052309]  sdk: sdk1 sdk2 sdk3 sdk4 sdk5 sdk6 sdk7 sdk8 sdk9
sdk10 sdk11 sdk12 sdk13 sdk14 sdk15
[   63.082546] md: Autodetect-buffering the above device.
[   63.088893] md: Autodetect-buffering the above device.
[   63.095053] md: Autodetect-buffering the above device.
[   63.101082] md: Autodetect-buffering the above device.
[   63.106956] md: Autodetect-buffering the above device.
[   63.112596] md: Autodetect-buffering the above device.
[   63.117998] md: Autodetect-buffering the above device.
[   63.123396] md: Autodetect-buffering the above device.
[   63.128789] md: Autodetect-buffering the above device.
[   63.134182] md: Autodetect-buffering the above device.
[   63.139576] md: Autodetect-buffering the above device.
[   63.144970] md: Autodetect-buffering the above device.
[   63.150360] md: Autodetect-buffering the above device.
[   63.155749] md: Autodetect-buffering the above device.
[   63.161498] sd 11:0:0:0: [sdk] Attached SCSI disk


 ---
 ~Randy
 *** Remember to use Documentation/SubmitChecklist when testing your code ***

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Patch for boot-time assembly of v1.x-metadata-based soft (MD) arrays: reasoning and future plans

2007-08-26 Thread Abe Skolnik
Dear Mr./Dr. Williams,

 Because you can rely on the configuration file to be certain about
 which disks to pull in and which to ignore.  Without the config file
 the auto-detect routine may not always do the right thing because it
 will need to make assumptions.

But kernel parameters can provide the same data, no?  After all, it is
not the file nature of the config file that we are after,
but rather the configuration data itself.  My now-working setup uses a
line in my grub.conf (AKA menu.lst) file in my boot partition that
says something like...
  root=/dev/md0 md=0,v1,/dev/sda1,/dev/sdb1,/dev/sdc1.
This works just fine, and will not go bad unless a drive fails or I
rearrange the SCSI bus.  Even in a case like that, the worst that will
happen (AFAIK) is that my root-RAID will not come up and I will have to
boot the PC from e.g. Knoppix in order to fix the problem
(that, and maybe also replace a broken drive or re-plug a loose power
connector or whatever).  Another MD should not be corrupted, since they
are (supposedly) protected from that by supposedly-unique array UUIDs.

 So I turn the question around, why go through the exercise of trying
 to improve an auto-detect routine which can never be perfect when the
 explicit configuration can be specified by a config file?

I will turn your turning of my question back around at you; I hope it's
not rude to do so.  Why make root-RAID (based on MD, not hardware RAID)
require an initrd/initramfs, especially since (a) that's another system
component to manage, (b) that's another thing that can go wrong,
(c) many systems (the new one I just finished building included) do not
need an initrd/initramfs for any other reason, so why trigger the
need just out of laziness of maintaining some boot code?  Especially
since a patch has already been written and tested as working. ;-)

 I believe the real issue is the need to improve the distributions'
 initramfs build-scripts and relieve the hassle of handling MD
details.

I believe that the mission of the Linux kernel is two-fold: first of
all, to do the obvious stuff like providing hardware drivers and
otherwise providing userland something to stand on.  The second,
less-obvious mission is _choice_: not just a choice between using Free
software and proprietary software, but more; after all, even if Linux
had never been born, we would still have the BSDs and all the
non-Unix-based Free OSes.  However, I _choose_ to use GNU/Linux,
because (to date) it has the design I like the best out of [GNU/Linux,
FreeBSD, NetBSD, OpenBSD].  The fact that Linux (like Perl) provides
More Than One Way To Do It is a Good Thing, not a bad one.

I acknowledge that the present way of starting up an MD without an
init[rd/ramfs] is somewhat hackish; I don't like the requirement for
md=...,/dev/sda,... at all.  I prefer the way older Linux kernels
worked on PCDOS-formatted disks with the 0xFD partition type -
pure autodetect.  Nonetheless, I recognize the potential for problems
that would exist with reviving that old technique, as Neil Brown has
outlined on his blog.  The main source of trouble in the old strategy,
I think, is the naïve use of MD device names (e.g. md0).  I really
do think that the kernel should have the ability to probe for MD
components on its own, so that, for example, [sda,sdb,sdc] becoming
[sdb,sdc,sdd] due to the addition of a new drive which (either through
unfortunate happenstance or some intentional reason) became sda,
thus pushing all the other SCSI drives down in the letter sequence.
 The rational solution to this, I think, is one that is based on the
array's UUID.  After all, the objective is to bring up an array at boot
time for the purpose of being able to boot, not to have specific
devices activated as components.  I should be able to specify e.g.
root=/dev/md0 md0=01234567:12345678:23456789:34567890 in my GRUB
configuration, and let the kernel do the work from that point on.
That way, even a change of a component from being stored on a partition
on a SCSI drive to being stored on a partition on an IDE or SATA drive
is totally tolerable, with no need to change the configuration data!

I have a plan in mind for future improvements, in case my patch
(or something like it) is accepted.  Here is what I have in mind...

* First of all, remove the spurious bad raid superblock messages for
  the cases (like the one I have now) that produces lots of noise
  even though it works (with nicer messages later on in the boot).

* Second, remove the need to manually specify the format version of
  the metadata.  This should be _correctly_ auto-detected on a
  per-array and per-component basis.

* Implement the kernel parameters md0, md1, ... using a format of
  e.g. md0=01234567:12345678:23456789:34567890.

* If someone would ask me nicely, I would also do the same for
  partitionable MDs.  I'm not using them, so I don't need that
  functionality for myself.

* Once the preceding has been done, use the UUIDs to scan for array
  

Re: Patch for boot-time assembly of v1.x-metadata-based soft (MD) arrays: reasoning and future plans

2007-08-26 Thread Dan Williams
On 8/26/07, Abe Skolnik [EMAIL PROTECTED] wrote:
 Dear Mr./Dr. Williams,

Just Dan is fine :-)

  Because you can rely on the configuration file to be certain about
  which disks to pull in and which to ignore.  Without the config file
  the auto-detect routine may not always do the right thing because it
  will need to make assumptions.

 But kernel parameters can provide the same data, no?  After all, it is
 not the file nature of the config file that we are after,
 but rather the configuration data itself.  My now-working setup uses a
 line in my grub.conf (AKA menu.lst) file in my boot partition that
 says something like...
   root=/dev/md0 md=0,v1,/dev/sda1,/dev/sdb1,/dev/sdc1.
 This works just fine, and will not go bad unless a drive fails or I
 rearrange the SCSI bus.  Even in a case like that, the worst that will
 happen (AFAIK) is that my root-RAID will not come up and I will have to
 boot the PC from e.g. Knoppix in order to fix the problem
 (that, and maybe also replace a broken drive or re-plug a loose power
 connector or whatever).  Another MD should not be corrupted, since they
 are (supposedly) protected from that by supposedly-unique array UUIDs.

Yes, you can get a similar effect of the config file by adding
parameters to the kernel command line.  My only point is that if the
initramfs update tools were as simple as:
mkinitrd root=/dev/md0 md=0,v1,/dev/sda1,/dev/sdb1,/dev/sdc1
...then using an initramfs becomes the same amount of work as editing
/etc/grub.conf.

  So I turn the question around, why go through the exercise of trying
  to improve an auto-detect routine which can never be perfect when the
  explicit configuration can be specified by a config file?

 I will turn your turning of my question back around at you; I hope it's
 not rude to do so.  Why make root-RAID (based on MD, not hardware RAID)
 require an initrd/initramfs, especially since (a) that's another system
 component to manage, (b) that's another thing that can go wrong,
 (c) many systems (the new one I just finished building included) do not
 need an initrd/initramfs for any other reason, so why trigger the
 need just out of laziness of maintaining some boot code?  Especially
 since a patch has already been written and tested as working. ;-)

Understood.  It comes down to a question of how much mdadm
functionality should be duplicated in the kernel?  With an initramfs
you get the full functionality and only one codebase to maintain
(mdadm).

[snip]

 Sincerely,

 Abe


Regards,
Dan
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


raid10 or raid1+0 ?

2007-08-26 Thread T. Eichstädt

Hallo all,

I have 4 HDDs and I want to use mirroring and striping.
I am wondering what difference between the following two solutions is:

- raid0 on top of 2 raid1 devices (raid1+0)
- directly using the raid10 module

Perhaps someone can give me a hint what the raid10 linux module does in 
difference to the combination of raid1 and raid0.


Another question:
How stable is the raid10 module in the linux kernel. I am currently 
using the debian kernel 2.6.18 but I saw some patches from Neil and 
others for 2.6.21, 2.6.22 regarding the raid10 module. And their 
descriptions sound as if it could be useful to integrate them. when 
using raid10.
Okay, backporting isn't complicated, but nevertheless the number of 
patches makes me feel that raid10 is perhaps not that stable like raid0 
and raid1 ???


Perhaps someone can blow away my fear :)

Best regards
 Thimo Eichstädt

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch v2 1/1] md: Software Raid autodetect dev list not array

2007-08-26 Thread Kyle Moffett

On Aug 26, 2007, at 08:20:45, Michael Evans wrote:
Also, I forgot to mention, the reason I added the counters was  
mostly for debugging.  However they're also as useful in the same  
way that listing the partitions when a new disk is added can be (in  
fact this augments that and the existing messages the autodetect  
routines provide).


As for using autodetect or not... the only way to skip it seems to  
be compiling md's raid support as a module.  I checked 2.6.22's  
menuconfig and there's no way for me to explicitly turn it on or  
off at compile time.  I also feel that forcing the addition of a  
boot parameter to de-activate a broken and deprecated system you  
aren't even aware you are getting is somehow wrong.  So if you have  
over 128 devices for it to scan, as I do on one of my PCs, then it  
can bring up

an array in degraded mode.  ... crud.


Well, you could just change the MSDOS disk label to use a different  
Partition Type for your raid partitions.  Just pick the standard  
Linux type and you will get exactly the same behavior that  
everybody who doesn't use MSDOS partition tables gets.


Cheers,
Kyle Moffett

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html