[PATCH 001 of 4] md: Set and test the -persistent flag for md devices more consistently.

2008-01-18 Thread NeilBrown

If you try to start an array for which the number of raid disks is
listed as zero, md will currently try to read metadata off any devices
that have been given.  This was done because the value of raid_disks
is used to signal whether array details have been provided by
userspace (raid_disks  0) or must be read from the devices
(raid_disks == 0).

However for an array without persistent metadata (or with externally
managed metadata) this is the wrong thing to do.  So we add a test in
do_md_run to give an error if raid_disks is zero for non-persistent
arrays.

This requires that mddev-persistent is set corrently at this point,
which it currently isn't for in-kernel autodetected arrays.

So set -persistent for autodetect arrays, and remove the settign in
super_*_validate which is now redundant.

Also clear -persistent when stopping an array so it is consistently
zero when starting an array.

Signed-off-by: Neil Brown [EMAIL PROTECTED]

### Diffstat output
 ./drivers/md/md.c |9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff .prev/drivers/md/md.c ./drivers/md/md.c
--- .prev/drivers/md/md.c   2008-01-18 10:46:49.0 +1100
+++ ./drivers/md/md.c   2008-01-18 11:03:15.0 +1100
@@ -779,7 +779,6 @@ static int super_90_validate(mddev_t *md
mddev-major_version = 0;
mddev-minor_version = sb-minor_version;
mddev-patch_version = sb-patch_version;
-   mddev-persistent = 1;
mddev-external = 0;
mddev-chunk_size = sb-chunk_size;
mddev-ctime = sb-ctime;
@@ -1159,7 +1158,6 @@ static int super_1_validate(mddev_t *mdd
if (mddev-raid_disks == 0) {
mddev-major_version = 1;
mddev-patch_version = 0;
-   mddev-persistent = 1;
mddev-external = 0;
mddev-chunk_size = le32_to_cpu(sb-chunksize)  9;
mddev-ctime = le64_to_cpu(sb-ctime)  ((1ULL  32)-1);
@@ -3219,8 +3217,11 @@ static int do_md_run(mddev_t * mddev)
/*
 * Analyze all RAID superblock(s)
 */
-   if (!mddev-raid_disks)
+   if (!mddev-raid_disks) {
+   if (!mddev-persistent)
+   return -EINVAL;
analyze_sbs(mddev);
+   }
 
chunk_size = mddev-chunk_size;
 
@@ -3627,6 +3628,7 @@ static int do_md_stop(mddev_t * mddev, i
mddev-resync_max = MaxSector;
mddev-reshape_position = MaxSector;
mddev-external = 0;
+   mddev-persistent = 0;
 
} else if (mddev-pers)
printk(KERN_INFO md: %s switched to read-only mode.\n,
@@ -3735,6 +3737,7 @@ static void autorun_devices(int part)
mddev_unlock(mddev);
} else {
printk(KERN_INFO md: created %s\n, mdname(mddev));
+   mddev-persistent = 1;
ITERATE_RDEV_GENERIC(candidates,rdev,tmp) {
list_del_init(rdev-same_set);
if (bind_rdev_to_array(rdev, mddev))
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Test 2

2007-10-26 Thread Janek Kozicki
Daniel L. Miller said: (by the date of Thu, 25 Oct 2007 16:32:31 -0700)

 Thanks for the test responses - I have re-subscribed...if I see this 
 myself...I'm back!

I know that gmail doesn't allow to see your own posts on mailing
lists. Only posts from other people. Maybe you have a similar problem?


-- 
Janek Kozicki |
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Test

2007-10-25 Thread Daniel L. Miller
Sorry for consuming bandwidth - but all of a sudden I'm not seeing 
messages.  Is this going through?


--
Daniel
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Test 2

2007-10-25 Thread Daniel L. Miller
Thanks for the test responses - I have re-subscribed...if I see this 
myself...I'm back!

--
Daniel
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Test

2007-10-25 Thread Justin Piszcz

Success.

On Thu, 25 Oct 2007, Daniel L. Miller wrote:

Sorry for consuming bandwidth - but all of a sudden I'm not seeing messages. 
Is this going through?


--
Daniel
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Test 2

2007-10-25 Thread Justin Piszcz

Success 2.

On Thu, 25 Oct 2007, Daniel L. Miller wrote:

Thanks for the test responses - I have re-subscribed...if I see this 
myself...I'm back!

--
Daniel
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 005 of 7] md: Improve the is_mddev_idle test fix

2007-05-20 Thread NeilBrown

Don't use 'unsigned' variable to track sync vs non-sync IO, as
the only thing we want to do with them is a signed comparison,
and fix up the comment which had become quite wrong.

Signed-off-by: Neil Brown [EMAIL PROTECTED]

### Diffstat output
 ./drivers/md/md.c   |   35 ++-
 ./include/linux/raid/md_k.h |2 +-
 2 files changed, 23 insertions(+), 14 deletions(-)

diff .prev/drivers/md/md.c ./drivers/md/md.c
--- .prev/drivers/md/md.c   2007-05-21 11:17:57.0 +1000
+++ ./drivers/md/md.c   2007-05-21 11:18:00.0 +1000
@@ -5092,7 +5092,7 @@ static int is_mddev_idle(mddev_t *mddev)
mdk_rdev_t * rdev;
struct list_head *tmp;
int idle;
-   unsigned long curr_events;
+   long curr_events;
 
idle = 1;
ITERATE_RDEV(mddev,rdev,tmp) {
@@ -5100,20 +5100,29 @@ static int is_mddev_idle(mddev_t *mddev)
curr_events = disk_stat_read(disk, sectors[0]) + 
disk_stat_read(disk, sectors[1]) - 
atomic_read(disk-sync_io);
-   /* The difference between curr_events and last_events
-* will be affected by any new non-sync IO (making
-* curr_events bigger) and any difference in the amount of
-* in-flight syncio (making current_events bigger or smaller)
-* The amount in-flight is currently limited to
-* 32*64K in raid1/10 and 256*PAGE_SIZE in raid5/6
-* which is at most 4096 sectors.
-* These numbers are fairly fragile and should be made
-* more robust, probably by enforcing the
-* 'window size' that md_do_sync sort-of uses.
+   /* sync IO will cause sync_io to increase before the disk_stats
+* as sync_io is counted when a request starts, and
+* disk_stats is counted when it completes.
+* So resync activity will cause curr_events to be smaller than
+* when there was no such activity.
+* non-sync IO will cause disk_stat to increase without
+* increasing sync_io so curr_events will (eventually)
+* be larger than it was before.  Once it becomes
+* substantially larger, the test below will cause
+* the array to appear non-idle, and resync will slow
+* down.
+* If there is a lot of outstanding resync activity when
+* we set last_event to curr_events, then all that activity
+* completing might cause the array to appear non-idle
+* and resync will be slowed down even though there might
+* not have been non-resync activity.  This will only
+* happen once though.  'last_events' will soon reflect
+* the state where there is little or no outstanding
+* resync requests, and further resync activity will
+* always make curr_events less than last_events.
 *
-* Note: the following is an unsigned comparison.
 */
-   if ((long)curr_events - (long)rdev-last_events  4096) {
+   if (curr_events - rdev-last_events  4096) {
rdev-last_events = curr_events;
idle = 0;
}

diff .prev/include/linux/raid/md_k.h ./include/linux/raid/md_k.h
--- .prev/include/linux/raid/md_k.h 2007-05-21 11:17:57.0 +1000
+++ ./include/linux/raid/md_k.h 2007-05-21 11:18:00.0 +1000
@@ -51,7 +51,7 @@ struct mdk_rdev_s
 
sector_t size;  /* Device size (in blocks) */
mddev_t *mddev; /* RAID array if running */
-   unsigned long last_events;  /* IO event timestamp */
+   long last_events;   /* IO event timestamp */
 
struct block_device *bdev;  /* block device handle */
 
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 002 of 2] md: Improve the is_mddev_idle test

2007-05-10 Thread NeilBrown

During a 'resync' or similar activity, md checks if the devices in the
array are otherwise active and winds back resync activity when they
are.  This test in done in is_mddev_idle, and it is somewhat fragile -
it sometimes thinks there is non-sync io when there isn't.

The test compares the total sectors of io (disk_stat_read) with the sectors
of resync io (disk-sync_io).
This has problems because total sectors gets updated when a request completes,
while resync io gets updated when the request is submitted.  The time difference
can cause large differenced between the two which do not actually imply 
non-resync activity.  The test currently allows for some fuzz (+/- 4096)
but there are some cases when it is not enough.

The test currently looks for any (non-fuzz) difference, either
positive or negative.  This clearly is not needed.  Any non-sync
activity will cause the total sectors to grow faster than the sync_io
count (never slower) so we only need to look for a positive differences.

If we do this then the amount of in-flight sync io will never cause
the appearance of non-sync IO.  Once enough non-sync IO to worry about
starts happening, resync will be slowed down and the measurements will
thus be more precise (as there is less in-flight) and control of resync
will still be suitably responsive.


Signed-off-by: Neil Brown [EMAIL PROTECTED]

### Diffstat output
 ./drivers/md/md.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff .prev/drivers/md/md.c ./drivers/md/md.c
--- .prev/drivers/md/md.c   2007-05-10 15:51:54.0 +1000
+++ ./drivers/md/md.c   2007-05-10 16:05:10.0 +1000
@@ -5095,7 +5095,7 @@ static int is_mddev_idle(mddev_t *mddev)
 *
 * Note: the following is an unsigned comparison.
 */
-   if ((curr_events - rdev-last_events + 4096)  8192) {
+   if ((long)curr_events - (long)rdev-last_events  4096) {
rdev-last_events = curr_events;
idle = 0;
}
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 002 of 2] md: Improve the is_mddev_idle test

2007-05-10 Thread Andrew Morton
On Thu, 10 May 2007 16:22:31 +1000 NeilBrown [EMAIL PROTECTED] wrote:

 The test currently looks for any (non-fuzz) difference, either
 positive or negative.  This clearly is not needed.  Any non-sync
 activity will cause the total sectors to grow faster than the sync_io
 count (never slower) so we only need to look for a positive differences.
 
 ...

 --- .prev/drivers/md/md.c 2007-05-10 15:51:54.0 +1000
 +++ ./drivers/md/md.c 2007-05-10 16:05:10.0 +1000
 @@ -5095,7 +5095,7 @@ static int is_mddev_idle(mddev_t *mddev)
*
* Note: the following is an unsigned comparison.
*/
 - if ((curr_events - rdev-last_events + 4096)  8192) {
 + if ((long)curr_events - (long)rdev-last_events  4096) {
   rdev-last_events = curr_events;
   idle = 0;

In which case would unsigned counters be more appropriate?
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 002 of 2] md: Improve the is_mddev_idle test

2007-05-10 Thread Neil Brown
On Thursday May 10, [EMAIL PROTECTED] wrote:
 On Thu, 10 May 2007 16:22:31 +1000 NeilBrown [EMAIL PROTECTED] wrote:
 
  The test currently looks for any (non-fuzz) difference, either
  positive or negative.  This clearly is not needed.  Any non-sync
  activity will cause the total sectors to grow faster than the sync_io
  count (never slower) so we only need to look for a positive differences.
  
  ...
 
  --- .prev/drivers/md/md.c   2007-05-10 15:51:54.0 +1000
  +++ ./drivers/md/md.c   2007-05-10 16:05:10.0 +1000
  @@ -5095,7 +5095,7 @@ static int is_mddev_idle(mddev_t *mddev)
   *
   * Note: the following is an unsigned comparison.
   */
  -   if ((curr_events - rdev-last_events + 4096)  8192) {
  +   if ((long)curr_events - (long)rdev-last_events  4096) {
  rdev-last_events = curr_events;
  idle = 0;
 
 In which case would unsigned counters be more appropriate?

I guess.

It is really the comparison that I want to be signed, I don't much
care about the counted - they are expected to wrap (though they might
not).
So maybe I really want

 if ((signed long)(curr_events - rdev-last_events)  4096) {

to make it clear...
But people expect number to be signed by default, so that probably
isn't necessary.

Yeah, I'll make them signed one day.

Thanks,
NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 002 of 2] md: Improve the is_mddev_idle test

2007-05-10 Thread Jan Engelhardt

On May 10 2007 16:22, NeilBrown wrote:

diff .prev/drivers/md/md.c ./drivers/md/md.c
--- .prev/drivers/md/md.c  2007-05-10 15:51:54.0 +1000
+++ ./drivers/md/md.c  2007-05-10 16:05:10.0 +1000
@@ -5095,7 +5095,7 @@ static int is_mddev_idle(mddev_t *mddev)
*
* Note: the following is an unsigned comparison.
*/
-  if ((curr_events - rdev-last_events + 4096)  8192) {
+  if ((long)curr_events - (long)rdev-last_events  4096) {
   rdev-last_events = curr_events;
   idle = 0;
   }

What did really change? Unless I am seriously mistaken,

curr_events - last_evens + 4096  8192

is mathematically equivalent to

curr_events - last_evens 4096

The casting to (long) may however force a signed comparison which turns
things quite upside down, and the comment does not apply anymore.


Jan
-- 
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 002 of 2] md: Improve the is_mddev_idle test

2007-05-10 Thread Neil Brown
On Thursday May 10, [EMAIL PROTECTED] wrote:
 
 On May 10 2007 16:22, NeilBrown wrote:
 
 diff .prev/drivers/md/md.c ./drivers/md/md.c
 --- .prev/drivers/md/md.c2007-05-10 15:51:54.0 +1000
 +++ ./drivers/md/md.c2007-05-10 16:05:10.0 +1000
 @@ -5095,7 +5095,7 @@ static int is_mddev_idle(mddev_t *mddev)
   *
   * Note: the following is an unsigned comparison.
   */
 -if ((curr_events - rdev-last_events + 4096)  8192) {
 +if ((long)curr_events - (long)rdev-last_events  4096) {
  rdev-last_events = curr_events;
  idle = 0;
  }
 
 What did really change? Unless I am seriously mistaken,
 
 curr_events - last_evens + 4096  8192
 
 is mathematically equivalent to
 
 curr_events - last_evens 4096
 
 The casting to (long) may however force a signed comparison which turns
 things quite upside down, and the comment does not apply anymore.

Yes, the use of a signed comparison is the significant difference.
And yes, the comment becomes wrong.  I'm in the process of redrafting
that.  It currently stands at:

/* sync IO will cause sync_io to increase before the disk_stats
 * as sync_io is counted when a request starts, and 
 * disk_stats is counted when it completes.
 * So resync activity will cause curr_events to be smaller than
 * when there was no such activity.
 * non-sync IO will cause disk_stat to increase without
 * increasing sync_io so curr_events will (eventually)
 * be larger than it was before.  Once it becomes
 * substantially larger, the test below will cause
 * the array to appear non-idle, and resync will slow
 * down.
 * If there is a lot of outstanding resync activity when
 * we set last_event to curr_events, then all that activity
 * completing might cause the array to appear non-idle
 * and resync will be slowed down even though there might
 * not have been non-resync activity.  This will only
 * happen once though.  'last_events' will soon reflect
 * the state where there is little or no outstanding
 * resync requests, and further resync activity will
 * always make curr_events less than last_events.
 *
 */


Does that read at all well?

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 002 of 2] md: Improve the is_mddev_idle test

2007-05-10 Thread Jan Engelhardt

On May 10 2007 20:04, Neil Brown wrote:
 -   if ((curr_events - rdev-last_events + 4096)  8192) {
 +   if ((long)curr_events - (long)rdev-last_events  4096) {
 rdev-last_events = curr_events;
 idle = 0;
 }
 
/* sync IO will cause sync_io to increase before the disk_stats
 * as sync_io is counted when a request starts, and 
 * disk_stats is counted when it completes.
 * So resync activity will cause curr_events to be smaller than
 * when there was no such activity.
 * non-sync IO will cause disk_stat to increase without
 * increasing sync_io so curr_events will (eventually)
 * be larger than it was before.  Once it becomes
 * substantially larger, the test below will cause
 * the array to appear non-idle, and resync will slow
 * down.
 * If there is a lot of outstanding resync activity when
 * we set last_event to curr_events, then all that activity
 * completing might cause the array to appear non-idle
 * and resync will be slowed down even though there might
 * not have been non-resync activity.  This will only
 * happen once though.  'last_events' will soon reflect
 * the state where there is little or no outstanding
 * resync requests, and further resync activity will
 * always make curr_events less than last_events.
 *
 */

Does that read at all well?

It is a more verbose explanation of your patch description, yes.


Jan
-- 
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 001 of 5] md: Move test for whether level supports bitmap to correct place.

2007-05-07 Thread NeilBrown

We need to check for internal-consistency of superblock in
load_super.  validate_super is for inter-device consistency.

With the test in the wrong place, a badly created array will confuse md
rather an produce sensible errors.

Signed-off-by: Neil Brown [EMAIL PROTECTED]

### Diffstat output
 ./drivers/md/md.c |   42 ++
 1 file changed, 26 insertions(+), 16 deletions(-)

diff .prev/drivers/md/md.c ./drivers/md/md.c
--- .prev/drivers/md/md.c   2007-05-07 14:33:31.0 +1000
+++ ./drivers/md/md.c   2007-05-07 14:33:31.0 +1000
@@ -695,6 +695,17 @@ static int super_90_load(mdk_rdev_t *rde
rdev-data_offset = 0;
rdev-sb_size = MD_SB_BYTES;
 
+   if (sb-state  (1MD_SB_BITMAP_PRESENT)) {
+   if (sb-level != 1  sb-level != 4
+sb-level != 5  sb-level != 6
+sb-level != 10) {
+   /* FIXME use a better test */
+   printk(KERN_WARNING
+  md: bitmaps not supported for this level.\n);
+   goto abort;
+   }
+   }
+
if (sb-level == LEVEL_MULTIPATH)
rdev-desc_nr = -1;
else
@@ -793,16 +804,8 @@ static int super_90_validate(mddev_t *md
mddev-max_disks = MD_SB_DISKS;
 
if (sb-state  (1MD_SB_BITMAP_PRESENT) 
-   mddev-bitmap_file == NULL) {
-   if (mddev-level != 1  mddev-level != 4
-mddev-level != 5  mddev-level != 6
-mddev-level != 10) {
-   /* FIXME use a better test */
-   printk(KERN_WARNING md: bitmaps not supported 
for this level.\n);
-   return -EINVAL;
-   }
+   mddev-bitmap_file == NULL)
mddev-bitmap_offset = mddev-default_bitmap_offset;
-   }
 
} else if (mddev-pers == NULL) {
/* Insist on good event counter while assembling */
@@ -1059,6 +1062,18 @@ static int super_1_load(mdk_rdev_t *rdev
   bdevname(rdev-bdev,b));
return -EINVAL;
}
+   if ((le32_to_cpu(sb-feature_map)  MD_FEATURE_BITMAP_OFFSET)) {
+   if (sb-level != cpu_to_le32(1) 
+   sb-level != cpu_to_le32(4) 
+   sb-level != cpu_to_le32(5) 
+   sb-level != cpu_to_le32(6) 
+   sb-level != cpu_to_le32(10)) {
+   printk(KERN_WARNING
+  md: bitmaps not supported for this level.\n);
+   return -EINVAL;
+   }
+   }
+
rdev-preferred_minor = 0x;
rdev-data_offset = le64_to_cpu(sb-data_offset);
atomic_set(rdev-corrected_errors, 
le32_to_cpu(sb-cnt_corrected_read));
@@ -1142,14 +1157,9 @@ static int super_1_validate(mddev_t *mdd
mddev-max_disks =  (4096-256)/2;
 
if ((le32_to_cpu(sb-feature_map)  MD_FEATURE_BITMAP_OFFSET) 
-   mddev-bitmap_file == NULL ) {
-   if (mddev-level != 1  mddev-level != 5  
mddev-level != 6
-mddev-level != 10) {
-   printk(KERN_WARNING md: bitmaps not supported 
for this level.\n);
-   return -EINVAL;
-   }
+   mddev-bitmap_file == NULL )
mddev-bitmap_offset = 
(__s32)le32_to_cpu(sb-bitmap_offset);
-   }
+
if ((le32_to_cpu(sb-feature_map)  MD_FEATURE_RESHAPE_ACTIVE)) 
{
mddev-reshape_position = 
le64_to_cpu(sb-reshape_position);
mddev-delta_disks = le32_to_cpu(sb-delta_disks);
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: strange test results

2007-03-20 Thread Bill Davidsen

Tomka Gergely wrote:

Hi!

I am running tests on our new test device. The device has 2x2 core Xeon, 
intel 5000 chipset, two 3ware sata raid card on pcie, and 15 sata2 disks, 
running debian etch. More info at the bottom.


The first phase of the test is probing various raid levels. So i 
configured the cards to 15 JBOD disks, and hacked together a testing 
script. The script builds raid arrays, waits for sync, and then runs this 
command:


iozone -eM -s 4g -r 1024 -i0 -i1 -i2 -i8 -t16 -+u

The graphs of the results here:

http://gergely.tomka.hu/dt/index.html

And i have a lots of questions.

http://gergely.tomka.hu/dt/1.html

This graph is crazy, like thunderbolts. But the raid50 is generally slower 
than raid5. Why?


http://gergely.tomka.hu/dt/3.html

This is the only graph i can explain :)

http://gergely.tomka.hu/dt/4.html

With random readers, why raid0 slowing down? And why raid10 faster than 
raid0?


Because with two copies of the data there is a better chance that one 
copy will be on a drive which is less busy, and/or has a shorter seek to 
position the heads. If you want to verify this you could create a RAID-1 
with three (or more) copies and run readers against that.


BTW: that's the only one of your questions I could answer quickly.

--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


strange test results

2007-03-19 Thread Tomka Gergely
Hi!

I am running tests on our new test device. The device has 2x2 core Xeon, 
intel 5000 chipset, two 3ware sata raid card on pcie, and 15 sata2 disks, 
running debian etch. More info at the bottom.

The first phase of the test is probing various raid levels. So i 
configured the cards to 15 JBOD disks, and hacked together a testing 
script. The script builds raid arrays, waits for sync, and then runs this 
command:

iozone -eM -s 4g -r 1024 -i0 -i1 -i2 -i8 -t16 -+u

The graphs of the results here:

http://gergely.tomka.hu/dt/index.html

And i have a lots of questions.

http://gergely.tomka.hu/dt/1.html

This graph is crazy, like thunderbolts. But the raid50 is generally slower 
than raid5. Why?

http://gergely.tomka.hu/dt/3.html

This is the only graph i can explain :)

http://gergely.tomka.hu/dt/4.html

With random readers, why raid0 slowing down? And why raid10 faster than 
raid0?

http://gergely.tomka.hu/dt/2.html

Why raid6 cant became faster, with multiple disks, as raid5  50?

So lots of questions. I am generally surprised by the non-linearity of 
some results and the lack of acceleration with more disks on other 
results. And now, the details:

Hardware:

Base Board Information
Manufacturer: Supermicro
Product Name: X7DB8
Processor Information
Socket Designation: LGA771/CPU1
Type: Central Processor
Family: Xeon
Manufacturer: Intel
ID: 64 0F 00 00 FF FB EB BF
Signature: Type 0, Family 15, Model 6, Stepping 4
(two cpus)
Memory Device
Array Handle: 0x0017
Error Information Handle: No Error
Total Width: 72 bits
Data Width: 64 bits
Size: 1024 MB
Form Factor: DIMM
Set: 1
Locator: DIMM x 4
Bank Locator: Bank1
Type: DDR2
Type Detail: Synchronous
Speed: 533 MHz (1.9 ns)
Manufacturer: Not Specified
Serial Number: Not Specified
Asset Tag: Not Specified
Part Number: Not Specified
(two of this also)

ursula:~# tw_cli show

Ctl   ModelPorts   Drives   Units   NotOpt   RRate   VRate   BBU

c09590SE-8ML   8   77   01   1   -
c19590SE-8ML   8   88   01   1   -

The tests generally:
mdadm
mkfs.xfs
blockdev --setra 524288 md (maybe not a good idea for multiple arrays)
do iozone test

raid10 is two disks raid1s in raid0 and raid50 is three disk raid6s in 
raid0.

These test runs for a week, and now slowly finishing. For this reason, 
replicatong the test to filter out accidents not a good option.

Any comments?

-- 
Tomka Gergely, [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


mdadm --misc --detail --test ... question

2006-11-17 Thread Russell Hammer
I'm trying to test the status of a raid device using mdadm:

# mdadm --misc --detail --test /dev/md0

However this does not appear to work as documented.  As I read the man
page, the return code is supposed to reflect the status of the raid
device:


MISC MODE
 ...
   --detail
  The  device  should be an active md device.  mdadm will
  display a detailed description of the array. --brief or --scan
  will cause the output to be less detailed and the format to be
  suitable for inclusion in  /etc/mdadm.conf.   The  exit status
  of mdadm will normally be 0 unless mdadm failed to get useful
  information about the device(s).  However if the --test option
  is given, then the exit status will be:

  0  The array is functioning normally.

  1  The array has at least one failed device.

  2  The array has multiple failed devices and hence is 
unusable (raid4 or raid5).

  4  There was an error while trying to get information about 
the device.


Am I missing something here (see below)?

Thanks,
Russ


# mdadm --misc --detail --test /dev/md0; echo -e \nReturned -- $?
/dev/md0:
Version : 00.90.03
  Creation Time : Mon Aug 23 12:49:46 2004
 Raid Level : raid1
 Array Size : 104320 (101.89 MiB 106.82 MB)
Device Size : 104320 (101.89 MiB 106.82 MB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0
Persistence : Superblock is persistent

Update Time : Thu Nov 16 13:38:40 2006
  State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

   UUID : 403240c4:4a5c7f60:ce59447e:a08858a9
 Events : 0.4476

Number   Major   Minor   RaidDevice State
   0  3310  active sync   /dev/hde1
   1  3411  active sync   /dev/hdg1

Returned -- 0

# mdadm --manage --fail /dev/md0 /dev/hdg1
mdadm: set /dev/hdg1 faulty in /dev/md0

# mdadm --misc --detail --test /dev/md0; echo -e \nReturned -- $?
/dev/md0:
Version : 00.90.03
  Creation Time : Mon Aug 23 12:49:46 2004
 Raid Level : raid1
 Array Size : 104320 (101.89 MiB 106.82 MB)
Device Size : 104320 (101.89 MiB 106.82 MB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0
Persistence : Superblock is persistent

Update Time : Fri Nov 17 14:08:33 2006
  State : clean, degraded
 Active Devices : 1
Working Devices : 1
 Failed Devices : 1
  Spare Devices : 0

   UUID : 403240c4:4a5c7f60:ce59447e:a08858a9
 Events : 0.4478

Number   Major   Minor   RaidDevice State
   0  3310  active sync   /dev/hde1
   1   001  removed

   2  341-  faulty spare   /dev/hdg1

Returned -- 0

# mdadm --manage --remove /dev/md0 /dev/hdg1
mdadm: hot removed /dev/hdg1

# mdadm --misc --detail --test /dev/md0; echo -e \nReturned -- $?
/dev/md0:
Version : 00.90.03
  Creation Time : Mon Aug 23 12:49:46 2004
 Raid Level : raid1
 Array Size : 104320 (101.89 MiB 106.82 MB)
Device Size : 104320 (101.89 MiB 106.82 MB)
   Raid Devices : 2
  Total Devices : 1
Preferred Minor : 0
Persistence : Superblock is persistent

Update Time : Fri Nov 17 14:09:02 2006
  State : clean, degraded
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0

   UUID : 403240c4:4a5c7f60:ce59447e:a08858a9
 Events : 0.4480

Number   Major   Minor   RaidDevice State
   0  3310  active sync   /dev/hde1
   1   001  removed

Returned -- 0
-



-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Newbie: Kernel panic during RAID1 test reboot loses one disk

2006-09-03 Thread Neil Brown
On Monday August 28, [EMAIL PROTECTED] wrote:
 Neil Brown wrote:
  On Saturday August 26, [EMAIL PROTECTED] wrote:
  All,
 
  [...]
 
  * Problem 1: Since moving from 2.4 - 2.6 kernel, a reboot kicks one 
  device out of the array (c.f. post by Andreas Pelzner on 24th Aug 2006).
 
  * Problem 2: When booting my system, unless both disks plugged in, I get 
  a kernel panic (oh dear!):
 
mdadm md0 stopped
mdadm cannot open device /dev/hda6 no such device or address
mdadm /dev/hda6 has wrong uuid
mdadm no devices found for /dev/md0
ext3fs unable to read superblock
ecit 2 - unable to read superblock cramfs
kernel panic attempting to kill init
  
  At a guess, I'd say something is wrong with your initramfs/initrd.
  Can you look inside it and see what /etc/mdadm/mdadm.conf contains?
 
 Sure, this is the first time I've mounted an initrd, here goes:
 
 # file /boot/initrd.img-2.6.8-3-386
 [...]Linux Compressed ROM File System data, little endian size 4333568 
 version #2 sorted_dirs CRC 0xa04ccaa3, edition 0, 2492 blocks, 312 files
 
 # losetup /dev/loop0
 
 # mkdir /tmp/initrdmount
 
 # mount -t cramfs /dev/loop0 /tmp/initrdmount
 
 # ls -al /tmp/initrdmount/etc/
 total 1.0K
 drwxr-xr-x  1 root root 64 1970-01-01 01:00 modprobe.d/
 -rw-r--r--  1 root root  0 1970-01-01 01:00 mtab
 
 There is no mdadm/mdadm.conf! What I should do about this?

Sorry, I don't think.  You'll have to ask on some Debian list.  I
don't know the intricacies of Debian initrd.

NeilBrown

-- 
VGER BF report: U 0.5
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Newbie: Kernel panic during RAID1 test reboot loses one disk

2006-08-28 Thread James Brown

Neil Brown wrote:

On Saturday August 26, [EMAIL PROTECTED] wrote:

All,

[...]

* Problem 1: Since moving from 2.4 - 2.6 kernel, a reboot kicks one 
device out of the array (c.f. post by Andreas Pelzner on 24th Aug 2006).


* Problem 2: When booting my system, unless both disks plugged in, I get 
a kernel panic (oh dear!):


  mdadm md0 stopped
  mdadm cannot open device /dev/hda6 no such device or address
  mdadm /dev/hda6 has wrong uuid
  mdadm no devices found for /dev/md0
  ext3fs unable to read superblock
  ecit 2 - unable to read superblock cramfs
  kernel panic attempting to kill init


At a guess, I'd say something is wrong with your initramfs/initrd.
Can you look inside it and see what /etc/mdadm/mdadm.conf contains?


Sure, this is the first time I've mounted an initrd, here goes:

# file /boot/initrd.img-2.6.8-3-386
[...]Linux Compressed ROM File System data, little endian size 4333568 
version #2 sorted_dirs CRC 0xa04ccaa3, edition 0, 2492 blocks, 312 files


# losetup /dev/loop0

# mkdir /tmp/initrdmount

# mount -t cramfs /dev/loop0 /tmp/initrdmount

# ls -al /tmp/initrdmount/etc/
total 1.0K
drwxr-xr-x  1 root root 64 1970-01-01 01:00 modprobe.d/
-rw-r--r--  1 root root  0 1970-01-01 01:00 mtab

There is no mdadm/mdadm.conf! What I should do about this?


[...]

* System md logs don't mention hdc6
# grep md /var/log/messages


 grep -C 5 md /var/log/messages
might be better as it gives a bit more context.l


I've put the messages here:

http://www.zen6780.zen.co.uk/messages.txt

Many thanks for your time.

James.


But I'm betting on the initramfs being a problem.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html



-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Newbie: Kernel panic during RAID1 test reboot loses one disk

2006-08-26 Thread James Brown

All,

I'm fairly new to Linux/Debian and have been trying to configure mdadm 
for RAID1 with 2x120Gb IDE disks. Unfortunately, I have two problems 
with the configuration and would really appreciate some advice.


* Problem 1: Since moving from 2.4 - 2.6 kernel, a reboot kicks one 
device out of the array (c.f. post by Andreas Pelzner on 24th Aug 2006).


* Problem 2: When booting my system, unless both disks plugged in, I get 
a kernel panic (oh dear!):


 mdadm md0 stopped
 mdadm cannot open device /dev/hda6 no such device or address
 mdadm /dev/hda6 has wrong uuid
 mdadm no devices found for /dev/md0
 ext3fs unable to read superblock
 ecit 2 - unable to read superblock cramfs
 kernel panic attempting to kill init

Here is the information about my system/config:

* System Info:
# uname -a
Linux cinzano. 2.6.8-3-386 #1 Sat Jul 15 09:26:40 UTC 2006 i686 GNU/Linux

* mdadm Config:
# cat /etc/mdadm/mdadm.conf
DEVICE partitions
ARRAY /dev/md1 level=raid1 num-devices=2 
UUID=cc518d12:0e602331:8715a849:6dac0873

   devices=/dev/hda7,/dev/hdc7
ARRAY /dev/md0 level=raid1 num-devices=2 
UUID=07c5cab1:1b86a5ca:f4599353:4ccfc5c1

   devices=/dev/hda6,/dev/hdc6

* After reboot:
# cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 hda7[0]
  4675264 blocks [2/1] [U_]
md0 : active raid1 hda6[0]
  101562816 blocks [2/1] [U_]

* After hotadding again:
# cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 hda7[0] hdc7[1]
  14675264 blocks [2/2] [UU]
md0 : active raid1 hdc6[1] hda6[0]
  101562816 blocks [2/2] [UU]

* Mdadm version
# apt-show-versions | grep mdadm
mdadm/stable uptodate 1.9.0-4sarge1

* System md logs don't mention hdc6
# grep md /var/log/messages
Aug 26 14:21:32 cinzano kernel: Kernel command line: root=/dev/md0 ro
Aug 26 14:21:32 cinzano kernel: md: md driver 0.90.0 MAX_MD_DEVS=256, 
MD_SB_DISKS=27

Aug 26 14:21:32 cinzano kernel: md: raid1 personality registered as nr 3
Aug 26 14:21:32 cinzano kernel: md: md0 stopped.
Aug 26 14:21:32 cinzano kernel: md: bindhda6
Aug 26 14:21:32 cinzano kernel: raid1: raid set md0 active with 1 out of 
2 mirrors

Aug 26 14:21:32 cinzano kernel: EXT3 FS on md0, internal journal
Aug 26 14:21:32 cinzano kernel: md: md1 stopped.
Aug 26 14:21:32 cinzano kernel: md: bindhdc7
Aug 26 14:21:32 cinzano kernel: md: bindhda7
Aug 26 14:21:32 cinzano kernel: raid1: raid set md1 active with 2 out of 
2 mirrors

Aug 26 14:21:32 cinzano kernel: EXT3 FS on md1, internal journal
Aug 26 14:25:43 cinzano kernel: Kernel command line: root=/dev/md0 ro
Aug 26 14:25:43 cinzano kernel: md: md driver 0.90.0 MAX_MD_DEVS=256, 
MD_SB_DISKS=27

Aug 26 14:25:43 cinzano kernel: md: raid1 personality registered as nr 3
Aug 26 14:25:43 cinzano kernel: md: md0 stopped.
Aug 26 14:25:43 cinzano kernel: md: bindhda6
Aug 26 14:25:43 cinzano kernel: raid1: raid set md0 active with 1 out of 
2 mirrors

Aug 26 14:25:43 cinzano kernel: EXT3-fs: md0: orphan cleanup on readonly fs
Aug 26 14:25:43 cinzano kernel: EXT3-fs: md0: 3 orphan inodes deleted
Aug 26 14:25:43 cinzano kernel: EXT3 FS on md0, internal journal
Aug 26 14:25:43 cinzano kernel: md: md1 stopped.
Aug 26 14:25:43 cinzano kernel: md: bindhdc7
Aug 26 14:25:43 cinzano kernel: md: bindhda7
Aug 26 14:25:43 cinzano kernel: raid1: raid set md1 active with 2 out of 
2 mirrors

Aug 26 14:25:43 cinzano kernel: EXT3 FS on md1, internal journal

* FDisk output
# fdisk -l
Disk /dev/hda: 120.0 GB, 120034123776 bytes
255 heads, 63 sectors/track, 14593 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot  Start End  Blocks   Id  System
/dev/hda1   1   14593   1172182415  Extended
/dev/hda5   1 122  979902   82  Linux swap/Sola.
/dev/hda6 123   12766   101562898+  fd  Linux raid auto.
/dev/hda7   12767   1459314675346   fd  Linux raid auto.

Disk /dev/hdc: 120.0 GB, 120034123776 bytes
255 heads, 63 sectors/track, 14593 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot  Start End  Blocks   Id  System
/dev/hdc1   1   14593   1172182415  Extended
/dev/hdc5   1 122  979902   82  Linux swap/Sola.
/dev/hdc6 123   12766   101562898+  fd  Linux raid auto.
/dev/hdc7   12767   1459314675346   fd  Linux raid auto.

Disk /dev/md0: 104.0 GB, 104000323584 bytes
2 heads, 4 sectors/track, 25390704 cylinders
Units = cylinders of 8 * 512 = 4096 bytes

Disk /dev/md0 doesn't contain a valid partition table

Disk /dev/md1: 15.0 GB, 15027470336 bytes
2 heads, 4 sectors/track, 3668816 cylinders
Units = cylinders of 8 * 512 = 4096 bytes

Disk /dev/md1 doesn't contain a valid partition table

* Grub config
# cat /boot/grub/menu.lst
[...]
# groot=(hd0,5)
[...]
title   Debian GNU/Linux, kernel 2.6.8-3-386
root(hd0,5)
kernel  /boot/vmlinuz-2.6.8-3-386 root=/dev/md0 ro
initrd  

Re: Test feedback 2.6.17.4+libata-tj-stable (EH, hotplug)

2006-07-17 Thread Bill Davidsen

Christian Pernegger wrote:


I finally got around to testing 2.6.17.4 with libata-tj-stable-20060710.

Hardware: ICH7R in ahci mode + WD5000YS's.

EH: much, much better. Before the patch it seemed like errors were
only printed to dmesg but never handed up to any layer above. Now md
actually fails the disk when I pull the (power) plug. I'll try my bad
cable once I can find it.

Hotplug: Unplugging was fine, took about 15s until the driver gave up
on the disk. After re-plugging the driver had to hard-reset the port
once to get the disk back, though that might be by design.

The fact that the disk had changed minor numbers after it was plugged
back in bugs me a bit. (was sdc before, sde after). Additionally udev
removed the sdc device file, so I had to manually recreate it to be
able to remove the 'faulty' disk from its md array.

Thanks for a great patch! I just hope it doesn't eat my data :) 


And thank you for testing!

--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Test feedback 2.6.17.4+libata-tj-stable (EH, hotplug)

2006-07-17 Thread Neil Brown
On Tuesday July 11, [EMAIL PROTECTED] wrote:
 Christian Pernegger wrote:
  The fact that the disk had changed minor numbers after it was plugged
  back in bugs me a bit. (was sdc before, sde after). Additionally udev
  removed the sdc device file, so I had to manually recreate it to be
  able to remove the 'faulty' disk from its md array.
 
 That's because md is stilling holding onto sdc in failed mode.  A 
 hotplug script which checks whether a removed device is in md array and 
 if so removes it from the array will solve the problem.  Not sure 
 whether that would be the correct approach though.

Checking whether the to-be-removed device is in an md array or in use
in any other way first definitely sounds like the right approach to
me.

Exactly what to do if the device is in use is somewhat less obvious.
If the array is completely quiescent then you don't necessarily want
to fail/remove the device from the array

I think the best approach would be to have plug-ins that are called if
an unplugged device is in use, and if it is still in use after those
calls, then don't delete the device.  Maybe it would also be good if
hotplug was told when a device was no longer in use so it could remove
the /dev entry then

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Test feedback 2.6.17.4+libata-tj-stable (EH, hotplug)

2006-07-10 Thread Christian Pernegger

I finally got around to testing 2.6.17.4 with libata-tj-stable-20060710.

Hardware: ICH7R in ahci mode + WD5000YS's.

EH: much, much better. Before the patch it seemed like errors were
only printed to dmesg but never handed up to any layer above. Now md
actually fails the disk when I pull the (power) plug. I'll try my bad
cable once I can find it.

Hotplug: Unplugging was fine, took about 15s until the driver gave up
on the disk. After re-plugging the driver had to hard-reset the port
once to get the disk back, though that might be by design.

The fact that the disk had changed minor numbers after it was plugged
back in bugs me a bit. (was sdc before, sde after). Additionally udev
removed the sdc device file, so I had to manually recreate it to be
able to remove the 'faulty' disk from its md array.

Thanks for a great patch! I just hope it doesn't eat my data :)

C.
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Test feedback 2.6.17.4+libata-tj-stable (EH, hotplug)

2006-07-10 Thread Tejun Heo

Christian Pernegger wrote:

The fact that the disk had changed minor numbers after it was plugged
back in bugs me a bit. (was sdc before, sde after). Additionally udev
removed the sdc device file, so I had to manually recreate it to be
able to remove the 'faulty' disk from its md array.


That's because md is stilling holding onto sdc in failed mode.  A 
hotplug script which checks whether a removed device is in md array and 
if so removes it from the array will solve the problem.  Not sure 
whether that would be the correct approach though.


Thanks.

--
tejun
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Best way to test a new RAID configuration

2001-03-16 Thread David Christensen

I've recently setup a new RAID-5 configuration and wanted to test it
thoroughly before I commit data to it.  I'm not so worried about drive
failures so I don't want to power down drives while the system is running,
but I do want to test the drives out by reading/writing/verifying for a few
days.  Anyone know of any good (easy to setup) applications for doing that,
or perhaps a shell script that might do the same thing?

David Christensen
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]



Re: Best way to test a new RAID configuration

2001-03-16 Thread Art Boulatov

David Christensen wrote:

 I've recently setup a new RAID-5 configuration and wanted to test it
 thoroughly before I commit data to it.  I'm not so worried about drive
 failures so I don't want to power down drives while the system is running,
 but I do want to test the drives out by reading/writing/verifying for a few
 days.  Anyone know of any good (easy to setup) applications for doing that,
 or perhaps a shell script that might do the same thing?
 
 David Christensen
 -
 To unsubscribe from this list: send the line "unsubscribe linux-raid" in
 the body of a message to [EMAIL PROTECTED]

Hi,

I have a setup of 2 SCSI disks with 8 partitions on each
consumed by software RAID0.

I started bonnie++ with really weird parameters
on each of the 8 meta devices at the same time.
So I had 8 bonnies++ chewing my raid0 configuration...

May be there could be more exhaustive tests,
but this one helped me to find a bad block on one
of the brand new IBM SCSI harddrive :)
Did not see any problems with RAID or reiserfs though.

Art.

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]



Re: Best way to test a new RAID configuration

2001-03-16 Thread Alvin Oga


hi ya

best way to test raid5 is to write large ( 1Gb-2Gb ) data files to it...
and than compare the files

-- oooppss... just re-read david's post skip the part about
   powering down the disks..etc...

than pull one of the disks offline
and see if it still compares...

insert a fresh disk in its place...
and see if it re-syncs while you are creating a new "large file"

-- for testing raid1 mirroring...
- write to "A" and take "a" offline
and see if the mirror ( "B" ) has the data you just wrote


taking scsi drives offline is tough ???
taking ide drives offline is easy ... use hdparm to shut it down ???

===
=== best way to make sure you dont lose the data on the Raid
=== is to have a backup somewhere else...
===

have fun raiding
alvin
http://www.Linux-1U.net ... 1U Raid5 ... 500Gb each ..
http://www.linux-consulting.com/Raid/Docs/raid_test*


On Fri, 16 Mar 2001, Derek Vadala wrote:

 On Fri, 16 Mar 2001, David Christensen wrote:
 
  I've recently setup a new RAID-5 configuration and wanted to test it
  thoroughly before I commit data to it.  I'm not so worried about drive
  failures so I don't want to power down drives while the system is running,
  but I do want to test the drives out by reading/writing/verifying for a few
  days.  Anyone know of any good (easy to setup) applications for doing that,
  or perhaps a shell script that might do the same thing?
 
 You could use Bonnie and some Perl scripts to hammer the drives for a few
 days. 
 
 ---
 Derek Vadala, [EMAIL PROTECTED], http://www.cynicism.com/~derek
 
 -
 To unsubscribe from this list: send the line "unsubscribe linux-raid" in
 the body of a message to [EMAIL PROTECTED]
 

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]



Re: Best way to test a new RAID configuration

2001-03-16 Thread Ross Vandegrift

   Anyone know of any good (easy to setup) applications for doing that,
   or perhaps a shell script that might do the same thing?

As a matter of fact, I have a very nice one right here.  Someone mailed this to the 
list back in the day when I asked this same question.  It's pretty killer.

Ross Vandegrift
[EMAIL PROTECTED]
[EMAIL PROTECTED]


#!/bin/bash -
# -*- Shell-script -*-
#
# Copyright (C) 1999 Bibliotech Ltd., 631-633 Fulham Rd., London SW6 5UQ.
#
# $Id: stress.sh,v 1.2 1999/02/10 10:58:04 rich Exp $
#
# Change log:
#
# $Log: stress.sh,v $
# Revision 1.2  1999/02/10 10:58:04  rich
# Use cp instead of tar to copy.
#
# Revision 1.1  1999/02/09 15:13:38  rich
# Added first version of stress test program.
#

# Stress-test a file system by doing multiple
# parallel disk operations. This does everything
# in MOUNTPOINT/stress.

nconcurrent=4
content=/usr/doc
stagger=yes

while getopts "c:n:s" c; do
case $c in
c)
content=$OPTARG
;;
n)
nconcurrent=$OPTARG
;;
s)
stagger=no
;;
*)
echo 'Usage: stress.sh [-options] MOUNTPOINT'
echo 'Options: -c Content directory'
echo ' -n Number of concurrent accesses (default: 4)'
echo ' -s Avoid staggerring start times'
exit 1
;;
esac
done

shift $(($OPTIND-1))
if [ $# -ne 1 ]; then
echo 'For usage: stress.sh -?'
exit 1
fi

mountpoint=$1

echo 'Number of concurrent processes:' $nconcurrent
echo 'Content directory:' $content '(size:' `du -s $content | awk '{print $1}'` 'KB)'

# Check the mount point is really a mount point.

if [ `df | awk '{print $6}' | grep ^$mountpoint\$ | wc -l` -lt 1 ]; then
echo $mountpoint: This doesn\'t seem to be a mountpoint. Try not
echo to use a trailing / character.
exit 1
fi

# Create the directory, if it doesn't exist.

echo Warning: This will DELETE anything in $mountpoint/stress. Type yes to confirm.
read line
if [ "$line" != "yes" ]; then
echo "Script abandoned."
exit 1
fi

if [ ! -d $mountpoint/stress ]; then
rm -rf $mountpoint/stress
if ! mkdir $mountpoint/stress; then
echo Problem creating $mountpoint/stress directory. Do you have sufficient
echo access permissions\?
exit 1
fi
fi

echo Created $mountpoint/stress directory.

# Construct MD5 sums over the content directory.

echo -n "Computing MD5 sums over content directory: "
( cd $content  find . -type f -print0 | xargs -0 md5sum | sort -k 2 -o 
$mountpoint/stress/content.sums )
echo done.

# Start the stressing processes.

echo -n "Starting stress test processes: "

pids=""

p=1
while [ $p -le $nconcurrent ]; do
echo -n "$p "

(

# Wait for all processes to start up.
if [ "$stagger" = "yes" ]; then
sleep $((10*$p))
else
sleep 10
fi

while true; do

# Remove old directories.
echo -n "D$p "
rm -rf $mountpoint/stress/$p

# Copy content - partition.
echo -n "W$p "
mkdir $mountpoint/stress/$p
#( cd $content  tar cf - . ) | ( cd $mountpoint/stress/$p  tar xf - )
cp -ax $content/* $mountpoint/stress/$p

# Compare the content and the copy.
echo -n "R$p "
( cd $mountpoint/stress/$p  find . -type f -print0 | xargs -0 md5sum | 
sort -k 2 -o /tmp/stress.$$.$p )
diff $mountpoint/stress/content.sums /tmp/stress.$$.$p
rm -f /tmp/stress.$$.$p
done
) 

pids="$pids $!"

p=$(($p+1))
done

echo
echo "Process IDs: $pids"
echo "Press ^C to kill all processes"

trap "kill $pids" SIGINT

wait

kill $pids
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]



test

2001-03-12 Thread Scott Sherman




 Scott Sherman   
 Systems Administrator
 design net
 Tel:  +44(0)870 240 0088
 Fax: +44(0)870 240 0099 
 Email: [EMAIL PROTECTED] 



 winmail.dat