Re: How many drives are bad?

2008-02-21 Thread Peter Grandi
 On Tue, 19 Feb 2008 14:25:28 -0500, Norman Elton
 [EMAIL PROTECTED] said:

[ ... ]

normelton The box presents 48 drives, split across 6 SATA
normelton controllers. So disks sda-sdh are on one controller,
normelton etc. In our configuration, I run a RAID5 MD array for
normelton each controller, then run LVM on top of these to form
normelton one large VolGroup.

Pure genius! I wonder how many Thumpers have been configured in
this well thought out way :-).

BTW, just to be sure -- you are running LVM in default linear
mode over those 6 RAID5s aren't you?

normelton I found that it was easiest to setup ext3 with a max
normelton of 2TB partitions. So running on top of the massive
normelton LVM VolGroup are a handful of ext3 partitions, each
normelton mounted in the filesystem.

Uhm, assuming 500GB drives each RAID set has a capacity of
3.5TB, and odds are that a bit over half of those 2TB volumes
will straddle array boundaries. Such attention to detail is
quite remarkable :-).

normelton This less than ideal (ZFS would allow us one large
normelton partition),

That would be another stroke of genius! (especially if you were
still using a set of underlying RAID5s instead of letting ZFS do
its RAIDZ thing). :-)

normelton but we're rewriting some software to utilize the
normelton multi-partition scheme.

Good luck!
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How many drives are bad?

2008-02-21 Thread Norman Elton

Pure genius! I wonder how many Thumpers have been configured in
this well thought out way :-).


I'm sorry I missed your contributions to the discussion a few weeks ago.

As I said up front, this is a test system. We're still trying a number  
of different configurations, and are learning how best to recover from  
a fault. Guy Watkins proposed one a few weeks ago that we haven't yet  
tried, but given our current situation... it may be a good time to  
give it a shot.


I'm still not convinced we were running a degraded array before this.  
One drive mysteriously dropped from the array, showing up as removed  
but not failed. We did not receive the notification that we did when  
the second actually failed. I'm still thinking its just one drive that  
actually failed.


Assuming we go with Guy's layout of 8 arrays of 6 drives (picking one  
from each controller), how would you setup the LVM VolGroups over top  
of these already distributed arrays?


Thanks again,

Norman



On Feb 20, 2008, at 2:21 AM, Peter Grandi wrote:


On Tue, 19 Feb 2008 14:25:28 -0500, Norman Elton
[EMAIL PROTECTED] said:


[ ... ]

normelton The box presents 48 drives, split across 6 SATA
normelton controllers. So disks sda-sdh are on one controller,
normelton etc. In our configuration, I run a RAID5 MD array for
normelton each controller, then run LVM on top of these to form
normelton one large VolGroup.

Pure genius! I wonder how many Thumpers have been configured in
this well thought out way :-).

BTW, just to be sure -- you are running LVM in default linear
mode over those 6 RAID5s aren't you?

normelton I found that it was easiest to setup ext3 with a max
normelton of 2TB partitions. So running on top of the massive
normelton LVM VolGroup are a handful of ext3 partitions, each
normelton mounted in the filesystem.

Uhm, assuming 500GB drives each RAID set has a capacity of
3.5TB, and odds are that a bit over half of those 2TB volumes
will straddle array boundaries. Such attention to detail is
quite remarkable :-).

normelton This less than ideal (ZFS would allow us one large
normelton partition),

That would be another stroke of genius! (especially if you were
still using a set of underlying RAID5s instead of letting ZFS do
its RAIDZ thing). :-)

normelton but we're rewriting some software to utilize the
normelton multi-partition scheme.

Good luck!
-
To unsubscribe from this list: send the line unsubscribe linux- 
raid in

the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How many drives are bad?

2008-02-21 Thread pg_mh
 On Thu, 21 Feb 2008 13:12:30 -0500, Norman Elton
 [EMAIL PROTECTED] said:

[ ... ]

normelton Assuming we go with Guy's layout of 8 arrays of 6
normelton drives (picking one from each controller),

Guy Watkins proposed another one too:

   «Assuming the 6 controllers are equal, I would make 3 16 disk
RAID6 arrays using 2 disks from each controller.  That way
any 1 controller can fail and your system will still be
running. 6 disks will be used for redundancy.

Or 6 8 disk RAID6 arrays using 1 disk from each controller).
That way any 2 controllers can fail and your system will
still be running. 12 disks will be used for redundancy.
Might be too excessive!»

So, I would not be overjoyed with either physical configuration,
except in a few particular cases. It is very amusing to read such
worries about host adapter failures, and somewhat depressing to
see too excessive used to describe 4+2 parity RAID.

normelton how would you setup the LVM VolGroups over top of
normelton these already distributed arrays?

That looks like a trick question, or at least an incorrect
question; because I would rather not do anything like that
except in a very few cases.

However, if one wants to do a bad thing in the least bad way,
perhaps a volume group per array would be least bad.

Going back to your original question:

  «So... we're curious how Linux will handle such a beast. Has
   anyone run MD software RAID over so many disks? Then piled
   LVM/ext3 on top of that?»

I haven't because it sounds rather inappropriate to me.

  «Any suggestions?»

Not easy to respond without a clear statement of what the array
be used for: RAID levels and file systems are very anisotropic
in both performance an resilience, so a particular configuration
may be very good for something but not for something else.

For example a 48 drive RAID0 with 'ext2' on top would be very
good for some cases, but perhaps not for archival :-).
In general, I'd use RAID10 (http://WWW.BAARF.com/), RAID5 in
very few cases and RAID6 almost never.

In general current storage practices do not handle that well
large single computer storage pools (just consider 'fsck'
times) and beyond 10TB I reckon that currently only multi-host
parallel/cluster file systems are good enough, for example
Lustre (for smaller multi TB filesystem I'd use JFS or XFS).

But then Lustre can be also used on a single machine with
multiple (say 2TB) block devices, and this may be the best
choice here too if a single virtual filesystem is the goal:

  http://wiki.Lustre.org/index.php?title=Lustre_Howto
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How many drives are bad?

2008-02-21 Thread Peter Rabbitson

Peter Grandi wrote:

In general, I'd use RAID10 (http://WWW.BAARF.com/), RAID5 in


Interesting movement. What do you think is their stance on Raid Fix? :)
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How many drives are bad?

2008-02-20 Thread Neil Brown
On Tuesday February 19, [EMAIL PROTECTED] wrote:
 So I had my first failure today, when I got a report that one drive
 (/dev/sdam) failed. I've attached the output of mdadm --detail. It
 appears that two drives are listed as removed, but the array is
 still functioning. What does this mean? How many drives actually
 failed?

The array is configured for 8 devices, but on 6 are active.  So you
have lost data.
Of the two missing devices, one is still in the array and is marked as
fault.  One is simply not present at all.
Hence Failed Devices: 1.  i.e. there is one failed device in the
array.

It looks like you have been running a degraded array for a while
(maybe not a long while) and the device has then failed.

mdadm --monitor

will send you mail if you have a degraded array.

NeilBrown

 
 This is all a test system, so I can dink around as much as necessary.
 Thanks for any advice!
 
 Norman Elton
 
 == OUTPUT OF MDADM =
 
 Version : 00.90.03
   Creation Time : Fri Jan 18 13:17:33 2008
  Raid Level : raid5
  Array Size : 6837319552 (6520.58 GiB 7001.42 GB)
 Device Size : 976759936 (931.51 GiB 1000.20 GB)
Raid Devices : 8
   Total Devices : 7
 Preferred Minor : 4
 Persistence : Superblock is persistent
 
 Update Time : Mon Feb 18 11:49:13 2008
   State : clean, degraded
  Active Devices : 6
 Working Devices : 6
  Failed Devices : 1
   Spare Devices : 0
 
  Layout : left-symmetric
  Chunk Size : 64K
 
UUID : b16bdcaf:a20192fb:39c74cb8:e5e60b20
  Events : 0.110
 
 Number   Major   Minor   RaidDevice State
0  6610  active sync   /dev/sdag1
1  66   171  active sync   /dev/sdah1
2  66   332  active sync   /dev/sdai1
3  66   493  active sync   /dev/sdaj1
4  66   654  active sync   /dev/sdak1
5   005  removed
6   006  removed
7  66  1137  active sync   /dev/sdan1
 
8  66   97-  faulty spare   /dev/sdam1
 -
 To unsubscribe from this list: send the line unsubscribe linux-raid in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How many drives are bad?

2008-02-19 Thread Justin Piszcz

How many drives actually failed?

Failed Devices : 1



On Tue, 19 Feb 2008, Norman Elton wrote:


So I had my first failure today, when I got a report that one drive
(/dev/sdam) failed. I've attached the output of mdadm --detail. It
appears that two drives are listed as removed, but the array is
still functioning. What does this mean? How many drives actually
failed?

This is all a test system, so I can dink around as much as necessary.
Thanks for any advice!

Norman Elton

== OUTPUT OF MDADM =

   Version : 00.90.03
 Creation Time : Fri Jan 18 13:17:33 2008
Raid Level : raid5
Array Size : 6837319552 (6520.58 GiB 7001.42 GB)
   Device Size : 976759936 (931.51 GiB 1000.20 GB)
  Raid Devices : 8
 Total Devices : 7
Preferred Minor : 4
   Persistence : Superblock is persistent

   Update Time : Mon Feb 18 11:49:13 2008
 State : clean, degraded
Active Devices : 6
Working Devices : 6
Failed Devices : 1
 Spare Devices : 0

Layout : left-symmetric
Chunk Size : 64K

  UUID : b16bdcaf:a20192fb:39c74cb8:e5e60b20
Events : 0.110

   Number   Major   Minor   RaidDevice State
  0  6610  active sync   /dev/sdag1
  1  66   171  active sync   /dev/sdah1
  2  66   332  active sync   /dev/sdai1
  3  66   493  active sync   /dev/sdaj1
  4  66   654  active sync   /dev/sdak1
  5   005  removed
  6   006  removed
  7  66  1137  active sync   /dev/sdan1

  8  66   97-  faulty spare   /dev/sdam1
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How many drives are bad?

2008-02-19 Thread Justin Piszcz

Neil,

Is this a bug?

Also, I have a question for Norman-- how come your drives are sda[a-z]1? 
Typically it is /dev/sda1 /dev/sdb1 etc?


Justin.

On Tue, 19 Feb 2008, Norman Elton wrote:

But why do two show up as removed?? I would expect /dev/sdal1 to show up 
someplace, either active or failed.


Any ideas?

Thanks,

Norman



On Feb 19, 2008, at 12:31 PM, Justin Piszcz wrote:


How many drives actually failed?

Failed Devices : 1



On Tue, 19 Feb 2008, Norman Elton wrote:


So I had my first failure today, when I got a report that one drive
(/dev/sdam) failed. I've attached the output of mdadm --detail. It
appears that two drives are listed as removed, but the array is
still functioning. What does this mean? How many drives actually
failed?

This is all a test system, so I can dink around as much as necessary.
Thanks for any advice!

Norman Elton

== OUTPUT OF MDADM =

 Version : 00.90.03
Creation Time : Fri Jan 18 13:17:33 2008
  Raid Level : raid5
  Array Size : 6837319552 (6520.58 GiB 7001.42 GB)
 Device Size : 976759936 (931.51 GiB 1000.20 GB)
Raid Devices : 8
Total Devices : 7
Preferred Minor : 4
 Persistence : Superblock is persistent

 Update Time : Mon Feb 18 11:49:13 2008
   State : clean, degraded
Active Devices : 6
Working Devices : 6
Failed Devices : 1
Spare Devices : 0

  Layout : left-symmetric
  Chunk Size : 64K

UUID : b16bdcaf:a20192fb:39c74cb8:e5e60b20
  Events : 0.110

 Number   Major   Minor   RaidDevice State
0  6610  active sync   /dev/sdag1
1  66   171  active sync   /dev/sdah1
2  66   332  active sync   /dev/sdai1
3  66   493  active sync   /dev/sdaj1
4  66   654  active sync   /dev/sdak1
5   005  removed
6   006  removed
7  66  1137  active sync   /dev/sdan1

8  66   97-  faulty spare   /dev/sdam1
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html



-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How many drives are bad?

2008-02-19 Thread Norman Elton
But why do two show up as removed?? I would expect /dev/sdal1 to  
show up someplace, either active or failed.


Any ideas?

Thanks,

Norman



On Feb 19, 2008, at 12:31 PM, Justin Piszcz wrote:


How many drives actually failed?

Failed Devices : 1



On Tue, 19 Feb 2008, Norman Elton wrote:


So I had my first failure today, when I got a report that one drive
(/dev/sdam) failed. I've attached the output of mdadm --detail. It
appears that two drives are listed as removed, but the array is
still functioning. What does this mean? How many drives actually
failed?

This is all a test system, so I can dink around as much as necessary.
Thanks for any advice!

Norman Elton

== OUTPUT OF MDADM =

  Version : 00.90.03
Creation Time : Fri Jan 18 13:17:33 2008
   Raid Level : raid5
   Array Size : 6837319552 (6520.58 GiB 7001.42 GB)
  Device Size : 976759936 (931.51 GiB 1000.20 GB)
 Raid Devices : 8
Total Devices : 7
Preferred Minor : 4
  Persistence : Superblock is persistent

  Update Time : Mon Feb 18 11:49:13 2008
State : clean, degraded
Active Devices : 6
Working Devices : 6
Failed Devices : 1
Spare Devices : 0

   Layout : left-symmetric
   Chunk Size : 64K

 UUID : b16bdcaf:a20192fb:39c74cb8:e5e60b20
   Events : 0.110

  Number   Major   Minor   RaidDevice State
 0  6610  active sync   /dev/sdag1
 1  66   171  active sync   /dev/sdah1
 2  66   332  active sync   /dev/sdai1
 3  66   493  active sync   /dev/sdaj1
 4  66   654  active sync   /dev/sdak1
 5   005  removed
 6   006  removed
 7  66  1137  active sync   /dev/sdan1

 8  66   97-  faulty spare   /dev/sdam1
-
To unsubscribe from this list: send the line unsubscribe linux- 
raid in

the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html



-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How many drives are bad?

2008-02-19 Thread Norman Elton
Justin,

This is a Sun X4500 (Thumper) box, so it's got 48 drives inside.
/dev/sd[a-z] are all there as well, just in other RAID sets. Once you
get to /dev/sdz, it starts up at /dev/sdaa, sdab, etc.

I'd be curious if what I'm experiencing is a bug. What should I try to
restore the array?

Norman

On 2/19/08, Justin Piszcz [EMAIL PROTECTED] wrote:
 Neil,

 Is this a bug?

 Also, I have a question for Norman-- how come your drives are sda[a-z]1?
 Typically it is /dev/sda1 /dev/sdb1 etc?

 Justin.

 On Tue, 19 Feb 2008, Norman Elton wrote:

  But why do two show up as removed?? I would expect /dev/sdal1 to show up
  someplace, either active or failed.
 
  Any ideas?
 
  Thanks,
 
  Norman
 
 
 
  On Feb 19, 2008, at 12:31 PM, Justin Piszcz wrote:
 
  How many drives actually failed?
  Failed Devices : 1
 
 
  On Tue, 19 Feb 2008, Norman Elton wrote:
 
  So I had my first failure today, when I got a report that one drive
  (/dev/sdam) failed. I've attached the output of mdadm --detail. It
  appears that two drives are listed as removed, but the array is
  still functioning. What does this mean? How many drives actually
  failed?
 
  This is all a test system, so I can dink around as much as necessary.
  Thanks for any advice!
 
  Norman Elton
 
  == OUTPUT OF MDADM =
 
   Version : 00.90.03
  Creation Time : Fri Jan 18 13:17:33 2008
Raid Level : raid5
Array Size : 6837319552 (6520.58 GiB 7001.42 GB)
   Device Size : 976759936 (931.51 GiB 1000.20 GB)
  Raid Devices : 8
  Total Devices : 7
  Preferred Minor : 4
   Persistence : Superblock is persistent
 
   Update Time : Mon Feb 18 11:49:13 2008
 State : clean, degraded
  Active Devices : 6
  Working Devices : 6
  Failed Devices : 1
  Spare Devices : 0
 
Layout : left-symmetric
Chunk Size : 64K
 
  UUID : b16bdcaf:a20192fb:39c74cb8:e5e60b20
Events : 0.110
 
   Number   Major   Minor   RaidDevice State
  0  6610  active sync   /dev/sdag1
  1  66   171  active sync   /dev/sdah1
  2  66   332  active sync   /dev/sdai1
  3  66   493  active sync   /dev/sdaj1
  4  66   654  active sync   /dev/sdak1
  5   005  removed
  6   006  removed
  7  66  1137  active sync   /dev/sdan1
 
  8  66   97-  faulty spare   /dev/sdam1
  -
  To unsubscribe from this list: send the line unsubscribe linux-raid in
  the body of a message to [EMAIL PROTECTED]
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
 

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How many drives are bad?

2008-02-19 Thread Justin Piszcz

Norman,

I am extremely interested in what distribution you are running on it and 
what type of SW raid you are employing (besides the one you showed here), 
are all 48 drives filled, or?


Justin.

On Tue, 19 Feb 2008, Norman Elton wrote:


Justin,

This is a Sun X4500 (Thumper) box, so it's got 48 drives inside.
/dev/sd[a-z] are all there as well, just in other RAID sets. Once you
get to /dev/sdz, it starts up at /dev/sdaa, sdab, etc.

I'd be curious if what I'm experiencing is a bug. What should I try to
restore the array?

Norman

On 2/19/08, Justin Piszcz [EMAIL PROTECTED] wrote:

Neil,

Is this a bug?

Also, I have a question for Norman-- how come your drives are sda[a-z]1?
Typically it is /dev/sda1 /dev/sdb1 etc?

Justin.

On Tue, 19 Feb 2008, Norman Elton wrote:


But why do two show up as removed?? I would expect /dev/sdal1 to show up
someplace, either active or failed.

Any ideas?

Thanks,

Norman



On Feb 19, 2008, at 12:31 PM, Justin Piszcz wrote:


How many drives actually failed?

Failed Devices : 1



On Tue, 19 Feb 2008, Norman Elton wrote:


So I had my first failure today, when I got a report that one drive
(/dev/sdam) failed. I've attached the output of mdadm --detail. It
appears that two drives are listed as removed, but the array is
still functioning. What does this mean? How many drives actually
failed?

This is all a test system, so I can dink around as much as necessary.
Thanks for any advice!

Norman Elton

== OUTPUT OF MDADM =

 Version : 00.90.03
Creation Time : Fri Jan 18 13:17:33 2008
  Raid Level : raid5
  Array Size : 6837319552 (6520.58 GiB 7001.42 GB)
 Device Size : 976759936 (931.51 GiB 1000.20 GB)
Raid Devices : 8
Total Devices : 7
Preferred Minor : 4
 Persistence : Superblock is persistent

 Update Time : Mon Feb 18 11:49:13 2008
   State : clean, degraded
Active Devices : 6
Working Devices : 6
Failed Devices : 1
Spare Devices : 0

  Layout : left-symmetric
  Chunk Size : 64K

UUID : b16bdcaf:a20192fb:39c74cb8:e5e60b20
  Events : 0.110

 Number   Major   Minor   RaidDevice State
0  6610  active sync   /dev/sdag1
1  66   171  active sync   /dev/sdah1
2  66   332  active sync   /dev/sdai1
3  66   493  active sync   /dev/sdaj1
4  66   654  active sync   /dev/sdak1
5   005  removed
6   006  removed
7  66  1137  active sync   /dev/sdan1

8  66   97-  faulty spare   /dev/sdam1
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html





-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How many drives are bad?

2008-02-19 Thread Norman Elton
Justin,

There was actually a discussion I fired off a few weeks ago about how
to best run SW RAID on this hardware. Here's the recap:

We're running RHEL, so no access to ZFS/XFS. I really wish we could do
ZFS, but no luck.

The box presents 48 drives, split across 6 SATA controllers. So disks
sda-sdh are on one controller, etc. In our configuration, I run a
RAID5 MD array for each controller, then run LVM on top of these to
form one large VolGroup.

I found that it was easiest to setup ext3 with a max of 2TB
partitions. So running on top of the massive LVM VolGroup are a
handful of ext3 partitions, each mounted in the filesystem. This less
than ideal (ZFS would allow us one large partition), but we're
rewriting some software to utilize the multi-partition scheme.

In this setup, we should be fairly protected against drive failure. We
are vulnerable to a controller failure. If such a failure occurred,
we'd have to restore from backup.

Hope this helps, let me know if you have any questions or suggestions.
I'm certainly no expert here!

Thanks,

Norman

On 2/19/08, Justin Piszcz [EMAIL PROTECTED] wrote:
 Norman,

 I am extremely interested in what distribution you are running on it and
 what type of SW raid you are employing (besides the one you showed here),
 are all 48 drives filled, or?

 Justin.

 On Tue, 19 Feb 2008, Norman Elton wrote:

  Justin,
 
  This is a Sun X4500 (Thumper) box, so it's got 48 drives inside.
  /dev/sd[a-z] are all there as well, just in other RAID sets. Once you
  get to /dev/sdz, it starts up at /dev/sdaa, sdab, etc.
 
  I'd be curious if what I'm experiencing is a bug. What should I try to
  restore the array?
 
  Norman
 
  On 2/19/08, Justin Piszcz [EMAIL PROTECTED] wrote:
  Neil,
 
  Is this a bug?
 
  Also, I have a question for Norman-- how come your drives are sda[a-z]1?
  Typically it is /dev/sda1 /dev/sdb1 etc?
 
  Justin.
 
  On Tue, 19 Feb 2008, Norman Elton wrote:
 
  But why do two show up as removed?? I would expect /dev/sdal1 to show up
  someplace, either active or failed.
 
  Any ideas?
 
  Thanks,
 
  Norman
 
 
 
  On Feb 19, 2008, at 12:31 PM, Justin Piszcz wrote:
 
  How many drives actually failed?
  Failed Devices : 1
 
 
  On Tue, 19 Feb 2008, Norman Elton wrote:
 
  So I had my first failure today, when I got a report that one drive
  (/dev/sdam) failed. I've attached the output of mdadm --detail. It
  appears that two drives are listed as removed, but the array is
  still functioning. What does this mean? How many drives actually
  failed?
 
  This is all a test system, so I can dink around as much as necessary.
  Thanks for any advice!
 
  Norman Elton
 
  == OUTPUT OF MDADM =
 
   Version : 00.90.03
  Creation Time : Fri Jan 18 13:17:33 2008
Raid Level : raid5
Array Size : 6837319552 (6520.58 GiB 7001.42 GB)
   Device Size : 976759936 (931.51 GiB 1000.20 GB)
  Raid Devices : 8
  Total Devices : 7
  Preferred Minor : 4
   Persistence : Superblock is persistent
 
   Update Time : Mon Feb 18 11:49:13 2008
 State : clean, degraded
  Active Devices : 6
  Working Devices : 6
  Failed Devices : 1
  Spare Devices : 0
 
Layout : left-symmetric
Chunk Size : 64K
 
  UUID : b16bdcaf:a20192fb:39c74cb8:e5e60b20
Events : 0.110
 
   Number   Major   Minor   RaidDevice State
  0  6610  active sync   /dev/sdag1
  1  66   171  active sync   /dev/sdah1
  2  66   332  active sync   /dev/sdai1
  3  66   493  active sync   /dev/sdaj1
  4  66   654  active sync   /dev/sdak1
  5   005  removed
  6   006  removed
  7  66  1137  active sync   /dev/sdan1
 
  8  66   97-  faulty spare   /dev/sdam1
  -
  To unsubscribe from this list: send the line unsubscribe linux-raid in
  the body of a message to [EMAIL PROTECTED]
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
 
  -
  To unsubscribe from this list: send the line unsubscribe linux-raid in
  the body of a message to [EMAIL PROTECTED]
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
 

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: How many drives are bad?

2008-02-19 Thread Steve Fairbairn

 
 The box presents 48 drives, split across 6 SATA controllers. 
 So disks sda-sdh are on one controller, etc. In our 
 configuration, I run a RAID5 MD array for each controller, 
 then run LVM on top of these to form one large VolGroup.
 

I might be missing something here, and I realise you'd lose 8 drives to
redundancy rather than 6, but wouldn't it have been better to have 8
arrays of 6 drives, each array using a single drive from each
controller?  That way a single controller failure (assuming no other HD
failures) wouldn't actually take any array down?  I do realise that 2
controller failures at the same time would lose everything.

Steve.

No virus found in this outgoing message.
Checked by AVG Free Edition. 
Version: 7.5.516 / Virus Database: 269.20.7/1286 - Release Date:
18/02/2008 18:49
 

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: How many drives are bad?

2008-02-19 Thread Guy Watkins


} -Original Message-
} From: [EMAIL PROTECTED] [mailto:linux-raid-
} [EMAIL PROTECTED] On Behalf Of Steve Fairbairn
} Sent: Tuesday, February 19, 2008 2:45 PM
} To: 'Norman Elton'
} Cc: linux-raid@vger.kernel.org
} Subject: RE: How many drives are bad?
} 
} 
} 
}  The box presents 48 drives, split across 6 SATA controllers.
}  So disks sda-sdh are on one controller, etc. In our
}  configuration, I run a RAID5 MD array for each controller,
}  then run LVM on top of these to form one large VolGroup.
} 
} 
} I might be missing something here, and I realise you'd lose 8 drives to
} redundancy rather than 6, but wouldn't it have been better to have 8
} arrays of 6 drives, each array using a single drive from each
} controller?  That way a single controller failure (assuming no other HD
} failures) wouldn't actually take any array down?  I do realise that 2
} controller failures at the same time would lose everything.

Wow.  Sounds like what I said a few months ago.  I think I also recommended
RAID6.

Guy

} 
} Steve.
} 
} No virus found in this outgoing message.
} Checked by AVG Free Edition.
} Version: 7.5.516 / Virus Database: 269.20.7/1286 - Release Date:
} 18/02/2008 18:49
} 
} 
} -
} To unsubscribe from this list: send the line unsubscribe linux-raid in
} the body of a message to [EMAIL PROTECTED]
} More majordomo info at  http://vger.kernel.org/majordomo-info.html

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html