Re: How many drives are bad?
Pure genius! I wonder how many Thumpers have been configured in this well thought out way :-). I'm sorry I missed your contributions to the discussion a few weeks ago. As I said up front, this is a test system. We're still trying a number of different configurations, and are learning how best to recover from a fault. Guy Watkins proposed one a few weeks ago that we haven't yet tried, but given our current situation... it may be a good time to give it a shot. I'm still not convinced we were running a degraded array before this. One drive mysteriously dropped from the array, showing up as removed but not failed. We did not receive the notification that we did when the second actually failed. I'm still thinking its just one drive that actually failed. Assuming we go with Guy's layout of 8 arrays of 6 drives (picking one from each controller), how would you setup the LVM VolGroups over top of these already distributed arrays? Thanks again, Norman On Feb 20, 2008, at 2:21 AM, Peter Grandi wrote: On Tue, 19 Feb 2008 14:25:28 -0500, Norman Elton [EMAIL PROTECTED] said: [ ... ] normelton The box presents 48 drives, split across 6 SATA normelton controllers. So disks sda-sdh are on one controller, normelton etc. In our configuration, I run a RAID5 MD array for normelton each controller, then run LVM on top of these to form normelton one large VolGroup. Pure genius! I wonder how many Thumpers have been configured in this well thought out way :-). BTW, just to be sure -- you are running LVM in default linear mode over those 6 RAID5s aren't you? normelton I found that it was easiest to setup ext3 with a max normelton of 2TB partitions. So running on top of the massive normelton LVM VolGroup are a handful of ext3 partitions, each normelton mounted in the filesystem. Uhm, assuming 500GB drives each RAID set has a capacity of 3.5TB, and odds are that a bit over half of those 2TB volumes will straddle array boundaries. Such attention to detail is quite remarkable :-). normelton This less than ideal (ZFS would allow us one large normelton partition), That would be another stroke of genius! (especially if you were still using a set of underlying RAID5s instead of letting ZFS do its RAIDZ thing). :-) normelton but we're rewriting some software to utilize the normelton multi-partition scheme. Good luck! - To unsubscribe from this list: send the line unsubscribe linux- raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
How many drives are bad?
So I had my first failure today, when I got a report that one drive (/dev/sdam) failed. I've attached the output of mdadm --detail. It appears that two drives are listed as removed, but the array is still functioning. What does this mean? How many drives actually failed? This is all a test system, so I can dink around as much as necessary. Thanks for any advice! Norman Elton == OUTPUT OF MDADM = Version : 00.90.03 Creation Time : Fri Jan 18 13:17:33 2008 Raid Level : raid5 Array Size : 6837319552 (6520.58 GiB 7001.42 GB) Device Size : 976759936 (931.51 GiB 1000.20 GB) Raid Devices : 8 Total Devices : 7 Preferred Minor : 4 Persistence : Superblock is persistent Update Time : Mon Feb 18 11:49:13 2008 State : clean, degraded Active Devices : 6 Working Devices : 6 Failed Devices : 1 Spare Devices : 0 Layout : left-symmetric Chunk Size : 64K UUID : b16bdcaf:a20192fb:39c74cb8:e5e60b20 Events : 0.110 Number Major Minor RaidDevice State 0 6610 active sync /dev/sdag1 1 66 171 active sync /dev/sdah1 2 66 332 active sync /dev/sdai1 3 66 493 active sync /dev/sdaj1 4 66 654 active sync /dev/sdak1 5 005 removed 6 006 removed 7 66 1137 active sync /dev/sdan1 8 66 97- faulty spare /dev/sdam1 - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How many drives are bad?
But why do two show up as removed?? I would expect /dev/sdal1 to show up someplace, either active or failed. Any ideas? Thanks, Norman On Feb 19, 2008, at 12:31 PM, Justin Piszcz wrote: How many drives actually failed? Failed Devices : 1 On Tue, 19 Feb 2008, Norman Elton wrote: So I had my first failure today, when I got a report that one drive (/dev/sdam) failed. I've attached the output of mdadm --detail. It appears that two drives are listed as removed, but the array is still functioning. What does this mean? How many drives actually failed? This is all a test system, so I can dink around as much as necessary. Thanks for any advice! Norman Elton == OUTPUT OF MDADM = Version : 00.90.03 Creation Time : Fri Jan 18 13:17:33 2008 Raid Level : raid5 Array Size : 6837319552 (6520.58 GiB 7001.42 GB) Device Size : 976759936 (931.51 GiB 1000.20 GB) Raid Devices : 8 Total Devices : 7 Preferred Minor : 4 Persistence : Superblock is persistent Update Time : Mon Feb 18 11:49:13 2008 State : clean, degraded Active Devices : 6 Working Devices : 6 Failed Devices : 1 Spare Devices : 0 Layout : left-symmetric Chunk Size : 64K UUID : b16bdcaf:a20192fb:39c74cb8:e5e60b20 Events : 0.110 Number Major Minor RaidDevice State 0 6610 active sync /dev/sdag1 1 66 171 active sync /dev/sdah1 2 66 332 active sync /dev/sdai1 3 66 493 active sync /dev/sdaj1 4 66 654 active sync /dev/sdak1 5 005 removed 6 006 removed 7 66 1137 active sync /dev/sdan1 8 66 97- faulty spare /dev/sdam1 - To unsubscribe from this list: send the line unsubscribe linux- raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How many drives are bad?
Justin, This is a Sun X4500 (Thumper) box, so it's got 48 drives inside. /dev/sd[a-z] are all there as well, just in other RAID sets. Once you get to /dev/sdz, it starts up at /dev/sdaa, sdab, etc. I'd be curious if what I'm experiencing is a bug. What should I try to restore the array? Norman On 2/19/08, Justin Piszcz [EMAIL PROTECTED] wrote: Neil, Is this a bug? Also, I have a question for Norman-- how come your drives are sda[a-z]1? Typically it is /dev/sda1 /dev/sdb1 etc? Justin. On Tue, 19 Feb 2008, Norman Elton wrote: But why do two show up as removed?? I would expect /dev/sdal1 to show up someplace, either active or failed. Any ideas? Thanks, Norman On Feb 19, 2008, at 12:31 PM, Justin Piszcz wrote: How many drives actually failed? Failed Devices : 1 On Tue, 19 Feb 2008, Norman Elton wrote: So I had my first failure today, when I got a report that one drive (/dev/sdam) failed. I've attached the output of mdadm --detail. It appears that two drives are listed as removed, but the array is still functioning. What does this mean? How many drives actually failed? This is all a test system, so I can dink around as much as necessary. Thanks for any advice! Norman Elton == OUTPUT OF MDADM = Version : 00.90.03 Creation Time : Fri Jan 18 13:17:33 2008 Raid Level : raid5 Array Size : 6837319552 (6520.58 GiB 7001.42 GB) Device Size : 976759936 (931.51 GiB 1000.20 GB) Raid Devices : 8 Total Devices : 7 Preferred Minor : 4 Persistence : Superblock is persistent Update Time : Mon Feb 18 11:49:13 2008 State : clean, degraded Active Devices : 6 Working Devices : 6 Failed Devices : 1 Spare Devices : 0 Layout : left-symmetric Chunk Size : 64K UUID : b16bdcaf:a20192fb:39c74cb8:e5e60b20 Events : 0.110 Number Major Minor RaidDevice State 0 6610 active sync /dev/sdag1 1 66 171 active sync /dev/sdah1 2 66 332 active sync /dev/sdai1 3 66 493 active sync /dev/sdaj1 4 66 654 active sync /dev/sdak1 5 005 removed 6 006 removed 7 66 1137 active sync /dev/sdan1 8 66 97- faulty spare /dev/sdam1 - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How many drives are bad?
Justin, There was actually a discussion I fired off a few weeks ago about how to best run SW RAID on this hardware. Here's the recap: We're running RHEL, so no access to ZFS/XFS. I really wish we could do ZFS, but no luck. The box presents 48 drives, split across 6 SATA controllers. So disks sda-sdh are on one controller, etc. In our configuration, I run a RAID5 MD array for each controller, then run LVM on top of these to form one large VolGroup. I found that it was easiest to setup ext3 with a max of 2TB partitions. So running on top of the massive LVM VolGroup are a handful of ext3 partitions, each mounted in the filesystem. This less than ideal (ZFS would allow us one large partition), but we're rewriting some software to utilize the multi-partition scheme. In this setup, we should be fairly protected against drive failure. We are vulnerable to a controller failure. If such a failure occurred, we'd have to restore from backup. Hope this helps, let me know if you have any questions or suggestions. I'm certainly no expert here! Thanks, Norman On 2/19/08, Justin Piszcz [EMAIL PROTECTED] wrote: Norman, I am extremely interested in what distribution you are running on it and what type of SW raid you are employing (besides the one you showed here), are all 48 drives filled, or? Justin. On Tue, 19 Feb 2008, Norman Elton wrote: Justin, This is a Sun X4500 (Thumper) box, so it's got 48 drives inside. /dev/sd[a-z] are all there as well, just in other RAID sets. Once you get to /dev/sdz, it starts up at /dev/sdaa, sdab, etc. I'd be curious if what I'm experiencing is a bug. What should I try to restore the array? Norman On 2/19/08, Justin Piszcz [EMAIL PROTECTED] wrote: Neil, Is this a bug? Also, I have a question for Norman-- how come your drives are sda[a-z]1? Typically it is /dev/sda1 /dev/sdb1 etc? Justin. On Tue, 19 Feb 2008, Norman Elton wrote: But why do two show up as removed?? I would expect /dev/sdal1 to show up someplace, either active or failed. Any ideas? Thanks, Norman On Feb 19, 2008, at 12:31 PM, Justin Piszcz wrote: How many drives actually failed? Failed Devices : 1 On Tue, 19 Feb 2008, Norman Elton wrote: So I had my first failure today, when I got a report that one drive (/dev/sdam) failed. I've attached the output of mdadm --detail. It appears that two drives are listed as removed, but the array is still functioning. What does this mean? How many drives actually failed? This is all a test system, so I can dink around as much as necessary. Thanks for any advice! Norman Elton == OUTPUT OF MDADM = Version : 00.90.03 Creation Time : Fri Jan 18 13:17:33 2008 Raid Level : raid5 Array Size : 6837319552 (6520.58 GiB 7001.42 GB) Device Size : 976759936 (931.51 GiB 1000.20 GB) Raid Devices : 8 Total Devices : 7 Preferred Minor : 4 Persistence : Superblock is persistent Update Time : Mon Feb 18 11:49:13 2008 State : clean, degraded Active Devices : 6 Working Devices : 6 Failed Devices : 1 Spare Devices : 0 Layout : left-symmetric Chunk Size : 64K UUID : b16bdcaf:a20192fb:39c74cb8:e5e60b20 Events : 0.110 Number Major Minor RaidDevice State 0 6610 active sync /dev/sdag1 1 66 171 active sync /dev/sdah1 2 66 332 active sync /dev/sdai1 3 66 493 active sync /dev/sdaj1 4 66 654 active sync /dev/sdak1 5 005 removed 6 006 removed 7 66 1137 active sync /dev/sdan1 8 66 97- faulty spare /dev/sdam1 - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid over 48 disks ... for real now
It is quite a box. There's a picture of the box with the cover removed on Sun's website: http://www.sun.com/images/k3/k3_sunfirex4500_4.jpg From the X4500 homepage, there's a gallery of additional pictures. The drives drop in from the top. Massive fans channel air in the small gaps between the drives. It doesn't look like there's much room between the disks, but a lot of cold air gets sucked in the front, and a lot of hot air comes out the back. So it must be doing its job :). I have not tried a fsck on it yet. I'll probably setup a lot of 2TB partitions rather than a single large partition. Then write the software to handle storing data across many partitions. Norman On 1/18/08, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Quoting Norman Elton [EMAIL PROTECTED]: I posed the question a few weeks ago about how to best accommodate software RAID over an array of 48 disks (a Sun X4500 server, a.k.a. Thumper). I appreciate all the suggestions. Well, the hardware is here. It is indeed six Marvell 88SX6081 SATA controllers, each with eight 1TB drives, for a total raw storage of 48TB. I must admit, it's quite impressive. And loud. More information about the hardware is available online... http://www.sun.com/servers/x64/x4500/arch-wp.pdf It came loaded with Solaris, configured with ZFS. Things seemed to work fine. I did not do any benchmarks, but I can revert to that configuration if necessary. Now I've loaded RHEL onto the box. For a first-shot, I've created one RAID-5 array (+ 1 spare) on each of the controllers, then used LVM to create a VolGroup across the arrays. So now I'm trying to figure out what to do with this space. So far, I've tested mke2fs on a 1TB and a 5TB LogVol. I wish RHEL would support XFS/ZFS, but for now, I'm stuck with ext3. Am I better off sticking with relatively small partitions (2-5 TB), or should I crank up the block size and go for one big partition? Impressive system. I'm curious to what the storage drives look like and how they attach to the server with that many disks? Sounds like you have some time to play around before shoving it into production. I wonder how long it would take to run an fsck on one large filesystem? Cheers, Mike - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Raid over 48 disks ... for real now
I posed the question a few weeks ago about how to best accommodate software RAID over an array of 48 disks (a Sun X4500 server, a.k.a. Thumper). I appreciate all the suggestions. Well, the hardware is here. It is indeed six Marvell 88SX6081 SATA controllers, each with eight 1TB drives, for a total raw storage of 48TB. I must admit, it's quite impressive. And loud. More information about the hardware is available online... http://www.sun.com/servers/x64/x4500/arch-wp.pdf It came loaded with Solaris, configured with ZFS. Things seemed to work fine. I did not do any benchmarks, but I can revert to that configuration if necessary. Now I've loaded RHEL onto the box. For a first-shot, I've created one RAID-5 array (+ 1 spare) on each of the controllers, then used LVM to create a VolGroup across the arrays. So now I'm trying to figure out what to do with this space. So far, I've tested mke2fs on a 1TB and a 5TB LogVol. I wish RHEL would support XFS/ZFS, but for now, I'm stuck with ext3. Am I better off sticking with relatively small partitions (2-5 TB), or should I crank up the block size and go for one big partition? Thoughts? Norman Elton - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid over 48 disks ... for real now
Hi, sounds like a monster server. I am interested in how you will make the space useful to remote machines- iscsi? this is what I am researching currently. Yes, it's a honker of a box. It will be collecting data from various collector servers. The plan right now is to collect the file to binary files using a daemon (already running on a smaller box), then make the last 30/60/90/?? days available in a database that is populated from these files. If we need to gather older data, then the individual files must be consulted locally. So, in production, I would probably setup the database partition on it's own set of 6 disks, then dedicate the rest to handling/archiving the raw binary files. These files are small (a few MB each), as they get rotated every five minutes. Hope this makes sense, and provides a little background info on what we're trying to do. Norman - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Raid over 48 disks
We're investigating the possibility of running Linux (RHEL) on top of Sun's X4500 Thumper box: http://www.sun.com/servers/x64/x4500/ Basically, it's a server with 48 SATA hard drives. No hardware RAID. It's designed for Sun's ZFS filesystem. So... we're curious how Linux will handle such a beast. Has anyone run MD software RAID over so many disks? Then piled LVM/ext3 on top of that? Any suggestions? Are we crazy to think this is even possible? Thanks! Norman Elton - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid over 48 disks
Thiemo -- I'm not familiar with RocketRaid. Is it handling the RAID for you, or are you using MD? Thanks, all, for your feedback! I'm still surprised nobody has tried this on one of these Sun boxes yet. I've signed up for some demo hardware. I'll post what I find. Norman On Dec 18, 2007, at 2:34 PM, Thiemo Nagel wrote: Dear Norman, So... we're curious how Linux will handle such a beast. Has anyone run MD software RAID over so many disks? Then piled LVM/ext3 on top of that? Any suggestions? Are we crazy to think this is even possible? I'm running 22x 500GB disks attached to RocketRaid2340 and NFORCE- MCP55 onboard controllers on an Athlon DC 5000+ with 1GB RAM: 9746150400 blocks super 1.2 level 6, 256k chunk, algorithm 2 [22/22] Performance of the raw device is fair: # dd if=/dev/md2 of=/dev/zero bs=128k count=64k 65536+0 records in 65536+0 records out 8589934592 bytes (8.6 GB) copied, 15.6071 seconds, 550 MB/s Somewhat less through ext3 (created with -E stride=64): # dd if=largetestfile of=/dev/zero bs=128k count=64k 65536+0 records in 65536+0 records out 8589934592 bytes (8.6 GB) copied, 26.4103 seconds, 325 MB/s There were no problems up to now. (mkfs.ext3 wants -F to create a filesystem larger than 8TB. The hard maximum is 16TB, so you will need to create partitions, if your drives are larger than 350GB...) Kind regards, Thiemo Nagel - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html