On 03/16/2013 03:04 AM, Dan Egli wrote: >> For a home server I recommend RAID1 or RAID10 over RAID6. > > Really? I guess between RAID6 and RAID10 it's not much different, but what > about someone who has say six or eight disks in the server? I'm curious why > you'd still recommend RAID10? Hypothetically speaking, let's assume I > wanted to have a server big enough to hold 1 year of downloaded data from > the net, downloading at approx 5Mbps (with TCP overhead, that comes to > approx 1 MB every 2 seconds) 24/7/365. That's nearly 16 TB. A RAID6 could > handle that with 6 drives, 4TB each. A raid10 would need 8 drives. I admit > each is possible to throw into a full tower case, but why spend the extra > money on two more drives, making the two raid10s? I am genuinely curious.
First of all, what do you need all that space for? The kind of data you store really dictates what kind of setup you need. Disks are cheap and if the data is that important to you, then buying twice as many disks as your capacity needs is really not that big of a deal. The cost of 6 disks vs 8 disks is negligible, compared to the peace of mind that the RAID-10 can bring you. Furthermore, if the data is that important to you, you will need a backup system, which at minimum is at least another set of disks, that receive a full set of data periodically, and are stored off-line. Secondly you'll need a very large power supply and a SATA expansion card if you're really planning to stuff that many disks in a box. There's a reason why people often buy a SAN array box with its own (redundant) power supplies. I'm unsure as to whether you are now talking about your own personal server which will go into a house, or your work project. For a home system, I think most people are served well by just two disks in RAID-1 configuration, plus a set of backup disks. Very large sets of data like photos, movies, and maybe MythTV recordings, don't need RAID at all. They don't change often, so a really good backup is much more important than RAID. > Well, that's not really an issue because I finally realized I could break > my boss down by using some basic math. I showed him using basic > multiplication how long it would take to fill the 120TB array he wanted > (more than eight years to reach 25% capacity) and he FINALLY agreed that we > could do it much cheaper and easier by building a full tower PC and filling > it with Hard Disk Drives. So we're going to order the parts soon. Thank > goodness for that. I'm still not sure which chassis he wanted. I think he > was thinking of going to a company like Aberdeen or someone. I have > insufficent experience to state whether or not that was a good idea, but > thankfully it's a moot point now. I imagine we can fit about 10 disks in a > large case (I have to do some research on cases to find the one that will > let us hold as many hard disks as we can), and make a raid out of them. There are companies that make cases for disks. Poor-man's arrays. The cases have lots of rails and a big power supply. Most of them then just have e-SATA ports on the back that you can connect to a PC's eSATA adapter cards (which you will need since PC's usually have 4 or less SATA ports on the motherboard. Also if you use an eSATA adapter, then the drives are hot-swappable (if not in use!). Here're a couple of ideas: http://www.istarusa.com/istarusa/product_speclist.php?series=Tower&sub=JBOD%20CASE http://www.granitedigital.com/SATAproseries8x.aspx For inter-box connections eSATA connectors are better than SATA because the connector has a clip to keep it plugged in, whereas most SATA cables are just held in with friction. >> it for years on Solaris without issue). But I'm not sure of the status >> of the zfs-on-linux project. > > So what would you use? Be aware that he's REALLY keen on using a file > system that includes journaling and data-deduplication. I don't know how > easy it's going to be to change his mind. It took near a week of arguments > before I got him to abandon the rackmount server idea. I'm well aware of > many of the advantages of file systems like Ext4 and JFS. But try > convincing my boss on that. He's one of those people who hears about some > new idea, likes it, and wants it implemented, despite not knowing how it > works internally or what would be involved in the implementation. I did enjoy ZFS features a lot and hope that Linux's home-grown BtrFS will get stable and mature soon, since BtrFS will pretty much match ZFS for features when it's done some day. Snapshots are the number one feature of ZFS and BtrFS! For a file server serving thousands of users, have very cheap snapshots allowing users to see their own files over the last 7 days was really slick. I used to snapshot every night for the last week, then every month for the last year. Because of the COW nature of ZFS, these snapshots only cost in size the difference between the snapshot and the current version of the file. Anyway ZFS is not a journaling filesystem (neither is BtrFS). They simply don't need journals. They are copy-on-write file systems, which means they are always consistent and after a failure, all you can lose are uncommitted blocks. Deduplication is something you can do at a much higher level. For example, a script could find identical files and replace them with hard links. There was an experimental project I saw once called, "opendedup" that was a FUSE filesystem that you could run any any underlying filesystem and do block-level deduplication somewhere without having to have a special physical file system. Again, the file system you end up choosing is going to depend entirely on exactly what he's using it for. For example, in a home server, if my main storage needs were for MythTV, I would eschew any form of RAID and keep my disks formatted as individual volumes to Ext4 because MythTV treats all its storage as a big pool so there's no need to have one big file system across all the devices. >From what I can see so far, you really have 3 native linux choices: Ext4, XFS, and BtrFS. Of those, XFS has been used on huge arrays for many years. BtrFS might be stable enough for your use. Ext4 is very stable and perfectly capable of being used on multi-terrabyte volumes. None of them have built in deduplication. /* PLUG: http://plug.org, #utah on irc.freenode.net Unsubscribe: http://plug.org/mailman/options/plug Don't fear the penguin. */
