Re: [zfs-discuss] [zfs] Petabyte pool?
Well, off the top of my head: 2 x Storage Heads, 4 x 10G, 256G RAM, 2 x Intel E5 CPU's 8 x 60-Bay JBOD's with 60 x 4TB SAS drives RAIDZ2 stripe over the 8 x JBOD's That should fit within 1 rack comfortably and provide 1 PB storage.. Regards, Kristoffer Sheather Cloud Central Scale Your Data Center In The Cloud Phone: 1300 144 007 | Mobile: +61 414 573 130 | Email: k...@cloudcentral.com.au LinkedIn: | Skype: kristoffer.sheather | Twitter: http://twitter.com/kristofferjon From: Marion Hakanson hakan...@ohsu.edu Sent: Saturday, March 16, 2013 12:12 PM To: z...@lists.illumos.org Subject: [zfs] Petabyte pool? Greetings, Has anyone out there built a 1-petabyte pool? I've been asked to look into this, and was told low performance is fine, workload is likely to be write-once, read-occasionally, archive storage of gene sequencing data. Probably a single 10Gbit NIC for connectivity is sufficient. We've had decent success with the 45-slot, 4U SuperMicro SAS disk chassis, using 4TB nearline SAS drives, giving over 100TB usable space (raidz3). Back-of-the-envelope might suggest stacking up eight to ten of those, depending if you want a raw marketing petabyte, or a proper power-of-two usable petabyte. I get a little nervous at the thought of hooking all that up to a single server, and am a little vague on how much RAM would be advisable, other than as much as will fit (:-). Then again, I've been waiting for something like pNFS/NFSv4.1 to be usable for gluing together multiple NFS servers into a single global namespace, without any sign of that happening anytime soon. So, has anyone done this? Or come close to it? Thoughts, even if you haven't done it yourself? Thanks and regards, Marion --- illumos-zfs Archives: https://www.listbox.com/member/archive/182191/=now RSS Feed: https://www.listbox.com/member/archive/rss/182191/23629987-2afa167a Modify Your Subscription: https://www.listbox.com/member/?member_id=23629987id_secret=23629987-c48148 a8 Powered by Listbox: http://www.listbox.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [zfs] Petabyte pool?
Actually, you could use 3TB drives and with a 6/8 RAIDZ2 stripe achieve 1080 TB usable. You'll also need 8-16 x SAS ports available on each storage head to provide redundant multi-pathed SAS connectivity to the JBOD's, recommend LSI 9207-8E's for those and Intel X520-DA2's for the 10G NIC's. From: Kristoffer Sheather @ CloudCentral kristoffer.sheat...@cloudcentral.com.au Sent: Saturday, March 16, 2013 12:21 PM To: z...@lists.illumos.org Subject: re: [zfs] Petabyte pool? Well, off the top of my head: 2 x Storage Heads, 4 x 10G, 256G RAM, 2 x Intel E5 CPU's 8 x 60-Bay JBOD's with 60 x 4TB SAS drives RAIDZ2 stripe over the 8 x JBOD's That should fit within 1 rack comfortably and provide 1 PB storage.. Regards, Kristoffer Sheather Cloud Central Scale Your Data Center In The Cloud Phone: 1300 144 007 | Mobile: +61 414 573 130 | Email: k...@cloudcentral.com.au LinkedIn: | Skype: kristoffer.sheather | Twitter: http://twitter.com/kristofferjon From: Marion Hakanson hakan...@ohsu.edu Sent: Saturday, March 16, 2013 12:12 PM To: z...@lists.illumos.org Subject: [zfs] Petabyte pool? Greetings, Has anyone out there built a 1-petabyte pool? I've been asked to look into this, and was told low performance is fine, workload is likely to be write-once, read-occasionally, archive storage of gene sequencing data. Probably a single 10Gbit NIC for connectivity is sufficient. We've had decent success with the 45-slot, 4U SuperMicro SAS disk chassis, using 4TB nearline SAS drives, giving over 100TB usable space (raidz3). Back-of-the-envelope might suggest stacking up eight to ten of those, depending if you want a raw marketing petabyte, or a proper power-of-two usable petabyte. I get a little nervous at the thought of hooking all that up to a single server, and am a little vague on how much RAM would be advisable, other than as much as will fit (:-). Then again, I've been waiting for something like pNFS/NFSv4.1 to be usable for gluing together multiple NFS servers into a single global namespace, without any sign of that happening anytime soon. So, has anyone done this? Or come close to it? Thoughts, even if you haven't done it yourself? Thanks and regards, Marion --- illumos-zfs Archives: https://www.listbox.com/member/archive/182191/=now RSS Feed: https://www.listbox.com/member/archive/rss/182191/23629987-2afa167a Modify Your Subscription: https://www.listbox.com/member/?member_id=23629987id_secret=23629987-c48148 a8 Powered by Listbox: http://www.listbox.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs-discuss Digest, Vol 89, Issue 12
You could always use 40-gigabit between the two storage systems which would speed things dramatically, or back to back 56-gigabit IB. From: zfs-discuss-requ...@opensolaris.org Sent: Monday, March 18, 2013 11:01 PM To: zfs-discuss@opensolaris.org Subject: zfs-discuss Digest, Vol 89, Issue 12 Send zfs-discuss mailing list submissions to zfs-discuss@opensolaris.org To subscribe or unsubscribe via the World Wide Web, visit http://mail.opensolaris.org/mailman/listinfo/zfs-discuss or, via email, send a message with subject or body 'help' to zfs-discuss-requ...@opensolaris.org You can reach the person managing the list at zfs-discuss-ow...@opensolaris.org When replying, please edit your Subject line so it is more specific than Re: Contents of zfs-discuss digest... Today's Topics: 1. Re: [zfs] Re: Petabyte pool? (Richard Yao) 2. Re: [zfs] Re: Petabyte pool? (Trey Palmer) -- Message: 1 Date: Sat, 16 Mar 2013 08:23:07 -0400 From: Richard Yao r...@gentoo.org To: z...@lists.illumos.org Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] [zfs] Re: Petabyte pool? Message-ID: 5144642b.1030...@gentoo.org Content-Type: text/plain; charset=iso-8859-1 On 03/16/2013 12:57 AM, Richard Elling wrote: On Mar 15, 2013, at 6:09 PM, Marion Hakanson hakan...@ohsu.edu wrote: So, has anyone done this? Or come close to it? Thoughts, even if you haven't done it yourself? Don't forget about backups :-) -- richard Transferring 1 PB over a 10 gigabit link will take at least 10 days when overhead is taken into account. The backup system should have a dedicated 10 gigabit link at the minimum and using incremental send/recv will be extremely important. -- next part -- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 901 bytes Desc: OpenPGP digital signature URL: http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20130316/de90 7dfe/attachment-0001.bin -- Message: 2 Date: Sat, 16 Mar 2013 01:30:41 -0400 (EDT) From: Trey Palmer t...@nerdmagic.com To: z...@lists.illumos.org z...@lists.illumos.org Cc: z...@lists.illumos.org z...@lists.illumos.org, zfs-discuss@opensolaris.org zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] [zfs] Re: Petabyte pool? Message-ID: 1ce7bf11-6e42-421e-b136-14c0d557d...@nerdmagic.com Content-Type: text/plain; charset=us-ascii I know it's heresy these days, but given the I/O throughput you're looking for and the amount you're going to spend on disks, a T5-2 could make sense when they're released (I think) later this month. Crucial sells RAM they guarantee for use in SPARC T-series, and since you're at an edu the academic discount is 35%. So A T4-2 with 512GB RAM could be had for under $35K shortly after release, 4-5 months before the E5 Xeon was released. It seemed a surprisingly good deal to me. The T5-2 has 32x3.6GHz cores, 256 threads and ~150GB/s aggregate memory bandwidth. In my testing a T4-1 can compete with a 12-core E-5 box on I/O and memory bandwidth, and this thing is about 5 times bigger than the T4-1. It should have at least 10 PCIe's and will take 32 DIMMs minimum, maybe 64. And is likely to cost you less than $50K with aftermarket RAM. -- Trey On Mar 15, 2013, at 10:35 PM, Marion Hakanson hakan...@ohsu.edu wrote: Ray said: Using a Dell R720 head unit, plus a bunch of Dell MD1200 JBODs dual pathed to a couple of LSI SAS switches. Marion said: How many HBA's in the R720? Ray said: We have qty 2 LSI SAS 9201-16e HBA's (Dell resold[1]). Sounds similar in approach to the Aberdeen product another sender referred to, with SAS switch layout: http://www.aberdeeninc.com/images/1-up-petarack2.jpg One concern I had is that I compared our SuperMicro JBOD with 40x 4TB drives in it, connected via a dual-port LSI SAS 9200-8e HBA, to the same pool layout on a 40-slot server with 40x SATA drives in it. But the server uses n expanders, instead using SAS-to-SATA octopus cables to connect the drives directly to three internal SAS HBA's (2x 9201-16i's, 1x 9211-8i). What I found was that the internal pool was significantly faster for both sequential and random I/O than the pool on the external JBOD. My conclusion was that I would not want to exceed ~48 drives on a single 8-port SAS HBA. So I thought that running the I/O of all your hundreds of drives through only two HBA's would be a bottleneck. LSI's specs say 4800MBytes/sec for an 8-port SAS HBA, but 4000MBytes/sec for that card in an x8 PCIe-2.0 slot. Sure, the newer 9207-8e is rated at 8000MBytes/sec in an x8 PCIe-3.0 slot, but it still has only the same 8 SAS ports going at 4800MBytes/sec. Yes, I know the disks probably can't go that fast. But in my tests above, the internal 40-disk pool measures 2000MBytes/sec sequential reads and writes,