A 72 drive, 10 I/O slot 3584 library will hold 2207 cartridges. with 175 GB/cartridge that works out to 6 libraries.
Orville L. Lantto Datatrend Technologies, Inc. (http://www.datatrend.com) IBM Premier Business Partner 121 Cheshire Lane, Suite 700 Minnetonka, MN 55305 Email: [EMAIL PROTECTED] V: 952-931-1203 F: 952-931-1293 C: 612-770-9166 Dan Foster <[EMAIL PROTECTED]> Sent by: "ADSM: Dist Stor Manager" <[EMAIL PROTECTED]> 11/19/2002 11:06 AM Please respond to "ADSM: Dist Stor Manager" To: [EMAIL PROTECTED] cc: Subject: How do you back up 2 PB of data? 2 PB is 2,048 TB, or 2,097,152 GB. A fun thought exercise: http://www.cnn.com/2002/TECH/biztech/11/19/ibm.supercomputerr.ap/index.html Well, assuming several things: 1. Using LTO (just because I know the numbers for this best off the top of my head) -- a 3584 library 2. LTO delivers maximum of 30 MB/sec in compressed mode, but 22-23 MB/sec is usually realistic. Let's use 22.5 MB/sec. 3. Typically 1.7:1 to 1.8:1 ratio for hardware compression Let's use 1.75, or 175 GB for a 100 GB uncompressed tape. 4. 72 drives per maxed out LTO setup (1 base frame + 5 expansion frames) for about 2000 tapes in all frames? 5. A single 3584 complex therefore delivers (using hardware compression) a grand total of 175 GB * 72 = 12.6 TB of compressed data *within* the library at any one time, and assuming the client is constantly streaming data to the ITSM server at peak efficiency, can back up 81 GB per hour at max write-to-tape speeds. 6. Assuming a 16 hour window for all backups to complete per day (so that you have time for other ITSM server processing), that's 81 * 16, or 1.3 TB per 3584 _drive_ per day. 72 * 1.3 means a single 3584 complex can do about 94 TB per day. 7. For a single full backup of 2 PB, that's 2048 TB, or 2,097,152 GB... or about 12,000 maxed out LTO tapes. Since a single fully fleshed out 3584 library is about 2,000 tapes... that would mean 6 3584 libraries for tape capacity alone. 8. 2048 TB divided by 94 TB yields about 22 3584 libraries. 9. Then you've got the small problem of having to come up with an appropriate ITSM server design... for starters, number of slots required would be incredible. You'd put max of 2 3580 drives on a single Ultra HVD SCSI adapter... so 72 drives per complex would be 36 slots alone! 36 slots multiplied by 22 complexes would be 792 slots! 10. Not sure about a p690 but think it's got a couple hundred slots? 11. Then you need more adapters for disk and network controllers. To support 22 MB/sec over 1,584 drives concurrently would be... 465 gigabit ethernet adapters assuming a perfectly tuned setup that can push 600 Mbps per adapter through. 12. You'd probably kill the bus with so much data zipping around long before you max out the slots... more likely you would need multiple (6-10?) p690 Regatta systems *just* to deal with ITSM backups for 2 PB of data alone. 13. The HVAC requirements for all these disks must be interesting ;) For the disks -- data, diskpool, db... total BTUs/hr would possibly be in neighborhood of about 3 million BTUs/hr which demands *seriously* beefy HVAC units for the disks alone, and nevermind for the servers, routers, etc...! 14. Probably has their own electrical substation for the computer room(s) alone. Run on an UPS? If they went to the extent of having own electrical substation, they might as well... The disks alone are probably going to eat about 15,300 amps at the bare minimum... total for entire room could be in neighborhood of 30-40,000 amps when you consider the large network equipment, servers, and other supporting infrastructure. I listed LTO and pSeries here just simply because I know the numbers and hardware the best, but feel free to offer other possible approaches. Keep in mind, all that is only a small part of the big picture... this one is *just* for a single full backup, and doesn't take into account the long-term needs such as ITSM db sizing or I/O loading of db or diskpool disks; each hard drive has a finite amount of I/Os it can do at any given time. Then you've got other issues such as performance vs reliability, which becomes even more tricky with the extremely large scale setups because use of RAID-5 could become a *very* real serious bottleneck that gums up the entire works. I actually wonder if ITSM on zSeries hardware would actually be better in this particular scenario because mainframes typically have superior I/O management, far beyond simple tricks like I/O pacing that exists on commercial UNIX OSes. Mainframes also have incredible I/O capabilities. Saw a zSeries box, had about 500 I/O controllers, and was still humming along just fine even under varying workloads. But I think that's balanced somewhat by the extensive training and support requirements, along with licensing and support contract costs. I do imagine that if I was the data center manager for that site, I'd be hiring an entire team of senior ITSM administrators with 20 years of experience ;) Teams of operators to deal with tape loads/unloads alone! I also can't imagine the vaulting requirements if that's 12,000 tapes for a full backup and assuming 10% incremental change daily... 1,200 tapes multiplied by say, a 8 week cycle... is 72,000 plus that 12k for a full backup... 84,000 tapes. That also assumes the data can be recycled every 8 weeks... if there are special legal considerations (such as that sometimes involves very sensitive stuff such as nuclear test results), that could be kept for years. In which case... 1,200 * 365 * 20 would be 8.76 million tapes. ;) Encryption might be required -- would 56 bit DES satisfy legal and site requirements? Or you might also have to do network path-based encryption such as IPSec and 3DES in addition to client side encryption; the network encryption in such a large setup would probably incur a serious CPU hit. You could install crypto accelerators, but that'd imply even more cards... I'd also be concerned about potential for hitting some internal ITSM limits that 99.9999% of the sites out there don't ever hit. Don't even want to think about any disaster recovery requirements which would make the entire setup *even* larger and more complex! If I was the (DoE?) IT team looking at this purchase, I'd have put in a condition in the vendor RFP indicating that a sale of such a large system must also demonstrate how one would deal with backups. Hopefully they did it as an integral part of the evaluation process, and not as an afterthought. Anybody want to do the hardware installation? Months, if not years, of assembling and cabling up :-) Where do I sign up for such an unique and extremely challenging job of administering such a setup? ;) -Dan
