Let's say I need to store a petabyte of data. I need fast access (tape/DVDs aren't fast enough) and redundancy like RAID.
How should I do it? Initially the data will be in large (several MB) binary objects and could be stored as files, but eventually, it will need to be placed into a relational database like Oracle. Let's say I have access to racks that are 44U tall. I've listed several very different scenarios. Which is best? Is there a better way? How would *you* store 1PB? What do you/your company currently use to store your large datasets? What kind of drives should I use? Do SCSI drives last longer than SATA? What about IDE or SAS? Do higher RPM drives have a shorter mean time to failure? Scenario 1: ---------------- EMC CLARiiON CX300 PSI w/ Fiber channel 14x300GB ultrawide SCSI $40K 3.9TB 5U 8 per rack 31.2TB per rack $320K per rack 33 needed for 1PB $10.6 million Scenario 2: ---------------- Dell PowerVault MD3000 14x146GB SAS 10K drives, +1 hot spare 1.7TB usable w/ RAID 5 3U $16K each 14 per rack $224K per rack 23.8TB per rack 43 racks needed for 1PB $9.6 million Scenario 3: ---------------- HP ProLiant DL320s Server 12 250GB SATA 7.2k drives 2.7GB with RAID 5 3U $5.8K 14 per rack $81.2K per rack 37.8TB per rack 27 racks needed for 1PB $2.2 million Scenario 4: ---------------- HP ProLiant DL320 G4 2 SATA 500GB 7.2k 490GB usable with software RAID 1 1U $2.8K each 44 per rack 21.56TB per rack $123.2K per rack 47.5 racks needed for 1PB $5.9 million /* PLUG: http://plug.org, #utah on irc.freenode.net Unsubscribe: http://plug.org/mailman/options/plug Don't fear the penguin. */
