On Sun, Apr 27, 2014 at 09:45:41AM +0100, Jörg Saßmannshausen wrote: > Dear all, > > in some of the discussions here I came across the 'lifespan of a cluster' > argument. What I was wondering is: how long is that in HPC for number > crunching? > Is it 3 years (end of warranty), 5 years (making good use of hardware) or > longer? >
It depends: it depends very much on what you're crunching, for how many people at what demand profile, how much data you've got ... a whole host of variables. It also depends on which manufacturer makes your hardware. IBM server, IBM diskshelves - 3 - 5 year full hardware support for the server, likewise for the shelf, individual disks may be available for ten years - but you may end up getting factory refurbished for the last few years. That comes at the high cost of having a service contract but you do get peace of mind - and salesmen pestering you after year 3 to make that upgrade ... If you're running a university's set of clusters, then you may find that the department head who can shout loudest / has the most money gets this year's sexy hardware and everyone plays shuffle down to the next set of hardware cast off by the department above. Power / cooling / rack shape and density also change: if you're still running a working cluster or three that are 10 years old, your power efficiency is probably not great and you're costing more to run the cluster than the hardware cycles are worth - but the refit costs of the data centre start to add up. There is a good argument on ecology / power and cooling costs alone. I have lived in a data centre where it was worth it to retrofit screening and move some racks to create hot/cold aisle separation to decrease overall cooling costs - but that decision, in itself, cost $$$. Likewise HPC interconnects and networking are "better" (for some values of better) on newer technologies Have you asked this question of your friendly rivals at Imperial and Queen Mary :) > The reason behind that asking is: I got clusters here which are 10 years old, > and quite a number of them, and I would like to get a scheme implemented to > get the hardware replaced every X years with X being the 'lifespan of a > cluster'. One of the various options which are currently thrown around is to > move from my local data-centre (3 rooms, one is purely for the backup/file > storage and the other two for HPC) into the College shared data centre > (single > room). IF we are doing that, I am a bit worried that I get told in 5 years > time (for the sake of that argument): your clusters are end of lifetime, you > have to get rid of them as we need space / they are consuming too much energy. > > Thus, I am looking to get some answers for: how long are clusters run > typically and how is that done in other shared data centres? > > The current funding situation here means it is difficult, if not impossible, > to > get HPC hardware from funding agencies. Even if you get a bit of money, it is > just enough to get a new node. So most clusters are a bit organically grown > which makes administration difficult if you want to get really the best out > of > waht you paid for. In an ideal world, I would like to have that replaced > every > 5 years: old kit out, new kit in. In the real world, I got to run the kit > until it falls apart and hope that the Principal Investigator, i.e. the owner > of the cluster, got some money to replace the old/broken nodes. Hence the > questions so I can build up a good case to change there. Central funding from something like the old SERC [Science and Engineering Research Council] / joint university projects? > > I hope that makes sense to you. > > All the best from a overcast London! > > Jörg > > > -- > ************************************************************* > Dr. Jörg Saßmannshausen, MRSC > University College London > Department of Chemistry > Gordon Street > London > WC1H 0AJ > > email: [email protected] > web: http://sassy.formativ.net > > Please avoid sending me Word or PowerPoint attachments. > See http://www.gnu.org/philosophy/no-word-attachments.html > _______________________________________________ > Beowulf mailing list, [email protected] sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf _______________________________________________ Beowulf mailing list, [email protected] sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
