On 02/02/2015 08:38 AM, Michael Di Domenico wrote:
Glenn's article is good and hits on many topics correctly (of which
i've seen, having sat on the vendor side of NSF proposals in a former
life).  However I'm a little concerned by what i perceive of his
attitude towards stripping funding from centers that don't have the
technical prowess to run an HPC resources.

NSF's goal is to further science.  stripping funding, i don't believe
is the correct solution.  if a center isn't keeping up or doesn't have
the skills from the start, there should be a mentor put in place from
one of the other bigger centers.  stripping funding is only going to
shrink the pool of knowledge to a few key installations around the US,
which probably isn't the best way to spread knowledge.  but i do
concur there is a point where the NSF would probably/already has
spread itself too thin

seems to me NFS needs to get back into building the HPC community of
PEOPLE rather then building hero machines at six or seven
installations across the us.


I interpreted it differently. I think he was saying that the NSF funding for HPC should be concentrated in fewer sites, similar to what the DOE has done with their leadership computing facilities (LCFs): Argonne Leadership Computing Facility (ALCF) and Oak Ridge Leadership Computing Facility (ORLCF). By concentrating their resources in fewer locations, they can take advantage of economies of scale:

1. Pay for two large data centers instead of 5 or 10
2. Higher a somewhat larger, but much more talented staff whose talents can be spread out over several clusters and storage systems rather than many smaller support staffs with (most likely) less capabilities for each site.

And on, and on.

By committing heavily to less sites, it's easier for the NSF to focus on providing a stable financial footing, than having to constantly spread the money around many different sites like they're broadcasting seeding a lawn.

TL;DR: Put all your eggs into 2-3 baskets, and keep a really good eye on those baskets.

Regarding your comment about 'hero' systems: I read a paper a couple of years ago that the large majority of computational scientists don't need these massive exascale systems - most only need a 'department'-sized cluster with ~1024 cores. I believe SDSC did their own study with XSEDE data and came to the same conclusion (Glenn actually told me this. I'm not sure if this is published anywhere).

This reminds me of 'The Long Tail' (http://en.wikipedia.org/wiki/Long_tail): The hero systems cater to the small percentage of extremely talented computational scientists at the top of their fields, and the long tail, which is your 'average' computational science PI or grad student at universities around the world, still has to rely on an antiquated small departmental cluster. because the NSF focuses on the hero users to the detriment of the long tail, which actually represents the bulk of their funded scientists.

_______________________________________________
Beowulf mailing list, [email protected] sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to