Indeed. Some interesting news here: http://www.enterprisetech.com/2016/03/04/docker-acquires-apache-aurora-founders/
Us old style guys are going to have our lunch money stolen by young upstarts. Or is that startups? Seriously - these guys know how to keep things running at scale and how to tolerate failures. On 3 March 2016 at 23:30, Christopher Samuel <[email protected]> wrote: > On 04/03/16 06:40, Douglas Eadline wrote: > > > Yes, failure needs to be option. > > The Slurm folks have been working on failure management support for a > little while, the idea being you can have a pool of spare nodes to pick > from (or alternatively bargain with a scheduler for a node that's > currently busy to come free later on and then add it to the job, > potentially extending the walltime to make up for the shortfall). > > A better description from someone with higher caffeination is here: > > http://slurm.schedmd.com/nonstop.html > > All the best, > Chris > -- > Christopher Samuel Senior Systems Administrator > VLSCI - Victorian Life Sciences Computation Initiative > Email: [email protected] Phone: +61 (0)3 903 55545 > http://www.vlsci.org.au/ http://twitter.com/vlsci > > _______________________________________________ > Beowulf mailing list, [email protected] sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf >
_______________________________________________ Beowulf mailing list, [email protected] sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
