Seems like SLURM daemons will not be running on each node on Sequoia -
slurmd will run on the I/O nodes but not the compute nodes if I read
this presentation correctly:

Multi-Petascale Computing on the Sequoia Architecture:
https://hpcrd.lbl.gov/scidac09/talks/Seager-Sequoia4SciDACv1.pdf

Nevertheless, the installations Jette listed are really massive!! The
largest known Grid Engine installation is Sun's Ranger at TACC, which
only has 62,976 processor cores in 3,936 nodes.

As a developer & maintainer of a Grid Engine fork (Oracle ended
developing the open-source SGE code-base in 2010, and thus we forked
the code and started the pure open-source project called "Open Grid
Scheduler"), I think Grid Engine won't be able to scale to those
numbers in the near or not so near future! :-(

Rayson



On Sat, Nov 20, 2010 at 1:49 PM, Jette, Moe <[email protected]> wrote:
> I believe that SLURM can manage any machine that HP can build and a customer 
> can pay for ;-)
>
> We have not seen any scaling issues and some of the machines running SLURM  
> today include:
> Tianhe-1A in China with 186368 cores
> Tera-100 at CEA with 138368 cores and a
> BlueGene/L at LLNL with 212992 cores
>
> We plan to run SLURM on LLNL's 20 PFlop Bluegene/Q system next year with 1.6 
> million
> processors 
> (http://www-304.ibm.com/jct03004c/press/us/en/pressrelease/26599.wss) and
> I am not expecting any scalability problems, although task launch on the 
> BlueGene systems
> differs from typical Linux systems.
>
> At the other end of the spectrum, Intel is using SLURM on their 48-core 
> "cluster on a chip"
> (http://www.hpcwire.com/features/Intel-Unveils-48-Core-Research-Chip-78378487.html).
> SLURM's architecture with a multitude of plugin options gives it tremendous 
> flexibility.
>
> Moe
> ________________________________________
> From: [email protected] [[email protected]] On 
> Behalf Of Andy Riebs [[email protected]]
> Sent: Friday, November 19, 2010 8:14 AM
> To: [email protected]
> Subject: [slurm-dev] design limits for 2.2?
>
> How large a cluster should one expect to be able to support with Slurm
> 2.2? (One suspects that the number is getting rather large!)
>
> Thanks!
> Andy
>
> --
> Andy Riebs
> Hewlett-Packard Company
> SCI Solutions
> +1-786-263-9743
> My opinions are not necessarily those of HP
>
>
>

Reply via email to