That depends on the MPI implementation. As far as I know, the most time
consuming part job startup is MPI_Init() with MPICH2/MVAPICH2 and SLURM
PMI implementation. I have launched a 4096 nodes X 12 tasks per node in
about 96 seconds before. Open MPI performs a little better because of
its lazy communication information exchanging.
在 2013-02-28四的 12:58 -0700,Andy Riebs写道:
> I've been asked (and have no good answer) about how long it takes to
> start a SLURM job on an x86_64 cluster that allocates, say, 2000 16-core
> nodes (32k processes). My googling skills have failed me on this query;
> does anyone have anecdotal evidence or pointers for the time SLURM
> requires to start huge jobs?
>
> Thanks in advance!
> Andy
>