Re: [slurm-dev] SLURM available on SGI platform using SGI MPI

Moe Jette Tue, 29 Nov 2011 09:57:05 -0800

I have not looked at the patch, but changing 72 files is verysurprising to me. Adding support for Cray and IBM BlueGene systemseach involved changes to about 25 files with the vast majority ofchanges in new plugins. I'd expect an SGI port to follow a similarpattern with a small number of new plugins and minor changes elsewhere.


Moe Jette
SchedMD LLC



Quoting Andy Riebs <[email protected]>:

Hi Michel,
Some of the things that you should consider as you approachsubmitting your changes to SLURM:
* SLURM already has the PMI interface (and I see that someone isworking on PMI2); do you require more support than the PMIinterface, or SPANK plugins, could provide? It might be helpful toidentify specific hooks that you need -- others on the list may beable to identify existing mechanisms.* Are you introducing new functionality that might be of moregeneral use? This may relate to the previous question.* You mentioned a concern with high cpu counts. The BlueGene codeoffers an excellent example of "the SLURM way" to handle thoseproblems.* Are your changes implemented so that they will have little or noimpact on those who choose not to use them? (This should also beviewed from the point of view of maintaining the code.)
Changing 72 files is a huge change. Clearly I speak only on behalfof myself, but the SLURM community can be of best help if weunderstand the pieces of the puzzle, and have a chance to ensurethat the changes that you require will also meet the needs of therest of the community.
Best regards,
Andy

On 11/24/2011 01:31 PM, Michel Bourget wrote:
Hi all,


It's about time I report to this mailing list what "SGI did to SLURM".

Short story:
FYI, we are releasing ( and support ) "SGI SLURM" product on SGIplatforms this November.It's based on version 2.2.7. For the user, it simply introduce the"sgimpi" mpi plugin.
Long story:
SGI MPI integration was not trivial since we are utilizing thenative SGI MPI launcher ( arrayservices ) underneath slurmstepd. We have introduced the notion of"strack" allowing job launchedoutside slurm scope to be tracked process-wise( proctrack ) andaccounting-wise ( job_acct_gather ).This introduce the notion of "sentinel" thread, in slurmstepd,responsible to add additional"pgid's" not being launched under slurmstepd umbrella. Thoseadditional pgid are communicated bystrack usinga simple mailbox file mechanism (slurm.sentinel.<job>.<step> ). Essentially,in addition to the native slurmstepd childmonitoring, we are addinghooks to monitor
out-of-band pgid's via the newly introduced strack/sentinel mechanism.
The resulting source patches to accomplish this integration arenot, in our opinion, ready for a proposal
on this mailing list yet for the following reasons:

- we would need to re-base on 2.3 and/or 2.4. Can someone confirm ?
- the source patches are quite large.
initd.sysconfig.patch : 3 files changed, 37insertions(+), 16 deletions(-)sentinel.patch : 50 files changed, 3334insertions(+), 28 deletions(-)sgimpi.patch : 18 files changed, 1089insertions(+), 5 deletions(-)
   slurm.modulefile.patch         : 1 file changed, 28 insertions(+)

  We need some guidance on an acceptable process for the slurm community for
submitting above patches. I presume a documented ( details, do,don't, why, ... )
  approach is probably required.

  Note the source RPM is, of course, shipped on the SGI SLURM iso.
  Please let me know if you'd like to look at it.
We hope to integrate the above into the stock SLURM release inthe following year.
- we believe a safe soak time ( customer's reported bug to us, etc... ) is necessary.- initial SGI release support ALTIX ICE Cluster. We don't supportlarge SSI yet ( UV1024 cores for example ) because it would require additionalrequired optimizationsfor such big machines. In particular, proctrack/job_acct_gatherneed to relieve pressureon reading the entire /proc/<all_pids>/stat. Why ? Because, on anidle 512 CPU machines,
 we have:

 #nproc=8867 #kthreads=8813 kthreads/nproc= 99.39%
In other words, kthreads are useless to scan over and over everyall_user_job/all_step.I am working on a separate solution ( GPL ) toscan-once-and-for-all-and-sharethose kthreads, hence relieving pressure. That separate solutionwould then be
 integrated into slurm in a form of an optional option:
   - dlopen the optional library
   - if there: use it
   - else    : continue as before.
In addition, the SGI MPI plugin would require some adjustments forSSI machines.
Cheers

Re: [slurm-dev] SLURM available on SGI platform using SGI MPI

Reply via email to