Re: [easybuild] Support for other job schedulers

Jordi Blasco Tue, 12 Aug 2014 08:48:15 +0200 (CEST)

Hi,

The slurm_job.py relies on the --wrap option of Slurm Workload Manager.
This allows to simplify the complexity of the code quite a lot.
The code is still in very early stage, and needs a clean-up and also some
efforts in the parallelbuild.py.
https://github.com/jordiblasco/easybuild-framework/tree/slurm


The major problem that we are facing is the privileges escalation. We are
using an special user account to install all the applications in the right
place and with the right permissions, but this requires root privileges,
and for that reason I have been looking for quick alternatives.

I have developed an EB command line wrapper that covers most of the needs
that we have in NeSI. It doesn't resolve the dependencies into different
jobs but it allows us to build the applications on all the architectures at
the same time.
In addition to that, it provides some useful features. Thanks to simple
rules in the sudoers, we can submit the jobs as the mentioned user, and
solving this way all the potential conflicts regarding to the ACLs. It also
provides a simple logging system that allows you to track who installed
what and when.
https://github.com/jordiblasco/slurm-utils

I hope it can be helpful.

Regards,

Jordi



On 9 August 2014 06:16, Pablo Escobar Lopez <[email protected]>
wrote:

> Hola Miguel  :)
>
> as you already mentioned neither LSF or slurm is officially supported yet,
> anyway even if it were supported, I would suggest to start learning how
> easybuild works without the --job option because that is not a widely
> tested option. So I think it´s better to start learning how easybuild works
> without submitting to a scheduler and once you are used to how easybuild
> works then start testing with the --job option.
>
> The approach I use to run easybuild in different clusters is to have a
> different easybuild config files for each of my clusters (where I define
> different paths for install_dir or modules_dir) and then run the same
> easyconfig (.eb file) in the different login nodes using the specific
> easybuild config file for that cluster. This way, I write a single
> easyconfig which I execute in each of my clusters login nodes so the
> compilation is optimized for each machine. Automatizing this is quite
> simple. If you want more details about this specific setup just email to
> the list.
>
> un saludo
> Pablo.
>
>
>
>
>
>
>
>
> 2014-08-08 18:08 GMT+02:00 Kenneth Hoste <[email protected]>:
>
>> HI Ricardo,
>>
>> On 08/08/14 17:48, Riccardo Murri wrote:
>> > Hi Miguel, all,
>> >
>> > On 8 August 2014 12:41, Miguel Bernabeu Diaz <[email protected]>
>> wrote:
>> >> I'm not sure if all or at least the most common schedulers' CLI could
>> be
>> >> abstracted in this manner as I've only worked with Slurm and LSF.
>> Either
>> >> way, would the community be interested in this kind of abstraction?
>> Also,
>> >> has someone worked on something similar or a port to Slurm or LSF we
>> could
>> >> extend or reuse?
>> > We too would probably be interested in batch-system independence,
>> > although we're in no hurry. (This would fit in the framework of a
>> project
>> > that will only start later on this year.)
>> I agree this would be a very nice feature indeed. --job is very useful
>> for us, but it probably really only works for us.
>> You basically need Torque + pbs_python (and maybe even align the
>> versions a bit, to make it worse).
>>
>> > Actually, if I am allowed a shameless self-plug, we already have a
>> > Python framework that can submit and manage jobs on different
>> > batch-queuing systems, see http://gc3pie.googlecode.com/
>>
>> That sounds interesting!
>>
>> Let me pick up a crazy project idea we wrote up some time ago:
>> https://gist.github.com/boegel/9225891 .
>>
>> How does gc3pie relate to that?
>>
>> > I am not familiar with EasyBuild internals, but GC3Pie's job control
>> > reduces to a few lines that should be relatively quick to plug in:
>> >
>> >      from gc3libs import Application
>> >      from gc3libs.core import Engine
>> >
>> >      task = Application(['some', '-unix', '+command', 'here'], ...)
>> >      engine = Engine(...)
>> >      engine.add(task)
>> >      # run task and wait for it to finish
>> >      engine.progress()
>> >
>> > If there is interest, I can look at the sources and try to estimate
>> > how much work it would be to integrate GC3Pie and EasyBuild.
>> The first step should be to abstract the current support for --job into
>> a generic class, and make what's there now derive from that (probably
>> naming it PbsPython).
>>
>> Then, SLURM & LSF could be just another version of that, and so can
>> gc3pie and DRMAA.
>>
>> Unless gc3pie solves all our problems, that would even be better. ;-)
>>
>> As the project idea gist shows, supporting different batch systems is
>> really a project on its own.
>>
>>
>> K.
>>
>
>
>
> --
> Pablo Escobar López
> HPC systems engineer
> Biozentrum, University of Basel
> Swiss Institute of Bioinformatics SIB
> Email: [email protected]
> Phone: +41 61 267 15 82
> http://www.biozentrum.unibas.ch
>

Re: [easybuild] Support for other job schedulers

Reply via email to