Hi Jordi,

Thanks for sharing your initial work on this!

On 12/08/14 08:48, Jordi Blasco wrote:
Hi,

The slurm_job.py relies on the --wrap option of Slurm Workload Manager. This allows to simplify the complexity of the code quite a lot. The code is still in very early stage, and needs a clean-up and also some efforts in the parallelbuild.py.
https://github.com/jordiblasco/easybuild-framework/tree/slurm
Easier to spot what actually changed: https://github.com/jordiblasco/easybuild-framework/compare/slurm .


The major problem that we are facing is the privileges escalation. We are using an special user account to install all the applications in the right place and with the right permissions, but this requires root privileges, and for that reason I have been looking for quick alternatives.
Hmm, can you elaborate here? Why would the special user account require root privileges exactly? W.r.t. privileges escalation, do mean that e.g. "newgrp - easybuild" and then using "eb --job ..." subsequently doesn't yield the expected results (i.e., installing something with eb under the 'easybuild' group)?


I have developed an EB command line wrapper that covers most of the needs that we have in NeSI. It doesn't resolve the dependencies into different jobs but it allows us to build the applications on all the architectures at the same time. In addition to that, it provides some useful features. Thanks to simple rules in the sudoers, we can submit the jobs as the mentioned user, and solving this way all the potential conflicts regarding to the ACLs. It also provides a simple logging system that allows you to track who installed what and when.
https://github.com/jordiblasco/slurm-utils

I hope it can be helpful.

Regards,

Jordi



On 9 August 2014 06:16, Pablo Escobar Lopez <[email protected] <mailto:[email protected]>> wrote:

    Hola Miguel  :)

    as you already mentioned neither LSF or slurm is officially
    supported yet, anyway even if it were supported, I would suggest
    to start learning how easybuild works without the --job option
    because that is not a widely tested option. So I think it´s better
    to start learning how easybuild works without submitting to a
    scheduler and once you are used to how easybuild works then start
    testing with the --job option.

    The approach I use to run easybuild in different clusters is to
    have a different easybuild config files for each of my clusters
    (where I define different paths for install_dir or modules_dir)
    and then run the same easyconfig (.eb file) in the different login
    nodes using the specific easybuild config file for that cluster.
    This way, I write a single easyconfig which I execute in each of
    my clusters login nodes so the compilation is optimized for each
    machine. Automatizing this is quite simple. If you want more
    details about this specific setup just email to the list.

    un saludo
    Pablo.







    2014-08-08 18:08 GMT+02:00 Kenneth Hoste <[email protected]
    <mailto:[email protected]>>:

        HI Ricardo,

        On 08/08/14 17:48, Riccardo Murri wrote:
        > Hi Miguel, all,
        >
        > On 8 August 2014 12:41, Miguel Bernabeu Diaz
        <[email protected] <mailto:[email protected]>> wrote:
        >> I'm not sure if all or at least the most common schedulers'
        CLI could be
        >> abstracted in this manner as I've only worked with Slurm
        and LSF. Either
        >> way, would the community be interested in this kind of
        abstraction? Also,
        >> has someone worked on something similar or a port to Slurm
        or LSF we could
        >> extend or reuse?
        > We too would probably be interested in batch-system
        independence,
        > although we're in no hurry. (This would fit in the framework
        of a project
        > that will only start later on this year.)
        I agree this would be a very nice feature indeed. --job is
        very useful
        for us, but it probably really only works for us.
        You basically need Torque + pbs_python (and maybe even align the
        versions a bit, to make it worse).

        > Actually, if I am allowed a shameless self-plug, we already
        have a
        > Python framework that can submit and manage jobs on different
        > batch-queuing systems, see http://gc3pie.googlecode.com/

        That sounds interesting!

        Let me pick up a crazy project idea we wrote up some time ago:
        https://gist.github.com/boegel/9225891 .

        How does gc3pie relate to that?

        > I am not familiar with EasyBuild internals, but GC3Pie's job
        control
        > reduces to a few lines that should be relatively quick to
        plug in:
        >
        >      from gc3libs import Application
        >      from gc3libs.core import Engine
        >
        >      task = Application(['some', '-unix', '+command',
        'here'], ...)
        >      engine = Engine(...)
        >      engine.add(task)
        >      # run task and wait for it to finish
        >      engine.progress()
        >
        > If there is interest, I can look at the sources and try to
        estimate
        > how much work it would be to integrate GC3Pie and EasyBuild.
        The first step should be to abstract the current support for
        --job into
        a generic class, and make what's there now derive from that
        (probably
        naming it PbsPython).

        Then, SLURM & LSF could be just another version of that, and
        so can
        gc3pie and DRMAA.

        Unless gc3pie solves all our problems, that would even be
        better. ;-)

        As the project idea gist shows, supporting different batch
        systems is
        really a project on its own.


        K.




-- Pablo Escobar López
    HPC systems engineer
    Biozentrum, University of Basel
    Swiss Institute of Bioinformatics SIB
    Email: [email protected]
    <mailto:[email protected]>
    Phone: +41 61 267 15 82 <tel:%2B41%2061%20267%2015%2082>
    http://www.biozentrum.unibas.ch



Reply via email to