On 11/08/14 11:27, Alan O'Cais wrote:



On 8 August 2014 22:18, Kenneth Hoste <[email protected] <mailto:[email protected]>> wrote:

    , I just started working with HMNS yesterday and I had a couple
    of issues:

      * Dependencies were quite picky, they have to be exactly
        specified (e.g. if the toolchain in use for an easyconfig was
        goolf and there's a dependency on something built with GCC,
        you need to explicitly include the compiler in the
        dependency). I have a feeling this is to do with when it goes
        searching for .eb file to figure out what the module
        path/name should have been. It could instead break the
        toolchain down and search for all possibilities (in this case
        gompi and then GCC).

    Excellent idea, I think this matches what's discussed in
    https://github.com/hpcugent/easybuild-framework/issues/741?

    It's "just" a matter of someone putting in the work to make this
    happen, it's not terribly difficult imho, and the required support
    to be able to do this is probably there now.

    From the top of my head, without trying it at all: the toolchain
    definitions in the framework need to be made aware of which
    subtoolchains 'apply' to them, and then some magic needs to happen
    to figure out the particular version to look for.
    https://github.com/hpcugent/easybuild-framework/issues/422 maybe
    also be relevant.


I realise this is perhaps specific to a pure HMNS installation but for dependencies within this scheme, if I load the toolchain I'm building with and can see the module then the dependency is met (since the GCC,gompi module paths are automatically included in a goolf toolchain)

Yes and no, see further down.

     *


      * That it goes searching for .eb files was a problem for me
        since I had a couple of "fake" modules wrapping some system
        stuff. Porting them wasn't trivial since I had to also go
        create "fake" eb files.

    I can see why this is an issue, but for non-trivial custom module
    naming scheme (like a hierarchical one), EB *needs* parsed
    easyconfigs for everything, since it needs to determine module
    names for all dependencies. I don't know how I would otherwise
    determine the 'moduleclass' for example, something which is relied
    upon by the HierarchicalMNS. If you can define a hierarchical
    module naming scheme without needing anything more than 'name',
    'version', 'versionsuffix' and 'toolchain', then you can modify
    the REQUIRED_KEYS list. If the EasyBuild framework notices that no
    other parameters are required, it won't try to search for an
    easyconfig file for all dependencies...

Wouldn't the same logic as above work in a HMNS? If the toolchain is loaded and it's possible to load <name>/<version> then the dependency is met/ Is this approach too specific to HMNS to be useful?
Yes, but I think now EasyBuild is checking for availability of modules with the *full* module name rather than the short one. I.e., it'll check for Compiler/GCC/4.8.3/CMake/5.0.0 rather than just CMake/5.0.0 .

This is done to avoid that the GCC module must be loaded (to ensure a correct $MODULEPATH) before EB can check dependencies... Loading is of course only done using the short name (which works, because EB will load the toolchain first).



      * Seemed to have problems with installing a complete set of
        dependencies at once, I tried to use "--try-toochain=XXX
        ./*.eb" on the complete set of config files to build python
        but it said it couldn't find the dependencies. When I did
        them, one by one the build succeeded.

    Someone else has also reported an issue with the recursive
    --try-toolchain (i.e. with --robot enabled) not working when using
    a HierarchicalMNS.
    Please open an issue for this with some more details, this needs
    to be looked into.


Will do

Done, thanks, see https://github.com/hpcugent/easybuild-framework/issues/1001 .


      * I had problems installing EasyBuild via bootstrapping, there
        was some kind of lmod related error that I think I've heard
        of in the past ( value "false"???module path???, tried to
        tell lmod to ignore the cache but didn't work). Happened at
        the end of the day so did't get to fix it.

    OK, the bootstrap failing is a serious issue. When that happens,
    please enable debug mode by editing the bootstrap script (set
    'debug = True' somewhere at the top of the script), and open an
    issue providing the full debug output.

This is resolved. The problem was I was trying to transition from the old NS to HMNS and keeping the old modules available while I was doing it. I had "EasyBuild/1.14.0" loaded with lmod when trying to bootstrap 1.14.0 into HMNS which lead to a conflict (since EasyBuild has no dependencies the names are identical in both schemes). Bootstrapping without the loaded module worked fine.

OK, thanks for the info. Mixing modules that were installed with different module naming schemes is indeed bound to lead to problems.

For EasyBuild specifically though, there's a fix in the works for this particular problem: https://github.com/hpcugent/easybuild-easyblocks/pull/428 .


regards,

Kenneth



    On 8 August 2014 17:24, Kenneth Hoste <[email protected]
    <mailto:[email protected]>> wrote:

        Hi Jack,

        On 08/08/14 16:32, Jack Perdue wrote:
        Howdy Ken,

        re: Hierarchical MNSs

        I looked over the notes below.  Before y'all
        head home for the weekend (I may be too late), do
        you happen to have any code updates for supporting
        HMNSs online that I could ponder/play with?
        I was planning to send out a status report on HMNS support
        soon given the significant interest in this feature, and poll
        for experiences with experimenting with it, but you beat me
        to it. ;-)

        Different people have reported a couple of issues, some are
        yet to be resolved.

        Kilian reported a couple of issues with using an Intel
        toolchain (e.g. 'ictce' or 'intel') in combination with a
        hierarchical module naming scheme [1].

        One aspect of that is a couple of mistakes in the
        HierarchicalMNS, fixes are available in framework PR#986 [2].

        Another aspect is that the currently provided easyconfigs for
        impi and imkl use the 'dummy' toolchain, which isn't correct
        in a HMNS context.
        A new version of the intel toolchain that resolves this is
        being contributed in easyconfigs PR#1014 [3].
        In there, the problem of the GCC dependency of icc/ifort also
        extending the $MODULEPATH is also being tackled (although
        proper support for this in the framework will need to be
        added in as well, see the comment included in the included
        GCC-4.8.3-libs.eb easyconfig file).

        I'm not keen on changing the other ictce/intel toolchain in a
        similar way, even though the change has limited impact for
        people already using them in production.
        It basically only results in slightly different module names
        for the toolchain components, and one or two additional
        modules (e.g. for the new intermediate iimpi toolchain).

        Next to that, two other issues have been reported. One by Ian
        w.r.t. the conflict statement in generated modules [4] (fix
        is pending, but pretty straightforward), and one by Olav
        w.r.t. 'module load' statements for compiler/MPI being
        included in module files for applications (e.g. "module load
        icc" in a module for HPL) [5]. Not only is this senseless,
        since you need to load the compiler and MPI before you can
        even see the application modules, it's also wrong, and it
        causes problems when unloading (and hence also swapping) modules.
        The latter is a serious bug, that basically renders the
        current HMNScsupport crippled. Up until now, it has only been
        discussed on the #easybuild IRC channel and via mail (outside
        of the ML).

        I hope to find time next week to work on the open problems,
        and get the proposed fixes that are already available merged in.
        Maybe I can get into getting a bugfix release out (v1.14.1),
        but no promises there. In any case, these issues should get
        resolved by EasyBuild v1.15.0 (early Sept'14).

        If anyone is aware of other issues, please come forward.

        Maybe we should also set up a conference call with the people
        interested in using a hierarchical module naming scheme with
        EasyBuild?
        Who would be interested in that?


        regards,

        Kenneth

        [1] https://github.com/hpcugent/easybuild-framework/issues/980
        [2] https://github.com/hpcugent/easybuild-framework/pull/986
        <https://github.com/hpcugent/easybuild-framework/pull/986>
        [3] https://github.com/hpcugent/easybuild-easyconfigs/pull/1014
        [4] https://github.com/hpcugent/easybuild-framework/issues/994
        [5] https://github.com/hpcugent/easybuild-framework/issues/996

        We are rolling out a new cluster and would
        really like to rebuild everything using
        HMNs before opening the system for production
        (since changes to the module system will much
        more painful afterwards).  As such, I've gotten
        approval to spend some time on this.

        My Python kung-fu is not so great, but I'm slowly
        learning more (via deploying/updating EasyBuild and
        Galaxy [a bio/bio web interface]).  I had some
        issues with my initial test of the HMNs (which I
        need to repeat since it was back when 14 first came out).
        I plan on trying again this weekend, so if you
        have a fork somewhere that has your latest updates,
        I'd love to look it over.

        Thanks,

        Jack Perdue
        Lead Systems Administrator
        TAMU Supercomputing Facility
[email protected] <mailto:[email protected]> http://sc.tamu.edu SC Helpdesk:[email protected] <mailto:[email protected]>


        ----- Original Message -----
        From: "Kenneth Hoste"<[email protected]>  
<mailto:[email protected]>
        To: "EasyBuild"<[email protected]>  
<mailto:[email protected]>
        Sent: Tuesday, August 5, 2014 9:27:54 AM
        Subject: Re: [easybuild] EasyBuild conference call: Aug 5th 2014, 3pm 
CET

        Notes on the conf call of this afternoon are available at
        
https://github.com/hpcugent/easybuild/wiki/Conference-call-notes-20140805
        .

        On 05/08/14 09:47, Kenneth Hoste wrote:
        Hello EasyBuilders,

        The next EasyBuild conference call is planned for today, Aug 5th
        2014,
        3pm - 3.30pm (CET).

        Topics that will be discussed include:

             *) EasyBuild v1.15 release planning
             *) improvements w.r.t. using a dummy/system toolchain
             *) status update of support for hierarchical modules

        Suggestions for additional topics are welcome.
        Please let me know if you're planning to attend this conf call.

        More information about the EasyBuild conference calls is available
        at
        https://github.com/hpcugent/easybuild/wiki/Conference-calls  .


        regards,

        Kenneth




-- Dr. Alan O'Cais
    Application Support
    Juelich Supercomputing Centre
    Forschungszentrum Juelich GmbH
    52425 Juelich, Germany

    Phone: +49 2461 61 5213 <tel:%2B49%202461%2061%205213>
    Fax: +49 2461 61 6656 <tel:%2B49%202461%2061%206656>
    E-mail: [email protected] <mailto:[email protected]>
    WWW: http://www.fz-juelich.de/jsc/


    
------------------------------------------------------------------------------------------------
    
------------------------------------------------------------------------------------------------
    Forschungszentrum Juelich GmbH
    52425 Juelich
    Sitz der Gesellschaft: Juelich
    Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
    Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
    Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
    Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
    Prof. Dr. Sebastian M. Schmidt
    
------------------------------------------------------------------------------------------------
    
------------------------------------------------------------------------------------------------





--
Dr. Alan O'Cais
Application Support
Juelich Supercomputing Centre
Forschungszentrum Juelich GmbH
52425 Juelich, Germany

Phone: +49 2461 61 5213
Fax: +49 2461 61 6656
E-mail: [email protected] <mailto:[email protected]>
WWW: http://www.fz-juelich.de/jsc/

Reply via email to