Howdy Gunnar,

If you do decide to use EB you might look at my
OpenMPI-system-GCC-system.eb available in the
tar file here:

http://www.siliconslick.com/easybuild/easyconfigs/terra/

It was basically just an attempt to wrap Intel's provided,
OPA-optimized, OpenMPI so I could use in EasyBuild.
I had full iompi/gompi using Intel's (RPM-based) MPIs,
but am not seeing them at the moment.

But there is certaiinly a way to do it (e.g.. on Power7/8 we
rely upon the vendor installed version).

Jack Perdue
Lead Systems Administrator
High Performance Research Computing
TAMU Division of Research
[email protected]    http://hprc.tamu.edu
HPRC Helpdesk: [email protected]

On 01/29/2017 07:51 AM, Gunnar Sauer wrote:
Hello Kenneth,
thanks for coming back to my question. I am sorry to say than I cannot follow the EasyBuild route anymore for the purpose of my internship, but I am definitely interested to solve the problems (described below) for myself and for my future career. (I'll have to buy access to some public cluster like Sabalcore or see whether I can work on our university cluster without a specific project. So it may take 1-2 weeks until I can proceed.)

I have run the HPCC benchmark (version 1.5.0 from http://icl.cs.utk.edu/hpcc/software/index.html <https://urldefense.proofpoint.com/v2/url?u=http-3A__icl.cs.utk.edu_hpcc_software_index.html&d=CwMFaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=yuyoBkmTkIQPbv1BTF9U27ww5Lm7GhsMmWcQG9gmjbA&m=0bFv2yKP9wxfXg2JyjHOgkAvDgKuRhJub5XrYLDU0ng&s=IHt9rOakCCKZaovyl7ETvnchX9d0ZS87AtrlSCpkGxw&e=>) once with the preinstalled gcc/openmpi/openblas of the company's Xeon cluster, and secondly with the foss/2016b toolchain built previously on the same cluster. This was meant as a quick check whether the forum users were right, saying that it doesn't matter for the MPI performance whether you use and optimized OpenMPI version or the generic EasyBuild OpenMPI built from source - or whether our engineers were right, saying that you have to use the system tools including an OpenMPI that has been set up for the Infiniband hardware if you want any decent MPI performance.

When I presented the numbers below, showing ping pong latencies of 10000 us (EasyBuild) compared to 2 us (system tools), we had a quick discussion, and my task is now to write a build script independent from EasyBuild, respecting the existing tools. Here are the results of the ping pong test, first for the system tools (see also attached hpccoutf.system for the complete HPCC ouput), second for the foss/2016b toolchain (see also attached hpccoutf.eb):

System compiler, openmpi, openblas:

Major Benchmark results:
------------------------

Max Ping Pong Latency:                 0.002115 msecs
Randomly Ordered Ring Latency:         0.001384 msecs
Min Ping Pong Bandwidth:            2699.150014 MB/s
Naturally Ordered Ring Bandwidth:    549.443306 MB/s
Randomly  Ordered Ring Bandwidth:    508.267423 MB/s

EasyBuild foss/2016b toolchain:

Major Benchmark results:
------------------------

Max Ping Pong Latency:                10.000019 msecs
Randomly Ordered Ring Latency:         4.251704 msecs
Min Ping Pong Bandwidth:              62.532243 MB/s
Naturally Ordered Ring Bandwidth:    134.390539 MB/s
Randomly  Ordered Ring Bandwidth:    144.071750 MB/s

I'd appreciate if somebody could analyze the attached full outputs to suggest what I have done wrong.

These results are along the lines of previous tests of mine that showed that EasyBuilds toolchain is not practical to solve large linear equation systems that involve more than one node. As soon as inter-node communication is involved, performance drops from 10-100 Gflop/s to 0.1-0.01 Gflop/s and gets worse the more nodes are involved even if the system is scaled to fill always 80% of the nodes' memory.

These are just my initial discouraging attempts with EasyBuild. I'll be happy to find out that the performance problem is due to my mistake, because I might not have found the relevant documentation, forgotten to set some compiler flag or something else.

Thank you
Gunnar


2017-01-28 17:38 GMT+01:00 Kenneth Hoste <[email protected] <mailto:[email protected]>>:

    Hi Gunnar,

    On 25/01/2017 19:08, Gunnar Sauer wrote:
    Hello Jens,

    2017-01-25 13:03 GMT+01:00 Jens Timmerman
    <[email protected] <mailto:[email protected]>>:

        Hello Gunnar,


        On 24/01/2017 19:54, Gunnar Sauer wrote:
        > Hello EasyBuild experts,
        >

        > But which toolchain do I choose on the Xeon cluster, which
        provides
        > all those optimized tools through already existing modules?
        Can I
        > tweak the goolf toolchain to use the existing system modules?
        Yes, you could create your own toolchain to use the already
        existing
        modules, this is exactly how the Cray toolchain works, see
        http://easybuild.readthedocs.io/en/latest/Using_external_modules.html
        
<https://urldefense.proofpoint.com/v2/url?u=http-3A__easybuild.readthedocs.io_en_latest_Using-5Fexternal-5Fmodules.html&d=CwMFaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=yuyoBkmTkIQPbv1BTF9U27ww5Lm7GhsMmWcQG9gmjbA&m=0bFv2yKP9wxfXg2JyjHOgkAvDgKuRhJub5XrYLDU0ng&s=DWDxlGlUMCM0VnRijQ7RpJk5kH9IqghXJAou0wrGqPc&e=>
        for more information on how to create your own toolchain from
        existing
        compilers and libraries.


    Ok, I'll try to understand the details how to set up a new
    toolchain and go this path. I have found the GCC-system, which
    seems to lead in the right direction. Would it be feasible to
    extend GCC-system to include OpenMPI-system and OpenBLAS-system
    in a similar fashion?

    The GCC-system easyconfig file leverages the SystemCompiler easyblock.

    To also support OpenMPI-system and OpenBLAS-system, a similar
    SystemLibrary easyblock should be created that forces you to
    specify the required information about the system library you
    would like to use.

    Alan's suggestion of just grabbing the module file that is
    generated using "--module-only --force" and adjusting it as needed
    is a good one though, it may take you a long way...



        And yes, these toolchains have infiniband support.

        So, it would be very nice to know what optimizations are
        being done at
        your company that make the internal toolchain even better
        optimized, so
        all EasyBuild
        users could all benefit from this knowledge and potentially
        millions of
        CPU hours could be saved.


    I will see, whether they share the details with me, or if they
    even have the details. As I understood, the cluster has been set
    up and is maintained by an external company. When we discussed
    today using the foss stack, I only got very discouraging answers:
    infiniband couldn't be configured correctly using a generic MPI
    installation procedure, BLAS would be an order of magnitude
    slower unless you put in the correct parameters for the specific
    architecture, etc.
    Nevertheless, I am currently trying to set up the HPL benchmark,
    and I will compare the results with easybuild's foss toolchain
    and with the cluster's 'builtin' toolchain.

    I'd very interested in hearing more about this, i.e. how the
    benchmark results turned out, how the existing toolchains were
    configured compared to how we tackle things in EasyBuild, etc.

    It's certainly possible that there was some heavy tuning done
    w.r.t. configuration parameters (in particular for the MPI); the
    downside of the easyconfigs we include in EasyBuild is that we
    need to keep them generic enough so that they'll work out of the box.
    For OpenMPI specifically, it makes a lot of sense to tweak the
    corresponding easyconfig file with additional/different
    system-specific configure options.


        I'm really serious here, if you can share this information,
        we would
        love to hear it so we can incorporate, but I do understand
        that this
        might be proprietary information.

        TL;DR:
        If you can share your highly optimized toolchains with us we
        will be
        pleased to support them in EasyBuild if they can help us
        getting faster
        software runtimes!


    Also thanks for the other replies! I need to gain some more
    experience with EasyBuild before I can make use of all your
    suggestions.

    Don't hesitate to let us know if you have any questions!



    regards,

    Kenneth



Reply via email to