-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 03/05/13 14:30, Ralph Castain wrote:

> On May 2, 2013, at 9:18 PM, Christopher Samuel 
> <sam...@unimelb.edu.au> wrote:
> 
>> We're using Slurm, and it supports them already apparently, so I'm 
>> not sure if that helps?
> 
> It does - but to be clear: your saying that you can directly launch 
> processes onto the Phi's via srun?

Ah no, Slurm 2.5 supports them as coprocessors, allocated as GPUs are.

I've been told Slurm 2.6 (under development) may support them as nodes
in their own right, but that's not something I've had time to look into
myself (yet).

> If so, then this may not be a problem, assuming you can get
> confirmation that the Phi's have direct access to the interconnects.

I'll see what I can do.   There is a long README which will be my light
reading on the train home tonight here:

http://registrationcenter.intel.com/irc_nas/3047/readme-en.txt

This seems to indicate how that works, but other parts imply that it
*may* require Intel True Scale InfiniBand adapters:

3.4  Starting Intel(R) MPSS with OFED Support

  1) Start the Intel(R) MPSS service. Section 2.3, "Starting Intel(R) MPSS 
     Services" explains how.  Do not proceed any further if Intel(R) MPSS is not
     started.    

  2) Start IB and HCA services. 
            user_prompt> sudo service openibd start
            user_prompt> sudo service opensmd start

  3) Start The Intel(R) Xeon Phi(TM) coprocessor specific OFED service.
            user_prompt> sudo service ofed-mic start

  4) To start the experimental ccl-proxy service (see /etc/mpxyd.conf)
            user_prompt> sudo service mpxyd start

3.5  Stopping Intel(R) MPSS with OFED Support 

    o If the installed version is earlier than 2.x.28xx unload the driver using:
            user_prompt> sudo modprobe -r mic

    o If the installed version is 2.x.28xx or later, unload the driver using:   
   
            user_prompt> sudo service ofed-mic stop
            user_prompt> sudo service mpss stop        
            user_prompt> sudo service mpss unload        
            user_prompt> sudo service opensmd stop
            user_prompt> sudo service openibd stop

    o If the experimental ccl-proxy driver was started, unload the driver using:
            user_prompt> sudo service mpxyd stop

> If the answer to both is "yes", then just srun the MPI procs
> directly - we support direct launch and use PMI to wireup. Problem
> solved :-)

That would be ideal, I'll do more digging into Slurm 2.6 (we had
planned on starting off with that, but as coprocessors, but this
may be enough for us to change).

> And yes - that support is indeed in the 1.6 series...just configure 
> --with-pmi. You may need to provide the path to where pmi.h is 
> located under the slurm install, but probably not.

Brilliant, thanks!

All the best,
Chris
- -- 
 Christopher Samuel        Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545
 http://www.vlsci.org.au/      http://twitter.com/vlsci

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlGDUOMACgkQO2KABBYQAh9lcQCeIp5KjX2PJ/2Cia6fc51hSjFW
26UAn1eKqTqjZil7S8xwJrDDL5wkGof/
=2A67
-----END PGP SIGNATURE-----

Reply via email to