Paddy,

Are you trying to allow one user to use a MIC on a node while a different
user works on the host?

We don't allow this at TACC, and as such, we don't do anything special
when it comes to scheduling on the MIC. There's basically no way to
prevent a user from running offload code, so I think you'd be hard pressed
to schedule around the situation where one user wants to run native and
another says they're running host-only but actually runs an offload code.

Maybe you have something else in mind?

Best,
Bill.
--
Bill Barth, Ph.D., Director, HPC
[email protected]        |   Phone: (512) 232-7069
Office: ROC 1.435             |   Fax:   (512) 475-9445







On 6/27/14 4:06 AM, "Paddy Doyle" <[email protected]> wrote:

>
>Hi Pardo,
>
>Sorry for the lack of reply.. I've been away etc.
>
>See below for the notes I took while compiling. I relied on a lot of
>advice from
>Olli-Pekka Lehto, in particular the binfmt_misc hack to allow some of the
>./configure checks to be executed directly, on the Phi cards, even while
>compiling on the host.
>
>It also needed some hacking of the configure/autoconf scripts for munge
>and
>slurm.
>
>Note that I had to cross-compile munge and openssl as well, for our
>setup. I
>don't know if people would need to do that in general.
>
>Versions (a bit old by now):
>slurm: 2.6.5
>munge: 0.5.10
>openssl: 1.0.0
>mpss: 3.1.1
>
>
>
>Hope that helps!
>
>I'd still really like to know if anyone has a good (slurm) solution for
>handling
>the queuing of the many modes of MIC/Phi execution: native, offload,
>symmetric
>with Intel MPI ?
>
>Of course our users want to be able to mix all 3 modes, but I'm
>struggling to
>see how that can be done within the confines of the queuing system.
>
>Paddy
>
>
>
>
>
>#######################################################################
># working finally! use the config/make lines below
>#######################################################################
>
># used the binfmt_misc hack, *** while on the host ***
># https://software.intel.com/en-us/forums/topic/391645#comment-1734339
>
>
>###############
># Openssl 1.0.0
>###############
>./Configure --prefix=/home/support/native-mic/install
>--cross-compile-prefix=/usr/linux-k1om-4.7/bin/x86_64-k1om-linux-
>enable-camellia enable-seed enable-tlsext enable-rfc3779 enable-cms
>enable-md2 no-zlib shared linux-generic64
>make depend
>make
>make install
>
>
>###############
># Munge
>###############
># edit the configure to change the LD-ld detection from "-m elf_x86_64"
># to the empty string ""
>#env CC=/usr/linux-k1om-4.7/bin/x86_64-k1om-linux-gcc
>LD=/usr/linux-k1om-4.7/bin/x86_64-k1om-linux-ld
>CXX=/usr/linux-k1om-4.7/bin/x86_64-k1om-linux-g++ ./configure
>--prefix=/home/support/native-mic/install
>--with-openssl-prefix=/home/support/native-mic/install
>--host=x86_64-k1om-linux
>#env CC=/usr/linux-k1om-4.7/bin/x86_64-k1om-linux-gcc
>LD=/usr/linux-k1om-4.7/bin/x86_64-k1om-linux-ld
>CXX=/usr/linux-k1om-4.7/bin/x86_64-k1om-linux-g++
>NM=/usr/linux-k1om-4.7/bin/x86_64-k1om-linux-nm ./configure
>--prefix=/home/support/native-mic/install
>--with-openssl-prefix=/home/support/native-mic/install
>--host=x86_64-k1om-linux --with-gnu-ld
>env CC=/usr/linux-k1om-4.7/bin/x86_64-k1om-linux-gcc
>LD=/usr/linux-k1om-4.7/bin/x86_64-k1om-linux-ld
>CXX=/usr/linux-k1om-4.7/bin/x86_64-k1om-linux-g++
>NM=/usr/linux-k1om-4.7/bin/x86_64-k1om-linux-nm ./configure
>--prefix=/home/support/native-mic/install --localstatedir=/var
>--with-openssl-prefix=/home/support/native-mic/install
>--host=x86_64-k1om-linux --with-gnu-ld
>make
>make install
>
>
>###############
># Slurm
>###############
># disable the printf NULL check in configure.ac and run 'autoconf' to
>create a
># new 'configure'
># then edit the configure to change the LD-ld detection from "-m
>elf_x86_64"
># to the empty string ""
># then the ./configure line below will fail because it tells us it can't
>run a
># test program while cross-compiling -- edit the configure script and
>force it
># to run the openssl test, then the ./configure should finish
>env CC=/usr/linux-k1om-4.7/bin/x86_64-k1om-linux-gcc
>LD=/usr/linux-k1om-4.7/bin/x86_64-k1om-linux-ld
>CXX=/usr/linux-k1om-4.7/bin/x86_64-k1om-linux-g++ ./configure
>--prefix=/home/support/native-mic/install
>--sysconfdir=/home/support/native-mic/install/etc/slurm
>--disable-salloc-background --without-readline --enable-pam
>-with-ssl=/home/support/native-mic/install
>--with-munge=/home/support/native-mic/install --host=x86_64-k1om-linux
>make
>make install
>
>
>
>
>On Mon, Jun 09, 2014 at 05:31:50AM -0700, Pardo Diaz, Alfonso wrote:
>
>> 
>> Hello!!
>> 
>> Would you mind posting here the steps to slurm cross-compile to xeon
>>phi or any URL? In the linkedin discussion isn't describe the procedure
>> 
>> 
>> Thanks in advance
>> 
>> 
>> El 05/03/2014, a las 11:57, Paddy Doyle <[email protected]> escribió:
>> 
>> > 
>> > Hi all,
>> > 
>> > Just wondering if there has been any developments regarding Phi cards?
>> > 
>> > We have just installed a small 10-node cluster with two MICs per
>>node, and are
>> > wondering how best to use the cards.
>> > 
>> > I intend to try the cross-compilation to install slurm on the cards,
>>and then
>> > have a separate queue, similar to that described in the discussion on
>>linkedin
>> > below.
>> > 
>> > But I think the users would also like to be able to run offload as
>>well, so
>> > that's a bit of an issue.
>> > 
>> > I have an additional question: the users have asked if a portion of
>>the host
>> > memory can be ``reserved'' for comms with the MICs, with the rest
>>available for
>> > whoever runs on the host. Is that possible??
>> > 
>> > Thanks,
>> > Paddy
>> > 
>> > 
>> > On Tue, Feb 18, 2014 at 01:39:44AM -0800, Olli-Pekka Lehto wrote:
>> > 
>> >> 
>> >> Getting the daemon running on the Phi is certainly possible and we
>>tried this a year or
>> >> so ago. The real challenge lies in being able to run offload,
>>host-native, symmetric and
>> >> Phi-native mode programs nicely on the same set of nodes. It is
>>something that would
>> >> really be needed in order to maximize utilization of a Phi-cluster.
>> >> 
>> >> Furthermore, having a simple way to maintain affinity of Phi cards
>>with their associated
>> >> hosts when doing symmetric runs would be very useful. It's sort of
>>possible with the topology
>> >> plugin but a bit clunky.
>> >> 
>> >> Also it would be nice to have a more lightweight daemon in order to
>>conserve the precious
>> >> resources of the Phi as was presented in the Slurm User Group
>>presentation. I expect that
>> >> this would be a more significant undertaking, however(?)
>> >> 
>> >> I'm wondering if people have been working on these kind of things?
>> >> 
>> >> Olli-Pekka
>> >> 
>> >> On Feb 18, 2014, at 1:05 PM, Ralph Castain <[email protected]> wrote:
>> >> 
>> >>> 
>> >>> I know others have direct-launched processes onto the Phi before,
>>both with Slurm and just using rsh/ssh. The OpenMPI user mailing list
>>archive talks about the ssh method (search for "phi" and you'll see the
>>chatter)
>> >>> 
>> >>> http://www.open-mpi.org/community/lists/users/
>> >>> 
>> >>> and the folks at Bright talk about how they did it with Slurm here:
>> >>> 
>> >>> 
>>https://www.linkedin.com/groups/Yes-we-run-SLURM-inside-4501392.S.5792769
>>036550955008
>> >>> 
>> >>> Ralph
>> >>> 
>> >>> On Feb 17, 2014, at 5:46 PM, Christopher Samuel
>><[email protected]> wrote:
>> >>> 
>> >>>> 
>> >>>> -----BEGIN PGP SIGNED MESSAGE-----
>> >>>> Hash: SHA1
>> >>>> 
>> >>>> Hi all,
>> >>>> 
>> >>>> At the Slurm User Group in Oakland last year it was mentioned that
>> >>>> there was intended to be support for a lightweight Slurm daemon on
>> >>>> Xeon Phi (MIC) cards.
>> >>>> 
>> >>>> I had a quick look in the git master last night but couldn't spot
>> >>>> anything related, is this still the intention?
>> >>>> 
>> >>>> Olli-Pekka Lehto from CSC is running a Xeon Phi workshop at VLSCI
>>at
>> >>>> the moment and it's of interest to a number of us.
>> >>>> 
>> >>>> We're going to run a hack day on Wednesday and we'll see if we can
>> >>>> build an LDAP enabled Xeon Phi stack, if we can then we we'll see
>>if
>> >>>> we can get standard Slurm going too. Nothing like having lofty
>>goals!
>> >>>> 
>> >>>> All the best!
>> >>>> Chris
>> >>>> - -- 
>> >>>> Christopher Samuel        Senior Systems Administrator
>> >>>> VLSCI - Victorian Life Sciences Computation Initiative
>> >>>> Email: [email protected] Phone: +61 (0)3 903 55545
>> >>>> http://www.vlsci.org.au/      http://twitter.com/vlsci
>> >>>> 
>> >>>> -----BEGIN PGP SIGNATURE-----
>> >>>> Version: GnuPG v1.4.14 (GNU/Linux)
>> >>>> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
>> >>>> 
>> >>>> iEYEARECAAYFAlMCuvIACgkQO2KABBYQAh+pwgCcCLPvoUJamArfmpxY5igcJm3I
>> >>>> 0p0AnjF51qUgZfoZtIsKTDLCK+pJe+bf
>> >>>> =7HO3
>> >>>> -----END PGP SIGNATURE-----
>> >> 
>> > 
>> > -- 
>> > Paddy Doyle
>> > Trinity Centre for High Performance Computing,
>> > Lloyd Building, Trinity College Dublin, Dublin 2, Ireland.
>> > Phone: +353-1-896-3725
>> > http://www.tchpc.tcd.ie/
>> 
>
>-- 
>Paddy Doyle
>Trinity Centre for High Performance Computing,
>Lloyd Building, Trinity College Dublin, Dublin 2, Ireland.
>Phone: +353-1-896-3725
>http://www.tchpc.tcd.ie/

Reply via email to