-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Ralph,  very quick reply as I've got an SGI engineer waiting for
me.. ;-)

On 03/05/13 12:21, Ralph Castain wrote:

> So the first problem is: how to know the Phi's are present, how
> many you have on each node, etc? We could push that into something
> like the hostfile, but that requires that someone build the file.
> Still, it would only have to be built once, so maybe that's not too
> bad - could have a "wildcard" entry if every node is the same,
> etc.

We're using Slurm, and it supports them already apparently, so I'm not
sure if that helps?

> Next, we have to launch processes across the PCI bus. We had to do
> an "rsh" launch of the MPI procs onto RR's cell processors as they
> appeared to be separate "hosts", though only visible on the local
> node (i.e., there was a stripped-down OS running on the cell) -
> Paul's cmd line implies this may also be the case here. If the same
> method works here, then we have most of that code still available
> (needs some updating). We would probably want to look at whether or
> not binding could be supported on the Phi local OS.

I believe that is the case - you can login via SSH to them is my
understanding.  We've not got that far with ours yet..

> Finally, we have to wire everything up. This is where RR got a
> little tricky, and we may encounter the same thing here. On RR, the
> cell's didn't have direct access to the interconnects - any
> messaging had to be relayed by a process running on the main cpu.
> So we had to create the ability to "route" MPI messages from
> processes running on the cells to processes residing on other
> nodes.

Gotcha.

> Solving the first two is relatively straightforward. In my mind,
> the primary issue is the last one - does anyone know if a process
> on the Phi's can "see" interconnects like a TCP NIC or an
> Infiniband adaptor?

I'm not sure, but I can tell you that the Intel RPMs include an OFED
install that looks like it's used on the Phi (if my reading is correct).

cheers,
Chris
- -- 
 Christopher Samuel        Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545
 http://www.vlsci.org.au/      http://twitter.com/vlsci

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlGDOoAACgkQO2KABBYQAh/ZrQCgjwf5PDZWF7LYYcujxfLgiYP4
lLYAn1tMt4AQ0/Jz0o+gJMvudfEGjf99
=vQ5j
-----END PGP SIGNATURE-----

Reply via email to