-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi Ralph, very quick reply as I've got an SGI engineer waiting for me.. ;-)
On 03/05/13 12:21, Ralph Castain wrote: > So the first problem is: how to know the Phi's are present, how > many you have on each node, etc? We could push that into something > like the hostfile, but that requires that someone build the file. > Still, it would only have to be built once, so maybe that's not too > bad - could have a "wildcard" entry if every node is the same, > etc. We're using Slurm, and it supports them already apparently, so I'm not sure if that helps? > Next, we have to launch processes across the PCI bus. We had to do > an "rsh" launch of the MPI procs onto RR's cell processors as they > appeared to be separate "hosts", though only visible on the local > node (i.e., there was a stripped-down OS running on the cell) - > Paul's cmd line implies this may also be the case here. If the same > method works here, then we have most of that code still available > (needs some updating). We would probably want to look at whether or > not binding could be supported on the Phi local OS. I believe that is the case - you can login via SSH to them is my understanding. We've not got that far with ours yet.. > Finally, we have to wire everything up. This is where RR got a > little tricky, and we may encounter the same thing here. On RR, the > cell's didn't have direct access to the interconnects - any > messaging had to be relayed by a process running on the main cpu. > So we had to create the ability to "route" MPI messages from > processes running on the cells to processes residing on other > nodes. Gotcha. > Solving the first two is relatively straightforward. In my mind, > the primary issue is the last one - does anyone know if a process > on the Phi's can "see" interconnects like a TCP NIC or an > Infiniband adaptor? I'm not sure, but I can tell you that the Intel RPMs include an OFED install that looks like it's used on the Phi (if my reading is correct). cheers, Chris - -- Christopher Samuel Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlGDOoAACgkQO2KABBYQAh/ZrQCgjwf5PDZWF7LYYcujxfLgiYP4 lLYAn1tMt4AQ0/Jz0o+gJMvudfEGjf99 =vQ5j -----END PGP SIGNATURE-----