Hmmm...well, a few points here. First, the Phi's sadly don't show up in the 
hwloc tree as they apparently are hidden behind the PCIe bridge. I don't know 
if there is a way for hwloc to "probe" and find processors on PCI cards, but 
that's something I'll have to defer to Jeff and Brice.

So the first problem is: how to know the Phi's are present, how many you have 
on each node, etc? We could push that into something like the hostfile, but 
that requires that someone build the file. Still, it would only have to be 
built once, so maybe that's not too bad - could have a "wildcard" entry if 
every node is the same, etc.

Next, we have to launch processes across the PCI bus. We had to do an "rsh" 
launch of the MPI procs onto RR's cell processors as they appeared to be 
separate "hosts", though only visible on the local node (i.e., there was a 
stripped-down OS running on the cell) - Paul's cmd line implies this may also 
be the case here. If the same method works here, then we have most of that code 
still available (needs some updating). We would probably want to look at 
whether or not binding could be supported on the Phi local OS.

Finally, we have to wire everything up. This is where RR got a little tricky, 
and we may encounter the same thing here. On RR, the cell's didn't have direct 
access to the interconnects - any messaging had to be relayed by a process 
running on the main cpu. So we had to create the ability to "route" MPI 
messages from processes running on the cells to processes residing on other 
nodes.

Solving the first two is relatively straightforward. In my mind, the primary 
issue is the last one - does anyone know if a process on the Phi's can "see" 
interconnects like a TCP NIC or an Infiniband adaptor?


On May 2, 2013, at 6:36 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:

> Jeff,
> 
> I know Intel MPI (MPICH based) "just works" with Phi, but you need to do 
> things like:
>    mpirun –n 2 –host cpu host.exe : –n 4 –host mic0 mic.exe
> if you want to use the Phi for more than just kernel-offload (in which case 
> they won't have/need an MPI rank).
> So, launch procs is PART of the problem, but certainty not all of it.
> 
> At least, unlike RR, the processing elements all share the same endianness!
> 
> -Paul
> 
> 
> On Thu, May 2, 2013 at 6:28 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> 
> wrote:
> I know the MPICH guys did a bunch of work to support the Phi's.  I don't know 
> exactly what that means (I haven't read their docs about this stuff), but I 
> suspect that it's more than just launching MPI processes on them...
> 
> 
> On May 2, 2013, at 8:54 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
> 
> > Ralph,
> >
> > I am not an expert, by any means, but based on a presentation I heard 4 
> > hours ago:
> >
> > The Xeon and Phi instruction sets have a large intersection, but neither is 
> > a subset of the other.
> > In particular, Phi has its own SIMD instructions *instead* of Xeon's MMX, 
> > SSEn, etc.
> > There is also on CMPXCHG16B instruction on Phi, among others.
> > So, there will need to be different binaries, or "fat" binaries that branch 
> > based on CPU type.
> >
> > -Paul
> >
> >
> > On Thu, May 2, 2013 at 5:47 PM, Ralph Castain <r...@open-mpi.org> wrote:
> >
> > On May 2, 2013, at 5:12 PM, Christopher Samuel <sam...@unimelb.edu.au> 
> > wrote:
> >
> > > -----BEGIN PGP SIGNED MESSAGE-----
> > > Hash: SHA1
> > >
> > > Hi folks,
> > >
> > > The new system we're bringing up has 10 nodes with dual Xeon Phi MIC
> > > cards, are there any plans to support them by launching MPI tasks
> > > directly on the Phis themselves (rather than just as offload devices
> > > for code on the hosts)?
> >
> > We had something similar at one time - I developed it for the Roadrunner 
> > cluster so you could run MPI tasks on the GPUs. Worked well, but eventually 
> > fell into disrepair due to lack of use.
> >
> > In this case, I suspect it will be much easier to do as the Phis appear to 
> > be a lot more visible to the host than the GPU did on RR. Looking at the 
> > documentation, the Phis just sit directly on the PCIe bus, so they should 
> > look just like any other processor, and they are Xeon binary compatible - 
> > so there is no issue with tracking which binary to run on which processor.
> >
> > Brice: do the Phis appear in the hwloc topology object?
> >
> > Chris: can you run lstopo on one of the nodes and send me the output 
> > (off-list)?
> >
> >
> > >
> > > All the best,
> > > Chris
> > > - --
> > > Christopher Samuel        Senior Systems Administrator
> > > VLSCI - Victorian Life Sciences Computation Initiative
> > > Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545
> > > http://www.vlsci.org.au/      http://twitter.com/vlsci
> > >
> > > -----BEGIN PGP SIGNATURE-----
> > > Version: GnuPG v1.4.11 (GNU/Linux)
> > > Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
> > >
> > > iEYEARECAAYFAlGDAPYACgkQO2KABBYQAh+y9ACfZ0SdqDuV7Euq3B0ANtxPhH1D
> > > 3h4An1Zlhu2Ut+OFvbTa9xbLBkspwwPY
> > > =TbIy
> > > -----END PGP SIGNATURE-----
> > > _______________________________________________
> > > devel mailing list
> > > de...@open-mpi.org
> > > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >
> >
> > _______________________________________________
> > devel mailing list
> > de...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >
> >
> >
> > --
> > Paul H. Hargrove                          phhargr...@lbl.gov
> > Future Technologies Group
> > Computer and Data Sciences Department     Tel: +1-510-495-2352
> > Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
> > _______________________________________________
> > devel mailing list
> > de...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> 
> -- 
> Paul H. Hargrove                          phhargr...@lbl.gov
> Future Technologies Group
> Computer and Data Sciences Department     Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Reply via email to