Great ... thanks. We will try it out as soon as the common backbone IB is in place.
cheers Michael On Fri, Jul 5, 2013 at 6:10 PM, Ralph Castain <r...@open-mpi.org> wrote: > As long as the IB interfaces can communicate to each other, you should be > fine. > > On Jul 5, 2013, at 3:26 PM, Michael Thomadakis <drmichaelt7...@gmail.com> > wrote: > > Sorry on the mvapich2 reference :) > > All nodes are attached over a common 1GigE network. We wish ofcourse that > if a node-pair is connected via a higher-speed fabric *as well* (IB FDR > or 10GigE) then that this would be leveraged instead of the common 1GigE. > > One question: suppose that we use nodes having either FDR or QDR IB > interfaces available, connected to one common IB fabric, all defined over a > common IP subnet: Will OpenMPI have any problem with this? Can MPI > communication take place over this type of hybrid IB fabric? We already > have a sub-cluster with QDR HCAs and we are attaching it to IB fabric with > FDR "backbone" and another cluster with FDR HCAs. > > Do you think there may be some issue with this? The HCAs are FDR and QDR > Mellanox devices and the switching is also over FDR Mellanox fabric. > Mellanox claims that at the IB level this is doable (i.e., FDR link pairs > talk to each other at FDR speeds and QDR link pairs at QDR). > > I guess if we use the RC connection types then it does not matter to > OpenMPI. > > thanks .... > Michael > > > > > On Fri, Jul 5, 2013 at 4:59 PM, Ralph Castain <r...@open-mpi.org> wrote: > >> I can't speak for MVAPICH - you probably need to ask them about this >> scenario. OMPI will automatically select whatever available transport that >> can reach the intended process. This requires that each communicating pair >> of processes have access to at least one common transport. >> >> So if a process that is on a node with only 1G-E wants to communicate >> with another process, then the node where that other process is running >> must also have access to a compatible Ethernet interface (1G can talk to >> 10G, so they can have different capabilities) on that subnet (or on a >> subnet that knows how to route to the other one). If both nodes have 10G-E >> as well as 1G-E interfaces, then OMPI will automatically take the 10G >> interface as it is the faster of the two. >> >> Note this means that if a process is on a node that only has IB, and >> wants to communicate to a process on a node that only has 1G-E, then the >> two processes cannot communicate. >> >> HTH >> Ralph >> >> On Jul 5, 2013, at 2:34 PM, Michael Thomadakis <drmichaelt7...@gmail.com> >> wrote: >> >> Hello OpenMPI >> >> We area seriously considering deploying OpenMPI 1.6.5 for production (and >> 1.7.2 for testing) on HPC clusters which consists of nodes with *different >> types of networking interfaces*. >> >> >> 1) Interface selection >> >> We are using OpenMPI 1.6.5 and was wondering how one would go about >> selecting* at run time* which networking interface to use for MPI >> communications in case that both IB, 10GigE and 1 GigE are present. >> >> This issues arises in a cluster with nodes that are equipped with >> different types of interfaces: >> >> *Some *have both IB-QDR or FDR and 10- and 1-GigE. Others *only* have >> 10-GigE and 1-GigE and simply others only 1-GigE. >> >> >> 2) OpenMPI 1.6.5 level of support for Heterogeneous Fabric >> >> Can OpenMPI support running an MPI application using a mix of nodes with >> all of the above networking interface combinations ? >> >> 2.a) Can the same MPI code (SPMD or MPMD) have a subset of its ranks >> run on nodes with QDR IB and another subset on FDR IB simultaneously? These >> are Mellanox QDR and FDR HCAs. >> >> Mellanox mentioned to us that they support both QDR and FDR HCAs attached >> to the same IB subnet. Do you think MVAPICH2 will have any issue with this? >> >> 2.b) Can the same MPI code (SPMD or MPMD) have a subset of its ranks run >> on nodes with IB and another subset over 10GiGE simultaneously? >> >> That is imagine nodes I1, I2, ..., IN having say QDR HCAs and nodes G1, >> G2, GM having only 10GigE interfaces. Could we have the same MPI >> application run across both types of nodes? >> >> Or should there be say 2 communicators with one of them explicitly >> overlaid on a IB only subnet and the other on a 10GigE only subnet? >> >> >> Please let me know if the above are not very clear. >> >> Thank you much >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >