On Mon, 2008-06-16 at 17:32 +0300, Yevgeny Kliteynik wrote: > Jeff, > > Jeff Becker wrote: > > Hi Al > > > > Al Chu wrote: > >> Hey Jeff, > >> > >> > >>> That works. The compute nodes need to talk to other compute nodes for > >>> MPI over one set of links, and they need to talk to the Lustre nodes > >>> for I/O, but over a different (disjoint) set of links. Thanks. > >>> > >> > >> Is there a strong belief that a different/disjoint set of links would be > >> beneficial? Sometime ago, Sasha and I iterated on a patch in which I > >> found out sometimes not all switch ports would be used. In this > >> particular case, a chunk of leaf switches were sometimes using only 11 > >> out of 12 uplinks. After the fix, mpigraph showed about 20% improvement > >> in MPI bandwidth. > >> > > Basically, we want to avoid situations where I/O and MPI contend for the > > same links, and get in each other's way. > > What about using different VLs for MPI and I/O?
Adam Moody ran this idea by me sometime ago too and was something I thought of looking into later. (We are analyzing/dealing w/ routing first :-). I have no idea if different service levels can be configured into MPI implementations. I asked the Lustre people in my hallway, and it isn't currently configurable for Lustre. This isn't to say it's not doable, but would take some effort. Al > It won't buy more bandwidth, but it might prevent MPI and I/O from > congesting each other - they will share the wire according to the > priority that you will define. > > -- Yevgeny > > > -jeff -- Albert Chu [EMAIL PROTECTED] 925-422-5311 Computer Scientist High Performance Systems Division Lawrence Livermore National Laboratory _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
