On Nov 5, 2013, at 2:59 PM, George Bosilca <bosi...@icl.utk.edu> wrote:
> I have a question regarding the extension of this concept to multi-BTL > runs. Granted we will have to have a local indexing of BTL (I'm not > concerned about this). But how do we ensure the naming is globally > consistent (in the sense that all processes in the job will agree that > usnic0 is index 0) even when we have a heterogeneous environment? The MPI_T pvars are local-only. So even if index 0 is usnic_0 in proc A, but index 0 is usnic_3 in proc B, it shouldn't matter. More specifically: these values only have meaning within the process from which they were gathered. I guess I'm trying to say that there's no need to ensure globally consistent ordering between processes. ...unless I'm missing something? > As > an example some of our clusters have 1 NIC on some nodes, and 2 on > others. Of course we can say we don't guarantee consistent naming, but > for tools trying to understand communication issues on distributed > environments having a global view is a clear plus. A good point. But even with globally consistent ordering, you don't know that usnic_0 in process A communicates with usnic_0 in process B (indeed, we run some QA cases here at Cisco where we deliberately ensure that usnic_X in process A is on the same subnet as usnic_Y in process B, where X!=Y, and everything still works properly). > Another question is about the level of details. I wonder if this level > of details is really needed, or providing the aggregate pvar will be > enough in most cases. The problem I see here is the lack of > topological knowledge at the upper level. Seeing a large number of > messages on a particular BTL might suggest that something is wrong > inside the implementation, when in fact the BTL is the only one > connecting a subset of peers. Without us exposing this information, > I'm afraid the tool might get the wrong picture ... I think exposing network-level information can only be used to infer indirect information about the upper-layer MPI semantics. However, exposing these counters was not intended to be used for MPI-application-level semantic information; it was more intended to expose information about what is happening on your underlying network -- something that OS bypass networks don't otherwise provide. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/