Hello Ralph, Is there something below that is the main reason of this mail? I don't anything obviously incompatible with hwloc here. This all seems mostly hierarchical to me, nothing harder than a NUMA machine with a strange NUMA interconnect.
A couple notes: * NIC don't require a PCI bus in hwloc. We can attach OS devices everywhere. And we can add other types of buses if needed (the bridge type is already configurable). * Your hierarchy of objects may have to use custom types or generic Groups * if there are independent lightweight kernels on small parts of the machine, we'll have to be careful assuming things like binding don't work across different subparts * the same controller in "common support infrastructure" in host or boards could be individual NICs, one for each subcomponent? * if there are objects within a single parent that have different distances to that common infrastructure, there's still a way to do it (put the NIC under a group of nearby sockets, and add a distance matrix explaining all this). If that doesn't work, we'll do a hwloc v2.0 with very intrusive changes, we'll see :) Brice Le 08/11/2013 17:42, Ralph Castain a écrit : > Hi folks > > We are seeing a new architecture appearing in the very near future, and I'm > not sure how hwloc will handle it. Consider the following case: > > * I have a rack that contains multiple "hosts" > > * each host consists of a box/shelf with common support infrastructure in it > - it has some kind of controller in it, and might have some networking > support, maybe a pool of memory that can be allocated across the occupants. > > * in the host, I have one or more "boards". Each board again has a controller > in it with some common infrastructure to support its local sockets - might > include some networking that would look like NICs (though not necessarily on > a PCIe interface), a board-level memory pool, etc. > > * each socket contains one or more die. Each die runs its own instance of an > OS - probably a lightweight kernel - that can vary between dies (e.g., might > have a tweaked configuration), and has its own associated memory that will > physically reside outside the socket. You can think of each die as > constituting a "shared memory locus" - i.e., processes running on that die > can share memory between them as it would sit under the same OS instance. > > * each die has some number of cores/hwthreads/caches etc. > > Note that the sockets are not sitting in some PCIe bus - they appear to be > directly connected to the overall network just like a "node" would appear > today. However, there is a definite need for higher layers (RMs and MPIs) to > understand this overall hierarchy and the "distances" between the individual > elements. > > Any thoughts on how we can support this? > Ralph > > _______________________________________________ > hwloc-devel mailing list > hwloc-de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel