Hello Ralph,

Is there something below that is the main reason of this mail? I don't
anything obviously incompatible with hwloc here. This all seems mostly
hierarchical to me, nothing harder than a NUMA machine with a strange
NUMA interconnect.

A couple notes:
* NIC don't require a PCI bus in hwloc. We can attach OS devices
everywhere. And we can add other types of buses if needed (the bridge
type is already configurable).
* Your hierarchy of objects may have to use custom types or generic Groups
* if there are independent lightweight kernels on small parts of the
machine, we'll have to be careful assuming things like binding don't
work across different subparts
* the same controller in "common support infrastructure" in host or
boards could be individual NICs, one for each subcomponent?
* if there are objects within a single parent that have different
distances to that common infrastructure, there's still a way to do it
(put the NIC under a group of nearby sockets, and add a distance matrix
explaining all this).

If that doesn't work, we'll do a hwloc v2.0 with very intrusive changes,
we'll see :)

Brice



Le 08/11/2013 17:42, Ralph Castain a écrit :
> Hi folks
>
> We are seeing a new architecture appearing in the very near future, and I'm 
> not sure how hwloc will handle it. Consider the following case:
>
> * I have a rack that contains multiple "hosts"
>
> * each host consists of a box/shelf with common support infrastructure in it 
> - it has some kind of controller in it, and might have some networking 
> support, maybe a pool of memory that can be allocated across the occupants.
>
> * in the host, I have one or more "boards". Each board again has a controller 
> in it with some common infrastructure to support its local sockets - might 
> include some networking that would look like NICs (though not necessarily on 
> a PCIe interface), a board-level memory pool, etc.
>
> * each socket contains one or more die. Each die runs its own instance of an 
> OS - probably a lightweight kernel - that can vary between dies (e.g., might 
> have a tweaked configuration), and has its own associated memory that will 
> physically reside outside the socket. You can think of each die as 
> constituting a "shared memory locus" - i.e., processes running on that die 
> can share memory between them as it would sit under the same OS instance.
>
> * each die has some number of cores/hwthreads/caches etc.
>
> Note that the sockets are not sitting in some PCIe bus - they appear to be 
> directly connected to the overall network just like a "node" would appear 
> today. However, there is a definite need for higher layers (RMs and MPIs) to 
> understand this overall hierarchy and the "distances" between the individual 
> elements.
>
> Any thoughts on how we can support this?
> Ralph
>
> _______________________________________________
> hwloc-devel mailing list
> hwloc-de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel

Reply via email to