Hi Brice

I think Linux cgroups makes the most sense in terms of a mechanism for doing 
this. We don't already do it, but it is something our customers want to see in 
the platform - so we have to provide it.

The basic use-case is for an application to specify a max memory requirement, 
thus allowing us to subdivide the node when allocating resources. In that case, 
we need to ensure that the application remains within that memory limit so we 
don't start swapping. This is a typical "big data" requirement, and the apps 
know how to handle the situation where they run up against the limit (e.g., 
what to do when malloc returns NULL).

System resource managers don't usually provide this capability, so we will do 
it at the ORTE level. We already use hwloc there for resource discovery and 
process placement, so it seems natural to include the ability to specify 
limits. Since ORTE also does the process launching, it could do the final 
cgroup definition and pass it to Linux.

We envision an API that basically is modeled after the cgroup structure. What 
we would want hwloc to do is the final step - we pass in the resource 
constraints, including bind and memory policy specs, and hwloc does the "magic" 
to tell Linux what needs to be done.

Make sense?
Ralph


On Nov 2, 2012, at 2:18 PM, Brice Goglin <brice.gog...@inria.fr> wrote:

> Hello Ralph,
> 
> I am not very familiar with these features. What system mechanism do you
> currently use for this? Linux cgroups? Any concrete example of what you
> would like to do?
> 
> Brice
> 
> 
> 
> Le 02/11/2012 22:12, Ralph Castain a écrit :
>> Hi folks
>> 
>> We (Greenplum) have a need to support resource limits (e.g., memory and cpu 
>> usage) on processes running under Open MPI's RTE. OMPI uses hwloc for 
>> processor and memory affinity, so this seems a likely place to add the 
>> required support. Jeff tells me that it doesn't yet exist in hwloc - I'm 
>> wondering if you would welcome and/or be willing to consider contributions 
>> from our engineers towards adding this capability?
>> 
>> Obviously, we'd need to discuss how and where to do the extension. Just 
>> wanted to first see if this is an option, or if we should do it directly in 
>> OMPI.
>> Ralph
>> 
>> 
>> _______________________________________________
>> hwloc-devel mailing list
>> hwloc-de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel
> 


Reply via email to