Re: [systemd-devel] [HEADSUP] cgroup changes

Brian Bockelman Sat, 29 Jun 2013 10:52:50 -0700

On Jun 25, 2013, at 4:56 AM, Lennart Poettering <lenn...@poettering.net> wrote:


> On Tue, 25.06.13 02:21, Brian Bockelman (bbock...@cse.unl.edu) wrote:
> 
>> A few questions came to mind which may provide interesting input 
>> to your design process:
>> 1) I use cgroups heavily for resource accounting.  Do you envision 
>>  me querying via dbus for each accounting attribute?  Or do you 
>>  envision me querying for the cgroup name, then accessing the 
>> controller statistics directly?
> 
> Good question. Tejun wants systemd to cover that too. I am not entirely
> sure. I don't like the extra roundtrip for measuring the accounting
> bits. But maybe we can add a library that avoids the roundtrip, and
> simply provides you with high-level accounting values for cgroups. That
> way, for *changing* things you'd need to go via the bus, for *reading*
> things we'd give you a library that goes directly to the cgroupfs and
> avoids the roundtrip.

I like this idea.  Hopefully single-writer, multiple-reader is more sustainable 
path forward.

What about the notification APIs?  We currently use the memory.oom_control to 
get a notification when a job hits limits (this allows us to know the job died 
due to memory issues, as the user code itself typically just SIGSEGV's).  Is 
subscribing to notifications considered reading or writing in this case?

> 
>> 2) I currently fork and setup the resource environment (namespaces, 
>>  environment, working directory, etc).  Can an appropriately privileged 
>>  process create a sub-slice, place itself in it, and then drop privs 
>> / exec?
> 
> We'll probably have a way how you can take an existing set of processes
> and turn them dynamically into a new unit in systemd. These units would
> be mostly like service units, except that systemd wouldn't start the
> processes, but they would be "foreign" created. We are not sure about
> the name for this yet (i.e. whether to cover it under the ".service"
> suffix, but we'll probably call it "Scopes" instead, with the suffix
> ".scope").
> 
> The scope units could then be manipulated at runtime for (cgroup based)
> resource management the way normal services are too.
> 
> So basically, a service unit could be assigned to a slice unit, and
> could then create "scope" units which detach subprocesses from the
> original service unit, and get their own cgroup in the same slice or any
> other.
> 

This sounds manageable.

> 
>> 5) Will I be able to delegate management of a subslice to a
>> non-privileged user?
> 
> Unlikely, at least for the beginning. 
> 

(Very) long-term, this is attractive for us.  We prefer the batch system to run 
as unprivileged when possible (and to sacrifice the minimal amount of 
functionality to do so!).

>> I'm excited to see new ideas (again, having system tools be aware of 
>> the batch system activity is intriguing [2]), but am a bit worried about
>> losing functionality and the cost of porting things to the new era!
> 
> There's certainly going to be some lost flexibility. But of course we'll
> try to cover all interesting usecases.

I'll try to lurk and provide guidance about how us nutty batch system folks may 
try to use it.

> 
>> [2] Hopefully something that works better than 
>> "ps xawf -eo pid,user,cgroup,args" which currently segfaults for me :(
> 
> Hmm, could you file a bug, please?
> 

Couldn't figure out a patch -- too little time.  However, I at least tracked 
down the offending code.  Bug report is here:

https://bugzilla.redhat.com/show_bug.cgi?id=977854

Thanks,

Brian

smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] [HEADSUP] cgroup changes

Reply via email to