On Jun 25, 2013, at 4:56 AM, Lennart Poettering lenn...@poettering.net wrote:
On Tue, 25.06.13 02:21, Brian Bockelman (bbock...@cse.unl.edu) wrote:
A few questions came to mind which may provide interesting input
to your design process:
1) I use cgroups heavily for resource accounting. Do you envision
me querying via dbus for each accounting attribute? Or do you
envision me querying for the cgroup name, then accessing the
controller statistics directly?
Good question. Tejun wants systemd to cover that too. I am not entirely
sure. I don't like the extra roundtrip for measuring the accounting
bits. But maybe we can add a library that avoids the roundtrip, and
simply provides you with high-level accounting values for cgroups. That
way, for *changing* things you'd need to go via the bus, for *reading*
things we'd give you a library that goes directly to the cgroupfs and
avoids the roundtrip.
I like this idea. Hopefully single-writer, multiple-reader is more sustainable
path forward.
What about the notification APIs? We currently use the memory.oom_control to
get a notification when a job hits limits (this allows us to know the job died
due to memory issues, as the user code itself typically just SIGSEGV's). Is
subscribing to notifications considered reading or writing in this case?
2) I currently fork and setup the resource environment (namespaces,
environment, working directory, etc). Can an appropriately privileged
process create a sub-slice, place itself in it, and then drop privs
/ exec?
We'll probably have a way how you can take an existing set of processes
and turn them dynamically into a new unit in systemd. These units would
be mostly like service units, except that systemd wouldn't start the
processes, but they would be foreign created. We are not sure about
the name for this yet (i.e. whether to cover it under the .service
suffix, but we'll probably call it Scopes instead, with the suffix
.scope).
The scope units could then be manipulated at runtime for (cgroup based)
resource management the way normal services are too.
So basically, a service unit could be assigned to a slice unit, and
could then create scope units which detach subprocesses from the
original service unit, and get their own cgroup in the same slice or any
other.
This sounds manageable.
5) Will I be able to delegate management of a subslice to a
non-privileged user?
Unlikely, at least for the beginning.
(Very) long-term, this is attractive for us. We prefer the batch system to run
as unprivileged when possible (and to sacrifice the minimal amount of
functionality to do so!).
I'm excited to see new ideas (again, having system tools be aware of
the batch system activity is intriguing [2]), but am a bit worried about
losing functionality and the cost of porting things to the new era!
There's certainly going to be some lost flexibility. But of course we'll
try to cover all interesting usecases.
I'll try to lurk and provide guidance about how us nutty batch system folks may
try to use it.
[2] Hopefully something that works better than
ps xawf -eo pid,user,cgroup,args which currently segfaults for me :(
Hmm, could you file a bug, please?
Couldn't figure out a patch -- too little time. However, I at least tracked
down the offending code. Bug report is here:
https://bugzilla.redhat.com/show_bug.cgi?id=977854
Thanks,
Brian
smime.p7s
Description: S/MIME cryptographic signature
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel