On Wed, Nov 20, 2013 at 02:23:59PM -0500, Stéphane Graber wrote: > This morning at vUDS we discussed adding support for cgroups in Upstart. > > Before I go into details about the proposed stanza and overall > behaviour, I'd begin by saying that contrary to some other init systems, > our intent is solely related to resource controls which is the main goal > of cgroups. Process grouping and tracking will remain unaffected by the > addition of cgroup support. > > Cgroup support will be implemented by adding a new "cgroup" stanza which > will control the application of cgroup based restrictions to the job. > The limits will be applied to any of the scripts > (pre-start/post-start/job/pre-stop/post-stob) similar to what's done > with setuid/setgid/apparmor stanzas. > > Now my recommended format for the stanza, which I believe should be > flexible enough is: > cgroup <controller> <cgroup name|auto> [<key> <value>] > > > Detail on the fields: > == controller == > Name for one of the cgroup controller > > Currently the valid values are (but won't be hardcoded into upstart): > - blkio > - cpu > - cpuacct > - cpuset > - devices > - freezer > - hugetlb > - memory > - perf_event > > == cgroup-name|$auto == > Name of the cgroup to use (and create if non-existing) > > The name may contain a / (e.g. "db/pgsql" or "db/$auto") indicating that > it's requesting a sub-cgroup. > > "$auto" is the recommended name and will have upstart generate a name > based on the job instance name. > > The main use of that field is for cases where a set of jobs should share > limits, in such case the main job should declare the various values and > the others just refer to the cgroup by name but not defined values. > > The name may be different for the various controllers but may not differ > within the same controller. Example: > valid => cgroup memory group1 limit_in_bytes 52428800 > cgroup cpuset group2 cpus 0-1 > > invalid => cgroup memory group1 limit_in_bytes 52428800 > cgroup memory group1 soft_limit_in_bytes 1024
The invalid entry above is actually valid... What I meant was:
invalid => cgroup memory group1 limit_in_bytes 52428800
cgroup memory group2 soft_limit_in_bytes 1024
Thanks to Serge Hallyn for noticing!
>
> == key ==
> The cgroup control file minus the controller name, so for example
> memory.soft_limit_in_bytes will become limit_in_bytes.
>
> == value ==
> Any value valid for the given control file, upstart itself won't perform
> any validation.
>
> If the value contains spaces, it should be put between double-quotes (e.g.):
> cgroup devices auto allow "c 1:2 rwm"
>
>
> Upstart won't have any controller aware logic in its code, instead,
> it'll simply talk over dbus (using a private dbus socket) to the cgroup
> manager which will take care of applying the various limits.
> That cgroup manager will be started very early in the boot sequence. Any
> job containing a cgroup stanza will be held until the manager is
> started.
>
> The cgroup will be destroyed when a job is stopped and the cgroup isn't
> shared with another job (task count is 0 and it has no child cgroup).
>
> It'll be possible to disable cgroup support entirely by either building
> upstart without it (needed for non-Linux systems) or by passing
> --no-cgroup as a parameter to upstart. In that case, the cgroup stanza
> will simply be ignored and the jobs will start without limitations.
>
>
> All of the above is also meant to apply to user sessions. The cgroup
> manager will allow unprivileged cgroup configuration, so as long as the
> user has write access to a sub-section of a controller, it'll be allowed
> to write entries there. Similarly to other restriction stanzas, failure
> to apply a cgroup limit in a user session won't be fatal.
>
>
> Now a few examples to try and illustrate the thoughts behind that proposal:
>
> == Single job simple example ==
> === Job ===
> cgroup memory $auto limit_in_bytes 52428800
>
> === Result ===
> The job will only start once the manager is up and running and will have a
> 50MB memory limit. If the system has less than 50MB, the job will fail
> to start.
>
> == Single job complex example ==
> === Job ===
> cgroup memory $auto limit_in_bytes 52428800
> cgroup cpuset $auto cpus 0-1
> cgroup blkio slowio throttle.write_bps_device "8:16 1048576"
>
> == Result ==
> The job will only start once the manager is up and running and will have a
> 50MB memory limit, be restricted to CPU ids 0 and 1 and have a 1MB/s
> write limit to the block device 8:16.
> The job will fail to start if the system has less than 50MB of RAM or
> less than 2 CPUs.
>
>
> == Multiple jobs complex example ==
> === Job 1 ===
> cgroup cpuset db cpus 0-1
> cgroup memory db limit_in_bytes 104857600
> cgroup blkio db throttle.write_bps_device "8:16 1048576"
>
> === Job 2 ===
> cgroup cpuset db/$auto cpus 1
> cgroup memory db/$auto limit_in_bytes 52428800
> cgroup blkio db/$auto throttle.write_bps_device "8:17 1048576"
>
> === Job 3 ===
> cgroup cpuset db
> cgroup memory db
>
> === Job 4 ===
> cgroup cpuset db/$auto cpus 2
>
> == Result ==
> This is rather complex, so let's go job by job:
> - Job 1 will start bound to CPU 0 and 1 with a 100MB memory limit and
> 1MB/s write limit to the 8:16 block device. It'll fail to start if
> the system has less than 2 CPUs or less than 100MB of RAM.
>
> - Job 2 will start bound to CPU 1 and with a 50MB memory limit. It'll
> inherit the 1MB/s write limit to 8:16 and on top of that also rate limit
> writes to 8:17 also at 1MB/s.
> The job will fail to start if the system has less than 50MB of RAM or
> less than 2 CPUs.
>
> - Job 3 will start in the "db" cpuset and memory cgroups. If it starts
> before Job 1, no limit will be applied at startup time. As soon as Job 1
> starts however Job 3 will be limited to 2 CPUs and 100MB of memory.
> As it doesn't have a blkio statement, it won't have rate limited I/Os.
>
> - Job 4 if started after Job 1 will fail to start as it's requesting a
> CPU that the parent cgroup doesn't have access to. If started before
> Job 1 however, it won't have a parent value set so will inherit the
> default and so will start so long as the system has at least 3 CPUs.
>
>
>
> I think this pretty much covers all I've got in mind at this point, I
> think the above is flexible enough to work with all existing
> controllers.
>
> Questions, comment and suggestions are much welcome!
>
> --
> Stéphane Graber
> Ubuntu developer
> http://www.ubuntu.com
> --
> upstart-devel mailing list
> [email protected]
> Modify settings or unsubscribe at:
> https://lists.ubuntu.com/mailman/listinfo/upstart-devel
--
Stéphane Graber
Ubuntu developer
http://www.ubuntu.com
signature.asc
Description: Digital signature
-- upstart-devel mailing list [email protected] Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/upstart-devel
