The following pull request was submitted through Github. It can be accessed and reviewed at: https://github.com/lxc/lxd/pull/7186
This e-mail was sent by the LXC bot, direct replies will not reach the author unless they happen to be subscribed to this list. === Description (from pull-request) ===
From 4e53e75245ec5a182efa3e0071d6c540b42b3db6 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?St=C3=A9phane=20Graber?= <stgra...@ubuntu.com> Date: Tue, 14 Apr 2020 12:52:20 -0400 Subject: [PATCH 1/2] doc/instances: Clarify config conditions MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Closes #7179 Signed-off-by: Stéphane Graber <stgra...@ubuntu.com> --- doc/instances.md | 124 +++++++++++++++++++++++------------------------ 1 file changed, 62 insertions(+), 62 deletions(-) diff --git a/doc/instances.md b/doc/instances.md index 8305efe943..c212149f0c 100644 --- a/doc/instances.md +++ b/doc/instances.md @@ -34,68 +34,68 @@ currently supported: The currently supported keys are: -Key | Type | Default | Live update | Condition | Description -:-- | :--- | :------ | :---------- | :---------- | :---------- -boot.autostart | boolean | - | n/a | - | Always start the instance when LXD starts (if not set, restore last state) -boot.autostart.delay | integer | 0 | n/a | - | Number of seconds to wait after the instance started before starting the next one -boot.autostart.priority | integer | 0 | n/a | - | What order to start the instances in (starting with highest) -boot.host\_shutdown\_timeout | integer | 30 | yes | - | Seconds to wait for instance to shutdown before it is force stopped -boot.stop.priority | integer | 0 | n/a | - | What order to shutdown the instances (starting with highest) -environment.\* | string | - | yes (exec) | - | key/value environment variables to export to the instance and set on exec -limits.cpu | string | - (all) | yes | - | Number or range of CPUs to expose to the instance -limits.cpu.allowance | string | 100% | yes | - | How much of the CPU can be used. Can be a percentage (e.g. 50%) for a soft limit or hard a chunk of time (25ms/100ms) -limits.cpu.priority | integer | 10 (maximum) | yes | - | CPU scheduling priority compared to other instances sharing the same CPUs (overcommit) (integer between 0 and 10) -limits.disk.priority | integer | 5 (medium) | yes | - | When under load, how much priority to give to the instance's I/O requests (integer between 0 and 10) -limits.hugepages.64KB | string | - | yes | container | Fixed value in bytes (various suffixes supported, see below) to limit number of 64 KB hugepages (Available hugepage sizes are architecture dependent.) -limits.hugepages.1MB | string | - | yes | container | Fixed value in bytes (various suffixes supported, see below) to limit number of 1 MB hugepages (Available hugepage sizes are architecture dependent.) -limits.hugepages.2MB | string | - | yes | container | Fixed value in bytes (various suffixes supported, see below) to limit number of 2 MB hugepages (Available hugepage sizes are architecture dependent.) -limits.hugepages.1GB | string | - | yes | container | Fixed value in bytes (various suffixes supported, see below) to limit number of 1 GB hugepages (Available hugepage sizes are architecture dependent.) -limits.kernel.\* | string | - | no | container | This limits kernel resources per instance (e.g. number of open files) -limits.memory | string | - (all) | yes | - | Percentage of the host's memory or fixed value in bytes (various suffixes supported, see below) -limits.memory.enforce | string | hard | yes | container | If hard, instance can't exceed its memory limit. If soft, the instance can exceed its memory limit when extra host memory is available -limits.memory.hugepages | boolean | false | no | virtual-machine | Controls whether to back the instance using hugepages rather than regular system memory -limits.memory.swap | boolean | true | yes | - | Whether to allow some of the instance's memory to be swapped out to disk -limits.memory.swap.priority | integer | 10 (maximum) | yes | - | The higher this is set, the least likely the instance is to be swapped to disk (integer between 0 and 10) -limits.network.priority | integer | 0 (minimum) | yes | - | When under load, how much priority to give to the instance's network requests (integer between 0 and 10) -limits.processes | integer | - (max) | yes | container | Maximum number of processes that can run in the instance -linux.kernel\_modules | string | - | yes | container | Comma separated list of kernel modules to load before starting the instance -migration.incremental.memory | boolean | false | yes | container | Incremental memory transfer of the instance's memory to reduce downtime -migration.incremental.memory.goal | integer | 70 | yes | container | Percentage of memory to have in sync before stopping the instance -migration.incremental.memory.iterations | integer | 10 | yes | container | Maximum number of transfer operations to go through before stopping the instance -nvidia.driver.capabilities | string | compute,utility | no | container | What driver capabilities the instance needs (sets libnvidia-container NVIDIA\_DRIVER\_CAPABILITIES) -nvidia.runtime | boolean | false | no | container | Pass the host NVIDIA and CUDA runtime libraries into the instance -nvidia.require.cuda | string | - | no | container | Version expression for the required CUDA version (sets libnvidia-container NVIDIA\_REQUIRE\_CUDA) -nvidia.require.driver | string | - | no | container | Version expression for the required driver version (sets libnvidia-container NVIDIA\_REQUIRE\_DRIVER) -raw.apparmor | blob | - | yes | container | Apparmor profile entries to be appended to the generated profile -raw.idmap | blob | - | no | container | Raw idmap configuration (e.g. "both 1000 1000") -raw.lxc | blob | - | no | container | Raw LXC configuration to be appended to the generated one -raw.qemu | blob | - | no | virtual-machine | Raw Qemu configuration to be appended to the generated command line -raw.seccomp | blob | - | no | container | Raw Seccomp configuration -security.devlxd | boolean | true | no | - | Controls the presence of /dev/lxd in the instance -security.devlxd.images | boolean | false | no | - | Controls the availability of the /1.0/images API over devlxd -security.idmap.base | integer | - | no | container | The base host ID to use for the allocation (overrides auto-detection) -security.idmap.isolated | boolean | false | no | container | Use an idmap for this instance that is unique among instances with isolated set -security.idmap.size | integer | - | no | container | The size of the idmap to use -security.nesting | boolean | false | yes | - | Support running lxd (nested) inside the instance -security.privileged | boolean | false | no | container | Runs the instance in privileged mode -security.protection.delete | boolean | false | yes | - | Prevents the instance from being deleted -security.protection.shift | boolean | false | yes | container | Prevents the instance's filesystem from being uid/gid shifted on startup -security.secureboot | boolean | true | no | virtual-machine | Controls whether UEFI secure boot is enabled with the default Microsoft keys -security.syscalls.blacklist | string | - | no | container | A '\n' separated list of syscalls to blacklist -security.syscalls.blacklist\_compat | boolean | false | no | container | On x86\_64 this enables blocking of compat\_\* syscalls, it is a no-op on other arches -security.syscalls.blacklist\_default | boolean | true | no | container | Enables the default syscall blacklist -security.syscalls.intercept.mknod | boolean | false | no | container | Handles the `mknod` and `mknodat` system calls (allows creation of a limited subset of char/block devices) -security.syscalls.intercept.mount | boolean | false | no | container | Handles the `mount` system call -security.syscalls.intercept.mount.allowed | string | - | yes | container | Specify a comma-separated list of filesystems that are safe to mount for processes inside the instance -security.syscalls.intercept.mount.fuse | string | - | yes | container | Whether to mount shiftfs on top of filesystems handled through mount syscall interception -security.syscalls.intercept.mount.shift | boolean | false | yes | container | Whether to redirect mounts of a given filesystem to their fuse implemenation (e.g. ext4=fuse2fs) -security.syscalls.intercept.setxattr | boolean | false | no | container | Handles the `setxattr` system call (allows setting a limited subset of restricted extended attributes) -security.syscalls.whitelist | string | - | no | container | A '\n' separated list of syscalls to whitelist (mutually exclusive with security.syscalls.blacklist\*) -snapshots.schedule | string | - | no | - | Cron expression (`<minute> <hour> <dom> <month> <dow>`) -snapshots.schedule.stopped | bool | false | no | - | Controls whether or not stopped instances are to be snapshoted automatically -snapshots.pattern | string | snap%d | no | - | Pongo2 template string which represents the snapshot name (used for scheduled snapshots and unnamed snapshots) -snapshots.expiry | string | - | no | - | Controls when snapshots are to be deleted (expects expression like `1M 2H 3d 4w 5m 6y`) -user.\* | string | - | n/a | - | Free form user key/value storage (can be used in search) +Key | Type | Default | Live update | Condition | Description +:-- | :--- | :------ | :---------- | :---------- | :---------- +boot.autostart | boolean | - | n/a | - | Always start the instance when LXD starts (if not set, restore last state) +boot.autostart.delay | integer | 0 | n/a | - | Number of seconds to wait after the instance started before starting the next one +boot.autostart.priority | integer | 0 | n/a | - | What order to start the instances in (starting with highest) +boot.host\_shutdown\_timeout | integer | 30 | yes | - | Seconds to wait for instance to shutdown before it is force stopped +boot.stop.priority | integer | 0 | n/a | - | What order to shutdown the instances (starting with highest) +environment.\* | string | - | yes (exec) | - | key/value environment variables to export to the instance and set on exec +limits.cpu | string | - (all) | yes | - | Number or range of CPUs to expose to the instance +limits.cpu.allowance | string | 100% | yes | container | How much of the CPU can be used. Can be a percentage (e.g. 50%) for a soft limit or hard a chunk of time (25ms/100ms) +limits.cpu.priority | integer | 10 (maximum) | yes | container | CPU scheduling priority compared to other instances sharing the same CPUs (overcommit) (integer between 0 and 10) +limits.disk.priority | integer | 5 (medium) | yes | - | When under load, how much priority to give to the instance's I/O requests (integer between 0 and 10) +limits.hugepages.64KB | string | - | yes | container | Fixed value in bytes (various suffixes supported, see below) to limit number of 64 KB hugepages (Available hugepage sizes are architecture dependent.) +limits.hugepages.1MB | string | - | yes | container | Fixed value in bytes (various suffixes supported, see below) to limit number of 1 MB hugepages (Available hugepage sizes are architecture dependent.) +limits.hugepages.2MB | string | - | yes | container | Fixed value in bytes (various suffixes supported, see below) to limit number of 2 MB hugepages (Available hugepage sizes are architecture dependent.) +limits.hugepages.1GB | string | - | yes | container | Fixed value in bytes (various suffixes supported, see below) to limit number of 1 GB hugepages (Available hugepage sizes are architecture dependent.) +limits.kernel.\* | string | - | no | container | This limits kernel resources per instance (e.g. number of open files) +limits.memory | string | - (all) | yes | - | Percentage of the host's memory or fixed value in bytes (various suffixes supported, see below) +limits.memory.enforce | string | hard | yes | container | If hard, instance can't exceed its memory limit. If soft, the instance can exceed its memory limit when extra host memory is available +limits.memory.hugepages | boolean | false | no | virtual-machine | Controls whether to back the instance using hugepages rather than regular system memory +limits.memory.swap | boolean | true | yes | container | Whether to allow some of the instance's memory to be swapped out to disk +limits.memory.swap.priority | integer | 10 (maximum) | yes | container | The higher this is set, the least likely the instance is to be swapped to disk (integer between 0 and 10) +limits.network.priority | integer | 0 (minimum) | yes | - | When under load, how much priority to give to the instance's network requests (integer between 0 and 10) +limits.processes | integer | - (max) | yes | container | Maximum number of processes that can run in the instance +linux.kernel\_modules | string | - | yes | container | Comma separated list of kernel modules to load before starting the instance +migration.incremental.memory | boolean | false | yes | container | Incremental memory transfer of the instance's memory to reduce downtime +migration.incremental.memory.goal | integer | 70 | yes | container | Percentage of memory to have in sync before stopping the instance +migration.incremental.memory.iterations | integer | 10 | yes | container | Maximum number of transfer operations to go through before stopping the instance +nvidia.driver.capabilities | string | compute,utility | no | container | What driver capabilities the instance needs (sets libnvidia-container NVIDIA\_DRIVER\_CAPABILITIES) +nvidia.runtime | boolean | false | no | container | Pass the host NVIDIA and CUDA runtime libraries into the instance +nvidia.require.cuda | string | - | no | container | Version expression for the required CUDA version (sets libnvidia-container NVIDIA\_REQUIRE\_CUDA) +nvidia.require.driver | string | - | no | container | Version expression for the required driver version (sets libnvidia-container NVIDIA\_REQUIRE\_DRIVER) +raw.apparmor | blob | - | yes | container | Apparmor profile entries to be appended to the generated profile +raw.idmap | blob | - | no | unprivileged container | Raw idmap configuration (e.g. "both 1000 1000") +raw.lxc | blob | - | no | container | Raw LXC configuration to be appended to the generated one +raw.qemu | blob | - | no | virtual-machine | Raw Qemu configuration to be appended to the generated command line +raw.seccomp | blob | - | no | container | Raw Seccomp configuration +security.devlxd | boolean | true | no | container | Controls the presence of /dev/lxd in the instance +security.devlxd.images | boolean | false | no | container | Controls the availability of the /1.0/images API over devlxd +security.idmap.base | integer | - | no | unprivileged container | The base host ID to use for the allocation (overrides auto-detection) +security.idmap.isolated | boolean | false | no | unprivileged container | Use an idmap for this instance that is unique among instances with isolated set +security.idmap.size | integer | - | no | unprivileged container | The size of the idmap to use +security.nesting | boolean | false | yes | container | Support running lxd (nested) inside the instance +security.privileged | boolean | false | no | container | Runs the instance in privileged mode +security.protection.delete | boolean | false | yes | - | Prevents the instance from being deleted +security.protection.shift | boolean | false | yes | container | Prevents the instance's filesystem from being uid/gid shifted on startup +security.secureboot | boolean | true | no | virtual-machine | Controls whether UEFI secure boot is enabled with the default Microsoft keys +security.syscalls.blacklist | string | - | no | container | A '\n' separated list of syscalls to blacklist +security.syscalls.blacklist\_compat | boolean | false | no | container | On x86\_64 this enables blocking of compat\_\* syscalls, it is a no-op on other arches +security.syscalls.blacklist\_default | boolean | true | no | container | Enables the default syscall blacklist +security.syscalls.intercept.mknod | boolean | false | no | container | Handles the `mknod` and `mknodat` system calls (allows creation of a limited subset of char/block devices) +security.syscalls.intercept.mount | boolean | false | no | container | Handles the `mount` system call +security.syscalls.intercept.mount.allowed | string | - | yes | container | Specify a comma-separated list of filesystems that are safe to mount for processes inside the instance +security.syscalls.intercept.mount.fuse | string | - | yes | container | Whether to mount shiftfs on top of filesystems handled through mount syscall interception +security.syscalls.intercept.mount.shift | boolean | false | yes | container | Whether to redirect mounts of a given filesystem to their fuse implemenation (e.g. ext4=fuse2fs) +security.syscalls.intercept.setxattr | boolean | false | no | container | Handles the `setxattr` system call (allows setting a limited subset of restricted extended attributes) +security.syscalls.whitelist | string | - | no | container | A '\n' separated list of syscalls to whitelist (mutually exclusive with security.syscalls.blacklist\*) +snapshots.schedule | string | - | no | - | Cron expression (`<minute> <hour> <dom> <month> <dow>`) +snapshots.schedule.stopped | bool | false | no | - | Controls whether or not stopped instances are to be snapshoted automatically +snapshots.pattern | string | snap%d | no | - | Pongo2 template string which represents the snapshot name (used for scheduled snapshots and unnamed snapshots) +snapshots.expiry | string | - | no | - | Controls when snapshots are to be deleted (expects expression like `1M 2H 3d 4w 5m 6y`) +user.\* | string | - | n/a | - | Free form user key/value storage (can be used in search) The following volatile keys are currently internally used by LXD: From 24255c43d17d9cd72b92d593b6233f7bb1771de9 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?St=C3=A9phane=20Graber?= <stgra...@ubuntu.com> Date: Tue, 14 Apr 2020 12:57:51 -0400 Subject: [PATCH 2/2] doc/index: Clarify bind-mount in FAQ MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Closes #7178 Signed-off-by: Stéphane Graber <stgra...@ubuntu.com> --- doc/index.md | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/doc/index.md b/doc/index.md index b820aa5634..8ca5273edb 100644 --- a/doc/index.md +++ b/doc/index.md @@ -277,25 +277,25 @@ experimental stages and may not work for all workloads. Please report bugs on lxc-devel, and we can escalate to CRIU lists as necessary. #### Can I bind mount my home directory in a container? -Yes. The easiest way to do that is using a privileged container to avoid file ownership issues: - -1.a) create a container. +Yes. This can be done using a disk device: ```bash -lxc launch ubuntu privilegedContainerName -c security.privileged=true +lxc config device add container-name home disk source=/home/$USER path=/home/ubuntu ``` -1.b) or, if your container already exists. +For unprivileged containers, you will also need one of: -```bash -lxc config set privilegedContainerName security.privileged true -``` + - Pass `shifted=true` to the `lxc config device add` call. This depends on `shiftfs` being supported (see `lxc info`) + - raw.idmap entry (see [Idmaps for user namespace](userns-idmap.md)) + - Recursive POSIX ACLs placed on your home directory -2) then. +Either of those can be used to allow the user in the container to have working read/write permissions. +When not setting one of those, everything will show up as the overflow uid/gid (65536:65536) +and access to anything that's not world readable will fail. -```bash -lxc config device add privilegedContainerName shareName disk source=/home/$USER path=/home/ubuntu -``` + +Privileged containers do not have this issue as all uid/gid inthe container are the same outside. +But that's also the cause of most of the security issues with such privileged containers. #### How can I run docker inside a LXD container? In order to run Docker inside a LXD container the `security.nesting` property of the container should be set to `true`.
_______________________________________________ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel