Re: [PR] [cgroups2] Implement Cgroups 2 isolator w/o nested containers and systemd. [mesos]
DevinLeamy commented on code in PR #556: URL: https://github.com/apache/mesos/pull/556#discussion_r1572831258 ## src/slave/containerizer/mesos/isolators/cgroups2/cgroups2.cpp: ## @@ -120,6 +144,569 @@ bool Cgroups2IsolatorProcess::supportsStandalone() return true; } + +Future> Cgroups2IsolatorProcess::prepare( +const ContainerID& containerId, +const ContainerConfig& containerConfig) +{ + if (containerId.has_parent()) { +return Failure("cgroups v2 does not support nested containers"); + } + + if (infos.contains(containerId)) { +return Failure( +"Container with id '" + stringify(containerId) + "' " +"has already been prepared"); + } + + CHECK(containerConfig.container_class() != ContainerClass::DEBUG); + + // Create the non-leaf and leaf cgroups for the container, enable + // controllers in the non-leaf cgroup, and `prepare` each of the controllers. + const string& cgroup = cgroups2_paths::container( + flags.cgroups_root, containerId); + if (cgroups2::exists(cgroup)) { +return Failure("Cgroup '" + cgroup + "' already exists"); + } + + Try createCgroup = cgroups2::create(cgroup); + if (createCgroup.isError()) { +return Failure("Failed to create cgroup '" + cgroup + "': " + + createCgroup.error()); + } + + const string& leafCgroup = cgroups2_paths::container( + flags.cgroups_root, containerId, true); + if (cgroups2::exists(leafCgroup)) { +return Failure("Cgroup '" + leafCgroup + "' already exists"); + } + + Try createLeafCgroup = cgroups2::create(leafCgroup); + if (createLeafCgroup.isError()) { +return Failure("Failed to create cgroup '" + leafCgroup + "': " + + createLeafCgroup.error()); + } + + LOG(INFO) << "Created cgroups '" << cgroup << "' and '" << leafCgroup << "'"; + + infos[containerId] = Owned(new Info(containerId, cgroup)); + vector> prepares; + foreachvalue (const Owned& controller, controllers) { +// Enable controllers in the non-leaf cgroup. +Try enable = + cgroups2::controllers::enable(cgroup, {controller->name()}); + +infos[containerId]->controllers.insert(controller->name()); +prepares.push_back( +controller->prepare(containerId, cgroup, containerConfig)); + } + + return await(prepares) +.then(defer( +PID(this), +::_prepare, +containerId, +containerConfig, +lambda::_1)); +} + + +Future> Cgroups2IsolatorProcess::_prepare( +const ContainerID& containerId, +const ContainerConfig& containerConfig, +const vector>& futures) +{ + vector errors; + foreach (const Future& future, futures) { +if (!future.isReady()) { + errors.push_back(future.isFailed() ? future.failure() : "discarded"); +} + } + + if (!errors.empty()) { +return Failure( +"Failed to prepare controllers: " + strings::join(": ", errors)); + } + + return update( + containerId, + containerConfig.resources(), + containerConfig.limits()) +.then(defer( +PID(this), +::__prepare, +containerId, +containerConfig)); +} + + +Future> Cgroups2IsolatorProcess::__prepare( +const ContainerID& containerId, +const ContainerConfig& containerConfig) +{ + // Only create cgroup mounts for containers with rootfs. + if (!containerConfig.has_rootfs()) { +return None(); + } + + Owned info = cgroupInfo(containerId); + if (!info.get()) { +return Failure( +"Failed to get cgroup for container " +"'" + stringify(containerId) + "'"); + } + + ContainerLaunchInfo launchInfo; + // Create a new cgroup namespace. The child process will only be able to + // see the cgroups that are in its cgroup subtree. + launchInfo.add_clone_namespaces(CLONE_NEWCGROUP); + + // Create a new mount namespace and mount the root cgroup at /sys/fs/cgroup. + launchInfo.add_clone_namespaces(CLONE_NEWNS); + *launchInfo.add_mounts() = protobuf::slave::createContainerMount( + cgroups2::path(info->cgroup), + path::join(containerConfig.rootfs(), "/sys/fs/cgroup"), + MS_BIND | MS_REC); + + return launchInfo; +} + + +Future Cgroups2IsolatorProcess::recover( +const vector& states, +const hashset& orphans) +{ + vector> recovers; + foreach (const ContainerState& state, states) { +recovers.push_back(___recover(state.container_id())); + } + + return await(recovers) +.then(defer( +PID(this), +::_recover, +orphans, +lambda::_1)); +} + + +Future Cgroups2IsolatorProcess::_recover( + const hashset& orphans, + const vector>& futures) +{ + vector errors; + foreach (const Future& future, futures) { +if (!future.isReady()) { + errors.push_back(future.isFailed() ? future.failure() : "discarded"); +} + } + + if (errors.size() > 0) { +return Failure("Failed to recover active containers: " + + strings::join(": ", errors)); + } + + hashset knownOrphans; + hashset
[PR] [cgroups2] Error if `--cgroups_limit_swap` is used when cgroups v2 is used. [mesos]
DevinLeamy opened a new pull request, #565: URL: https://github.com/apache/mesos/pull/565 Mesos does not support limiting swap memory when using cgroups v2. This is because the cgroups v2 API does not allow us to directly control the swap memory usage limit. Rather, we can only control the combined swap _and_ RAM usage limit. Therefore, when the `--cgroups_limit_swap` flag is provided and cgroups v2 is used we error during flag validation. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@mesos.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [cgroups2] Introduces an interface to read a subset of the memory usage statistics. [mesos]
bmahler closed pull request #564: [cgroups2] Introduces an interface to read a subset of the memory usage statistics. URL: https://github.com/apache/mesos/pull/564 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@mesos.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [cgroups2] Error if `--cgroups_limit_swap` is used when cgroups v2 is used. [mesos]
bmahler closed pull request #565: [cgroups2] Error if `--cgroups_limit_swap` is used when cgroups v2 is used. URL: https://github.com/apache/mesos/pull/565 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@mesos.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [cgroups2] Introduces an interface to read a subset of the memory usage statistics. [mesos]
bmahler commented on code in PR #564: URL: https://github.com/apache/mesos/pull/564#discussion_r1573045523 ## src/linux/cgroups2.hpp: ## @@ -243,6 +243,41 @@ Try max(const std::string& cgroup); // See: https://docs.kernel.org/admin-guide/cgroup-v2.html namespace memory { +// Memory usage statistics. +// +// Snapshot of the 'memory.stat' control file. +// +// Note: +// We only record a subset of the memory statistics; a complete list can be +// found in the kernel documentation. +// https://docs.kernel.org/admin-guide/cgroup-v2.html#memory-interface-files +struct Stats +{ + // Anonymous memory usage. Review Comment: let's just use the cgroups names and docs for these? ## src/linux/cgroups2.cpp: ## @@ -964,6 +1005,17 @@ Result high(const string& cgroup) return internal::parse_bytelimit(*contents); } + +Try stats(const string& cgroup) +{ + Try contents = cgroups2::read(cgroup, control::STAT); + if (contents.isError()) { +return Error("Failed to read 'memory.stat': " + contents.error()); + } + + return control::stat::parse(strings::trim(*contents)); Review Comment: hm.. the parse function should know how to parse without the trimming, was this actually needed? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@mesos.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [cgroups2] Error if `--cgroups_limit_swap` is used when cgroups v2 is used. [mesos]
bmahler commented on code in PR #565: URL: https://github.com/apache/mesos/pull/565#discussion_r1573038742 ## src/slave/flags.cpp: ## @@ -652,8 +657,25 @@ mesos::internal::slave::Flags::Flags() add(::cgroups_limit_swap, "cgroups_limit_swap", "Cgroups feature flag to enable memory limits on both memory and\n" - "swap instead of just memory.\n", - false); + "swap instead of just memory. Not supported if cgroups v2 is used.\n", + false, + [](const bool& limit_swap) -> Option { +#ifdef ENABLE_CGROUPS_V2 +Try mounted = cgroups2::mounted(); +if (mounted.isError()) { +return Error("Failed to check if cgroup2 filesystem is mounted: " Review Comment: nit: 2 space indent here and below -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@mesos.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] [cgroups2] Introduces an interface to read a subset of the memory usage statistics. [mesos]
DevinLeamy opened a new pull request, #564: URL: https://github.com/apache/mesos/pull/564 Cgroups v2 exposes memory statistics through the 'memory.stat' control. Here we introduce `cgroups2::memory::stats` to read a subset of the memory usage statistics into a new `memory::Stats` object. These statistics will be used by the `MemoryControllerProcess` to populate a `ResourceStatistics` object, like is done by the `MemorySubsystemProcess` in cgroups v1. Additional statistics from the 'memory.stat' control can be included as they are required. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@mesos.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [cgroups2] Error if `--cgroups_limit_swap` is used when cgroups v2 is used. [mesos]
bmahler commented on PR #565: URL: https://github.com/apache/mesos/pull/565#issuecomment-2067370799 > This is because the cgroups v2 API does not allow us to directly control the swap memory usage limit. Rather, we can only control the combined swap and RAM usage limit. This doesn't sound accurate? in v2 we are able to separately control swap usage -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@mesos.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [cgroups2] Introduces API to make cgroups threaded and read their type. [mesos]
DevinLeamy commented on PR #562: URL: https://github.com/apache/mesos/pull/562#issuecomment-2066644211 Since we want the isolated ("containerized") process to be able to manipulate its own cgroups, if it so chooses, we're opting to no have the leaf cgroup for each container be threaded, hence removing the motivation for this patch. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@mesos.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [cgroups2] Introduces API to make cgroups threaded and read their type. [mesos]
DevinLeamy closed pull request #562: [cgroups2] Introduces API to make cgroups threaded and read their type. URL: https://github.com/apache/mesos/pull/562 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@mesos.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org