Repository: mesos Updated Branches: refs/heads/1.5.x eee8f2124 -> 6e2dfd06c
Improved performance of cgroups::read by verifying after failure. It turns out that cgroups::verify is expensive and leads to a severe performance issue on the agent during container metrics collection if there are a lot of containers on the agent. See MESOS-8418. Since cgroups::verify serves to provide a helpful error message, this patch preserves the error message, but only if the read fails. Longer term, there probably needs to be some re-structuring of the code to make verification caller-controlled, or perhaps the verify code can occur consistently post-operation (as done in this patch), or perhaps verify can be optimized substantially. Review: https://reviews.apache.org/r/67923/ Project: http://git-wip-us.apache.org/repos/asf/mesos/repo Commit: http://git-wip-us.apache.org/repos/asf/mesos/commit/f2be36d4 Tree: http://git-wip-us.apache.org/repos/asf/mesos/tree/f2be36d4 Diff: http://git-wip-us.apache.org/repos/asf/mesos/diff/f2be36d4 Branch: refs/heads/1.5.x Commit: f2be36d4a57206b2c53a0f42b2e392ccd65c8c98 Parents: eee8f21 Author: Benjamin Mahler <bmah...@apache.org> Authored: Mon Jul 16 17:22:45 2018 -0700 Committer: Benjamin Mahler <bmah...@apache.org> Committed: Mon Jul 16 17:36:31 2018 -0700 ---------------------------------------------------------------------- src/linux/cgroups.cpp | 21 +++++++++++++++++---- 1 file changed, 17 insertions(+), 4 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/mesos/blob/f2be36d4/src/linux/cgroups.cpp ---------------------------------------------------------------------- diff --git a/src/linux/cgroups.cpp b/src/linux/cgroups.cpp index cdd0f40..96ea289 100644 --- a/src/linux/cgroups.cpp +++ b/src/linux/cgroups.cpp @@ -948,12 +948,25 @@ Try<string> read( const string& cgroup, const string& control) { - Option<Error> error = verify(hierarchy, cgroup, control); - if (error.isSome()) { - return error.get(); + Try<string> read = internal::read(hierarchy, cgroup, control); + + // It turns out that `verify()` is expensive, so rather than + // verifying *prior* to the read, as is currently done for other + // cgroup helpers, we only verify if the read fails. This ensures + // we still provide a good error message. See: MESOS-8418. + // + // TODO(bmahler): Longer term, verification should be done + // explicitly by the caller, or its performance should be + // improved, or verification could be done consistently + // post-failure, as done here. + if (read.isError()) { + Option<Error> error = verify(hierarchy, cgroup, control); + if (error.isSome()) { + return error.get(); + } } - return internal::read(hierarchy, cgroup, control); + return read; }