Re: RFR: 8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container [v2]
On Fri, 3 May 2024 15:58:11 GMT, Severin Gehwolf wrote: >> src/java.base/share/classes/sun/launcher/LauncherHelper.java line 375: >> >>> 373: if (!c.isContainerized()) { >>> 374: ostream.println(INDENT + "System not containerized."); >>> 375: return; >> >> Why return here? Would this not cut the output short in the >> non-containerized case? >> >> And if this not intended, the not-containerized-`-XshowSettings:system` test >> below should test and catch this (e.g. scan for CPU set) > >> Why return here? > > Because it's not useful to see containerized settings (other than the cg > version in use) after this patch. The JVM won't use them (uses the physical > settings instead). Why would you want to show the settings? To clarify. `showSettings:system` output on a host system: Operating System Metrics: Provider: cgroupv1 System not containerized. openjdk 23-internal 2024-09-17 OpenJDK Runtime Environment (fastdebug build 23-internal-adhoc.sgehwolf.jdk-jdk) OpenJDK 64-Bit Server VM (fastdebug build 23-internal-adhoc.sgehwolf.jdk-jdk, mixed mode, sharing) ... and in a container (with memory limit 500m): Operating System Metrics: Provider: cgroupv1 Effective CPU Count: 12 CPU Period: 10us CPU Quota: -1 CPU Shares: -1 List of Processors, 12 total: 0 1 2 3 4 5 6 7 8 9 10 11 List of Effective Processors, 12 total: 0 1 2 3 4 5 6 7 8 9 10 11 List of Memory Nodes, 1 total: 0 List of Available Memory Nodes, 1 total: 0 Memory Limit: 500.00M Memory Soft Limit: Unlimited Memory & Swap Limit: 500.00M Maximum Processes Limit: 2048 openjdk 23-internal 2024-09-17 OpenJDK Runtime Environment (fastdebug build 23-internal-adhoc.sgehwolf.jdk-jdk) OpenJDK 64-Bit Server VM (fastdebug build 23-internal-adhoc.sgehwolf.jdk-jdk, mixed mode, sharing) - PR Review Comment: https://git.openjdk.org/jdk/pull/18201#discussion_r1589407238
Re: RFR: 8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container [v2]
On Tue, 16 Apr 2024 18:21:29 GMT, Thomas Stuefe wrote: > Why return here? Because it's not useful to see containerized settings (other than the cg version in use) after this patch. The JVM won't use them (uses the physical settings instead). Why would you want to show the settings? - PR Review Comment: https://git.openjdk.org/jdk/pull/18201#discussion_r1589396352
Re: RFR: 8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container [v2]
On Tue, 16 Apr 2024 18:10:08 GMT, Thomas Stuefe wrote: >> src/hotspot/os/linux/cgroupSubsystem_linux.cpp line 351: >> >>> 349: // >>> 350: // We collect the read only mount option in the cgroup infos so as >>> to have that >>> 351: // info ready when determining is_containerized(). >> >> Here, and in other places: a comment indicating the line format we scan >> would be appreciated, possibly with argument numbers. Saves the casual code >> reader from looking into proc man page. Even just pasting the example line >> for proc manpage would be fine >> (https://man7.org/linux/man-pages/man5/proc.5.html) (but with order adapted >> to your scanf call, they count major:minor as one) > > Trying to parse the `%s%*[^-]-` > > So, %s parses the mount options, until we encounter whitespace. Then %*[^-]- > parses everything that is not a dash, until we encounter the dash? Then we > eat the dash? This is to skip the optionals? Correct. Note that `%s %*[^-]` doesn't work for files without optionals. Since `%*[^-]` requires a non-empty match and the optionals are, well, optional :-) I've added more verbose comments to clarify this. - PR Review Comment: https://git.openjdk.org/jdk/pull/18201#discussion_r1589390841
Re: RFR: 8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container [v2]
On Tue, 16 Apr 2024 18:16:33 GMT, Thomas Stuefe wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a >> merge or a rebase. The incremental webrev excludes the unrelated changes >> brought in by the merge/rebase. The pull request contains ten additional >> commits since the last revision: >> >> - Merge branch 'master' into jdk-8261242-is-containerized-fix >> - jcheck fixes >> - Fix tests >> - Implement Metrics.isContainerized() >> - Some clean-up >> - Drop cgroups testing on plain Linux >> - Implement fall-back logic for non-ro controller mounts >> - Make find_ro static and local to compilation unit >> - 8261242: [Linux] OSContainer::is_containerized() returns true > > src/hotspot/os/linux/osContainer_linux.cpp line 78: > >> 76: const char *reason; >> 77: bool any_mem_cpu_limit_present = false; >> 78: bool ctrl_ro = cgroup_subsystem->is_containerized(); > > nit: naming? what does ctrl mean in this case? Maybe use > "cgroup_is_containerized"? `ctrl` was short for `controller`. I've changed the naming. - PR Review Comment: https://git.openjdk.org/jdk/pull/18201#discussion_r1589391426
Re: RFR: 8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container [v2]
On Thu, 11 Apr 2024 12:08:02 GMT, Severin Gehwolf wrote: >> Please review this enhancement to the container detection code which allows >> it to figure out whether the JVM is actually running inside a container >> (`podman`, `docker`, `crio`), or with some other means that enforces >> memory/cpu limits by means of the cgroup filesystem. If neither of those >> conditions hold, the JVM runs in not containerized mode, addressing the >> issue described in the JBS tracker. For example, on my Linux system >> `is_containerized() == false" is being indicated with the following trace >> log line: >> >> >> [0.001s][debug][os,container] OSContainer::init: is_containerized() = false >> because no cpu or memory limit is present >> >> >> This state is being exposed by the Java `Metrics` API class using the new >> (still JDK internal) `isContainerized()` method. Example: >> >> >> java -XshowSettings:system --version >> Operating System Metrics: >> Provider: cgroupv1 >> System not containerized. >> openjdk 23-internal 2024-09-17 >> OpenJDK Runtime Environment (fastdebug build >> 23-internal-adhoc.sgehwolf.jdk-jdk) >> OpenJDK 64-Bit Server VM (fastdebug build >> 23-internal-adhoc.sgehwolf.jdk-jdk, mixed mode, sharing) >> >> >> The basic property this is being built on is the observation that the cgroup >> controllers typically get mounted read only into containers. Note that the >> current container tests assert that `OSContainer::is_containerized() == >> true` in various tests. Therefore, using the heuristic of "is any memory or >> cpu limit present" isn't sufficient. I had considered that in an earlier >> iteration, but many container tests failed. >> >> Overall, I think, with this patch we improve the current situation of >> claiming a containerized system being present when it's actually just a >> regular Linux system. >> >> Testing: >> >> - [x] GHA (risc-v failure seems infra related) >> - [x] Container tests on Linux x86_64 of cgroups v1 and cgroups v2 >> (including gtests) >> - [x] Some manual testing using cri-o >> >> Thoughts? > > Severin Gehwolf has updated the pull request with a new target base due to a > merge or a rebase. The incremental webrev excludes the unrelated changes > brought in by the merge/rebase. The pull request contains ten additional > commits since the last revision: > > - Merge branch 'master' into jdk-8261242-is-containerized-fix > - jcheck fixes > - Fix tests > - Implement Metrics.isContainerized() > - Some clean-up > - Drop cgroups testing on plain Linux > - Implement fall-back logic for non-ro controller mounts > - Make find_ro static and local to compilation unit > - 8261242: [Linux] OSContainer::is_containerized() returns true Thanks for your input Larry! - PR Comment: https://git.openjdk.org/jdk/pull/18201#issuecomment-2066136268
Re: RFR: 8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container [v2]
On Thu, 18 Apr 2024 13:27:38 GMT, Jan Kratochvil wrote: > Could not we rename `is_containerized()` to `use_container_limit()` ? As that > is the current only purpose of `is_containerized()`. I'm not sure. There is value to have `is_containerized()` like it would behave after this patch. Specifically the first table row difference in [your comment](https://github.com/openjdk/jdk/pull/18201#issuecomment-2063868908) concerns me. JVMs running in a container without limit wouldn't be detected as "containerized". That seems a large share of deployments to miss. - PR Comment: https://git.openjdk.org/jdk/pull/18201#issuecomment-2064487567
Re: RFR: 8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container [v2]
On Thu, 11 Apr 2024 12:08:02 GMT, Severin Gehwolf wrote: >> Please review this enhancement to the container detection code which allows >> it to figure out whether the JVM is actually running inside a container >> (`podman`, `docker`, `crio`), or with some other means that enforces >> memory/cpu limits by means of the cgroup filesystem. If neither of those >> conditions hold, the JVM runs in not containerized mode, addressing the >> issue described in the JBS tracker. For example, on my Linux system >> `is_containerized() == false" is being indicated with the following trace >> log line: >> >> >> [0.001s][debug][os,container] OSContainer::init: is_containerized() = false >> because no cpu or memory limit is present >> >> >> This state is being exposed by the Java `Metrics` API class using the new >> (still JDK internal) `isContainerized()` method. Example: >> >> >> java -XshowSettings:system --version >> Operating System Metrics: >> Provider: cgroupv1 >> System not containerized. >> openjdk 23-internal 2024-09-17 >> OpenJDK Runtime Environment (fastdebug build >> 23-internal-adhoc.sgehwolf.jdk-jdk) >> OpenJDK 64-Bit Server VM (fastdebug build >> 23-internal-adhoc.sgehwolf.jdk-jdk, mixed mode, sharing) >> >> >> The basic property this is being built on is the observation that the cgroup >> controllers typically get mounted read only into containers. Note that the >> current container tests assert that `OSContainer::is_containerized() == >> true` in various tests. Therefore, using the heuristic of "is any memory or >> cpu limit present" isn't sufficient. I had considered that in an earlier >> iteration, but many container tests failed. >> >> Overall, I think, with this patch we improve the current situation of >> claiming a containerized system being present when it's actually just a >> regular Linux system. >> >> Testing: >> >> - [x] GHA (risc-v failure seems infra related) >> - [x] Container tests on Linux x86_64 of cgroups v1 and cgroups v2 >> (including gtests) >> - [x] Some manual testing using cri-o >> >> Thoughts? > > Severin Gehwolf has updated the pull request with a new target base due to a > merge or a rebase. The incremental webrev excludes the unrelated changes > brought in by the merge/rebase. The pull request contains ten additional > commits since the last revision: > > - Merge branch 'master' into jdk-8261242-is-containerized-fix > - jcheck fixes > - Fix tests > - Implement Metrics.isContainerized() > - Some clean-up > - Drop cgroups testing on plain Linux > - Implement fall-back logic for non-ro controller mounts > - Make find_ro static and local to compilation unit > - 8261242: [Linux] OSContainer::is_containerized() returns true Could not we rename `is_containerized()` to `use_container_limit()` ? As that is the current only purpose of `is_containerized()`. I did not test it but I expect the values will be: | your patch | my trivial patch | Actual deployment scenario | |||| | `true` | `false` |OpenJDK runs in an unprivileged container **without** a cpu/memory limit | | `true` | `true` | OpenJDK runs in an unprivileged container **with** a cpu/memory limit | | `false` | `false` | OpenJDK runs in a privileged container **without** a cpu/memory limit | | `true` | `true` | OpenJDK runs in a privileged container **with** a cpu/memory limit | | `false` | `false` | OpenJDK runs in a systemd slice **without** a cpu/memory limit | | `true` | `true` | OpenJDK runs in a systemd slice **with** a cpu/memory limit | | `false` | `false` | OpenJDK runs on a physical Linux system (VM or bare metal) | - PR Comment: https://git.openjdk.org/jdk/pull/18201#issuecomment-2063868908
Re: RFR: 8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container [v2]
On Wed, 17 Apr 2024 01:07:04 GMT, Jan Kratochvil wrote: >>> IMHO `is_containerized()` is OK to return `false` even when running in a >>> container but with no limitations set. >> >> The idea here is to use this property to tune OpenJDK for in-container, >> specifically k8s, use. In such a setup it's custom to run a single process >> within set resource constraints. In order to do this, we need a reliable way >> to distinguish that vs. non-containerized setup. If somebody really wants to >> run OpenJDK in a container expecting it to run like a physical OpenJDK >> deployment, that's when `-XX:-UseContainerSupport` should be used. > >> The idea here is to use this property to tune OpenJDK for in-container, >> specifically k8s, use. In such a setup it's custom to run a single process >> within set resource constraints. > > The in-container tuning means to use all the available resources. Containers > in the real world have some memory limits set which is where my modified > patch still correctly identifies it as a container to use all the available > resources of the node which is the whole goal of the container detection code. > >> In order to do this, we need a reliable way to distinguish that vs. >> non-containerized setup. > > I expect it should have been written "We need a reliable way to distinguish > real world in-container vs. non-containerized setup. We do not mind behavior > for artificial containers on OpenJDK development machines.". Which is what my > patch does in an easier and less error-prone way. > >> If somebody really wants to run OpenJDK in a container expecting it to run >> like a physical OpenJDK deployment, that's when `-XX:-UseContainerSupport` >> should be used. > > That behaves still the same with my patch. > > Could you give a countercase where my patch behaves wrongly? @jankratochvil I believe this boils down to what we actually want. Should `OSContainer::is_containerized()` return `false` when run *inside* a container? If so, when is it OK to do that? Should `OSContainer::is_containerized()` return `true` on a physical Linux deployment? IMO, the read-only property of the mount points was something that fit naturally since, we already scan those anyway for (cgv1 vs cgv2 detection). Therefore it was something to consider to make heuristics more accurate. The truth table of the patch in this PR looks like this: | `OSContainer::is_containerized()` value | Actual deployment scenario | | - | - | | `true` | OpenJDK runs in an unprivileged container **without** a cpu/memory limit | | `true` | OpenJDK runs in an unprivileged container **with** a cpu/memory limit | | `true` | OpenJDK runs in a privileged container **with** a cpu/memory limit | | `false` | OpenJDK runs in a privileged container **without** a cpu/memory limit | | `false` | OpenJDK runs in a systemd slice **without** a cpu/memory limit | | `true` | OpenJDK runs in a systemd slice **with** a cpu/memory limit | | `false` | OpenJDK runs on a physical Linux system (VM or bare metal) | As you can see, the case of "OpenJDK runs in a privileged container *without* a cpu/memory limit" gives the wrong result. However, I consider this a fairly uncommon setup and there isn't really anything we can do to detect this kind of setup. Even if we did manage to detect it, why would we care? It's a question of probability. > Could you give a countercase where my patch behaves wrongly? Again, it's not a case of right or wrong IMO. Since we are in the land of heuristics, they will be wrong in some cases. We should make them so that we cover the common cases and, perhaps, are able to use those in serviceability tools to help guide diagnosis and/or further tuning. So far the existing capabilities were OK, but prevent further out-of-the-box tuning and/or accurate data collection. Your proposed patch (it's one I had in a previous iteration too, fwiw) would also return `false` for the case of "OpenJDK runs in an unprivileged container **without** a cpu/memory limit", which seems counterintuitive if OpenJDK actually runs in a container! What's more, it seems a fairly common case. Also, there is a chance of the OpenJDK heuristics to fail memory/cpu limit detection because of bugs and new developments. It seems the safer option to know that OpenJDK is containerized (using other heuristics) in that case. Maybe that's just me. Let's have that discussion more broadly and see if we can reach consensus! - PR Comment: https://git.openjdk.org/jdk/pull/18201#issuecomment-2063477204
Re: RFR: 8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container [v2]
On Tue, 16 Apr 2024 15:17:33 GMT, Severin Gehwolf wrote: > The idea here is to use this property to tune OpenJDK for in-container, > specifically k8s, use. In such a setup it's custom to run a single process > within set resource constraints. The in-container tuning means to use all the available resources. Containers in the real world have some memory limits set which is where my modified patch still correctly identifies it as a container to use all the available resources of the node which is the whole goal of the container detection code. > In order to do this, we need a reliable way to distinguish that vs. > non-containerized setup. I expect it should have been written "We need a reliable way to distinguish real world in-container vs. non-containerized setup. We do not mind behavior for artificial containers on OpenJDK development machines.". Which is what my patch does in an easier and less error-prone way. > If somebody really wants to run OpenJDK in a container expecting it to run > like a physical OpenJDK deployment, that's when `-XX:-UseContainerSupport` > should be used. That behaves still the same with my patch. Could you give a countercase where my patch behaves wrongly? - PR Comment: https://git.openjdk.org/jdk/pull/18201#issuecomment-2060158409
Re: RFR: 8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container [v2]
On Sat, 13 Apr 2024 18:29:59 GMT, Thomas Stuefe wrote: >> Severin Gehwolf has updated the pull request with a new target base due to a >> merge or a rebase. The incremental webrev excludes the unrelated changes >> brought in by the merge/rebase. The pull request contains ten additional >> commits since the last revision: >> >> - Merge branch 'master' into jdk-8261242-is-containerized-fix >> - jcheck fixes >> - Fix tests >> - Implement Metrics.isContainerized() >> - Some clean-up >> - Drop cgroups testing on plain Linux >> - Implement fall-back logic for non-ro controller mounts >> - Make find_ro static and local to compilation unit >> - 8261242: [Linux] OSContainer::is_containerized() returns true > > src/hotspot/os/linux/cgroupSubsystem_linux.cpp line 170: > >> 168: } >> 169: } >> 170: return false; > > An alternative, simpler, no need for modifying source string: > > static bool find_ro_opt(const char* o) { > return strcmp(o, "ro") || strstr(o, ",ro") || strstr(o, "ro,"); > } Please disregard my comment. Albeit longer, your version is clearer to read and more fault tolerant. > src/hotspot/os/linux/cgroupSubsystem_linux.cpp line 351: > >> 349: // >> 350: // We collect the read only mount option in the cgroup infos so as >> to have that >> 351: // info ready when determining is_containerized(). > > Here, and in other places: a comment indicating the line format we scan would > be appreciated, possibly with argument numbers. Saves the casual code reader > from looking into proc man page. Even just pasting the example line for proc > manpage would be fine (https://man7.org/linux/man-pages/man5/proc.5.html) > (but with order adapted to your scanf call, they count major:minor as one) Trying to parse the `%s%*[^-]-` So, %s parses the mount options, until we encounter whitespace. Then %*[^-]- parses everything that is not a dash, until we encounter the dash? Then we eat the dash? This is to skip the optionals? - PR Review Comment: https://git.openjdk.org/jdk/pull/18201#discussion_r1567754861 PR Review Comment: https://git.openjdk.org/jdk/pull/18201#discussion_r1567767209
Re: RFR: 8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container [v2]
On Thu, 11 Apr 2024 12:08:02 GMT, Severin Gehwolf wrote: >> Please review this enhancement to the container detection code which allows >> it to figure out whether the JVM is actually running inside a container >> (`podman`, `docker`, `crio`), or with some other means that enforces >> memory/cpu limits by means of the cgroup filesystem. If neither of those >> conditions hold, the JVM runs in not containerized mode, addressing the >> issue described in the JBS tracker. For example, on my Linux system >> `is_containerized() == false" is being indicated with the following trace >> log line: >> >> >> [0.001s][debug][os,container] OSContainer::init: is_containerized() = false >> because no cpu or memory limit is present >> >> >> This state is being exposed by the Java `Metrics` API class using the new >> (still JDK internal) `isContainerized()` method. Example: >> >> >> java -XshowSettings:system --version >> Operating System Metrics: >> Provider: cgroupv1 >> System not containerized. >> openjdk 23-internal 2024-09-17 >> OpenJDK Runtime Environment (fastdebug build >> 23-internal-adhoc.sgehwolf.jdk-jdk) >> OpenJDK 64-Bit Server VM (fastdebug build >> 23-internal-adhoc.sgehwolf.jdk-jdk, mixed mode, sharing) >> >> >> The basic property this is being built on is the observation that the cgroup >> controllers typically get mounted read only into containers. Note that the >> current container tests assert that `OSContainer::is_containerized() == >> true` in various tests. Therefore, using the heuristic of "is any memory or >> cpu limit present" isn't sufficient. I had considered that in an earlier >> iteration, but many container tests failed. >> >> Overall, I think, with this patch we improve the current situation of >> claiming a containerized system being present when it's actually just a >> regular Linux system. >> >> Testing: >> >> - [x] GHA (risc-v failure seems infra related) >> - [x] Container tests on Linux x86_64 of cgroups v1 and cgroups v2 >> (including gtests) >> - [x] Some manual testing using cri-o >> >> Thoughts? > > Severin Gehwolf has updated the pull request with a new target base due to a > merge or a rebase. The incremental webrev excludes the unrelated changes > brought in by the merge/rebase. The pull request contains ten additional > commits since the last revision: > > - Merge branch 'master' into jdk-8261242-is-containerized-fix > - jcheck fixes > - Fix tests > - Implement Metrics.isContainerized() > - Some clean-up > - Drop cgroups testing on plain Linux > - Implement fall-back logic for non-ro controller mounts > - Make find_ro static and local to compilation unit > - 8261242: [Linux] OSContainer::is_containerized() returns true I am not enough of a container expert to judge if the basic approach is right - I trust you on this. This is just a technical code review. src/hotspot/os/linux/cgroupSubsystem_linux.cpp line 170: > 168: } > 169: } > 170: return false; An alternative, simpler, no need for modifying source string: static bool find_ro_opt(const char* o) { return strcmp(o, "ro") || strstr(o, ",ro") || strstr(o, "ro,"); } src/hotspot/os/linux/cgroupSubsystem_linux.cpp line 351: > 349: // > 350: // We collect the read only mount option in the cgroup infos so as > to have that > 351: // info ready when determining is_containerized(). Here, and in other places: a comment indicating the line format we scan would be appreciated, possibly with argument numbers. Saves the casual code reader from looking into proc man page. Even just pasting the example line for proc manpage would be fine (https://man7.org/linux/man-pages/man5/proc.5.html) (but with order adapted to your scanf call, they count major:minor as one) src/hotspot/os/linux/osContainer_linux.cpp line 78: > 76: const char *reason; > 77: bool any_mem_cpu_limit_present = false; > 78: bool ctrl_ro = cgroup_subsystem->is_containerized(); nit: naming? what does ctrl mean in this case? Maybe use "cgroup_is_containerized"? src/java.base/share/classes/sun/launcher/LauncherHelper.java line 375: > 373: if (!c.isContainerized()) { > 374: ostream.println(INDENT + "System not containerized."); > 375: return; Why return here? Would this not cut the output short in the non-containerized case? And if this not intended, the not-containerized-`-XshowSettings:system` test below should test and catch this (e.g. scan for CPU set) - PR Review: https://git.openjdk.org/jdk/pull/18201#pullrequestreview-1999328503 PR Review Comment: https://git.openjdk.org/jdk/pull/18201#discussion_r1564182879 PR Review Comment: https://git.openjdk.org/jdk/pull/18201#discussion_r1567756663 PR Review Comment: https://git.openjdk.org/jdk/pull/18201#discussion_r1567774124 PR Review Comment: https://git.openjdk.org/jdk/pull/18201#discussion_r1567779248
Re: RFR: 8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container [v2]
On Tue, 16 Apr 2024 14:40:46 GMT, Jan Kratochvil wrote: > IMHO `is_containerized()` is OK to return `false` even when running in a > container but with no limitations set. The idea here is to use this property to tune OpenJDK for in-container, specifically k8s, use. In such a setup it's custom to run a single process within set resource constraints. In order to do this, we need a reliable way to distinguish that vs. non-containerized setup. If somebody really wants to run OpenJDK in a container expecting it to run like a physical OpenJDK deployment, that's when `-XX:-UseContainerSupport` should be used. - PR Comment: https://git.openjdk.org/jdk/pull/18201#issuecomment-2059344194
Re: RFR: 8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container [v2]
On Thu, 11 Apr 2024 12:08:02 GMT, Severin Gehwolf wrote: >> Please review this enhancement to the container detection code which allows >> it to figure out whether the JVM is actually running inside a container >> (`podman`, `docker`, `crio`), or with some other means that enforces >> memory/cpu limits by means of the cgroup filesystem. If neither of those >> conditions hold, the JVM runs in not containerized mode, addressing the >> issue described in the JBS tracker. For example, on my Linux system >> `is_containerized() == false" is being indicated with the following trace >> log line: >> >> >> [0.001s][debug][os,container] OSContainer::init: is_containerized() = false >> because no cpu or memory limit is present >> >> >> This state is being exposed by the Java `Metrics` API class using the new >> (still JDK internal) `isContainerized()` method. Example: >> >> >> java -XshowSettings:system --version >> Operating System Metrics: >> Provider: cgroupv1 >> System not containerized. >> openjdk 23-internal 2024-09-17 >> OpenJDK Runtime Environment (fastdebug build >> 23-internal-adhoc.sgehwolf.jdk-jdk) >> OpenJDK 64-Bit Server VM (fastdebug build >> 23-internal-adhoc.sgehwolf.jdk-jdk, mixed mode, sharing) >> >> >> The basic property this is being built on is the observation that the cgroup >> controllers typically get mounted read only into containers. Note that the >> current container tests assert that `OSContainer::is_containerized() == >> true` in various tests. Therefore, using the heuristic of "is any memory or >> cpu limit present" isn't sufficient. I had considered that in an earlier >> iteration, but many container tests failed. >> >> Overall, I think, with this patch we improve the current situation of >> claiming a containerized system being present when it's actually just a >> regular Linux system. >> >> Testing: >> >> - [x] GHA (risc-v failure seems infra related) >> - [x] Container tests on Linux x86_64 of cgroups v1 and cgroups v2 >> (including gtests) >> - [x] Some manual testing using cri-o >> >> Thoughts? > > Severin Gehwolf has updated the pull request with a new target base due to a > merge or a rebase. The incremental webrev excludes the unrelated changes > brought in by the merge/rebase. The pull request contains ten additional > commits since the last revision: > > - Merge branch 'master' into jdk-8261242-is-containerized-fix > - jcheck fixes > - Fix tests > - Implement Metrics.isContainerized() > - Some clean-up > - Drop cgroups testing on plain Linux > - Implement fall-back logic for non-ro controller mounts > - Make find_ro static and local to compilation unit > - 8261242: [Linux] OSContainer::is_containerized() returns true IMHO `is_containerized()` is OK to return `false` even when running in a container but with no limitations set. Container detection is IIUC/AFAIK being used to maximize resource usage by OpenJDK. But if OpenJDK runs in a container with the same limits as the hardware box OpenJDK should still use reduced resources as it is sharing them with other processes on the hardware box. [is-containerized.patch.txt](https://github.com/openjdk/jdk/files/14998503/is-containerized.patch.txt) - PR Comment: https://git.openjdk.org/jdk/pull/18201#issuecomment-2059261807
Re: RFR: 8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container [v2]
> Please review this enhancement to the container detection code which allows > it to figure out whether the JVM is actually running inside a container > (`podman`, `docker`, `crio`), or with some other means that enforces > memory/cpu limits by means of the cgroup filesystem. If neither of those > conditions hold, the JVM runs in not containerized mode, addressing the issue > described in the JBS tracker. For example, on my Linux system > `is_containerized() == false" is being indicated with the following trace log > line: > > > [0.001s][debug][os,container] OSContainer::init: is_containerized() = false > because no cpu or memory limit is present > > > This state is being exposed by the Java `Metrics` API class using the new > (still JDK internal) `isContainerized()` method. Example: > > > java -XshowSettings:system --version > Operating System Metrics: > Provider: cgroupv1 > System not containerized. > openjdk 23-internal 2024-09-17 > OpenJDK Runtime Environment (fastdebug build > 23-internal-adhoc.sgehwolf.jdk-jdk) > OpenJDK 64-Bit Server VM (fastdebug build 23-internal-adhoc.sgehwolf.jdk-jdk, > mixed mode, sharing) > > > The basic property this is being built on is the observation that the cgroup > controllers typically get mounted read only into containers. Note that the > current container tests assert that `OSContainer::is_containerized() == true` > in various tests. Therefore, using the heuristic of "is any memory or cpu > limit present" isn't sufficient. I had considered that in an earlier > iteration, but many container tests failed. > > Overall, I think, with this patch we improve the current situation of > claiming a containerized system being present when it's actually just a > regular Linux system. > > Testing: > > - [x] GHA (risc-v failure seems infra related) > - [x] Container tests on Linux x86_64 of cgroups v1 and cgroups v2 (including > gtests) > - [x] Some manual testing using cri-o > > Thoughts? Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains ten additional commits since the last revision: - Merge branch 'master' into jdk-8261242-is-containerized-fix - jcheck fixes - Fix tests - Implement Metrics.isContainerized() - Some clean-up - Drop cgroups testing on plain Linux - Implement fall-back logic for non-ro controller mounts - Make find_ro static and local to compilation unit - 8261242: [Linux] OSContainer::is_containerized() returns true - Changes: - all: https://git.openjdk.org/jdk/pull/18201/files - new: https://git.openjdk.org/jdk/pull/18201/files/98325f18..0df26ebd Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk=18201=01 - incr: https://webrevs.openjdk.org/?repo=jdk=18201=00-01 Stats: 407791 lines in 3887 files changed: 43423 ins; 33650 del; 330718 mod Patch: https://git.openjdk.org/jdk/pull/18201.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18201/head:pull/18201 PR: https://git.openjdk.org/jdk/pull/18201