On Fri, 21 May 2021 16:11:49 GMT, Severin Gehwolf <sgehw...@openjdk.org> wrote:
>>> @tanghaoth90 My local testing suggests that your fix addresses the issue of >>> CPU quotas set via `--cpu-quota/--cpu-period`. When using `--cpu-shares` >>> the CPU load calculation is wrong since it will (wrongly) report host >>> values. Lets look at them individually, fix the quota and shares case >>> individually (i.e. when not both are set). Once that's done, quota will be >>> preferred in the OperatingSystemMXBean impl, which is reasonable. I don't >>> think we need to account for the shares-preferred-over-quota at this point >>> since that changed in HotSpot in JDK 11 time-frame (JDK-8197589) and >>> OperatingSystemMXBean has only been made container aware in JDK 14 (yes, it >>> got backported, but still). >> >> @jerboaa I have updated my fix by inserting a new branch for `--cpu-shares`. >> By setting up containers with different restrictions based on your >> suggestion, I noticed two problems for the CPU load calculation. Take >> [TestLoad.java](https://bugs.openjdk.java.net/secure/attachment/94530/TestLoad.java) >> as the example. >> >> 1. `--cpu-quota=700000 --cpu-period=100000` as the restriction: the result >> is getting close to `6/7` slowly as time goes by, which indicates that the >> result of >> >> long usageNanos = containerMetrics.getCpuUsage(); >> >> is an accumulated CPU usage rather than a real-time CPU usage. >> According to the javadoc, `getCpuLoad()` _returns the "recent cpu usage"_. >> The current fix obviously does not address this issue. >> >> 2. `--cpu-shares=2048` as the restriction: the "cpu-share" branch only >> returns `-1` since `containerMetrics.getCpuNumPeriods()` returns `0` >> (`nr_periods` of `cpu.stat` stays `0` when only `--cpu-shares` is set). By >> now, I have no idea to compute the elapsed time or the total available CPU >> time with the help of the methods of `jdk.internal.platform.CgroupMetrics`. >> >> Any suggestion is appreciated. @jerboaa @argha-c > > @tanghaoth90 > > As to your observation of 1). Yes, that's true and I'm not sure it's worth > changing. We could change the javadoc. > > As for 2) I've created a PR for your branch with a suggested implementation > for cpu shares based accounting. It uses a similar heuristics than the > getCpuLoad0() uses here: > https://github.com/openjdk/jdk/blob/master/src/jdk.management/linux/native/libmanagement_ext/UnixOperatingSystem.c#L278 > > That's the best I could think of. It's certainly more accurate than the > current implementation. @jerboaa Just read your comment about my observation (due to the weird format of github pull request). Sorry for making further [change](https://github.com/openjdk/jdk/pull/3656/commits/eba3bc10b4ebd) for the `quota` branch without asking for your suggestion. For now, I still hold a different opinion about the `quota` branch. My local test shows that podman run --cpu-shares=4096 <image> podman run --cpu-quota=400000 --cpu-period=100000 <image> yields the same result after the change, otherwise the second one still prints the average cpu load. ------------- PR: https://git.openjdk.java.net/jdk/pull/3656