On Thu, 6 May 2021 02:34:01 GMT, Hao Tang 
<github.com+7947546+tanghaot...@openjdk.org> wrote:

>> Thanks for linking that. It sounds reasonable to me to prefer `quota` in 
>> that case.
>
>> Thanks for linking that. It sounds reasonable to me to prefer `quota` in 
>> that case.
> 
> Yes, flag `PreferContainerQuotaForCPUCount` is [true by 
> default](https://github.com/openjdk/jdk/blob/739769c8/src/hotspot/os/linux/globals_linux.hpp#L62).
>  Therefore, [my current 
> implementation](https://github.com/openjdk/jdk/pull/3656/commits/b052b624c84) 
> might be a minimal implementation.
> 
> We can also cover the case where `PreferContainerQuotaForCPUCount` is false. 
> There are two different ways.
> 1. To access the value of `PreferContainerQuotaForCPUCount`, so that we can 
> decide if we should use `quota` or (`quota` &  `share`);
> 2. Reuse `CgroupSubsystem::active_processor_count`. However, the function 
> returns an integer. It is more reasonable to use a floating number.
> 
> Looking forward to your suggestion.

> @tanghaoth90 My local testing suggests that your fix addresses the issue of 
> CPU quotas set via `--cpu-quota/--cpu-period`. When using `--cpu-shares` the 
> CPU load calculation is wrong since it will (wrongly) report host values. 
> Lets look at them individually, fix the quota and shares case individually 
> (i.e. when not both are set). Once that's done, quota will be preferred in 
> the OperatingSystemMXBean impl, which is reasonable. I don't think we need to 
> account for the shares-preferred-over-quota at this point since that changed 
> in HotSpot in JDK 11 time-frame (JDK-8197589) and OperatingSystemMXBean has 
> only been made container aware in JDK 14 (yes, it got backported, but still).

@jerboaa I have updated my fix by inserting a new branch for `--cpu-shares`. By 
setting up containers with different restrictions based on your suggestion, I 
noticed two problems for the CPU load calculation. Take 
[TestLoad.java](https://bugs.openjdk.java.net/secure/attachment/94530/TestLoad.java)
 as the example.

1. `--cpu-quota=700000 --cpu-period=100000` as the restriction: the result is 
getting close to `6/7` slowly as time goes by, which indicates that the result 
of 

long usageNanos = containerMetrics.getCpuUsage();

is an accumulated CPU usage rather than a real-time CPU usage.
According to the javadoc, `getCpuLoad()` _returns the "recent cpu usage"_. The 
current fix obviously does not address this issue.

2. `--cpu-shares=2048` as the restriction: the "cpu-share" branch only returns 
`-1` since `containerMetrics.getCpuNumPeriods()` returns `0` (`nr_periods` of 
`cpu.stat` stays `0` when only `--cpu-shares` is set). By now, I have no idea 
to compute the elapsed time or the total available CPU time with the help of 
the methods of `jdk.internal.platform.CgroupMetrics`.

Any suggestion is appreciated. @jerboaa @argha-c

-------------

PR: https://git.openjdk.java.net/jdk/pull/3656

Reply via email to