On 4/05/2021 9:07 am, Argha C wrote:
Hello David,
It does look quite similar.
The crux is factoring in the sum of cpu time across all shares, instead
of a single cpu share.
My proposal for implementing the fix is slightly different from the PR,
but happy to trust your advice on taking this forward.
I would suggest liaising with the contributors of
https://github.com/openjdk/jdk/pull/3656
so see if all of the issues can be addressed together.
Cheers,
David
On Mon, May 3, 2021 at 3:21 PM David Holmes <david.hol...@oracle.com
<mailto:david.hol...@oracle.com>> wrote:
Hi,
Is this related to the issue reported here:
https://bugs.openjdk.java.net/browse/JDK-8265836
<https://bugs.openjdk.java.net/browse/JDK-8265836>
Cheers,
David
On 4/05/2021 3:14 am, Argha C wrote:
> Hello,
> I wanted to report an issue we're seeing with the load
calculation, when
> running with cpu shares > 1, in a container environment.
Specifically,
> the implementation of /OperatingSystemImpl#getCpuLoad./
> /
> /
> /Problem/
> /
> /
> When running with allocation of multiple cpu shares, ie. > 1
unit, the
> load numbers do not comply with the expected range of 0-1. In the
> example screenshot, it goes beyond 4.
> This miscalculation throws off load based system heuristics, when
> running in a container based environment.
>
> /Proposed solution/
> /
> /
> In a container aware environment, for load average calculation, the
> number of cpu cycles, ie. /getCpuPeriod /must be multiplied by the
> number of requested cpu shares by the process, ie. /getCpuShares./
> This would ensure that the load calculation uses the correct
denominator
> for elapsed time slice periods.
>
> In the screenshot below, this would mean using /getCpuShares /as a
> multiplier for /periodLength./
>
> Please consider validating this behavior. I'd be happy to submit
a PR
> but I'm not an openjdk author/contributor.
> Thanks for your consideration.