> On Jan 13, 2020, at 4:04 AM, Severin Gehwolf <sgehw...@redhat.com> wrote:
> 
> Hi Bob,
> 
> On Fri, 2020-01-10 at 11:50 -0500, Bob Vandette wrote:
>> Severin,
>> 
>> The changes look ok to me.  I assume you’ve run the container tests
>> on a cgroupv2 and cgroupv1 enabled system, right?
> 
> Yes, I have.
> 
>> You’re going to have to find a “R” reviewer.
> 
> Indeed.
> 
>> What’s the status of the hotspot cgroupv2 changes.  Have these been
>> reviewed?
> 
> Latest webvrev was v5 here:
> http://mail.openjdk.java.net/pipermail/hotspot-dev/2019-November/040120.html

It appears that cgroup v2 is getting supported in RHEL 8 and Oracle Linux 8 so 
I think we need to
make a strong case to get this work reviewed and integrated in JDK 15.

> 
> It included the memory_max_usage_in_bytes synthesized hack for cgroups
> v2, though. Should I remove it there as well? I'd be in favor of that.
> Otherwise nothing new. It's ready from my perspective other than
> finding another Reviewer for it.

Yes, I’d remove it.

Bob.

> 
> Thanks,
> Severin
> 
>> Bob.
>> 
>> 
>>> On Jan 9, 2020, at 2:51 PM, Severin Gehwolf <sgehw...@redhat.com>
>>> wrote:
>>> 
>>> Hi,
>>> 
>>> On Fri, 2020-01-03 at 15:37 -0500, Bob Vandette wrote:
>>>> Here are a few comments from scanning the webrev.
>>>> 
>>>> 
>>>> It looks like you can remove line 151
>>>> 
>>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06-incremental/webrev/src/java.base/linux/classes/jdk/internal/platform/CgroupSubsystemController.java.sdiff.html
>>>> 
>>>> 151         int[] ints = new int[0];
>>>> 
>>>> —————
>>>> 
>>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06-incremental/webrev/src/java.base/share/classes/jdk/internal/platform/Metrics.java.sdiff.html
>>>> 
>>>> There are several comments stating that -2 == unlimited.  This is
>>>> not
>>>> the case.
>>>> 
>>>> @return count of elapsed periods or -2 if the quota is unlimited.
>>>> 
>>>> —————
>>>> 
>>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/test/jdk/jdk/internal/platform/docker/TestDockerMemoryMetrics.java.sdiff.html
>>>> 
>>>> Shouldn’t you just check for cgroupv2 before trying to run the
>>>> testKernelMemoryLimit and testOomKillFlag tests?
>>>> 
>>>> —————
>>>> 
>>>> There are a few places where getPerCpuUsage is returning “new
>>>> long[0]” but you changed the interface to expect null
>>>> for not supported.
>>>> 
>>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/src/java.base/linux/classes/jdk/internal/platform/cgroupv2/CgroupV2Subsystem.java.html
>>>> 
>>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/CgroupV1Subsystem.java.html
>>>> 
>>>> You probably need to check all users of the APIs which used to
>>>> return
>>>> array[0] which now return null to make sure you
>>>> don’t get null pointer exceptions.
>>>> 
>>>> One example is testCpuConsumption.
>>>> 
>>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/test/lib/jdk/test/lib/containers/cgroup/MetricsTesterCgroupV1.java.html
>>>> 
>>>> Also, 
>>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/test/lib/jdk/test/lib/containers/cgroup/MetricsTesterCgroupV2.java.html:167
>>>> 
>>>> 
>>>> You’ll also have to update copyrights to 2020.
>>> 
>>> Thanks for the review! Should all be fixed now. Updated webrev:
>>> 
>>> incremental: 
>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/07/incremental/webrev/
>>> full: 
>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/07/webrev/
>>> 
>>> Note: I'll go through touched files and update copyright dates to
>>> 2020.
>>> Not all have been updated in the full webrev. It'll be done though.
>>> 
>>> Thanks,
>>> Severin
>>> 
>>>> Bob.
>>>> 
>>>> 
>>>>> On Dec 20, 2019, at 9:50 AM, Severin Gehwolf <
>>>>> sgehw...@redhat.com>
>>>>> wrote:
>>>>> 
>>>>> Hi Bob,
>>>>> 
>>>>> Sorry for the delay to get this updated.
>>>>> 
>>>>> Changes done in this latest webrev:
>>>>> 
>>>>> 1. Rebased on top of 8226575: OperatingSystemMXBean should be
>>>>> made
>>>>>    container aware. File read ops now use privileged code.
>>>>> 2. Warning for mixed cgroup controllers and returning null for
>>>>> metrics.
>>>>> 3. Added implementations for getBlkIOServiceCount() and
>>>>>    getBlkIOServiced() using sum of rios/wios and rbytes/wbytes
>>>>> across
>>>>>    devices. Added impl for getTcpMemoryUsage()
>>>>> 4. Returning -2 for not supported (if the metric would return
>>>>> long).
>>>>>    boolean metrics now return Boolean and null if not
>>>>> supported.
>>>>> Same
>>>>>    for array return types. null if not supported for symmetry.
>>>>> 5. -XshowSettings:system output has been updated to return
>>>>> "N/A"
>>>>> for
>>>>>    when a metric is not available. E.g. "Kernel OOM killer
>>>>> enabled"
>>>>>    Boolean.
>>>>> 6. Tests have been updated to reflect this.
>>>>> 
>>>>> webrev:
>>>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/
>>>>> incremental webrev:
>>>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06-incremental/webrev/
>>>>> 
>>>>> Testing: Docker tests and podman testing on cgroupsv2. I'll run
>>>>> it
>>>>> through jdk/submit as well.
>>>>> 
>>>>> Hopefully this can get pushed with 8230305 early on in the jdk
>>>>> 15
>>>>> cycle
>>>>> :)
>>>>> 
>>>>> A couple of more inline comments below.
>>>>> 
>>>>> On Mon, 2019-12-02 at 12:13 -0500, Bob Vandette wrote:
>>>>>> Sorry for the delay in responding.  See comments below:
>>>>>> 
>>>>>>> On Nov 29, 2019, at 4:05 AM, Severin Gehwolf <
>>>>>>> sgehw...@redhat.com> wrote:
>>>>>>> 
>>>>>>> On Fri, 2019-11-15 at 17:51 +0100, Severin Gehwolf wrote:
>>>>>>>> Hi,
>>>>>>>> 
>>>>>>>> Could I please get reviews of these core-libs, Linux-
>>>>>>>> only,
>>>>>>>> changes to
>>>>>>>> the Metrics subsystem? This patch implements cgroupv2
>>>>>>>> support
>>>>>>>> for
>>>>>>>> Metrics which are currently cgroupv1-only. Fedora 31
>>>>>>>> switched
>>>>>>>> to
>>>>>>>> cgroupv2 by default so it's time to get OpenJDK recognize
>>>>>>>> it.
>>>>>>>> 
>>>>>>>> Note that a couple of metrics are no longer supported
>>>>>>>> with
>>>>>>>> cgroupv2.
>>>>>>>> Most notably (not an exhaustive list, though):
>>>>>>>> 
>>>>>>>> Metrics.getKernel*() family of methods.
>>>>>>>> Metrics.getTcp*() family of methods.
>>>>>>>> Metrics.getBlkIO*() family of methods.
>>>>>>>> Metrics.isMemoryOOMKillEnabled()
>>>>>>>> 
>>>>>>>> A couple of open questions with regards to that:
>>>>>>>> 
>>>>>>>> 1)
>>>>>>>> Most API docs of Metrics make no distiction between
>>>>>>>> "unlimited" and
>>>>>>>> "not supported", both returning -1 for longs, for
>>>>>>>> example.
>>>>>>>> This is a
>>>>>>>> problem, because output of "java -XshowSettings:system
>>>>>>>> -version" will
>>>>>>>> not distinguish between unlimited and not supported
>>>>>>>> metrics.
>>>>>>>> Would it
>>>>>>>> be acceptable to change the API to distinguish those
>>>>>>>> cases so
>>>>>>>> that
>>>>>>>> LauncherHelper could display them appropriately?
>>>>>>>> 
>>>>>>>> 2)
>>>>>>>> How should we deal with "not supported" for
>>>>>>>> booleans/arrays,
>>>>>>>> etc.?
>>>>>>>> Would it make sense to return record objects from the
>>>>>>>> Metrics
>>>>>>>> API so
>>>>>>>> that this could be dealt with? E.g.
>>>>>>>> 
>>>>>>>> Metrics m = ...
>>>>>>>> MetricResult<int[]> result = m.getCpuSetCpus();
>>>>>>>> switch(result.getType()) {
>>>>>>>> case NOT_SUPPORTED: /* do something */; break;
>>>>>>>> case SUPPORTED: int[] val = result.get(); break;
>>>>>>>> ...         
>>>>>>>> }
>>>>>>>> 
>>>>>>>> I'm bringing this up, because the proposed patch doesn't
>>>>>>>> deal
>>>>>>>> with the
>>>>>>>> above open questions as of yet. With that being said,
>>>>>>>> it's
>>>>>>>> mostly
>>>>>>>> identical to the proposed hotspot changes in [1].
>>>>>> 
>>>>>> For issue 1 and 2 …
>>>>>> 
>>>>>> I wonder if we should change the API to throw
>>>>>> UnsupportedOperationException for those methods 
>>>>>> that are not available.  This exception is used quite a lot
>>>>>> in
>>>>>> the java/nio and java/net packages
>>>>>> for dealing with functionality not available on a host
>>>>>> platform.
>>>>>> 
>>>>>> As an alternate suggestion, we already return -2 for "not
>>>>>> supported" for most APIs in the hotspot
>>>>>> implementation.  We could use this for non boolean values in
>>>>>> the
>>>>>> Metrics API.  For boolean
>>>>>> values, we could change the API to return “Boolean”.  A null
>>>>>> return would signify not
>>>>>> implemented.
>>>>> 
>>>>> This alternative has been implemented in the latest webrev.
>>>>> LauncherHelper has been updated to print "N/A" if some property
>>>>> being
>>>>> printed is not supported. Example output for cgroupsv2:
>>>>> 
>>>>> $ ./bin/java -XshowSettings:system -version
>>>>> Operating System Metrics:
>>>>>  Provider: cgroupv2
>>>>>  Effective CPU Count: 4
>>>>>  CPU Period: 100000us
>>>>>  CPU Quota: -1
>>>>>  CPU Shares: -1
>>>>>  List of Processors: N/A
>>>>>  List of Effective Processors: N/A
>>>>>  List of Memory Nodes: N/A
>>>>>  List of Available Memory Nodes: N/A
>>>>>  CPUSet Memory Pressure Enabled: N/A
>>>>>  Memory Limit: Unlimited
>>>>>  Memory Soft Limit: Unlimited
>>>>>  Memory & Swap Limit: Unlimited
>>>>>  Kernel Memory Limit: N/A
>>>>>  TCP Memory Limit: N/A
>>>>>  Out Of Memory Killer Enabled: N/A
>>>>> 
>>>>> openjdk version "15-internal" 2020-09-15
>>>>> OpenJDK Runtime Environment (build 15-internal+0-
>>>>> adhoc.sgehwolf.openjdk-head-2)
>>>>> OpenJDK 64-Bit Server VM (build 15-internal+0-
>>>>> adhoc.sgehwolf.openjdk-head-2, mixed mode, sharing)
>>>>> 
>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8231111
>>>>>>>> webrev: 
>>>>>>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/04/webrev/
>>>>>>>> 
>>>>>> 
>>>>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/04/webrev/src/java.base/linux/classes/jdk/internal/platform/CgroupSubsystemFactory.java.html
>>>>>> 
>>>>>> Should we check to make sure that there are no mixed cgroupv1
>>>>>> and
>>>>>> cgroupv2 mounted subsystems since
>>>>>> you are not supporting mixing.
>>>>> 
>>>>> Done. null is returned for metrics and a warning printed to
>>>>> stderr.
>>>>> 
>>>>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/04/webrev/src/java.base/linux/classes/jdk/internal/platform/cgroupv2/CgroupV2Subsystem.java.html
>>>>>> 
>>>>>> It looks looks like there may be alternate ways of reporting
>>>>>> some
>>>>>> of the metrics that you have classified as
>>>>>> RETVAL_NOT_SUPPORTED.
>>>>>> 
>>>>>> BlkIOServicedCount (io.stat)
>>>>>> KernelMemory (memory.stat)
>>>>>> TcpMemory (memory.stat)
>>>>> 
>>>>> The latest webrev has getBlkIOService* and getTcpMemoryUsage
>>>>> impls.
>>>>> I've left out kernel memory metrics as it wasn't clear what
>>>>> this
>>>>> would
>>>>> be. We can add that in a later patch. The size of this patch is
>>>>> already
>>>>> rather big.
>>>>> 
>>>>>> You could try the same trick for getMemoryAndSwapMaxUsage
>>>>>> which
>>>>>> keeps track of the highest returned value.
>>>>> 
>>>>> We've decided to not do this. getMemoryAndSwapMaxUsage and
>>>>> getMemoryMaxUsage is being returned as not supported in cgroups
>>>>> v2.
>>>>> 
>>>>> Thanks,
>>>>> Severin
>>>>> 
>>>>>> for the benefit of other reviewers, you should provide links
>>>>>> to
>>>>>> the cgroupv1 and v2 docs.
>>>>>> 
>>>>>> https://www.kernel.org/doc/Documentation/cgroup-v2.txt
>>>>>> 
>>>>>>>> Testing: jdk/submit and platform docker tests on Linux
>>>>>>>> x86_64
>>>>>>>> (with
>>>>>>>> hybrid hierarchy, docker/podman) and on Linux x86_64 with
>>>>>>>> unified
>>>>>>>> hierarchy (podman only).
>>>>>>>> 
>>>>>>>> Thoughts? Suggestions?
>>>>>> 
>>>>>> Do you think we should check the docker version being used
>>>>>> for
>>>>>> the tests to make sure it
>>>>>> supports cgroupv2?  I assume a fairly recent version is
>>>>>> required
>>>>>> to work with cgroupv2.
>>>>>> 
>>>>>> Bob.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> Ping?
>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Severin
>>>>>>>> 
>>>>>>>> [1] 
>>>>>>>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2019-November/039909.html
> 

Reply via email to