Severin, The changes look ok to me. I assume you’ve run the container tests on a cgroupv2 and cgroupv1 enabled system, right?
You’re going to have to find a “R” reviewer. What’s the status of the hotspot cgroupv2 changes. Have these been reviewed? Bob. > On Jan 9, 2020, at 2:51 PM, Severin Gehwolf <sgehw...@redhat.com> wrote: > > Hi, > > On Fri, 2020-01-03 at 15:37 -0500, Bob Vandette wrote: >> Here are a few comments from scanning the webrev. >> >> >> It looks like you can remove line 151 >> >> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06-incremental/webrev/src/java.base/linux/classes/jdk/internal/platform/CgroupSubsystemController.java.sdiff.html >> >> 151 int[] ints = new int[0]; >> >> ————— >> >> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06-incremental/webrev/src/java.base/share/classes/jdk/internal/platform/Metrics.java.sdiff.html >> >> There are several comments stating that -2 == unlimited. This is not >> the case. >> >> @return count of elapsed periods or -2 if the quota is unlimited. >> >> ————— >> >> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/test/jdk/jdk/internal/platform/docker/TestDockerMemoryMetrics.java.sdiff.html >> >> Shouldn’t you just check for cgroupv2 before trying to run the >> testKernelMemoryLimit and testOomKillFlag tests? >> >> ————— >> >> There are a few places where getPerCpuUsage is returning “new >> long[0]” but you changed the interface to expect null >> for not supported. >> >> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/src/java.base/linux/classes/jdk/internal/platform/cgroupv2/CgroupV2Subsystem.java.html >> >> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/CgroupV1Subsystem.java.html >> >> You probably need to check all users of the APIs which used to return >> array[0] which now return null to make sure you >> don’t get null pointer exceptions. >> >> One example is testCpuConsumption. >> >> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/test/lib/jdk/test/lib/containers/cgroup/MetricsTesterCgroupV1.java.html >> >> Also, >> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/test/lib/jdk/test/lib/containers/cgroup/MetricsTesterCgroupV2.java.html:167 >> >> >> You’ll also have to update copyrights to 2020. > > Thanks for the review! Should all be fixed now. Updated webrev: > > incremental: > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/07/incremental/webrev/ > full: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/07/webrev/ > > Note: I'll go through touched files and update copyright dates to 2020. > Not all have been updated in the full webrev. It'll be done though. > > Thanks, > Severin > >> Bob. >> >> >>> On Dec 20, 2019, at 9:50 AM, Severin Gehwolf <sgehw...@redhat.com> >>> wrote: >>> >>> Hi Bob, >>> >>> Sorry for the delay to get this updated. >>> >>> Changes done in this latest webrev: >>> >>> 1. Rebased on top of 8226575: OperatingSystemMXBean should be >>> made >>> container aware. File read ops now use privileged code. >>> 2. Warning for mixed cgroup controllers and returning null for >>> metrics. >>> 3. Added implementations for getBlkIOServiceCount() and >>> getBlkIOServiced() using sum of rios/wios and rbytes/wbytes >>> across >>> devices. Added impl for getTcpMemoryUsage() >>> 4. Returning -2 for not supported (if the metric would return >>> long). >>> boolean metrics now return Boolean and null if not supported. >>> Same >>> for array return types. null if not supported for symmetry. >>> 5. -XshowSettings:system output has been updated to return "N/A" >>> for >>> when a metric is not available. E.g. "Kernel OOM killer >>> enabled" >>> Boolean. >>> 6. Tests have been updated to reflect this. >>> >>> webrev: >>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/ >>> incremental webrev: >>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06-incremental/webrev/ >>> >>> Testing: Docker tests and podman testing on cgroupsv2. I'll run it >>> through jdk/submit as well. >>> >>> Hopefully this can get pushed with 8230305 early on in the jdk 15 >>> cycle >>> :) >>> >>> A couple of more inline comments below. >>> >>> On Mon, 2019-12-02 at 12:13 -0500, Bob Vandette wrote: >>>> Sorry for the delay in responding. See comments below: >>>> >>>>> On Nov 29, 2019, at 4:05 AM, Severin Gehwolf < >>>>> sgehw...@redhat.com> wrote: >>>>> >>>>> On Fri, 2019-11-15 at 17:51 +0100, Severin Gehwolf wrote: >>>>>> Hi, >>>>>> >>>>>> Could I please get reviews of these core-libs, Linux-only, >>>>>> changes to >>>>>> the Metrics subsystem? This patch implements cgroupv2 support >>>>>> for >>>>>> Metrics which are currently cgroupv1-only. Fedora 31 switched >>>>>> to >>>>>> cgroupv2 by default so it's time to get OpenJDK recognize it. >>>>>> >>>>>> Note that a couple of metrics are no longer supported with >>>>>> cgroupv2. >>>>>> Most notably (not an exhaustive list, though): >>>>>> >>>>>> Metrics.getKernel*() family of methods. >>>>>> Metrics.getTcp*() family of methods. >>>>>> Metrics.getBlkIO*() family of methods. >>>>>> Metrics.isMemoryOOMKillEnabled() >>>>>> >>>>>> A couple of open questions with regards to that: >>>>>> >>>>>> 1) >>>>>> Most API docs of Metrics make no distiction between >>>>>> "unlimited" and >>>>>> "not supported", both returning -1 for longs, for example. >>>>>> This is a >>>>>> problem, because output of "java -XshowSettings:system >>>>>> -version" will >>>>>> not distinguish between unlimited and not supported metrics. >>>>>> Would it >>>>>> be acceptable to change the API to distinguish those cases so >>>>>> that >>>>>> LauncherHelper could display them appropriately? >>>>>> >>>>>> 2) >>>>>> How should we deal with "not supported" for booleans/arrays, >>>>>> etc.? >>>>>> Would it make sense to return record objects from the Metrics >>>>>> API so >>>>>> that this could be dealt with? E.g. >>>>>> >>>>>> Metrics m = ... >>>>>> MetricResult<int[]> result = m.getCpuSetCpus(); >>>>>> switch(result.getType()) { >>>>>> case NOT_SUPPORTED: /* do something */; break; >>>>>> case SUPPORTED: int[] val = result.get(); break; >>>>>> ... >>>>>> } >>>>>> >>>>>> I'm bringing this up, because the proposed patch doesn't deal >>>>>> with the >>>>>> above open questions as of yet. With that being said, it's >>>>>> mostly >>>>>> identical to the proposed hotspot changes in [1]. >>>> >>>> For issue 1 and 2 … >>>> >>>> I wonder if we should change the API to throw >>>> UnsupportedOperationException for those methods >>>> that are not available. This exception is used quite a lot in >>>> the java/nio and java/net packages >>>> for dealing with functionality not available on a host platform. >>>> >>>> As an alternate suggestion, we already return -2 for "not >>>> supported" for most APIs in the hotspot >>>> implementation. We could use this for non boolean values in the >>>> Metrics API. For boolean >>>> values, we could change the API to return “Boolean”. A null >>>> return would signify not >>>> implemented. >>> >>> This alternative has been implemented in the latest webrev. >>> LauncherHelper has been updated to print "N/A" if some property >>> being >>> printed is not supported. Example output for cgroupsv2: >>> >>> $ ./bin/java -XshowSettings:system -version >>> Operating System Metrics: >>> Provider: cgroupv2 >>> Effective CPU Count: 4 >>> CPU Period: 100000us >>> CPU Quota: -1 >>> CPU Shares: -1 >>> List of Processors: N/A >>> List of Effective Processors: N/A >>> List of Memory Nodes: N/A >>> List of Available Memory Nodes: N/A >>> CPUSet Memory Pressure Enabled: N/A >>> Memory Limit: Unlimited >>> Memory Soft Limit: Unlimited >>> Memory & Swap Limit: Unlimited >>> Kernel Memory Limit: N/A >>> TCP Memory Limit: N/A >>> Out Of Memory Killer Enabled: N/A >>> >>> openjdk version "15-internal" 2020-09-15 >>> OpenJDK Runtime Environment (build 15-internal+0- >>> adhoc.sgehwolf.openjdk-head-2) >>> OpenJDK 64-Bit Server VM (build 15-internal+0- >>> adhoc.sgehwolf.openjdk-head-2, mixed mode, sharing) >>> >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8231111 >>>>>> webrev: >>>>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/04/webrev/ >>>>>> >>>> >>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/04/webrev/src/java.base/linux/classes/jdk/internal/platform/CgroupSubsystemFactory.java.html >>>> >>>> Should we check to make sure that there are no mixed cgroupv1 and >>>> cgroupv2 mounted subsystems since >>>> you are not supporting mixing. >>> >>> Done. null is returned for metrics and a warning printed to stderr. >>> >>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/04/webrev/src/java.base/linux/classes/jdk/internal/platform/cgroupv2/CgroupV2Subsystem.java.html >>>> >>>> It looks looks like there may be alternate ways of reporting some >>>> of the metrics that you have classified as RETVAL_NOT_SUPPORTED. >>>> >>>> BlkIOServicedCount (io.stat) >>>> KernelMemory (memory.stat) >>>> TcpMemory (memory.stat) >>> >>> The latest webrev has getBlkIOService* and getTcpMemoryUsage impls. >>> I've left out kernel memory metrics as it wasn't clear what this >>> would >>> be. We can add that in a later patch. The size of this patch is >>> already >>> rather big. >>> >>>> You could try the same trick for getMemoryAndSwapMaxUsage which >>>> keeps track of the highest returned value. >>> >>> We've decided to not do this. getMemoryAndSwapMaxUsage and >>> getMemoryMaxUsage is being returned as not supported in cgroups v2. >>> >>> Thanks, >>> Severin >>> >>>> for the benefit of other reviewers, you should provide links to >>>> the cgroupv1 and v2 docs. >>>> >>>> https://www.kernel.org/doc/Documentation/cgroup-v2.txt >>>> >>>>>> Testing: jdk/submit and platform docker tests on Linux x86_64 >>>>>> (with >>>>>> hybrid hierarchy, docker/podman) and on Linux x86_64 with >>>>>> unified >>>>>> hierarchy (podman only). >>>>>> >>>>>> Thoughts? Suggestions? >>>> >>>> Do you think we should check the docker version being used for >>>> the tests to make sure it >>>> supports cgroupv2? I assume a fairly recent version is required >>>> to work with cgroupv2. >>>> >>>> Bob. >>>> >>>> >>>> >>>>> Ping? >>>>> >>>>>> Thanks, >>>>>> Severin >>>>>> >>>>>> [1] >>>>>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2019-November/039909.html >