Hi Bob, On Fri, 2020-01-10 at 11:50 -0500, Bob Vandette wrote: > Severin, > > The changes look ok to me. I assume you’ve run the container tests > on a cgroupv2 and cgroupv1 enabled system, right?
Yes, I have. > You’re going to have to find a “R” reviewer. Indeed. > What’s the status of the hotspot cgroupv2 changes. Have these been > reviewed? Latest webvrev was v5 here: http://mail.openjdk.java.net/pipermail/hotspot-dev/2019-November/040120.html It included the memory_max_usage_in_bytes synthesized hack for cgroups v2, though. Should I remove it there as well? I'd be in favor of that. Otherwise nothing new. It's ready from my perspective other than finding another Reviewer for it. Thanks, Severin > Bob. > > > > On Jan 9, 2020, at 2:51 PM, Severin Gehwolf <sgehw...@redhat.com> > > wrote: > > > > Hi, > > > > On Fri, 2020-01-03 at 15:37 -0500, Bob Vandette wrote: > > > Here are a few comments from scanning the webrev. > > > > > > > > > It looks like you can remove line 151 > > > > > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06-incremental/webrev/src/java.base/linux/classes/jdk/internal/platform/CgroupSubsystemController.java.sdiff.html > > > > > > 151 int[] ints = new int[0]; > > > > > > ————— > > > > > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06-incremental/webrev/src/java.base/share/classes/jdk/internal/platform/Metrics.java.sdiff.html > > > > > > There are several comments stating that -2 == unlimited. This is > > > not > > > the case. > > > > > > @return count of elapsed periods or -2 if the quota is unlimited. > > > > > > ————— > > > > > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/test/jdk/jdk/internal/platform/docker/TestDockerMemoryMetrics.java.sdiff.html > > > > > > Shouldn’t you just check for cgroupv2 before trying to run the > > > testKernelMemoryLimit and testOomKillFlag tests? > > > > > > ————— > > > > > > There are a few places where getPerCpuUsage is returning “new > > > long[0]” but you changed the interface to expect null > > > for not supported. > > > > > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/src/java.base/linux/classes/jdk/internal/platform/cgroupv2/CgroupV2Subsystem.java.html > > > > > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/CgroupV1Subsystem.java.html > > > > > > You probably need to check all users of the APIs which used to > > > return > > > array[0] which now return null to make sure you > > > don’t get null pointer exceptions. > > > > > > One example is testCpuConsumption. > > > > > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/test/lib/jdk/test/lib/containers/cgroup/MetricsTesterCgroupV1.java.html > > > > > > Also, > > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/test/lib/jdk/test/lib/containers/cgroup/MetricsTesterCgroupV2.java.html:167 > > > > > > > > > You’ll also have to update copyrights to 2020. > > > > Thanks for the review! Should all be fixed now. Updated webrev: > > > > incremental: > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/07/incremental/webrev/ > > full: > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/07/webrev/ > > > > Note: I'll go through touched files and update copyright dates to > > 2020. > > Not all have been updated in the full webrev. It'll be done though. > > > > Thanks, > > Severin > > > > > Bob. > > > > > > > > > > On Dec 20, 2019, at 9:50 AM, Severin Gehwolf < > > > > sgehw...@redhat.com> > > > > wrote: > > > > > > > > Hi Bob, > > > > > > > > Sorry for the delay to get this updated. > > > > > > > > Changes done in this latest webrev: > > > > > > > > 1. Rebased on top of 8226575: OperatingSystemMXBean should be > > > > made > > > > container aware. File read ops now use privileged code. > > > > 2. Warning for mixed cgroup controllers and returning null for > > > > metrics. > > > > 3. Added implementations for getBlkIOServiceCount() and > > > > getBlkIOServiced() using sum of rios/wios and rbytes/wbytes > > > > across > > > > devices. Added impl for getTcpMemoryUsage() > > > > 4. Returning -2 for not supported (if the metric would return > > > > long). > > > > boolean metrics now return Boolean and null if not > > > > supported. > > > > Same > > > > for array return types. null if not supported for symmetry. > > > > 5. -XshowSettings:system output has been updated to return > > > > "N/A" > > > > for > > > > when a metric is not available. E.g. "Kernel OOM killer > > > > enabled" > > > > Boolean. > > > > 6. Tests have been updated to reflect this. > > > > > > > > webrev: > > > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/ > > > > incremental webrev: > > > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06-incremental/webrev/ > > > > > > > > Testing: Docker tests and podman testing on cgroupsv2. I'll run > > > > it > > > > through jdk/submit as well. > > > > > > > > Hopefully this can get pushed with 8230305 early on in the jdk > > > > 15 > > > > cycle > > > > :) > > > > > > > > A couple of more inline comments below. > > > > > > > > On Mon, 2019-12-02 at 12:13 -0500, Bob Vandette wrote: > > > > > Sorry for the delay in responding. See comments below: > > > > > > > > > > > On Nov 29, 2019, at 4:05 AM, Severin Gehwolf < > > > > > > sgehw...@redhat.com> wrote: > > > > > > > > > > > > On Fri, 2019-11-15 at 17:51 +0100, Severin Gehwolf wrote: > > > > > > > Hi, > > > > > > > > > > > > > > Could I please get reviews of these core-libs, Linux- > > > > > > > only, > > > > > > > changes to > > > > > > > the Metrics subsystem? This patch implements cgroupv2 > > > > > > > support > > > > > > > for > > > > > > > Metrics which are currently cgroupv1-only. Fedora 31 > > > > > > > switched > > > > > > > to > > > > > > > cgroupv2 by default so it's time to get OpenJDK recognize > > > > > > > it. > > > > > > > > > > > > > > Note that a couple of metrics are no longer supported > > > > > > > with > > > > > > > cgroupv2. > > > > > > > Most notably (not an exhaustive list, though): > > > > > > > > > > > > > > Metrics.getKernel*() family of methods. > > > > > > > Metrics.getTcp*() family of methods. > > > > > > > Metrics.getBlkIO*() family of methods. > > > > > > > Metrics.isMemoryOOMKillEnabled() > > > > > > > > > > > > > > A couple of open questions with regards to that: > > > > > > > > > > > > > > 1) > > > > > > > Most API docs of Metrics make no distiction between > > > > > > > "unlimited" and > > > > > > > "not supported", both returning -1 for longs, for > > > > > > > example. > > > > > > > This is a > > > > > > > problem, because output of "java -XshowSettings:system > > > > > > > -version" will > > > > > > > not distinguish between unlimited and not supported > > > > > > > metrics. > > > > > > > Would it > > > > > > > be acceptable to change the API to distinguish those > > > > > > > cases so > > > > > > > that > > > > > > > LauncherHelper could display them appropriately? > > > > > > > > > > > > > > 2) > > > > > > > How should we deal with "not supported" for > > > > > > > booleans/arrays, > > > > > > > etc.? > > > > > > > Would it make sense to return record objects from the > > > > > > > Metrics > > > > > > > API so > > > > > > > that this could be dealt with? E.g. > > > > > > > > > > > > > > Metrics m = ... > > > > > > > MetricResult<int[]> result = m.getCpuSetCpus(); > > > > > > > switch(result.getType()) { > > > > > > > case NOT_SUPPORTED: /* do something */; break; > > > > > > > case SUPPORTED: int[] val = result.get(); break; > > > > > > > ... > > > > > > > } > > > > > > > > > > > > > > I'm bringing this up, because the proposed patch doesn't > > > > > > > deal > > > > > > > with the > > > > > > > above open questions as of yet. With that being said, > > > > > > > it's > > > > > > > mostly > > > > > > > identical to the proposed hotspot changes in [1]. > > > > > > > > > > For issue 1 and 2 … > > > > > > > > > > I wonder if we should change the API to throw > > > > > UnsupportedOperationException for those methods > > > > > that are not available. This exception is used quite a lot > > > > > in > > > > > the java/nio and java/net packages > > > > > for dealing with functionality not available on a host > > > > > platform. > > > > > > > > > > As an alternate suggestion, we already return -2 for "not > > > > > supported" for most APIs in the hotspot > > > > > implementation. We could use this for non boolean values in > > > > > the > > > > > Metrics API. For boolean > > > > > values, we could change the API to return “Boolean”. A null > > > > > return would signify not > > > > > implemented. > > > > > > > > This alternative has been implemented in the latest webrev. > > > > LauncherHelper has been updated to print "N/A" if some property > > > > being > > > > printed is not supported. Example output for cgroupsv2: > > > > > > > > $ ./bin/java -XshowSettings:system -version > > > > Operating System Metrics: > > > > Provider: cgroupv2 > > > > Effective CPU Count: 4 > > > > CPU Period: 100000us > > > > CPU Quota: -1 > > > > CPU Shares: -1 > > > > List of Processors: N/A > > > > List of Effective Processors: N/A > > > > List of Memory Nodes: N/A > > > > List of Available Memory Nodes: N/A > > > > CPUSet Memory Pressure Enabled: N/A > > > > Memory Limit: Unlimited > > > > Memory Soft Limit: Unlimited > > > > Memory & Swap Limit: Unlimited > > > > Kernel Memory Limit: N/A > > > > TCP Memory Limit: N/A > > > > Out Of Memory Killer Enabled: N/A > > > > > > > > openjdk version "15-internal" 2020-09-15 > > > > OpenJDK Runtime Environment (build 15-internal+0- > > > > adhoc.sgehwolf.openjdk-head-2) > > > > OpenJDK 64-Bit Server VM (build 15-internal+0- > > > > adhoc.sgehwolf.openjdk-head-2, mixed mode, sharing) > > > > > > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8231111 > > > > > > > webrev: > > > > > > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/04/webrev/ > > > > > > > > > > > > > > > > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/04/webrev/src/java.base/linux/classes/jdk/internal/platform/CgroupSubsystemFactory.java.html > > > > > > > > > > Should we check to make sure that there are no mixed cgroupv1 > > > > > and > > > > > cgroupv2 mounted subsystems since > > > > > you are not supporting mixing. > > > > > > > > Done. null is returned for metrics and a warning printed to > > > > stderr. > > > > > > > > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/04/webrev/src/java.base/linux/classes/jdk/internal/platform/cgroupv2/CgroupV2Subsystem.java.html > > > > > > > > > > It looks looks like there may be alternate ways of reporting > > > > > some > > > > > of the metrics that you have classified as > > > > > RETVAL_NOT_SUPPORTED. > > > > > > > > > > BlkIOServicedCount (io.stat) > > > > > KernelMemory (memory.stat) > > > > > TcpMemory (memory.stat) > > > > > > > > The latest webrev has getBlkIOService* and getTcpMemoryUsage > > > > impls. > > > > I've left out kernel memory metrics as it wasn't clear what > > > > this > > > > would > > > > be. We can add that in a later patch. The size of this patch is > > > > already > > > > rather big. > > > > > > > > > You could try the same trick for getMemoryAndSwapMaxUsage > > > > > which > > > > > keeps track of the highest returned value. > > > > > > > > We've decided to not do this. getMemoryAndSwapMaxUsage and > > > > getMemoryMaxUsage is being returned as not supported in cgroups > > > > v2. > > > > > > > > Thanks, > > > > Severin > > > > > > > > > for the benefit of other reviewers, you should provide links > > > > > to > > > > > the cgroupv1 and v2 docs. > > > > > > > > > > https://www.kernel.org/doc/Documentation/cgroup-v2.txt > > > > > > > > > > > > Testing: jdk/submit and platform docker tests on Linux > > > > > > > x86_64 > > > > > > > (with > > > > > > > hybrid hierarchy, docker/podman) and on Linux x86_64 with > > > > > > > unified > > > > > > > hierarchy (podman only). > > > > > > > > > > > > > > Thoughts? Suggestions? > > > > > > > > > > Do you think we should check the docker version being used > > > > > for > > > > > the tests to make sure it > > > > > supports cgroupv2? I assume a fairly recent version is > > > > > required > > > > > to work with cgroupv2. > > > > > > > > > > Bob. > > > > > > > > > > > > > > > > > > > > > Ping? > > > > > > > > > > > > > Thanks, > > > > > > > Severin > > > > > > > > > > > > > > [1] > > > > > > > http://mail.openjdk.java.net/pipermail/hotspot-dev/2019-November/039909.html