Hi Bob,

On Fri, 2020-01-10 at 11:50 -0500, Bob Vandette wrote:
> Severin,
> 
> The changes look ok to me.  I assume you’ve run the container tests
> on a cgroupv2 and cgroupv1 enabled system, right?

Yes, I have.

> You’re going to have to find a “R” reviewer.

Indeed.

> What’s the status of the hotspot cgroupv2 changes.  Have these been
> reviewed?

Latest webvrev was v5 here:
http://mail.openjdk.java.net/pipermail/hotspot-dev/2019-November/040120.html

It included the memory_max_usage_in_bytes synthesized hack for cgroups
v2, though. Should I remove it there as well? I'd be in favor of that.
Otherwise nothing new. It's ready from my perspective other than
finding another Reviewer for it.

Thanks,
Severin

> Bob.
> 
> 
> > On Jan 9, 2020, at 2:51 PM, Severin Gehwolf <sgehw...@redhat.com>
> > wrote:
> > 
> > Hi,
> > 
> > On Fri, 2020-01-03 at 15:37 -0500, Bob Vandette wrote:
> > > Here are a few comments from scanning the webrev.
> > > 
> > > 
> > > It looks like you can remove line 151
> > > 
> > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06-incremental/webrev/src/java.base/linux/classes/jdk/internal/platform/CgroupSubsystemController.java.sdiff.html
> > > 
> > > 151         int[] ints = new int[0];
> > > 
> > > —————
> > > 
> > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06-incremental/webrev/src/java.base/share/classes/jdk/internal/platform/Metrics.java.sdiff.html
> > > 
> > > There are several comments stating that -2 == unlimited.  This is
> > > not
> > > the case.
> > > 
> > > @return count of elapsed periods or -2 if the quota is unlimited.
> > > 
> > > —————
> > > 
> > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/test/jdk/jdk/internal/platform/docker/TestDockerMemoryMetrics.java.sdiff.html
> > > 
> > > Shouldn’t you just check for cgroupv2 before trying to run the
> > > testKernelMemoryLimit and testOomKillFlag tests?
> > > 
> > > —————
> > > 
> > > There are a few places where getPerCpuUsage is returning “new
> > > long[0]” but you changed the interface to expect null
> > > for not supported.
> > > 
> > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/src/java.base/linux/classes/jdk/internal/platform/cgroupv2/CgroupV2Subsystem.java.html
> > > 
> > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/CgroupV1Subsystem.java.html
> > > 
> > > You probably need to check all users of the APIs which used to
> > > return
> > > array[0] which now return null to make sure you
> > > don’t get null pointer exceptions.
> > > 
> > > One example is testCpuConsumption.
> > > 
> > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/test/lib/jdk/test/lib/containers/cgroup/MetricsTesterCgroupV1.java.html
> > > 
> > > Also, 
> > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/test/lib/jdk/test/lib/containers/cgroup/MetricsTesterCgroupV2.java.html:167
> > > 
> > > 
> > > You’ll also have to update copyrights to 2020.
> > 
> > Thanks for the review! Should all be fixed now. Updated webrev:
> > 
> > incremental: 
> > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/07/incremental/webrev/
> > full: 
> > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/07/webrev/
> > 
> > Note: I'll go through touched files and update copyright dates to
> > 2020.
> > Not all have been updated in the full webrev. It'll be done though.
> > 
> > Thanks,
> > Severin
> > 
> > > Bob.
> > > 
> > > 
> > > > On Dec 20, 2019, at 9:50 AM, Severin Gehwolf <
> > > > sgehw...@redhat.com>
> > > > wrote:
> > > > 
> > > > Hi Bob,
> > > > 
> > > > Sorry for the delay to get this updated.
> > > > 
> > > > Changes done in this latest webrev:
> > > > 
> > > >  1. Rebased on top of 8226575: OperatingSystemMXBean should be
> > > > made
> > > >     container aware. File read ops now use privileged code.
> > > >  2. Warning for mixed cgroup controllers and returning null for
> > > > metrics.
> > > >  3. Added implementations for getBlkIOServiceCount() and
> > > >     getBlkIOServiced() using sum of rios/wios and rbytes/wbytes
> > > > across
> > > >     devices. Added impl for getTcpMemoryUsage()
> > > >  4. Returning -2 for not supported (if the metric would return
> > > > long).
> > > >     boolean metrics now return Boolean and null if not
> > > > supported.
> > > > Same
> > > >     for array return types. null if not supported for symmetry.
> > > >  5. -XshowSettings:system output has been updated to return
> > > > "N/A"
> > > > for
> > > >     when a metric is not available. E.g. "Kernel OOM killer
> > > > enabled"
> > > >     Boolean.
> > > >  6. Tests have been updated to reflect this.
> > > > 
> > > > webrev:
> > > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06/webrev/
> > > > incremental webrev:
> > > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/06-incremental/webrev/
> > > > 
> > > > Testing: Docker tests and podman testing on cgroupsv2. I'll run
> > > > it
> > > > through jdk/submit as well.
> > > > 
> > > > Hopefully this can get pushed with 8230305 early on in the jdk
> > > > 15
> > > > cycle
> > > > :)
> > > > 
> > > > A couple of more inline comments below.
> > > > 
> > > > On Mon, 2019-12-02 at 12:13 -0500, Bob Vandette wrote:
> > > > > Sorry for the delay in responding.  See comments below:
> > > > > 
> > > > > > On Nov 29, 2019, at 4:05 AM, Severin Gehwolf <
> > > > > > sgehw...@redhat.com> wrote:
> > > > > > 
> > > > > > On Fri, 2019-11-15 at 17:51 +0100, Severin Gehwolf wrote:
> > > > > > > Hi,
> > > > > > > 
> > > > > > > Could I please get reviews of these core-libs, Linux-
> > > > > > > only,
> > > > > > > changes to
> > > > > > > the Metrics subsystem? This patch implements cgroupv2
> > > > > > > support
> > > > > > > for
> > > > > > > Metrics which are currently cgroupv1-only. Fedora 31
> > > > > > > switched
> > > > > > > to
> > > > > > > cgroupv2 by default so it's time to get OpenJDK recognize
> > > > > > > it.
> > > > > > > 
> > > > > > > Note that a couple of metrics are no longer supported
> > > > > > > with
> > > > > > > cgroupv2.
> > > > > > > Most notably (not an exhaustive list, though):
> > > > > > > 
> > > > > > > Metrics.getKernel*() family of methods.
> > > > > > > Metrics.getTcp*() family of methods.
> > > > > > > Metrics.getBlkIO*() family of methods.
> > > > > > > Metrics.isMemoryOOMKillEnabled()
> > > > > > > 
> > > > > > > A couple of open questions with regards to that:
> > > > > > > 
> > > > > > > 1)
> > > > > > > Most API docs of Metrics make no distiction between
> > > > > > > "unlimited" and
> > > > > > > "not supported", both returning -1 for longs, for
> > > > > > > example.
> > > > > > > This is a
> > > > > > > problem, because output of "java -XshowSettings:system
> > > > > > > -version" will
> > > > > > > not distinguish between unlimited and not supported
> > > > > > > metrics.
> > > > > > > Would it
> > > > > > > be acceptable to change the API to distinguish those
> > > > > > > cases so
> > > > > > > that
> > > > > > > LauncherHelper could display them appropriately?
> > > > > > > 
> > > > > > > 2)
> > > > > > > How should we deal with "not supported" for
> > > > > > > booleans/arrays,
> > > > > > > etc.?
> > > > > > > Would it make sense to return record objects from the
> > > > > > > Metrics
> > > > > > > API so
> > > > > > > that this could be dealt with? E.g.
> > > > > > > 
> > > > > > > Metrics m = ...
> > > > > > > MetricResult<int[]> result = m.getCpuSetCpus();
> > > > > > > switch(result.getType()) {
> > > > > > > case NOT_SUPPORTED: /* do something */; break;
> > > > > > > case SUPPORTED: int[] val = result.get(); break;
> > > > > > > ...         
> > > > > > > }
> > > > > > > 
> > > > > > > I'm bringing this up, because the proposed patch doesn't
> > > > > > > deal
> > > > > > > with the
> > > > > > > above open questions as of yet. With that being said,
> > > > > > > it's
> > > > > > > mostly
> > > > > > > identical to the proposed hotspot changes in [1].
> > > > > 
> > > > > For issue 1 and 2 …
> > > > > 
> > > > > I wonder if we should change the API to throw
> > > > > UnsupportedOperationException for those methods 
> > > > > that are not available.  This exception is used quite a lot
> > > > > in
> > > > > the java/nio and java/net packages
> > > > > for dealing with functionality not available on a host
> > > > > platform.
> > > > > 
> > > > > As an alternate suggestion, we already return -2 for "not
> > > > > supported" for most APIs in the hotspot
> > > > > implementation.  We could use this for non boolean values in
> > > > > the
> > > > > Metrics API.  For boolean
> > > > > values, we could change the API to return “Boolean”.  A null
> > > > > return would signify not
> > > > > implemented.
> > > > 
> > > > This alternative has been implemented in the latest webrev.
> > > > LauncherHelper has been updated to print "N/A" if some property
> > > > being
> > > > printed is not supported. Example output for cgroupsv2:
> > > > 
> > > > $ ./bin/java -XshowSettings:system -version
> > > > Operating System Metrics:
> > > >   Provider: cgroupv2
> > > >   Effective CPU Count: 4
> > > >   CPU Period: 100000us
> > > >   CPU Quota: -1
> > > >   CPU Shares: -1
> > > >   List of Processors: N/A
> > > >   List of Effective Processors: N/A
> > > >   List of Memory Nodes: N/A
> > > >   List of Available Memory Nodes: N/A
> > > >   CPUSet Memory Pressure Enabled: N/A
> > > >   Memory Limit: Unlimited
> > > >   Memory Soft Limit: Unlimited
> > > >   Memory & Swap Limit: Unlimited
> > > >   Kernel Memory Limit: N/A
> > > >   TCP Memory Limit: N/A
> > > >   Out Of Memory Killer Enabled: N/A
> > > > 
> > > > openjdk version "15-internal" 2020-09-15
> > > > OpenJDK Runtime Environment (build 15-internal+0-
> > > > adhoc.sgehwolf.openjdk-head-2)
> > > > OpenJDK 64-Bit Server VM (build 15-internal+0-
> > > > adhoc.sgehwolf.openjdk-head-2, mixed mode, sharing)
> > > > 
> > > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8231111
> > > > > > > webrev: 
> > > > > > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/04/webrev/
> > > > > > > 
> > > > > 
> > > > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/04/webrev/src/java.base/linux/classes/jdk/internal/platform/CgroupSubsystemFactory.java.html
> > > > > 
> > > > > Should we check to make sure that there are no mixed cgroupv1
> > > > > and
> > > > > cgroupv2 mounted subsystems since
> > > > > you are not supporting mixing.
> > > > 
> > > > Done. null is returned for metrics and a warning printed to
> > > > stderr.
> > > > 
> > > > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/04/webrev/src/java.base/linux/classes/jdk/internal/platform/cgroupv2/CgroupV2Subsystem.java.html
> > > > > 
> > > > > It looks looks like there may be alternate ways of reporting
> > > > > some
> > > > > of the metrics that you have classified as
> > > > > RETVAL_NOT_SUPPORTED.
> > > > > 
> > > > > BlkIOServicedCount (io.stat)
> > > > > KernelMemory (memory.stat)
> > > > > TcpMemory (memory.stat)
> > > > 
> > > > The latest webrev has getBlkIOService* and getTcpMemoryUsage
> > > > impls.
> > > > I've left out kernel memory metrics as it wasn't clear what
> > > > this
> > > > would
> > > > be. We can add that in a later patch. The size of this patch is
> > > > already
> > > > rather big.
> > > > 
> > > > > You could try the same trick for getMemoryAndSwapMaxUsage
> > > > > which
> > > > > keeps track of the highest returned value.
> > > > 
> > > > We've decided to not do this. getMemoryAndSwapMaxUsage and
> > > > getMemoryMaxUsage is being returned as not supported in cgroups
> > > > v2.
> > > > 
> > > > Thanks,
> > > > Severin
> > > > 
> > > > > for the benefit of other reviewers, you should provide links
> > > > > to
> > > > > the cgroupv1 and v2 docs.
> > > > > 
> > > > > https://www.kernel.org/doc/Documentation/cgroup-v2.txt
> > > > > 
> > > > > > > Testing: jdk/submit and platform docker tests on Linux
> > > > > > > x86_64
> > > > > > > (with
> > > > > > > hybrid hierarchy, docker/podman) and on Linux x86_64 with
> > > > > > > unified
> > > > > > > hierarchy (podman only).
> > > > > > > 
> > > > > > > Thoughts? Suggestions?
> > > > > 
> > > > > Do you think we should check the docker version being used
> > > > > for
> > > > > the tests to make sure it
> > > > > supports cgroupv2?  I assume a fairly recent version is
> > > > > required
> > > > > to work with cgroupv2.
> > > > > 
> > > > > Bob.
> > > > > 
> > > > > 
> > > > > 
> > > > > > Ping?
> > > > > > 
> > > > > > > Thanks,
> > > > > > > Severin
> > > > > > > 
> > > > > > > [1] 
> > > > > > > http://mail.openjdk.java.net/pipermail/hotspot-dev/2019-November/039909.html

Reply via email to