Hi Gary,

I admittedly know next to nothing about how jstat and jstatd interact, and also know little about RMI, but after reading the bug I can't help but think that your fix might just be hiding a bug else (like in RMI). For example, should UnicastRemoteObject.exportObject() be keeping the remoteHost object live?

thanks,

Chris

On 3/21/19 11:47 AM, Gary Adams wrote:
The jstatd failures look like the server remote endpoint was GC'd after it was bound.
A strong reference should prevent the NoSuchObjectException (JDK-8203026).
Was seeing 1 failure every 50 testruns. Currently through 400 test runs with
no failures with this proposed fix.

diff --git a/src/jdk.jstatd/share/classes/sun/tools/jstatd/Jstatd.java b/src/jdk.jstatd/share/classes/sun/tools/jstatd/Jstatd.java
--- a/src/jdk.jstatd/share/classes/sun/tools/jstatd/Jstatd.java
+++ b/src/jdk.jstatd/share/classes/sun/tools/jstatd/Jstatd.java
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2004, 2018, Oracle and/or its affiliates. All rights reserved.
+ * Copyright (c) 2004, 2019, Oracle and/or its affiliates. All rights reserved.
  * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
  *
  * This code is free software; you can redistribute it and/or modify it
@@ -45,6 +45,7 @@
     private static Registry registry;
     private static int port = -1;
     private static boolean startRegistry = true;
+    private static RemoteHostImpl remoteHost;
 
     private static void printUsage() {
         System.err.println("usage: jstatd [-nr] [-p port] [-n rminame]\n" +
@@ -138,7 +139,7 @@
         try {
             // use 1.5.0 dynamically generated subs.
             System.setProperty("java.rmi.server.ignoreSubClasses", "true");
-            RemoteHostImpl remoteHost = new RemoteHostImpl();
+            remoteHost = new RemoteHostImpl();
             RemoteHost stub = (RemoteHost) UnicastRemoteObject.exportObject(
                     remoteHost, 0);
             bind(name.toString(), remoteHost);


diff --git a/src/jdk.jstatd/share/classes/sun/tools/jstatd/RemoteHostImpl.java b/src/jdk.jstatd/share/classes/sun/tools/jstatd/RemoteHostImpl.java
--- a/src/jdk.jstatd/share/classes/sun/tools/jstatd/RemoteHostImpl.java
+++ b/src/jdk.jstatd/share/classes/sun/tools/jstatd/RemoteHostImpl.java
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2004, Oracle and/or its affiliates. All rights reserved.
+ * Copyright (c) 2004, 2019, Oracle and/or its affiliates. All rights reserved.
  * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
  *
  * This code is free software; you can redistribute it and/or modify it
@@ -50,6 +50,7 @@
 
     private MonitoredHost monitoredHost;
     private Set<Integer> activeVms;
+    private static RemoteVmImpl rvm;
 
     public RemoteHostImpl() throws MonitorException {
         try {
@@ -76,7 +77,7 @@
         try {
             VmIdentifier vmid = new VmIdentifier(vmidStr);
             MonitoredVm mvm = monitoredHost.getMonitoredVm(vmid);
-            RemoteVmImpl rvm = new RemoteVmImpl((BufferedMonitoredVm)mvm);
+            rvm = new RemoteVmImpl((BufferedMonitoredVm)mvm);
             stub = (RemoteVm) UnicastRemoteObject.exportObject(rvm, 0);
         }
         catch (URISyntaxException e) {



On 3/20/19, 8:26 AM, Gary Adams wrote:
I think I have found the source of the dash.
 sun/tools/jstat/RowClosure.java
sets the NaN value to "-".

That should be fine.

It just means in the jstat output tests need to account for the possibility.

...

On 3/19/19, 6:07 PM, Chris Plummer wrote:
The presence of the '-' sounds familiar. I recall reading about it in a review. I know I did a review for these tests that impacted the output, but I don't see it introducing a '-' anywhere. See JDK-8202466. However, that bug references JDK-8199519, which caused problems by changing a value in the output from a '-' to a 0:

--- a/test/hotspot/jtreg/serviceability/tmtools/jstat/utils/JstatGcCauseResults.java	Wed Apr 25 17:50:32 2018 -0400
+++ b/test/hotspot/jtreg/serviceability/tmtools/jstat/utils/JstatGcCauseResults.java	Thu Apr 26 09:45:47 2018 +0900
@@ -74,10 +74,18 @@
         assertThat(GCT >= 0, "Incorrect time value for GCT");
         assertThat(GCT >= YGCT, "GCT < YGCT (total garbage collection time < young generation garbage collection time)");
 
-        int CGC = getIntValue("CGC");
-        float CGCT = getFloatValue("CGCT");
-        assertThat(CGCT >= 0, "Incorrect time value for CGCT");
+        int CGC = 0;
+        float CGCT = 0.0f;
+        try {
+            CGC = getIntValue("CGC");
+        } catch (NumberFormatException e) {
+            if (!e.getMessage().equals("Unparseable number: \"-\"")) {
+                throw e;
+            }
+        }
         if (CGC > 0) {
+            CGCT = getFloatValue("CGCT");
+            assertThat(CGCT >= 0, "Incorrect time value for CGCT");
             assertThat(CGCT > 0, "Number of concurrent GC events is " + CGC + ", but CGCT is 0");
         }
 
I'm not sure if any of this is related to what you are seeing.

Chris

On 3/19/19 11:39 AM, Gary Adams wrote:
A quick follow up on the jstat test failures.

On the failed runs the output looks like this :

----------messages:(3/127)----------
command: shell jstatLineCounts4.sh
reason: User specified action: run shell jstatLineCounts4.sh 
elapsed time (seconds): 7.496
----------System.out:(13/1261)----------
  S0     S1     E      O      M     CCS    YGC     YGCT    FGC    FGCT    CGC    CGCT     GCT   
  0.00   0.00   0.00   0.00      -      -      0    0.000     0    0.000     0    0.000    0.000
  0.00   0.00   0.00   0.00      -      -      0    0.000     0    0.000     0    0.000    0.000
  0.00   0.00   0.00   0.00      -      -      0    0.000     0    0.000     0    0.000    0.000
  0.00   0.00   0.00   0.00      -      -      0    0.000     0    0.000     0    0.000    0.000
  0.00   0.00   0.00   0.00      -      -      0    0.000     0    0.000     0    0.000    0.000
  0.00   0.00   0.00   0.00      -      -      0    0.000     0    0.000     0    0.000    0.000
  0.00   0.00   0.00   0.00      -  28.19      1    0.571     0    0.000     0    0.000    0.571
  0.00 100.00   0.00  14.85  31.29  28.19      1    0.571     0    0.000     0    0.000    0.571
  0.00 100.00   0.00  14.85  31.29  28.19      1    0.571     0    0.000     0    0.000    0.571
  0.00 100.00   0.00  14.85  31.29  28.19      1    0.571     0    0.000     0    0.000    0.571
  S0     S1     E      O      M     CCS    YGC     YGCT    FGC    FGCT    CGC    CGCT     GCT   
  0.00 100.00   0.00  14.85  31.29  28.19      1    0.571     0    0.000     0    0.000    0.571

The awk scripts used to check the output are not aware a dash '-' is allowed the metaspace column.
Here is a quick stab to allow the dashes.
Is anyone aware of recent changes in the gcutil output?

diff --git a/test/jdk/sun/tools/jstat/lineCounts3.awk b/test/jdk/sun/tools/jstat/lineCounts3.awk
--- a/test/jdk/sun/tools/jstat/lineCounts3.awk
+++ b/test/jdk/sun/tools/jstat/lineCounts3.awk
@@ -23,7 +23,7 @@
         headerlines++;
     }
 
-/^[ ]*[0-9]+\.[0-9]+[ ]*[0-9]+\.[0-9]+[ ]*[0-9]+\.[0-9]+[ ]*[0-9]+\.[0-9]+[ ]*[0-9]+\.[0-9]+[ ]*([0-9]+\.[0-9]+)|-[ ]*[0-9]+[ ]*[0-9]+\.[0-9]+[ ]*[0-9]+[ ]*[0-9]+\.[0-9]+[ ]*[0-9]+[ ]*[0-9]+\.[0-9]+[ ]*[0-9]+\.[0-9]+$/    {
+/^[ ]*[0-9]+\.[0-9]+[ ]*[0-9]+\.[0-9]+[ ]*[0-9]+\.[0-9]+[ ]*[0-9]+\.[0-9]+[ ]*([0-9]+\.[0-9]+)|-[ ]*([0-9]+\.[0-9]+)|-[ ]*[0-9]+[ ]*[0-9]+\.[0-9]+[ ]*[0-9]+[ ]*[0-9]+\.[0-9]+[ ]*[0-9]+[ ]*[0-9]+\.[0-9]+[ ]*[0-9]+\.[0-9]+$/    {
         datalines++;
     }
 
diff --git a/test/jdk/sun/tools/jstat/lineCounts4.awk b/test/jdk/sun/tools/jstat/lineCounts4.awk
--- a/test/jdk/sun/tools/jstat/lineCounts4.awk
+++ b/test/jdk/sun/tools/jstat/lineCounts4.awk
@@ -26,7 +26,7 @@
         headerlines++;
     }
 
-/^[ ]*[0-9]+\.[0-9]+[ ]*[0-9]+\.[0-9]+[ ]*[0-9]+\.[0-9]+[ ]*[0-9]+\.[0-9]+[ ]*[0-9]+\.[0-9]+[ ]*([0-9]+\.[0-9]+)|-[ ]*[0-9]+[ ]*[0-9]+\.[0-9]+[ ]*[0-9]+[ ]*[0-9]+\.[0-9]+[ ]*[0-9]+[ ]*[0-9]+\.[0-9]+[ ]*[0-9]+\.[0-9]+$/    {
+/^[ ]*[0-9]+\.[0-9]+[ ]*[0-9]+\.[0-9]+[ ]*[0-9]+\.[0-9]+[ ]*[0-9]+\.[0-9]+[ ]*([0-9]+\.[0-9]+)|-[ ]*([0-9]+\.[0-9]+)|-[ ]*[0-9]+[ ]*[0-9]+\.[0-9]+[ ]*[0-9]+[ ]*[0-9]+\.[0-9]+[ ]*[0-9]+[ ]*[0-9]+\.[0-9]+[ ]*[0-9]+\.[0-9]+$/    {
         if (headerlines == 2) {
             datalines2++;
         }



On 3/19/19, 8:22 AM, Gary Adams wrote:
After 1000 testruns on {solaris,linux,windows,macosx} debug builds
running 
  test/jdk/sun/tools
  test/jdk/com/sun/tools/attach

with this change, no failures were observed in TestJps or TempDirTest.

diff --git a/test/jdk/TEST.ROOT b/test/jdk/TEST.ROOT
--- a/test/jdk/TEST.ROOT
+++ b/test/jdk/TEST.ROOT
@@ -22,7 +22,11 @@
 javax/management sun/awt sun/java2d javax/xml/jaxp/testng/validation java/lang/ProcessHandle
 
 # Tests that cannot run concurrently
-exclusiveAccess.dirs=java/rmi/Naming java/util/prefs sun/management/jmxremote sun/tools/jstatd sun/security/mscapi java/util/stream java/util/Arrays/largeMemory java/util/BitSet/stream javax/rmi
+exclusiveAccess.dirs=java/rmi/Naming java/util/prefs sun/management/jmxremote \
+sun/tools/jstatd sun/tools/jcmd sun/tools/jhsdb sun/tools/jhsdb/heapconfig \
+sun/tools/jinfo sun/tools/jmap sun/tools/jps sun/tools/jstack sun/tools/jstat \
+com/sun/tools/attach sun/security/mscapi java/util/stream java/util/Arrays/largeMemory \
+java/util/BitSet/stream javax/rmi
 # Group definitions
 groups=TEST.groups
 

Failures were observed in the following tests.
sun/tools/jstat/jstatLineCounts4.sh,[],[solaris-sparcv9-debug],[ExitCode: 1],[bug4990825 shell], [jstatLineCounts4.sh],[37605],[] sun/tools/jstat/jstatLineCounts4.sh,[],[solaris-sparcv9-debug],[ExitCode: 1],[bug4990825 shell], [jstatLineCounts4.sh],[37605],[]

Issues:
  https://bugs.openjdk.java.net/browse/JDK-8220295
  https://bugs.openjdk.java.net/browse/JDK-8220242

At this point just looking for feedback, if this is a reasonable direction
to deal with the intermittent timeouts with theses tests to reduce the
concurrency with other attaching tests.
 





Reply via email to