Hi Jon,
On 12/10/2018 11:58 AM, Jonathan Gibbons wrote:
On 10/11/18 3:40 PM, David Holmes wrote:
Hi Erik,
On 12/10/2018 8:29 AM, Erik Joelsson wrote:
Hello,
(adding serviceability-dev and hotspot-dev for test changes)
Bug: https://bugs.openjdk.java.net/browse/JDK-8212028
Webrev:
http://cr.openjdk.java.net/~erikj/8212028/webrev.01/index.html (From
ihse-runtestprebuilt-branch in jdk-sandbox)
In order to fully adopt the new run-test framework, we need to switch
over the automated and distributed testing system at Oracle to the
new framework. To get this to work, there are number of issues that
needed to be fixed. Here follows a brief explanation, see bug for
more details.
For RunTest.gmk and related makefiles there are a number of minor
tweaks to support all the necessary control variables that are
currently used for the old test makefiles, as well as correcting some
test setup settings.
In addition to that, some tests also needed to be modified:
Timeouts
The current default timeoutFactor in the makefiles is 4. However, the
old Mach5 executor overrides that to 10. I don't think it should
dabble with such things and leave it to the makefiles, the user, or a
specific job definition, so with the new run-test executor, it no
longer does. This means many tests now have a much shorter effective
timeout. Because of this, we need to increase the timeout on some
that are now prone to timing out. I have run tier1-5 a few times to
try and find these and added /timeout=300 (which will result in the
same effective timeout as before) when specific tests seemed
problematic.
This should be fixed in the tier job definitions not the individual
tests. We have moved away from putting explicit timeouts on individual
tests and instead rely on the framework timeout being set appropriately.
David
-----
David,
That's a suboptimal policy. because it means you're relying on the
framework handling the worst case test.
Yes. Given we have such a huge range of tests running on a range of
platforms, on machines with a range of capabilities, using a range of VM
flags and using a range of loads on the test machines, this has to be
punted to the framework - otherwise you have to update every test to add
an explicit timeout for the worst case (as experienced by some runner of
the tests).
There's no holy-grail answer here.
My understanding of current approach was to set the framework timeout so
that the majority of tests running under a given "normal" execution
context pass. Then add multipliers for specific test configurations or
platforms known to take longer (-Xcomp or sparc, for example). Then
tests that don't fit within that chosen timeout get either their own
timeout set, or moved to a tier with a different multiplier.
This change basically lowers the bar that had been set such that more
tests now need explicit timeouts. I'm not sure why that was necessary,
nor do I think it necessarily a good thing.
But after some internal discussions the test folk seem to be okay with
this, so having said my piece I'll let it drop.
As far as jtreg goes, the default timeout for each step is 2 mins, which
is intended to be enough for the test to reliably run within that time
on a reasonably modern developer-class machine. A test which always
times out on a good machine should use a test-specific increased timeout.
Agreed.
Where the framework can help is, if tests are being run on an old or
slow machine, or if test run args are provided that will cause the test
to run significantly slower than usual, then the framework can/should
start scaling up the timeout factor.
Again agreed.
Cheers,
David
-- Jon
test/hotspot/jtreg/runtime/appcds/jvmti/InstrumentationTest.java
This test spawns a child process and tries to locate it using the
attach api, by looking for a unique token in the command line string
of the spawned JVM. The problem is that the command line string it
gets from the attach api is truncated and the token is last on the
command line. This normally works well, but the arguments before it
are 3 files, with full absolute paths inside the jtreg work
directory. With Mach5 we have pretty deep work directories, and with
run-test, we make them even deeper. This unfortunately trips the
limit and the test fails. I have fixed this by reordering the
arguments to the child process.
/Erik