Revised webrev is available
- removed the loop on termination
- increased the original internal termination timeout
to 120 seconds.
The termination can exit sooner as the services are shutdown.
The 120 second timeout is consistent with the jtreg defaults,
but can also work standalone as a hang check.
http://cr.openjdk.java.net/~gadams/6736316/
It may be best to engage Doug Lea with a batch of updates
for the concurrent tests. I agree we need to keep tests in sync.
Hopefully, the effort here for slow machine fixes will be worth while
upstream.
I think it is important to monitor unexpected slowdowns in
the system as a whole, but there are better tools for that analysis
in the standard performance benchmarks.
On 11/21/11 09:08 PM, David Holmes wrote:
Hi Gary,
On 22/11/2011 5:26 AM, Gary Adams wrote:
The original complaint about the flakey mutex regression test is
that it was failing on slower machines. The delay at the end of
the processing is unnecessarily restrictive. Since the test harness
will terminate after 120 seconds if the test hangs, the test does
not have to terminate more quickly.
Here's a propose compromise that allows the service up to 120 seconds
to terminate and allows an early exit if the service completes sooner.
Alternatively, we could just increase the timeout to the full 120 seconds
which will also exit sooner if the service completes earlier.
http://cr.openjdk.java.net/~gadams/6736316/
The loop serves no purpose. If you want to wait a maximum of 120 seconds then
use a timeout of 120 seconds. As you said, awaitTermination will return
earlier if termination completes.
I support raising the timeout rather than waiting forever, as these tests
should be able to run standalone and in that case they should not hang upon
encountering a bug. I'm wary of assuming there is a higher-level test harness
involved. Also note that java/util/concurrent tests are generally updated in
collaboration with Doug Lea as he also maintains these tests for the jsr166
repository - and runs them.
I do have a concern that raising the timeouts/delays too high means that they
will no longer be of any use in detecting general unexpected slowdowns. This
is a delicate balance between accommodating slow machines and flagging
problems with unexpected slowdowns. Maybe there is nothing we can do about
that, but I want to flag it.
Thanks,
David