Hi Martin, Eric,

Of several hundred failures of this test, most were done in a JRE run with
-Xcomp set.  A few failures occurred with -Xmixed, none with -Xint.

The printed "elapsed" times (not normalized to hardware or OS) range from
24 to 132 (ms) with most falling into several buckets in the 30's, 40's, 50's and 70's.

I don't spot anything in the Timer.mainLoop code that might break when highly
optimized but that's one possibility.

Roger


On 6/4/2014 3:25 PM, Martin Buchholz wrote:
Tests for Timer are inherently timing (!) dependent.
It's reasonable for tests to assume that:
- reasonable events like creating a thread and executing a simple task
should complete in less than, say 2500ms.
- system clock will not change by a significant amount (> 1 sec) during the
test.  Yes, that means Timer tests are likely to fail during daylight
saving time switchover - we can live with that. (we could even try to fix
that, by detecting deviations between clock time and elapsed time, but
probably not worth it)

Can you detect any real-world unreliability in my latest version of the
test, not counting daylight saving time switch?

I continue to resist your efforts to "fix" the test by removing chances for
the SUT code to go wrong.


On Tue, Jun 3, 2014 at 11:28 PM, Eric Wang <yiming.w...@oracle.com> wrote:

  Hi Martin,

Thanks for explanation, now I can understand why you set the DELAY_MS to
100 seconds, it is true that it prevents failure on a slow host, however, i
still have some concerns.
Because the test tests to schedule tasks at the time in the past, so all
13 tasks should be executed immediately and finished within a short time.
If set the elapsed time limitation to 50s (DELAY_MS/2), it seems that the
timer have plenty of time to finish tasks, so whether it causes above test
point lost.

Back to the original test, i think it should be a test stabilization
issue, because the original test assumes that the timer should be cancelled
within < 1 second before the 14th task is called. this assumption may not
be guaranteed due to 2 reasons:
1. if test is executed in jtreg concurrent mode on a slow host.
2. the system clock of virtual machine may not be accurate (maybe faster
than physical).

To support the point, i changed the test as attached to print the
execution time to see whether the timer behaves expected as the API
document described. the result is as expected.

The unrepeated task executed immediately: [1401855509336]
The repeated task executed immediately and repeated per 1 second:
[1401855509337, 1401855510337, 1401855511338]
The fixed-rate task executed immediately and catch up the delay:
[1401855509338, 1401855509338, 1401855509338, 1401855509338, 1401855509338,
1401855509338, 1401855509338, 1401855509338, 1401855509338, 1401855509338,
1401855509338, 1401855509836, 1401855510836]


Thanks,
Eric
On 2014/6/4 9:16, Martin Buchholz wrote:




On Tue, Jun 3, 2014 at 6:12 PM, Eric Wang <yiming.w...@oracle.com> wrote:

Hi Martin,

To sleep(1000) is not enough to reproduce the failure, because it is much
lower than the period DELAY_MS (10*1000) of the repeated task created by
"scheduleAtFixedRate(t, counter(y3), past, DELAY_MS)".

Try sleep(DELAY_MS), the failure can be reproduced.

  Well sure, then the task is rescheduled, so I expect it to fail in this
case.

  But in my version I had set DELAY_MS to 100 seconds.  The point of
extending the DELAY_MS is to prevent flaky failure on a slow machine.

  Again, how do we know that this test hasn't found a Timer bug?

  I still can't reproduce it.




Reply via email to