Args.java failing intermittently in HS testing

roger riggs Mon, 09 Jun 2014 08:04:43 -0700

Hi Eric, Martin,

I'm fine with the re-write. I'm not sure why the re-ordering of y3 willchange the

behavior of the test but it will provide more debugging info.


Roger

On 6/6/2014 9:32 PM, Martin Buchholz wrote:

If you don't want to go with my rewrite, you can conservatively justcheck in a 10x increase in all the constant durations and see whetherthe flakiness goes away.

On Thu, Jun 5, 2014 at 9:46 PM, Martin Buchholz <[email protected]<mailto:[email protected]>> wrote:


    As with David, the cause of the failure is mystifying.
    How can things fail when we stay below the timeout value of 500ms?
    There's a bug either in Timer or my own understanding of what
    should be happening.

    Anyways, raising the timeout value (as I have done in my minor
    rewrite) seems prudent.  Fortunately, we can write this test in a
    way that doesn't require actually waiting for the timeout to elapse.


    On Wed, Jun 4, 2014 at 1:23 PM, roger riggs
    <[email protected] <mailto:[email protected]>> wrote:

        Hi Martin, Eric,

        Of several hundred failures of this test, most were done in a
        JRE run with
        -Xcomp set.  A few failures occurred with -Xmixed, none with
        -Xint.

        The printed "elapsed" times (not normalized to hardware or OS)
        range from
        24 to 132 (ms) with most falling into several buckets in the
        30's, 40's, 50's and 70's.

        I don't spot anything in the Timer.mainLoop code that might
        break when highly
        optimized but that's one possibility.

        Roger



        On 6/4/2014 3:25 PM, Martin Buchholz wrote:

            Tests for Timer are inherently timing (!) dependent.
            It's reasonable for tests to assume that:
            - reasonable events like creating a thread and executing a
            simple task
            should complete in less than, say 2500ms.
            - system clock will not change by a significant amount (>
            1 sec) during the
            test.  Yes, that means Timer tests are likely to fail
            during daylight
            saving time switchover - we can live with that. (we could
            even try to fix
            that, by detecting deviations between clock time and
            elapsed time, but
            probably not worth it)

            Can you detect any real-world unreliability in my latest
            version of the
            test, not counting daylight saving time switch?

            I continue to resist your efforts to "fix" the test by
            removing chances for
            the SUT code to go wrong.


            On Tue, Jun 3, 2014 at 11:28 PM, Eric Wang
            <[email protected] <mailto:[email protected]>>
            wrote:

                  Hi Martin,

                Thanks for explanation, now I can understand why you
                set the DELAY_MS to
                100 seconds, it is true that it prevents failure on a
                slow host, however, i
                still have some concerns.
                Because the test tests to schedule tasks at the time
                in the past, so all
                13 tasks should be executed immediately and finished
                within a short time.
                If set the elapsed time limitation to 50s
                (DELAY_MS/2), it seems that the
                timer have plenty of time to finish tasks, so whether
                it causes above test
                point lost.

                Back to the original test, i think it should be a test
                stabilization
                issue, because the original test assumes that the
                timer should be cancelled
                within < 1 second before the 14th task is called. this
                assumption may not
                be guaranteed due to 2 reasons:
                1. if test is executed in jtreg concurrent mode on a
                slow host.
                2. the system clock of virtual machine may not be
                accurate (maybe faster
                than physical).

                To support the point, i changed the test as attached
                to print the
                execution time to see whether the timer behaves
                expected as the API
                document described. the result is as expected.

                The unrepeated task executed immediately: [1401855509336]
                The repeated task executed immediately and repeated
                per 1 second:
                [1401855509337, 1401855510337, 1401855511338]
                The fixed-rate task executed immediately and catch up
                the delay:
                [1401855509338, 1401855509338, 1401855509338,
                1401855509338, 1401855509338,
                1401855509338, 1401855509338, 1401855509338,
                1401855509338, 1401855509338,
                1401855509338, 1401855509836, 1401855510836]


                Thanks,
                Eric
                On 2014/6/4 9:16, Martin Buchholz wrote:




                On Tue, Jun 3, 2014 at 6:12 PM, Eric Wang
                <[email protected]
                <mailto:[email protected]>> wrote:

                    Hi Martin,

                    To sleep(1000) is not enough to reproduce the
                    failure, because it is much
                    lower than the period DELAY_MS (10*1000) of the
                    repeated task created by
                    "scheduleAtFixedRate(t, counter(y3), past, DELAY_MS)".

                    Try sleep(DELAY_MS), the failure can be reproduced.

                  Well sure, then the task is rescheduled, so I expect
                it to fail in this
                case.

                  But in my version I had set DELAY_MS to 100 seconds.
                 The point of
                extending the DELAY_MS is to prevent flaky failure on
                a slow machine.

                  Again, how do we know that this test hasn't found a
                Timer bug?

                  I still can't reproduce it.

Re: RFR for bug JDK-8004807: java/util/Timer/Args.java failing intermittently in HS testing

Reply via email to