Pavel, Deal. Handling early returns too: http://cr.openjdk.java.net/~mmimica/8228580/webrev.05/ I will ask there about socket timeout semantics.
On Tue, 24 Sep 2019 at 12:51, Pavel Rappo <pavel.ra...@oracle.com> wrote: > > Milan, > > Thanks for looking into this. I think you should ask a question on the > expected timing semantics and guarantees on net-dev (with maybe a cc to > nio-dev). > As for our test. I agree with you that we should simply work a possibility of > early returns into the check. > > ... > > /* The acceptable variation of early returns from timed socket operations. */ > private static final long PREMATURE_RETURN = adjustTimeout(100); > > ... > > long elapsed = NANOSECONDS.toMillis(System.nanoTime() - startNanos); > if (elapsed < timeout - PREMATURE_RETURN || elapsed > timeout + TOLERANCE) { > String msg = String.format( > "elapsed=%s, timeout=%s, TOLERANCE=%s, PREMATURE_RETURN=%s", > timeout, timeout, TOLERANCE, PREMATURE_RETURN); > throw new RuntimeException(msg); > } > > ... > > Thoughts? > > -Pavel > > > On 24 Sep 2019, at 09:12, Milan Mimica <milan.mim...@gmail.com> wrote: > > > > Hi Pavel > > > > Wow, I find this awesome. I don't have a Windows machine to play with, > > but I think I may have found something. > > The difference is how Java_sun_nio_ch_Net_poll is implemented. On unix > > it uses poll(2), on Windows it uses select(2). Regarding timeouts, > > poll() has "wait at least" semantics and overruns by design[1], while > > select() on windows has "waits at most" semantics, or how they put > > it[2]: "specifies the maximum time that select should wait before > > returning.". It returns early by design! Is this a known thing? I > > don't think there is much one can do here. It probably makes no sense > > to loop it and wait for time remainder. > > Java's soTimeout does not specify[3] should it wait at least or at > > most the specified timeout, so it's fine I guess. Old, "plain" socket > > impl are not much different. > > > > If the above is correct, should I just add a tolerance for the lower bound? > > > > [1] http://man7.org/linux/man-pages/man2/poll.2.html > > [2] > > https://docs.microsoft.com/en-us/windows/win32/api/winsock2/nf-winsock2-select > > [3] > > https://docs.oracle.com/javase/9/docs/api/java/net/Socket.html#setSoTimeout-int- > > > > On Mon, 23 Sep 2019 at 16:15, Pavel Rappo <pavel.ra...@oracle.com> wrote: > >> > >> Milan, > >> > >> I'm observing the latest version (.04) of this test failing quite > >> frequently (4/100) on Windows (Windows Server 2012 R2 6.3 (amd64)) > >> machines. The test passes fine on macOS and Linux. Here's the typical > >> output I see in the logs: > >> > >> java.lang.RuntimeException: Query took 4997 ms. . Timeout value is 5000 > >> java.lang.RuntimeException: Query took 4999 ms. . Timeout value is 5000 > >> java.lang.RuntimeException: Query took 4995 ms. . Timeout value is 5000 > >> java.lang.RuntimeException: Query took 4998 ms. . Timeout value is 5000 > >> ... > >> > >> Now, there might be many reasons for that. One of which would be that the > >> DnsClient code is buggy. The other reason would be that the accuracy > >> guaranteed by Windows implementation of `read` is not what we would > >> expect. Would you be able to investigate that? > >> > >> P.S. The good news is that the CSR has been approved: > >> > >> https://bugs.openjdk.java.net/browse/JDK-8230965 > >> > >>> On 23 Sep 2019, at 14:20, Milan Mimica <milan.mim...@gmail.com> wrote: > >>> > >>> Got it. Thanks Pavel! > >>> > >>> > >>> On Mon, 23 Sep 2019 at 13:37, Pavel Rappo <pavel.ra...@oracle.com> wrote: > >>>> > >>>> Milan, > >>>> > >>>> How do you check which tests are run? That's what I see in the > >>>> /test-support/jtreg_open_test_jdk_com_sun_jndi_dns_ConfigTests_TcpTimeout_java/com/sun/jndi/dns/ConfigTests/TcpTimeout.jtr > >>>> file after I have run the test locally on my machine: > >>>> > >>>> ----------messages:(5/233)---------- > >>>> command: main TcpTimeout > >>>> reason: User specified action: run main TcpTimeout > >>>> Mode: othervm > >>>> Additional options from @modules: --add-modules java.base --add-exports > >>>> java.base/sun.security.util=ALL-UNNAMED > >>>> elapsed time (seconds): 1.751 > >>>> > >>>> ... > >>>> > >>>> ----------messages:(5/313)---------- > >>>> command: main TcpTimeout -Dcom.sun.jndi.dns.timeout.initial=5000 > >>>> reason: User specified action: run main TcpTimeout > >>>> -Dcom.sun.jndi.dns.timeout.initial=5000 > >>>> Mode: othervm > >>>> Additional options from @modules: --add-modules java.base --add-exports > >>>> java.base/sun.security.util=ALL-UNNAMED > >>>> elapsed time (seconds): 5.498 > >>>> > >>>> ------------------------------------ > >>>> > >>>> Which is consistent with what I would expect given the timeout values. > >>>> > >>>> The following output does not tell the full story, just the name of the > >>>> test: > >>>> > >>>> ============================== > >>>> Test summary > >>>> ============================== > >>>> TEST TOTAL PASS FAIL > >>>> ERROR > >>>> jtreg:open/test/jdk/com/sun/jndi/dns/ConfigTests/TcpTimeout.java > >>>> 1 1 0 > >>>> 0 > >>>> ============================== > >>>> TEST SUCCESS > >>>> > >>>> -Pavel > >>>> > >>>>> On 20 Sep 2019, at 15:42, Milan Mimica <milan.mim...@gmail.com> wrote: > >>>>> > >>>>> Pavel, > >>>>> > >>>>> Here it is: http://cr.openjdk.java.net/~mmimica/8228580/webrev.04/ > >>>>> I don't see the test is run twice when I execute "make test > >>>>> TEST=jtreg:test/jdk/com/sun/jndi/dns/ConfigTests/TcpTimeout.java". Am > >>>>> I missing something? > >>>>> > >>> > >>> > >>> -- > >>> Milan Mimica > >> > > > > > > -- > > Milan Mimica > -- Milan Mimica