[ 
https://issues.apache.org/jira/browse/TEZ-4649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor resolved TEZ-4649.
-------------------------------
    Resolution: Fixed

> Java21 upgrade made TestAnalyzer#testWithATS flaky - change entry to nanotime 
> precision
> ---------------------------------------------------------------------------------------
>
>                 Key: TEZ-4649
>                 URL: https://issues.apache.org/jira/browse/TEZ-4649
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: László Bodor
>            Assignee: László Bodor
>            Priority: Major
>             Fix For: 1.0.0
>
>         Attachments: 
> BAD_TEZ_4649_org.apache.tez.analyzer.TestAnalyzer-output.txt, 
> GOOD_TEZ_4649_org.apache.tez.analyzer.TestAnalyzer-output.txt
>
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> tested by:
> {code}
>  mvn clean install -Denforcer.skip=true -nsu -Dtest=TestAnalyzer#testWithATS 
> -pl ./tez-tools/analyzers/job-analyzer -am
> {code}
> git bisected, and double-checked before and after this commit:
> {code}
> commit 03b5f1d68ed30d214d0a0aa3e6ec9fe67777372d
>     TEZ-4636: Move to JDK-21+ (#418)
> {code}
> I discovered this issue while testing jersey upgrade in the scope of TEZ-4648
> I'm afraid we don't run all the unit tests in case of a root pom.xml change 
> either, otherwise TEZ-4636 PR (https://github.com/apache/tez/pull/418), 
> should have shown this problem, which it hasn't (only TestRecovery)
> bad (with JDK21 patch, on JDK21 locally):
>  [^BAD_TEZ_4649_org.apache.tez.analyzer.TestAnalyzer-output.txt] 
> good (without JDK21 patch, on JDK17 locally):
>  [^GOOD_TEZ_4649_org.apache.tez.analyzer.TestAnalyzer-output.txt] 
> important note: both good and bad contain this:
> {code}
> SEVERE: Failed to generate the schema for the JAX-B elements
> javax.xml.bind.JAXBException: TimelineEntity and TimelineEntities has 
> IllegalAnnotation
>       at 
> org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.ContextFactory.createContext(ContextFactory.java)
> {code}
> so this is not what makes the unit test fail (this red herring caused a few 
> hours of useless investigation for me in TEZ-4648)
> instead the problem might be:
> {code}
> 2025-09-17 12:49:38,826 INFO  [qtp830449117-475] requests.applicationhistory 
> (Slf4jRequestLogWriter.java:write(62)) - 127.0.0.1 - - [17/Sep/2025:10:49:38 
> +0000] "GET 
> /ws/v1/timeline/TEZ_TASK_ATTEMPT_ID?limit=100&primaryFilter=TEZ_DAG_ID:dag_1758106144809_0001_3&user.name=lbodor
>  HTTP/1.1" 200 7438 "-" "Java/21.0.8"
> 2025-09-17 12:49:38,827 ERROR [Time-limited test] history.ATSImportTool 
> (ATSImportTool.java:download(165)) - Exception in download
> java.util.zip.ZipException: duplicate entry: part-1758106178826.json
>       at 
> java.base/java.util.zip.ZipOutputStream.putNextEntry(ZipOutputStream.java:259)
>       at 
> org.apache.tez.history.ATSImportTool.downloadJSONArrayFromATS(ATSImportTool.java:265)
>       at 
> org.apache.tez.history.ATSImportTool.downloadData(ATSImportTool.java:233)
>       at org.apache.tez.history.ATSImportTool.download(ATSImportTool.java:162)
>       at org.apache.tez.history.ATSImportTool.run(ATSImportTool.java:350)
>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82)
>       at org.apache.tez.history.ATSImportTool.process(ATSImportTool.java:464)
>       at 
> org.apache.tez.analyzer.TestAnalyzer.getDagInfo(TestAnalyzer.java:264)
>       at org.apache.tez.analyzer.TestAnalyzer.verify(TestAnalyzer.java:252)
>       at org.apache.tez.analyzer.TestAnalyzer.runTests(TestAnalyzer.java:391)
>       at 
> org.apache.tez.analyzer.TestAnalyzer.testWithATS(TestAnalyzer.java:355)
>       at 
> java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
>       at java.base/java.lang.reflect.Method.invoke(Method.java:580)
>       at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>       at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>       at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>       at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>       at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>       at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>       at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
>       at java.base/java.lang.Thread.run(Thread.java:1583)
> {code}
> by more detailed logging, I got:
> {code}
> 2025-09-17 13:09:05,068 INFO  [qtp830449117-835] requests.applicationhistory 
> (Slf4jRequestLogWriter.java:write(62)) - 127.0.0.1 - - [17/Sep/2025:11:09:05 
> +0000] "GET 
> /ws/v1/timeline/TEZ_VERTEX_ID?limit=100&primaryFilter=TEZ_DAG_ID:dag_1758107310713_0001_10&user.name=lbodor
>  HTTP/1.1" 200 5942 "-" "Java/21.0.8"
> 2025-09-17 13:09:05,069 INFO  [Time-limited test] history.ATSImportTool 
> (ATSImportTool.java:downloadJSONArrayFromATS(268)) - Downloading 2 entities 
> to zip entry: part-1758107345069.json
> 2025-09-17 13:09:05,069 INFO  [qtp830449117-831] requests.applicationhistory 
> (Slf4jRequestLogWriter.java:write(62)) - 127.0.0.1 - - [17/Sep/2025:11:09:05 
> +0000] "GET 
> /ws/v1/timeline/TEZ_TASK_ID?limit=100&primaryFilter=TEZ_DAG_ID:dag_1758107310713_0001_10&user.name=lbodor
>  HTTP/1.1" 200 2780 "-" "Java/21.0.8"
> 2025-09-17 13:09:05,070 INFO  [Time-limited test] history.ATSImportTool 
> (ATSImportTool.java:downloadJSONArrayFromATS(268)) - Downloading 2 entities 
> to zip entry: part-1758107345070.json
> 2025-09-17 13:09:05,070 INFO  [qtp830449117-835] requests.applicationhistory 
> (Slf4jRequestLogWriter.java:write(62)) - 127.0.0.1 - - [17/Sep/2025:11:09:05 
> +0000] "GET 
> /ws/v1/timeline/TEZ_TASK_ATTEMPT_ID?limit=100&primaryFilter=TEZ_DAG_ID:dag_1758107310713_0001_10&user.name=lbodor
>  HTTP/1.1" 200 10395 "-" "Java/21.0.8"
> 2025-09-17 13:09:05,070 INFO  [Time-limited test] history.ATSImportTool 
> (ATSImportTool.java:downloadJSONArrayFromATS(268)) - Downloading 4 entities 
> to zip entry: part-1758107345070.json
> {code}
> this is not necessarily related to Java21, as I was able to have a passing 
> test on that also, but somehow Java21 test run made this very flaky, like 
> returning to the same point in the loop more than once within a millisecond
> need to have some randomness here



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to