Chris created TEZ-4640: -------------------------- Summary: there are many killed tez tasks in yarn Key: TEZ-4640 URL: https://issues.apache.org/jira/browse/TEZ-4640 Project: Apache Tez Issue Type: Bug Affects Versions: 0.10.2 Environment: !image-2025-07-29-09-28-18-539.png! Reporter: Chris Attachments: image-2025-07-29-09-27-20-373.png, image-2025-07-29-09-28-18-539.png, image-2025-07-29-09-35-54-230.png, image-2025-07-29-09-38-53-667.png
I use Apache Dolphinscheduler to execute hive tasks via tez, however, there would always be a lot of killed tez tasks in yarn Applications page. The duration of these tasks are often very short. Mostly would be 1second . And this is not caused by resource limit ,i only run a single job at one time. I found this log in Dolphinscheduler task log {code:java} 2025-07-28 21:33:11,599 INFO hive.HiveImport: 2025-07-28 21:33:11 INFO TezClient:780 - Could not connect to AM, killing session via YARN, sessionName=HIVE-24438a38-77f0-46fe-8e24-abf7559cf986, applicationId=application_1753709175100_0002 {code} And I checked the source code, this log is caused by sessionShutdownSuccessful=false !image-2025-07-29-09-35-54-230.png! And this is caused by getAMProxy return null. The question is why the if statement doesn't contains YarnApplicationState.ACCEPTED ? !image-2025-07-29-09-38-53-667.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)