[
https://issues.apache.org/jira/browse/SPARK-34154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17272865#comment-17272865
]
Apache Spark commented on SPARK-34154:
--------------------------------------
User 'attilapiros' has created a pull request for this issue:
https://github.com/apache/spark/pull/31363
> Flaky Test: LocalityPlacementStrategySuite.handle large number of containers
> and tasks (SPARK-18750)
> ----------------------------------------------------------------------------------------------------
>
> Key: SPARK-34154
> URL: https://issues.apache.org/jira/browse/SPARK-34154
> Project: Spark
> Issue Type: Bug
> Components: YARN
> Affects Versions: 3.0.2, 3.2.0, 3.1.1
> Reporter: Dongjoon Hyun
> Priority: Major
>
> `LocalityPlacementStrategySuite` hangs sometimes like the following. We can
> retriever, but it takes our resource significantly because it hangs until the
> timeout (6 hours) occurs.
> [https://github.com/apache/spark/runs/1719480243]
> [https://github.com/apache/spark/runs/1724459002]
> [https://github.com/apache/spark/runs/1717958874]
> [https://github.com/apache/spark/runs/1731673955] (branch-3.0)
> {code:java}
> [info] LocalityPlacementStrategySuite:
> 17299[info] *** Test still running after 3 minutes, 6 seconds: suite name:
> LocalityPlacementStrategySuite, test name: handle large number of containers
> and tasks (SPARK-18750).
> 17300[info] *** Test still running after 8 minutes, 6 seconds: suite name:
> LocalityPlacementStrategySuite, test name: handle large number of containers
> and tasks (SPARK-18750).
> 17301[info] *** Test still running after 13 minutes, 6 seconds: suite name:
> LocalityPlacementStrategySuite, test name: handle large number of containers
> and tasks (SPARK-18750).
> 17302[info] *** Test still running after 18 minutes, 6 seconds: suite name:
> LocalityPlacementStrategySuite, test name: handle large number of containers
> and tasks (SPARK-18750).
> 17303[info] *** Test still running after 23 minutes, 6 seconds: suite name:
> LocalityPlacementStrategySuite, test name: handle large number of containers
> and tasks (SPARK-18750).
> 17304[info] *** Test still running after 28 minutes, 6 seconds: suite name:
> LocalityPlacementStrategySuite, test name: handle large number of containers
> and tasks (SPARK-18750).
> 17305[info] *** Test still running after 33 minutes, 6 seconds: suite name:
> LocalityPlacementStrategySuite, test name: handle large number of containers
> and tasks (SPARK-18750).
> 17306[info] *** Test still running after 38 minutes, 6 seconds: suite name:
> LocalityPlacementStrategySuite, test name: handle large number of containers
> and tasks (SPARK-18750).
> 17307[info] *** Test still running after 43 minutes, 6 seconds: suite name:
> LocalityPlacementStrategySuite, test name: handle large number of containers
> and tasks (SPARK-18750).
> 17308[info] *** Test still running after 48 minutes, 6 seconds: suite name:
> LocalityPlacementStrategySuite, test name: handle large number of containers
> and tasks (SPARK-18750).
> 17309[info] *** Test still running after 53 minutes, 6 seconds: suite name:
> LocalityPlacementStrategySuite, test name: handle large number of containers
> and tasks (SPARK-18750).
> 17310[info] *** Test still running after 58 minutes, 6 seconds: suite name:
> LocalityPlacementStrategySuite, test name: handle large number of containers
> and tasks (SPARK-18750).
> 17311[info] *** Test still running after 1 hour, 3 minutes, 6 seconds: suite
> name: LocalityPlacementStrategySuite, test name: handle large number of
> containers and tasks (SPARK-18750).
> 17312[info] *** Test still running after 1 hour, 8 minutes, 6 seconds: suite
> name: LocalityPlacementStrategySuite, test name: handle large number of
> containers and tasks (SPARK-18750).
> 17313[info] *** Test still running after 1 hour, 13 minutes, 6 seconds: suite
> name: LocalityPlacementStrategySuite, test name: handle large number of
> containers and tasks (SPARK-18750).
> 17314[info] *** Test still running after 1 hour, 18 minutes, 6 seconds: suite
> name: LocalityPlacementStrategySuite, test name: handle large number of
> containers and tasks (SPARK-18750).
> 17315[info] *** Test still running after 1 hour, 23 minutes, 6 seconds: suite
> name: LocalityPlacementStrategySuite, test name: handle large number of
> containers and tasks (SPARK-18750).
> 17316[info] *** Test still running after 1 hour, 28 minutes, 6 seconds: suite
> name: LocalityPlacementStrategySuite, test name: handle large number of
> containers and tasks (SPARK-18750).
> 17317[info] *** Test still running after 1 hour, 33 minutes, 6 seconds: suite
> name: LocalityPlacementStrategySuite, test name: handle large number of
> containers and tasks (SPARK-18750).
> 17318[info] *** Test still running after 1 hour, 38 minutes, 6 seconds: suite
> name: LocalityPlacementStrategySuite, test name: handle large number of
> containers and tasks (SPARK-18750).
> 17319[info] *** Test still running after 1 hour, 43 minutes, 6 seconds: suite
> name: LocalityPlacementStrategySuite, test name: handle large number of
> containers and tasks (SPARK-18750).
> 17320[info] *** Test still running after 1 hour, 48 minutes, 6 seconds: suite
> name: LocalityPlacementStrategySuite, test name: handle large number of
> containers and tasks (SPARK-18750).
> 17321[info] *** Test still running after 1 hour, 53 minutes, 6 seconds: suite
> name: LocalityPlacementStrategySuite, test name: handle large number of
> containers and tasks (SPARK-18750).
> 17322[info] *** Test still running after 1 hour, 58 minutes, 6 seconds: suite
> name: LocalityPlacementStrategySuite, test name: handle large number of
> containers and tasks (SPARK-18750).
> 17323[info] *** Test still running after 2 hours, 3 minutes, 6 seconds: suite
> name: LocalityPlacementStrategySuite, test name: handle large number of
> containers and tasks (SPARK-18750).
> 17324[info] *** Test still running after 2 hours, 8 minutes, 6 seconds: suite
> name: LocalityPlacementStrategySuite, test name: handle large number of
> containers and tasks (SPARK-18750).
> 17325[info] *** Test still running after 2 hours, 13 minutes, 6 seconds:
> suite name: LocalityPlacementStrategySuite, test name: handle large number of
> containers and tasks (SPARK-18750).
> 17326[info] *** Test still running after 2 hours, 18 minutes, 6 seconds:
> suite name: LocalityPlacementStrategySuite, test name: handle large number of
> containers and tasks (SPARK-18750).
> 17327[info] *** Test still running after 2 hours, 23 minutes, 6 seconds:
> suite name: LocalityPlacementStrategySuite, test name: handle large number of
> containers and tasks (SPARK-18750).
> 17328[info] *** Test still running after 2 hours, 28 minutes, 6 seconds:
> suite name: LocalityPlacementStrategySuite, test name: handle large number of
> containers and tasks (SPARK-18750).
> 17329[info] *** Test still running after 2 hours, 33 minutes, 6 seconds:
> suite name: LocalityPlacementStrategySuite, test name: handle large number of
> containers and tasks (SPARK-18750).
> 17330[info] *** Test still running after 2 hours, 38 minutes, 6 seconds:
> suite name: LocalityPlacementStrategySuite, test name: handle large number of
> containers and tasks (SPARK-18750).
> 17331[info] *** Test still running after 2 hours, 43 minutes, 6 seconds:
> suite name: LocalityPlacementStrategySuite, test name: handle large number of
> containers and tasks (SPARK-18750).
> 17332[info] *** Test still running after 2 hours, 48 minutes, 6 seconds:
> suite name: LocalityPlacementStrategySuite, test name: handle large number of
> containers and tasks (SPARK-18750).
> 17333[info] *** Test still running after 2 hours, 53 minutes, 6 seconds:
> suite name: LocalityPlacementStrategySuite, test name: handle large number of
> containers and tasks (SPARK-18750).
> 17334[info] *** Test still running after 2 hours, 58 minutes, 6 seconds:
> suite name: LocalityPlacementStrategySuite, test name: handle large number of
> containers and tasks (SPARK-18750).
> 17335[info] *** Test still running after 3 hours, 3 minutes, 6 seconds: suite
> name: LocalityPlacementStrategySuite, test name: handle large number of
> containers and tasks (SPARK-18750).
> 17336[info] *** Test still running after 3 hours, 8 minutes, 6 seconds: suite
> name: LocalityPlacementStrategySuite, test name: handle large number of
> containers and tasks (SPARK-18750).
> 17337[info] *** Test still running after 3 hours, 13 minutes, 6 seconds:
> suite name: LocalityPlacementStrategySuite, test name: handle large number of
> containers and tasks (SPARK-18750).
> 17338[info] *** Test still running after 3 hours, 18 minutes, 6 seconds:
> suite name: LocalityPlacementStrategySuite, test name: handle large number of
> containers and tasks (SPARK-18750).
> 17339[info] *** Test still running after 3 hours, 23 minutes, 6 seconds:
> suite name: LocalityPlacementStrategySuite, test name: handle large number of
> containers and tasks (SPARK-18750).
> 17340[info] *** Test still running after 3 hours, 28 minutes, 6 seconds:
> suite name: LocalityPlacementStrategySuite, test name: handle large number of
> containers and tasks (SPARK-18750).
> 17341[info] *** Test still running after 3 hours, 33 minutes, 6 seconds:
> suite name: LocalityPlacementStrategySuite, test name: handle large number of
> containers and tasks (SPARK-18750).
> 17342[info] *** Test still running after 3 hours, 38 minutes, 6 seconds:
> suite name: LocalityPlacementStrategySuite, test name: handle large number of
> containers and tasks (SPARK-18750).
> 17343[info] *** Test still running after 3 hours, 43 minutes, 6 seconds:
> suite name: LocalityPlacementStrategySuite, test name: handle large number of
> containers and tasks (SPARK-18750).
> 17344[info] *** Test still running after 3 hours, 48 minutes, 6 seconds:
> suite name: LocalityPlacementStrategySuite, test name: handle large number of
> containers and tasks (SPARK-18750).
> 17345[info] *** Test still running after 3 hours, 53 minutes, 6 seconds:
> suite name: LocalityPlacementStrategySuite, test name: handle large number of
> containers and tasks (SPARK-18750).
> 17346[info] *** Test still running after 3 hours, 58 minutes, 6 seconds:
> suite name: LocalityPlacementStrategySuite, test name: handle large number of
> containers and tasks (SPARK-18750).
> 17347[info] *** Test still running after 4 hours, 3 minutes, 6 seconds: suite
> name: LocalityPlacementStrategySuite, test name: handle large number of
> containers and tasks (SPARK-18750).
> 17348[info] *** Test still running after 4 hours, 8 minutes, 6 seconds: suite
> name: LocalityPlacementStrategySuite, test name: handle large number of
> containers and tasks (SPARK-18750).
> 17349[info] *** Test still running after 4 hours, 13 minutes, 6 seconds:
> suite name: LocalityPlacementStrategySuite, test name: handle large number of
> containers and tasks (SPARK-18750).
> 17350[info] *** Test still running after 4 hours, 18 minutes, 6 seconds:
> suite name: LocalityPlacementStrategySuite, test name: handle large number of
> containers and tasks (SPARK-18750). {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]