HyukjinKwon opened a new pull request, #56715:
URL: https://github.com/apache/spark/pull/56715

   ## What changes were proposed in this pull request?
   
   **[DO-NOT-MERGE]** — this is a CI-stabilization validation PR (draft). It 
targets `branch-4.2` where the scheduled Maven build is red.
   
   `BaseYarnClusterSuite` configures a mini `CapacityScheduler` but never sets 
`yarn.scheduler.capacity.maximum-am-resource-percent`, so it defaults to `0.1`. 
On memory-constrained CI runners the queue's total AM resource budget becomes 
~1GB, which is smaller than the 1–2GB AM/driver memory these tests request. 
Applications then wedge in the `ACCEPTED` state (never activated) and the suite 
times out after 3 minutes with `handle.getState().isFinal() was false`.
   
   This sets `maximum-am-resource-percent` to `1.0` (global + `root.default`) 
so AMs can use the whole test queue and applications are always activated.
   
   The last commit adds a focused YARN-only validation workflow and **must be 
reverted before merge**.
   
   ## Why are the changes needed?
   
   `YarnClusterSuite` fails 6 tests deterministically on the branch-4.2 Maven 
(Scala 2.13, JDK 21) scheduled run (e.g. run 28045133937) and the branch-4.x 
JDK17 run 28042117318:
   - run Spark in yarn-client/cluster mode with different configurations, 
ensuring redaction
   - yarn-cluster should respect conf overrides in SparkHadoopUtil 
(SPARK-16414, SPARK-23630)
   - SPARK-35672: additional jar using URI scheme 'local' (client, cluster, 
client + gateway-replacement)
   
   All fail with the same 3-minute `eventually` timeout; the YARN diagnostics 
show `Queue's AM resource limit exceeded. AM Resource Request = <memory:2048>; 
Queue Resource Limit for AM = <memory:1024>` repeated >1000 times.
   
   ## Does this PR introduce any user-facing change?
   
   No. Test-only.
   
   ## How was this patch tested?
   
   Focused YARN-only GitHub Actions workflow on the fork running 
`resource-managers/yarn` tests.
   
   This pull request and its description were written by Isaac.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to