HyukjinKwon opened a new pull request, #56714:
URL: https://github.com/apache/spark/pull/56714
> **[DO-NOT-MERGE]** Verification-only PR. Do not merge. Opened as a draft to
> run CI on a batch of fixes that stabilize currently failing / flaky
> apache/spark master scheduled jobs. Individual fixes will be sent as
> separate, properly-attributed PRs.
### Scope
Stabilize failing/flaky CI jobs (excluding the Pandas 3 jobs the community is
already handling). This is a shared integration branch; commits are focused
and
reference the job/area they fix.
### Fixes included so far
- **`UTF8String.getByte` out-of-bounds contract** — `getByte(int)` documents
"if byte index is invalid, returns 0" but did an unchecked
`Platform.getByte`,
returning adjacent memory. Surfaced as `UTF8StringSuite.testGetByte`
*expected 0 but got 47* in **Maven (Scala 2.13, JDK 25)**. Added the bounds
check so behavior is deterministic across JDKs.
### Tracking (other failing lanes, handled separately / in progress)
- [ ] `datetime-formatting.sql` stale `.out.java21` golden file (Maven JDK21
& JDK25, JDK21 SBT) — regenerate JDK21 golden after SPARK-57575
- [ ] spark-connect `SparkSessionE2ESuite` interrupt tests
`RemoteClassLoaderError` flakiness
- [ ] pyspark connect pipelines `Race while writing batch 0` flakiness
- [ ] `MetricsFailureInjectionSuite` timing flakiness
- [ ] branch-4.2 Maven/Build JDK21/JDK25 failures
This pull request and its description were written by Isaac.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]