yew1eb opened a new pull request, #1958:
URL: https://github.com/apache/auron/pull/1958
<!--
- Start the PR title with the related issue ID, e.g. '[AURON #XXXX] Short
summary...'.
-->
# Which issue does this PR close?
Closes #1956
# Rationale for this change
# What changes are included in this PR?
## Spark 4 API Compatibility
### Servlet API Migration
Updated `AuronAllExecutionsPage.scala` to support both
`javax.servlet.http.HttpServletRequest` (Spark 3.x) and
`jakarta.servlet.http.HttpServletRequest` (Spark 4.x) via version-specific
`@sparkver` annotations, adapting to Spark 4's migration to Jakarta EE Servlet
API.
### Shuffle API Changes
Adapted shuffle components to address Spark 4.x's
`ShuffleWriteProcessor.write` API refinement (SPARK-44605), which triggers
early execution of shuffle writers and breaks alignment with Spark 3.x
execution logic:
- Enhanced `AuronShuffleDependency` with a version-specific `getInputRdd`
method (returns `null` for Spark 3.x, returns `_rdd` for Spark 4.x) and exposed
the `inputRdd` field to retrieve the original RDD in Spark 4.x's
`ShuffleWriteProcessor.write` method.
- Returned `Iterator.empty` in `NativeRDD.compute()` for
`NativeRDD.ShuffleWrite` plans to defer execution to the
`ShuffleWriteProcessor.write()` method, aligning with Spark 3.x execution logic.
- Added a Spark 4.1-specific `ShuffleWriteProcessor.write` override in
`NativeShuffleExchangeExec` that accepts `Iterator[_]` as input, asserts empty
input iterator (validating adaptation logic), retrieves the RDD via
`AuronShuffleDependency.inputRdd`, and reuses core shuffle logic through
`internalWrite` to maintain consistency across Spark 3.x/4.x.
### SparkSession Package Path Change
Addressed Spark 4.x's SparkSession package restructure:
- Spark 3.x: org.apache.spark.sql.SparkSession → Spark 4.x:
org.apache.spark.sql.classic.SparkSession
- Updated references in NativeParquetInsertIntoHiveTableExec.scala and
NativeBroadcastExchangeBase.scala
### New Data Types
Added stubs for Spark 4.x's new `GeographyVal`/`GeometryVal`/`VariantVal`
data types in columnar data structures (`AuronColumnarArray.scala`,
`AuronColumnarStruct.scala`, `AuronColumnarBatchRow.scala`). These stubs throw
`UnsupportedOperationException` to resolve compilation errors while avoiding
impact on Spark 3.x builds.
# Are there any user-facing changes?
# How was this patch tested?
- [ ] Enabled Spark 4.1 in CI pipeline
- [ ] Passed all existing Unit Tests (UT)
- [ ] Passed all TPC-DS Integration Tests (IT)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]