[PR] [AURON #1956] Add initial compatibility support for Spark 4.1 (UT/CI Pass) [auron]

via GitHub Sun, 25 Jan 2026 19:18:27 -0800


yew1eb opened a new pull request, #1958:
URL: https://github.com/apache/auron/pull/1958


   <!--
     - Start the PR title with the related issue ID, e.g. '[AURON #XXXX] Short 
summary...'.
   -->
   # Which issue does this PR close?
   
   Closes #1956 
   
   # Rationale for this change
   
   # What changes are included in this PR?
   ## Spark 4 API Compatibility
   ### Servlet API Migration
   Updated `AuronAllExecutionsPage.scala` to support both 
`javax.servlet.http.HttpServletRequest` (Spark 3.x) and 
`jakarta.servlet.http.HttpServletRequest` (Spark 4.x) via version-specific 
`@sparkver` annotations, adapting to Spark 4's migration to Jakarta EE Servlet 
API.
   
   
   ### Shuffle API Changes
   Adapted shuffle components to address Spark 4.x's 
`ShuffleWriteProcessor.write` API refinement (SPARK-44605), which triggers 
early execution of shuffle writers and breaks alignment with Spark 3.x 
execution logic:
   - Enhanced `AuronShuffleDependency` with a version-specific `getInputRdd` 
method (returns `null` for Spark 3.x, returns `_rdd` for Spark 4.x) and exposed 
the `inputRdd` field to retrieve the original RDD in Spark 4.x's 
`ShuffleWriteProcessor.write` method.
   - Returned `Iterator.empty` in `NativeRDD.compute()` for 
`NativeRDD.ShuffleWrite` plans to defer execution to the 
`ShuffleWriteProcessor.write()` method, aligning with Spark 3.x execution logic.
   - Added a Spark 4.1-specific `ShuffleWriteProcessor.write` override in 
`NativeShuffleExchangeExec` that accepts `Iterator[_]` as input, asserts empty 
input iterator (validating adaptation logic), retrieves the RDD via 
`AuronShuffleDependency.inputRdd`, and reuses core shuffle logic through 
`internalWrite` to maintain consistency across Spark 3.x/4.x.
   
   ### SparkSession Package Path Change
   Addressed Spark 4.x's SparkSession package restructure:
   - Spark 3.x: org.apache.spark.sql.SparkSession → Spark 4.x: 
org.apache.spark.sql.classic.SparkSession
   - Updated references in NativeParquetInsertIntoHiveTableExec.scala and 
NativeBroadcastExchangeBase.scala 
   
   ### New Data Types
   Added stubs for Spark 4.x's new `GeographyVal`/`GeometryVal`/`VariantVal` 
data types in columnar data structures (`AuronColumnarArray.scala`, 
`AuronColumnarStruct.scala`, `AuronColumnarBatchRow.scala`). These stubs throw 
`UnsupportedOperationException` to resolve compilation errors while avoiding 
impact on Spark 3.x builds.
    
    
   
   # Are there any user-facing changes?
   
   # How was this patch tested?
   - [ ] Enabled Spark 4.1 in CI pipeline
   - [ ] Passed all existing Unit Tests (UT)
   - [ ] Passed all TPC-DS Integration Tests (IT)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[PR] [AURON #1956] Add initial compatibility support for Spark 4.1 (UT/CI Pass) [auron]

Reply via email to