Re: [D] [Discuss] Keep the commit history when adding a new spark version [incubator-gluten]

via GitHub Sun, 04 Jan 2026 00:28:12 -0800


GitHub user baibaichen edited a discussion: [Discuss] Keep the commit history 
when adding a new spark version


Hi guys,

Referring to [this mail list 
archive](https://lists.apache.org/[email protected]:2022-5:spark%203.3),
 I suggest we follow a similar approach when Gluten supports a new Spark 
version. The log below is from https://github.com/apache/iceberg:

```
ed26fd7ac - DOAP: add release 1.10.1 (#14917)
317968426 - Spark: Initial support for 4.1.0 => step 3 build 
ddea5658e - Spark: Copy back 4.1 as 4.0      => step 2 copy back
314989243 - Spark: Move 4.0 as 4.1           => step 1 move  
752a2820c - Site: correct release time for 1.10.1 (#14918)
```

The proposed changes for Gluten are:

1. GitHub offers three methods for merging PRs. While Iceberg uses **"Rebase 
and merge"** for this type of PR, I suggest we perform a manual rebase first 
and then use **"Create a merge commit"**.
2. Iceberg's **"Initial support for 4.1.0"** is a single large commit. My 
suggestion is for developers to use a merge commit locally to retain the 
history of all smaller individual commits.

Here is a PR I have provided: 
https://github.com/apache/incubator-gluten/pull/11347. After merging, the 
history would look like this:

```
*   5d2350e0f test                                                              
                                  
|\  
| *   d7d78a453 Spark: Initial support for 4.1.0                                
                                    
| |\  
| | * d2ee7884a [Fix] Add GeographyVal and GeometryVal support in 
ColumnarArrayShim                                 
| | * 2ef147cb2 [4.1.0] Only Run ArrowEvalPythonExecSuite tests up to Spark 
4.0, we need update ci python to 3.10  
| | * 031b12e03 [4.1.0] Exclude split test in VeloxStringFunctionsSuite         
                                    
| | * 0e7f6d456 [Fix] Refactor Spark version checks in VeloxHashJoinSuite to 
improve readability and maintainability
| | * f1ecdf312 [Fix] Fix MiscOperatorSuite to support OneRowRelationExec plan 
Spark 4.1                            
| | * 3953a17c8 [Fix] Refactor Spark version checks in VeloxHashJoinSuite to 
improve readability and maintainability
| | * 6228d63b3 [Fix] Update Scala version to 2.13.17 in pom.xml to fix 
`java.lang.NoSuchMethodError: 'java.lang.S..
| | * 82b8cc059 [Fix] Using new interface of ParquetFooterReader                
                                    
| | * df3741ddf [Fix] Adapt to DataSourceV2Relation interface change            
                                    
| | * 1855fe465 [Fix] Adapt to QueryExecution.createSparkPlan interface change  
                                    
| | * 61845a8a4 [Fix] Remove TimeAdd from ExpressionConverter and 
ExpressionMappings for test                       
| | * 12b401798 [Fix] Add missing StoragePartitionJoinParams import in 
BatchScanExecShim and AbstractBatchScanExec  
| | * 9ba81906b [Fix] Remove unused MDC import in FileSourceScanExecShim.scala  
                                    
| | * ea32ac233 [Fix] Add printOutputColumns parameter to generateTreeString 
methods                                
| | * 937b8c397 [Fix] Use class name instead of class object for streaming call 
detection to ensure Spark 4.1 comp..
| | * 21293c4bc [Feat] Introduce Spark41Shims and update build configuration to 
support Spark 4.1.                  
| |/  
| * 9b3d3038e Spark: Copy back 4.1 as 4.0                                       
                                  
| * 68ece8e6b Spark: Move 4.0 as 4.1                                            
                                  
|/  
* be247e112 [GLUTEN-6887][VL] Daily Update Velox Version (dft-2026_01_02) 
(#11349)   => current head
```

Below is how this would appear in different IDEs:

| IDE | |
| :--- | :--- |
| VS Code 
|![image-1](https://github.com/user-attachments/assets/6df58640-83f7-4444-bb34-f04babf92ac0)
 |
| IntelliJ | 
![image-2](https://github.com/user-attachments/assets/731c91f4-c6f3-4d0d-8e1c-559c5d9ae2d5)
  |

For instance, PR https://github.com/apache/incubator-gluten/pull/11331 
introduced ColumnarArrayShim. Checking its history reveals:

```
d2ee7884a [Fix] Add GeographyVal and GeometryVal support in ColumnarArrayShim
68ece8e6b Spark: Move 4.0 as 4.1
41073d5b1 [GLUTEN-11330][VL] Make PartialProject support array and map with 
null values (#11331)
```

GitHub link: https://github.com/apache/incubator-gluten/discussions/11352

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [D] [Discuss] Keep the commit history when adding a new spark version [incubator-gluten]

Reply via email to