Hi Anurag,

Thank you for calling this out, TIL about Spark quarterly updates!

A few naive question, do we need to support more than 2 major Spark versions in CI? 
Is it correct to assume API  interface changes should only happen across major version updates? 
Is the Spark community doing this with the built in assumption that minor version upgrades will be relatively easy going forward?

Thank you,
Kurtis C. Wright

On Jun 1, 2026, at 15:25, Anurag Mantripragada <[email protected]> wrote:


Hi all,                                  
                                                                                                                                                         
With Spark 3.4 now removed after the 1.11 release, and Spark community proposing quarterly minor releases, I'd like to start a discussion on how Iceberg should adapt its Spark version support strategy going forward.

Where we are today                                                                                                                                      
On main we support three Spark versions: 3.5, 4.0, and 4.1. Our CI matrix runs 16 jobs across these which is already becoming a bottleneck
                                                                                                                                                         
Historically, we have deprecated and removed Spark versions in an ad-hoc fashion. This worked with ~2 Spark minors per year, but with the new quarterly releases of spark it may not scale.

As per the Spark SPIP we have this coming next 

Date

Release

Maintenance

Notes

April 2026

4.2

6 months

Non-LTS (Past)

July 2026

4.3

6 months

Non-LTS

October 2026

4.4

6 months

Non-LTS

January 2027

4.5

18 months

LTS

April 2027

5.0

Major

   
This means 4 Spark minors per year, each with only a 6-month maintenance window, and an LTS roughly once a year.

I propose we adopt a policy instead of making ad-hoc decisions. Some options I see:                                                                                  
  1. LTS + rolling window of 2 minors: Support the current Spark LTS and the 2 most recent minors. For example, when 4.2 GA ships, add it and deprecate 4.0 and when 4.3 ships, add it and deprecate 4.1. This provides predictable cadence but also means a version add/drop every quarter.                                                        
  2. LTS + selective minors:  Support the Spark LTS and choose minors that have meaningful DSv2 API changes, skipping versions that are incremental. More flexible but less predictable for users. (This is the current strategy)                                                                                                    
Any strategy must account for CI infra ceiling too. Recent improvements have helped, but I think we should support at most 3 versions to keep this under control.                                     

Reply via email to